aboutsummaryrefslogtreecommitdiff
path: root/gcc/doc/extend.texi
diff options
context:
space:
mode:
Diffstat (limited to 'gcc/doc/extend.texi')
-rw-r--r--gcc/doc/extend.texi414
1 files changed, 392 insertions, 22 deletions
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 882c082..7427825 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -1131,6 +1131,14 @@ such an initializer, as shown here:
char **foo = (char *[]) @{ "x", "y", "z" @};
@end smallexample
+As a GNU extension, GCC allows compound literals with a variable size.
+In this case, only empty initialization is allowed.
+
+@smallexample
+int n = 4;
+char (*p)[n] = &(char[n])@{ @};
+@end smallexample
+
Compound literals for scalar types and union types are also allowed. In
the following example the variable @code{i} is initialized to the value
@code{2}, the result of incrementing the unnamed object created by
@@ -3463,12 +3471,41 @@ Function Attributes}, @ref{PowerPC Function Attributes},
@ref{ARM Function Attributes}, @ref{AArch64 Function Attributes},
and @ref{S/390 Function Attributes} for details.
+On targets supporting @code{target} function multiversioning (x86), when using
+C++, you can declare multiple functions with the same signatures but different
+@code{target} attribute values, and the correct version is chosen by the
+dynamic linker. In the example below, two function versions are produced
+with differing mangling. Additionally an ifunc resolver is created to
+select the correct version to populate the @code{func} symbol.
+
+@smallexample
+int func (void) __attribute__ ((target ("arch=core2"))) @{ return 1; @}
+int func (void) __attribute__ ((target ("sse3"))) @{ return 2; @}
+@end smallexample
+
+Declarations annotated with @code{target} cannot be used in combination with
+declarations annotated with @code{target_clones} in a single multiversioned
+function definition.
+
+@xref{Function Multiversioning} for more details.
+
+@cindex @code{target_version} function attribute
+@item target_version (@var{option})
+On targets with @code{target_version} function multiversioning (AArch64 and
+RISC-V) in C or C++, you can declare multiple functions with
+@code{target_version} or @code{target_clones} attributes to define a function
+version set.
+
+@xref{Function Multiversioning} for more details.
+
@cindex @code{target_clones} function attribute
@item target_clones (@var{options})
The @code{target_clones} attribute is used to specify that a function
be cloned into multiple versions compiled with different target options
-than specified on the command line. The supported options and restrictions
-are the same as for @code{target} attribute.
+than specified on the command line.
+
+For the x86 and PowerPC targets, the supported options and restrictions
+are the same as for the @code{target} attribute.
For instance, on an x86, you could compile a function with
@code{target_clones("sse4.1,avx")}. GCC creates two function clones,
@@ -3480,16 +3517,20 @@ function clones, one compiled with @option{-mcpu=power9} and another
with the default options. GCC must be configured to use GLIBC 2.23 or
newer in order to use the @code{target_clones} attribute.
-It also creates a resolver function (see
-the @code{ifunc} attribute above) that dynamically selects a clone
-suitable for current architecture. The resolver is created only if there
-is a usage of a function with @code{target_clones} attribute.
+@code{target_clones} works similarly for targets that support the
+@code{target_version} attribute (AArch64 and RISC-V). The attribute takes
+multiple arguments, and generates a versioned clone for each. A function
+annotated with @code{target_clones} is equivalent to the same function
+duplicated for each valid version string in the argument, where each
+version is instead annotated with @code{target_version}. This means that a
+@code{target_clones} annotated function definition can be used in combination
+with @code{target_version} annotated functions definitions and other
+@code{target_clones} annotated function definitions.
-Note that any subsequent call of a function without @code{target_clone}
-from a @code{target_clone} caller will not lead to copying
-(target clone) of the called function.
-If you want to enforce such behavior,
-we recommend declaring the calling function with the @code{flatten} attribute?
+For these targets the supported options and restrictions are the same as for
+the @code{target_version} attribute.
+
+@xref{Function Multiversioning} for more details.
@cindex @code{unavailable} function attribute
@item unavailable
@@ -7311,11 +7352,16 @@ the attribute.
When the field that represents the number of the elements is assigned a
negative integer value, the compiler treats the value as zero.
-The @code{counted_by} attribute is not allowed for a pointer to @code{void},
-a pointer to function, or a pointer to a structure or union that includes
-a flexible array member. However, it is allowed for a pointer to
-non-void incomplete structure or union types, as long as the type could
-be completed before the first reference to the pointer.
+The @code{counted_by} attribute is not allowed for a pointer to function,
+or a pointer to a structure or union that includes a flexible array member.
+However, it is allowed for a pointer to non-void incomplete structure
+or union types, as long as the type could be completed before the first
+reference to the pointer.
+
+The attribute is allowed for a pointer to @code{void}. However,
+warnings will be issued for such cases when @option{-Wpointer-arith} is
+specified. When this attribute is applied on a pointer to @code{void},
+the size of each element of this pointer array is treated as 1.
An explicit @code{counted_by} annotation defines a relationship between
two objects, @code{p->array} and @code{p->count}, and there are the
@@ -19688,7 +19734,16 @@ into the data cache. The instruction is issued in slot I1@.
These built-in functions are available for LoongArch.
-Data Type Description:
+@menu
+* Data Types::
+* Directly-mapped Builtin Functions::
+* Directly-mapped Division Builtin Functions::
+* Other Builtin Functions::
+@end menu
+
+@node Data Types
+@subsubsection Data Types
+
@itemize
@item @code{imm0_31}, a compile-time constant in range 0 to 31;
@item @code{imm0_16383}, a compile-time constant in range 0 to 16383;
@@ -19696,6 +19751,9 @@ Data Type Description:
@item @code{imm_n2048_2047}, a compile-time constant in range -2048 to 2047;
@end itemize
+@node Directly-mapped Builtin Functions
+@subsubsection Directly-mapped Builtin Functions
+
The intrinsics provided are listed below:
@smallexample
unsigned int __builtin_loongarch_movfcsr2gr (imm0_31)
@@ -19819,6 +19877,9 @@ function you need to include @code{larchintrin.h}.
void __break (imm0_32767)
@end smallexample
+@node Directly-mapped Division Builtin Functions
+@subsubsection Directly-mapped Division Builtin Functions
+
These intrinsic functions are available by including @code{larchintrin.h} and
using @option{-mfrecipe}.
@smallexample
@@ -19828,6 +19889,9 @@ using @option{-mfrecipe}.
double __frsqrte_d (double);
@end smallexample
+@node Other Builtin Functions
+@subsubsection Other Builtin Functions
+
Additional built-in functions are available for LoongArch family
processors to efficiently use 128-bit floating-point (__float128)
values.
@@ -19854,6 +19918,15 @@ GCC provides intrinsics to access the LSX (Loongson SIMD Extension) instructions
The interface is made available by including @code{<lsxintrin.h>} and using
@option{-mlsx}.
+@menu
+* SX Data Types::
+* Directly-mapped SX Builtin Functions::
+* Directly-mapped SX Division Builtin Functions::
+@end menu
+
+@node SX Data Types
+@subsubsection SX Data Types
+
The following vectors typedefs are included in @code{lsxintrin.h}:
@itemize
@@ -19881,6 +19954,9 @@ input/output values manipulated:
@item @code{imm_n2048_2047}, an integer literal in range -2048 to 2047.
@end itemize
+@node Directly-mapped SX Builtin Functions
+@subsubsection Directly-mapped SX Builtin Functions
+
For convenience, GCC defines functions @code{__lsx_vrepli_@{b/h/w/d@}} and
@code{__lsx_b[n]z_@{v/b/h/w/d@}}, which are implemented as follows:
@@ -20664,6 +20740,9 @@ __m128i __lsx_vxori_b (__m128i, imm0_255);
__m128i __lsx_vxor_v (__m128i, __m128i);
@end smallexample
+@node Directly-mapped SX Division Builtin Functions
+@subsubsection Directly-mapped SX Division Builtin Functions
+
These intrinsic functions are available by including @code{lsxintrin.h} and
using @option{-mfrecipe} and @option{-mlsx}.
@smallexample
@@ -20680,6 +20759,16 @@ GCC provides intrinsics to access the LASX (Loongson Advanced SIMD Extension)
instructions. The interface is made available by including @code{<lasxintrin.h>}
and using @option{-mlasx}.
+@menu
+* ASX Data Types::
+* Directly-mapped ASX Builtin Functions::
+* Directly-mapped ASX Division Builtin Functions::
+* Directly-mapped SX and ASX Conversion Builtin Functions::
+@end menu
+
+@node ASX Data Types
+@subsubsection ASX Data Types
+
The following vectors typedefs are included in @code{lasxintrin.h}:
@itemize
@@ -20708,6 +20797,9 @@ input/output values manipulated:
@item @code{imm_n2048_2047}, an integer literal in range -2048 to 2047.
@end itemize
+@node Directly-mapped ASX Builtin Functions
+@subsubsection Directly-mapped ASX Builtin Functions
+
For convenience, GCC defines functions @code{__lasx_xvrepli_@{b/h/w/d@}} and
@code{__lasx_b[n]z_@{v/b/h/w/d@}}, which are implemented as follows:
@@ -21512,6 +21604,9 @@ __m256i __lasx_xvxori_b (__m256i, imm0_255);
__m256i __lasx_xvxor_v (__m256i, __m256i);
@end smallexample
+@node Directly-mapped ASX Division Builtin Functions
+@subsubsection Directly-mapped ASX Division Builtin Functions
+
These intrinsic functions are available by including @code{lasxintrin.h} and
using @option{-mfrecipe} and @option{-mlasx}.
@smallexample
@@ -21521,6 +21616,213 @@ __m256d __lasx_xvfrsqrte_d (__m256d);
__m256 __lasx_xvfrsqrte_s (__m256);
@end smallexample
+@node Directly-mapped SX and ASX Conversion Builtin Functions
+@subsubsection Directly-mapped SX and ASX Conversion Builtin Functions
+
+For convenience, the @code{lsxintrin.h} file was imported into @code{
+lasxintrin.h} and 18 new interface functions for 128 and 256 vector
+conversions were added, using the @option{-mlasx} option.
+@smallexample
+__m256 __lasx_cast_128_s (__m128);
+__m256d __lasx_cast_128_d (__m128d);
+__m256i __lasx_cast_128 (__m128i);
+__m256 __lasx_concat_128_s (__m128, __m128);
+__m256d __lasx_concat_128_d (__m128d, __m128d);
+__m256i __lasx_concat_128 (__m128i, __m128i);
+__m128 __lasx_extract_128_lo_s (__m256);
+__m128 __lasx_extract_128_hi_s (__m256);
+__m128d __lasx_extract_128_lo_d (__m256d);
+__m128d __lasx_extract_128_hi_d (__m256d);
+__m128i __lasx_extract_128_lo (__m256i);
+__m128i __lasx_extract_128_hi (__m256i);
+__m256 __lasx_insert_128_lo_s (__m256, __m128);
+__m256 __lasx_insert_128_hi_s (__m256, __m128);
+__m256d __lasx_insert_128_lo_d (__m256d, __m128d);
+__m256d __lasx_insert_128_hi_d (__m256d, __m128d);
+__m256i __lasx_insert_128_lo (__m256i, __m128i);
+__m256i __lasx_insert_128_hi (__m256i, __m128i);
+@end smallexample
+
+When gcc does not support interfaces for 128 and 256 conversions,
+use the following code for equivalent substitution.
+
+@smallexample
+
+ #ifndef __loongarch_asx_sx_conv
+
+ #include <lasxintrin.h>
+ #include <lsxintrin.h>
+ __m256 inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+ __lasx_cast_128_s (__m128 src)
+ @{
+ __m256 dest;
+ asm ("" : "=f"(dest) : "0"(src));
+ return dest;
+ @}
+
+ __m256d inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+ __lasx_cast_128_d (__m128d src)
+ @{
+ __m256d dest;
+ asm ("" : "=f"(dest) : "0"(src));
+ return dest;
+ @}
+
+ __m256i inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+ __lasx_cast_128 (__m128i src)
+ @{
+ __m256i dest;
+ asm ("" : "=f"(dest) : "0"(src));
+ return dest;
+ @}
+
+ __m256 inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+ __lasx_concat_128_s (__m128 src1, __m128 src2)
+ @{
+ __m256 dest;
+ asm ("xvpermi.q %u0,%u2,0x02\n"
+ : "=f"(dest)
+ : "0"(src1), "f"(src2));
+ return dest;
+ @}
+
+ __m256d inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+ __lasx_concat_128_d (__m128d src1, __m128d src2)
+ @{
+ __m256d dest;
+ asm ("xvpermi.q %u0,%u2,0x02\n"
+ : "=f"(dest)
+ : "0"(src1), "f"(src2));
+ return dest;
+ @}
+
+ __m256i inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+ __lasx_concat_128 (__m128i src1, __m128i src2)
+ @{
+ __m256i dest;
+ asm ("xvpermi.q %u0,%u2,0x02\n"
+ : "=f"(dest)
+ : "0"(src1), "f"(src2));
+ return dest;
+ @}
+
+ __m128 inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+ __lasx_extract_128_lo_s (__m256 src)
+ @{
+ __m128 dest;
+ asm ("" : "=f"(dest) : "0"(src));
+ return dest;
+ @}
+
+ __m128d inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+ __lasx_extract_128_lo_d (__m256d src)
+ @{
+ __m128d dest;
+ asm ("" : "=f"(dest) : "0"(src));
+ return dest;
+ @}
+
+ __m128i inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+ __lasx_extract_128_lo (__m256i src)
+ @{
+ __m128i dest;
+ asm ("" : "=f"(dest) : "0"(src));
+ return dest;
+ @}
+
+ __m128 inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+ __lasx_extract_128_hi_s (__m256 src)
+ @{
+ __m128 dest;
+ asm ("xvpermi.d %u0,%u1,0xe\n"
+ : "=f"(dest)
+ : "f"(src));
+ return dest;
+ @}
+
+ __m128d inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+ __lasx_extract_128_hi_d (__m256d src)
+ @{
+ __m128d dest;
+ asm ("xvpermi.d %u0,%u1,0xe\n"
+ : "=f"(dest)
+ : "f"(src));
+ return dest;
+ @}
+
+ __m128i inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+ __lasx_extract_128_hi (__m256i src)
+ @{
+ __m128i dest;
+ asm ("xvpermi.d %u0,%u1,0xe\n"
+ : "=f"(dest)
+ : "f"(src));
+ return dest;
+ @}
+
+ __m256 inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+ __lasx_insert_128_lo_s (__m256 src1, __m128 src2)
+ @{
+ __m256 dest;
+ asm ("xvpermi.q %u0,%u2,0x30\n"
+ : "=f"(dest)
+ : "0"(src1), "f"(src2));
+ return dest;
+ @}
+
+ __m256d inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+ __lasx_insert_128_lo_d (__m256d a, __m128d b)
+ @{
+ __m256d dest;
+ asm ("xvpermi.q %u0,%u2,0x30\n"
+ : "=f"(dest)
+ : "0"(src1), "f"(src2));
+ return dest;
+ @}
+
+ __m256i inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+ __lasx_insert_128_lo (__m256i src1, __m128i src2)
+ @{
+ __m256i dest;
+ asm ("xvpermi.q %u0,%u2,0x30\n"
+ : "=f"(dest)
+ : "0"(src1), "f"(src2));
+ return dest;
+ @}
+
+ __m256 inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+ __lasx_insert_128_hi_s (__m256 src1, __m128 src2)
+ @{
+ __m256 dest;
+ asm ("xvpermi.q %u0,%u2,0x02\n"
+ : "=f"(dest)
+ : "0"(src1), "f"(src2));
+ return dest;
+ @}
+
+ __m256d inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+ __lasx_insert_128_hi_d (__m256d src1, __m128d src2)
+ @{
+ __m256d dest;
+ asm ("xvpermi.q %u0,%u2,0x02\n"
+ : "=f"(dest)
+ : "0"(src1), "f"(src2));
+ return dest;
+ @}
+
+ __m256i inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+ __lasx_insert_128_hi (__m256i src1, __m128i src2)
+ @{
+ __m256i dest;
+ asm ("xvpermi.q %u0,%u2,0x02\n"
+ : "=f"(dest)
+ : "0"(src1), "f"(src2));
+ return dest;
+ @}
+ #endif
+
+@end smallexample
+
@node MIPS DSP Built-in Functions
@subsection MIPS DSP Built-in Functions
@@ -30669,11 +30971,79 @@ For the effects of the @code{hot} attribute on functions, see
@section Function Multiversioning
@cindex function versions
-With the GNU C++ front end, for x86 targets, you may specify multiple
-versions of a function, where each function is specialized for a
-specific target feature. At runtime, the appropriate version of the
-function is automatically executed depending on the characteristics of
-the execution platform. Here is an example.
+Function multiversioning is a mechanism that enables compiling multiple
+versions of a function, each specialized for different combinations of
+architecture extensions. Additionally, the compiler generates a resolver that
+the dynamic linker uses to detect architecture support and choose the
+appropriate version at runtime.
+
+Function multiversioning relies on the indirect function extension to the ELF
+standard, and therefore Binutils version 2.20.1 or higher and GNU C Library
+version 2.11.1 are required to use this feature.
+
+There are two versions of function multiversioning supported by GCC.
+
+For targets supporting the @code{target_version} attribute (AArch64 and RISC-V),
+when compiling for C or C++, a function version set can be defined by a
+combination of function definitions with @code{target_version} and
+@code{target_clones} attributes, across translation units.
+
+For example:
+
+@smallexample
+// fmv.h:
+int foo ();
+int foo [[gnu::target_clones("sve", "sve2")]] ();
+int foo [[gnu::target_version("dotprod;priority=1")]] ();
+
+// fmv1.cc
+#include "fmv.h"
+
+int foo ()
+@{
+ // The default version of foo.
+ return 0;
+@}
+
+// fmv2.cc:
+#include "fmv.h"
+
+int foo [[gnu::target_clones("sve", "sve2")]] ()
+@{
+ // foo versions for sve and sve2
+ return 1;
+@}
+
+int foo [[gnu::target_version("dotprod")]] ()
+@{
+ // foo version for dotprod extension
+ return 2;
+@}
+
+// main.cc
+#include "fmv.h"
+
+int main ()
+@{
+ int (*p)() = &foo;
+ assert ((*p) () == foo ());
+ return 0;
+@}
+@end smallexample
+
+This example results in 4 versions of the foo function being generated, and
+a resolver which is used by the dynamic linker to choose the correct version.
+
+For the AArch64 target GCC implements function multiversionsing, with the
+semantics and version strings as specified in the
+@ref{ARM C Language Extensions (ACLE)}.
+
+For targets that support multiversioning with the @code{target} attribute
+(x86) a multiversioned function can be defined with either multiple function
+definitions with the @code{target} attribute (in C++) within a translation unit,
+or a single definition with the @code{target_clones} attribute.
+
+Here is an example.
@smallexample
__attribute__ ((target ("default")))