diff options
Diffstat (limited to 'gcc/doc/extend.texi')
| -rw-r--r-- | gcc/doc/extend.texi | 414 |
1 files changed, 392 insertions, 22 deletions
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index 882c082..7427825 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -1131,6 +1131,14 @@ such an initializer, as shown here: char **foo = (char *[]) @{ "x", "y", "z" @}; @end smallexample +As a GNU extension, GCC allows compound literals with a variable size. +In this case, only empty initialization is allowed. + +@smallexample +int n = 4; +char (*p)[n] = &(char[n])@{ @}; +@end smallexample + Compound literals for scalar types and union types are also allowed. In the following example the variable @code{i} is initialized to the value @code{2}, the result of incrementing the unnamed object created by @@ -3463,12 +3471,41 @@ Function Attributes}, @ref{PowerPC Function Attributes}, @ref{ARM Function Attributes}, @ref{AArch64 Function Attributes}, and @ref{S/390 Function Attributes} for details. +On targets supporting @code{target} function multiversioning (x86), when using +C++, you can declare multiple functions with the same signatures but different +@code{target} attribute values, and the correct version is chosen by the +dynamic linker. In the example below, two function versions are produced +with differing mangling. Additionally an ifunc resolver is created to +select the correct version to populate the @code{func} symbol. + +@smallexample +int func (void) __attribute__ ((target ("arch=core2"))) @{ return 1; @} +int func (void) __attribute__ ((target ("sse3"))) @{ return 2; @} +@end smallexample + +Declarations annotated with @code{target} cannot be used in combination with +declarations annotated with @code{target_clones} in a single multiversioned +function definition. + +@xref{Function Multiversioning} for more details. + +@cindex @code{target_version} function attribute +@item target_version (@var{option}) +On targets with @code{target_version} function multiversioning (AArch64 and +RISC-V) in C or C++, you can declare multiple functions with +@code{target_version} or @code{target_clones} attributes to define a function +version set. + +@xref{Function Multiversioning} for more details. + @cindex @code{target_clones} function attribute @item target_clones (@var{options}) The @code{target_clones} attribute is used to specify that a function be cloned into multiple versions compiled with different target options -than specified on the command line. The supported options and restrictions -are the same as for @code{target} attribute. +than specified on the command line. + +For the x86 and PowerPC targets, the supported options and restrictions +are the same as for the @code{target} attribute. For instance, on an x86, you could compile a function with @code{target_clones("sse4.1,avx")}. GCC creates two function clones, @@ -3480,16 +3517,20 @@ function clones, one compiled with @option{-mcpu=power9} and another with the default options. GCC must be configured to use GLIBC 2.23 or newer in order to use the @code{target_clones} attribute. -It also creates a resolver function (see -the @code{ifunc} attribute above) that dynamically selects a clone -suitable for current architecture. The resolver is created only if there -is a usage of a function with @code{target_clones} attribute. +@code{target_clones} works similarly for targets that support the +@code{target_version} attribute (AArch64 and RISC-V). The attribute takes +multiple arguments, and generates a versioned clone for each. A function +annotated with @code{target_clones} is equivalent to the same function +duplicated for each valid version string in the argument, where each +version is instead annotated with @code{target_version}. This means that a +@code{target_clones} annotated function definition can be used in combination +with @code{target_version} annotated functions definitions and other +@code{target_clones} annotated function definitions. -Note that any subsequent call of a function without @code{target_clone} -from a @code{target_clone} caller will not lead to copying -(target clone) of the called function. -If you want to enforce such behavior, -we recommend declaring the calling function with the @code{flatten} attribute? +For these targets the supported options and restrictions are the same as for +the @code{target_version} attribute. + +@xref{Function Multiversioning} for more details. @cindex @code{unavailable} function attribute @item unavailable @@ -7311,11 +7352,16 @@ the attribute. When the field that represents the number of the elements is assigned a negative integer value, the compiler treats the value as zero. -The @code{counted_by} attribute is not allowed for a pointer to @code{void}, -a pointer to function, or a pointer to a structure or union that includes -a flexible array member. However, it is allowed for a pointer to -non-void incomplete structure or union types, as long as the type could -be completed before the first reference to the pointer. +The @code{counted_by} attribute is not allowed for a pointer to function, +or a pointer to a structure or union that includes a flexible array member. +However, it is allowed for a pointer to non-void incomplete structure +or union types, as long as the type could be completed before the first +reference to the pointer. + +The attribute is allowed for a pointer to @code{void}. However, +warnings will be issued for such cases when @option{-Wpointer-arith} is +specified. When this attribute is applied on a pointer to @code{void}, +the size of each element of this pointer array is treated as 1. An explicit @code{counted_by} annotation defines a relationship between two objects, @code{p->array} and @code{p->count}, and there are the @@ -19688,7 +19734,16 @@ into the data cache. The instruction is issued in slot I1@. These built-in functions are available for LoongArch. -Data Type Description: +@menu +* Data Types:: +* Directly-mapped Builtin Functions:: +* Directly-mapped Division Builtin Functions:: +* Other Builtin Functions:: +@end menu + +@node Data Types +@subsubsection Data Types + @itemize @item @code{imm0_31}, a compile-time constant in range 0 to 31; @item @code{imm0_16383}, a compile-time constant in range 0 to 16383; @@ -19696,6 +19751,9 @@ Data Type Description: @item @code{imm_n2048_2047}, a compile-time constant in range -2048 to 2047; @end itemize +@node Directly-mapped Builtin Functions +@subsubsection Directly-mapped Builtin Functions + The intrinsics provided are listed below: @smallexample unsigned int __builtin_loongarch_movfcsr2gr (imm0_31) @@ -19819,6 +19877,9 @@ function you need to include @code{larchintrin.h}. void __break (imm0_32767) @end smallexample +@node Directly-mapped Division Builtin Functions +@subsubsection Directly-mapped Division Builtin Functions + These intrinsic functions are available by including @code{larchintrin.h} and using @option{-mfrecipe}. @smallexample @@ -19828,6 +19889,9 @@ using @option{-mfrecipe}. double __frsqrte_d (double); @end smallexample +@node Other Builtin Functions +@subsubsection Other Builtin Functions + Additional built-in functions are available for LoongArch family processors to efficiently use 128-bit floating-point (__float128) values. @@ -19854,6 +19918,15 @@ GCC provides intrinsics to access the LSX (Loongson SIMD Extension) instructions The interface is made available by including @code{<lsxintrin.h>} and using @option{-mlsx}. +@menu +* SX Data Types:: +* Directly-mapped SX Builtin Functions:: +* Directly-mapped SX Division Builtin Functions:: +@end menu + +@node SX Data Types +@subsubsection SX Data Types + The following vectors typedefs are included in @code{lsxintrin.h}: @itemize @@ -19881,6 +19954,9 @@ input/output values manipulated: @item @code{imm_n2048_2047}, an integer literal in range -2048 to 2047. @end itemize +@node Directly-mapped SX Builtin Functions +@subsubsection Directly-mapped SX Builtin Functions + For convenience, GCC defines functions @code{__lsx_vrepli_@{b/h/w/d@}} and @code{__lsx_b[n]z_@{v/b/h/w/d@}}, which are implemented as follows: @@ -20664,6 +20740,9 @@ __m128i __lsx_vxori_b (__m128i, imm0_255); __m128i __lsx_vxor_v (__m128i, __m128i); @end smallexample +@node Directly-mapped SX Division Builtin Functions +@subsubsection Directly-mapped SX Division Builtin Functions + These intrinsic functions are available by including @code{lsxintrin.h} and using @option{-mfrecipe} and @option{-mlsx}. @smallexample @@ -20680,6 +20759,16 @@ GCC provides intrinsics to access the LASX (Loongson Advanced SIMD Extension) instructions. The interface is made available by including @code{<lasxintrin.h>} and using @option{-mlasx}. +@menu +* ASX Data Types:: +* Directly-mapped ASX Builtin Functions:: +* Directly-mapped ASX Division Builtin Functions:: +* Directly-mapped SX and ASX Conversion Builtin Functions:: +@end menu + +@node ASX Data Types +@subsubsection ASX Data Types + The following vectors typedefs are included in @code{lasxintrin.h}: @itemize @@ -20708,6 +20797,9 @@ input/output values manipulated: @item @code{imm_n2048_2047}, an integer literal in range -2048 to 2047. @end itemize +@node Directly-mapped ASX Builtin Functions +@subsubsection Directly-mapped ASX Builtin Functions + For convenience, GCC defines functions @code{__lasx_xvrepli_@{b/h/w/d@}} and @code{__lasx_b[n]z_@{v/b/h/w/d@}}, which are implemented as follows: @@ -21512,6 +21604,9 @@ __m256i __lasx_xvxori_b (__m256i, imm0_255); __m256i __lasx_xvxor_v (__m256i, __m256i); @end smallexample +@node Directly-mapped ASX Division Builtin Functions +@subsubsection Directly-mapped ASX Division Builtin Functions + These intrinsic functions are available by including @code{lasxintrin.h} and using @option{-mfrecipe} and @option{-mlasx}. @smallexample @@ -21521,6 +21616,213 @@ __m256d __lasx_xvfrsqrte_d (__m256d); __m256 __lasx_xvfrsqrte_s (__m256); @end smallexample +@node Directly-mapped SX and ASX Conversion Builtin Functions +@subsubsection Directly-mapped SX and ASX Conversion Builtin Functions + +For convenience, the @code{lsxintrin.h} file was imported into @code{ +lasxintrin.h} and 18 new interface functions for 128 and 256 vector +conversions were added, using the @option{-mlasx} option. +@smallexample +__m256 __lasx_cast_128_s (__m128); +__m256d __lasx_cast_128_d (__m128d); +__m256i __lasx_cast_128 (__m128i); +__m256 __lasx_concat_128_s (__m128, __m128); +__m256d __lasx_concat_128_d (__m128d, __m128d); +__m256i __lasx_concat_128 (__m128i, __m128i); +__m128 __lasx_extract_128_lo_s (__m256); +__m128 __lasx_extract_128_hi_s (__m256); +__m128d __lasx_extract_128_lo_d (__m256d); +__m128d __lasx_extract_128_hi_d (__m256d); +__m128i __lasx_extract_128_lo (__m256i); +__m128i __lasx_extract_128_hi (__m256i); +__m256 __lasx_insert_128_lo_s (__m256, __m128); +__m256 __lasx_insert_128_hi_s (__m256, __m128); +__m256d __lasx_insert_128_lo_d (__m256d, __m128d); +__m256d __lasx_insert_128_hi_d (__m256d, __m128d); +__m256i __lasx_insert_128_lo (__m256i, __m128i); +__m256i __lasx_insert_128_hi (__m256i, __m128i); +@end smallexample + +When gcc does not support interfaces for 128 and 256 conversions, +use the following code for equivalent substitution. + +@smallexample + + #ifndef __loongarch_asx_sx_conv + + #include <lasxintrin.h> + #include <lsxintrin.h> + __m256 inline __attribute__((__gnu_inline__, __always_inline__, __artificial__)) + __lasx_cast_128_s (__m128 src) + @{ + __m256 dest; + asm ("" : "=f"(dest) : "0"(src)); + return dest; + @} + + __m256d inline __attribute__((__gnu_inline__, __always_inline__, __artificial__)) + __lasx_cast_128_d (__m128d src) + @{ + __m256d dest; + asm ("" : "=f"(dest) : "0"(src)); + return dest; + @} + + __m256i inline __attribute__((__gnu_inline__, __always_inline__, __artificial__)) + __lasx_cast_128 (__m128i src) + @{ + __m256i dest; + asm ("" : "=f"(dest) : "0"(src)); + return dest; + @} + + __m256 inline __attribute__((__gnu_inline__, __always_inline__, __artificial__)) + __lasx_concat_128_s (__m128 src1, __m128 src2) + @{ + __m256 dest; + asm ("xvpermi.q %u0,%u2,0x02\n" + : "=f"(dest) + : "0"(src1), "f"(src2)); + return dest; + @} + + __m256d inline __attribute__((__gnu_inline__, __always_inline__, __artificial__)) + __lasx_concat_128_d (__m128d src1, __m128d src2) + @{ + __m256d dest; + asm ("xvpermi.q %u0,%u2,0x02\n" + : "=f"(dest) + : "0"(src1), "f"(src2)); + return dest; + @} + + __m256i inline __attribute__((__gnu_inline__, __always_inline__, __artificial__)) + __lasx_concat_128 (__m128i src1, __m128i src2) + @{ + __m256i dest; + asm ("xvpermi.q %u0,%u2,0x02\n" + : "=f"(dest) + : "0"(src1), "f"(src2)); + return dest; + @} + + __m128 inline __attribute__((__gnu_inline__, __always_inline__, __artificial__)) + __lasx_extract_128_lo_s (__m256 src) + @{ + __m128 dest; + asm ("" : "=f"(dest) : "0"(src)); + return dest; + @} + + __m128d inline __attribute__((__gnu_inline__, __always_inline__, __artificial__)) + __lasx_extract_128_lo_d (__m256d src) + @{ + __m128d dest; + asm ("" : "=f"(dest) : "0"(src)); + return dest; + @} + + __m128i inline __attribute__((__gnu_inline__, __always_inline__, __artificial__)) + __lasx_extract_128_lo (__m256i src) + @{ + __m128i dest; + asm ("" : "=f"(dest) : "0"(src)); + return dest; + @} + + __m128 inline __attribute__((__gnu_inline__, __always_inline__, __artificial__)) + __lasx_extract_128_hi_s (__m256 src) + @{ + __m128 dest; + asm ("xvpermi.d %u0,%u1,0xe\n" + : "=f"(dest) + : "f"(src)); + return dest; + @} + + __m128d inline __attribute__((__gnu_inline__, __always_inline__, __artificial__)) + __lasx_extract_128_hi_d (__m256d src) + @{ + __m128d dest; + asm ("xvpermi.d %u0,%u1,0xe\n" + : "=f"(dest) + : "f"(src)); + return dest; + @} + + __m128i inline __attribute__((__gnu_inline__, __always_inline__, __artificial__)) + __lasx_extract_128_hi (__m256i src) + @{ + __m128i dest; + asm ("xvpermi.d %u0,%u1,0xe\n" + : "=f"(dest) + : "f"(src)); + return dest; + @} + + __m256 inline __attribute__((__gnu_inline__, __always_inline__, __artificial__)) + __lasx_insert_128_lo_s (__m256 src1, __m128 src2) + @{ + __m256 dest; + asm ("xvpermi.q %u0,%u2,0x30\n" + : "=f"(dest) + : "0"(src1), "f"(src2)); + return dest; + @} + + __m256d inline __attribute__((__gnu_inline__, __always_inline__, __artificial__)) + __lasx_insert_128_lo_d (__m256d a, __m128d b) + @{ + __m256d dest; + asm ("xvpermi.q %u0,%u2,0x30\n" + : "=f"(dest) + : "0"(src1), "f"(src2)); + return dest; + @} + + __m256i inline __attribute__((__gnu_inline__, __always_inline__, __artificial__)) + __lasx_insert_128_lo (__m256i src1, __m128i src2) + @{ + __m256i dest; + asm ("xvpermi.q %u0,%u2,0x30\n" + : "=f"(dest) + : "0"(src1), "f"(src2)); + return dest; + @} + + __m256 inline __attribute__((__gnu_inline__, __always_inline__, __artificial__)) + __lasx_insert_128_hi_s (__m256 src1, __m128 src2) + @{ + __m256 dest; + asm ("xvpermi.q %u0,%u2,0x02\n" + : "=f"(dest) + : "0"(src1), "f"(src2)); + return dest; + @} + + __m256d inline __attribute__((__gnu_inline__, __always_inline__, __artificial__)) + __lasx_insert_128_hi_d (__m256d src1, __m128d src2) + @{ + __m256d dest; + asm ("xvpermi.q %u0,%u2,0x02\n" + : "=f"(dest) + : "0"(src1), "f"(src2)); + return dest; + @} + + __m256i inline __attribute__((__gnu_inline__, __always_inline__, __artificial__)) + __lasx_insert_128_hi (__m256i src1, __m128i src2) + @{ + __m256i dest; + asm ("xvpermi.q %u0,%u2,0x02\n" + : "=f"(dest) + : "0"(src1), "f"(src2)); + return dest; + @} + #endif + +@end smallexample + @node MIPS DSP Built-in Functions @subsection MIPS DSP Built-in Functions @@ -30669,11 +30971,79 @@ For the effects of the @code{hot} attribute on functions, see @section Function Multiversioning @cindex function versions -With the GNU C++ front end, for x86 targets, you may specify multiple -versions of a function, where each function is specialized for a -specific target feature. At runtime, the appropriate version of the -function is automatically executed depending on the characteristics of -the execution platform. Here is an example. +Function multiversioning is a mechanism that enables compiling multiple +versions of a function, each specialized for different combinations of +architecture extensions. Additionally, the compiler generates a resolver that +the dynamic linker uses to detect architecture support and choose the +appropriate version at runtime. + +Function multiversioning relies on the indirect function extension to the ELF +standard, and therefore Binutils version 2.20.1 or higher and GNU C Library +version 2.11.1 are required to use this feature. + +There are two versions of function multiversioning supported by GCC. + +For targets supporting the @code{target_version} attribute (AArch64 and RISC-V), +when compiling for C or C++, a function version set can be defined by a +combination of function definitions with @code{target_version} and +@code{target_clones} attributes, across translation units. + +For example: + +@smallexample +// fmv.h: +int foo (); +int foo [[gnu::target_clones("sve", "sve2")]] (); +int foo [[gnu::target_version("dotprod;priority=1")]] (); + +// fmv1.cc +#include "fmv.h" + +int foo () +@{ + // The default version of foo. + return 0; +@} + +// fmv2.cc: +#include "fmv.h" + +int foo [[gnu::target_clones("sve", "sve2")]] () +@{ + // foo versions for sve and sve2 + return 1; +@} + +int foo [[gnu::target_version("dotprod")]] () +@{ + // foo version for dotprod extension + return 2; +@} + +// main.cc +#include "fmv.h" + +int main () +@{ + int (*p)() = &foo; + assert ((*p) () == foo ()); + return 0; +@} +@end smallexample + +This example results in 4 versions of the foo function being generated, and +a resolver which is used by the dynamic linker to choose the correct version. + +For the AArch64 target GCC implements function multiversionsing, with the +semantics and version strings as specified in the +@ref{ARM C Language Extensions (ACLE)}. + +For targets that support multiversioning with the @code{target} attribute +(x86) a multiversioned function can be defined with either multiple function +definitions with the @code{target} attribute (in C++) within a translation unit, +or a single definition with the @code{target_clones} attribute. + +Here is an example. @smallexample __attribute__ ((target ("default"))) |
