aboutsummaryrefslogtreecommitdiff
path: root/gcc/config.gcc
AgeCommit message (Collapse)AuthorFilesLines
2024-03-07LoongArch: Use /lib instead of /lib64 as the library search path for MUSL.Yang Yujie1-0/+3
gcc/ChangeLog: * config.gcc: Add a case for loongarch*-*-linux-musl*. * config/loongarch/linux.h: Disable the multilib-compatible treatment for *musl* targets. * config/loongarch/musl.h: New file.
2024-02-28bpf: renames coreout.* files to btfext-out.*.Cupertino Miranda1-2/+2
gcc/ChangeLog: * config.gcc (target_gtfiles): Change coreout to btfext-out. (extra_objs): Change coreout to btfext-out. * config/bpf/coreout.cc: Rename to btfext-out.cc. * config/bpf/btfext-out.cc: Add. * config/bpf/coreout.h: Rename to btfext-out.h. * config/bpf/btfext-out.h: Add. * config/bpf/core-builtins.cc: Change include. * config/bpf/core-builtins.h: Change include. * config/bpf/t-bpf: Accomodate renamed files.
2024-02-23Add ia64*-*-* to the list of obsolete targetsRichard Biener1-0/+1
The following deprecates ia64*-*-* for GCC 14. Since we plan to force LRA for GCC 15 and the target only has slim chances of getting updated this notifies people in advance. Given both Linux and glibc have axed the target further development is also made difficult. There is no listed maintainer for ia64 either. PR target/90785 gcc/ * config.gcc: Add ia64*-*-* to the list of obsoleted targets. contrib/ * config-list.mk (LIST): --enable-obsolete for ia64*-*-*.
2024-01-26amdgcn: config.gcc - enable gfx1030 and gfx1100 multilib; add them to the docsTobias Burnus1-1/+1
gcc/ChangeLog: * config.gcc (amdgcn-*-*): Add gfx1030 and gfx1100 to TM_MULTILIB_CONFIG. * doc/install.texi (Configuration amdgcn-*-*): Mention gfx1030/gfx1100. * doc/invoke.texi (AMD GCN Options): Add gfx1030 and gfx1100 to -march/-mtune. libgomp/ChangeLog: * testsuite/libgomp.c/declare-variant-4.h: Add variant functions for gfx1030 and gfx1100. * testsuite/libgomp.c/declare-variant-4-gfx1030.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx1100.c: New test. Signed-off-by: Tobias Burnus <tburnus@baylibre.com>
2024-01-18RISC-V: Handle differences between XTheadvector and VectorJun Sha (Joshua)1-1/+1
This patch is to handle the differences in instruction generation between Vector and XTheadVector. In this version, we only support partial xtheadvector instructions that leverage directly from current RVV1.0 with simple adding "th." prefix. For different name xtheadvector instructions but share same patterns as RVV1.0 instructions, we will use ASM targethook to rewrite the whole string of the instructions in the following patches. For some vector patterns that cannot be avoided, we use "!TARGET_XTHEADVECTOR" to disable them in vector.md in order not to generate instructions that xtheadvector does not support, like vmv1r. gcc/ChangeLog: * config.gcc: Add files for XTheadVector intrinsics. * config/riscv/autovec.md: Guard XTheadVector. * config/riscv/predicates.md: Disable immediate vl for XTheadVector. * config/riscv/riscv-c.cc (riscv_pragma_intrinsic): Add pragma for XTheadVector. * config/riscv/riscv-string.cc (riscv_expand_block_move): Guard XTheadVector. * config/riscv/riscv-v.cc (vls_mode_valid_p): Avoid autovec. * config/riscv/riscv-vector-builtins-bases.cc: Do not normalize vsetvl instructions for XTheadVector. * config/riscv/riscv-vector-builtins-shapes.cc (check_type): New check type function. (build_one): Adjust for XTheadVector. * config/riscv/riscv-vector-switch.def (ENTRY): Disable fractional mode for the XTheadVector extension. (TUPLE_ENTRY): Likewise. * config/riscv/riscv.cc (riscv_v_adjust_bytesize): Guard XTheadVector. (riscv_preferred_simd_mode): Likewsie. (riscv_autovectorize_vector_modes): Likewise. (riscv_vector_mode_supported_any_target_p): Likewise. (TARGET_VECTOR_MODE_SUPPORTED_ANY_TARGET_P): Likewise. * config/riscv/thead.cc (th_asm_output_opcode): Rewrite vsetvl instructions. * config/riscv/vector.md: Include thead-vector.md and change fractional LMUL into 1 for vbool. * config/riscv/riscv_th_vector.h: New file. * config/riscv/thead-vector.md: New file. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/pragma-1.c: Add XTheadVector. * gcc.target/riscv/rvv/base/abi-1.c: Exclude XTheadVector. * lib/target-supports.exp: Add target for XTheadVector. Co-authored-by: Jin Ma <jinma@linux.alibaba.com> Co-authored-by: Xianmiao Qu <cooper.qu@linux.alibaba.com> Co-authored-by: Christoph Müllner <christoph.muellner@vrull.eu>
2024-01-15RISC-V: Add C intrinsic for Scalar Bitmanip ExtensionLiao Shihua1-1/+1
This patch adds C intrinsics for Bitmanip Extension. RISCV_BUILTIN_NO_PREFIX is a new riscv_builtin_description like RISCV_BUILTIN. But it uses CODE_FOR_##INSN rather than CODE_FOR_riscv_##INSN. Changed orcb, clmul, brev8 pattern's mode form X to GPR because orcbsi, clmul_si, brev8_si are both included in rv32 and rv64. Test them in scalar_bitmanip_intrinsic-64-emulated.c. gcc/ChangeLog: * config.gcc: Include riscv_bitmanip.h. * config/riscv/bitmanip.md: Changed mode form X to GPR in orcb and clmul pattern. * config/riscv/crypto.md: Changed mode form X to GPR in brev8 pattern. * config/riscv/riscv-builtins.cc (AVAIL): Adding new bitmanip builtins. (RISCV_BUILTIN_NO_PREFIX): New helper macro. * config/riscv/riscv-cmo.def (RISCV_BUILTIN): Add '_32'/'_64' postfix to builtins. * config/riscv/riscv-ftypes.def (2): New ftypes. * config/riscv/riscv-scalar-crypto.def (RISCV_BUILTIN): New builtins. (RISCV_BUILTIN_NO_PREFIX): Likewise. * config/riscv/riscv_bitmanip.h: New file. gcc/testsuite/ChangeLog: * gcc.target/riscv/scalar_bitmanip_intrinsic-32.c: New test. * gcc.target/riscv/scalar_bitmanip_intrinsic-64-emulated.c: New test. * gcc.target/riscv/scalar_bitmanip_intrinsic-64.c: New test.
2024-01-15RISC-V: Add C intrinsic for Scalar Crypto ExtensionLiao Shihua1-1/+1
This patch adds C intrinsics for Scalar Crypto Extension. gcc/ChangeLog: * config.gcc: Include riscv_crypto.h. * config/riscv/riscv_crypto.h: New file. gcc/testsuite/ChangeLog: * gcc.target/riscv/scalar_crypto_intrinsic-32.c: New test. * gcc.target/riscv/scalar_crypto_intrinsic-64.c: New test.
2024-01-09aarch64: Fix up GC of aarch64_simd_types [PR113270]Jakub Jelinek1-1/+1
The r14-6524 changes created aarch64-builtins.h header and moved struct aarch64_simd_type_info definition in there. Unfortunately, the new header wasn't added to target_gtfiles, so the trees and const char * pointer elements in the aarch64_simd_types array aren't marked as GC roots anymore. That breaks e.g. PCH, when the array elements then can refer to ggc_freed memory instead of the expected types, but also any other GC collection could free them and further uses would not work correctly. Unfortunately, just adding the new header to target_gtfiles doesn't fix this, because non-static variable definitions marked with GTY(()) aren't considered by gengtype, it looks in those cases for an extern GTY(()) declaration, and there was none - the aarch64-builtins.h header contains an extern declaration without GTY(()). Adding GTY(()) to that extern declaration doesn't work, because then gengtype attempts to emit the aarch64_simd_types GC roots in gtype-desc.cc but the corresponding header isn't included there. So, the patch instead adds another extern declaration in aarch64-builtins.cc right before the actual definition, which makes sure the GC roots are registered correctly in gt-aarch64-builtins.h (where we want them). 2024-01-09 Jakub Jelinek <jakub@redhat.com> PR target/113270 * config.gcc (aarch64*-*-*): Add aarch64-builtins.h to target_gtfiles. * config/aarch64/aarch64-builtins.cc (aarch64_simd_types): Add extern GTY(()) declaration before the definition, drop GTY(()) drom the definition.
2024-01-08GCN: Add pre-initial support for gfx1100Tobias Burnus1-1/+1
ROCm since 5.7.1 supports gfx1100 (RDNA3) cards. This commit adds support for it, mostly by assuming gfx1100 behaves identical to gfx1030. Like gfx1030, gfx1100 support is neither documented nor the build of the multilib enabled by default. But contrary to gfx1030, gfx1100 has a known issue causing some libraries not to build, including newlib: The sdwa variant of v_mov_b32_sdwa is not supported by the hardware but GCC current does generates this instruction. This will be addressed in a later commit. gcc/ChangeLog: * config.gcc (amdgcn-*-amdhsa): Accept --with-arch=gfx1100. * config/gcn/gcn-hsa.h (NO_XNACK): Add gfx1100: (ASM_SPEC): Handle gfx1100. * config/gcn/gcn-opts.h (enum processor_type): Add PROCESSOR_GFX1100. (enum gcn_isa): Add ISA_RDNA3. (TARGET_GFX1100, TARGET_RDNA2_PLUS, TARGET_RDNA3): Define. * config/gcn/gcn-valu.md: Change TARGET_RDNA2 to TARGET_RDNA2_PLUS. * config/gcn/gcn.cc (gcn_option_override, gcn_omp_device_kind_arch_isa, output_file_start): Handle gfx1100. (gcn_global_address_p, gcn_addr_space_legitimate_address_p): Change TARGET_RDNA2 to TARGET_RDNA2_PLUS. (gcn_hsa_declare_function_name): Don't use '.amdhsa_reserve_flat_scratch' with gfx1100. * config/gcn/gcn.h (ASSEMBLER_DIALECT): Likewise. (TARGET_CPU_CPP_BUILTINS): Define __RDNA3__, __gfx1030__ and __gfx1100__. * config/gcn/gcn.md: Change TARGET_RDNA2 to TARGET_RDNA2_PLUS. * config/gcn/gcn.opt (Enum gpu_type): Add gfx1100. * config/gcn/mkoffload.cc (EF_AMDGPU_MACH_AMDGCN_GFX1100): Define. (isa_has_combined_avgprs, main): Handle gfx1100. * config/gcn/t-omp-device (isa): Add gfx1100. libgomp/ChangeLog: * plugin/plugin-gcn.c (gcn_gfx1100_s): New const string. (gcn_isa_name_len): Fix length. (isa_hsa_name, isa_code, max_isa_vgprs): Handle gfx1100.
2024-01-03Update copyright years.Jakub Jelinek1-1/+1
2023-12-18LoongArch: Add support for D frontend.liushuyu1-0/+1
gcc/ChangeLog: * config.gcc: Add loongarch-d.o to d_target_objs for LoongArch architecture. * config/loongarch/t-loongarch: Add object target for loongarch-d.cc. * config/loongarch/loongarch-d.cc (loongarch_d_target_versions): add interface function to define builtin D versions for LoongArch architecture. (loongarch_d_handle_target_float_abi): add interface function to define builtin D traits for LoongArch architecture. (loongarch_d_register_target_info): add interface function to register loongarch_d_handle_target_float_abi function. * config/loongarch/loongarch-d.h (loongarch_d_target_versions): add function prototype. (loongarch_d_register_target_info): Likewise. libphobos/ChangeLog: * configure.tgt: Enable libphobos for LoongArch architecture. * libdruntime/gcc/sections/elf.d: Add TLS_DTV_OFFSET constant for LoongArch64. * libdruntime/gcc/unwind/generic.d: Add __aligned__ constant for LoongArch64.
2023-12-15aarch64: Add new load/store pair fusion pass.Alex Coplan1-1/+1
This adds a new aarch64-specific RTL-SSA pass dedicated to forming load and store pairs (LDPs and STPs). As a motivating example for the kind of thing this improves, take the following testcase: extern double c[20]; double f(double x) { double y = x*x; y += c[16]; y += c[17]; y += c[18]; y += c[19]; return y; } for which we currently generate (at -O2): f: adrp x0, c add x0, x0, :lo12:c ldp d31, d29, [x0, 128] ldr d30, [x0, 144] fmadd d0, d0, d0, d31 ldr d31, [x0, 152] fadd d0, d0, d29 fadd d0, d0, d30 fadd d0, d0, d31 ret but with the pass, we generate: f: .LFB0: adrp x0, c add x0, x0, :lo12:c ldp d31, d29, [x0, 128] fmadd d0, d0, d0, d31 ldp d30, d31, [x0, 144] fadd d0, d0, d29 fadd d0, d0, d30 fadd d0, d0, d31 ret The pass is local (only considers a BB at a time). In theory, it should be possible to extend it to run over EBBs, at least in the case of pure (MEM_READONLY_P) loads, but this is left for future work. The pass works by identifying two kinds of bases: tree decls obtained via MEM_EXPR, and RTL register bases in the form of RTL-SSA def_infos. If a candidate memory access has a MEM_EXPR base, then we track it via this base, and otherwise if it is of a simple reg + <imm> form, we track it via the RTL-SSA def_info for the register. For each BB, for a given kind of base, we build up a hash table mapping the base to an access_group. The access_group data structure holds a list of accesses at each offset relative to the same base. It uses a splay tree to support efficient insertion (while walking the bb), and the nodes are chained using a linked list to support efficient iteration (while doing the transformation). For each base, we then iterate over the access_group to identify adjacent accesses, and try to form load/store pairs for those insns that access adjacent memory. The pass is currently run twice, both before and after register allocation. The first copy of the pass is run late in the pre-RA RTL pipeline, immediately after sched1, since it was found that sched1 was increasing register pressure when the pass was run before. The second copy of the pass runs immediately before peephole2, so as to get any opportunities that the existing ldp/stp peepholes can handle. There are some cases that we punt on before RA, e.g. accesses relative to eliminable regs (such as the soft frame pointer). We do this since we can't know the elimination offset before RA, and we want to avoid the RA reloading the offset (due to being out of ldp/stp immediate range) as this can generate worse code. The post-RA copy of the pass is there to pick up the crumbs that were left behind / things we punted on in the pre-RA pass. Among other things, it's needed to handle accesses relative to the stack pointer. It can also handle code that didn't exist at the time the pre-RA pass was run (spill code, prologue/epilogue code). This is an initial implementation, and there are (among other possible improvements) the following notable caveats / missing features that are left for future work, but could give further improvements: - Moving accesses between BBs within in an EBB, see above. - Out-of-range opportunities: currently the pass refuses to form pairs if there isn't a suitable base register with an immediate in range for ldp/stp, but it can be profitable to emit anchor addresses in the case that there are four or more out-of-range nearby accesses that can be formed into pairs. This is handled by the current ldp/stp peepholes, so it would be good to support this in the future. - Discovery: currently we prioritize MEM_EXPR bases over RTL bases, which can lead to us missing opportunities in the case that two accesses have distinct MEM_EXPR bases (i.e. different DECLs) but they are still adjacent in memory (e.g. adjacent variables on the stack). I hope to address this for GCC 15, hopefully getting to the point where we can remove the ldp/stp peepholes and scheduling hooks. Furthermore it would be nice to make the pass aware of section anchors (adding these as a third kind of base) allowing merging accesses to adjacent variables within the same section. gcc/ChangeLog: * config.gcc: Add aarch64-ldp-fusion.o to extra_objs for aarch64. * config/aarch64/aarch64-passes.def: Add copies of pass_ldp_fusion before and after RA. * config/aarch64/aarch64-protos.h (make_pass_ldp_fusion): Declare. * config/aarch64/aarch64.opt (-mearly-ldp-fusion): New. (-mlate-ldp-fusion): New. (--param=aarch64-ldp-alias-check-limit): New. (--param=aarch64-ldp-writeback): New. * config/aarch64/t-aarch64: Add rule for aarch64-ldp-fusion.o. * config/aarch64/aarch64-ldp-fusion.cc: New file. * doc/invoke.texi (AArch64 Options): Document new -m{early,late}-ldp-fusion options.
2023-12-13aarch64: SVE/NEON Bridging intrinsicsRichard Ball1-1/+1
ACLE has added intrinsics to bridge between SVE and Neon. The NEON_SVE Bridge adds intrinsics that allow conversions between NEON and SVE vectors. This patch adds support to GCC for the following 3 intrinsics: svset_neonq, svget_neonq and svdup_neonq gcc/ChangeLog: * config.gcc: Adds new header to config. * config/aarch64/aarch64-builtins.cc (enum aarch64_type_qualifiers): Moved to header file. (ENTRY): Likewise. (enum aarch64_simd_type): Likewise. (struct aarch64_simd_type_info): Remove static. (GTY): Likewise. * config/aarch64/aarch64-c.cc (aarch64_pragma_aarch64): Defines pragma for arm_neon_sve_bridge.h. * config/aarch64/aarch64-protos.h: Add handle_arm_neon_sve_bridge_h * config/aarch64/aarch64-sve-builtins-base.h: New intrinsics. * config/aarch64/aarch64-sve-builtins-base.cc (class svget_neonq_impl): New intrinsic implementation. (class svset_neonq_impl): Likewise. (class svdup_neonq_impl): Likewise. (NEON_SVE_BRIDGE_FUNCTION): New intrinsics. * config/aarch64/aarch64-sve-builtins-functions.h (NEON_SVE_BRIDGE_FUNCTION): Defines macro for NEON_SVE_BRIDGE functions. * config/aarch64/aarch64-sve-builtins-shapes.h: New shapes. * config/aarch64/aarch64-sve-builtins-shapes.cc (parse_element_type): Add NEON element types. (parse_type): Likewise. (struct get_neonq_def): Defines function shape for get_neonq. (struct set_neonq_def): Defines function shape for set_neonq. (struct dup_neonq_def): Defines function shape for dup_neonq. * config/aarch64/aarch64-sve-builtins.cc (DEF_SVE_TYPE_SUFFIX): Changed to be called through SVE_NEON macro. (DEF_SVE_NEON_TYPE_SUFFIX): Defines macro for NEON_SVE_BRIDGE type suffixes. (DEF_NEON_SVE_FUNCTION): Defines macro for NEON_SVE_BRIDGE functions. (function_resolver::infer_neon128_vector_type): Infers type suffix for overloaded functions. (handle_arm_neon_sve_bridge_h): Handles #pragma arm_neon_sve_bridge.h. * config/aarch64/aarch64-sve-builtins.def (DEF_SVE_NEON_TYPE_SUFFIX): Macro for handling neon_sve type suffixes. (bf16): Replace entry with neon-sve entry. (f16): Likewise. (f32): Likewise. (f64): Likewise. (s8): Likewise. (s16): Likewise. (s32): Likewise. (s64): Likewise. (u8): Likewise. (u16): Likewise. (u32): Likewise. (u64): Likewise. * config/aarch64/aarch64-sve-builtins.h (GCC_AARCH64_SVE_BUILTINS_H): Include aarch64-builtins.h. (ENTRY): Add aarch64_simd_type definiton. (enum aarch64_simd_type): Add neon information to type_suffix_info. (struct type_suffix_info): New function. * config/aarch64/aarch64-sve.md (@aarch64_sve_get_neonq_<mode>): New intrinsic insn for big endian. (@aarch64_sve_set_neonq_<mode>): Likewise. * config/aarch64/iterators.md: Add UNSPEC_SET_NEONQ. * config/aarch64/aarch64-builtins.h: New file. * config/aarch64/aarch64-neon-sve-bridge-builtins.def: New file. * config/aarch64/arm_neon_sve_bridge.h: New file. gcc/testsuite/ChangeLog: * gcc.target/aarch64/sve/acle/asm/test_sve_acle.h: Add include arm_neon_sve_bridge header file * gcc.dg/torture/neon-sve-bridge.c: New test. * gcc.target/aarch64/sve/acle/asm/dup_neonq_bf16.c: New test. * gcc.target/aarch64/sve/acle/asm/dup_neonq_f16.c: New test. * gcc.target/aarch64/sve/acle/asm/dup_neonq_f32.c: New test. * gcc.target/aarch64/sve/acle/asm/dup_neonq_f64.c: New test. * gcc.target/aarch64/sve/acle/asm/dup_neonq_s16.c: New test. * gcc.target/aarch64/sve/acle/asm/dup_neonq_s32.c: New test. * gcc.target/aarch64/sve/acle/asm/dup_neonq_s64.c: New test. * gcc.target/aarch64/sve/acle/asm/dup_neonq_s8.c: New test. * gcc.target/aarch64/sve/acle/asm/dup_neonq_u16.c: New test. * gcc.target/aarch64/sve/acle/asm/dup_neonq_u32.c: New test. * gcc.target/aarch64/sve/acle/asm/dup_neonq_u64.c: New test. * gcc.target/aarch64/sve/acle/asm/dup_neonq_u8.c: New test. * gcc.target/aarch64/sve/acle/asm/get_neonq_bf16.c: New test. * gcc.target/aarch64/sve/acle/asm/get_neonq_f16.c: New test. * gcc.target/aarch64/sve/acle/asm/get_neonq_f32.c: New test. * gcc.target/aarch64/sve/acle/asm/get_neonq_f64.c: New test. * gcc.target/aarch64/sve/acle/asm/get_neonq_s16.c: New test. * gcc.target/aarch64/sve/acle/asm/get_neonq_s32.c: New test. * gcc.target/aarch64/sve/acle/asm/get_neonq_s64.c: New test. * gcc.target/aarch64/sve/acle/asm/get_neonq_s8.c: New test. * gcc.target/aarch64/sve/acle/asm/get_neonq_u16.c: New test. * gcc.target/aarch64/sve/acle/asm/get_neonq_u32.c: New test. * gcc.target/aarch64/sve/acle/asm/get_neonq_u64.c: New test. * gcc.target/aarch64/sve/acle/asm/get_neonq_u8.c: New test. * gcc.target/aarch64/sve/acle/asm/set_neonq_bf16.c: New test. * gcc.target/aarch64/sve/acle/asm/set_neonq_f16.c: New test. * gcc.target/aarch64/sve/acle/asm/set_neonq_f32.c: New test. * gcc.target/aarch64/sve/acle/asm/set_neonq_f64.c: New test. * gcc.target/aarch64/sve/acle/asm/set_neonq_s16.c: New test. * gcc.target/aarch64/sve/acle/asm/set_neonq_s32.c: New test. * gcc.target/aarch64/sve/acle/asm/set_neonq_s64.c: New test. * gcc.target/aarch64/sve/acle/asm/set_neonq_s8.c: New test. * gcc.target/aarch64/sve/acle/asm/set_neonq_u16.c: New test. * gcc.target/aarch64/sve/acle/asm/set_neonq_u32.c: New test. * gcc.target/aarch64/sve/acle/asm/set_neonq_u64.c: New test. * gcc.target/aarch64/sve/acle/asm/set_neonq_u8.c: New test. * gcc.target/aarch64/sve/acle/general-c/dup_neonq_1.c: New test. * gcc.target/aarch64/sve/acle/general-c/get_neonq_1.c: New test. * gcc.target/aarch64/sve/acle/general-c/set_neonq_1.c: New test.
2023-12-07aarch64: Add an early RA for strided registersRichard Sandiford1-1/+1
This pass adds a simple register allocator for FP & SIMD registers. Its main purpose is to make use of SME2's strided LD1, ST1 and LUTI2/4 instructions, which require a very specific grouping structure, and so would be difficult to exploit with general allocation. The allocator is very simple. It gives up on anything that would require spilling, or that it might not handle well for other reasons. The allocator needs to track liveness at the level of individual FPRs. Doing that fixes a lot of the PRs relating to redundant moves caused by structure loads and stores. That particular problem is going to be fixed more generally for GCC 15 by Lehua's RA patches. However, the early-RA pass runs before scheduling, so it has a chance to bag a spill-free allocation of vector code before the scheduler moves things around. It could therefore still be useful for non-SME code (e.g. for hand-scheduled ACLE code) even after Lehua's patches are in. The pass is controlled by a tristate switch: - -mearly-ra=all: run on all functions - -mearly-ra=strided: run on functions that have access to strided registers - -mearly-ra=none: don't run on any function The patch makes -mearly-ra=all the default at -O2 and above for now. We can revisit this for GCC 15 once Lehua's patches are in; -mearly-ra=strided might then be more appropriate. As said previously, the pass is very naive. There's much more that we could do, such as handling invariants better. The main focus is on not committing to a bad allocation, rather than on handling as much as possible. gcc/ PR rtl-optimization/106694 PR rtl-optimization/109078 PR rtl-optimization/109391 * config.gcc: Add aarch64-early-ra.o for AArch64 targets. * config/aarch64/t-aarch64 (aarch64-early-ra.o): New rule. * config/aarch64/aarch64-opts.h (aarch64_early_ra_scope): New enum. * config/aarch64/aarch64.opt (mearly_ra): New option. * doc/invoke.texi: Document it. * common/config/aarch64/aarch64-common.cc (aarch_option_optimization_table): Use -mearly-ra=strided by default for -O2 and above. * config/aarch64/aarch64-passes.def (pass_aarch64_early_ra): New pass. * config/aarch64/aarch64-protos.h (aarch64_strided_registers_p) (make_pass_aarch64_early_ra): Declare. * config/aarch64/aarch64-sme.md (@aarch64_sme_lut<LUTI_BITS><mode>): Add a stride_type attribute. (@aarch64_sme_lut<LUTI_BITS><mode>_strided2): New pattern. (@aarch64_sme_lut<LUTI_BITS><mode>_strided4): Likewise. * config/aarch64/aarch64-sve-builtins-base.cc (svld1_impl::expand) (svldnt1_impl::expand, svst1_impl::expand, svstn1_impl::expand): Handle new way of defining multi-register loads and stores. * config/aarch64/aarch64-sve.md (@aarch64_ld1<SVE_FULLx24:mode>) (@aarch64_ldnt1<SVE_FULLx24:mode>, @aarch64_st1<SVE_FULLx24:mode>) (@aarch64_stnt1<SVE_FULLx24:mode>): Delete. * config/aarch64/aarch64-sve2.md (@aarch64_<LD1_COUNT:optab><mode>) (@aarch64_<LD1_COUNT:optab><mode>_strided2): New patterns. (@aarch64_<LD1_COUNT:optab><mode>_strided4): Likewise. (@aarch64_<ST1_COUNT:optab><mode>): Likewise. (@aarch64_<ST1_COUNT:optab><mode>_strided2): Likewise. (@aarch64_<ST1_COUNT:optab><mode>_strided4): Likewise. * config/aarch64/aarch64.cc (aarch64_strided_registers_p): New function. * config/aarch64/aarch64.md (UNSPEC_LD1_SVE_COUNT): Delete. (UNSPEC_ST1_SVE_COUNT, UNSPEC_LDNT1_SVE_COUNT): Likewise. (UNSPEC_STNT1_SVE_COUNT): Likewise. (stride_type): New attribute. * config/aarch64/constraints.md (Uwd, Uwt): New constraints. * config/aarch64/iterators.md (UNSPEC_LD1_COUNT, UNSPEC_LDNT1_COUNT) (UNSPEC_ST1_COUNT, UNSPEC_STNT1_COUNT): New unspecs. (optab): Handle them. (LD1_COUNT, ST1_COUNT): New iterators. * config/aarch64/aarch64-early-ra.cc: New file. gcc/testsuite/ PR rtl-optimization/106694 PR rtl-optimization/109078 PR rtl-optimization/109391 * gcc.target/aarch64/ldp_stp_16.c (cons4_4_float): Tighten expected output test. * gcc.target/aarch64/sve/shift_1.c: Allow reversed shifts for .s as well as .d. * gcc.target/aarch64/sme/strided_1.c: New test. * gcc.target/aarch64/pr109078.c: Likewise. * gcc.target/aarch64/pr109391.c: Likewise. * gcc.target/aarch64/sve/pr106694.c: Likewise.
2023-12-05aarch64: Add support for <arm_sme.h>Richard Sandiford1-2/+2
This adds support for the SME parts of arm_sme.h. gcc/ * doc/invoke.texi: Document +sme-i16i64 and +sme-f64f64. * config.gcc (aarch64*-*-*): Add arm_sme.h to the list of headers to install and aarch64-sve-builtins-sme.o to the list of objects to build. * config/aarch64/aarch64-c.cc (aarch64_update_cpp_builtins): Define or undefine TARGET_SME, TARGET_SME_I16I64 and TARGET_SME_F64F64. (aarch64_pragma_aarch64): Handle arm_sme.h. * config/aarch64/aarch64-option-extensions.def (sme-i16i64) (sme-f64f64): New extensions. * config/aarch64/aarch64-protos.h (aarch64_sme_vq_immediate) (aarch64_addsvl_addspl_immediate_p, aarch64_output_addsvl_addspl) (aarch64_output_sme_zero_za): Declare. (aarch64_output_move_struct): Delete. (aarch64_sme_ldr_vnum_offset): Declare. (aarch64_sve::handle_arm_sme_h): Likewise. * config/aarch64/aarch64.h (AARCH64_ISA_SM_ON): New macro. (AARCH64_ISA_SME_I16I64, AARCH64_ISA_SME_F64F64): Likewise. (TARGET_STREAMING, TARGET_STREAMING_SME): Likewise. (TARGET_SME_I16I64, TARGET_SME_F64F64): Likewise. * config/aarch64/aarch64.cc (aarch64_sve_rdvl_factor_p): Rename to... (aarch64_sve_rdvl_addvl_factor_p): ...this. (aarch64_sve_rdvl_immediate_p): Update accordingly. (aarch64_rdsvl_immediate_p, aarch64_add_offset): Likewise. (aarch64_sme_vq_immediate): Likewise. Make public. (aarch64_sve_addpl_factor_p): New function. (aarch64_sve_addvl_addpl_immediate_p): Use aarch64_sve_rdvl_addvl_factor_p and aarch64_sve_addpl_factor_p. (aarch64_addsvl_addspl_immediate_p): New function. (aarch64_output_addsvl_addspl): Likewise. (aarch64_cannot_force_const_mem): Return true for RDSVL immediates. (aarch64_classify_index): Handle .Q scaling for VNx1TImode. (aarch64_classify_address): Likewise for vnum offsets. (aarch64_output_sme_zero_za): New function. (aarch64_sme_ldr_vnum_offset_p): Likewise. * config/aarch64/predicates.md (aarch64_addsvl_addspl_immediate): New predicate. (aarch64_pluslong_operand): Include it for SME. * config/aarch64/constraints.md (Ucj, Uav): New constraints. * config/aarch64/iterators.md (VNx1TI_ONLY): New mode iterator. (SME_ZA_I, SME_ZA_SDI, SME_ZA_SDF_I, SME_MOP_BHI): Likewise. (SME_MOP_HSDF): Likewise. (UNSPEC_SME_ADDHA, UNSPEC_SME_ADDVA, UNSPEC_SME_FMOPA) (UNSPEC_SME_FMOPS, UNSPEC_SME_LD1_HOR, UNSPEC_SME_LD1_VER) (UNSPEC_SME_READ_HOR, UNSPEC_SME_READ_VER, UNSPEC_SME_SMOPA) (UNSPEC_SME_SMOPS, UNSPEC_SME_ST1_HOR, UNSPEC_SME_ST1_VER) (UNSPEC_SME_SUMOPA, UNSPEC_SME_SUMOPS, UNSPEC_SME_UMOPA) (UNSPEC_SME_UMOPS, UNSPEC_SME_USMOPA, UNSPEC_SME_USMOPS) (UNSPEC_SME_WRITE_HOR, UNSPEC_SME_WRITE_VER): New unspecs. (elem_bits): Handle x2 and x4 structure modes, plus VNx1TI. (Vetype, Vesize, VPRED): Handle VNx1TI. (b): New mode attribute. (SME_LD1, SME_READ, SME_ST1, SME_WRITE, SME_BINARY_SDI, SME_INT_MOP) (SME_FP_MOP): New int iterators. (optab): Handle SME unspecs. (hv): New int attribute. * config/aarch64/aarch64.md (*add<mode>3_aarch64): Handle ADDSVL and ADDSPL. * config/aarch64/aarch64-sme.md (UNSPEC_SME_LDR): New unspec. (@aarch64_sme_<optab><mode>, @aarch64_sme_<optab><mode>_plus) (aarch64_sme_ldr0, @aarch64_sme_ldrn<mode>): New patterns. (UNSPEC_SME_STR): New unspec. (@aarch64_sme_<optab><mode>, @aarch64_sme_<optab><mode>_plus) (aarch64_sme_str0, @aarch64_sme_strn<mode>): New patterns. (@aarch64_sme_<optab><v_int_container><mode>): Likewise. (*aarch64_sme_<optab><v_int_container><mode>_plus): Likewise. (@aarch64_sme_<optab><VNx1TI_ONLY:mode><SVE_FULL:mode>): Likewise. (@aarch64_sme_<optab><v_int_container><mode>): Likewise. (*aarch64_sme_<optab><v_int_container><mode>_plus): Likewise. (@aarch64_sme_<optab><VNx1TI_ONLY:mode><SVE_FULL:mode>): Likewise. (UNSPEC_SME_ZERO): New unspec. (aarch64_sme_zero): New pattern. (@aarch64_sme_<SME_BINARY_SDI:optab><mode>): Likewise. (@aarch64_sme_<SME_INT_MOP:optab><mode>): Likewise. (@aarch64_sme_<SME_FP_MOP:optab><mode>): Likewise. * config/aarch64/aarch64-sve-builtins.def: Add ZA type suffixes. Include aarch64-sve-builtins-sme.def. (DEF_SME_ZA_FUNCTION): New macro. * config/aarch64/aarch64-sve-builtins.h (CP_READ_ZA): New call property. (CP_WRITE_ZA): Likewise. (PRED_za_m): New predication type. (type_suffix_index): Handle DEF_SME_ZA_SUFFIX. (type_suffix_info): Add vector_p and za_p fields. (function_instance::num_za_tiles): New member function. (function_builder::get_attributes): Add an aarch64_feature_flags argument. (function_expander::get_contiguous_base): Take a base argument number, a vnum argument number, and an argument that indicates whether the vnum parameter is a factor of the SME vector length or the prevailing vector length. (function_expander::add_integer_operand): Take a poly_int64. (sve_switcher::sve_switcher): Take a base set of flags. (sme_switcher): New class. (scalar_types): Add a null entry for NUM_VECTOR_TYPES. * config/aarch64/aarch64-sve-builtins.cc: Include aarch64-sve-builtins-sme.h. (pred_suffixes): Add an entry for PRED_za_m. (type_suffixes): Initialize vector_p and za_p. Handle ZA suffixes. (TYPES_all_za, TYPES_d_za, TYPES_za_bhsd_data, TYPES_za_all_data) (TYPES_za_s_integer, TYPES_za_d_integer, TYPES_mop_base) (TYPES_mop_base_signed, TYPES_mop_base_unsigned, TYPES_mop_i16i64) (TYPES_mop_i16i64_signed, TYPES_mop_i16i64_unsigned, TYPES_za): New type suffix macros. (preds_m, preds_za_m): New predication lists. (function_groups): Handle DEF_SME_ZA_FUNCTION. (scalar_types): Add an entry for NUM_VECTOR_TYPES. (find_type_suffix_for_scalar_type): Check positively for vectors rather than negatively for predicates. (check_required_extensions): Handle PSTATE.SM and PSTATE.ZA requirements. (report_out_of_range): Handle the case where the minimum and maximum are the same. (function_instance::reads_global_state_p): Return true for functions that read ZA. (function_instance::modifies_global_state_p): Return true for functions that write to ZA. (sve_switcher::sve_switcher): Add a base flags argument. (function_builder::get_name): Handle "__arm_" prefixes. (add_attribute): Add an overload that takes a namespaces. (add_shared_state_attribute): New function. (function_builder::get_attributes): Take the required feature flags as argument. Add streaming and ZA attributes where appropriate. (function_builder::add_unique_function): Update calls accordingly. (function_resolver::check_gp_argument): Assert that the predication isn't ZA _m predication. (function_checker::function_checker): Don't bias the argument number for ZA _m predication. (function_expander::get_contiguous_base): Add arguments that specify the base argument number, the vnum argument number, and an argument that indicates whether the vnum parameter is a factor of the SME vector length or the prevailing vector length. Handle the SME case. (function_expander::add_input_operand): Handle pmode_register_operand. (function_expander::add_integer_operand): Take a poly_int64. (init_builtins): Call handle_arm_sme_h for LTO. (handle_arm_sve_h): Skip SME intrinsics. (handle_arm_sme_h): New function. * config/aarch64/aarch64-sve-builtins-functions.h (read_write_za, write_za): New classes. (unspec_based_sme_function, za_arith_function): New using aliases. (quiet_za_arith_function): Likewise. * config/aarch64/aarch64-sve-builtins-shapes.h (binary_za_int_m, binary_za_m, binary_za_uint_m, bool_inherent) (inherent_za, inherent_mask_za, ldr_za, load_za, read_za_m, store_za) (str_za, unary_za_m, write_za_m): Declare. * config/aarch64/aarch64-sve-builtins-shapes.cc (apply_predication): Expect za_m functions to have an existing governing predicate. (binary_za_m_base, binary_za_int_m_def, binary_za_m_def): New classes. (binary_za_uint_m_def, bool_inherent_def, inherent_za_def): Likewise. (inherent_mask_za_def, ldr_za_def, load_za_def, read_za_m_def) (store_za_def, str_za_def, unary_za_m_def, write_za_m_def): Likewise. * config/aarch64/arm_sme.h: New file. * config/aarch64/aarch64-sve-builtins-sme.h: Likewise. * config/aarch64/aarch64-sve-builtins-sme.cc: Likewise. * config/aarch64/aarch64-sve-builtins-sme.def: Likewise. * config/aarch64/t-aarch64 (aarch64-sve-builtins.o): Depend on aarch64-sve-builtins-sme.def and aarch64-sve-builtins-sme.h. (aarch64-sve-builtins-sme.o): New rule. gcc/testsuite/ * lib/target-supports.exp: Add sme and sme-i16i64 features. * gcc.target/aarch64/pragma_cpp_predefs_4.c: Test __ARM_FEATURE_SME* macros. * gcc.target/aarch64/sve/acle/asm/test_sve_acle.h: Allow functions to be marked as __arm_streaming, __arm_streaming_compatible, and __arm_inout("za"). * g++.target/aarch64/sve/acle/general-c++/func_redef_4.c: Mark the function as __arm_streaming_compatible. * g++.target/aarch64/sve/acle/general-c++/func_redef_5.c: Likewise. * g++.target/aarch64/sve/acle/general-c++/func_redef_7.c: Likewise. * gcc.target/aarch64/sve/acle/general-c/func_redef_4.c: Likewise. * gcc.target/aarch64/sve/acle/general-c/func_redef_5.c: Likewise. * g++.target/aarch64/sme/aarch64-sme-acle-asm.exp: New test harness. * gcc.target/aarch64/sme/aarch64-sme-acle-asm.exp: Likewise. * gcc.target/aarch64/sve/acle/general-c/binary_za_int_m_1.c: New test. * gcc.target/aarch64/sve/acle/general-c/binary_za_m_1.c: Likewise. * gcc.target/aarch64/sve/acle/general-c/binary_za_m_2.c: Likewise. * gcc.target/aarch64/sve/acle/general-c/binary_za_uint_m_1.c: Likewise. * gcc.target/aarch64/sve/acle/general-c/read_za_m_1.c: Likewise. * gcc.target/aarch64/sve/acle/general-c/unary_za_m_1.c: Likewise. * gcc.target/aarch64/sve/acle/general-c/write_za_m_1.c: Likewise.
2023-11-27GCN: Remove 'last_arg' spec functionThomas Schwinge1-1/+0
The LLVM 13.0.1 assembler ('llvm-mc') indeed still does complain in presence of multiple '-mcpu=[...]' options: as: for the --mcpu option: may only occur zero or one times! However, as of "GCN: Tag '-march=[...]', '-mtune=[...]' as 'Negative' of themselves [PR112669]", the GCC-side special handling is no longer necessary. gcc/ * config.gcc <amdgcn-*-amdhsa> (extra_gcc_objs): Don't set. * config/gcn/driver-gcn.cc: Remove. * config/gcn/gcn-hsa.h (ASM_SPEC, EXTRA_SPEC_FUNCTIONS): Remove 'last_arg' spec function. * config/gcn/t-gcn-hsa (driver-gcn.o): Remove.
2023-11-27hurd: Add multilib paths for gnu-x86_64Samuel Thibault1-0/+3
We need the multilib paths in gcc to find e.g. glibc crt files on Debian. This is essentially based on t-linux64 version. gcc/ChangeLog: * config/i386/t-gnu64: New file. * config.gcc [x86_64-*-gnu*]: Add i386/t-gnu64 to tmake_file.
2023-11-27mips: Fix up mips*-sde-elf* build [PR112300]Jakub Jelinek1-6/+6
As reported in the PR, mipsisa64r2-sde-elf doesn't build because HEAP_TRAMPOLINES_INIT macro isn't defined anywhere. It is normally defined by # Figure out if we need to enable heap trampolines by default case ${target} in *-*-darwin2*) # Currently, we do this for macOS 11 and above. tm_defines="$tm_defines HEAP_TRAMPOLINES_INIT=1" ;; *) tm_defines="$tm_defines HEAP_TRAMPOLINES_INIT=0" ;; esac in config.gcc, but mips*-sde-elf* is the only target which overwrites tm_defines shell variable rather than just appending to it (or in one case prepending), all other targets append something to it, including other mips* triplets. I believe (just from looking at config.gcc) that the difference is that LIBC_GLIBC=1 LIBC_UCLIBC=2 LIBC_BIONIC=3 LIBC_MUSL=4 HEAP_TRAMPOLINES_INIT=0 isn't defined without the patch and is with the patch. I think defining those first 4 shouldn't cause any harm and defining the last one is required for it to actually build at all. 2023-11-27 Jakub Jelinek <jakub@redhat.com> PR target/112300 * config.gcc (mips*-sde-elf*): Append to tm_defines rather than overwriting them.
2023-11-27RISC-V: Initial RV64E and LP64E supportTsukasa OI1-4/+6
Along with RV32E, RV64E is ratified. Though ILP32E and LP64E ABIs are still draft, it's worth supporting it. gcc/ChangeLog: * common/config/riscv/riscv-common.cc (riscv_ext_version_table): Set version to ratified 2.0. (riscv_subset_list::parse_std_ext): Allow RV64E. * config.gcc: Parse base ISA 'rv64e' and ABI 'lp64e'. * config/riscv/arch-canonicalize: Parse base ISA 'rv64e'. * config/riscv/riscv-c.cc (riscv_cpu_cpp_builtins): Define different macro per XLEN. Add handling for ABI_LP64E. * config/riscv/riscv-d.cc (riscv_d_handle_target_float_abi): Add handling for ABI_LP64E. * config/riscv/riscv-opts.h (enum riscv_abi_type): Add ABI_LP64E. * config/riscv/riscv.cc (riscv_option_override): Enhance error handling to support RV64E and LP64E. (riscv_conditional_register_usage): Change "RV32E" in a comment to "RV32E/RV64E". * config/riscv/riscv.h (UNITS_PER_FP_ARG): Add handling for ABI_LP64E. (STACK_BOUNDARY): Ditto. (ABI_STACK_BOUNDARY): Ditto. (MAX_ARGS_IN_REGISTERS): Ditto. (ABI_SPEC): Add support for "lp64e". * config/riscv/riscv.opt: Parse -mabi=lp64e as ABI_LP64E. * doc/invoke.texi: Add documentation of the LP64E ABI. gcc/testsuite/ChangeLog: * gcc.target/riscv/predef-1.c: Test for __riscv_64e. * gcc.target/riscv/predef-2.c: Ditto. * gcc.target/riscv/predef-3.c: Ditto. * gcc.target/riscv/predef-4.c: Ditto. * gcc.target/riscv/predef-5.c: Ditto. * gcc.target/riscv/predef-6.c: Ditto. * gcc.target/riscv/predef-7.c: Ditto. * gcc.target/riscv/predef-8.c: Ditto. * gcc.target/riscv/predef-9.c: New test for RV64E and LP64E, based on predef-7.c.
2023-11-27bpf: remove bpf-helpers.hJose E. Marchesi1-1/+0
Now that we are finally able to use the kernel provided bpf_helpers.h file and associated machinery, there is no longer need to distribute our own version. This patch removes bpf-helpers.h and deletes most of the associated tests from the gcc.target/bpf testsuite. Two tests are adapted and retained: one testing the kernel_helper attribute, which is still useful, and the other making sure that proper constant propagation is performed with -O2, which is necessary to use the helpers defined as static pointers in the kernel's bpf_helpers.h. Regtested in target bpf-unknown-none and host x86_64-linux-gnu. gcc/ChangeLog * config/bpf/bpf-helpers.h: Remove. * config.gcc: Adapt accordingly. gcc/testsuite/ChangeLog * gcc.target/bpf/helper-bind.c: Do not include bpf-helpers.h. * gcc.target/bpf/helper-skb-ancestor-cgroup-id.c: Likewise, and renamed from skb-ancestor-cgroup-id.c. * gcc.target/bpf/helper-bpf-redirect.c: Remove. * gcc.target/bpf/helper-clone-redirect.c: Likewise. * gcc.target/bpf/helper-csum-diff.c: Likewise. * gcc.target/bpf/helper-csum-update.c: Likewise. * gcc.target/bpf/helper-current-task-under-cgroup.c: Likewise. * gcc.target/bpf/helper-fib-lookup.c: Likewise. * gcc.target/bpf/helper-get-cgroup-classid.c: Likewise. * gcc.target/bpf/helper-get-current-cgroup-id.c: Likewise. * gcc.target/bpf/helper-get-current-comm.c: Likewise. * gcc.target/bpf/helper-get-current-pid-tgid.c: Likewise. * gcc.target/bpf/helper-get-current-task.c: Likewise. * gcc.target/bpf/helper-get-current-uid-gid.c: Likewise. * gcc.target/bpf/helper-get-hash-recalc.c: Likewise. * gcc.target/bpf/helper-get-listener-sock.c: Likewise. * gcc.target/bpf/helper-get-local-storage.c: Likewise. * gcc.target/bpf/helper-get-numa-node-id.c: Likewise. * gcc.target/bpf/helper-get-prandom-u32.c: Likewise. * gcc.target/bpf/helper-get-route-realm.c: Likewise. * gcc.target/bpf/helper-get-smp-processor-id.c: Likewise. * gcc.target/bpf/helper-get-socket-cookie.c: Likewise. * gcc.target/bpf/helper-get-socket-uid.c: Likewise. * gcc.target/bpf/helper-get-stack.c: Likewise. * gcc.target/bpf/helper-get-stackid.c: Likewise. * gcc.target/bpf/helper-getsockopt.c: Likewise. * gcc.target/bpf/helper-ktime-get-ns.c: Likewise. * gcc.target/bpf/helper-l3-csum-replace.c: Likewise. * gcc.target/bpf/helper-l4-csum-replace.c: Likewise. * gcc.target/bpf/helper-lwt-push-encap.c: Likewise. * gcc.target/bpf/helper-lwt-seg6-action.c: Likewise. * gcc.target/bpf/helper-lwt-seg6-adjust-srh.c: Likewise. * gcc.target/bpf/helper-lwt-seg6-store-bytes.c: Likewise. * gcc.target/bpf/helper-map-delete-elem.c: Likewise. * gcc.target/bpf/helper-map-lookup-elem.c: Likewise. * gcc.target/bpf/helper-map-peek-elem.c: Likewise. * gcc.target/bpf/helper-map-pop-elem.c: Likewise. * gcc.target/bpf/helper-map-push-elem.c: Likewise. * gcc.target/bpf/helper-map-update-elem.c: Likewise. * gcc.target/bpf/helper-msg-apply-bytes.c: Likewise. * gcc.target/bpf/helper-msg-cork-bytes.c: Likewise. * gcc.target/bpf/helper-msg-pop-data.c: Likewise. * gcc.target/bpf/helper-msg-pull-data.c: Likewise. * gcc.target/bpf/helper-msg-push-data.c: Likewise. * gcc.target/bpf/helper-msg-redirect-hash.c: Likewise. * gcc.target/bpf/helper-msg-redirect-map.c: Likewise. * gcc.target/bpf/helper-override-return.c: Likewise. * gcc.target/bpf/helper-perf-event-output.c: Likewise. * gcc.target/bpf/helper-perf-event-read-value.c: Likewise. * gcc.target/bpf/helper-perf-event-read.c: Likewise. * gcc.target/bpf/helper-perf-prog-read-value.c: Likewise. * gcc.target/bpf/helper-probe-read-str.c: Likewise. * gcc.target/bpf/helper-probe-read.c: Likewise. * gcc.target/bpf/helper-probe-write-user.c: Likewise. * gcc.target/bpf/helper-rc-keydown.c: Likewise. * gcc.target/bpf/helper-rc-pointer-rel.c: Likewise. * gcc.target/bpf/helper-rc-repeat.c: Likewise. * gcc.target/bpf/helper-redirect-map.c: Likewise. * gcc.target/bpf/helper-set-hash-invalid.c: Likewise. * gcc.target/bpf/helper-set-hash.c: Likewise. * gcc.target/bpf/helper-setsockopt.c: Likewise. * gcc.target/bpf/helper-sk-fullsock.c: Likewise. * gcc.target/bpf/helper-sk-lookup-tcp.c: Likewise. * gcc.target/bpf/helper-sk-lookup-upd.c: Likewise. * gcc.target/bpf/helper-sk-redirect-hash.c: Likewise. * gcc.target/bpf/helper-sk-redirect-map.c: Likewise. * gcc.target/bpf/helper-sk-release.c: Likewise. * gcc.target/bpf/helper-sk-select-reuseport.c: Likewise. * gcc.target/bpf/helper-sk-storage-delete.c: Likewise. * gcc.target/bpf/helper-sk-storage-get.c: Likewise. * gcc.target/bpf/helper-skb-adjust-room.c: Likewise. * gcc.target/bpf/helper-skb-cgroup-id.c: Likewise. * gcc.target/bpf/helper-skb-change-head.c: Likewise. * gcc.target/bpf/helper-skb-change-proto.c: Likewise. * gcc.target/bpf/helper-skb-change-tail.c: Likewise. * gcc.target/bpf/helper-skb-change-type.c: Likewise. * gcc.target/bpf/helper-skb-ecn-set-ce.c: Likewise. * gcc.target/bpf/helper-skb-get-tunnel-key.c: Likewise. * gcc.target/bpf/helper-skb-get-tunnel-opt.c: Likewise. * gcc.target/bpf/helper-skb-get-xfrm-state.c: Likewise. * gcc.target/bpf/helper-skb-load-bytes-relative.c: Likewise. * gcc.target/bpf/helper-skb-load-bytes.c: Likewise. * gcc.target/bpf/helper-skb-pull-data.c: Likewise. * gcc.target/bpf/helper-skb-set-tunnel-key.c: Likewise. * gcc.target/bpf/helper-skb-set-tunnel-opt.c: Likewise. * gcc.target/bpf/helper-skb-store-bytes.c: Likewise. * gcc.target/bpf/helper-skb-under-cgroup.c: Likewise. * gcc.target/bpf/helper-skb-vlan-pop.c: Likewise. * gcc.target/bpf/helper-skb-vlan-push.c: Likewise. * gcc.target/bpf/helper-skc-lookup-tcp.c: Likewise. * gcc.target/bpf/helper-sock-hash-update.c: Likewise. * gcc.target/bpf/helper-sock-map-update.c: Likewise. * gcc.target/bpf/helper-sock-ops-cb-flags-set.c: Likewise. * gcc.target/bpf/helper-spin-lock.c: Likewise. * gcc.target/bpf/helper-spin-unlock.c: Likewise. * gcc.target/bpf/helper-strtol.c: Likewise. * gcc.target/bpf/helper-strtoul.c: Likewise. * gcc.target/bpf/helper-sysctl-get-current-value.c: Likewise. * gcc.target/bpf/helper-sysctl-get-name.c: Likewise. * gcc.target/bpf/helper-sysctl-get-new-value.c: Likewise. * gcc.target/bpf/helper-sysctl-set-new-value.c: Likewise. * gcc.target/bpf/helper-tail-call.c: Likewise. * gcc.target/bpf/helper-tcp-check-syncookie.c: Likewise. * gcc.target/bpf/helper-tcp-sock.c: Likewise. * gcc.target/bpf/helper-trace-printk.c: Likewise. * gcc.target/bpf/helper-xdp-adjust-head.c: Likewise. * gcc.target/bpf/helper-xdp-adjust-meta.c: Likewise. * gcc.target/bpf/helper-xdp-adjust-tail.c: Likewise. * gcc.target/bpf/skb-ancestor-cgroup-id.c: Likewise.
2023-11-18LoongArch: Add LA664 support.Lulu Cheng1-5/+5
Define ISA_BASE_LA64V110, which represents the base instruction set defined in LoongArch1.1. Support the configure setting --with-arch =la664, and support -march=la664,-mtune=la664. gcc/ChangeLog: * config.gcc: Support LA664. * config/loongarch/genopts/loongarch-strings: Likewise. * config/loongarch/genopts/loongarch.opt.in: Likewise. * config/loongarch/loongarch-cpu.cc (fill_native_cpu_config): Likewise. * config/loongarch/loongarch-def.c: Likewise. * config/loongarch/loongarch-def.h (N_ISA_BASE_TYPES): Likewise. (ISA_BASE_LA64V110): Define macro. (N_ARCH_TYPES): Update value. (N_TUNE_TYPES): Update value. (CPU_LA664): New macro. * config/loongarch/loongarch-opts.cc (isa_default_abi): Likewise. (isa_base_compat_p): Likewise. * config/loongarch/loongarch-opts.h (TARGET_64BIT): This parameter is enabled when la_target.isa.base is equal to ISA_BASE_LA64V100 or ISA_BASE_LA64V110. (TARGET_uARCH_LA664): Define macro. * config/loongarch/loongarch-str.h (STR_CPU_LA664): Likewise. * config/loongarch/loongarch.cc (loongarch_cpu_sched_reassociation_width): Add LA664 support. * config/loongarch/loongarch.opt: Regenerate.
2023-11-16RISC-V: Implement target attributeKito Cheng1-1/+1
The target attribute which proposed in [1], target attribute allow user to specify a local setting per-function basis. The syntax of target attribute is `__attribute__((target("<ATTR-STRING>")))`. and the syntax of `<ATTR-STRING>` describes below: ``` ATTR-STRING := ATTR-STRING ';' ATTR | ATTR ATTR := ARCH-ATTR | CPU-ATTR | TUNE-ATTR ARCH-ATTR := 'arch=' EXTENSIONS-OR-FULLARCH EXTENSIONS-OR-FULLARCH := <EXTENSIONS> | <FULLARCHSTR> EXTENSIONS := <EXTENSION> ',' <EXTENSIONS> | <EXTENSION> FULLARCHSTR := <full-arch-string> EXTENSION := <OP> <EXTENSION-NAME> <VERSION> OP := '+' VERSION := [0-9]+ 'p' [0-9]+ | [1-9][0-9]* | EXTENSION-NAME := Naming rule is defined in RISC-V ISA manual CPU-ATTR := 'cpu=' <valid-cpu-name> TUNE-ATTR := 'tune=' <valid-tune-name> ``` Changes since v1: - Use std::unique_ptr rather than alloca to prevent memory issue. - Error rather than warning when attribute duplicated. [1] https://github.com/riscv-non-isa/riscv-c-api-doc/pull/35 gcc/ChangeLog: * config.gcc (riscv): Add riscv-target-attr.o. * config/riscv/riscv-protos.h (riscv_declare_function_size) New. (riscv_option_valid_attribute_p): New. (riscv_override_options_internal): New. (struct riscv_tune_info): New. (riscv_parse_tune): New. * config/riscv/riscv-target-attr.cc (class riscv_target_attr_parser): New. (struct riscv_attribute_info): New. (riscv_attributes): New. (riscv_target_attr_parser::parse_arch): New. (riscv_target_attr_parser::handle_arch): New. (riscv_target_attr_parser::handle_cpu): New. (riscv_target_attr_parser::handle_tune): New. (riscv_target_attr_parser::update_settings): New. (riscv_process_one_target_attr): New. (num_occurences_in_str): New. (riscv_process_target_attr): New. (riscv_option_valid_attribute_p): New. * config/riscv/riscv.cc: Include target-globals.h and riscv-subset.h. (struct riscv_tune_info): Move to riscv-protos.h. (get_tune_str): New. (riscv_parse_tune): New parameter null_p. (riscv_declare_function_size): New. (riscv_option_override): Build target_option_default_node and target_option_current_node. (riscv_save_restore_target_globals): New. (riscv_option_restore): New. (riscv_previous_fndecl): New. (riscv_set_current_function): Apply the target attribute. (TARGET_OPTION_RESTORE): Define. (TARGET_OPTION_VALID_ATTRIBUTE_P): Ditto. * config/riscv/riscv.h (SWITCHABLE_TARGET): Define to 1. (ASM_DECLARE_FUNCTION_SIZE) Define. * config/riscv/riscv.opt (mtune=): Add Save attribute. (mcpu=): Ditto. (mcmodel=): Ditto. * config/riscv/t-riscv: Add build rule for riscv-target-attr.o * doc/extend.texi: Add doc for target attribute. gcc/testsuite/ChangeLog: * gcc.target/riscv/target-attr-01.c: New. * gcc.target/riscv/target-attr-02.c: Ditto. * gcc.target/riscv/target-attr-03.c: Ditto. * gcc.target/riscv/target-attr-04.c: Ditto. * gcc.target/riscv/target-attr-05.c: Ditto. * gcc.target/riscv/target-attr-06.c: Ditto. * gcc.target/riscv/target-attr-07.c: Ditto. * gcc.target/riscv/target-attr-bad-01.c: Ditto. * gcc.target/riscv/target-attr-bad-02.c: Ditto. * gcc.target/riscv/target-attr-bad-03.c: Ditto. * gcc.target/riscv/target-attr-bad-04.c: Ditto. * gcc.target/riscv/target-attr-bad-05.c: Ditto. * gcc.target/riscv/target-attr-bad-06.c: Ditto. * gcc.target/riscv/target-attr-bad-07.c: Ditto. * gcc.target/riscv/target-attr-bad-08.c: Ditto. * gcc.target/riscv/target-attr-bad-09.c: Ditto. * gcc.target/riscv/target-attr-bad-10.c: Ditto. Reviewed-by: Christoph Müllner <christoph.muellner@vrull.eu>
2023-11-14IBM Z: Add GTY marker to builtin data structuresAndreas Krebbel1-0/+1
This adds GTY markers to s390_builtin_types, s390_builtin_fn_types, and s390_builtin_decls. These were missing causing problems in particular when using builtins after including a precompiled header. Unfortunately the declaration of these data structures use enum values from s390-builtins.h. This file however is not included everywhere and is rather large. In order to include it only for the purpose of gtype-desc.cc we place a preprocessed copy of it in the build directory and include only this. This is going to be backported to GCC 12 and 13. gcc/ChangeLog: * config.gcc: Add s390-gen-builtins.h to target_gtfiles. * config/s390/s390-builtins.h (s390_builtin_types) (s390_builtin_fn_types, s390_builtin_decls): Add GTY marker. * config/s390/t-s390 (EXTRA_GTYPE_DEPS): Add s390-gen-builtins.h. Add build rule for s390-gen-builtins.h.
2023-10-30i386: Zhaoxin yongfeng enablementMayshao1-2/+10
Enable -march/-mtune=yongfeng. Costs and tunings are set according to the characteristics of the processor. Add a new .md file to describe yongfeng processor. gcc/ChangeLog: * common/config/i386/cpuinfo.h (get_zhaoxin_cpu): Recognize yongfeng. * common/config/i386/i386-common.cc: Add yongfeng. * common/config/i386/i386-cpuinfo.h (enum processor_subtypes): Add ZHAOXIN_FAM7H_YONGFENG. * config.gcc: Add yongfeng. * config/i386/driver-i386.cc (host_detect_local_cpu): Let -march=native recognize yongfeng processors. * config/i386/i386-c.cc (ix86_target_macros_internal): Add yongfeng. * config/i386/i386-options.cc (m_YONGFENG): New definition. (m_ZHAOXIN): Ditto. * config/i386/i386.h (enum processor_type): Add PROCESSOR_YONGFENG. * config/i386/i386.md: Add yongfeng. * config/i386/lujiazui.md: Fix typo. * config/i386/x86-tune-costs.h (struct processor_costs): Add yongfeng costs. * config/i386/x86-tune-sched.cc (ix86_issue_rate): Add yongfeng. (ix86_adjust_cost): Ditto. * config/i386/x86-tune.def (X86_TUNE_SCHEDULE): Replace m_LUJIAZUI with m_ZHAOXIN. (X86_TUNE_PARTIAL_REG_DEPENDENCY): Ditto. (X86_TUNE_SSE_PARTIAL_REG_DEPENDENCY): Ditto. (X86_TUNE_SSE_PARTIAL_REG_FP_CONVERTS_DEPENDENCY): Ditto. (X86_TUNE_SSE_PARTIAL_REG_CONVERTS_DEPENDENCY): Ditto. (X86_TUNE_MOVX): Ditto. (X86_TUNE_MEMORY_MISMATCH_STALL): Ditto. (X86_TUNE_FUSE_CMP_AND_BRANCH_32): Ditto. (X86_TUNE_FUSE_CMP_AND_BRANCH_64): Ditto. (X86_TUNE_FUSE_CMP_AND_BRANCH_SOFLAGS): Ditto. (X86_TUNE_FUSE_ALU_AND_BRANCH): Ditto. (X86_TUNE_ACCUMULATE_OUTGOING_ARGS): Ditto. (X86_TUNE_USE_LEAVE): Ditto. (X86_TUNE_PUSH_MEMORY): Ditto. (X86_TUNE_LCP_STALL): Ditto. (X86_TUNE_INTEGER_DFMODE_MOVES): Ditto. (X86_TUNE_OPT_AGU): Ditto. (X86_TUNE_PREFER_KNOWN_REP_MOVSB_STOSB): Ditto. (X86_TUNE_MISALIGNED_MOVE_STRING_PRO_EPILOGUES): Ditto. (X86_TUNE_USE_SAHF): Ditto. (X86_TUNE_USE_BT): Ditto. (X86_TUNE_AVOID_FALSE_DEP_FOR_BMI): Ditto. (X86_TUNE_ONE_IF_CONV_INSN): Ditto. (X86_TUNE_AVOID_MFENCE): Ditto. (X86_TUNE_EXPAND_ABS): Ditto. (X86_TUNE_USE_SIMODE_FIOP): Ditto. (X86_TUNE_USE_FFREEP): Ditto. (X86_TUNE_EXT_80387_CONSTANTS): Ditto. (X86_TUNE_SSE_UNALIGNED_LOAD_OPTIMAL): Ditto. (X86_TUNE_SSE_UNALIGNED_STORE_OPTIMAL): Ditto. (X86_TUNE_SSE_TYPELESS_STORES): Ditto. (X86_TUNE_SSE_LOAD0_BY_PXOR): Ditto. (X86_TUNE_USE_GATHER_2PARTS): Add m_YONGFENG. (X86_TUNE_USE_GATHER_4PARTS): Ditto. (X86_TUNE_USE_GATHER_8PARTS): Ditto. (X86_TUNE_AVOID_128FMA_CHAINS): Ditto. * doc/extend.texi: Add details about yongfeng. * doc/invoke.texi: Ditto. * config/i386/yongfeng.md: New file to describe yongfeng processor. gcc/testsuite/ChangeLog: * g++.target/i386/mv32.C: Handle new -march. * gcc.target/i386/funcspec-56.inc: Ditto.
2023-10-27RISC-V: Add AVL propagation PASS for RVV auto-vectorizationJuzhe-Zhong1-1/+1
This patch addresses the redundant AVL/VL toggling in RVV partial auto-vectorization which is a known issue for a long time and I finally find the time to address it. Consider a simple vector addition operation: https://godbolt.org/z/7hfGfEjW3 void foo (int *__restrict a, int *__restrict b, int *__restrict n) { for (int i = 0; i < n; i++) a[i] = a[i] + b[i]; } Optimized IR: Loop body: _38 = .SELECT_VL (ivtmp_36, POLY_INT_CST [4, 4]); -> vsetvli a5,a2,e8,mf4,ta,ma ... vect__4.8_27 = .MASK_LEN_LOAD (vectp_a.6_29, 32B, { -1, ... }, _38, 0); -> vle32.v v2,0(a0) vect__6.11_20 = .MASK_LEN_LOAD (vectp_b.9_25, 32B, { -1, ... }, _38, 0); -> vle32.v v1,0(a1) vect__7.12_19 = vect__6.11_20 + vect__4.8_27; -> vsetvli a6,zero,e32,m1,ta,ma + vadd.vv v1,v1,v2 .MASK_LEN_STORE (vectp_a.13_11, 32B, { -1, ... }, _38, 0, vect__7.12_19); -> vsetvli zero,a5,e32,m1,ta,ma + vse32.v v1,0(a4) We can see 2 redundant vsetvls inside the loop body due to AVL/VL toggling. The AVL/VL toggling is because we are missing LEN information in simple PLUS_EXPR GIMPLE assignment: vect__7.12_19 = vect__6.11_20 + vect__4.8_27; GCC apply partial predicate load/store and un-predicated full vector operation on partial vectorization. Such flow are used by all other targets like ARM SVE (RVV also uses such flow): ARM SVE: .L3: ld1w z30.s, p7/z, [x0, x3, lsl 2] -> predicated load ld1w z31.s, p7/z, [x1, x3, lsl 2] -> predicated load add z31.s, z31.s, z30.s -> un-predicated add st1w z31.s, p7, [x0, x3, lsl 2] -> predicated store Such vectorization flow causes AVL/VL toggling on RVV so we need AVL propagation PASS for it. Also, It's very unlikely that we can apply predicated operations on all vectorization for following reasons: 1. It's very heavy workload to support them on all vectorization and we don't see any benefits if we can handle that on targets backend. 2. Changing Loop vectorizer for it will make code base ugly and hard to maintain. 3. We will need so many patterns for all operations. Not only COND_LEN_ADD, COND_LEN_SUB, .... We also need COND_LEN_EXTEND, ...., COND_LEN_CEIL, ... .. over 100+ patterns, unreasonable number of patterns. To conclude, we prefer un-predicated operations here, and design a nice and clean AVL propagation PASS for it to elide the redundant vsetvls due to AVL/VL toggling. The second question is that why we separate a PASS called AVL propagation. Why not optimize it in VSETVL PASS (We definitetly can optimize AVL in VSETVL PASS) Frankly, I was planning to address such issue in VSETVL PASS that's why we recently refactored VSETVL PASS. However, I changed my mind recently after several experiments and tries. The reasons as follows: 1. For code base management and maintainience. Current VSETVL PASS is complicated enough and aleady has enough aggressive and fancy optimizations which turns out it can always generate optimal codegen in most of the cases. It's not a good idea keep adding more features into VSETVL PASS to make VSETVL PASS become heavy and heavy again, then we will need to refactor it again in the future. Actuall, the VSETVL PASS is very stable and optimal after the recent refactoring. Hopefully, we should not change VSETVL PASS any more except the minor fixes. 2. vsetvl insertion (VSETVL PASS does this thing) and AVL propagation are 2 different things, I don't think we should fuse them into same PASS. 3. VSETVL PASS is an post-RA PASS, wheras AVL propagtion should be done before RA which can reduce register allocation. 4. This patch's AVL propagation PASS only does AVL propagation for RVV partial auto-vectorization situations. This patch's codes are only hundreds lines which is very managable and can be very easily extended features and enhancements. We can easily extend and enhance more AVL propagation in a clean and separate PASS in the future. (If we do it on VSETVL PASS, we will complicate VSETVL PASS again which is already so complicated.) Here is an example to demonstrate more: https://godbolt.org/z/bE86sv3q5 void foo2 (int *__restrict a, int *__restrict b, int *__restrict c, int *__restrict a2, int *__restrict b2, int *__restrict c2, int *__restrict a3, int *__restrict b3, int *__restrict c3, int *__restrict a4, int *__restrict b4, int *__restrict c4, int *__restrict a5, int *__restrict b5, int *__restrict c5, int n) { for (int i = 0; i < n; i++){ a[i] = b[i] + c[i]; b5[i] = b[i] + c[i]; a2[i] = b2[i] + c2[i]; a3[i] = b3[i] + c3[i]; a4[i] = b4[i] + c4[i]; a5[i] = a[i] + a4[i]; a[i] = a5[i] + b5[i]+ a[i]; a[i] = a[i] + c[i]; b5[i] = a[i] + c[i]; a2[i] = a[i] + c2[i]; a3[i] = a[i] + c3[i]; a4[i] = a[i] + c4[i]; a5[i] = a[i] + a4[i]; a[i] = a[i] + b5[i]+ a[i]; } } 1. Loop Body: Before this patch: After this patch: vsetvli a4,t1,e8,mf4,ta,ma vsetvli a4,t1,e32,m1,ta,ma vle32.v v2,0(a2) vle32.v v2,0(a2) vle32.v v4,0(a1) vle32.v v3,0(t2) vle32.v v1,0(t2) vle32.v v4,0(a1) vsetvli a7,zero,e32,m1,ta,ma vle32.v v1,0(t0) vadd.vv v4,v2,v4 vadd.vv v4,v2,v4 vsetvli zero,a4,e32,m1,ta,ma vadd.vv v1,v3,v1 vle32.v v3,0(s0) vadd.vv v1,v1,v4 vsetvli a7,zero,e32,m1,ta,ma vadd.vv v1,v1,v4 vadd.vv v1,v3,v1 vadd.vv v1,v1,v4 vadd.vv v1,v1,v4 vadd.vv v1,v1,v2 vadd.vv v1,v1,v4 vadd.vv v2,v1,v2 vadd.vv v1,v1,v4 vse32.v v2,0(t5) vsetvli zero,a4,e32,m1,ta,ma vadd.vv v2,v2,v1 vle32.v v4,0(a5) vadd.vv v2,v2,v1 vsetvli a7,zero,e32,m1,ta,ma slli a7,a4,2 vadd.vv v1,v1,v2 vadd.vv v3,v1,v3 vadd.vv v2,v1,v2 vle32.v v5,0(a5) vadd.vv v4,v1,v4 vle32.v v6,0(t6) vsetvli zero,a4,e32,m1,ta,ma vse32.v v3,0(t3) vse32.v v2,0(t5) vse32.v v2,0(a0) vse32.v v4,0(a3) vadd.vv v3,v3,v1 vsetvli a7,zero,e32,m1,ta,ma vadd.vv v2,v1,v5 vadd.vv v3,v1,v3 vse32.v v3,0(t4) vadd.vv v2,v2,v1 vadd.vv v1,v1,v6 vadd.vv v2,v2,v1 vse32.v v2,0(a3) vsetvli zero,a4,e32,m1,ta,ma vse32.v v1,0(a6) vse32.v v2,0(a0) vse32.v v3,0(t3) vle32.v v2,0(t0) vsetvli a7,zero,e32,m1,ta,ma vadd.vv v3,v3,v1 vsetvli zero,a4,e32,m1,ta,ma vse32.v v3,0(t4) vsetvli a7,zero,e32,m1,ta,ma slli a7,a4,2 vadd.vv v1,v1,v2 sub t1,t1,a4 vsetvli zero,a4,e32,m1,ta,ma vse32.v v1,0(a6) It's quite obvious, all heavy && redundant vsetvls inside loop body are eliminated. 2. Epilogue: Before this patch: After this patch: .L5: .L5: ld s0,8(sp) ret addi sp,sp,16 jr ra This is the benefit we do the AVL propation before RA since we eliminate the use of 'a7' register which is used by the redudant AVL/VL toggling instruction: 'vsetvli a7,zero,e32,m1,ta,ma' The final codegen after this patch: foo2: lw t1,56(sp) ld t6,0(sp) ld t3,8(sp) ld t0,16(sp) ld t2,24(sp) ld t4,32(sp) ld t5,40(sp) ble t1,zero,.L5 .L3: vsetvli a4,t1,e32,m1,ta,ma vle32.v v2,0(a2) vle32.v v3,0(t2) vle32.v v4,0(a1) vle32.v v1,0(t0) vadd.vv v4,v2,v4 vadd.vv v1,v3,v1 vadd.vv v1,v1,v4 vadd.vv v1,v1,v4 vadd.vv v1,v1,v4 vadd.vv v1,v1,v2 vadd.vv v2,v1,v2 vse32.v v2,0(t5) vadd.vv v2,v2,v1 vadd.vv v2,v2,v1 slli a7,a4,2 vadd.vv v3,v1,v3 vle32.v v5,0(a5) vle32.v v6,0(t6) vse32.v v3,0(t3) vse32.v v2,0(a0) vadd.vv v3,v3,v1 vadd.vv v2,v1,v5 vse32.v v3,0(t4) vadd.vv v1,v1,v6 vse32.v v2,0(a3) vse32.v v1,0(a6) sub t1,t1,a4 add a1,a1,a7 add a2,a2,a7 add a5,a5,a7 add t6,t6,a7 add t0,t0,a7 add t2,t2,a7 add t5,t5,a7 add a3,a3,a7 add a6,a6,a7 add t3,t3,a7 add t4,t4,a7 add a0,a0,a7 bne t1,zero,.L3 .L5: ret PR target/111318 PR target/111888 gcc/ChangeLog: * config.gcc: Add AVL propagation pass. * config/riscv/riscv-passes.def (INSERT_PASS_AFTER): Ditto. * config/riscv/riscv-protos.h (make_pass_avlprop): Ditto. * config/riscv/t-riscv: Ditto. * config/riscv/riscv-avlprop.cc: New file. gcc/testsuite/ChangeLog: * gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-5.c: Adapt test. * gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-2.c: Ditto. * gcc.target/riscv/rvv/autovec/partial/select_vl-2.c: Ditto. * gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-2.c: Ditto. * gcc.target/riscv/rvv/autovec/pr111318.c: New test. * gcc.target/riscv/rvv/autovec/pr111888.c: New test. Tested-by: Patrick O'Neill <patrick@rivosinc.com>
2023-10-25config, aarch64: Use a more compatible sed invocation.Iain Sandoe1-6/+6
Currently, the sed command used to parse --with-{cpu,tune,arch} are using GNU-specific extension (automatically recognising extended regex). This is failing on Darwin, which defualts to Posix behaviour. However '-E' is accepted to indicate an extended RE. Strictly, this is also not really sufficient, since we should only require a Posix sed. gcc/ChangeLog: * config.gcc: Use -E to to sed to indicate that we are using extended REs. Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
2023-10-22target: Support heap-based trampolinesAndrew Burgess1-0/+11
Enable -ftrampoline-impl=heap by default if we are on macOS 11 or later. Co-Authored-By: Maxim Blinov <maxim.blinov@embecosm.com> Co-Authored-By: Francois-Xavier Coudert <fxcoudert@gcc.gnu.org> Co-Authored-By: Iain Sandoe <iain@sandoe.co.uk> gcc/ChangeLog: * config.gcc: Default to heap trampolines on macOS 11 and above. * config/i386/darwin.h: Define X86_CUSTOM_FUNCTION_TEST. * config/i386/i386.h: Define X86_CUSTOM_FUNCTION_TEST. * config/i386/i386.cc: Use X86_CUSTOM_FUNCTION_TEST.
2023-10-20amdgcn: add -march=gfx1030 EXPERIMENTALAndrew Stubbs1-1/+1
Accept the architecture configure option and resolve build failures. This is enough to build binaries, but I've not got a device to test it on, so there are probably runtime issues to fix. The cache control instructions might be unsafe (or too conservative), and the kernel metadata might be off. Vector reductions will need to be reworked for RDNA2. In principle, it would be better to use wavefrontsize32 for this architecture, but that would mean switching everything to allow SImode masks, so wavefrontsize64 it is. The multilib is not included in the default configuration so either configure --with-arch=gfx1030 or include it in --with-multilib-list=gfx1030,.... The majority of this patch has no effect on other devices, but changing from using scalar writes for the exit value to vector writes means we don't need the scalar cache write-back instruction anywhere (which doesn't exist in RDNA2). gcc/ChangeLog: * config.gcc: Allow --with-arch=gfx1030. * config/gcn/gcn-hsa.h (NO_XNACK): gfx1030 does not support xnack. (ASM_SPEC): gfx1030 needs -mattr=+wavefrontsize64 set. * config/gcn/gcn-opts.h (enum processor_type): Add PROCESSOR_GFX1030. (TARGET_GFX1030): New. (TARGET_RDNA2): New. * config/gcn/gcn-valu.md (@dpp_move<mode>): Disable for RDNA2. (addc<mode>3<exec_vcc>): Add RDNA2 syntax variant. (subc<mode>3<exec_vcc>): Likewise. (<convop><mode><vndi>2_exec): Add RDNA2 alternatives. (vec_cmp<mode>di): Likewise. (vec_cmp<u><mode>di): Likewise. (vec_cmp<mode>di_exec): Likewise. (vec_cmp<u><mode>di_exec): Likewise. (vec_cmp<mode>di_dup): Likewise. (vec_cmp<mode>di_dup_exec): Likewise. (reduc_<reduc_op>_scal_<mode>): Disable for RDNA2. (*<reduc_op>_dpp_shr_<mode>): Likewise. (*plus_carry_dpp_shr_<mode>): Likewise. (*plus_carry_in_dpp_shr_<mode>): Likewise. * config/gcn/gcn.cc (gcn_option_override): Recognise gfx1030. (gcn_global_address_p): RDNA2 only allows smaller offsets. (gcn_addr_space_legitimate_address_p): Likewise. (gcn_omp_device_kind_arch_isa): Recognise gfx1030. (gcn_expand_epilogue): Use VGPRs instead of SGPRs. (output_file_start): Configure gfx1030. * config/gcn/gcn.h (TARGET_CPU_CPP_BUILTINS): Add __RDNA2__; (ASSEMBLER_DIALECT): New. * config/gcn/gcn.md (rdna): New define_attr. (enabled): Use "rdna" attribute. (gcn_return): Remove s_dcache_wb. (addcsi3_scalar): Add RDNA2 syntax variant. (addcsi3_scalar_zero): Likewise. (addptrdi3): Likewise. (mulsi3): v_mul_lo_i32 should be v_mul_lo_u32 on all ISA. (*memory_barrier): Add RDNA2 syntax variant. (atomic_load<mode>): Add RDNA2 cache control variants, and disable scalar atomics for RDNA2. (atomic_store<mode>): Likewise. (atomic_exchange<mode>): Likewise. * config/gcn/gcn.opt (gpu_type): Add gfx1030. * config/gcn/mkoffload.cc (EF_AMDGPU_MACH_AMDGCN_GFX1030): New. (main): Recognise -march=gfx1030. * config/gcn/t-omp-device: Add gfx1030 isa. libgcc/ChangeLog: * config/gcn/amdgcn_veclib.h (CDNA3_PLUS): Set false for __RDNA2__. libgomp/ChangeLog: * plugin/plugin-gcn.c (EF_AMDGPU_MACH_AMDGCN_GFX1030): New. (isa_hsa_name): Recognise gfx1030. (isa_code): Likewise. * team.c (defined): Remove s_endpgm.
2023-10-19amdgcn: deprecate Fiji device and multilibAndrew Stubbs1-1/+13
LLVM wants to remove it, which breaks our build. This patch means that most users won't notice that change, when it comes, and those that do will have chosen to enable Fiji explicitly. I'm selecting gfx900 as the new default as that's the least likely for users to want, which means most users will specify -march explicitly, which means we'll be free to change the default again, when we need to, without breaking anybody's makefiles. gcc/ChangeLog: * config.gcc (amdgcn): Switch default to --with-arch=gfx900. Implement support for --with-multilib-list. * config/gcn/t-gcn-hsa: Likewise. * doc/install.texi: Likewise. * doc/invoke.texi: Mark Fiji deprecated.
2023-10-18Initial Panther Lake SupportHaochen Jiang1-1/+1
gcc/ChangeLog: * common/config/i386/cpuinfo.h (get_intel_cpu): Add Panther Lake. * common/config/i386/i386-common.cc (processor_name): Ditto. (processor_alias_table): Ditto. * common/config/i386/i386-cpuinfo.h (enum processor_types): Add INTEL_PANTHERLAKE. * config.gcc: Add -march=pantherlake. * config/i386/driver-i386.cc (host_detect_local_cpu): Refactor the if clause. Handle pantherlake. * config/i386/i386-c.cc (ix86_target_macros_internal): Handle pantherlake. * config/i386/i386-options.cc (processor_cost_table): Ditto. (m_PANTHERLAKE): New. (m_CORE_HYBRID): Add pantherlake. * config/i386/i386.h (enum processor_type): Ditto. * doc/extend.texi: Ditto. * doc/invoke.texi: Ditto. gcc/testsuite/ChangeLog: * g++.target/i386/mv16.C: Ditto. * gcc.target/i386/funcspec-56.inc: Handle new march.
2023-10-18Initial Clearwater Forest SupportHaochen Jiang1-1/+1
gcc/ChangeLog: * common/config/i386/cpuinfo.h (get_intel_cpu): Handle Clearwater Forest. * common/config/i386/i386-common.cc (processor_name): Add Clearwater Forest. (processor_alias_table): Ditto. * common/config/i386/i386-cpuinfo.h (enum processor_types): Add INTEL_CLEARWATERFOREST. * config.gcc: Add -march=clearwaterforest. * config/i386/driver-i386.cc (host_detect_local_cpu): Handle clearwaterforest. * config/i386/i386-c.cc (ix86_target_macros_internal): Ditto. * config/i386/i386-options.cc (processor_cost_table): Ditto. (m_CLEARWATERFOREST): New. (m_CORE_ATOM): Add clearwaterforest. * config/i386/i386.h (enum processor_type): Ditto. * doc/extend.texi: Ditto. * doc/invoke.texi: Ditto. gcc/testsuite/ChangeLog: * g++.target/i386/mv16.C: Ditto. * gcc.target/i386/funcspec-56.inc: Handle new march.
2023-10-12Support Intel USER_MSRHu, Lin11-1/+2
gcc/ChangeLog: * common/config/i386/cpuinfo.h (get_available_features): Detect USER_MSR. * common/config/i386/i386-common.cc (OPTION_MASK_ISA2_USER_MSR_SET): New. (OPTION_MASK_ISA2_USER_MSR_UNSET): Ditto. (ix86_handle_option): Handle -musermsr. * common/config/i386/i386-cpuinfo.h (enum processor_features): Add FEATURE_USER_MSR. * common/config/i386/i386-isas.h: Add ISA_NAME_TABLE_ENTRY for usermsr. * config.gcc: Add usermsrintrin.h * config/i386/cpuid.h (bit_USER_MSR): New. * config/i386/i386-builtin-types.def: Add DEF_FUNCTION_TYPE (VOID, UINT64, UINT64). * config/i386/i386-builtins.cc (ix86_init_mmx_sse_builtins): Add __builtin_urdmsr and __builtin_uwrmsr. * config/i386/i386-builtins.h (ix86_builtins): Add IX86_BUILTIN_URDMSR and IX86_BUILTIN_UWRMSR. * config/i386/i386-c.cc (ix86_target_macros_internal): Define __USER_MSR__. * config/i386/i386-expand.cc (ix86_expand_builtin): Handle new builtins. * config/i386/i386-isa.def (USER_MSR): Add DEF_PTA(USER_MSR). * config/i386/i386-options.cc (ix86_valid_target_attribute_inner_p): Handle usermsr. * config/i386/i386.md (urdmsr): New define_insn. (uwrmsr): Ditto. * config/i386/i386.opt: Add option -musermsr. * config/i386/x86gprintrin.h: Include usermsrintrin.h * doc/extend.texi: Document usermsr. * doc/invoke.texi: Document -musermsr. * doc/sourcebuild.texi: Document target usermsr. * config/i386/usermsrintrin.h: New file. gcc/testsuite/ChangeLog: * gcc.target/i386/funcspec-56.inc: Add new target attribute. * gcc.target/i386/x86gprintrin-1.c: Add -musermsr for 64bit target. * gcc.target/i386/x86gprintrin-2.c: Ditto. * gcc.target/i386/x86gprintrin-3.c: Ditto. * gcc.target/i386/x86gprintrin-4.c: Add musermsr for 64bit target. * gcc.target/i386/x86gprintrin-5.c: Ditto * gcc.target/i386/user_msr-1.c: New test. * gcc.target/i386/user_msr-2.c: Ditto.
2023-10-12LoongArch: Adjust makefile dependency for loongarch headers.Yang Yujie1-2/+2
gcc/ChangeLog: * config.gcc: Add loongarch-driver.h to tm_files. * config/loongarch/loongarch.h: Do not include loongarch-driver.h. * config/loongarch/t-loongarch: Append loongarch-multilib.h to $(GTM_H) instead of $(TM_H) for building generator programs.
2023-10-09[PATCH 4/5] Push evex512 target for 512 bit intrinsHaochen Jiang1-9/+10
gcc/ChangeLog: * config.gcc: Add avx512bitalgvlintrin.h. * config/i386/avx5124fmapsintrin.h: Add evex512 target for 512 bit intrins. * config/i386/avx5124vnniwintrin.h: Ditto. * config/i386/avx512bf16intrin.h: Ditto. * config/i386/avx512bitalgintrin.h: Add evex512 target for 512 bit intrins. Split 128/256 bit intrins to avx512bitalgvlintrin.h. * config/i386/avx512erintrin.h: Add evex512 target for 512 bit intrins * config/i386/avx512ifmaintrin.h: Ditto * config/i386/avx512pfintrin.h: Ditto * config/i386/avx512vbmi2intrin.h: Ditto. * config/i386/avx512vbmiintrin.h: Ditto. * config/i386/avx512vnniintrin.h: Ditto. * config/i386/avx512vp2intersectintrin.h: Ditto. * config/i386/avx512vpopcntdqintrin.h: Ditto. * config/i386/gfniintrin.h: Ditto. * config/i386/immintrin.h: Add avx512bitalgvlintrin.h. * config/i386/vaesintrin.h: Add evex512 target for 512 bit intrins. * config/i386/vpclmulqdqintrin.h: Ditto. * config/i386/avx512bitalgvlintrin.h: New.
2023-09-21rust: Implement TARGET_RUST_OS_INFO for *-*-*linux*.Iain Buclaw1-0/+2
gcc/ChangeLog: * config.gcc (*linux*): Set rust target_objs, and target_has_targetrustm, * config/t-linux (linux-rust.o): New rule. * config/linux-rust.cc: New file.
2023-09-21rust: Implement TARGET_RUST_OS_INFO for i[34567]86-*-mingw* and x86_64-*-mingw*.Iain Buclaw1-0/+2
gcc/ChangeLog: * config.gcc (i[34567]86-*-mingw* | x86_64-*-mingw*): Set rust_target_objs and target_has_targetrustm. * config/t-winnt (winnt-rust.o): New rule. * config/winnt-rust.cc: New file.
2023-09-21rust: Implement TARGET_RUST_OS_INFO for *-*-fuchsia*.Iain Buclaw1-0/+3
gcc/ChangeLog: * config.gcc (*-*-fuchsia): Set tmake_rule, rust_target_objs, and target_has_targetrustm. * config/fuchsia-rust.cc: New file. * config/t-fuchsia: New file.
2023-09-21rust: Implement TARGET_RUST_OS_INFO for *-*-vxworks*Iain Buclaw1-0/+3
gcc/ChangeLog: * config.gcc (*-*-vxworks*): Set rust_target_objs and target_has_targetrustm. * config/t-vxworks (vxworks-rust.o): New rule. * config/vxworks-rust.cc: New file.
2023-09-21rust: Implement TARGET_RUST_OS_INFO for *-*-dragonfly*Iain Buclaw1-0/+2
gcc/ChangeLog: * config.gcc (*-*-dragonfly*): Set rust_target_objs and target_has_targetrustm. * config/t-dragonfly (dragonfly-rust.o): New rule. * config/dragonfly-rust.cc: New file.
2023-09-21rust: Implement TARGET_RUST_OS_INFO for *-*-solaris2*.Iain Buclaw1-0/+2
gcc/ChangeLog: * config.gcc (*-*-solaris2*): Set rust_target_objs and target_has_targetrustm. * config/t-sol2 (sol2-rust.o): New rule. * config/sol2-rust.cc: New file.
2023-09-21rust: Implement TARGET_RUST_OS_INFO for *-*-openbsd*Iain Buclaw1-0/+2
gcc/ChangeLog: * config.gcc (*-*-openbsd*): Set rust_target_objs and target_has_targetrustm. * config/t-openbsd (openbsd-rust.o): New rule. * config/openbsd-rust.cc: New file.
2023-09-21rust: Implement TARGET_RUST_OS_INFO for *-*-netbsd*Iain Buclaw1-0/+2
gcc/ChangeLog: * config.gcc (*-*-netbsd*): Set rust_target_objs and target_has_targetrustm. * config/t-netbsd (netbsd-rust.o): New rule. * config/netbsd-rust.cc: New file.
2023-09-21rust: Implement TARGET_RUST_OS_INFO for *-*-freebsd*Iain Buclaw1-0/+2
gcc/ChangeLog: * config.gcc (*-*-freebsd*): Set rust_target_objs and target_has_targetrustm. * config/t-freebsd (freebsd-rust.o): New rule. * config/freebsd-rust.cc: New file.
2023-09-21rust: Implement TARGET_RUST_OS_INFO for *-*-darwin*Iain Buclaw1-0/+2
gcc/ChangeLog: * config.gcc (*-*-darwin*): Set rust_target_objs and target_has_targetrustm. * config/t-darwin (darwin-rust.o): New rule. * config/darwin-rust.cc: New file.
2023-09-21rust: Add skeleton support and documentation for targetrustm hooks.Iain Buclaw1-0/+25
gcc/ChangeLog: * Makefile.in (tm_rust_file_list, tm_rust_include_list, TM_RUST_H, RUST_TARGET_DEF, RUST_TARGET_H, RUST_TARGET_OBJS): New variables. (tm_rust.h, cs-tm_rust.h, default-rust.o, rust/rust-target-hooks-def.h, s-rust-target-hooks-def-h): New rules. (s-tm-texi): Also check timestamp on rust-target.def. (generated_files): Add TM_RUST_H and rust-target-hooks-def.h. (build/genhooks.o): Also depend on RUST_TARGET_DEF. * config.gcc (tm_rust_file, rust_target_objs, target_has_targetrustm): New variables. * configure: Regenerate. * configure.ac (tm_rust_file_list, tm_rust_include_list, rust_target_objs): Add substitutes. * doc/tm.texi: Regenerate. * doc/tm.texi.in (targetrustm): Document. (target_has_targetrustm): Document. * genhooks.cc: Include rust/rust-target.def. * config/default-rust.cc: New file. gcc/rust/ChangeLog: * rust-target-def.h: New file. * rust-target.def: New file. * rust-target.h: New file.
2023-09-15LoongArch: Reimplement multilib build option handling.Yang Yujie1-3/+3
Library build options from --with-multilib-list used to be processed with *self_spec, which missed the driver's initial canonicalization. This caused limitations on CFLAGS override and the use of driver-only options like -m[no]-lsx. The problem is solved by promoting the injection rules of --with-multilib-list options to the first element of DRIVER_SELF_SPECS, to make them execute before the canonialization. The library-build options are also hard-coded in the driver and can be used conveniently by the builders of other non-gcc libraries via the use of -fmultiflags. Bootstrapped and tested on loongarch64-linux-gnu. ChangeLog: * config-ml.in: Remove unneeded loongarch clause. * configure.ac: Register custom makefile fragments mt-loongarch-* for loongarch targets. * configure: Regenerate. config/ChangeLog: * mt-loongarch-mlib: New file. Pass -fmultiflags when building target libraries (FLAGS_FOR_TARGET). * mt-loongarch-elf: New file. * mt-loongarch-gnu: New file. gcc/ChangeLog: * config.gcc: Pass the default ABI via TM_MULTILIB_CONFIG. * config/loongarch/loongarch-driver.h: Invoke MLIB_SELF_SPECS before the driver canonicalization routines. * config/loongarch/loongarch.h: Move definitions of CC1_SPEC etc. to loongarch-driver.h * config/loongarch/t-linux: Move multilib-related definitions to t-multilib. * config/loongarch/t-multilib: New file. Inject library build options obtained from --with-multilib-list. * config/loongarch/t-loongarch: Same.
2023-09-12riscv: Add support for strlen inline expansionChristoph Müllner1-1/+2
This patch implements the expansion of the strlen builtin for RV32/RV64 for xlen-aligned aligned strings if Zbb or XTheadBb instructions are available. The inserted sequences are: rv32gc_zbb (RV64 is similar): add a3,a0,4 li a4,-1 .L1: lw a5,0(a0) add a0,a0,4 orc.b a5,a5 beq a5,a4,.L1 not a5,a5 ctz a5,a5 srl a5,a5,0x3 add a0,a0,a5 sub a0,a0,a3 rv64gc_xtheadbb (RV32 is similar): add a4,a0,8 .L2: ld a5,0(a0) add a0,a0,8 th.tstnbz a5,a5 beqz a5,.L2 th.rev a5,a5 th.ff1 a5,a5 srl a5,a5,0x3 add a0,a0,a5 sub a0,a0,a4 This allows to inline calls to strlen(), with optimized code for xlen-aligned strings, resulting in the following benefits over a call to libc: * no call/ret instructions * no stack frame allocation * no register saving/restoring * no alignment test The inlining mechanism is gated by a new switch ('-minline-strlen') and by the variable 'optimize_size'. Tested using the glibc string tests. Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu> gcc/ChangeLog: * config.gcc: Add new object riscv-string.o. riscv-string.cc. * config/riscv/riscv-protos.h (riscv_expand_strlen): New function. * config/riscv/riscv.md (strlen<mode>): New expand INSN. * config/riscv/riscv.opt: New flag 'minline-strlen'. * config/riscv/t-riscv: Add new object riscv-string.o. * config/riscv/thead.md (th_rev<mode>2): Export INSN name. (th_rev<mode>2): Likewise. (th_tstnbz<mode>2): New INSN. * doc/invoke.texi: Document '-minline-strlen'. * emit-rtl.cc (emit_likely_jump_insn): New helper function. (emit_unlikely_jump_insn): Likewise. * rtl.h (emit_likely_jump_insn): New prototype. (emit_unlikely_jump_insn): Likewise. * config/riscv/riscv-string.cc: New file. gcc/testsuite/ChangeLog: * gcc.target/riscv/xtheadbb-strlen-unaligned.c: New test. * gcc.target/riscv/xtheadbb-strlen.c: New test. * gcc.target/riscv/zbb-strlen-disabled-2.c: New test. * gcc.target/riscv/zbb-strlen-disabled.c: New test. * gcc.target/riscv/zbb-strlen-unaligned.c: New test. * gcc.target/riscv/zbb-strlen.c: New test.
2023-09-08LoongArch: Fix unintentional bash-ism in r14-3665.Yang Yujie1-1/+1
gcc/ChangeLog: * config.gcc: remove non-POSIX syntax "<<<".
2023-09-05LoongArch: Add Loongson ASX directive builtin function support.Lulu Cheng1-1/+1
gcc/ChangeLog: * config.gcc: Export the header file lasxintrin.h. * config/loongarch/loongarch-builtins.cc (enum loongarch_builtin_type): Add Loongson ASX builtin functions support. (AVAIL_ALL): Ditto. (LASX_BUILTIN): Ditto. (LASX_NO_TARGET_BUILTIN): Ditto. (LASX_BUILTIN_TEST_BRANCH): Ditto. (CODE_FOR_lasx_xvsadd_b): Ditto. (CODE_FOR_lasx_xvsadd_h): Ditto. (CODE_FOR_lasx_xvsadd_w): Ditto. (CODE_FOR_lasx_xvsadd_d): Ditto. (CODE_FOR_lasx_xvsadd_bu): Ditto. (CODE_FOR_lasx_xvsadd_hu): Ditto. (CODE_FOR_lasx_xvsadd_wu): Ditto. (CODE_FOR_lasx_xvsadd_du): Ditto. (CODE_FOR_lasx_xvadd_b): Ditto. (CODE_FOR_lasx_xvadd_h): Ditto. (CODE_FOR_lasx_xvadd_w): Ditto. (CODE_FOR_lasx_xvadd_d): Ditto. (CODE_FOR_lasx_xvaddi_bu): Ditto. (CODE_FOR_lasx_xvaddi_hu): Ditto. (CODE_FOR_lasx_xvaddi_wu): Ditto. (CODE_FOR_lasx_xvaddi_du): Ditto. (CODE_FOR_lasx_xvand_v): Ditto. (CODE_FOR_lasx_xvandi_b): Ditto. (CODE_FOR_lasx_xvbitsel_v): Ditto. (CODE_FOR_lasx_xvseqi_b): Ditto. (CODE_FOR_lasx_xvseqi_h): Ditto. (CODE_FOR_lasx_xvseqi_w): Ditto. (CODE_FOR_lasx_xvseqi_d): Ditto. (CODE_FOR_lasx_xvslti_b): Ditto. (CODE_FOR_lasx_xvslti_h): Ditto. (CODE_FOR_lasx_xvslti_w): Ditto. (CODE_FOR_lasx_xvslti_d): Ditto. (CODE_FOR_lasx_xvslti_bu): Ditto. (CODE_FOR_lasx_xvslti_hu): Ditto. (CODE_FOR_lasx_xvslti_wu): Ditto. (CODE_FOR_lasx_xvslti_du): Ditto. (CODE_FOR_lasx_xvslei_b): Ditto. (CODE_FOR_lasx_xvslei_h): Ditto. (CODE_FOR_lasx_xvslei_w): Ditto. (CODE_FOR_lasx_xvslei_d): Ditto. (CODE_FOR_lasx_xvslei_bu): Ditto. (CODE_FOR_lasx_xvslei_hu): Ditto. (CODE_FOR_lasx_xvslei_wu): Ditto. (CODE_FOR_lasx_xvslei_du): Ditto. (CODE_FOR_lasx_xvdiv_b): Ditto. (CODE_FOR_lasx_xvdiv_h): Ditto. (CODE_FOR_lasx_xvdiv_w): Ditto. (CODE_FOR_lasx_xvdiv_d): Ditto. (CODE_FOR_lasx_xvdiv_bu): Ditto. (CODE_FOR_lasx_xvdiv_hu): Ditto. (CODE_FOR_lasx_xvdiv_wu): Ditto. (CODE_FOR_lasx_xvdiv_du): Ditto. (CODE_FOR_lasx_xvfadd_s): Ditto. (CODE_FOR_lasx_xvfadd_d): Ditto. (CODE_FOR_lasx_xvftintrz_w_s): Ditto. (CODE_FOR_lasx_xvftintrz_l_d): Ditto. (CODE_FOR_lasx_xvftintrz_wu_s): Ditto. (CODE_FOR_lasx_xvftintrz_lu_d): Ditto. (CODE_FOR_lasx_xvffint_s_w): Ditto. (CODE_FOR_lasx_xvffint_d_l): Ditto. (CODE_FOR_lasx_xvffint_s_wu): Ditto. (CODE_FOR_lasx_xvffint_d_lu): Ditto. (CODE_FOR_lasx_xvfsub_s): Ditto. (CODE_FOR_lasx_xvfsub_d): Ditto. (CODE_FOR_lasx_xvfmul_s): Ditto. (CODE_FOR_lasx_xvfmul_d): Ditto. (CODE_FOR_lasx_xvfdiv_s): Ditto. (CODE_FOR_lasx_xvfdiv_d): Ditto. (CODE_FOR_lasx_xvfmax_s): Ditto. (CODE_FOR_lasx_xvfmax_d): Ditto. (CODE_FOR_lasx_xvfmin_s): Ditto. (CODE_FOR_lasx_xvfmin_d): Ditto. (CODE_FOR_lasx_xvfsqrt_s): Ditto. (CODE_FOR_lasx_xvfsqrt_d): Ditto. (CODE_FOR_lasx_xvflogb_s): Ditto. (CODE_FOR_lasx_xvflogb_d): Ditto. (CODE_FOR_lasx_xvmax_b): Ditto. (CODE_FOR_lasx_xvmax_h): Ditto. (CODE_FOR_lasx_xvmax_w): Ditto. (CODE_FOR_lasx_xvmax_d): Ditto. (CODE_FOR_lasx_xvmaxi_b): Ditto. (CODE_FOR_lasx_xvmaxi_h): Ditto. (CODE_FOR_lasx_xvmaxi_w): Ditto. (CODE_FOR_lasx_xvmaxi_d): Ditto. (CODE_FOR_lasx_xvmax_bu): Ditto. (CODE_FOR_lasx_xvmax_hu): Ditto. (CODE_FOR_lasx_xvmax_wu): Ditto. (CODE_FOR_lasx_xvmax_du): Ditto. (CODE_FOR_lasx_xvmaxi_bu): Ditto. (CODE_FOR_lasx_xvmaxi_hu): Ditto. (CODE_FOR_lasx_xvmaxi_wu): Ditto. (CODE_FOR_lasx_xvmaxi_du): Ditto. (CODE_FOR_lasx_xvmin_b): Ditto. (CODE_FOR_lasx_xvmin_h): Ditto. (CODE_FOR_lasx_xvmin_w): Ditto. (CODE_FOR_lasx_xvmin_d): Ditto. (CODE_FOR_lasx_xvmini_b): Ditto. (CODE_FOR_lasx_xvmini_h): Ditto. (CODE_FOR_lasx_xvmini_w): Ditto. (CODE_FOR_lasx_xvmini_d): Ditto. (CODE_FOR_lasx_xvmin_bu): Ditto. (CODE_FOR_lasx_xvmin_hu): Ditto. (CODE_FOR_lasx_xvmin_wu): Ditto. (CODE_FOR_lasx_xvmin_du): Ditto. (CODE_FOR_lasx_xvmini_bu): Ditto. (CODE_FOR_lasx_xvmini_hu): Ditto. (CODE_FOR_lasx_xvmini_wu): Ditto. (CODE_FOR_lasx_xvmini_du): Ditto. (CODE_FOR_lasx_xvmod_b): Ditto. (CODE_FOR_lasx_xvmod_h): Ditto. (CODE_FOR_lasx_xvmod_w): Ditto. (CODE_FOR_lasx_xvmod_d): Ditto. (CODE_FOR_lasx_xvmod_bu): Ditto. (CODE_FOR_lasx_xvmod_hu): Ditto. (CODE_FOR_lasx_xvmod_wu): Ditto. (CODE_FOR_lasx_xvmod_du): Ditto. (CODE_FOR_lasx_xvmul_b): Ditto. (CODE_FOR_lasx_xvmul_h): Ditto. (CODE_FOR_lasx_xvmul_w): Ditto. (CODE_FOR_lasx_xvmul_d): Ditto. (CODE_FOR_lasx_xvclz_b): Ditto. (CODE_FOR_lasx_xvclz_h): Ditto. (CODE_FOR_lasx_xvclz_w): Ditto. (CODE_FOR_lasx_xvclz_d): Ditto. (CODE_FOR_lasx_xvnor_v): Ditto. (CODE_FOR_lasx_xvor_v): Ditto. (CODE_FOR_lasx_xvori_b): Ditto. (CODE_FOR_lasx_xvnori_b): Ditto. (CODE_FOR_lasx_xvpcnt_b): Ditto. (CODE_FOR_lasx_xvpcnt_h): Ditto. (CODE_FOR_lasx_xvpcnt_w): Ditto. (CODE_FOR_lasx_xvpcnt_d): Ditto. (CODE_FOR_lasx_xvxor_v): Ditto. (CODE_FOR_lasx_xvxori_b): Ditto. (CODE_FOR_lasx_xvsll_b): Ditto. (CODE_FOR_lasx_xvsll_h): Ditto. (CODE_FOR_lasx_xvsll_w): Ditto. (CODE_FOR_lasx_xvsll_d): Ditto. (CODE_FOR_lasx_xvslli_b): Ditto. (CODE_FOR_lasx_xvslli_h): Ditto. (CODE_FOR_lasx_xvslli_w): Ditto. (CODE_FOR_lasx_xvslli_d): Ditto. (CODE_FOR_lasx_xvsra_b): Ditto. (CODE_FOR_lasx_xvsra_h): Ditto. (CODE_FOR_lasx_xvsra_w): Ditto. (CODE_FOR_lasx_xvsra_d): Ditto. (CODE_FOR_lasx_xvsrai_b): Ditto. (CODE_FOR_lasx_xvsrai_h): Ditto. (CODE_FOR_lasx_xvsrai_w): Ditto. (CODE_FOR_lasx_xvsrai_d): Ditto. (CODE_FOR_lasx_xvsrl_b): Ditto. (CODE_FOR_lasx_xvsrl_h): Ditto. (CODE_FOR_lasx_xvsrl_w): Ditto. (CODE_FOR_lasx_xvsrl_d): Ditto. (CODE_FOR_lasx_xvsrli_b): Ditto. (CODE_FOR_lasx_xvsrli_h): Ditto. (CODE_FOR_lasx_xvsrli_w): Ditto. (CODE_FOR_lasx_xvsrli_d): Ditto. (CODE_FOR_lasx_xvsub_b): Ditto. (CODE_FOR_lasx_xvsub_h): Ditto. (CODE_FOR_lasx_xvsub_w): Ditto. (CODE_FOR_lasx_xvsub_d): Ditto. (CODE_FOR_lasx_xvsubi_bu): Ditto. (CODE_FOR_lasx_xvsubi_hu): Ditto. (CODE_FOR_lasx_xvsubi_wu): Ditto. (CODE_FOR_lasx_xvsubi_du): Ditto. (CODE_FOR_lasx_xvpackod_d): Ditto. (CODE_FOR_lasx_xvpackev_d): Ditto. (CODE_FOR_lasx_xvpickod_d): Ditto. (CODE_FOR_lasx_xvpickev_d): Ditto. (CODE_FOR_lasx_xvrepli_b): Ditto. (CODE_FOR_lasx_xvrepli_h): Ditto. (CODE_FOR_lasx_xvrepli_w): Ditto. (CODE_FOR_lasx_xvrepli_d): Ditto. (CODE_FOR_lasx_xvandn_v): Ditto. (CODE_FOR_lasx_xvorn_v): Ditto. (CODE_FOR_lasx_xvneg_b): Ditto. (CODE_FOR_lasx_xvneg_h): Ditto. (CODE_FOR_lasx_xvneg_w): Ditto. (CODE_FOR_lasx_xvneg_d): Ditto. (CODE_FOR_lasx_xvbsrl_v): Ditto. (CODE_FOR_lasx_xvbsll_v): Ditto. (CODE_FOR_lasx_xvfmadd_s): Ditto. (CODE_FOR_lasx_xvfmadd_d): Ditto. (CODE_FOR_lasx_xvfmsub_s): Ditto. (CODE_FOR_lasx_xvfmsub_d): Ditto. (CODE_FOR_lasx_xvfnmadd_s): Ditto. (CODE_FOR_lasx_xvfnmadd_d): Ditto. (CODE_FOR_lasx_xvfnmsub_s): Ditto. (CODE_FOR_lasx_xvfnmsub_d): Ditto. (CODE_FOR_lasx_xvpermi_q): Ditto. (CODE_FOR_lasx_xvpermi_d): Ditto. (CODE_FOR_lasx_xbnz_v): Ditto. (CODE_FOR_lasx_xbz_v): Ditto. (CODE_FOR_lasx_xvssub_b): Ditto. (CODE_FOR_lasx_xvssub_h): Ditto. (CODE_FOR_lasx_xvssub_w): Ditto. (CODE_FOR_lasx_xvssub_d): Ditto. (CODE_FOR_lasx_xvssub_bu): Ditto. (CODE_FOR_lasx_xvssub_hu): Ditto. (CODE_FOR_lasx_xvssub_wu): Ditto. (CODE_FOR_lasx_xvssub_du): Ditto. (CODE_FOR_lasx_xvabsd_b): Ditto. (CODE_FOR_lasx_xvabsd_h): Ditto. (CODE_FOR_lasx_xvabsd_w): Ditto. (CODE_FOR_lasx_xvabsd_d): Ditto. (CODE_FOR_lasx_xvabsd_bu): Ditto. (CODE_FOR_lasx_xvabsd_hu): Ditto. (CODE_FOR_lasx_xvabsd_wu): Ditto. (CODE_FOR_lasx_xvabsd_du): Ditto. (CODE_FOR_lasx_xvavg_b): Ditto. (CODE_FOR_lasx_xvavg_h): Ditto. (CODE_FOR_lasx_xvavg_w): Ditto. (CODE_FOR_lasx_xvavg_d): Ditto. (CODE_FOR_lasx_xvavg_bu): Ditto. (CODE_FOR_lasx_xvavg_hu): Ditto. (CODE_FOR_lasx_xvavg_wu): Ditto. (CODE_FOR_lasx_xvavg_du): Ditto. (CODE_FOR_lasx_xvavgr_b): Ditto. (CODE_FOR_lasx_xvavgr_h): Ditto. (CODE_FOR_lasx_xvavgr_w): Ditto. (CODE_FOR_lasx_xvavgr_d): Ditto. (CODE_FOR_lasx_xvavgr_bu): Ditto. (CODE_FOR_lasx_xvavgr_hu): Ditto. (CODE_FOR_lasx_xvavgr_wu): Ditto. (CODE_FOR_lasx_xvavgr_du): Ditto. (CODE_FOR_lasx_xvmuh_b): Ditto. (CODE_FOR_lasx_xvmuh_h): Ditto. (CODE_FOR_lasx_xvmuh_w): Ditto. (CODE_FOR_lasx_xvmuh_d): Ditto. (CODE_FOR_lasx_xvmuh_bu): Ditto. (CODE_FOR_lasx_xvmuh_hu): Ditto. (CODE_FOR_lasx_xvmuh_wu): Ditto. (CODE_FOR_lasx_xvmuh_du): Ditto. (CODE_FOR_lasx_xvssran_b_h): Ditto. (CODE_FOR_lasx_xvssran_h_w): Ditto. (CODE_FOR_lasx_xvssran_w_d): Ditto. (CODE_FOR_lasx_xvssran_bu_h): Ditto. (CODE_FOR_lasx_xvssran_hu_w): Ditto. (CODE_FOR_lasx_xvssran_wu_d): Ditto. (CODE_FOR_lasx_xvssrarn_b_h): Ditto. (CODE_FOR_lasx_xvssrarn_h_w): Ditto. (CODE_FOR_lasx_xvssrarn_w_d): Ditto. (CODE_FOR_lasx_xvssrarn_bu_h): Ditto. (CODE_FOR_lasx_xvssrarn_hu_w): Ditto. (CODE_FOR_lasx_xvssrarn_wu_d): Ditto. (CODE_FOR_lasx_xvssrln_bu_h): Ditto. (CODE_FOR_lasx_xvssrln_hu_w): Ditto. (CODE_FOR_lasx_xvssrln_wu_d): Ditto. (CODE_FOR_lasx_xvssrlrn_bu_h): Ditto. (CODE_FOR_lasx_xvssrlrn_hu_w): Ditto. (CODE_FOR_lasx_xvssrlrn_wu_d): Ditto. (CODE_FOR_lasx_xvftint_w_s): Ditto. (CODE_FOR_lasx_xvftint_l_d): Ditto. (CODE_FOR_lasx_xvftint_wu_s): Ditto. (CODE_FOR_lasx_xvftint_lu_d): Ditto. (CODE_FOR_lasx_xvsllwil_h_b): Ditto. (CODE_FOR_lasx_xvsllwil_w_h): Ditto. (CODE_FOR_lasx_xvsllwil_d_w): Ditto. (CODE_FOR_lasx_xvsllwil_hu_bu): Ditto. (CODE_FOR_lasx_xvsllwil_wu_hu): Ditto. (CODE_FOR_lasx_xvsllwil_du_wu): Ditto. (CODE_FOR_lasx_xvsat_b): Ditto. (CODE_FOR_lasx_xvsat_h): Ditto. (CODE_FOR_lasx_xvsat_w): Ditto. (CODE_FOR_lasx_xvsat_d): Ditto. (CODE_FOR_lasx_xvsat_bu): Ditto. (CODE_FOR_lasx_xvsat_hu): Ditto. (CODE_FOR_lasx_xvsat_wu): Ditto. (CODE_FOR_lasx_xvsat_du): Ditto. (loongarch_builtin_vectorized_function): Ditto. (loongarch_expand_builtin_insn): Ditto. (loongarch_expand_builtin): Ditto. * config/loongarch/loongarch-ftypes.def (1): Ditto. (2): Ditto. (3): Ditto. (4): Ditto. * config/loongarch/lasxintrin.h: New file.
2023-09-05LoongArch: Add Loongson SX directive builtin function support.Lulu Cheng1-1/+1
gcc/ChangeLog: * config.gcc: Export the header file lsxintrin.h. * config/loongarch/loongarch-builtins.cc (LARCH_FTYPE_NAME4): Add builtin function support. (enum loongarch_builtin_type): Ditto. (AVAIL_ALL): Ditto. (LARCH_BUILTIN): Ditto. (LSX_BUILTIN): Ditto. (LSX_BUILTIN_TEST_BRANCH): Ditto. (LSX_NO_TARGET_BUILTIN): Ditto. (CODE_FOR_lsx_vsadd_b): Ditto. (CODE_FOR_lsx_vsadd_h): Ditto. (CODE_FOR_lsx_vsadd_w): Ditto. (CODE_FOR_lsx_vsadd_d): Ditto. (CODE_FOR_lsx_vsadd_bu): Ditto. (CODE_FOR_lsx_vsadd_hu): Ditto. (CODE_FOR_lsx_vsadd_wu): Ditto. (CODE_FOR_lsx_vsadd_du): Ditto. (CODE_FOR_lsx_vadd_b): Ditto. (CODE_FOR_lsx_vadd_h): Ditto. (CODE_FOR_lsx_vadd_w): Ditto. (CODE_FOR_lsx_vadd_d): Ditto. (CODE_FOR_lsx_vaddi_bu): Ditto. (CODE_FOR_lsx_vaddi_hu): Ditto. (CODE_FOR_lsx_vaddi_wu): Ditto. (CODE_FOR_lsx_vaddi_du): Ditto. (CODE_FOR_lsx_vand_v): Ditto. (CODE_FOR_lsx_vandi_b): Ditto. (CODE_FOR_lsx_bnz_v): Ditto. (CODE_FOR_lsx_bz_v): Ditto. (CODE_FOR_lsx_vbitsel_v): Ditto. (CODE_FOR_lsx_vseqi_b): Ditto. (CODE_FOR_lsx_vseqi_h): Ditto. (CODE_FOR_lsx_vseqi_w): Ditto. (CODE_FOR_lsx_vseqi_d): Ditto. (CODE_FOR_lsx_vslti_b): Ditto. (CODE_FOR_lsx_vslti_h): Ditto. (CODE_FOR_lsx_vslti_w): Ditto. (CODE_FOR_lsx_vslti_d): Ditto. (CODE_FOR_lsx_vslti_bu): Ditto. (CODE_FOR_lsx_vslti_hu): Ditto. (CODE_FOR_lsx_vslti_wu): Ditto. (CODE_FOR_lsx_vslti_du): Ditto. (CODE_FOR_lsx_vslei_b): Ditto. (CODE_FOR_lsx_vslei_h): Ditto. (CODE_FOR_lsx_vslei_w): Ditto. (CODE_FOR_lsx_vslei_d): Ditto. (CODE_FOR_lsx_vslei_bu): Ditto. (CODE_FOR_lsx_vslei_hu): Ditto. (CODE_FOR_lsx_vslei_wu): Ditto. (CODE_FOR_lsx_vslei_du): Ditto. (CODE_FOR_lsx_vdiv_b): Ditto. (CODE_FOR_lsx_vdiv_h): Ditto. (CODE_FOR_lsx_vdiv_w): Ditto. (CODE_FOR_lsx_vdiv_d): Ditto. (CODE_FOR_lsx_vdiv_bu): Ditto. (CODE_FOR_lsx_vdiv_hu): Ditto. (CODE_FOR_lsx_vdiv_wu): Ditto. (CODE_FOR_lsx_vdiv_du): Ditto. (CODE_FOR_lsx_vfadd_s): Ditto. (CODE_FOR_lsx_vfadd_d): Ditto. (CODE_FOR_lsx_vftintrz_w_s): Ditto. (CODE_FOR_lsx_vftintrz_l_d): Ditto. (CODE_FOR_lsx_vftintrz_wu_s): Ditto. (CODE_FOR_lsx_vftintrz_lu_d): Ditto. (CODE_FOR_lsx_vffint_s_w): Ditto. (CODE_FOR_lsx_vffint_d_l): Ditto. (CODE_FOR_lsx_vffint_s_wu): Ditto. (CODE_FOR_lsx_vffint_d_lu): Ditto. (CODE_FOR_lsx_vfsub_s): Ditto. (CODE_FOR_lsx_vfsub_d): Ditto. (CODE_FOR_lsx_vfmul_s): Ditto. (CODE_FOR_lsx_vfmul_d): Ditto. (CODE_FOR_lsx_vfdiv_s): Ditto. (CODE_FOR_lsx_vfdiv_d): Ditto. (CODE_FOR_lsx_vfmax_s): Ditto. (CODE_FOR_lsx_vfmax_d): Ditto. (CODE_FOR_lsx_vfmin_s): Ditto. (CODE_FOR_lsx_vfmin_d): Ditto. (CODE_FOR_lsx_vfsqrt_s): Ditto. (CODE_FOR_lsx_vfsqrt_d): Ditto. (CODE_FOR_lsx_vflogb_s): Ditto. (CODE_FOR_lsx_vflogb_d): Ditto. (CODE_FOR_lsx_vmax_b): Ditto. (CODE_FOR_lsx_vmax_h): Ditto. (CODE_FOR_lsx_vmax_w): Ditto. (CODE_FOR_lsx_vmax_d): Ditto. (CODE_FOR_lsx_vmaxi_b): Ditto. (CODE_FOR_lsx_vmaxi_h): Ditto. (CODE_FOR_lsx_vmaxi_w): Ditto. (CODE_FOR_lsx_vmaxi_d): Ditto. (CODE_FOR_lsx_vmax_bu): Ditto. (CODE_FOR_lsx_vmax_hu): Ditto. (CODE_FOR_lsx_vmax_wu): Ditto. (CODE_FOR_lsx_vmax_du): Ditto. (CODE_FOR_lsx_vmaxi_bu): Ditto. (CODE_FOR_lsx_vmaxi_hu): Ditto. (CODE_FOR_lsx_vmaxi_wu): Ditto. (CODE_FOR_lsx_vmaxi_du): Ditto. (CODE_FOR_lsx_vmin_b): Ditto. (CODE_FOR_lsx_vmin_h): Ditto. (CODE_FOR_lsx_vmin_w): Ditto. (CODE_FOR_lsx_vmin_d): Ditto. (CODE_FOR_lsx_vmini_b): Ditto. (CODE_FOR_lsx_vmini_h): Ditto. (CODE_FOR_lsx_vmini_w): Ditto. (CODE_FOR_lsx_vmini_d): Ditto. (CODE_FOR_lsx_vmin_bu): Ditto. (CODE_FOR_lsx_vmin_hu): Ditto. (CODE_FOR_lsx_vmin_wu): Ditto. (CODE_FOR_lsx_vmin_du): Ditto. (CODE_FOR_lsx_vmini_bu): Ditto. (CODE_FOR_lsx_vmini_hu): Ditto. (CODE_FOR_lsx_vmini_wu): Ditto. (CODE_FOR_lsx_vmini_du): Ditto. (CODE_FOR_lsx_vmod_b): Ditto. (CODE_FOR_lsx_vmod_h): Ditto. (CODE_FOR_lsx_vmod_w): Ditto. (CODE_FOR_lsx_vmod_d): Ditto. (CODE_FOR_lsx_vmod_bu): Ditto. (CODE_FOR_lsx_vmod_hu): Ditto. (CODE_FOR_lsx_vmod_wu): Ditto. (CODE_FOR_lsx_vmod_du): Ditto. (CODE_FOR_lsx_vmul_b): Ditto. (CODE_FOR_lsx_vmul_h): Ditto. (CODE_FOR_lsx_vmul_w): Ditto. (CODE_FOR_lsx_vmul_d): Ditto. (CODE_FOR_lsx_vclz_b): Ditto. (CODE_FOR_lsx_vclz_h): Ditto. (CODE_FOR_lsx_vclz_w): Ditto. (CODE_FOR_lsx_vclz_d): Ditto. (CODE_FOR_lsx_vnor_v): Ditto. (CODE_FOR_lsx_vor_v): Ditto. (CODE_FOR_lsx_vori_b): Ditto. (CODE_FOR_lsx_vnori_b): Ditto. (CODE_FOR_lsx_vpcnt_b): Ditto. (CODE_FOR_lsx_vpcnt_h): Ditto. (CODE_FOR_lsx_vpcnt_w): Ditto. (CODE_FOR_lsx_vpcnt_d): Ditto. (CODE_FOR_lsx_vxor_v): Ditto. (CODE_FOR_lsx_vxori_b): Ditto. (CODE_FOR_lsx_vsll_b): Ditto. (CODE_FOR_lsx_vsll_h): Ditto. (CODE_FOR_lsx_vsll_w): Ditto. (CODE_FOR_lsx_vsll_d): Ditto. (CODE_FOR_lsx_vslli_b): Ditto. (CODE_FOR_lsx_vslli_h): Ditto. (CODE_FOR_lsx_vslli_w): Ditto. (CODE_FOR_lsx_vslli_d): Ditto. (CODE_FOR_lsx_vsra_b): Ditto. (CODE_FOR_lsx_vsra_h): Ditto. (CODE_FOR_lsx_vsra_w): Ditto. (CODE_FOR_lsx_vsra_d): Ditto. (CODE_FOR_lsx_vsrai_b): Ditto. (CODE_FOR_lsx_vsrai_h): Ditto. (CODE_FOR_lsx_vsrai_w): Ditto. (CODE_FOR_lsx_vsrai_d): Ditto. (CODE_FOR_lsx_vsrl_b): Ditto. (CODE_FOR_lsx_vsrl_h): Ditto. (CODE_FOR_lsx_vsrl_w): Ditto. (CODE_FOR_lsx_vsrl_d): Ditto. (CODE_FOR_lsx_vsrli_b): Ditto. (CODE_FOR_lsx_vsrli_h): Ditto. (CODE_FOR_lsx_vsrli_w): Ditto. (CODE_FOR_lsx_vsrli_d): Ditto. (CODE_FOR_lsx_vsub_b): Ditto. (CODE_FOR_lsx_vsub_h): Ditto. (CODE_FOR_lsx_vsub_w): Ditto. (CODE_FOR_lsx_vsub_d): Ditto. (CODE_FOR_lsx_vsubi_bu): Ditto. (CODE_FOR_lsx_vsubi_hu): Ditto. (CODE_FOR_lsx_vsubi_wu): Ditto. (CODE_FOR_lsx_vsubi_du): Ditto. (CODE_FOR_lsx_vpackod_d): Ditto. (CODE_FOR_lsx_vpackev_d): Ditto. (CODE_FOR_lsx_vpickod_d): Ditto. (CODE_FOR_lsx_vpickev_d): Ditto. (CODE_FOR_lsx_vrepli_b): Ditto. (CODE_FOR_lsx_vrepli_h): Ditto. (CODE_FOR_lsx_vrepli_w): Ditto. (CODE_FOR_lsx_vrepli_d): Ditto. (CODE_FOR_lsx_vsat_b): Ditto. (CODE_FOR_lsx_vsat_h): Ditto. (CODE_FOR_lsx_vsat_w): Ditto. (CODE_FOR_lsx_vsat_d): Ditto. (CODE_FOR_lsx_vsat_bu): Ditto. (CODE_FOR_lsx_vsat_hu): Ditto. (CODE_FOR_lsx_vsat_wu): Ditto. (CODE_FOR_lsx_vsat_du): Ditto. (CODE_FOR_lsx_vavg_b): Ditto. (CODE_FOR_lsx_vavg_h): Ditto. (CODE_FOR_lsx_vavg_w): Ditto. (CODE_FOR_lsx_vavg_d): Ditto. (CODE_FOR_lsx_vavg_bu): Ditto. (CODE_FOR_lsx_vavg_hu): Ditto. (CODE_FOR_lsx_vavg_wu): Ditto. (CODE_FOR_lsx_vavg_du): Ditto. (CODE_FOR_lsx_vavgr_b): Ditto. (CODE_FOR_lsx_vavgr_h): Ditto. (CODE_FOR_lsx_vavgr_w): Ditto. (CODE_FOR_lsx_vavgr_d): Ditto. (CODE_FOR_lsx_vavgr_bu): Ditto. (CODE_FOR_lsx_vavgr_hu): Ditto. (CODE_FOR_lsx_vavgr_wu): Ditto. (CODE_FOR_lsx_vavgr_du): Ditto. (CODE_FOR_lsx_vssub_b): Ditto. (CODE_FOR_lsx_vssub_h): Ditto. (CODE_FOR_lsx_vssub_w): Ditto. (CODE_FOR_lsx_vssub_d): Ditto. (CODE_FOR_lsx_vssub_bu): Ditto. (CODE_FOR_lsx_vssub_hu): Ditto. (CODE_FOR_lsx_vssub_wu): Ditto. (CODE_FOR_lsx_vssub_du): Ditto. (CODE_FOR_lsx_vabsd_b): Ditto. (CODE_FOR_lsx_vabsd_h): Ditto. (CODE_FOR_lsx_vabsd_w): Ditto. (CODE_FOR_lsx_vabsd_d): Ditto. (CODE_FOR_lsx_vabsd_bu): Ditto. (CODE_FOR_lsx_vabsd_hu): Ditto. (CODE_FOR_lsx_vabsd_wu): Ditto. (CODE_FOR_lsx_vabsd_du): Ditto. (CODE_FOR_lsx_vftint_w_s): Ditto. (CODE_FOR_lsx_vftint_l_d): Ditto. (CODE_FOR_lsx_vftint_wu_s): Ditto. (CODE_FOR_lsx_vftint_lu_d): Ditto. (CODE_FOR_lsx_vandn_v): Ditto. (CODE_FOR_lsx_vorn_v): Ditto. (CODE_FOR_lsx_vneg_b): Ditto. (CODE_FOR_lsx_vneg_h): Ditto. (CODE_FOR_lsx_vneg_w): Ditto. (CODE_FOR_lsx_vneg_d): Ditto. (CODE_FOR_lsx_vshuf4i_d): Ditto. (CODE_FOR_lsx_vbsrl_v): Ditto. (CODE_FOR_lsx_vbsll_v): Ditto. (CODE_FOR_lsx_vfmadd_s): Ditto. (CODE_FOR_lsx_vfmadd_d): Ditto. (CODE_FOR_lsx_vfmsub_s): Ditto. (CODE_FOR_lsx_vfmsub_d): Ditto. (CODE_FOR_lsx_vfnmadd_s): Ditto. (CODE_FOR_lsx_vfnmadd_d): Ditto. (CODE_FOR_lsx_vfnmsub_s): Ditto. (CODE_FOR_lsx_vfnmsub_d): Ditto. (CODE_FOR_lsx_vmuh_b): Ditto. (CODE_FOR_lsx_vmuh_h): Ditto. (CODE_FOR_lsx_vmuh_w): Ditto. (CODE_FOR_lsx_vmuh_d): Ditto. (CODE_FOR_lsx_vmuh_bu): Ditto. (CODE_FOR_lsx_vmuh_hu): Ditto. (CODE_FOR_lsx_vmuh_wu): Ditto. (CODE_FOR_lsx_vmuh_du): Ditto. (CODE_FOR_lsx_vsllwil_h_b): Ditto. (CODE_FOR_lsx_vsllwil_w_h): Ditto. (CODE_FOR_lsx_vsllwil_d_w): Ditto. (CODE_FOR_lsx_vsllwil_hu_bu): Ditto. (CODE_FOR_lsx_vsllwil_wu_hu): Ditto. (CODE_FOR_lsx_vsllwil_du_wu): Ditto. (CODE_FOR_lsx_vssran_b_h): Ditto. (CODE_FOR_lsx_vssran_h_w): Ditto. (CODE_FOR_lsx_vssran_w_d): Ditto. (CODE_FOR_lsx_vssran_bu_h): Ditto. (CODE_FOR_lsx_vssran_hu_w): Ditto. (CODE_FOR_lsx_vssran_wu_d): Ditto. (CODE_FOR_lsx_vssrarn_b_h): Ditto. (CODE_FOR_lsx_vssrarn_h_w): Ditto. (CODE_FOR_lsx_vssrarn_w_d): Ditto. (CODE_FOR_lsx_vssrarn_bu_h): Ditto. (CODE_FOR_lsx_vssrarn_hu_w): Ditto. (CODE_FOR_lsx_vssrarn_wu_d): Ditto. (CODE_FOR_lsx_vssrln_bu_h): Ditto. (CODE_FOR_lsx_vssrln_hu_w): Ditto. (CODE_FOR_lsx_vssrln_wu_d): Ditto. (CODE_FOR_lsx_vssrlrn_bu_h): Ditto. (CODE_FOR_lsx_vssrlrn_hu_w): Ditto. (CODE_FOR_lsx_vssrlrn_wu_d): Ditto. (loongarch_builtin_vector_type): Ditto. (loongarch_build_cvpointer_type): Ditto. (LARCH_ATYPE_CVPOINTER): Ditto. (LARCH_ATYPE_BOOLEAN): Ditto. (LARCH_ATYPE_V2SF): Ditto. (LARCH_ATYPE_V2HI): Ditto. (LARCH_ATYPE_V2SI): Ditto. (LARCH_ATYPE_V4QI): Ditto. (LARCH_ATYPE_V4HI): Ditto. (LARCH_ATYPE_V8QI): Ditto. (LARCH_ATYPE_V2DI): Ditto. (LARCH_ATYPE_V4SI): Ditto. (LARCH_ATYPE_V8HI): Ditto. (LARCH_ATYPE_V16QI): Ditto. (LARCH_ATYPE_V2DF): Ditto. (LARCH_ATYPE_V4SF): Ditto. (LARCH_ATYPE_V4DI): Ditto. (LARCH_ATYPE_V8SI): Ditto. (LARCH_ATYPE_V16HI): Ditto. (LARCH_ATYPE_V32QI): Ditto. (LARCH_ATYPE_V4DF): Ditto. (LARCH_ATYPE_V8SF): Ditto. (LARCH_ATYPE_UV2DI): Ditto. (LARCH_ATYPE_UV4SI): Ditto. (LARCH_ATYPE_UV8HI): Ditto. (LARCH_ATYPE_UV16QI): Ditto. (LARCH_ATYPE_UV4DI): Ditto. (LARCH_ATYPE_UV8SI): Ditto. (LARCH_ATYPE_UV16HI): Ditto. (LARCH_ATYPE_UV32QI): Ditto. (LARCH_ATYPE_UV2SI): Ditto. (LARCH_ATYPE_UV4HI): Ditto. (LARCH_ATYPE_UV8QI): Ditto. (loongarch_builtin_vectorized_function): Ditto. (LARCH_GET_BUILTIN): Ditto. (loongarch_expand_builtin_insn): Ditto. (loongarch_expand_builtin_lsx_test_branch): Ditto. (loongarch_expand_builtin): Ditto. * config/loongarch/loongarch-ftypes.def (1): Ditto. (2): Ditto. (3): Ditto. (4): Ditto. * config/loongarch/lsxintrin.h: New file.