aboutsummaryrefslogtreecommitdiff
path: root/gcc
AgeCommit message (Collapse)AuthorFilesLines
2024-10-22GCN: Initial generic-target handling, add more GCN macro definesTobias Burnus7-32/+147
Newer llvm-mc assemblers support the gfx*-generic targets, permitting to generate code for all GPUs belonging to the same generation, even if not optimal code. This requires LLVM 19. This patch adds the compiler-side support for generic gfx and also adds -march=gfx10-3-generic and -march=gfx-11. However, those -march= are not documented nor used anywhere, yet. Disclaimer: Not tested (as my ROCm does not support it); additionally, libgomp/plugin/plugin-gcn.c has to be updated before it becomes useful. For better compatibility with LLVM's Clang, this commit additionally adds the macro definitions __GFX<9|10|11>__ for the architecture family, __AMDGPU__ besides the existing __AMDGCN__ and the two strings-containing macros __amdgcn_processor__ and __amdgcn_target_id__, where the former has '-' replaced by '_' but otherwise both contain the lower case name. For the new generic targets, the same happens, yielding, e.g., __gfx10_3_generic__. gcc/ChangeLog: * config/gcn/gcn-devices.def: Add generic version/flag as additional value and architecture family entry; update; add gfx-10-3-generic and gfx11-generic. * config/gcn/gcn-hsa.h (ABI_VERSION_SPEC): Remove (ASM_SPEC): Use generated ABI_VERSION_OPT instead. * config/gcn/gcn-tables.opt: Regenerate * config/gcn/gcn.h (gcn_device_def): Add generic_version and arch_family members. (TARGET_CPU_CPP_BUILTINS): Fix allocation bug, handle '-' in the name and add additional macro defines. * config/gcn/gcn.cc (gcn_devices): Handle it. * config/gcn/gen-gcn-device-macros.awk: Likewise; use ELF name for the macro name; generate ABI_VERSION_OPT. * config/gcn/mkoffload.cc (ELFABIVERSION_AMDGPU_HSA_V6, EF_AMDGPU_GENERIC_VERSION_V, EF_AMDGPU_GENERIC_VERSION_OFFSET, GET_GENERIC_VERSION, SET_GENERIC_VERSION): Define. (get_arch): Call SET_GENERIC_VERSION flag on elf_flags. (copy_early_debug_info): If the arch sets the generic version, use ELFABIVERSION_AMDGPU_HSA_V6.
2024-10-22testsuite: arm: Use check-function-bodies in fp16-aapcs-* testsTorbjörn SVENSSON4-23/+150
Converted the tests to use check-function-bodies in order to ensure that the sequence is correct. gcc/testsuite/ChangeLog: * gcc.target/arm/fp16-aapcs-1.c: Use check-function-bodies. * gcc.target/arm/fp16-aapcs-2.c: Likewise. * gcc.target/arm/fp16-aapcs-3.c: Likewise. * gcc.target/arm/fp16-aapcs-4.c: Likewise. Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
2024-10-22testsuite: arm: Relax expected asm in bitfield* and union-2 testsTorbjörn SVENSSON5-10/+10
Below -O2, lsls/lsrs are prefered. For -O2 and above, lsl/lsr are prefered. gcc/testsuite/ChangeLog: * gcc.target/arm/cmse/mainline/8_1m/bitfield-4.c: Allow lsl and lsr instructions. * gcc.target/arm/cmse/mainline/8_1m/bitfield-6.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/bitfield-8.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/bitfield-and-union.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/union-2.c: Likewise. Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
2024-10-22testsuite: arm: Use check-function-bodies in cmse-5 testsTorbjörn SVENSSON10-145/+274
Converted the tests to use check-function-bodies in order to ensure that the sequence is correct. This also allows both APSR_nzcvq and APSR_nzcvqg as target selector does not work when the -march and/or -mcpu overrides the target to test. gcc/testsuite/ChangeLog: * gcc.target/arm/cmse/mainline/8m/hard-sp/cmse-5.c: Use check-function-bodies. * gcc.target/arm/cmse/mainline/8m/hard/cmse-5.c: Likewise. * gcc.target/arm/cmse/mainline/8m/soft/cmse-5.c: Likewise. * gcc.target/arm/cmse/mainline/8m/softfp-sp/cmse-5.c: Likewise. * gcc.target/arm/cmse/mainline/8m/softfp/cmse-5.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/hard-sp/cmse-5.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/hard/cmse-5.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/soft/cmse-5.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/softfp-sp/cmse-5.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/softfp/cmse-5.c: Likewise. Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
2024-10-22testsuite: Add test directive checking removal of link_errorJennifer Schmitz1-1/+3
This test needs a directive checking the removal of the link_error. Committed as obvious. Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com> gcc/testsuite/ * gcc.dg/tree-ssa/log_ident.c: Add scan for removal of link_error in optimized tree dump.
2024-10-22c++: redundant hashing in register_specializationPatrick Palka1-3/+1
After r15-4050-g5dad738c1dd164 register_specialization needs to set elt.hash to the (maybe) precomputed hash so that the lookup uses it rather than redundantly computing it from scratch. gcc/cp/ChangeLog: * pt.cc (register_specialization): Set elt.hash. Reviewed-by: Jason Merrill <jason@redhat.com>
2024-10-22testsuite: Skip pr112305.c for -O[01] on simulatorsRichard Sandiford1-0/+1
gcc.dg/torture/pr112305.c contains an inner loop that executes 0x8000_0014 times and an outer loop that executes 5 times, giving about 10 billion total executions of the inner loop body. At -O2 and above we are able to remove the inner loop, but at -O1 we keep a no-op loop: dls lr, r3 .L3: subs r3, r3, #1 le lr, .L3 and at -O0 we of course don't optimise. This can lead to long execution times on simulators, possibly triggering a timeout. gcc/testsuite * gcc.dg/torture/pr112305.c: Skip at -O0 and -O1 for simulators.
2024-10-22c++/modules: Handle forward-declared class typesNathaniel Shead10-30/+64
In some cases we can access members of a namespace-scope class without ever having performed name-lookup on it; this can occur when a forward-declaration of the class is used as a return type, for instance, or with PIMPL. One possible approach would be to do name lookup in complete_type to force lazy loading to occur, but this seems overly expensive for a relatively rare case. Instead, this patch generalises the existing pending-entity support to handle this case as well. Unfortunately this does mean that almost every class definition will be added to the pending-entity table, and almost always unnecessarily, but I don't see a good way to avoid this. gcc/cp/ChangeLog: * module.cc (depset::DB_IS_MEMBER_BIT): Rename to... (depset::DB_IS_PENDING_BIT): ...this. (depset::is_member): Remove. (depset::is_pending_entity): New function. (depset::hash::make_dependency): Mark definitions of namespace-scope types as maybe-pending entities. (depset::hash::add_class_entities): Rename DB_IS_MEMBER_BIT to DB_IS_PENDING_BIT. (depset::hash::find_dependencies): Use is_pending_entity instead of is_member. (module_state::write_pendings): Likewise; adjust comment. gcc/testsuite/ChangeLog: * g++.dg/modules/inst-4_b.C: Adjust pending-entity count. * g++.dg/modules/member-def-1_c.C: Likewise. * g++.dg/modules/member-def-2_c.C: Likewise. * g++.dg/modules/tpl-spec-3_b.C: Likewise. * g++.dg/modules/tpl-spec-4_b.C: Likewise. * g++.dg/modules/tpl-spec-5_b.C: Likewise. * g++.dg/modules/class-9_a.H: New test. * g++.dg/modules/class-9_b.H: New test. * g++.dg/modules/class-9_c.C: New test. Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by: Jason Merrill <jason@redhat.com>
2024-10-22tree-optimization/117254 - ICE with access diangosticsRichard Biener2-1/+12
The diagnostics code fails to handle non-constant domain max. PR tree-optimization/117254 * gimple-ssa-warn-access.cc (maybe_warn_nonstring_arg): Check the array domain max is constant before using it. * gcc.dg/pr117254.c: New testcase.
2024-10-22amdgcn: Refactor device settings into a def fileAndrew Stubbs14-272/+478
Almost all device-specific settings are now centralised into gcn-devices.def for the compiler, mkoffload, and libgomp. No longer will we have to touch 10 files in multiple places just to add another device without any exotic features. (New ISAs and devices with incompatible metadata will continue to need a bit more.) In order to remove the device-specific conditionals in the code a new value HSACO_ATTR_UNSUPPORTED has been added, indicating that the assembler will reject any setting of that option. This incorporates some of Tobias's patch from March 2024. Co-Authored-By: Tobias Burnus <tburnus@baylibre.com> gcc/ChangeLog: * config.gcc (amdgcn): Add gcn-device-macros.h to tm_file. Add gcn-tables.opt to extra_options. * config/gcn/gcn-hsa.h (NO_XNACK): Delete. (NO_SRAM_ECC): Delete. (SRAMOPT): Move definition to generated file gcn-device-macros.h. (XNACKOPT): Likewise. (ASM_SPEC): Redefine using generated values from gcn-device-macros.h. * config/gcn/gcn-opts.h (enum processor_type): Generate from gcn-devices.def. (TARGET_VEGA10): Delete. (TARGET_VEGA20): Delete. (TARGET_GFX908): Delete. (TARGET_GFX90a): Delete. (TARGET_GFX90c): Delete. (TARGET_GFX1030): Delete. (TARGET_GFX1036): Delete. (TARGET_GFX1100): Delete. (TARGET_GFX1103): Delete. (TARGET_XNACK): Redefine to allow for HSACO_ATTR_UNSUPPORTED. (enum hsaco_attr_type): Add HSACO_ATTR_UNSUPPORTED. (TARGET_TGSPLIT): New define. * config/gcn/gcn.cc (gcn_devices): New constant table. (gcn_option_override): Rework to use gcn_devices table. (gcn_omp_device_kind_arch_isa): Likewise. (output_file_start): Likewise. (gcn_hsa_declare_function_name): Rework using TARGET_* macros. * config/gcn/gcn.h (gcn_devices): Declare struct and table. (TARGET_CPU_CPP_BUILTINS): Rework using gcn_devices. * config/gcn/gcn.opt: Move enum data to generated file gcn-tables.opt. Use new names for the default values. * config/gcn/mkoffload.cc (EF_AMDGPU_MACH_AMDGCN_GFX900): Delete. (EF_AMDGPU_MACH_AMDGCN_GFX906): Delete. (EF_AMDGPU_MACH_AMDGCN_GFX908): Delete. (EF_AMDGPU_MACH_AMDGCN_GFX90a): Delete. (EF_AMDGPU_MACH_AMDGCN_GFX90c): Delete. (EF_AMDGPU_MACH_AMDGCN_GFX1030): Delete. (EF_AMDGPU_MACH_AMDGCN_GFX1036): Delete. (EF_AMDGPU_MACH_AMDGCN_GFX1100): Delete. (EF_AMDGPU_MACH_AMDGCN_GFX1103): Delete. (enum elf_arch_code): Define using gcn-devices.def. (get_arch): Rework using gcn-devices.def. (main): Rework using gcn-devices.def * config/gcn/t-gcn-hsa (gcn-tables.opt): Generate file. (gcn-device-macros.h): Generate file. * config/gcn/t-omp-device: Generate isa list from gcn-devices.def. * config/gcn/gcn-devices.def: New file. * config/gcn/gcn-tables.opt: New file. * config/gcn/gcn-tables.opt.urls: New file. * config/gcn/gen-gcn-device-macros.awk: New file. * config/gcn/gen-opt-tables.awk: New file. libgomp/ChangeLog: * plugin/plugin-gcn.c (EF_AMDGPU_MACH): Generate from gcn-devices.def. (gcn_gfx803_s): Delete. (gcn_gfx900_s): Delete. (gcn_gfx906_s): Delete. (gcn_gfx908_s): Delete. (gcn_gfx90a_s): Delete. (gcn_gfx90c_s): Delete. (gcn_gfx1030_s): Delete. (gcn_gfx1036_s): Delete. (gcn_gfx1100_s): Delete. (gcn_gfx1103_s): Delete. (gcn_isa_name_len): Delete. (isa_hsa_name): Rename ... (isa_name): ... to this, and rework using gcn-devices.def. (isa_gcc_name): Delete. (isa_code): Rework using gcn-devices.def. (max_isa_vgprs): Rework using gcn-devices.def. (isa_matches_agent): Update isa_name usage. (GOMP_OFFLOAD_init_device): Improve diagnostic using the name.
2024-10-22tree-optimization/117123 - missed PHI equivalence in VNRichard Biener2-25/+77
Value-numbering can use its set of equivalences to prove that a PHI node with args <a_1, 5, 10> is equal to a_1 iff on the edges with the constants a_1 == 5 and a_1 == 10 hold. This breaks down when the order of PHI args is <5, 10, a_1> as then we drop to VARYING early. The following mitigates this by shuffling a copy of the edge vector to always process a SSA name argument first. Which should also handle the special-case of a two argument <5, a_1> we already had. PR tree-optimization/117123 * tree-ssa-sccvn.cc (visit_phi): First process a non-constant argument edge to handle more equivalences. Remove the two-arg special case. * g++.dg/tree-ssa/pr117123.C: New testcase.
2024-10-22testsuite: Fix typo in ext-floating19.CStefan Schulze Frielinghaus1-1/+1
gcc/testsuite/ChangeLog: * g++.dg/cpp23/ext-floating19.C: Fix typo for bfloat16 guard.
2024-10-22RISC-V: Add testcases for unsigned .SAT_SUB form 1 with IMM = 1.xuli4-0/+86
form 1: T __attribute__((noinline)) \ sat_u_sub_imm##IMM##_##T##_fmt_1 (T y) \ { \ return (T)IMM >= y ? (T)IMM - y : 0; \ } Passed the rv64gcv regression test. Change-Id: I8805225b445cdbbc685f4f54a4d66c7ee8f748e1 Signed-off-by: Li Xu <xuli1@eswincomputing.com> gcc/testsuite/ChangeLog: * gcc.target/riscv/sat_u_sub_imm-1_4.c: New test. * gcc.target/riscv/sat_u_sub_imm-2_4.c: New test. * gcc.target/riscv/sat_u_sub_imm-3_4.c: New test. * gcc.target/riscv/sat_u_sub_imm-4_2.c: New test.
2024-10-22Match: Support IMM=1 for unsigned scalar .SAT_SUB IMM form 1xuli1-0/+7
This patch would like to support .SAT_SUB when one of the op is IMM = 1 of form1. Form 1: #define DEF_SAT_U_SUB_IMM_FMT_1(T, IMM) \ T __attribute__((noinline)) \ sat_u_sub_imm##IMM##_##T##_fmt_1 (T y) \ { \ return IMM >= y ? IMM - y : 0; \ } Take below form 1 as example: DEF_SAT_U_SUB_IMM_FMT_1(uint8_t, 1) Before this patch: __attribute__((noinline)) uint8_t sat_u_sub_imm1_uint8_t_fmt_1 (uint8_t y) { uint8_t _1; uint8_t _3; <bb 2> [local count: 1073741824]: if (y_2(D) <= 1) goto <bb 3>; [41.00%] else goto <bb 4>; [59.00%] <bb 3> [local count: 440234144]: _3 = y_2(D) ^ 1; <bb 4> [local count: 1073741824]: # _1 = PHI <0(2), _3(3)> return _1; } After this patch: __attribute__((noinline)) uint8_t sat_u_sub_imm1_uint8_t_fmt_1 (uint8_t y) { uint8_t _1; ;; basic block 2, loop depth 0 ;; pred: ENTRY _1 = .SAT_SUB (1, y_2(D)); [tail call] return _1; ;; succ: EXIT } The below test suites are passed for this patch: 1. The rv64gcv fully regression tests. 2. The x86 bootstrap tests. 3. The x86 fully regression tests. Signed-off-by: Li Xu <xuli1@eswincomputing.com> gcc/ChangeLog: * match.pd: Support IMM=1.
2024-10-22RISC-V: Add testcases for unsigned .SAT_SUB form 1 with IMM = max -1.xuli4-0/+89
form 1: T __attribute__((noinline)) \ sat_u_sub_imm##IMM##_##T##_fmt_1 (T y) \ { \ return (T)IMM >= y ? (T)IMM - y : 0; \ } Passed the rv64gcv regression test. Change-Id: Idaa1ab41f2a5785112279ea8ee2c93236457b740 Signed-off-by: Li Xu <xuli1@eswincomputing.com> gcc/testsuite/ChangeLog: * gcc.target/riscv/sat_u_sub_imm-1_3.c: New test. * gcc.target/riscv/sat_u_sub_imm-2_3.c: New test. * gcc.target/riscv/sat_u_sub_imm-3_3.c: New test. * gcc.target/riscv/sat_u_sub_imm-4_1.c: New test.
2024-10-22Match: Support IMM=max-1 for unsigned scalar .SAT_SUB IMM form 1xuli1-1/+17
This patch would like to support .SAT_SUB when one of the op is IMM = max - 1 of form1. Form 1: #define DEF_SAT_U_SUB_IMM_FMT_1(T, IMM) \ T __attribute__((noinline)) \ sat_u_sub_imm##IMM##_##T##_fmt_1 (T y) \ { \ return IMM >= y ? IMM - y : 0; \ } Take below form 1 as example: DEF_SAT_U_SUB_IMM_FMT_1(uint8_t, 254) Before this patch: __attribute__((noinline)) uint8_t sat_u_sub_imm254_uint8_t_fmt_1 (uint8_t y) { uint8_t _1; uint8_t _3; <bb 2> [local count: 1073741824]: if (y_2(D) != 255) goto <bb 3>; [66.00%] else goto <bb 4>; [34.00%] <bb 3> [local count: 708669600]: _3 = 254 - y_2(D); <bb 4> [local count: 1073741824]: # _1 = PHI <0(2), _3(3)> return _1; } After this patch: __attribute__((noinline)) uint8_t sat_u_sub_imm254_uint8_t_fmt_1 (uint8_t y) { uint8_t _1; <bb 2> [local count: 1073741824]: _1 = .SAT_SUB (254, y_2(D)); [tail call] return _1; } The below test suites are passed for this patch: 1. The rv64gcv fully regression tests. 2. The x86 bootstrap tests. 3. The x86 fully regression tests. Signed-off-by: Li Xu <xuli1@eswincomputing.com> gcc/ChangeLog: * match.pd: Support IMM=max-1.
2024-10-22Daily bump.GCC Administrator3-1/+494
2024-10-21[committed][PR rtl-optimization/116488] Fix SIGN_EXTEND source handling in ↵Jeff Law5-6/+98
ext-dce A while back I noticed that the code to call carry_backpropagate was being called after the optimization step. Which seemed wrong, but at the time I didn't have a testcase showing it as a problem. Now I have 4 :-) The way things used to work, the extension would be stripped away before calling carry_backpropagte, meaning carry_backpropagate would never see a SIGN_EXTENSION. Thus the code trying to account for the sign extended bit was never reached. Getting that bit marked live is what's needed to fix these testcases. Fallout is minor with just an adjustment needed to sensibly deal with vector modes in a place where we didn't have them before. I'm still somewhat concerned about this code. Specifically whether or not we can get in here with arbitrarily complex RTL, and if so do we need to recurse down and look at those sub-expressions. So while this patch fixes the most pressing issue, I wouldn't be terribly surprised if we're back inside this code at some point. Bootstrapped and regression tested on x86_64, ppc64le, riscv64, s390x, mips64, loongarch, aarch64, m68k, alpha, hppa, sh4, sh4eb, perhaps something else that I've forgotten... Also tested on all the crosses in my tester. PR rtl-optimization/116488 PR rtl-optimization/116579 PR rtl-optimization/116915 PR rtl-optimization/117226 gcc/ * ext-dce.cc (carry_backpropagate): Properly handle SIGN_EXTEND, add ZERO_EXTEND handling as well. (ext_dce_process_uses): Call carry_backpropagate before the optimization step. gcc/testsuite/ * gcc.dg/torture/pr116488.c: New test. * gcc.dg/torture/pr116579.c: New test. * gcc.dg/torture/pr116915.c: New test. * gcc.dg/torture/pr117226.c: New test.
2024-10-21RISC-V: Add testcases for form 8 of vector signed SAT_TRUNCPan Li13-0/+172
Form 8: #define DEF_VEC_SAT_S_TRUNC_FMT_8(NT, WT, NT_MIN, NT_MAX) \ void __attribute__((noinline)) \ vec_sat_s_trunc_##NT##_##WT##_fmt_8 (NT *out, WT *in, unsigned limit) \ { \ unsigned i; \ for (i = 0; i < limit; i++) \ { \ WT x = in[i]; \ NT trunc = (NT)x; \ out[i] = (WT)NT_MIN >= x || x >= (WT)NT_MAX \ ? x < 0 ? NT_MIN : NT_MAX \ : trunc; \ } \ } The below test are passed for this patch. * The rv64gcv fully regression test. It is test only patch and obvious up to a point, will commit it directly if no comments in next 48H. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-8-i16-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-8-i32-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-8-i32-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-8-i64-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-8-i64-to-i32.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-8-i64-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-8-i16-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-8-i32-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-8-i32-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-8-i64-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-8-i64-to-i32.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-8-i64-to-i8.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-10-21RISC-V: Add testcases for form 7 of vector signed SAT_TRUNCPan Li13-0/+172
Form 7: #define DEF_VEC_SAT_S_TRUNC_FMT_7(NT, WT, NT_MIN, NT_MAX) \ void __attribute__((noinline)) \ vec_sat_s_trunc_##NT##_##WT##_fmt_7 (NT *out, WT *in, unsigned limit) \ { \ unsigned i; \ for (i = 0; i < limit; i++) \ { \ WT x = in[i]; \ NT trunc = (NT)x; \ out[i] = (WT)NT_MIN > x || x >= (WT)NT_MAX \ ? x < 0 ? NT_MIN : NT_MAX \ : trunc; \ } \ } The below test are passed for this patch. * The rv64gcv fully regression test. It is test only patch and obvious up to a point, will commit it directly if no comments in next 48H. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-7-i16-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-7-i32-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-7-i32-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-7-i64-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-7-i64-to-i32.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-7-i64-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-7-i16-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-7-i32-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-7-i32-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-7-i64-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-7-i64-to-i32.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-7-i64-to-i8.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-10-21RISC-V: Add testcases for form 6 of vector signed SAT_TRUNCPan Li13-0/+172
Form 6: #define DEF_VEC_SAT_S_TRUNC_FMT_6(NT, WT, NT_MIN, NT_MAX) \ void __attribute__((noinline)) \ vec_sat_s_trunc_##NT##_##WT##_fmt_6 (NT *out, WT *in, unsigned limit) \ { \ unsigned i; \ for (i = 0; i < limit; i++) \ { \ WT x = in[i]; \ NT trunc = (NT)x; \ out[i] = (WT)NT_MIN >= x || x > (WT)NT_MAX \ ? x < 0 ? NT_MIN : NT_MAX \ j: trunc; \ } \ } The below test are passed for this patch. * The rv64gcv fully regression test. It is test only patch and obvious up to a point, will commit it directly if no comments in next 48H. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-6-i16-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-6-i32-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-6-i32-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-6-i64-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-6-i64-to-i32.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-6-i64-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-6-i16-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-6-i32-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-6-i32-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-6-i64-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-6-i64-to-i32.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-6-i64-to-i8.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-10-21RISC-V: Add testcases for form 5 of vector signed SAT_TRUNCPan Li13-0/+172
Form 5: #define DEF_VEC_SAT_S_TRUNC_FMT_5(NT, WT, NT_MIN, NT_MAX) \ void __attribute__((noinline)) \ vec_sat_s_trunc_##NT##_##WT##_fmt_5 (NT *out, WT *in, unsigned limit) \ { \ unsigned i; \ for (i = 0; i < limit; i++) \ { \ WT x = in[i]; \ NT trunc = (NT)x; \ out[i] = (WT)NT_MIN > x || x > (WT)NT_MAX \ ? x < 0 ? NT_MIN : NT_MAX \ : trunc; \ } \ } The below test are passed for this patch. * The rv64gcv fully regression test. It is test only patch and obvious up to a point, will commit it directly if no comments in next 48H. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-5-i16-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-5-i32-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-5-i32-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-5-i64-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-5-i64-to-i32.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-5-i64-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-5-i16-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-5-i32-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-5-i32-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-5-i64-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-5-i64-to-i32.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-5-i64-to-i8.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-10-21RISC-V: Add testcases for form 4 of vector signed SAT_TRUNCPan Li13-0/+172
Form 4: #define DEF_VEC_SAT_S_TRUNC_FMT_4(NT, WT, NT_MIN, NT_MAX) \ void __attribute__((noinline)) \ vec_sat_s_trunc_##NT##_##WT##_fmt_4 (NT *out, WT *in, unsigned limit) \ { \ unsigned i; \ for (i = 0; i < limit; i++) \ { \ WT x = in[i]; \ NT trunc = (NT)x; \ out[i] = (WT)NT_MIN <= x && x < (WT)NT_MAX \ ? trunc \ : x < 0 ? NT_MIN : NT_MAX; \ } \ } The below test are passed for this patch. * The rv64gcv fully regression test. It is test only patch and obvious up to a point, will commit it directly if no comments in next 48H. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-4-i16-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-4-i32-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-4-i32-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-4-i64-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-4-i64-to-i32.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-4-i64-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-4-i16-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-4-i32-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-4-i32-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-4-i64-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-4-i64-to-i32.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-4-i64-to-i8.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-10-21RISC-V: Add testcases for form 3 of vector signed SAT_TRUNCPan Li13-0/+172
Form 3: #define DEF_VEC_SAT_S_TRUNC_FMT_3(NT, WT, NT_MIN, NT_MAX) \ void __attribute__((noinline)) \ vec_sat_s_trunc_##NT##_##WT##_fmt_3 (NT *out, WT *in, unsigned limit) \ { \ unsigned i; \ for (i = 0; i < limit; i++) \ { \ WT x = in[i]; \ NT trunc = (NT)x; \ out[i] = (WT)NT_MIN < x && x < (WT)NT_MAX \ ? trunc \ : x < 0 ? NT_MIN : NT_MAX; \ } \ } The below test are passed for this patch. * The rv64gcv fully regression test. It is test only patch and obvious up to a point, will commit it directly if no comments in next 48H. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-3-i16-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-3-i32-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-3-i32-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-3-i64-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-3-i64-to-i32.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-3-i64-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-3-i16-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-3-i32-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-3-i32-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-3-i64-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-3-i64-to-i32.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-3-i64-to-i8.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-10-21RISC-V: Add testcases for form 2 of vector signed SAT_TRUNCPan Li13-0/+172
Form 2: #define DEF_VEC_SAT_S_TRUNC_FMT_2(NT, WT, NT_MIN, NT_MAX) \ void __attribute__((noinline)) \ vec_sat_s_trunc_##NT##_##WT##_fmt_2 (NT *out, WT *in, unsigned limit) \ { \ unsigned i; \ for (i = 0; i < limit; i++) \ { \ WT x = in[i]; \ NT trunc = (NT)x; \ out[i] = (WT)NT_MIN < x && x < (WT)NT_MAX \ ? trunc \ : x < 0 ? NT_MIN : NT_MAX; \ } \ } The below test are passed for this patch. * The rv64gcv fully regression test. It is test only patch and obvious up to a point, will commit it directly if no comments in next 48H. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-2-i16-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-2-i32-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-2-i32-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-2-i64-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-2-i64-to-i32.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-2-i64-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-2-i16-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-2-i32-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-2-i32-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-2-i64-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-2-i64-to-i32.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-2-i64-to-i8.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-10-21RISC-V: Add testcases for form 1 of vector signed SAT_TRUNCPan Li14-0/+463
Form 1: #define DEF_VEC_SAT_S_TRUNC_FMT_1(NT, WT, NT_MIN, NT_MAX) \ void __attribute__((noinline)) \ vec_sat_s_trunc_##NT##_##WT##_fmt_1 (NT *out, WT *in, unsigned limit) \ { \ unsigned i; \ for (i = 0; i < limit; i++) \ { \ WT x = in[i]; \ NT trunc = (NT)x; \ out[i] = (WT)NT_MIN <= x && x <= (WT)NT_MAX \ ? trunc \ : x < 0 ? NT_MIN : NT_MAX; \ } \ } The below test are passed for this patch. * The rv64gcv fully regression test. It is test only patch and obvious up to a point, will commit it directly if no comments in next 48H. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/unop/vec_sat_data.h: Add test data for signed SAT_TRUNC. * gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-1-i16-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-1-i32-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-1-i32-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-1-i64-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-1-i64-to-i32.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-1-i64-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-1-i16-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-1-i32-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-1-i32-to-i8.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-1-i64-to-i16.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-1-i64-to-i32.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-1-i64-to-i8.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-10-21RISC-V: Implement vector SAT_TRUNC for signed integerPan Li3-0/+84
This patch would like to implement the sstrunc for vector signed integer. Form 1: #define DEF_VEC_SAT_S_TRUNC_FMT_1(NT, WT, NT_MIN, NT_MAX) \ void __attribute__((noinline)) \ vec_sat_s_trunc_##NT##_##WT##_fmt_1 (NT *out, WT *in, unsigned limit) \ { \ unsigned i; \ for (i = 0; i < limit; i++) \ { \ WT x = in[i]; \ NT trunc = (NT)x; \ out[i] = (WT)NT_MIN <= x && x <= (WT)NT_MAX \ ? trunc \ : x < 0 ? NT_MIN : NT_MAX; \ } \ } DEF_VEC_SAT_S_TRUNC_FMT_1(int32_t, int64_t, INT32_MIN, INT32_MAX) Before this patch: 27 │ vsetvli a5,a2,e64,m1,ta,ma 28 │ vle64.v v1,0(a1) 29 │ slli a3,a5,3 30 │ slli a4,a5,2 31 │ sub a2,a2,a5 32 │ add a1,a1,a3 33 │ vadd.vv v0,v1,v5 34 │ vsetvli zero,zero,e32,mf2,ta,ma 35 │ vnsrl.wx v2,v1,a6 36 │ vncvt.x.x.w v1,v1 37 │ vsetvli zero,zero,e64,m1,ta,ma 38 │ vmsgtu.vv v0,v0,v4 39 │ vsetvli zero,zero,e32,mf2,ta,mu 40 │ vneg.v v2,v2 41 │ vxor.vv v1,v2,v3,v0.t 42 │ vse32.v v1,0(a0) 43 │ add a0,a0,a4 44 │ bne a2,zero,.L3 After this patch: 16 │ vsetvli a5,a2,e32,mf2,ta,ma 17 │ vle64.v v1,0(a1) 18 │ slli a3,a5,3 19 │ slli a4,a5,2 20 │ sub a2,a2,a5 21 │ add a1,a1,a3 22 │ vnclip.wi v1,v1,0 23 │ vse32.v v1,0(a0) 24 │ add a0,a0,a4 25 │ bne a2,zero,.L3 The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/ChangeLog: * config/riscv/autovec.md (sstrunc<mode><v_double_trunc>2): Add new pattern sstrunc for double trunc. (sstrunc<mode><v_quad_trunc>2): Ditto but for quad trunc. (sstrunc<mode><v_oct_trunc>2): Ditto but for oct trunc. * config/riscv/riscv-protos.h (expand_vec_double_sstrunc): Add new func decl to expand double trunc. (expand_vec_quad_sstrunc): Ditto but for quad trunc. (expand_vec_oct_sstrunc): Ditto but for oct trunc. * config/riscv/riscv-v.cc (expand_vec_double_sstrunc): Add new func to expand double trunc. (expand_vec_quad_sstrunc): Ditto but for quad trunc. (expand_vec_oct_sstrunc): Ditto but for oct trunc. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-10-21Vect: Try the pattern of vector signed integer SAT_TRUNCPan Li1-1/+3
Almost the same as vector unsigned integer SAT_TRUNC, try to match the signed version during the vector pattern matching. The below test suites are passed for this patch. * The rv64gcv fully regression test. * The x86 bootstrap test. * The x86 fully regression test. gcc/ChangeLog: * tree-vect-patterns.cc (gimple_signed_integer_sat_trunc): Add new func decl for signed SAT_TRUNC. (vect_recog_sat_trunc_pattern): Try signed match pattern for the SAT_TRUNC. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-10-21Match: Support form 1 for vector signed integer SAT_TRUNCPan Li1-1/+3
This patch would like to support the form 1 of the vector signed integer SAT_TRUNC. Aka below example: Form 1: #define DEF_VEC_SAT_S_TRUNC_FMT_1(NT, WT, NT_MIN, NT_MAX) \ void __attribute__((noinline)) \ vec_sat_s_trunc_##NT##_##WT##_fmt_1 (NT *out, WT *in, unsigned limit) \ { \ unsigned i; \ for (i = 0; i < limit; i++) \ { \ WT x = in[i]; \ NT trunc = (NT)x; \ out[i] = (WT)NT_MIN <= x && x <= (WT)NT_MAX \ ? trunc \ : x < 0 ? NT_MIN : NT_MAX; \ } \ } DEF_VEC_SAT_S_TRUNC_FMT_1(int32_t, int64_t, INT32_MIN, INT32_MAX) Before this patch: 48 │ _87 = .SELECT_VL (ivtmp_85, POLY_INT_CST [2, 2]); 49 │ ivtmp_64 = _87 * 8; 50 │ vect_x_14.10_67 = .MASK_LEN_LOAD (vectp_in.8_65, 64B, { -1, ... }, _87, 0); 51 │ vect_trunc_15.21_78 = (vector([2,2]) int) vect_x_14.10_67; 52 │ _61 = VIEW_CONVERT_EXPR<vector([2,2]) unsigned long>(vect_x_14.10_67); 53 │ _32 = _61 >> 63; 54 │ vect_patt_52.16_73 = (vector([2,2]) int) _32; 55 │ vect__46.17_74 = VIEW_CONVERT_EXPR<vector([2,2]) unsigned int>(vect_patt_52.16_73); 56 │ vect__47.18_75 = -vect__46.17_74; 57 │ vect__21.19_76 = VIEW_CONVERT_EXPR<vector([2,2]) int>(vect__47.18_75); 58 │ vect_x.11_68 = VIEW_CONVERT_EXPR<vector([2,2]) unsigned long>(vect_x_14.10_67); 59 │ vect__5.12_69 = vect_x.11_68 + { 2147483648, ... }; 60 │ mask__34.13_70 = vect__5.12_69 > { 4294967295, ... }; 61 │ _25 = .COND_XOR (mask__34.13_70, vect__21.19_76, { 2147483647, ... }, vect_trunc_15.21_78); 62 │ ivtmp_80 = _87 * 4; 63 │ .MASK_LEN_STORE (vectp_out.23_81, 32B, { -1, ... }, _87, 0, _25); 64 │ vectp_in.8_66 = vectp_in.8_65 + ivtmp_64; 65 │ vectp_out.23_82 = vectp_out.23_81 + ivtmp_80; 66 │ ivtmp_86 = ivtmp_85 - _87; After this patch: 38 │ _77 = .SELECT_VL (ivtmp_75, POLY_INT_CST [2, 2]); 39 │ ivtmp_65 = _77 * 8; 40 │ vect_x_14.10_68 = .MASK_LEN_LOAD (vectp_in.8_66, 64B, { -1, ... }, _77, 0); 41 │ vect_patt_53.11_69 = .SAT_TRUNC (vect_x_14.10_68); 42 │ ivtmp_70 = _77 * 4; 43 │ .MASK_LEN_STORE (vectp_out.12_71, 32B, { -1, ... }, _77, 0, vect_patt_53.11_69); 44 │ vectp_in.8_67 = vectp_in.8_66 + ivtmp_65; 45 │ vectp_out.12_72 = vectp_out.12_71 + ivtmp_70; 46 │ ivtmp_76 = ivtmp_75 - _77; The below test suites are passed for this patch. * The rv64gcv fully regression test. * The x86 bootstrap test. * The x86 fully regression test. gcc/ChangeLog: * match.pd: Refine matching for vector signed SAT_TRUNC form 1. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-10-21aarch64: Fix costing of move to/from MOVEABLE_SYSREGSAndrew Carlotti1-0/+6
This is necessary to prevent reload assuming that a direct FP->FPMR move is valid. gcc/ChangeLog: * config/aarch64/aarch64.cc (aarch64_register_move_cost): Increase costs involving MOVEABLE_SYSREGS.
2024-10-21amdgcn: silence warningAndrew Stubbs1-1/+1
FIRST_SGPR_REG is register zero so the compiler always claims this comparison is redundant. It's right, of course, but I'd have preferred to keep the comparison for completeness. Probably the "correct" solution is to use an enum for these values. gcc/ChangeLog: * config/gcn/gcn.h (SGPR_REGNO_P): Silence warning.
2024-10-21pair-fusion: Assume alias conflict if common address reg changes [PR116783]Alex Coplan2-12/+213
As the PR shows, pair-fusion was tricking memory_modified_in_insn_p into returning false when a common base register (in this case, x1) was modified between the mem and the store insn. This lead to wrong code as the accesses really did alias. To avoid this sort of problem, this patch avoids invoking RTL alias analysis altogether (and assume an alias conflict) if the two insns to be compared share a common address register R, and the insns see different definitions of R (i.e. it was modified in between). gcc/ChangeLog: PR rtl-optimization/116783 * pair-fusion.cc (def_walker::cand_addr_uses): New. (def_walker::def_walker): Add parameter for candidate address uses. (def_walker::alias_conflict_p): Declare. (def_walker::addr_reg_conflict_p): New. (def_walker::conflict_p): New. (store_walker::store_walker): Add parameter for candidate address uses and pass to base ctor. (store_walker::conflict_p): Rename to ... (store_walker::alias_conflict_p): ... this. (load_walker::load_walker): Add parameter for candidate address uses and pass to base ctor. (load_walker::conflict_p): Rename to ... (load_walker::alias_conflict_p): ... this. (pair_fusion_bb_info::try_fuse_pair): Collect address register uses for candidate insns and pass down to alias walkers. gcc/testsuite/ChangeLog: PR rtl-optimization/116783 * g++.dg/torture/pr116783.C: New test.
2024-10-21rs6000: Correct the function code for _AMO_LD_DEC_BOUNDEDJeevitha1-1/+1
Corrected the function code for the Atomic Memory Operation "Fetch and Decrement Bounded", changing it from 0x1A to 0x1C. 2024-10-11 Jeevitha Palanisamy <jeevitha@linux.ibm.com> gcc/ * config/rs6000/amo.h (enum _AMO_LD): Correct the function code for _AMO_LD_DEC_BOUNDED.
2024-10-21i386: Refactor get_intel_cpuHaochen Jiang1-295/+292
From ISE, it shows that we will have family 0x13 for Diamond Rapids. Therefore, we need to refactor the get_intel_cpu to accept new families. Also I did some reorder in the switch for clearness by putting earlier added products on top for search convenience. gcc/ChangeLog: * common/config/i386/cpuinfo.h (get_intel_cpu): Refactor the function for future expansion on different family.
2024-10-21RISC-V: Skip flag -flto for all saturated arithmetic test cases.xuli232-0/+232
Skip flat -flto to address UNRESOLVED cases as follows: gcc.target/riscv/sat_s_add-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects: output file does not exist UNRESOLVED: gcc.target/riscv/sat_s_add-1.c Change-Id: I7ff55197b6294cd473dfaa6cc350c5e2eb5960fe Signed-off-by: Li Xu <xuli1@eswincomputing.com> gcc/testsuite/ChangeLog: * gcc.target/riscv/sat_s_add-1.c: Skip flag -flto. * gcc.target/riscv/sat_s_add-10.c: Ditto. * gcc.target/riscv/sat_s_add-11.c: Ditto. * gcc.target/riscv/sat_s_add-12.c: Ditto. * gcc.target/riscv/sat_s_add-13.c: Ditto. * gcc.target/riscv/sat_s_add-14.c: Ditto. * gcc.target/riscv/sat_s_add-15.c: Ditto. * gcc.target/riscv/sat_s_add-16.c: Ditto. * gcc.target/riscv/sat_s_add-2.c: Ditto. * gcc.target/riscv/sat_s_add-3.c: Ditto. * gcc.target/riscv/sat_s_add-4.c: Ditto. * gcc.target/riscv/sat_s_add-5.c: Ditto. * gcc.target/riscv/sat_s_add-6.c: Ditto. * gcc.target/riscv/sat_s_add-7.c: Ditto. * gcc.target/riscv/sat_s_add-8.c: Ditto. * gcc.target/riscv/sat_s_add-9.c: Ditto. * gcc.target/riscv/sat_s_sub-1-i16.c: Ditto. * gcc.target/riscv/sat_s_sub-1-i32.c: Ditto. * gcc.target/riscv/sat_s_sub-1-i64.c: Ditto. * gcc.target/riscv/sat_s_sub-1-i8.c: Ditto. * gcc.target/riscv/sat_s_sub-2-i16.c: Ditto. * gcc.target/riscv/sat_s_sub-2-i32.c: Ditto. * gcc.target/riscv/sat_s_sub-2-i64.c: Ditto. * gcc.target/riscv/sat_s_sub-2-i8.c: Ditto. * gcc.target/riscv/sat_s_sub-3-i16.c: Ditto. * gcc.target/riscv/sat_s_sub-3-i32.c: Ditto. * gcc.target/riscv/sat_s_sub-3-i64.c: Ditto. * gcc.target/riscv/sat_s_sub-3-i8.c: Ditto. * gcc.target/riscv/sat_s_sub-4-i16.c: Ditto. * gcc.target/riscv/sat_s_sub-4-i32.c: Ditto. * gcc.target/riscv/sat_s_sub-4-i64.c: Ditto. * gcc.target/riscv/sat_s_sub-4-i8.c: Ditto. * gcc.target/riscv/sat_s_trunc-1-i16-to-i8.c: Ditto. * gcc.target/riscv/sat_s_trunc-1-i32-to-i16.c: Ditto. * gcc.target/riscv/sat_s_trunc-1-i32-to-i8.c: Ditto. * gcc.target/riscv/sat_s_trunc-1-i64-to-i16.c: Ditto. * gcc.target/riscv/sat_s_trunc-1-i64-to-i32.c: Ditto. * gcc.target/riscv/sat_s_trunc-1-i64-to-i8.c: Ditto. * gcc.target/riscv/sat_s_trunc-2-i16-to-i8.c: Ditto. * gcc.target/riscv/sat_s_trunc-2-i32-to-i16.c: Ditto. * gcc.target/riscv/sat_s_trunc-2-i32-to-i8.c: Ditto. * gcc.target/riscv/sat_s_trunc-2-i64-to-i16.c: Ditto. * gcc.target/riscv/sat_s_trunc-2-i64-to-i32.c: Ditto. * gcc.target/riscv/sat_s_trunc-2-i64-to-i8.c: Ditto. * gcc.target/riscv/sat_s_trunc-3-i16-to-i8.c: Ditto. * gcc.target/riscv/sat_s_trunc-3-i32-to-i16.c: Ditto. * gcc.target/riscv/sat_s_trunc-3-i32-to-i8.c: Ditto. * gcc.target/riscv/sat_s_trunc-3-i64-to-i16.c: Ditto. * gcc.target/riscv/sat_s_trunc-3-i64-to-i32.c: Ditto. * gcc.target/riscv/sat_s_trunc-3-i64-to-i8.c: Ditto. * gcc.target/riscv/sat_s_trunc-4-i16-to-i8.c: Ditto. * gcc.target/riscv/sat_s_trunc-4-i32-to-i16.c: Ditto. * gcc.target/riscv/sat_s_trunc-4-i32-to-i8.c: Ditto. * gcc.target/riscv/sat_s_trunc-4-i64-to-i16.c: Ditto. * gcc.target/riscv/sat_s_trunc-4-i64-to-i32.c: Ditto. * gcc.target/riscv/sat_s_trunc-4-i64-to-i8.c: Ditto. * gcc.target/riscv/sat_s_trunc-5-i16-to-i8.c: Ditto. * gcc.target/riscv/sat_s_trunc-5-i32-to-i16.c: Ditto. * gcc.target/riscv/sat_s_trunc-5-i32-to-i8.c: Ditto. * gcc.target/riscv/sat_s_trunc-5-i64-to-i16.c: Ditto. * gcc.target/riscv/sat_s_trunc-5-i64-to-i32.c: Ditto. * gcc.target/riscv/sat_s_trunc-5-i64-to-i8.c: Ditto. * gcc.target/riscv/sat_s_trunc-6-i16-to-i8.c: Ditto. * gcc.target/riscv/sat_s_trunc-6-i32-to-i16.c: Ditto. * gcc.target/riscv/sat_s_trunc-6-i32-to-i8.c: Ditto. * gcc.target/riscv/sat_s_trunc-6-i64-to-i16.c: Ditto. * gcc.target/riscv/sat_s_trunc-6-i64-to-i32.c: Ditto. * gcc.target/riscv/sat_s_trunc-6-i64-to-i8.c: Ditto. * gcc.target/riscv/sat_s_trunc-7-i16-to-i8.c: Ditto. * gcc.target/riscv/sat_s_trunc-7-i32-to-i16.c: Ditto. * gcc.target/riscv/sat_s_trunc-7-i32-to-i8.c: Ditto. * gcc.target/riscv/sat_s_trunc-7-i64-to-i16.c: Ditto. * gcc.target/riscv/sat_s_trunc-7-i64-to-i32.c: Ditto. * gcc.target/riscv/sat_s_trunc-7-i64-to-i8.c: Ditto. * gcc.target/riscv/sat_s_trunc-8-i16-to-i8.c: Ditto. * gcc.target/riscv/sat_s_trunc-8-i32-to-i16.c: Ditto. * gcc.target/riscv/sat_s_trunc-8-i32-to-i8.c: Ditto. * gcc.target/riscv/sat_s_trunc-8-i64-to-i16.c: Ditto. * gcc.target/riscv/sat_s_trunc-8-i64-to-i32.c: Ditto. * gcc.target/riscv/sat_s_trunc-8-i64-to-i8.c: Ditto. * gcc.target/riscv/sat_u_add-1.c: Ditto. * gcc.target/riscv/sat_u_add-10.c: Ditto. * gcc.target/riscv/sat_u_add-11.c: Ditto. * gcc.target/riscv/sat_u_add-12.c: Ditto. * gcc.target/riscv/sat_u_add-13.c: Ditto. * gcc.target/riscv/sat_u_add-14.c: Ditto. * gcc.target/riscv/sat_u_add-15.c: Ditto. * gcc.target/riscv/sat_u_add-16.c: Ditto. * gcc.target/riscv/sat_u_add-17.c: Ditto. * gcc.target/riscv/sat_u_add-18.c: Ditto. * gcc.target/riscv/sat_u_add-19.c: Ditto. * gcc.target/riscv/sat_u_add-2.c: Ditto. * gcc.target/riscv/sat_u_add-20.c: Ditto. * gcc.target/riscv/sat_u_add-21.c: Ditto. * gcc.target/riscv/sat_u_add-22.c: Ditto. * gcc.target/riscv/sat_u_add-23.c: Ditto. * gcc.target/riscv/sat_u_add-24.c: Ditto. * gcc.target/riscv/sat_u_add-3.c: Ditto. * gcc.target/riscv/sat_u_add-4.c: Ditto. * gcc.target/riscv/sat_u_add-5.c: Ditto. * gcc.target/riscv/sat_u_add-6.c: Ditto. * gcc.target/riscv/sat_u_add-7.c: Ditto. * gcc.target/riscv/sat_u_add-8.c: Ditto. * gcc.target/riscv/sat_u_add-9.c: Ditto. * gcc.target/riscv/sat_u_add_imm-1.c: Ditto. * gcc.target/riscv/sat_u_add_imm-10.c: Ditto. * gcc.target/riscv/sat_u_add_imm-11.c: Ditto. * gcc.target/riscv/sat_u_add_imm-12.c: Ditto. * gcc.target/riscv/sat_u_add_imm-13.c: Ditto. * gcc.target/riscv/sat_u_add_imm-14.c: Ditto. * gcc.target/riscv/sat_u_add_imm-15.c: Ditto. * gcc.target/riscv/sat_u_add_imm-16.c: Ditto. * gcc.target/riscv/sat_u_add_imm-2.c: Ditto. * gcc.target/riscv/sat_u_add_imm-3.c: Ditto. * gcc.target/riscv/sat_u_add_imm-4.c: Ditto. * gcc.target/riscv/sat_u_add_imm-5.c: Ditto. * gcc.target/riscv/sat_u_add_imm-6.c: Ditto. * gcc.target/riscv/sat_u_add_imm-7.c: Ditto. * gcc.target/riscv/sat_u_add_imm-8.c: Ditto. * gcc.target/riscv/sat_u_add_imm-9.c: Ditto. * gcc.target/riscv/sat_u_sub-1.c: Ditto. * gcc.target/riscv/sat_u_sub-10.c: Ditto. * gcc.target/riscv/sat_u_sub-11.c: Ditto. * gcc.target/riscv/sat_u_sub-12.c: Ditto. * gcc.target/riscv/sat_u_sub-13.c: Ditto. * gcc.target/riscv/sat_u_sub-14.c: Ditto. * gcc.target/riscv/sat_u_sub-15.c: Ditto. * gcc.target/riscv/sat_u_sub-16.c: Ditto. * gcc.target/riscv/sat_u_sub-17.c: Ditto. * gcc.target/riscv/sat_u_sub-18.c: Ditto. * gcc.target/riscv/sat_u_sub-19.c: Ditto. * gcc.target/riscv/sat_u_sub-2.c: Ditto. * gcc.target/riscv/sat_u_sub-20.c: Ditto. * gcc.target/riscv/sat_u_sub-21.c: Ditto. * gcc.target/riscv/sat_u_sub-22.c: Ditto. * gcc.target/riscv/sat_u_sub-23.c: Ditto. * gcc.target/riscv/sat_u_sub-24.c: Ditto. * gcc.target/riscv/sat_u_sub-25.c: Ditto. * gcc.target/riscv/sat_u_sub-26.c: Ditto. * gcc.target/riscv/sat_u_sub-27.c: Ditto. * gcc.target/riscv/sat_u_sub-28.c: Ditto. * gcc.target/riscv/sat_u_sub-29.c: Ditto. * gcc.target/riscv/sat_u_sub-3.c: Ditto. * gcc.target/riscv/sat_u_sub-30.c: Ditto. * gcc.target/riscv/sat_u_sub-31.c: Ditto. * gcc.target/riscv/sat_u_sub-32.c: Ditto. * gcc.target/riscv/sat_u_sub-33.c: Ditto. * gcc.target/riscv/sat_u_sub-34.c: Ditto. * gcc.target/riscv/sat_u_sub-35.c: Ditto. * gcc.target/riscv/sat_u_sub-36.c: Ditto. * gcc.target/riscv/sat_u_sub-37.c: Ditto. * gcc.target/riscv/sat_u_sub-38.c: Ditto. * gcc.target/riscv/sat_u_sub-39.c: Ditto. * gcc.target/riscv/sat_u_sub-4.c: Ditto. * gcc.target/riscv/sat_u_sub-40.c: Ditto. * gcc.target/riscv/sat_u_sub-41.c: Ditto. * gcc.target/riscv/sat_u_sub-42.c: Ditto. * gcc.target/riscv/sat_u_sub-43.c: Ditto. * gcc.target/riscv/sat_u_sub-44.c: Ditto. * gcc.target/riscv/sat_u_sub-45.c: Ditto. * gcc.target/riscv/sat_u_sub-46.c: Ditto. * gcc.target/riscv/sat_u_sub-47.c: Ditto. * gcc.target/riscv/sat_u_sub-48.c: Ditto. * gcc.target/riscv/sat_u_sub-5.c: Ditto. * gcc.target/riscv/sat_u_sub-6.c: Ditto. * gcc.target/riscv/sat_u_sub-7.c: Ditto. * gcc.target/riscv/sat_u_sub-8.c: Ditto. * gcc.target/riscv/sat_u_sub-9.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-1.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-10.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-10_1.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-10_2.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-11.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-11_1.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-11_2.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-12.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-13.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-13_1.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-13_2.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-14.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-14_1.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-14_2.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-15.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-15_1.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-15_2.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-16.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-1_1.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-1_2.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-2.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-2_1.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-2_2.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-3.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-3_1.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-3_2.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-4.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-5.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-5_1.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-5_2.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-6.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-6_1.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-6_2.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-7.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-7_1.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-7_2.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-8.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-9.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-9_1.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-9_2.c: Ditto. * gcc.target/riscv/sat_u_trunc-1.c: Ditto. * gcc.target/riscv/sat_u_trunc-10.c: Ditto. * gcc.target/riscv/sat_u_trunc-11.c: Ditto. * gcc.target/riscv/sat_u_trunc-12.c: Ditto. * gcc.target/riscv/sat_u_trunc-13.c: Ditto. * gcc.target/riscv/sat_u_trunc-14.c: Ditto. * gcc.target/riscv/sat_u_trunc-15.c: Ditto. * gcc.target/riscv/sat_u_trunc-16.c: Ditto. * gcc.target/riscv/sat_u_trunc-17.c: Ditto. * gcc.target/riscv/sat_u_trunc-18.c: Ditto. * gcc.target/riscv/sat_u_trunc-19.c: Ditto. * gcc.target/riscv/sat_u_trunc-2.c: Ditto. * gcc.target/riscv/sat_u_trunc-20.c: Ditto. * gcc.target/riscv/sat_u_trunc-21.c: Ditto. * gcc.target/riscv/sat_u_trunc-22.c: Ditto. * gcc.target/riscv/sat_u_trunc-23.c: Ditto. * gcc.target/riscv/sat_u_trunc-24.c: Ditto. * gcc.target/riscv/sat_u_trunc-3.c: Ditto. * gcc.target/riscv/sat_u_trunc-4.c: Ditto. * gcc.target/riscv/sat_u_trunc-5.c: Ditto. * gcc.target/riscv/sat_u_trunc-6.c: Ditto. * gcc.target/riscv/sat_u_trunc-7.c: Ditto. * gcc.target/riscv/sat_u_trunc-8.c: Ditto. * gcc.target/riscv/sat_u_trunc-9.c: Ditto.
2024-10-21[testsuite] [arm] add effective target and options for pacbti testsAlexandre Oliva6-12/+17
arm pac and bti tests that use -march=armv8.1-m.main get an implicit -mthumb, that is incompatible with vxworks kernel mode. Declaring the requirement for a 8.1-m.main-compatible toolchain is enough to avoid those fails, because the toolchain feature test fails in kernel mode, but taking the -march options from the standardized arch tests, after testing for support for the corresponding effective target, makes it generally safer, and enables us to drop skip directives and extraneous option variants. for gcc/testsuite/ChangeLog * gcc.target/arm/bti-1.c: Require arch, use its opts, drop skip. * gcc.target/arm/bti-2.c: Likewise. * gcc.target/arm/acle/pacbti-m-predef-11.c: Likewise. * gcc.target/arm/acle/pacbti-m-predef-12.c: Likewise. * gcc.target/arm/acle/pacbti-m-predef-7.c: Likewise. * g++.target/arm/pac-1.C: Likewise. Drop +mve.
2024-10-21Refine splitters related to "combine vpcmpuw + zero_extend to vpcmpuw"liuhongt4-121/+125
r12-6103-g1a7ce8570997eb combines vpcmpuw + zero_extend to vpcmpuw with the pre_reload splitter, but the splitter transforms the zero_extend into a subreg which make reload think the upper part is garbage, it's not correct. The patch adjusts the zero_extend define_insn_and_split to define_insn to keep zero_extend. gcc/ChangeLog: PR target/117159 * config/i386/sse.md (*<avx512>_cmp<V48H_AVX512VL:mode>3_zero_extend<SWI248x:mode>): Change from define_insn_and_split to define_insn. (*<avx512>_cmp<VI12_AVX512VL:mode>3_zero_extend<SWI248x:mode>): Ditto. (*<avx512>_ucmp<VI12_AVX512VL:mode>3_zero_extend<SWI248x:mode>): Ditto. (*<avx512>_ucmp<VI48_AVX512VL:mode>3_zero_extend<SWI248x:mode>): Ditto. (*<avx512>_cmp<V48H_AVX512VL:mode>3_zero_extend<SWI248x:mode>_2): Split to the zero_extend pattern. (*<avx512>_cmp<VI12_AVX512VL:mode>3_zero_extend<SWI248x:mode>_2): Ditto. (*<avx512>_ucmp<VI12_AVX512VL:mode>3_zero_extend<SWI248x:mode>_2): Ditto. (*<avx512>_ucmp<VI48_AVX512VL:mode>3_zero_extend<SWI248x:mode>_2): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/pr117159.c: New test. * gcc.target/i386/avx512bw-pr103750-1.c: Remove xfail. * gcc.target/i386/avx512bw-pr103750-2.c: Remove xfail.
2024-10-21Daily bump.GCC Administrator4-1/+27
2024-10-20Revert "[PATCH 7/7] RISC-V: Disable by pieces for vector setmem length > ↵Jeff Law4-35/+11
UNITS_PER_WORD" This reverts commit 72ceddbfb78dbb95f0808c3eca1765e8cd48b023.
2024-10-20modula2: M2MetaError.{def,mod} and P2SymBuild.mod further cleanupGaius Mulley3-14/+5
Further cleanups and improve the wording of an error message. gcc/m2/ChangeLog: * gm2-compiler/M2MetaError.mod (op): Corrected ordering. * gm2-compiler/P2SymBuild.def: Remove comment. * gm2-compiler/P2SymBuild.mod (GetComparison): Replace the word less with fewer. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2024-10-20Daily bump.GCC Administrator6-1/+518
2024-10-19diagnostics: libcpp: Improve locations for _Pragma lexing diagnostics [PR114423]Lewis Hyatt3-6/+28
libcpp is not currently set up to be able to generate valid locations for tokens lexed from a _Pragma string. Instead, after obtaining the tokens, it sets their locations all to the location of the _Pragma operator itself. This makes things like _Pragma("GCC diagnostic") work well enough, but if any diagnostics are issued during lexing, prior to resetting the token locations, those diagnostics get issued at the invalid locations. Fix that up by adding a new field pfile->diagnostic_override_loc that instructs libcpp to issue diagnostics at the alternate location. libcpp/ChangeLog: PR preprocessor/114423 * internal.h (struct cpp_reader): Add DIAGNOSTIC_OVERRIDE_LOC field. * directives.cc (destringize_and_run): Set the new field to the location of the _Pragma operator. * errors.cc (cpp_diagnostic_at): Support DIAGNOSTIC_OVERRIDE_LOC to temporarily issue diagnostics at a different location. (cpp_diagnostic_with_line): Likewise. gcc/testsuite/ChangeLog: PR preprocessor/114423 * c-c++-common/cpp/pragma-diagnostic-loc.c: New test. * c-c++-common/cpp/diagnostic-pragma-1.c: Adjust expected output. * g++.dg/pch/operator-1.C: Likewise.
2024-10-19modula2: Tidyup gm2-compiler/M2MetaError.modGaius Mulley1-25/+26
This patch is a tidyup for gm2-compiler/M2MetaError.mod. gcc/m2/ChangeLog: * gm2-compiler/M2MetaError.mod (op): Alphabetically order each case label and comment. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2024-10-19phiopt: do factor_out_conditional_operation for all phis [PR112418]Andrew Pinski7-63/+272
Sometimes factor_out_conditional_operation can factor out an operation that causes a phi node to become the same element. Other times, we want to factor out a binary operator because it can improve code generation, an example is PR 110015 (openjpeg). Note this includes a heuristic to decide if factoring out the operation is profitable or not. It can be expanded to include a better live range extend detector. Right now it has a simple one where if it is live on a dominating path, it is considered a live or if there are a small # of assign statements (defaults to 5), then it does not extend the live range too much. Bootstrapped and tested on x86_64-linux-gnu. PR tree-optimization/112418 gcc/ChangeLog: * tree-ssa-phiopt.cc (is_factor_profitable): New function. (factor_out_conditional_operation): Add merge argument. Remove arg0/arg1 arguments. Return bool instead of the new phi. Early return for virtual ops. Call is_factor_profitable to check if the factoring would be profitable. (pass_phiopt::execute): Call factor_out_conditional_operation on all phis instead of just singleton phi. * doc/invoke.texi (--param phiopt-factor-max-stmts-live=): Document. * params.opt (--param=phiopt-factor-max-stmts-live=): New opt. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/factor_op_phi-1.c: New test. * gcc.dg/tree-ssa/factor_op_phi-2.c: New test. * gcc.dg/tree-ssa/factor_op_phi-3.c: New test. * gcc.dg/tree-ssa/factor_op_phi-4.c: New test. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2024-10-19[PATCH][v5] RISC-V: add option -m(no-)autovec-segmentGreg McGary68-2/+397
Add option -m(no-)autovec-segment to enable/disable autovectorizer from emitting vector segment load/store instructions. This is useful for performance experiments. gcc/ChangeLog: * config/riscv/autovec.md (vec_mask_len_load_lanes, vec_mask_len_store_lanes): Predicate with TARGET_VECTOR_AUTOVEC_SEGMENT * config/riscv/riscv-opts.h (TARGET_VECTOR_AUTOVEC_SEGMENT): New macro. * config/riscv/riscv.opt (-m(no-)autovec-segment): New option. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/struct/mask_struct_load_noseg-1.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_load_noseg-2.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_load_noseg-3.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_load_noseg-4.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_load_noseg-5.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_load_noseg-6.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_load_noseg-7.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_load_noseg_run-1.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_load_noseg_run-2.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_load_noseg_run-3.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_load_noseg_run-4.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_load_noseg_run-5.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_load_noseg_run-6.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_load_noseg_run-7.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_store_noseg-1.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_store_noseg-2.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_store_noseg-3.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_store_noseg-4.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_store_noseg-5.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_store_noseg-6.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_store_noseg-7.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_store_noseg_run-1.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_store_noseg_run-2.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_store_noseg_run-3.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_store_noseg_run-4.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_store_noseg_run-5.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_store_noseg_run-6.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_store_noseg_run-7.c: New test. * gcc.target/riscv/rvv/autovec/no-segment.c: New test.
2024-10-19Add missing dg-error to unsigned_38.f90.Thomas Koenig1-1/+1
gcc/testsuite/ChangeLog: PR fortran/117225 * gfortran.dg/unsigned_38.f90: Add missing dg-error directive.
2024-10-19[PATCH 7/7] RISC-V: Disable by pieces for vector setmem length > UNITS_PER_WORDCraig Blackmore4-11/+35
For fast unaligned access targets, by pieces uses up to UNITS_PER_WORD size pieces resulting in more store instructions than needed. For example gcc.target/riscv/rvv/base/setmem-1.c:f1 built with `-O3 -march=rv64gcv -mtune=thead-c906`: ``` f1: vsetivli zero,8,e8,mf2,ta,ma vmv.v.x v1,a1 vsetivli zero,0,e32,mf2,ta,ma sb a1,14(a0) vmv.x.s a4,v1 vsetivli zero,8,e16,m1,ta,ma vmv.x.s a5,v1 vse8.v v1,0(a0) sw a4,8(a0) sh a5,12(a0) ret ``` The slow unaligned access version built with `-O3 -march=rv64gcv` used 15 sb instructions: ``` f1: sb a1,0(a0) sb a1,1(a0) sb a1,2(a0) sb a1,3(a0) sb a1,4(a0) sb a1,5(a0) sb a1,6(a0) sb a1,7(a0) sb a1,8(a0) sb a1,9(a0) sb a1,10(a0) sb a1,11(a0) sb a1,12(a0) sb a1,13(a0) sb a1,14(a0) ret ``` After this patch, the following is generated in both cases: ``` f1: vsetivli zero,15,e8,m1,ta,ma vmv.v.x v1,a1 vse8.v v1,0(a0) ret ``` gcc/ChangeLog: * config/riscv/riscv.cc (riscv_use_by_pieces_infrastructure_p): New function. (TARGET_USE_BY_PIECES_INFRASTRUCTURE_P): Define. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/pr113469.c: Expect mf2 setmem. * gcc.target/riscv/rvv/base/setmem-2.c: Update f1 to expect straight-line vector memset. * gcc.target/riscv/rvv/base/setmem-3.c: Likewise.
2024-10-19[PATCH 5/7] RISC-V: Move vector memcpy decision making to separate function ↵Craig Blackmore1-56/+87
[NFC] This moves the code for deciding whether to generate a vectorized memcpy, what vector mode to use and whether a loop is needed out of riscv_vector::expand_block_move and into a new function riscv_vector::use_stringop_p so that it can be reused for other string operations. gcc/ChangeLog: * config/riscv/riscv-string.cc (struct stringop_info): New. (expand_block_move): Move decision making code to... (use_vector_stringop_p): ...here.
2024-10-19[PATCH 4/7] RISC-V: Honour -mrvv-max-lmul in riscv_vector::expand_block_moveCraig Blackmore30-121/+95
Unlike the other vector string ops, expand_block_move was using max LMUL m8 regardless of TARGET_MAX_LMUL. The check for whether to generate inline vector code for movmem has been moved from movmem<mode> to riscv_vector::expand_block_move to avoid maintaining multiple versions of similar logic. They already differed on the minimum length for which they would generate vector code. Now that the expand_block_move value is used, movmem will be generated for smaller lengths. Limiting memcpy to m1 caused some memcpy loops to be generated in the calling convention tests which makes it awkward to add suitable scan assembler tests checking the return value being set, so -mrvv-max-lmul=m8 has been added to these tests. Other tests have been adjusted to expect the new memcpy m1 generation where reasonably straight-forward, otherwise -mrvv-max-lmul=m8 has been added. pr111720-[0-9].c regressed because a memcpy loop is generated instead of straight-line. This reveals an existing issue where a redundant straight-line memcpy gets eliminated but a memcpy loop does not (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117205). For example, on pr111720-0.c after this patch: -mrvv-max-lmul=m8: test: lui a5,%hi(.LANCHOR0) li a4,32 addi sp,sp,-32 addi a5,a5,%lo(.LANCHOR0) vsetvli zero,a4,e8,m1,ta,ma vle8.v v8,0(a5) addi sp,sp,32 jr ra -mrvv-max-lmul=m1: test: addi sp,sp,-32 lui a5,%hi(.LANCHOR0) addi a5,a5,%lo(.LANCHOR0) mv a2,sp li a3,32 .L2: vsetvli a4,a3,e8,m1,ta,ma vle8.v v8,0(a5) sub a3,a3,a4 add a5,a5,a4 vse8.v v8,0(a2) add a2,a2,a4 bne a3,zero,.L2 li a5,32 vsetvli zero,a5,e8,m1,ta,ma vle8.v v8,0(sp) addi sp,sp,32 jr ra I have added -mrvv-max-lmul=m8 to pr111720-[0-9].c so that we continue to test the elimination of straight-line memcpy. gcc/ChangeLog: * config/riscv/riscv-protos.h (get_lmul_mode): New prototype. (expand_block_move): Add bool parameter for movmem_p. * config/riscv/riscv-string.cc (riscv_expand_block_move_scalar): Pass movmem_p as false to riscv_vector::expand_block_move. (expand_block_move): Add movmem_p parameter. Return false if loop needed and movmem_p is true. Respect TARGET_MAX_LMUL. * config/riscv/riscv-v.cc (get_lmul_mode): New function. * config/riscv/riscv.md (movmem<mode>): Move checking for whether to generate inline vector code to riscv_vector::expand_block_move by passing movmem_p as true. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/pr113206-1.c: Add -mrvv-max-lmul=m8. * gcc.target/riscv/rvv/autovec/pr113206-2.c: Likewise. * gcc.target/riscv/rvv/autovec/vls/calling-convention-1.c: Add -mrvv-max-lmul=m8 and adjust assembly scans. * gcc.target/riscv/rvv/autovec/vls/calling-convention-2.c: Likewise. * gcc.target/riscv/rvv/autovec/vls/calling-convention-3.c: Likewise. * gcc.target/riscv/rvv/autovec/vls/calling-convention-4.c: Likewise. * gcc.target/riscv/rvv/autovec/vls/calling-convention-5.c: Likewise. * gcc.target/riscv/rvv/autovec/vls/calling-convention-6.c: Likewise. * gcc.target/riscv/rvv/autovec/vls/calling-convention-7.c: Likewise. * gcc.target/riscv/rvv/autovec/vls/spill-4.c: Add -mrvv-max-lmul=m8. * gcc.target/riscv/rvv/autovec/vls/spill-7.c: Likewise. * gcc.target/riscv/rvv/base/cpymem-1.c: Expect m1 in f1 and f2. * gcc.target/riscv/rvv/base/cpymem-2.c: Add -mrvv-max-lmul=m8. * gcc.target/riscv/rvv/base/movmem-1.c: Adjust f1 to a length that will not get vectorized. * gcc.target/riscv/rvv/base/pr111720-0.c: Add -mrvv-max-lmul=m8. * gcc.target/riscv/rvv/base/pr111720-1.c: Likewise. * gcc.target/riscv/rvv/base/pr111720-2.c: Likewise. * gcc.target/riscv/rvv/base/pr111720-3.c: Likewise. * gcc.target/riscv/rvv/base/pr111720-4.c: Likewise. * gcc.target/riscv/rvv/base/pr111720-5.c: Likewise. * gcc.target/riscv/rvv/base/pr111720-6.c: Likewise. * gcc.target/riscv/rvv/base/pr111720-7.c: Likewise. * gcc.target/riscv/rvv/base/pr111720-8.c: Likewise. * gcc.target/riscv/rvv/base/pr111720-9.c: Likewise. * gcc.target/riscv/rvv/vsetvl/pr112929-1.c: Expect memcpy m1 loops. * gcc.target/riscv/rvv/vsetvl/pr112988-1.c: Likewise.
2024-10-19PR modula2/115328 The FORWARD keyword is not implementedGaius Mulley46-449/+1888
This patch implements the FORWARD keyword found in the ISO standard. The patch checks incoming parameters against the prior declaration found in definition/forward sections and will issue an error based on virtual tokens highlighing the full parameter declaration. gcc/m2/ChangeLog: PR modula2/115328 * gm2-compiler/M2MetaError.def: Extend comment documentating new format specifiers. * gm2-compiler/M2MetaError.mod (GetTokProcedure): New declaration. (doErrorScopeModule): New procedure. (doErrorScopeForward): Ditto. (doErrorScopeMod): Reimplement. (doErrorScopeFor): New procedure. (declarationMod): Ditto. (doErrorScopeDefinition): Ditto. (doErrorScopeDef): Reimplement. (declaredDef): New procedure. (declaredFor): Ditto. (doErrorScopeProc): Ditto. (declaredVar): Ditto. (declaredType): Ditto. (declaredFull): Ditto. * gm2-compiler/M2Options.mod (SetAutoInit): Add missing return type. (GetDumpGimple): Remove duplicate implementation. * gm2-compiler/M2Quads.def (DupFrame): New procedure. * gm2-compiler/M2Quads.mod (DupFrame): New procedure. * gm2-compiler/M2Reserved.def (ForwardTok): New variable. * gm2-compiler/M2Reserved.mod (ForwardTok): Initialize variable. * gm2-compiler/M2Scaffold.mod (DeclareArgEnvParams): Add tokno parameter for call to PutParam. * gm2-compiler/P0SymBuild.def (EndForward): New procedure. * gm2-compiler/P0SymBuild.mod (EndForward): New procedure. * gm2-compiler/P0SyntaxCheck.bnf (BlockAssert): New procedure. (ProcedureDeclaration): Reimplement rule. (PostProcedureHeading): New rule. (ForwardDeclaration): Ditto. (ProperProcedure): Ditto. * gm2-compiler/P1Build.bnf (ProcedureDeclaration): Reimplement rule. (PostProcedureHeading): New rule. (ForwardDeclaration): Ditto. (ProperProcedure): Ditto. * gm2-compiler/P1SymBuild.def (Export): Removed unnecessary export. (EndBuildForward): New procedure. * gm2-compiler/P1SymBuild.mod (StartBuildProcedure): Reimplement. (EndBuildProcedure): Ditto. (EndBuildForward): Ditto. * gm2-compiler/P2Build.bnf (ProcedureDeclaration): Reimplement rule. (PostProcedureHeading): New rule. (ForwardDeclaration): Ditto. (ProperProcedure): Ditto. * gm2-compiler/P2SymBuild.def (BuildProcedureDefinedByForward): New procedure. (BuildProcedureDefinedByProper): Ditto. (CheckProcedure): Ditto. (EndBuildForward): Ditto. * gm2-compiler/P2SymBuild.mod (EndBuildProcedure): Reimplement. (EndBuildForward): New procedure. (BuildFPSection): Reimplement to allow forward declaration or checking of parameters. (BuildProcedureDefinedByProper): New procedure. (BuildProcedureDefinedByForward): Ditto (FailParameter): Remove. (ParameterError): New procedure. (ParameterMismatch): Ditto. (EndBuildFormalParameters): Add parameter number check. (GetComparison): New procedure function. (GetSourceDesc): Ditto. (GetCurSrcDesc): Ditto. (GetDeclared): New procedure. (ReturnTypeMismatch): Ditto. (BuildFunction): Reimplement. (CheckProcedure): New procedure. (CheckFormalParameterSection): Reimplement using ParameterError. * gm2-compiler/P3Build.bnf (ProcedureDeclaration): Reimplement rule. (PostProcedureHeading): New rule. (ForwardDeclaration): Ditto. (ProperProcedure): Ditto. * gm2-compiler/P3SymBuild.def (Export): Remove unnecessary export. (EndBuildForward): New procedure. * gm2-compiler/P3SymBuild.mod (EndBuildForward): New procedure. * gm2-compiler/PCBuild.bnf (ProcedureDeclaration): Reimplement rule. (PostProcedureHeading): New rule. (ForwardDeclaration): Ditto. (ProperProcedure): Ditto. * gm2-compiler/PCSymBuild.def (EndBuildForward): New procedure. * gm2-compiler/PCSymBuild.mod (EndBuildForward): Ditto. * gm2-compiler/PHBuild.bnf (ProcedureDeclaration): Reimplement rule. (PostProcedureHeading): New rule. (ForwardDeclaration): Ditto. (ProperProcedure): Ditto. * gm2-compiler/SymbolTable.def (PutVarTok): New procedure. (PutParam): Add typetok parameter. (PutVarParam): Ditto. (PutParamName): Ditto. (GetDeclaredFor): New procedure function. (AreParametersDefinedInDefinition): Ditto. (PutParametersDefinedByForward): New procedure. (GetParametersDefinedByForward): New procedure function. (PutParametersDefinedByProper): New procedure. (GetParametersDefinedByProper): New procedure function. (GetProcedureDeclaredForward): Ditto. (PutProcedureDeclaredForward): New procedure. (GetProcedureDeclaredProper): New procedure function. (PutProcedureDeclaredProper): New procedure. (GetProcedureDeclaredDefinition): New procedure function. (PutProcedureDeclaredDefinition): New procedure. (GetVarDeclTypeTok): Ditto. (PutVarDeclTypeTok): New procedure. (GetVarDeclTok): Ditto. (PutVarDeclTok): New procedure. (GetVarDeclFullTok): Ditto. * gm2-compiler/SymbolTable.mod (ProcedureDecl): New record type. (VarDecl): Ditto. (SymProcedure): Add new field Declared. (SymVar): Add new field Declared. (PutVarTok): New procedure. (PutParam): Add typetok parameter. (PutVarParam): Ditto. (PutParamName): Ditto. (GetDeclaredFor): New procedure function. (AreParametersDefinedInDefinition): Ditto. (PutParametersDefinedByForward): New procedure. (GetParametersDefinedByForward): New procedure function. (PutParametersDefinedByProper): New procedure. (GetParametersDefinedByProper): New procedure function. (GetProcedureDeclaredForward): Ditto. (PutProcedureDeclaredForward): New procedure. (GetProcedureDeclaredProper): New procedure function. (PutProcedureDeclaredProper): New procedure. (GetProcedureDeclaredDefinition): New procedure function. (PutProcedureDeclaredDefinition): New procedure. (GetVarDeclTypeTok): Ditto. (PutVarDeclTypeTok): New procedure. (GetVarDeclTok): Ditto. (PutVarDeclTok): New procedure. (GetVarDeclFullTok): Ditto. (MakeProcedure): Initialize Declared field. (MakeVar): Initialize Declared field. * gm2-libs-log/FileSystem.def (FileNameChar): Add missing return type. * m2.flex: Add FORWARD keyword. gcc/testsuite/ChangeLog: PR modula2/115328 * gm2/iso/fail/badparam.def: New test. * gm2/iso/fail/badparam.mod: New test. * gm2/iso/fail/badparam2.def: New test. * gm2/iso/fail/badparam2.mod: New test. * gm2/iso/fail/badparam3.def: New test. * gm2/iso/fail/badparam3.mod: New test. * gm2/iso/fail/badparamarray.def: New test. * gm2/iso/fail/badparamarray.mod: New test. * gm2/iso/fail/simpledef1.def: New test. * gm2/iso/fail/simpledef1.mod: New test. * gm2/iso/fail/simpleforward.mod: New test. * gm2/iso/fail/simpleforward2.mod: New test. * gm2/iso/fail/simpleforward3.mod: New test. * gm2/iso/fail/simpleforward4.mod: New test. * gm2/iso/fail/simpleforward5.mod: New test. * gm2/iso/fail/simpleforward7.mod: New test. * gm2/iso/pass/simpleforward.mod: New test. * gm2/iso/pass/simpleforward6.mod: New test. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>