aboutsummaryrefslogtreecommitdiff
path: root/gcc/testsuite
AgeCommit message (Collapse)AuthorFilesLines
2025-09-04RISC-V: Use correct target in expand_vec_perm [PR121780].Robin Dapp2-0/+100
This fixes a glaring mistake in yesterday's change to the expansion of vec_perm. We should of course move tmp_target into the real target and not the other way around. I wonder why my testing hasn't caught this... PR target/121742 PR target/121780 PR target/121781 gcc/ChangeLog: * config/riscv/riscv-v.cc (expand_vec_perm): Swap target and tmp_target. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/pr121780.c: New test. * gcc.target/riscv/rvv/autovec/pr121781.c: New test.
2025-09-04tree-optimization/121768 - bogus double reduction detectedRichard Biener1-0/+15
The following changes how we detect double reductions, in particular not setting vect_double_reduction_def on the outer PHIs when the inner loop doesn't satisfy double reduction constraints. It also simplifies the setup a bit by not having to detect wheter we process an inner loop of a double reduction. PR tree-optimization/121768 * tree-vect-loop.cc (vect_inner_phi_in_double_reduction_p): Remove. (vect_analyze_scalar_cycles_1): Analyze inner loops of double reductions immediately and only mark fully recognized double reductions. Skip already analyzed inner loops. (vect_is_simple_reduction): Change double_reduc from a flag to an output of the inner loop PHI and to whether we are processing an inner loop of a double reduction. * gcc.dg/vect/pr121768.c: New testcase.
2025-09-04tree-optimization/121685 - accesses to *this are not trappingRichard Biener1-0/+20
When inside a method then we know the this pointer points to an object of at least the size of the methods base type. We can use this to compute more references as not trapping and enable invariant motion and in turn vectorization as for a slightly modified version of the testcase in the PR. PR tree-optimization/121685 * tree-eh.cc (ref_outside_object_p): Split out from ... (tree_could_trap_p): ... here. Assume the this pointer of a method refers to an object of at least size of its base type. * g++.dg/vect/pr121685-1.cc: New testcase.
2025-09-04forwprop: Improve the reject case for copy prop [PR107051]Andrew Pinski1-0/+24
Currently the code rejects: ``` tmp = *a; *b = tmp; ``` (unless *a == *b). This can be improved such that if a and b are known to share the same base, then only reject it if they overlap; that is the difference of the offsets (from the base) is maybe less than the size. This fixes the testcase in comment #0 of PR 107051. Changes since v1: * v2: Use ranges_maybe_overlap_p instead of manually checking the overlap. Allow for the case where the alignment is known to be greater than the size. PR tree-optimization/107051 gcc/ChangeLog: * tree-ssa-forwprop.cc (optimize_agr_copyprop_1): Allow for memory sharing the same base if they known not to overlap over the size. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/copy-prop-aggregate-union-1.c: New test. Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
2025-09-04RISC-V: Always register vector built-in functions during LTO [PR110812]Kito Cheng4-0/+90
Previously, vector built-in functions were not properly registered during the LTO pipeline, causing link failures when vector intrinsics were used in LTO builds with mixed architecture options. This patch ensures all vector built-in functions are always registered during LTO compilation. The key changes include: - Moving pragma intrinsic flag manipulation from riscv-c.cc to riscv-vector-builtins.cc for better encapsulation - Registering all vector built-in functions regardless of current ISA extensions, deferring the actual extension checking to expansion time - Adding proper support for built-in type registration during LTO This approach is safe because we already perform extension requirement checking at expansion time. The trade-off is a slight increase in bootstrap time for LTO builds due to registering more built-in functions. PR target/110812 gcc/ChangeLog: * config/riscv/riscv-c.cc (pragma_intrinsic_flags): Remove struct. (riscv_pragma_intrinsic_flags_pollute): Remove function. (riscv_pragma_intrinsic_flags_restore): Remove function. (riscv_pragma_intrinsic): Simplify to only call handle_pragma_vector. * config/riscv/riscv-vector-builtins.cc (pragma_intrinsic_flags): Move struct definition here from riscv-c.cc. (riscv_pragma_intrinsic_flags_pollute): Move and adapt from riscv-c.cc, add zvfbfmin, zvfhmin and vector_elen_bf_16 support. (riscv_pragma_intrinsic_flags_restore): Move from riscv-c.cc. (rvv_switcher::rvv_switcher): Add pollute_flags parameter to control flag manipulation. (rvv_switcher::~rvv_switcher): Restore flags conditionally. (register_builtin_types): Use rvv_switcher without polluting flags. (get_required_extensions): Remove function. (check_required_extensions): Simplify to only check type validity. (function_instance::function_returns_void_p): Move implementation from header. (function_builder::add_function): Register placeholder for LTO. (init_builtins): Simplify and handle LTO case. (reinit_builtins): Remove function. (handle_pragma_vector): Remove extension checking. * config/riscv/riscv-vector-builtins.h (function_instance::function_returns_void_p): Add declaration. (function_call_info::function_returns_void_p): Remove inline implementation. gcc/testsuite/ChangeLog: * gcc.target/riscv/lto/pr110812_0.c: New test. * gcc.target/riscv/lto/pr110812_1.c: New test. * gcc.target/riscv/lto/riscv-lto.exp: New test driver. * gcc.target/riscv/lto/riscv_vector.h: New header wrapper.
2025-09-04RISC-V: Fix extension subset check in riscv_can_inline_pKito Cheng5-0/+88
The extension subset check logic in riscv_ext_is_subset was incorrectly inverted, causing functions with more extensions to be incorrectly rejected from being inlined into functions with fewer extensions. This patch fixes the logic to correctly check if the callee's required extensions are a subset of the caller's extensions. The corrected logic now properly allows inlining when the caller has all the extensions that the callee requires. gcc/ * common/config/riscv/riscv-common.cc (riscv_ext_is_subset): Fix inverted logic in extension subset check. gcc/testsuite/ * gcc.target/riscv/can_inline_p_test-01.c: New test. * gcc.target/riscv/can_inline_p_test-02.c: New test. * gcc.target/riscv/can_inline_p_test-03.c: New test. * gcc.target/riscv/can_inline_p_test-04.c: New test. * gcc.target/riscv/riscv_vector.h: New header wrapper for vector tests.
2025-09-04tree-optimization/61247 - handle peeled converted IV in SCEVRichard Biener1-0/+17
The following handles SCEV analysis of a peeled converted IV if that IV is known to not overflow. For # _15 = PHI <_4(6), 0(5)> # i_18 = PHI <i_11(6), 0(5)> i_11 = i_18 + 1; _4 = (long unsigned int) i_11; we cannot analyze _15 directly since the SCC has a widening conversion. But we can analyze _4 to (long unsigned int) {1, +, 1}_1 which is "peeled" (it's from after the first iteration of _15). If the un-peeled IV {0, +, 1}_1 has the same initial value as _15 and it does not overflow then _15 can be analyzed as {0ul, +, 1ul}_1. The following implements this in simplify_peeled_chrec. PR tree-optimization/61247 * tree-scalar-evolution.cc (simplify_peeled_chrec): Handle the case of a converted peeled chrec. * gcc.dg/vect/vect-pr61247.c: New testcase.
2025-09-04tree-optimization/121740 - handle aggregate zeroing as skipped may-defRichard Biener2-1/+16
The following makes value-numbering handle a situation like D.58046 = {}; SR.83_44->i = {}; pretmp_41 = MEM[(struct _Optional_payload_base &)&D.58046 + 8]._M_engaged; where the intermediate may-def SR.83_44->i = {} prevents CSE of the load to zero. The problem is two-fold here, one is that the code skipping may-defs does not handle zeroing via a CTOR, the other is that (partial) must-defs can be better handled by later code as otherwise we may not find an appropriate definition to CSE to. I've noticed we fail to guard against storage-order issues, so fixed that on the fly. PR tree-optimization/121740 * tree-ssa-sccvn.cc (vn_reference_lookup_3): Allow skipping may-defs from CTORs. Do not skip may-defs with storage-order issues or (partial) must-defs. * gcc.dg/tree-ssa/ssa-fre-104.c: Un-XFAIL. * gcc.dg/tree-ssa/ssa-fre-110.c: New testcase.
2025-09-04c++/modules: Fix ADL [PR117658]Nathaniel Shead11-5/+185
On looking again at [basic.lookup.argdep] p4, I believe GCC hasn't fully implemented the wording here for ADL. This patch fixes two issues. First, 4.3 indicates that a function exported from a named module should be visible to ADL regardless of whether it's visible to normal name lookup, as long as some restrictions are followed. This patch implements this; for skipping declarations that "do not appear in the TU containing the point of lookup" I don't think there's anything special we need to do, as any declarations before the point of lookup will be found in other ways anyway, and any remaining declarations from the current TU cannot be seen regardless. Secondly, currently we only add the exported functions along the instantiation path of a lookup. But I don't think this is intended by the current wording, so this patch adjusts that. I also clean up the logic to do all different module processing in adl_namespace_fns so that we don't duplicate work in traversing the module binding list unnecessarily. This new handling means we need to do some extra work to properly error on overload sets containing TU-local entities (as this might actually come up now!) but I'm leaving that for a later patch. As a drive-by fix this also fixes an ICE for C++26 expansion statements with finding the instantiation path. PR c++/117658 gcc/cp/ChangeLog: * cp-tree.h (get_originating_module): Adjust parameter names. * module.cc (path_of_instantiation): Handle C++26 expansion statements. * name-lookup.cc (name_lookup::adl_namespace_fns): Handle exported declarations attached to the same module of an associated entity with the same innermost non-inline namespace, and non-exported functions on the instantiation path. (name_lookup::search_adl): Build mapping of namespace to modules that associated entities are attached to; remove now-unneeded instantiation path handling. gcc/testsuite/ChangeLog: * g++.dg/modules/adl-4_a.C: Test should pass. * g++.dg/modules/adl-4_b.C: Test should pass. * g++.dg/modules/adl-6_a.C: New test. * g++.dg/modules/adl-6_b.C: New test. * g++.dg/modules/adl-6_c.C: New test. * g++.dg/modules/adl-7_a.C: New test. * g++.dg/modules/adl-7_b.C: New test. * g++.dg/modules/adl-7_c.C: New test. * g++.dg/modules/adl-8_a.C: New test. * g++.dg/modules/adl-8_b.C: New test. * g++.dg/modules/adl-8_c.C: New test. Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by: Jason Merrill <jason@redhat.com>
2025-09-04c++/modules: Mark implicit inline namespaces as purview [PR121724]Nathaniel Shead2-0/+23
When we push an existing namespace within the module purview for the first time, we also need to mark any parent inline namespaces as purview to not confuse the streaming logic. PR c++/121724 gcc/cp/ChangeLog: * name-lookup.cc (push_namespace): Mark inline namespace contexts as purview if needed. gcc/testsuite/ChangeLog: * g++.dg/modules/namespace-12_a.C: New test. * g++.dg/modules/namespace-12_b.C: New test. Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
2025-09-04testsuite, darwin: Suppress unwind frames in scantest-lto.c.Iain Sandoe1-0/+1
Currently, for Darwin unwind and EH frames are emitted without use of .cfi_xxx instructions; the emitted frames also contain the string 'ascii'. For the purpose of this test, omit them. PR testsuite/112728 gcc/testsuite/ChangeLog: * gcc.dg/scantest-lto.c: Omit unwind frames. Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
2025-09-04Daily bump.GCC Administrator1-0/+102
2025-09-03RISC-V: Add support for the XAndesbfhcvt ISA extension.Kuan-Lin Chen2-0/+22
This extension defines instructions to perform scalar floating-point conversion between the BFLOAT16 floating-point data and the IEEE-754 32-bit single-precision floating-point (SP) data in a scalar floating point register. gcc/ChangeLog: * config/riscv/andes.def: Add nds_fcvt_s_bf16 and nds_fcvt_bf16_s. * config/riscv/riscv.md (truncsfbf2): Add TARGET_XANDESBFHCVT support. (extendbfsf2): Ditto. * config/riscv/riscv-builtins.cc: New AVAIL andesbfhcvt. Add new define RISCV_ATYPE_BF and RISCV_ATYPE_SF. * config/riscv/riscv-ftypes.def: New DEF_RISCV_FTYPE. gcc/testsuite/ChangeLog: * gcc.target/riscv/xandes/xandesbfhcvt-1.c: New test. * gcc.target/riscv/xandes/xandesbfhcvt-2.c: New test.
2025-09-03RISC-V: Add support for the XAndesperf ISA extension.Kuan-Lin Chen12-0/+222
This patch adds support for the XAndesperf ISA extension. The 32-bit AndeStar V5 extension includes branch instructions, load effective address instructions, and string processing instructions for performance improvement. New INSN patterns are added into the new file andes.md as a seprated vender extension. gcc/ChangeLog: * config/riscv/constraints.md (Ou07): New constraint. (ads_Bext): New constraint. * config/riscv/iterators.md (ANYLE32): New iterator. (sizen): New iterator. (sh_limit): New iterator. (sh_bit): New iterator. (cs): New iterator. * config/riscv/predicates.md (ads_branch_bbcs_operand): New predicate. (ads_branch_bimm_operand): New predicate. (ads_imm_extract_operand): New predicate. (ads_extract_size_imm_si): New predicate. (ads_extract_size_imm_di): New predicate. (const_int5_operand): New predicate. * config/riscv/riscv-builtins.cc: Add new AVAIL andesperf32 and andesperf64. Add new define RISCV_ATYPE_DI. * config/riscv/riscv-ftypes.def: New DEF_RISCV_FTYPE. * config/riscv/riscv.cc (riscv_extend_cost): Cost for pattern 'bfo'. (riscv_rtx_costs): Cost for XAndesperf extension. * config/riscv/riscv.md: Add support for XAndesperf to patterns zero_extendsidi2_internal, zero_extendhi2, extendsidi2_internal, extend<SHORT:mode><SUPERQI:mode>2, <any_extract:optab><GPR:mode>3 and branch_on_bit. * config/riscv/vector-iterators.md (sz): Add sign_extract and zero_extract. * config/riscv/andes.def: New file for vender Andes. * config/riscv/andes.md: New file for vender Andes. gcc/testsuite/ChangeLog: * gcc.target/riscv/riscv.exp: Add runtest for subdir xandes. * gcc.target/riscv/xandes/xandesperf-1.c: New test. * gcc.target/riscv/xandes/xandesperf-10.c: New test. * gcc.target/riscv/xandes/xandesperf-2.c: New test. * gcc.target/riscv/xandes/xandesperf-3.c: New test. * gcc.target/riscv/xandes/xandesperf-4.c: New test. * gcc.target/riscv/xandes/xandesperf-5.c: New test. * gcc.target/riscv/xandes/xandesperf-6.c: New test. * gcc.target/riscv/xandes/xandesperf-7.c: New test. * gcc.target/riscv/xandes/xandesperf-8.c: New test. * gcc.target/riscv/xandes/xandesperf-9.c: New test.
2025-09-03RISC-V: Add basic XAndes vendor extension support.Kuan-Lin Chen6-0/+84
This patch add basic support for the following XAndes ISA extensions: XANDESPERF XANDESBFHCVT XANDESVBFHCVT XANDESVSINTLOAD XANDESVPACKFPH XANDESVDOT gcc/ChangeLog: * config/riscv/riscv-ext.def: Include riscv-ext-andes.def. * config/riscv/riscv-ext.opt (riscv_xandes_subext): New variable. (XANDESPERF) : New mask. (XANDESBFHCVT): Ditto. (XANDESVBFHCVT): Ditto. (XANDESVSINTLOAD): Ditto. (XANDESVPACKFPH): Ditto. (XANDESVDOT): Ditto. * config/riscv/t-riscv: Add riscv-ext-andes.def. * doc/riscv-ext.texi: Regenerated. * config/riscv/riscv-ext-andes.def: New file. gcc/testsuite/ChangeLog: * gcc.target/riscv/xandes/xandes-predef-1.c: New test. * gcc.target/riscv/xandes/xandes-predef-2.c: New test. * gcc.target/riscv/xandes/xandes-predef-3.c: New test. * gcc.target/riscv/xandes/xandes-predef-4.c: New test. * gcc.target/riscv/xandes/xandes-predef-5.c: New test. * gcc.target/riscv/xandes/xandes-predef-6.c: New test. Co-author: Lino Hsing-Yu Peng (linopeng@andestech.com) Co-author: Kai Kai-Yi Weng (kaiweng@andestech.com).
2025-09-03RISC-V: Add pattern for vector-scalar floating-point maxPaul-Antoine Arras19-2/+251
This pattern enables the combine pass (or late-combine, depending on the case) to merge a vec_duplicate into an smax RTL instruction. Before this patch, we have two instructions, e.g.: vfmv.v.f v2,fa0 vfmax.vv v1,v1,v2 After, we get only one: vfmax.vf v1,v1,fa0 In some cases, it also shaves off one vsetvli. gcc/ChangeLog: * config/riscv/autovec-opt.md (*vfmax_vf_<mode>): Rename into... (*vf<optab>_vf_<mode>): New pattern to combine vec_duplicate + vf{min,max}.vv into vf{max,min}.vf. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls/floating-point-max-2.c: Adjust scan dump. * gcc.target/riscv/rvv/autovec/vls/floating-point-max-4.c: Likewise. * gcc.target/riscv/rvv/autovec/vx_vf/vf-1-f16.c: Add vfmax. Also add missing scan-dump for vfmul. * gcc.target/riscv/rvv/autovec/vx_vf/vf-1-f32.c: Add vfmax. * gcc.target/riscv/rvv/autovec/vx_vf/vf-1-f64.c: Likewise. * gcc.target/riscv/rvv/autovec/vx_vf/vf-2-f16.c: Likewise. * gcc.target/riscv/rvv/autovec/vx_vf/vf-2-f32.c: Likewise. * gcc.target/riscv/rvv/autovec/vx_vf/vf-2-f64.c: Likewise. * gcc.target/riscv/rvv/autovec/vx_vf/vf-3-f16.c: Likewise. * gcc.target/riscv/rvv/autovec/vx_vf/vf-3-f32.c: Likewise. * gcc.target/riscv/rvv/autovec/vx_vf/vf-3-f64.c: Likewise. * gcc.target/riscv/rvv/autovec/vx_vf/vf-4-f16.c: Likewise. * gcc.target/riscv/rvv/autovec/vx_vf/vf-4-f32.c: Likewise. * gcc.target/riscv/rvv/autovec/vx_vf/vf-4-f64.c: Likewise. * gcc.target/riscv/rvv/autovec/vx_vf/vf_binop.h: Add max functions. * gcc.target/riscv/rvv/autovec/vx_vf/vf_binop_data.h: Add data for vfmax. * gcc.target/riscv/rvv/autovec/vx_vf/vf_vfmax-run-1-f16.c: New test. * gcc.target/riscv/rvv/autovec/vx_vf/vf_vfmax-run-1-f32.c: New test. * gcc.target/riscv/rvv/autovec/vx_vf/vf_vfmax-run-1-f64.c: New test.
2025-09-03Fortran: fix TRANSFER with rank 1 unlimited polymorphic SOURCE [PR121263]Harald Anlauf1-0/+53
PR fortran/121263 gcc/fortran/ChangeLog: * trans-intrinsic.cc (gfc_conv_intrinsic_transfer): For an unlimited polymorphic SOURCE to TRANSFER use saved descriptor if possible. gcc/testsuite/ChangeLog: * gfortran.dg/transfer_class_5.f90: New test.
2025-09-03[RISC-V][PR target/121213] Avoid unnecessary sign extension in amoswap sequenceAustin Law1-1/+1
This is Austin's work to remove the redundant sign extension seen in pr121213. -- The .w form of amoswap will sign extend its result from 32 to 64 bits, thus any explicit sign extension insn doing the same is redundant. This uses Jivan's approach of allocating a DI temporary for an extended result and using a promoted subreg extraction to get that result into the final destination. Tested with no regressions on riscv32-elf and riscv64-elf and bootstrapped on the BPI and pioneer systems. PR target/121213 gcc/ * config/riscv/sync.md (amo_atomic_exchange_extended<mode>): Separate insn with sign extension for 64 bit targets. gcc/testsuite * gcc.target/riscv/amo/pr121213.c: Remove xfail.
2025-09-03aarch64: PR target/121749: Use dg-assemble in testcaseKyrylo Tkachov1-1/+1
Committing as obvious. Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com> gcc/testsuite/ PR target/121749 * gcc.target/aarch64/simd/pr121749.c: Use dg-assemble directive.
2025-09-03aarch64: PR target/121749: Use correct predicate for narrowing shift amountsKyrylo Tkachov1-0/+11
With g:d20b2ad845876eec0ee80a3933ad49f9f6c4ee30 the narrowing shift instructions are now represented with standard RTL and more merging optimisations occur. This exposed a wrong predicate for the shift amount operand. The shift amount is the number of bits of the narrow destination, not the input sources. Correct this by using the vn_mode attribute when specifying the predicate, which exists for this purpose. I've spotted a few more narrowing shift patterns that need the restriction, so they are updated as well. Bootstrapped and tested on aarch64-none-linux-gnu. Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com> gcc/ PR target/121749 * config/aarch64/aarch64-simd.md (aarch64_<shrn_op>shrn_n<mode>): Use aarch64_simd_shift_imm_offset_<vn_mode> instead of aarch64_simd_shift_imm_offset_<ve_mode> predicate. (aarch64_<shrn_op>shrn_n<mode> VQN define_expand): Likewise. (*aarch64_<shrn_op>rshrn_n<mode>_insn): Likewise. (aarch64_<shrn_op>rshrn_n<mode>): Likewise. (aarch64_<shrn_op>rshrn_n<mode> VQN define_expand): Likewise. (aarch64_sqshrun_n<mode>_insn): Likewise. (aarch64_sqshrun_n<mode>): Likewise. (aarch64_sqshrun_n<mode> VQN define_expand): Likewise. (aarch64_sqrshrun_n<mode>_insn): Likewise. (aarch64_sqrshrun_n<mode>): Likewise. (aarch64_sqrshrun_n<mode>): Likewise. * config/aarch64/iterators.md (vn_mode): Handle DI, SI, HI modes. gcc/testsuite/ PR target/121749 * gcc.target/aarch64/simd/pr121749.c: New test.
2025-09-03c++: constant non-dep init folding vs FIELD_DECL access [PR97740]Patrick Palka2-0/+38
Here although the local templated variables x and y have the same reduced constant value, only x's initializer {a.get()} is well-formed as written since A::m has private access. We correctly reject y's initializer {&a.m} (at instantiation time), but we also reject x's initializer because we happen to constant fold it ahead of time, which means at instantiation time it's already represented as a COMPONENT_REF to a FIELD_DECL, and so when substituting this COMPONENT_REF we naively double check that the given FIELD_DECL is accessible, which fails. This patch sidesteps around this particular issue by not checking access when substituting a COMPONENT_REF to a FIELD_DECL. If the target of a COMPONENT_REF is already a FIELD_DECL (i.e. before substitution), then I think we can assume access has been already checked appropriately. PR c++/97740 gcc/cp/ChangeLog: * pt.cc (tsubst_expr) <case COMPONENT_REF>: Don't check access when the given member is already a FIELD_DECL. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/constexpr-97740a.C: New test. * g++.dg/cpp0x/constexpr-97740b.C: New test. Reviewed-by: Jason Merrill <jason@redhat.com>
2025-09-03tree-optimization/121756 - handle irreducible regions when sinkingRichard Biener1-0/+30
The sinking code currently does not heuristically avoid placing code into an irreducible region in the same way it avoids placing into a deeper loop nest. Critically for the PR we may not insert a VDEF into a irreducible region that does not contain a virtual definition. The following adds the missing heuristic and also a stop-gap for the VDEF issue - since we cannot determine validity inside an irreducible region we have to reject any VDEF movement with destination inside such region, even when it originates there. In particular irreducible sub-cycles are not tracked separately and can cause issues. I chose to not complicate the already partly incomplete assert but prune it down to essentials. PR tree-optimization/121756 * tree-ssa-sink.cc (select_best_block): Avoid irreducible regions in otherwise same loop depth. (statement_sink_location): When sinking a VDEF, never place that into an irreducible region. * gcc.dg/torture/pr121756.c: New testcase.
2025-09-03tree-optimization/121767 - modvar pattern breaking reductionsRichard Biener1-0/+9
The a % b -> a - a / b pattern breaks reduction constraints, disable it for reduction stmts. PR tree-optimization/121767 * tree-vect-patterns.cc (vect_recog_mod_var_pattern): Disable for reductions. * gcc.dg/vect/pr121767.c: New testcase.
2025-09-03tree-optimization/121758 - fix pattern stmt REDUC_IDX updatingRichard Biener1-0/+15
The following fixes a corner case of pattern stmt STMT_VINFO_REDUC_IDX updating which happens auto-magically. When a 2nd pattern sequence uses defs from inside a prior pattern sequence then the first guess for the lookfor can be off. This happens when for example widening patterns use vect_get_internal_def, which looks into earlier patterns. PR tree-optimization/121758 * tree-vect-patterns.cc (vect_mark_pattern_stmts): Try harder to find a reduction continuation. * gcc.dg/vect/pr121758.c: New testcase.
2025-09-03fold: Unwrap MEM_REF after get_inner_reference in ↵Andrew Pinski1-0/+45
split_address_to_core_and_offset [PR121355] Inside split_address_to_core_and_offset, this calls get_inner_reference. Take: ``` _6 = t_3(D) + 12; _8 = &MEM[(struct s1 *)t_3(D) + 4B].t; _1 = _6 - _8; ``` On the assignement of _8, get_inner_reference will return `MEM[(struct s1 *)t_3(D) + 4B]` and an offset but that does not match up with `t_3(D)` which is how split_address_to_core_and_offset handles pointer plus. So this patch adds the unwrapping of the MEM_REF after the call to get_inner_reference and have it act like a pointer plus. Changes since v1: * v2: Remove check on operand 1 for poly_int_tree_p, it is always. Add before the check to see if it fits in shwi instead of after. Bootstrapped and tested on x86_64-linux-gnu. PR tree-optimization/121355 gcc/ChangeLog: * fold-const.cc (split_address_to_core_and_offset): Handle an MEM_REF after the call to get_inner_reference. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/ptrdiff-1.c: New test. Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
2025-09-03Daily bump.GCC Administrator1-0/+68
2025-09-02Fortran: Allow PDT parameterized procedure pointer components [PR89707]Paul Thomas1-0/+28
2025-09-02 Paul Thomas <pault@gcc.gnu.org> gcc/fortran PR fortran/89707 * decl.cc (gfc_get_pdt_instance): Copy the typebound procedure field from the PDT template. If the template interface has kind=0, provide the new instance with an interface with a type spec that points to that of the parameterized component. (match_ppc_decl): When 'saved_kind_expr' this is a PDT and the expression should be copied to the component kind_expr. * gfortran.h: Define gfc_get_tbp. gcc/testsuite/ PR fortran/89707 * gfortran.dg/pdt_43.f03: New test.
2025-09-02Fortran: Handle PDTs correctly with unlimited selector [PR87669]Paul Thomas1-0/+46
2025-09-02 Paul Thomas <pault@gcc.gnu.org> gcc/fortran PR fortran/87669 * expr.cc (gfc_spec_list_type): If no LEN components are seen, unconditionally return 'SPEC_ASSUMED'. This suppresses an invalid error in match.cc(gfc_match_type_is). gcc/testsuite/ PR fortran/87669 * gfortran.dg/pdt_42.f03: New test. libgfortran/ PR fortran/87669 * intrinsics/extends_type_of.c (is_extension_of): Use the vptr rather than the hash value to identify the types.
2025-09-02arm: testsuite: improve test compatibility of asm-hard-reg-... testsRichard Earnshaw2-3/+3
On arm, overriding -march can lead to warnings if the testsuite options try to pass -mcpu. Avoid these by ensuring the -mcpu is unset before adding the architecture. Also, improve the compatibility of asm-hard-reg-error-3.c for hard-float environment by allowing FP instructions in the architecture. gcc/testsuite: * gcc.dg/asm-hard-reg-4.c: On Arm, unset the CPU before setting the arch. * gcc.dg/asm-hard-reg-error-3.c: Similarly. Also add floating-point instructions to aid hard-float variants. Match on arm* not just arm.
2025-09-02RISC-V: Handle overlap in expand_vec_perm PR121742.Robin Dapp1-0/+30
In a two-source gather we unconditionally overwrite target with the first gather's result already. If op1 == target this clobbers the source operand for the second gather. This patch uses a temporary in that case. PR target/121742 gcc/ChangeLog: * config/riscv/riscv-v.cc (expand_vec_perm): Use temporary if op1 and target overlap. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/pr121742.c: New test.
2025-09-02s390: Adjust s390/spaceship-fp-*.c tests for recent changesJakub Jelinek2-4/+4
In r16-3414 libstdc++ changed ABI for (still experimental C++20) and uses unordered value -128 instead of 2. Generally the change improved code generation on all targets tested, see https://gcc.gnu.org/pipermail/gcc-patches/2025-August/693534.html for details. In r16-3474 I've adjusted the middle-end and backends to use that value. This apparently broke the gcc.target/s390/spaceship-fp-2.c test, with -ffast-math the 2 value is unreachable and so the .SPACESHIP last argument in that case is the default, which changed from 2 to -128. But spaceship-fp-1.c test also doesn't test what libstdc++ uses anymore, so the following patch uses -128 in all the spots. 2025-09-02 Jakub Jelinek <jakub@redhat.com> * gcc.target/s390/spaceship-fp-1.c: Expect .SPACESHIP call with -128 as last argument instead of 2. (TEST): Use -128 instead of 2. * gcc.target/s390/spaceship-fp-2.c: Expect .SPACESHIP call with -128 as last argument instead of 2. (TEST): Use -128 instead of 2.
2025-09-02RISC-V: Add Zbb extension sext testcase.Jiawei1-0/+15
This patch update RISC-V Zba extension 'sext' instructions generation. Supplemented the instruction generation detection of 'sext.h' and 'sext.b'. gcc/testsuite/ChangeLog: * gcc.target/riscv/zbb-sext.c: New test.
2025-09-02RISC-V: Update Zba 'shNadd.uw' testcase.`Jiawei1-1/+19
This patch update RISC-V Zba extension 'shNadd.uw' instruction generation. Supplemented the instruction generation detection of 'sh1add.uw' and 'sh3add.uw'. gcc/testsuite/ChangeLog: * gcc.target/riscv/zba-shadd.c: New test functions.
2025-09-02testsuite: i386: Fix gcc.target/i386/memset-strategy-1[03].c on Solaris/x86Rainer Orth2-2/+2
The new gcc.target/i386/memset-strategy-1[03].c tests FAIL on Solaris/x86: FAIL: gcc.target/i386/memset-strategy-10.c check-function-bodies foo FAIL: gcc.target/i386/memset-strategy-13.c check-function-bodies foo The issue is the same as several times previously: they need to be compiled with -fasynchronous-unwind-tables -fdwarf2-cfi-asm, which this patch does. Tested on i386-pc-solaris2.11 (as and gas) and x86_64-pc-linux-gnu. 2025-09-01 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> gcc/testsuite: * gcc.target/i386/memset-strategy-10.c (dg-options): Add -fasynchronous-unwind-tables -fdwarf2-cfi-asm. * gcc.target/i386/memset-strategy-13.c: Likewise.
2025-09-02tree-optimization/121754 - ICE with vect_reduc_type and nested cycleRichard Biener2-0/+27
The reduction guard isn't correct, STMT_VINFO_REDUC_DEF also exists for nested cycles not part of reductions but there's no reduction info for them. PR tree-optimization/121754 * tree-vectorizer.h (vect_reduc_type): Simplify to not ICE on nested cycles. * gcc.dg/vect/pr121754.c: New testcase. * gcc.target/aarch64/vect-pr121754.c: Likewise.
2025-09-02tree-cfg: Fix up assign_discriminator ICE with too large #line [PR121663]Jakub Jelinek1-0/+9
As mentioned in the PR, LOCATION_LINE is represented in an int, and while we have -pedantic diagnostics (and -pedantic-error error) for too large #line, we can still overflow into negative line numbers up to -2 and -1. We could overflow to that even with valid source if it says has #line 2147483640 and then just has 2G+ lines after it. Now, the ICE is because assign_discriminator{,s} uses a hash_map with int_hash <int64_t, -1, -2>, so values -2 and -1 are reserved for deleted and empty entries. We just need to make sure those aren't valid. One possible fix would be just that - discrim_entry &e = map.get_or_insert (LOCATION_LINE (loc), &existed); + discrim_entry &e + = map.get_or_insert ((unsigned) LOCATION_LINE (loc), &existed); by adding unsigned cast when the key is signed 64-bit, it will never be -1 or -2. But I think that is wasteful, discrim_entry is a struct with 2 unsigned non-static data members, so for lines which can only be 0 to 0xffffffff (sure, with wrap-around), I think just using a hash_map with 96bit elts is better than 128bit. So, the following patch just doesn't assign any discriminators for lines -1U and -2U, I think that is fine, normal programs never do that. Another possibility would be to handle lines -1U and -2U as if it was say -3U. 2025-09-02 Jakub Jelinek <jakub@redhat.com> PR middle-end/121663 * tree-cfg.cc (assign_discriminator): Change map argument type from hash_map with int_hash <int64_t, -1, -2> to one with int_hash <unsigned, -1U, -2U>. Cast LOCATION_LINE to unsigned. Return early for (unsigned) LOCATION_LINE above -3U. (assign_discriminators): Change map type from hash_map with int_hash <int64_t, -1, -2> to one with int_hash <unsigned, -1U, -2U>. * gcc.dg/pr121663.c: New test.
2025-09-02testsuite: Fix gcc.dg/tree-ssa/cswtch-[67].c on Solaris/SPARC with asRainer Orth2-2/+2
The gcc.dg/tree-ssa/cswtch-[67].c tests FAIL on Solaris/SPARC with the native as: FAIL: gcc.dg/tree-ssa/cswtch-6.c scan-assembler .rodata.cst16 FAIL: gcc.dg/tree-ssa/cswtch-7.c scan-assembler .rodata.cst32 The issue is the same in both cases: compared to the gas version, with as there's only - .section .rodata.cst32,"aM",@progbits,32 + .section ".rodata" It turns out that varasm.c (mergeable_constant_section) only emits the former if HAVE_GAS_SHF_MERGE, which is 0 with the native as. Fixed by xfailing the tests in this case. Tested on sparc-sun-solaris2.11 with both as and gas. 2025-07-30 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> gcc/testsuite: * gcc.dg/tree-ssa/cswtch-6.c (dg-final): xfail on sparc*-*-solaris2* && !gas. * gcc.dg/tree-ssa/cswtch-7.c: Likewise.
2025-09-01Testsuite: Don't test vector-compare-1.C on strict alignment targetsAndrew Pinski1-1/+1
This testcase will fail on strict alignment targets due to the requirement of doing a possible unaligned load. This fixes that. Note this testcase still fails on arm (and maybe riscv) targets while having unaligned loads, they have slow ones. Pushed as obvious after testing on x86_64-linux-gnu to make sure it is still testing. gcc/testsuite/ChangeLog: * g++.dg/tree-ssa/vector-compare-1.C: Restrict to non_strict_align targets. Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
2025-09-02Daily bump.GCC Administrator1-0/+68
2025-09-01c: Implement C2Y N3457 - The __COUNTER__ predefined macroJakub Jelinek1-0/+44
The following patch implements the https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3457.htm paper without the first 3 lines in Recommended practice. Seems GCC behavior already matches the expected behavior except for diagnostics of more than 2147483648 __COUNTER__ expansions, so the patch adds a diagnostic for that (but not testcase because #define A __COUNTER__ __COUNTER__ __COUNTER__ __COUNTER__ __COUNTER__ __COUNTER__ __COUNTER__ __COUNTER__ #define B A A A A A A A A #define C B B B B B B B B #define D C C C C C C C C #define E D D D D D D D D #define F E E E E E E E E #define G F F F F F F F F #define H G G G G G G G G #define I H H H H H H H H #define J I I I I I I I I J J J J __COUNTER__ just takes too long to preprocess). Plus I've included all the snippets from the paper into one testcase. 2025-09-01 Jakub Jelinek <jakub@redhat.com> * macro.cc: Implement C2Y N3457 - The __COUNTER__ predefined macro. (_cpp_builtin_macro_text): Diagnose if __COUNTER__ reaches 2147483648 value. * gcc.dg/cpp/c2y-counter-1.c: New test.
2025-09-01c: Rename uimaxabs to umaxabsJakub Jelinek3-28/+28
The following patch implements https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3577.txt No big deal on the GCC side, for uimaxabs we just won't recognize it as builtin and I don't see it worth preserving __builtin_uimaxabs, I doubt anything but gcc testsuite used that. But on the glibc side I think it will need to remain exported for ABI compatibility :( 2025-09-01 Jakub Jelinek <jakub@redhat.com> * builtins.def: Implement C2Y N3577 - Rename s/uimaxabs/umaxabs/. (BUILT_IN_UIMAXABS): Rename to ... (BUILT_IN_UMAXABS): ... this. Change second argument to "umaxabs". * builtins.cc (fold_builtin_1): Use BUILT_IN_UMAXABS rather than BUILT_IN_UIMAXABS. * gcc.c-torture/execute/builtins/lib/abs.c (uimaxabs): Rename to ... (umaxabs): ... this. * gcc.c-torture/execute/builtins/uabs-2.c (uimaxabs): Rename to ... (umaxabs): ... this. (main_test): Use umaxabs instead of uimaxabs. * gcc.c-torture/execute/builtins/uabs-3.c (main_test): Use umaxabs instead of uimaxabs.
2025-09-01Fortran: truncate constant string passed to character,value dummy [PR121727]Harald Anlauf1-0/+43
PR fortran/121727 gcc/fortran/ChangeLog: * trans-expr.cc (gfc_const_length_character_type_p): New helper function. (conv_dummy_value): Use it to determine if a character actual argument has a constant length. If a character actual argument is constant and longer than the dummy, truncate it at compile time. gcc/testsuite/ChangeLog: * gfortran.dg/value_10.f90: New test.
2025-09-01PR target/89828 Inernal compiler error on "-fno-omit-frame-pointer"Yoshinori Sato1-0/+49
The problem was caused by an erroneous note about creating a stack frame, which caused the cur_cfa reg to fail to assert with a value other than the frame pointer. This fix will generate notes that correctly update cur_cfa. v2 changes. Add testcase. All tests that failed with "internal compiler error: in dwarf2out_frame_debug_adjust_cfa, at dwarf2cfi.cc" now pass. PR target/89828 gcc * config/rx/rx.cc (add_pop_cfi_notes): Release the frame pointer if it is used. (rx_expand_prologue): Redesigned stack pointer and frame pointer update process. gcc/testsuite/ * gcc.dg/pr89828.c: New.
2025-09-01Add default arch/tuning to shift-gf2p8affine test casesAndi Kleen6-6/+6
This makes them not fail during test suite runs with overriden arch or tunings. gcc/testsuite/ChangeLog: * gcc.target/i386/shift-gf2p8affine-1.c: Use -march=x86-64 -mtune-generic. * gcc.target/i386/shift-gf2p8affine-2.c: Dito. * gcc.target/i386/shift-gf2p8affine-3.c: Dito. * gcc.target/i386/shift-gf2p8affine-5.c: Dito. * gcc.target/i386/shift-gf2p8affine-6.c: Dito. * gcc.target/i386/shift-gf2p8affine-7.c: Dito.
2025-09-01testsuite: arm: factorize arm_v8_neon_ok flagsChristophe Lyon1-3/+3
Like we do in other effective-targets, add "-mcpu=unset -march=armv8-a" directly when setting et_arm_v8_neon_flags in arm_v8_neon_ok_nocache, to avoid having to add these two flags in all users of arm_v8_neon_ok. This avoids duplication and possible typos / oversights. gcc/testsuite/ChangeLog: * lib/target-supports.exp (check_effective_target_arm_v8_neon_ok_nocache): Add "-mcpu=unset -march=armv8-a" to et_arm_v8_neon_flags. (add_options_for_vect_early_break): Remove useless "-mcpu=unset -march=armv8-a". (add_options_for_arm_v8_neon): Likewise.
2025-09-01testsuite: arm: remove arm32 check from a few effective-targetsChristophe Lyon1-46/+39
A few arm effective-targets call check_effective_target_arm32 even though they would force a -march=XXX flag which supports Arm and/or Thumb-2, thus making the arm32 check useless. This has an impact when the toolchain is configured with a default -march or -mcpu which supports Thumb-1 only: in such a case, arm32 is false and we skip many tests, thus reducing coverage. This patch removes the call to check_effective_target_arm32 where it is useless, enabling about 2000 tests. In addition, add an early exit if the target is not an arm one, thus saving a few compilation cycles where not needed. In all callers of arm_neon_ok, remove the now useless "istarget arm*-*-*. gcc/testsuite/ChangeLog: * lib/target-supports.exp (check_effective_target_arm_neon_ok_nocache): Remove arm32 check. Add istarget arm*-*-* check. (check_effective_target_arm_neon_fp16_ok_nocache): Likewise. (check_effective_target_arm_neon_softfp_fp16_ok_nocache): Likewise. (check_effective_target_arm_v8_neon_ok_nocache): Likewise. (check_effective_target_arm_neonv2_ok_nocache): Likewise. (check_effective_target_vect_pack_trunc): Remove istarget arm*-*-* check. (check_effective_target_vect_unpack): Likewise. (check_effective_target_vect_condition): Likewise. (check_effective_target_vect_cond_mixed): Likewise. (available_vector_sizes): Likewise.
2025-09-01tree-optimization/121744 - handle CST << var in shift pattern recogRichard Biener1-0/+13
We currently do not handle promotion/demotion of 'var' when the left operand of a variable shift is constant. There's no good reason why, so the following fixes this omission. PR tree-optimization/121744 * tree-vect-patterns.cc (vect_recog_vector_vector_shift_pattern): Allow constant left operand. * gcc.dg/vect/pr121744-1.c: New testcase.
2025-08-31Fix ICE due to wrong operand is passed to ix86_vgf2p8affine_shift_matrix.liuhongt1-0/+23
1) Fix predicate of operands[3] in cond_<insn><mode> since only const_vec_dup_operand is excepted for masked operations, and pass real count to ix86_vgf2p8affine_shift_matrix. 2) Pass operands[2] instead of operands[1] to gen_vgf2p8affineqb_<mode>_mask which excepted the operand to shifted, but operands[1] is mask operand in cond_<insn><mode>. gcc/ChangeLog: PR target/121699 * config/i386/predicates.md (const_vec_dup_operand): New predicate. * config/i386/sse.md (cond_<insn><mode>): Fix predicate of operands[3], and fix wrong operands passed to ix86_vgf2p8affine_shift_matrix and gen_vgf2p8affineqb_<mode>_mask. gcc/testsuite/ChangeLog: * gcc.target/i386/pr121699.c: New test.
2025-09-01Daily bump.GCC Administrator1-0/+36
2025-08-31Fortran: Pass PDTs to dummies with VALUE attribute [PR99709]Paul Thomas1-0/+47
2025-08-31 Paul Thomas <pault@gcc.gnu.org> gcc/fortran PR fortran/99709 * trans-array.cc (structure_alloc_comps): For the case COPY_ALLOC_COMP, do a deep copy of non-allocatable PDT arrays Suppress the use of 'duplicate_allocatable' for PDT arrays. * trans-expr.cc (conv_dummy_value): When passing to a PDT dummy with the VALUE attribute, do a deep copy to ensure that parameterized components are reallocated. gcc/testsuite/ PR fortran/99709 * gfortran.dg/pdt_41.f03: New test.