aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2025-09-03RISC-V: Add basic XAndes vendor extension support.Kuan-Lin Chen11-1/+226
This patch add basic support for the following XAndes ISA extensions: XANDESPERF XANDESBFHCVT XANDESVBFHCVT XANDESVSINTLOAD XANDESVPACKFPH XANDESVDOT gcc/ChangeLog: * config/riscv/riscv-ext.def: Include riscv-ext-andes.def. * config/riscv/riscv-ext.opt (riscv_xandes_subext): New variable. (XANDESPERF) : New mask. (XANDESBFHCVT): Ditto. (XANDESVBFHCVT): Ditto. (XANDESVSINTLOAD): Ditto. (XANDESVPACKFPH): Ditto. (XANDESVDOT): Ditto. * config/riscv/t-riscv: Add riscv-ext-andes.def. * doc/riscv-ext.texi: Regenerated. * config/riscv/riscv-ext-andes.def: New file. gcc/testsuite/ChangeLog: * gcc.target/riscv/xandes/xandes-predef-1.c: New test. * gcc.target/riscv/xandes/xandes-predef-2.c: New test. * gcc.target/riscv/xandes/xandes-predef-3.c: New test. * gcc.target/riscv/xandes/xandes-predef-4.c: New test. * gcc.target/riscv/xandes/xandes-predef-5.c: New test. * gcc.target/riscv/xandes/xandes-predef-6.c: New test. Co-author: Lino Hsing-Yu Peng (linopeng@andestech.com) Co-author: Kai Kai-Yi Weng (kaiweng@andestech.com).
2025-09-03RISC-V: Add pattern for vector-scalar floating-point maxPaul-Antoine Arras21-6/+1800
This pattern enables the combine pass (or late-combine, depending on the case) to merge a vec_duplicate into an smax RTL instruction. Before this patch, we have two instructions, e.g.: vfmv.v.f v2,fa0 vfmax.vv v1,v1,v2 After, we get only one: vfmax.vf v1,v1,fa0 In some cases, it also shaves off one vsetvli. gcc/ChangeLog: * config/riscv/autovec-opt.md (*vfmax_vf_<mode>): Rename into... (*vf<optab>_vf_<mode>): New pattern to combine vec_duplicate + vf{min,max}.vv into vf{max,min}.vf. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls/floating-point-max-2.c: Adjust scan dump. * gcc.target/riscv/rvv/autovec/vls/floating-point-max-4.c: Likewise. * gcc.target/riscv/rvv/autovec/vx_vf/vf-1-f16.c: Add vfmax. Also add missing scan-dump for vfmul. * gcc.target/riscv/rvv/autovec/vx_vf/vf-1-f32.c: Add vfmax. * gcc.target/riscv/rvv/autovec/vx_vf/vf-1-f64.c: Likewise. * gcc.target/riscv/rvv/autovec/vx_vf/vf-2-f16.c: Likewise. * gcc.target/riscv/rvv/autovec/vx_vf/vf-2-f32.c: Likewise. * gcc.target/riscv/rvv/autovec/vx_vf/vf-2-f64.c: Likewise. * gcc.target/riscv/rvv/autovec/vx_vf/vf-3-f16.c: Likewise. * gcc.target/riscv/rvv/autovec/vx_vf/vf-3-f32.c: Likewise. * gcc.target/riscv/rvv/autovec/vx_vf/vf-3-f64.c: Likewise. * gcc.target/riscv/rvv/autovec/vx_vf/vf-4-f16.c: Likewise. * gcc.target/riscv/rvv/autovec/vx_vf/vf-4-f32.c: Likewise. * gcc.target/riscv/rvv/autovec/vx_vf/vf-4-f64.c: Likewise. * gcc.target/riscv/rvv/autovec/vx_vf/vf_binop.h: Add max functions. * gcc.target/riscv/rvv/autovec/vx_vf/vf_binop_data.h: Add data for vfmax. * gcc.target/riscv/rvv/autovec/vx_vf/vf_vfmax-run-1-f16.c: New test. * gcc.target/riscv/rvv/autovec/vx_vf/vf_vfmax-run-1-f32.c: New test. * gcc.target/riscv/rvv/autovec/vx_vf/vf_vfmax-run-1-f64.c: New test.
2025-09-03Fortran: fix TRANSFER with rank 1 unlimited polymorphic SOURCE [PR121263]Harald Anlauf2-1/+59
PR fortran/121263 gcc/fortran/ChangeLog: * trans-intrinsic.cc (gfc_conv_intrinsic_transfer): For an unlimited polymorphic SOURCE to TRANSFER use saved descriptor if possible. gcc/testsuite/ChangeLog: * gfortran.dg/transfer_class_5.f90: New test.
2025-09-03libstdc++: Implement LWG4222 'expected' constructor from a single value ↵Yihan Wang2-0/+40
missing a constraint libstdc++-v3/ChangeLog: * include/std/expected (expected(U&&)): Add missing constraint as per LWG 4222. * testsuite/20_util/expected/lwg4222.cc: New test. Signed-off-by: Yihan Wang <yronglin777@gmail.com>
2025-09-03[RISC-V][PR target/121213] Avoid unnecessary sign extension in amoswap sequenceAustin Law2-2/+27
This is Austin's work to remove the redundant sign extension seen in pr121213. -- The .w form of amoswap will sign extend its result from 32 to 64 bits, thus any explicit sign extension insn doing the same is redundant. This uses Jivan's approach of allocating a DI temporary for an extended result and using a promoted subreg extraction to get that result into the final destination. Tested with no regressions on riscv32-elf and riscv64-elf and bootstrapped on the BPI and pioneer systems. PR target/121213 gcc/ * config/riscv/sync.md (amo_atomic_exchange_extended<mode>): Separate insn with sign extension for 64 bit targets. gcc/testsuite * gcc.target/riscv/amo/pr121213.c: Remove xfail.
2025-09-03Dump profile_info in ipa-profile dumpJan Hubicka1-1/+11
WPA currently does not print profile_info which might have been modified by profile merging logic. this patch adds dumping logic to ipa-profile pass. Bootstrapped/regtested x86_64-linux, comitted. gcc/ChangeLog: * ipa-profile.cc (ipa_profile): Dump profile_info.
2025-09-03Do not auto-enable loop optimizations with AutoFDOJan Hubicka1-17/+28
With -O2 we automatically enable several loop optimizations with -fprofile-use. The rationale is that those optimizations at -O3 only mainly since they may hurt performance or not pay back in code size when used blindly on all loops. Profile feedback gives us data on number of iterations which is used by heuristics controlling those optimizations. Currently auto-FDO is not that good on determining number of iterations so I think we do not want to enable them until we can prove that those are useful. This is affecting primarily -O2 codegen. Theoretically auto-FdO with lbr can be pretty good on estimating # of iterations, but to make it useful we will need to implement multiplicity for discriminators at least. Bootstrapped/regtested x86_64-linux, comitted. gcc/ChangeLog: * opts.cc (enable_fdo_optimizations): Do not auto-enabele loop optimizations with AutoFDO.
2025-09-03aarch64: PR target/121749: Use dg-assemble in testcaseKyrylo Tkachov1-1/+1
Committing as obvious. Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com> gcc/testsuite/ PR target/121749 * gcc.target/aarch64/simd/pr121749.c: Use dg-assemble directive.
2025-09-03Increase default number of LTO partitionsJan Hubicka1-1/+1
The number of LTO partitions should exceed number of CPUs (or hyper-threads) of commonly used CPUs. I think it is time to increase it again and as discussed in the LTO and toplevel asm thread, doing so scales quite well. Tmp file usage grows from 2.7 to 2.9MB which seems acceptable. Overall build time on machine with 256 hyperthreads is comparable. Bootstrapped/regtested x86_64-linux, comitted. gcc/ChangeLog: * params.opt (-param=lto-partitions=): INcrease default value from 128 to 512.
2025-09-03aarch64: PR target/121749: Use correct predicate for narrowing shift amountsKyrylo Tkachov3-12/+24
With g:d20b2ad845876eec0ee80a3933ad49f9f6c4ee30 the narrowing shift instructions are now represented with standard RTL and more merging optimisations occur. This exposed a wrong predicate for the shift amount operand. The shift amount is the number of bits of the narrow destination, not the input sources. Correct this by using the vn_mode attribute when specifying the predicate, which exists for this purpose. I've spotted a few more narrowing shift patterns that need the restriction, so they are updated as well. Bootstrapped and tested on aarch64-none-linux-gnu. Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com> gcc/ PR target/121749 * config/aarch64/aarch64-simd.md (aarch64_<shrn_op>shrn_n<mode>): Use aarch64_simd_shift_imm_offset_<vn_mode> instead of aarch64_simd_shift_imm_offset_<ve_mode> predicate. (aarch64_<shrn_op>shrn_n<mode> VQN define_expand): Likewise. (*aarch64_<shrn_op>rshrn_n<mode>_insn): Likewise. (aarch64_<shrn_op>rshrn_n<mode>): Likewise. (aarch64_<shrn_op>rshrn_n<mode> VQN define_expand): Likewise. (aarch64_sqshrun_n<mode>_insn): Likewise. (aarch64_sqshrun_n<mode>): Likewise. (aarch64_sqshrun_n<mode> VQN define_expand): Likewise. (aarch64_sqrshrun_n<mode>_insn): Likewise. (aarch64_sqrshrun_n<mode>): Likewise. (aarch64_sqrshrun_n<mode>): Likewise. * config/aarch64/iterators.md (vn_mode): Handle DI, SI, HI modes. gcc/testsuite/ PR target/121749 * gcc.target/aarch64/simd/pr121749.c: New test.
2025-09-03c++: constant non-dep init folding vs FIELD_DECL access [PR97740]Patrick Palka3-0/+44
Here although the local templated variables x and y have the same reduced constant value, only x's initializer {a.get()} is well-formed as written since A::m has private access. We correctly reject y's initializer {&a.m} (at instantiation time), but we also reject x's initializer because we happen to constant fold it ahead of time, which means at instantiation time it's already represented as a COMPONENT_REF to a FIELD_DECL, and so when substituting this COMPONENT_REF we naively double check that the given FIELD_DECL is accessible, which fails. This patch sidesteps around this particular issue by not checking access when substituting a COMPONENT_REF to a FIELD_DECL. If the target of a COMPONENT_REF is already a FIELD_DECL (i.e. before substitution), then I think we can assume access has been already checked appropriately. PR c++/97740 gcc/cp/ChangeLog: * pt.cc (tsubst_expr) <case COMPONENT_REF>: Don't check access when the given member is already a FIELD_DECL. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/constexpr-97740a.C: New test. * g++.dg/cpp0x/constexpr-97740b.C: New test. Reviewed-by: Jason Merrill <jason@redhat.com>
2025-09-03tree-optimization/121756 - handle irreducible regions when sinkingRichard Biener2-6/+41
The sinking code currently does not heuristically avoid placing code into an irreducible region in the same way it avoids placing into a deeper loop nest. Critically for the PR we may not insert a VDEF into a irreducible region that does not contain a virtual definition. The following adds the missing heuristic and also a stop-gap for the VDEF issue - since we cannot determine validity inside an irreducible region we have to reject any VDEF movement with destination inside such region, even when it originates there. In particular irreducible sub-cycles are not tracked separately and can cause issues. I chose to not complicate the already partly incomplete assert but prune it down to essentials. PR tree-optimization/121756 * tree-ssa-sink.cc (select_best_block): Avoid irreducible regions in otherwise same loop depth. (statement_sink_location): When sinking a VDEF, never place that into an irreducible region. * gcc.dg/torture/pr121756.c: New testcase.
2025-09-03libstdc++: Fix std::get<T> for std::pair with reference members [PR121745]Jonathan Wakely2-4/+56
Make the std::get<T> overloads for rvalues use std::forward<T>(p.first) not std::move(p.first), so that lvalue reference members are not incorrectly converted to rvalues. It might appear that std::move(p).first would also work, but the language rules say that for std::pair<T&&, U> that would produce T& rather than the expected T&& (see the discussion in P2445R1 §8.2). Additional tests are added to verify all combinations of reference members, value categories, and const-qualification. libstdc++-v3/ChangeLog: PR libstdc++/121745 * include/bits/stl_pair.h (get): Use forward instead of move in std::get<T> overloads for rvalue pairs. * testsuite/20_util/pair/astuple/get_by_type.cc: Check all value categories and cv-qualification. Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
2025-09-03Remove vector type setting from vect_recog_cond_expr_convert_patternRichard Biener1-5/+3
This pattern doesn't do any target support check so no need to set a vector type. * tree-vect-patterns.cc (vect_recog_cond_expr_convert_pattern): Do not set any vector types.
2025-09-03tree-optimization/121767 - modvar pattern breaking reductionsRichard Biener2-1/+10
The a % b -> a - a / b pattern breaks reduction constraints, disable it for reduction stmts. PR tree-optimization/121767 * tree-vect-patterns.cc (vect_recog_mod_var_pattern): Disable for reductions. * gcc.dg/vect/pr121767.c: New testcase.
2025-09-03tree-optimization/121758 - fix pattern stmt REDUC_IDX updatingRichard Biener2-6/+39
The following fixes a corner case of pattern stmt STMT_VINFO_REDUC_IDX updating which happens auto-magically. When a 2nd pattern sequence uses defs from inside a prior pattern sequence then the first guess for the lookfor can be off. This happens when for example widening patterns use vect_get_internal_def, which looks into earlier patterns. PR tree-optimization/121758 * tree-vect-patterns.cc (vect_mark_pattern_stmts): Try harder to find a reduction continuation. * gcc.dg/vect/pr121758.c: New testcase.
2025-09-03MAINTAINERS: Add myself as an aarch64 port reviewerAlice Carlotti1-0/+1
Changelog: * MAINTAINERS: Add myself as an aarch64 port reviewer.
2025-09-03libstdc++: Make CTAD ignore pair(const T1&, const T2&) constructor [PR110853]Jonathan Wakely2-1/+11
For the pair(T1, T2) explicit deduction type to decay its arguments as intended, we need the pair(const T1&, const T2&) constructor to not be used for CTAD. Otherwise we try to instantiate pair<T1, T2> without decaying, which is ill-formed for function lvalues. Use std::type_identity_t<T1> to make the constructor unusable for an implicit deduction guide. libstdc++-v3/ChangeLog: PR libstdc++/110853 * include/bits/stl_pair.h [C++20] (pair(const T1&, const T2&)): Use std::type_identity_t<T1> for first parameter. * testsuite/20_util/pair/cons/110853.cc: New test. Reviewed-by: Patrick Palka <ppalka@redhat.com> Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
2025-09-03libstdc++: Restore C++20 <chrono> support for old std::string ABIJonathan Wakely4-6/+28
The r16-3416-g806de30f51c8b9 change to use __cpp_lib_chrono in preprocessor conditions broke support for <chrono> for freestanding and the COW std::string ABI. That happened because __cpp_lib_chrono is only defined to the C++20 value for hosted and for the new ABI, because the full set of C++20 features are not defined for freestanding and tzdb is not defined for the old ABI. This introduces a new internal feature test macro that corresponds to the features that are always supported (e.g. chrono::local_time, chrono::year, chrono::weekday). libstdc++-v3/ChangeLog: * include/bits/version.def (chrono_cxx20): Define. * include/bits/version.h: Regenerate. * include/std/chrono: Check __glibcxx_chrono_cxx20 instead of __cpp_lib_chrono for C++20 features that don't require the new std::string ABI and/or can be used for freestanding. * src/c++20/clock.cc: Adjust preprocessor condition. Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
2025-09-03fold: Unwrap MEM_REF after get_inner_reference in ↵Andrew Pinski2-0/+56
split_address_to_core_and_offset [PR121355] Inside split_address_to_core_and_offset, this calls get_inner_reference. Take: ``` _6 = t_3(D) + 12; _8 = &MEM[(struct s1 *)t_3(D) + 4B].t; _1 = _6 - _8; ``` On the assignement of _8, get_inner_reference will return `MEM[(struct s1 *)t_3(D) + 4B]` and an offset but that does not match up with `t_3(D)` which is how split_address_to_core_and_offset handles pointer plus. So this patch adds the unwrapping of the MEM_REF after the call to get_inner_reference and have it act like a pointer plus. Changes since v1: * v2: Remove check on operand 1 for poly_int_tree_p, it is always. Add before the check to see if it fits in shwi instead of after. Bootstrapped and tested on x86_64-linux-gnu. PR tree-optimization/121355 gcc/ChangeLog: * fold-const.cc (split_address_to_core_and_offset): Handle an MEM_REF after the call to get_inner_reference. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/ptrdiff-1.c: New test. Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
2025-09-03Daily bump.GCC Administrator8-1/+234
2025-09-02Move the folding of memcmpy to memcmpy_eq to fold all builtinsAndrew Pinski2-71/+35
This is a small cleanup by moving the optimization of memcmp to memcmp_eq to fab from strlen pass. Since the copy of the other part of the memcmp strlen optimization to forwprop, this was the only thing left that strlen can do memcmp. Note this move will cause memcmp_eq to be used for -Os too. It also removes the optimization from strlen since both are now handled elsewhere. Bootstrapped and tested on x86_64-linux-gnu. gcc/ChangeLog: * tree-ssa-ccp.cc (optimize_memcmp_eq): New function. (pass_fold_builtins::execute): Call optimize_memcmp_eq for memcmp. * tree-ssa-strlen.cc (strlen_pass::handle_builtin_memcmp): Remove. (strlen_pass::check_and_optimize_call): Don't call handle_builtin_memcmp. Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
2025-09-02strlen: Fixup load alignment for memcmpAndrew Pinski1-2/+10
Like the previous commit but for strlen copy so we can backport this commit. The loads should have the correct alignment on them so we need to create newly aligned types when the alignment of the pointer is less than the alignment of the current type. Pushed as pre-approved by https://gcc.gnu.org/pipermail/gcc-patches/2025-September/694016.html after a bootstrap/test on x86_64-linux-gnu. gcc/ChangeLog: * tree-ssa-strlen.cc (strlen_pass::handle_builtin_memcmp): Create unaligned types if the alignment of the pointers is less than the alignment of the new type. Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
2025-09-02forwprop: Fix alignment of types in expansion of memcmpAndrew Pinski1-2/+10
I noticed that when looking into g++.dg/tree-ssa/vector-compare-1.C failure on arm, the wrong alignment was being used for the load. There needs to be an unaligned type here to get the correct alignment. NOTE this means the code in strlen is also wrong but that is on its way out so I am not sure if we should update it or not to backport to the release branches; there could be wrong code happening too. Bootstrapped and tested on x86_64-linux-gnu. gcc/ChangeLog: * tree-ssa-forwprop.cc (simplify_builtin_memcmp): Create unaligned types if the alignment of the pointers is less than the alignment of the new type. Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
2025-09-02Fortran: Allow PDT parameterized procedure pointer components [PR89707]Paul Thomas3-0/+48
2025-09-02 Paul Thomas <pault@gcc.gnu.org> gcc/fortran PR fortran/89707 * decl.cc (gfc_get_pdt_instance): Copy the typebound procedure field from the PDT template. If the template interface has kind=0, provide the new instance with an interface with a type spec that points to that of the parameterized component. (match_ppc_decl): When 'saved_kind_expr' this is a PDT and the expression should be copied to the component kind_expr. * gfortran.h: Define gfc_get_tbp. gcc/testsuite/ PR fortran/89707 * gfortran.dg/pdt_43.f03: New test.
2025-09-02Fortran: Handle PDTs correctly with unlimited selector [PR87669]Paul Thomas3-2/+51
2025-09-02 Paul Thomas <pault@gcc.gnu.org> gcc/fortran PR fortran/87669 * expr.cc (gfc_spec_list_type): If no LEN components are seen, unconditionally return 'SPEC_ASSUMED'. This suppresses an invalid error in match.cc(gfc_match_type_is). gcc/testsuite/ PR fortran/87669 * gfortran.dg/pdt_42.f03: New test. libgfortran/ PR fortran/87669 * intrinsics/extends_type_of.c (is_extension_of): Use the vptr rather than the hash value to identify the types.
2025-09-02arm: testsuite: improve test compatibility of asm-hard-reg-... testsRichard Earnshaw2-3/+3
On arm, overriding -march can lead to warnings if the testsuite options try to pass -mcpu. Avoid these by ensuring the -mcpu is unset before adding the architecture. Also, improve the compatibility of asm-hard-reg-error-3.c for hard-float environment by allowing FP instructions in the architecture. gcc/testsuite: * gcc.dg/asm-hard-reg-4.c: On Arm, unset the CPU before setting the arch. * gcc.dg/asm-hard-reg-error-3.c: Similarly. Also add floating-point instructions to aid hard-float variants. Match on arm* not just arm.
2025-09-02tree-optimization/121753 - ICE with pattern breaking reduction constraintsRichard Biener1-0/+7
The recent change to vect_synth_mult_by_constant missed to handle the synth_shift_p case for alg_shift, so we still changed c * 4 to c + c + c + c. The following also amends alg_add_t2_m, alg_sub_t2_m, alg_add_factor and alg_sub_factor appropriately. PR tree-optimization/121753 * tree-vect-patterns.cc (vect_synth_mult_by_constant): Properly bail when synth_shift_p and an alg_shift use. Handle other problematic cases.
2025-09-02RISC-V: Fix is_vlmax_len_p and use for strided ops.Robin Dapp2-10/+30
This patch changes is_vlmax_len_p to handle VLS modes properly. Before we would check if len == GET_MODE_NUNITS (mode). This works vor VLA modes but not necessarily for VLS modes. We regularly have e.g. small VLS modes where LEN equals their number of units but which do not span a full vector. Therefore now check if len * GET_MODE_UNIT_SIZE (mode) equals BYTES_PER_RISCV_VECTOR * TARGET_MAX_LMUL. Changing this uncovered an oversight in avlprop where we used GET_MODE_NUNITS as AVL when GET_MODE_NUNITS / NF would be correct. The testsuite is unchanged. I didn't bother to add a dedicated test because we would have seen the fallout any way once the gather patch lands. gcc/ChangeLog: * config/riscv/riscv-v.cc (is_vlmax_len_p): Properly handle VLS modes. (imm_avl_p): Fix VLS length check. (expand_strided_load): Use is_vlmax_len_p. (expand_strided_store): Ditto. * config/riscv/riscv-avlprop.cc (pass_avlprop::execute): Use GET_MODE_NUNITS / NF as avl.
2025-09-02RISC-V: Handle overlap in expand_vec_perm PR121742.Robin Dapp2-3/+38
In a two-source gather we unconditionally overwrite target with the first gather's result already. If op1 == target this clobbers the source operand for the second gather. This patch uses a temporary in that case. PR target/121742 gcc/ChangeLog: * config/riscv/riscv-v.cc (expand_vec_perm): Use temporary if op1 and target overlap. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/pr121742.c: New test.
2025-09-02docs: Add NoOffload option flag to the internals manualAndrew Stubbs1-0/+6
The NoOffload flag was introduced recently (commit "Don't pass vector params through to offload targets"). gcc/ChangeLog: * doc/options.texi: Document NoOffload.
2025-09-02s390: Adjust s390/spaceship-fp-*.c tests for recent changesJakub Jelinek2-4/+4
In r16-3414 libstdc++ changed ABI for (still experimental C++20) and uses unordered value -128 instead of 2. Generally the change improved code generation on all targets tested, see https://gcc.gnu.org/pipermail/gcc-patches/2025-August/693534.html for details. In r16-3474 I've adjusted the middle-end and backends to use that value. This apparently broke the gcc.target/s390/spaceship-fp-2.c test, with -ffast-math the 2 value is unreachable and so the .SPACESHIP last argument in that case is the default, which changed from 2 to -128. But spaceship-fp-1.c test also doesn't test what libstdc++ uses anymore, so the following patch uses -128 in all the spots. 2025-09-02 Jakub Jelinek <jakub@redhat.com> * gcc.target/s390/spaceship-fp-1.c: Expect .SPACESHIP call with -128 as last argument instead of 2. (TEST): Use -128 instead of 2. * gcc.target/s390/spaceship-fp-2.c: Expect .SPACESHIP call with -128 as last argument instead of 2. (TEST): Use -128 instead of 2.
2025-09-02c++, contracts: Simplify contracts headers [NFC].Iain Sandoe13-84/+91
We have contracts-related declarations and macros split between contracts.h and cp-tree.h, and then contracts.h is included in the latter, which means that it is included in all c++ front end files. This patch: - moves all the contracts-related material to contracts.h. - makes some functions that are only used in contracts.cc static. - tries to group the external API for contracts into related topics. - includes contracts.h in the front end sources that need it. gcc/cp/ChangeLog: * constexpr.cc: Include contracts.h * coroutines.cc: Likewise. * cp-gimplify.cc: Likewise. * decl.cc: Likewise. * decl2.cc: Likewise. * mangle.cc: Likewise. * module.cc: Likewise. * pt.cc: Likewise. * search.cc: Likewise. * semantics.cc: Likewise. * contracts.cc (validate_contract_role, setup_default_contract_role, add_contract_role, get_concrete_axiom_semantic, get_default_contract_role): Make static. * cp-tree.h (make_postcondition_variable, grok_contract, finish_contract_condition, find_contract, set_decl_contracts, get_contract_semantic, set_contract_semantic): Move to contracts.h. * contracts.h (get_contract_role, add_contract_role, validate_contract_role, setup_default_contract_role, lookup_concrete_semantic, get_default_contract_role): Remove. Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
2025-09-02D, Darwin, Powerpc: Fix build error.Iain Sandoe1-1/+1
osthread.d is trying to use PPC_THREAD_STATE32 which is not defined in thread_act.d (PPC_THREAD_STATE is defined for the 32b case). This leads to a build fail for libdruntime. libphobos/ChangeLog: * libdruntime/core/thread/osthread.d: Use PPC_THREAD_STATE instead of PPC_THREAD_STATE32. Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
2025-09-02RISC-V: Add Zbb extension sext testcase.Jiawei1-0/+15
This patch update RISC-V Zba extension 'sext' instructions generation. Supplemented the instruction generation detection of 'sext.h' and 'sext.b'. gcc/testsuite/ChangeLog: * gcc.target/riscv/zbb-sext.c: New test.
2025-09-02RISC-V: Update Zba 'shNadd.uw' testcase.`Jiawei1-1/+19
This patch update RISC-V Zba extension 'shNadd.uw' instruction generation. Supplemented the instruction generation detection of 'sh1add.uw' and 'sh3add.uw'. gcc/testsuite/ChangeLog: * gcc.target/riscv/zba-shadd.c: New test functions.
2025-09-02libstdc++: Move _Index_tuple, _Build_index_tuple to <type_traits>.Luc Grosheintz2-20/+22
As preparation for implementing std::constant_wrapper that's part of the C++26 version of the <type_traits> header, the two classes _Index_tuple and _Build_index_tuple are moved to <type_traits>. These two helpers are needed by std::constant_wrapper to initialize the elements of one C array with another. Since, <bits/utility.h> already includes <type_traits> this solution avoids creating a very small header file for just these two internal classes. This approach doesn't move std::index_sequence and related code to <type_traits> and therefore doesn't change which headers provide user-facing features. libstdc++-v3/ChangeLog: * include/bits/utility.h (_Index_tuple): Move to <type_traits>. (_Build_index_tuple): Ditto. * include/std/type_traits (_Index_tuple): Ditto. (_Build_index_tuple): Ditto. Reviewed-by: Jonathan Wakely <jwakely@redhat.com> Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com>
2025-09-02testsuite: i386: Fix gcc.target/i386/memset-strategy-1[03].c on Solaris/x86Rainer Orth2-2/+2
The new gcc.target/i386/memset-strategy-1[03].c tests FAIL on Solaris/x86: FAIL: gcc.target/i386/memset-strategy-10.c check-function-bodies foo FAIL: gcc.target/i386/memset-strategy-13.c check-function-bodies foo The issue is the same as several times previously: they need to be compiled with -fasynchronous-unwind-tables -fdwarf2-cfi-asm, which this patch does. Tested on i386-pc-solaris2.11 (as and gas) and x86_64-pc-linux-gnu. 2025-09-01 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> gcc/testsuite: * gcc.target/i386/memset-strategy-10.c (dg-options): Add -fasynchronous-unwind-tables -fdwarf2-cfi-asm. * gcc.target/i386/memset-strategy-13.c: Likewise.
2025-09-02Restore STMT_VINFO_VECTYPE during analysis, set to NULL for all stmtsRichard Biener1-8/+5
The following makes vect_analyze_stmt call vectorizable_* with all STMT_VINFO_VECTYPE NULL_TREE but restores the value for eventual iteration with single-lane SLP. It clears it for every stmt during vect_transform_stmt. * tree-vect-stmts.cc (vect_transform_stmt): Clear STMT_VINFO_VECTYPE for all stmts. (vect_analyze_stmt): Likewise. But restore at the end again.
2025-09-02tree-optimization/121754 - ICE with vect_reduc_type and nested cycleRichard Biener3-7/+30
The reduction guard isn't correct, STMT_VINFO_REDUC_DEF also exists for nested cycles not part of reductions but there's no reduction info for them. PR tree-optimization/121754 * tree-vectorizer.h (vect_reduc_type): Simplify to not ICE on nested cycles. * gcc.dg/vect/pr121754.c: New testcase. * gcc.target/aarch64/vect-pr121754.c: Likewise.
2025-09-02Avoid touching STMT_VINFO_VECTYPE in bump_vector_ptrRichard Biener1-8/+2
bump is always specified, so remove the STMT_VINFO_VECTYPE touching path. * tree-vect-data-refs.cc (bump_vector_ptr): Remove the STMT_VINFO_VECTYPE use, bump is always specified.
2025-09-02Pass vectype to vect_check_gather_scatterRichard Biener5-18/+22
The strided-store path needs to have the SLP trees vector type so the following patch passes dowm the vector type to be used to vect_check_gather_scatter and adjusts all other callers. This removes one of the last pieces requiring STMT_VINFO_VECTYPE during SLP stmt analysis. * tree-vectorizer.h (vect_check_gather_scatter): Add vectype parameter. * tree-vect-data-refs.cc (vect_check_gather_scatter): Get vectype as parameter. (vect_analyze_data_refs): Adjust. * tree-vect-patterns.cc (vect_recog_gather_scatter_pattern): Likewise. * tree-vect-slp.cc (vect_get_and_check_slp_defs): Get vectype as parameter, pass down. (vect_build_slp_tree_2): Adjust. * tree-vect-stmts.cc (vect_mark_stmts_to_be_vectorized): Likewise. (vect_use_strided_gather_scatters_p): Likewise.
2025-09-02libstdc++: Rename __cmp_cat::__unspec to __cmp_cat::__literal_zero.Tomasz Kamiński1-35/+35
This slightly improve the readability of error message, by suggesting that 0 (literal) is expected as argument: invalid conversion from 'int' to 'std::__cmp_cat::__literal_zero*' libstdc++-v3/ChangeLog: * libsupc++/compare (__cmp_cat::__literal_zero): Rename from __unspec. (__cmp_cat::__unspec): Rename to __literal_zero. (operator==, operator<, operator>, operator<=, operator>=): Replace __cmp_cat::__unspec to __cmp_cat::__literal_zero.
2025-09-02doc: Fix sort order for counted_by attributeJonathan Wakely1-79/+79
gcc/ChangeLog: * doc/extend.texi (Common Variable Attributes): Put counted_by in alphabetical order.
2025-09-02tree-cfg: Fix up assign_discriminator ICE with too large #line [PR121663]Jakub Jelinek2-3/+16
As mentioned in the PR, LOCATION_LINE is represented in an int, and while we have -pedantic diagnostics (and -pedantic-error error) for too large #line, we can still overflow into negative line numbers up to -2 and -1. We could overflow to that even with valid source if it says has #line 2147483640 and then just has 2G+ lines after it. Now, the ICE is because assign_discriminator{,s} uses a hash_map with int_hash <int64_t, -1, -2>, so values -2 and -1 are reserved for deleted and empty entries. We just need to make sure those aren't valid. One possible fix would be just that - discrim_entry &e = map.get_or_insert (LOCATION_LINE (loc), &existed); + discrim_entry &e + = map.get_or_insert ((unsigned) LOCATION_LINE (loc), &existed); by adding unsigned cast when the key is signed 64-bit, it will never be -1 or -2. But I think that is wasteful, discrim_entry is a struct with 2 unsigned non-static data members, so for lines which can only be 0 to 0xffffffff (sure, with wrap-around), I think just using a hash_map with 96bit elts is better than 128bit. So, the following patch just doesn't assign any discriminators for lines -1U and -2U, I think that is fine, normal programs never do that. Another possibility would be to handle lines -1U and -2U as if it was say -3U. 2025-09-02 Jakub Jelinek <jakub@redhat.com> PR middle-end/121663 * tree-cfg.cc (assign_discriminator): Change map argument type from hash_map with int_hash <int64_t, -1, -2> to one with int_hash <unsigned, -1U, -2U>. Cast LOCATION_LINE to unsigned. Return early for (unsigned) LOCATION_LINE above -3U. (assign_discriminators): Change map type from hash_map with int_hash <int64_t, -1, -2> to one with int_hash <unsigned, -1U, -2U>. * gcc.dg/pr121663.c: New test.
2025-09-02testsuite: Fix gcc.dg/tree-ssa/cswtch-[67].c on Solaris/SPARC with asRainer Orth2-2/+2
The gcc.dg/tree-ssa/cswtch-[67].c tests FAIL on Solaris/SPARC with the native as: FAIL: gcc.dg/tree-ssa/cswtch-6.c scan-assembler .rodata.cst16 FAIL: gcc.dg/tree-ssa/cswtch-7.c scan-assembler .rodata.cst32 The issue is the same in both cases: compared to the gas version, with as there's only - .section .rodata.cst32,"aM",@progbits,32 + .section ".rodata" It turns out that varasm.c (mergeable_constant_section) only emits the former if HAVE_GAS_SHF_MERGE, which is 0 with the native as. Fixed by xfailing the tests in this case. Tested on sparc-sun-solaris2.11 with both as and gas. 2025-07-30 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> gcc/testsuite: * gcc.dg/tree-ssa/cswtch-6.c (dg-final): xfail on sparc*-*-solaris2* && !gas. * gcc.dg/tree-ssa/cswtch-7.c: Likewise.
2025-09-02RISC-V: Remove unused print_ext_doc_entry function [NFC]Kito Cheng1-44/+0
The print_ext_doc_entry function and associated version_t struct in gen-riscv-ext-opt.cc were not being used anywhere in the codebase. Remove them to clean up the code. gcc/ * config/riscv/gen-riscv-ext-opt.cc (version_t): Remove unused struct. (print_ext_doc_entry): Remove unused function.
2025-09-01Testsuite: Don't test vector-compare-1.C on strict alignment targetsAndrew Pinski1-1/+1
This testcase will fail on strict alignment targets due to the requirement of doing a possible unaligned load. This fixes that. Note this testcase still fails on arm (and maybe riscv) targets while having unaligned loads, they have slow ones. Pushed as obvious after testing on x86_64-linux-gnu to make sure it is still testing. gcc/testsuite/ChangeLog: * g++.dg/tree-ssa/vector-compare-1.C: Restrict to non_strict_align targets. Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
2025-09-02Daily bump.GCC Administrator6-1/+248
2025-09-01install: Fix spelling of "support" and "arithmetic"Jonathan Grant1-7/+7
gcc: * doc/install.texi (Configuration): Fix spelling of "support" and "floating-point arithmetic". Signed-off-by: Jonathan Grant <jg@jguk.org>