aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2025-07-29testsuite: Generalise aarch64/saturating_arithmetic*.cRichard Sandiford2-10/+10
gcc.target/aarch64/saturating_arithmetic_{1,2}.c expect w0 and w1 to be duplicated into vectors. The tests expected the duplication of w1 to happen first, but the other order would be fine too. A later simplify-rtx.cc patch happens to change the order. gcc/testsuite/ * gcc.target/aarch64/saturating_arithmetic_1.c: Allow w0 and w1 to be duplicated in either order. * gcc.target/aarch64/saturating_arithmetic_2.c: Likewise.
2025-07-29testsuite: Make aarch64/cmpbr.c more forgivingRichard Sandiford1-20/+20
The 8-bit and 16-bit tests in cmpbr.c assumed an inverted operand order ("w1, w0"), but it's possible to use the uninverted operand order too. This patch generalises the tests to support both forms. This is a prerequisite for a later patch that adds a new simplify-rtx.cc rule. gcc/testsuite/ * gcc.target/aarch64/cmpbr.c: Support both operand orders for 8-bit and 16-bit comparisons.
2025-07-29aarch64: Fix function_expander::get_reg_targetRichard Sandiford1-1/+2
function_expander::get_reg_target didn't actually check for a register, meaning that it could return a memory target instead. That doesn't really matter for the current direct and indirect uses (svundef*, svcreate*, and svset*) but it will for later patches. gcc/ * config/aarch64/aarch64-sve-builtins.cc (function_expander::get_reg_target): Check whether the target is a valid register_operand.
2025-07-29[modula2] Tidyup remove unused local variablesGaius Mulley2-7/+0
This patch removes unused local variables from three procedures. gcc/m2/ChangeLog: * gm2-compiler/M2GenGCC.mod (FoldBecomes): Remove all local variables. (CodeIndrX): Remove length. Remove newstr. * gm2-compiler/M2Range.mod (FoldTypeIndrX): Remove desType. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2025-07-29asf: Fix case of multiple stores with base offset [PR120660]Konstantinos Eleftheriou2-8/+46
When having multiple stores with the same offset as the load, in the case that we are eliminating the load, we were generating a mov instruction for both of them, leading to the overwrite of the register containing the loaded value. This patch fixes this issue by generating a mov instruction only for the first store in the store-load sequence that has the same offset as the load. For the next ones that might be encountered, we use bit-field insertion. Bootstrapped/regtested on AArch64 and x86_64. PR rtl-optimization/120660 gcc/ChangeLog: * avoid-store-forwarding.cc (process_store_forwarding): Fix instruction generation when haveing multiple stores with base offset. gcc/testsuite/ChangeLog: * gcc.dg/pr120660.c: New test.
2025-07-29libsdc++: Test using range_format::map as format_kind.Tomasz Kamiński1-1/+3
This adderess TODO from the test file. libstdc++-v3/ChangeLog: * testsuite/std/format/ranges/format_kind.cc: New test. Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
2025-07-29RISC-V: Remove use of structured binding to fix compiler warningChristoph Müllner1-1/+2
Function riscv_ext_is_subset () uses structured bindings to iterate over all keys and values of an unordered map. However, this is only available since C++17 and causes a warning like this: warning: structured bindings only available with ‘-std=c++17’ This patch addresses the warning. gcc/ChangeLog: * common/config/riscv/riscv-common.cc (riscv_ext_is_subset): Remove use of structured binding to fix compiler warning. Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
2025-07-29asf: Skip when an instruction doesn't satisfy the constraints [PR119795]Konstantinos Eleftheriou2-8/+83
While scanning the instructions and upon reaching an instruction that doesn't satisfy the constraints that we have set, we were removing the already detected stores, but we were continuing adding stores from that point onward. This was causing issues when the address ranges from later stores overlapped with the load's address, leading to partial and wrong update of the register containing the loaded value. With this patch, we are skipping the tranformation for stores that operate on the load's address range, when stores that operate on the same range have been deleted due to constraint violations. PR rtl-optimization/119795 gcc/ChangeLog: * avoid-store-forwarding.cc (store_forwarding_analyzer::avoid_store_forwarding): Skip transformations for stores that operate on the same address range as deleted ones. gcc/testsuite/ChangeLog: * gcc.target/i386/pr119795.c: New test.
2025-07-29RISC-V: Add test cases for mul based unsigned scalar SAT_MULPan Li12-3/+117
Add run and tree-optimized check for mul based unsigned scalar SAT_MUL instead of the widen_mul. gcc/testsuite/ChangeLog: * gcc.target/riscv/sat/sat_u_mul-run-1-u16-from-u64.c: Add rv64 target for run. * gcc.target/riscv/sat/sat_u_mul-run-1-u32-from-u64.c: Ditto. * gcc.target/riscv/sat/sat_u_mul-run-1-u8-from-u64.c: Ditto. * gcc.target/riscv/sat/sat_u_mul-1-u16-from-u32.c: New test. * gcc.target/riscv/sat/sat_u_mul-1-u8-from-u16.c: New test. * gcc.target/riscv/sat/sat_u_mul-1-u8-from-u32.c: New test. * gcc.target/riscv/sat/sat_u_mul-2-u16-from-u64.c: New test. * gcc.target/riscv/sat/sat_u_mul-2-u32-from-u64.c: New test. * gcc.target/riscv/sat/sat_u_mul-2-u8-from-u64.c: New test. * gcc.target/riscv/sat/sat_u_mul-run-1-u16-from-u32.c: New test. * gcc.target/riscv/sat/sat_u_mul-run-1-u8-from-u16.c: New test. * gcc.target/riscv/sat/sat_u_mul-run-1-u8-from-u32.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2025-07-29Match: Introduce mul based pattern for unsigned SAT_MULPan Li1-5/+17
Like widen_mul based pattern, we would like introduce the mul based pattern as well. The pattern is quite simple compares to the widen_mul, thus add new instead of the for loop in match.pd. gcc/ChangeLog: * match.pd: Add mul based unsigned SAT_MUL. Signed-off-by: Pan Li <pan2.li@intel.com>
2025-07-29Another testcase for PR120687Richard Biener1-0/+16
This shows reassoc is harmful even with len == 3. PR tree-optimization/120687 * gcc.dg/vect/pr120687-3.c: New testcase.
2025-07-29testsuite: Fix C++14 test failure with modules test [PR121285]Nathaniel Shead1-2/+2
I hadn't validated this test worked in C++14 before submitting, fixed thusly. PR testsuite/121285 gcc/testsuite/ChangeLog: * g++.dg/modules/class-11_a.H: Make static_asserts valid for C++14. Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
2025-07-29tree-optimization/120687 - avoid disturbing reduction chains in reassocRichard Biener4-4/+42
Reassoc carefully ranks operands to form reduction chains for vectorization so we are careful to not apply any width related changes in the early pass. Unfortunately we are not careful enough. The following gates fma related re-ordering and also the >= 3 ops tail "optimization" which is the culprit here. This does not fix the reported inefficient vectorization when using signed integer reductions yet. PR tree-optimization/120687 * tree-ssa-reassoc.cc (reassociate_bb): Do not disturb the sorted operand order in the early pass. * tree-vect-slp.cc (vect_analyze_slp): Dump when a detected reduction chain fails SLP discovery. * gcc.dg/vect/pr120687-1.c: New testcase. * gcc.dg/vect/pr120687-2.c: Likewise.
2025-07-29Fix UB in string_slice::operator== (PR 121261)Alfie Richards1-0/+4
This adds a nullptr check to fix a regression where it is possible to call `memcmp (NULL, NULL, 0)` which is UB prior to C26. This fixes the bootstrap-ubsan build. gcc/ChangeLog: PR middle-end/121261 * vec.h: Add null ptr check.
2025-07-29PR modula2/121289 Poor warning location when using Wstyle optionGaius Mulley11-54/+116
This patch adds a token location parameter to CheckVariableAgainstKeyword and dependants ensuring that the warning is generated from the token associated with the variable rather than the end of the statement. gcc/m2/ChangeLog: PR modula2/121289 * gm2-compiler/M2Students.def (CheckVariableAgainstKeyword): New parameter tok. * gm2-compiler/M2Students.mod (CheckVariableAgainstKeyword): New parameter tok. Pass tok to PerformVariableKeywordCheck. (PerformVariableKeywordCheck): New parameter tok. Pass tok to MetaErrorStringT0. * gm2-compiler/P2SymBuild.mod (BuildVariable): Pass tok to CheckVariableAgainstKeyword. * gm2-libs-iso/LowLong.mod (except): Replace with ... (exceptSrc): ... this. * gm2-libs-iso/LowReal.mod (except): Replace with ... (exceptSrc): ... this. * gm2-libs-iso/LowShort.mod (except): Replace with ... (exceptSrc): ... this. * gm2-libs-iso/Processes.mod (Wait): Replace from with fromCor. * gm2-libs-iso/RndFile.mod (EndPos): Replace end with endP. * gm2-libs/SCmdArgs.mod (GetArg): Replace start with startPos. Replace end with endPos. (NArg): Replace start with startPos. Replace end with endPos. gcc/testsuite/ChangeLog: PR modula2/121289 * gm2/warnings/style/fail/badvarname.mod: New test. * gm2/warnings/style/fail/warnings-style-fail.exp: New test. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2025-07-29testsuite: Restore dg-do run on pr116906 and pr78185 testsChristophe Lyon3-0/+3
Commit r15-7152-g57b706d141b87c removed /* { dg-do run { target*-*-linux* *-*-gnu* *-*-uclinux* } } */ from these tests, turning them into 'compile' only tests, even when they could be executed. This patch adds /* { dg-do run } */ which is OK since the tests are correctly skipped if needed thanks to the following effective-targets (alarm and signal). With this patch we have again two entries for these tests on linux targets: * compile (test for excess errors) * execution test gcc/testsuite/ChangeLog: * gcc.dg/pr116906-1.c: Add 'dg-do run'. * gcc.dg/pr116906-2.c: Likewise. * gcc.dg/pr78185.c: Likewise.
2025-07-29calls: Allow musttail calls to noreturn [PR121159]Jakub Jelinek3-2/+20
In the PR119483 r15-9003 change we've allowed musttail calls to noreturn functions, after all the decision not to normally tail call noreturn functions is not because it is not possible to tail call those, but because it screws up backtraces. As the following testcase shows, we've done that only for functions not declared [[noreturn]]/_Noreturn but later on discovered through IPA as noreturn. Functions explicitly declared [[noreturn]] have (for historical reasons) volatile FUNCTION_TYPE and the FUNCTION_DECLs are volatile as well, so in order to support those we shouldn't complain on ECF_NORETURN (we've stopped doing so for musttail in PR119483) but also shouldn't complain about TYPE_VOLATILE on their FUNCTION_TYPE (something that IPA doesn't change, I think it only sets TREE_THIS_VOLATILE on the FUNCTION_DECL). volatile on function type really means noreturn as well, it has no other meaning. 2025-07-29 Jakub Jelinek <jakub@redhat.com> PR middle-end/121159 * calls.cc (can_implement_as_sibling_call_p): Don't reject declared noreturn functions in musttail calls. * c-c++-common/pr121159.c: New test. * gcc.dg/plugin/must-tail-call-2.c (test_5): Don't expect an error.
2025-07-28output: Move an special # (256) to a new macroAndrew Pinski3-6/+9
This is a followup to the review of mergability of CSWTCH patch located at https://gcc.gnu.org/pipermail/gcc-patches/2025-July/690810.html. Moves the special # (256) to a macro so it is not used bare in the source and there is only the need to change it in one place. This special # was added with r0-37392-g201556f0e00580 which added the original mergeable section support to gcc. Pushed as obvious after build and test on x86_64. gcc/ChangeLog: * output.h (MAX_ALIGN_MERGABLE): New define. * tree-switch-conversion.cc (switch_conversion::build_one_array): Use MAX_ALIGN_MERGABLE instead of 256. * varasm.cc (mergeable_string_section): Likewise (mergeable_constant_section): Likewise Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2025-07-28Improve mergability of CSWTCH [PR120523]Andrew Pinski4-7/+99
When I did r16-1067-gaa935ce40a7, I thought it would be enough to mark the decl as mergable to get it to merge on all targets. Turns out a few things needed to be changed to support it being mergable on all targets. The first thing is improve the selecting of the mergable section and instead of basing it on the DECL's mode, it should be based on the size instead. The second thing that needed to be happen is change the alignment of the CSWTCH decl to be aligned to the next power of 2 compared to the size if the size is less than 32bytes (the max mergable size that is supported). With these changes, cswtch-6.c passes on ia32 and other targets. And the new testcase cswtch-7.c will pass now too. Note I noticed the darwin's darwin_mergeable_constant_section could be "fixed" up to use DECL_SIZE instead of the DECL_MODE but I am not sure it makes a huge difference. Bootstrapped and tested on x86_64-linux-gnu. PR middle-end/120523 gcc/ChangeLog: * output.h (mergeable_constant_section): New declaration taking unsigned HOST_WIDE_INT for the size. * tree-switch-conversion.cc (switch_conversion::build_one_array): Increase the alignment of CSWTCH for sizes less than 32bytes. * varasm.cc (mergeable_constant_section): Split out twice. One that takes the size in unsigned HOST_WIDE_INT and the other size in a tree. (default_elf_select_section): Pass DECL_SIZE instead of DECL_MODE to mergeable_constant_section. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/cswtch-7.c: New test. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2025-07-29Un-factor vectorizable_load partsRichard Biener1-32/+30
When the costing refactoring happened we ended up with some strange inter-mixing of VMAT unrelated code. The following moves stuff closer to where it's actually used, at the expense of duplicating some lines. * tree-vect-stmts.cc (vectorizable_load): Un-factor VMAT specific code to their handling blocks.
2025-07-29Eliminate gather-scatter-info offset_dt memberRichard Biener3-26/+10
The following removes this only set member. Sligthly complicated by the hoops get_group_load_store_type jumps through. I've simplified that, noting the offset vector type that's relevant is that of the actual offset SLP node, not of what vect_check_gather_scatter (re-)computes. * tree-vectorizer.h (gather_scatter_info::offset_dt): Remove. * tree-vect-data-refs.cc (vect_describe_gather_scatter_call): Do not set it. (vect_check_gather_scatter): Likewise. * tree-vect-stmts.cc (vect_truncate_gather_scatter_offset): Likewise. (get_group_load_store_type): Use the vector type of the offset SLP child. Do not re-check vect_is_simple_use validated by SLP build.
2025-07-29Daily bump.GCC Administrator6-1/+220
2025-07-28AVR: target/121277 - Don't load 0x800000 with const __flashx *x = NULL.Georg-Johann Lay1-6/+13
Converting from generic AS to __flashx used the same rule like for __memx, which tags RAM (generic AS) locations by setting bit 23. The justification was that generic isn't a subset of __flashx, though that lead to surprises with code like const __flashx *x = NULL. The natural thing to do is to just load 0x000000 in that case, so that the null pointer works in __flashx as expected. Apart from that, converting NULL to __flashx (or __flash) no more raises a -Waddr-space-convert diagnostic. gcc/ PR target/121277 * config/avr/avr.cc (avr_addr_space_convert): When converting from generic AS to __flashx, don't set bit 23. (avr_convert_to_type): Don't -Waddr-space-convert when NULL is converted to __flashx or to __flash.
2025-07-28ifcvt: Fix ifcvt for multiple phi nodes after factoring operator [PR121236]Andrew Pinski2-25/+55
When I added the factor operations to ifcvt, I messed how handling of removing the phi nodes. The fix is we need to remove the phi node that was factored out as we factored out the operator because otherwise scev can go when it comes to detecting if the new args are from a reduction. Also the need to change the interface for is_cond_scalar_reduction as the phi node that was being passed after the factoring no longer exists so need to pass the parts that were being used. PR tree-optimization/121236 gcc/ChangeLog: * tree-if-conv.cc (is_cond_scalar_reduction): Instead of phi argument, pass bb and res of the phi. (factor_out_operators): Add iterator for the phi. Remove the phi if this is the first time. Return if we had removed the phi. (predicate_scalar_phi): Add the phi iterator argument. Update call to is_cond_scalar_reduction. Update call to factor_out_operators and set the return value to true when factor_out_operators returns true. (predicate_all_scalar_phis): Don't remove the phi if predicate_scalar_phi already removed it. gcc/testsuite/ChangeLog: * gcc.dg/torture/pr121236-1.c: New test. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2025-07-28x86: Disallow -mtls-dialect=gnu with no_caller_saved_registersH.J. Lu7-0/+83
__tls_get_addr doesn't preserve vector registers. When a function with no_caller_saved_registers attribute calls __tls_get_addr, YMM and ZMM registers will be clobbered. Issue an error and suggest -mtls-dialect=gnu2 in this case. gcc/ PR target/121208 * config/i386/i386.cc (ix86_tls_get_addr): Issue an error for -mtls-dialect=gnu with no_caller_saved_registers attribute and suggest -mtls-dialect=gnu2. gcc/testsuite/ PR target/121208 * gcc.target/i386/pr121208-1a.c: New test. * gcc.target/i386/pr121208-1b.c: Likewise. * gcc.target/i386/pr121208-2a.c: Likewise. * gcc.target/i386/pr121208-2b.c: Likewise. * gcc.target/i386/pr121208-3a.c: Likewise. * gcc.target/i386/pr121208-3b.c: Likewise. Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2025-07-28libstdc++: Teach std::distance and std::advance about C++20 iterators [PR102181]Jonathan Wakely2-3/+125
When the C++98 std::distance and std::advance functions (and C++11 std::next and std::prev) are used with C++20 iterators there can be unexpected results, ranging from compilation failure to decreased performance to undefined behaviour. An iterator which satisfies std::input_iterator but does not meet the Cpp17InputIterator requirements might have std::output_iterator_tag for its std::iterator_traits<I>::iterator_category, which means it currently cannot be used with std::advance at all. However, the implementation of std::advance for a Cpp17InputIterator doesn't do anything that isn't valid for iterator types satsifying C++20 std::input_iterator. Similarly, a type satisfying C++20 std::bidirectional_iterator might be usable with std::prev, if it weren't for the fact that its C++17 iterator_category is std::input_iterator_tag. Finally, a type satisfying C++20 std::random_access_iterator might use a slower implementation for std::distance or std::advance if its C++17 iterator_category is not std::random_access_iterator_tag. This commit adds a __promotable_iterator concept to detect C++20 iterators which explicitly define an iterator_concept member, and which either have no iterator_category, or their iterator_category is weaker than their iterator_concept. This is used by std::distance and std::advance to detect iterators which should dispatch based on their iterator_concept instead of their iterator_category. This means that those functions just work and do the right thing for C++20 iterators which would otherwise fail to compile or have suboptimal performance. This is related to LWG 3197, which considers making it undefined to use std::prev with types which do not meet the Cpp17BidirectionalIterator requirements. I think making it work, as in this commit, is a better solution than banning it (or rejecting it at compile-time as libc++ does). PR libstdc++/102181 libstdc++-v3/ChangeLog: * include/bits/stl_iterator_base_funcs.h (distance, advance): Check C++20 iterator concepts and handle appropriately. (__detail::__iter_category_converts_to_concept): New concept. (__detail::__promotable_iterator): New concept. * testsuite/24_iterators/operations/cxx20_iterators.cc: New test. Reviewed-by: Patrick Palka <ppalka@redhat.com> Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
2025-07-28git_commit.py: add "diagnostics" to bug componentsDavid Malcolm1-0/+1
contrib/ChangeLog * gcc-changelog/git_commit.py: Add "diagnostics" to bug components.
2025-07-28restore bootstrap with --enable-checking=release [PR121260]Mikael Pettersson5-5/+15
Current trunk doesn't bootstrap with --enable-checking=release due to improper nesting of namespaces and #if CHECKING_P blocks. This corrects that. gcc/ PR other/121260 * diagnostics/changes.cc: Correct nesting of namespaces and #if CHECKING_P blocks. * diagnostics/context.cc: Likewise. * diagnostics/html-sink.cc: Likewise. * diagnostics/output-spec.cc: Likewise. * diagnostics/sarif-sink.cc: Likewise. Signed-off-by: Mikael Pettersson <mikpelinux@gmail.com> Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2025-07-28nvptx/nvptx.opt: Update -march-map= for newer sm_xxx: test casesThomas Schwinge15-0/+285
Test cases for commit 60ba2b61af23e6d561c5cbab8df57ea093ade3b3 "nvptx/nvptx.opt: Update -march-map= for newer sm_xxx". gcc/testsuite/ * gcc.target/nvptx/march-map=sm_100.c: New. * gcc.target/nvptx/march-map=sm_100a.c: Likewise. * gcc.target/nvptx/march-map=sm_100f.c: Likewise. * gcc.target/nvptx/march-map=sm_101.c: Likewise. * gcc.target/nvptx/march-map=sm_101a.c: Likewise. * gcc.target/nvptx/march-map=sm_101f.c: Likewise. * gcc.target/nvptx/march-map=sm_103.c: Likewise. * gcc.target/nvptx/march-map=sm_103a.c: Likewise. * gcc.target/nvptx/march-map=sm_103f.c: Likewise. * gcc.target/nvptx/march-map=sm_120.c: Likewise. * gcc.target/nvptx/march-map=sm_120a.c: Likewise. * gcc.target/nvptx/march-map=sm_120f.c: Likewise. * gcc.target/nvptx/march-map=sm_121.c: Likewise. * gcc.target/nvptx/march-map=sm_121a.c: Likewise. * gcc.target/nvptx/march-map=sm_121f.c: Likewise.
2025-07-28nvptx/nvptx.opt: Update -march-map= for newer sm_xxxTobias Burnus1-0/+45
Usage of the -march-map=: "Select the closest available '-march=' value that is not more capable." As PTX ISA 8.6/8.7 (= unreleased CUDA 12.7 + CUDA 12.8) added the Nvidia Blackwell GPUs SM_100, SM_101, and SM_120, it makes sense to add them as well. Note that all three come as sm_XXX and sm_XXXa. PTX ISA 8.8 (CUDA 12.9) added SM_103 and SM_121 and the new 'f' suffix for all SM_1xx. Internally, GCC currently generates the same code for >= sm_80 (Ampere); however, as GCC's -march= also supports sm_89 (Ada), the here added sm_1xxs (Blackwell) will map to sm_89. [Naming note: while ptx code generated for sm_X can also run with sm_Y if Y > X, code generated for sm_XXXa can (generally) only run on the specific hardware; and sm_XXXf implies compatibility with only subsequent targets in the same family.] gcc/ChangeLog: * config/nvptx/nvptx.opt (march-map=): Add sm_100{,f,a}, sm_101{,f,a}, sm_103{,a,f}, sm_120{,a,f} and sm_121{,f,a}.
2025-07-28gcn: Fix CDNA3 atomics' buffer invalidationTobias Burnus1-10/+12
For device (agent) scope atomics - as needed when there is more than one teams, a buffer_wbl2 followed by s_waitcnt is required. When doing the initial porting, the pre-atomic instruction got accidentally replaced by buffer_inv sc1, which is not quite the right instruction. gcc/ChangeLog: * config/gcn/gcn.md (atomic_load, atomic_store, atomic_exchange): Fix CDNA3 L2 cache write-back before atomic instructions.
2025-07-28Const correctness for gather-scatter infoRichard Biener1-6/+6
The following adds const qualification to gather_scatter_info * parameters for various APIs in the vectorizer. * tree-vect-stmts.cc (check_load_store_for_partial_vectors): Make *gs_info const. (vect_build_one_gather_load_call): Likewise. (vect_build_one_scatter_store_call): Likewise. (vect_get_gather_scatter_ops): Likewise. (vect_get_strided_load_store_ops): Likewise.
2025-07-28gcn: Add more s_nop for MI300Tobias Burnus3-38/+55
Implement another case where the CDNA3 ISA documentation requires s_nop, add a comment why another case does not need to be handled. And add one case where an s_nop is required by MI300A hardware but seems to be not mentioned in the CDNA3 ISA documentation. gcc/ChangeLog: * config/gcn/gcn.md (define_attr "vcmp"): Add with values vcmp/vcmpx/no. (*movbi, cstoredi4.., cstore<mode>4): Set it. * config/gcn/gcn-valu.md (vec_cmp<mode>...): Likewise. * config/gcn/gcn.cc (gcn_cmpx_insn_p): Remove. (gcn_md_reorg): Add two new conditions for MI300.
2025-07-28gcn: Add 'nops' insn, extend commentsTobias Burnus3-2/+18
Use 's_nops' with a number instead of multiple of 's_nop' when manually adding 1 to 5 wait state. This helps with the instruction cache and helps a tiny bit with PR119367 where a two-byte variable overflows in the debugging location view handling. Add a comment about 'sc0' to TARGET_GLC_NAME as for atomics it is unrelated to the scope but to whether the result is stored; i.e. using e.g. 'sc1' instead of 'sc0' will have undesired consequences! Update the comment above print_operand_address to document 'R' and 'V'; those are used below as "Temporary hack.", but it makes sense to see them in the list. gcc/ChangeLog: * config/gcn/gcn-opts.h (enum hsaco_attr_type): Add comment about 'sc0'. * config/gcn/gcn.cc (gcn_md_reorg): Use gen_nops instead of gen_nop. (print_operand_address): Document 'R' and 'V' in the pre-function comment as well. * config/gcn/gcn.md (nops): Add.
2025-07-28libstdc++: provide debug impl of P2697 ctor [PR119742]Nathan Myers1-0/+11
This adds the new bitset constructor from string_view defined in P2697 to the debug version of the type. libstdc++-v3/Changelog: PR libstdc++/119742 * include/debug/bitset: Add new ctor.
2025-07-28tree-optimization/121256 - properly support SLP in vectorizable recurrenceRichard Biener3-8/+155
We failed to build the correct initialization vector. For VLA vectors and a non-uniform initialization vector this rejects vectorization for now. PR tree-optimization/121256 * tree-vect-loop.cc (vectorizable_recurr): Build a correct initialization vector for SLP_TREE_LANES > 1. * gcc.dg/vect/vect-recurr-pr121256.c: New testcase. * gcc.dg/vect/vect-recurr-pr121256-2.c: Likewise.
2025-07-28libstdc++: Fix style issues in <mdspan>.Luc Grosheintz1-13/+4
libstdc++-v3/ChangeLog: * include/std/mdspan: Small stylistic adjustments. Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com>
2025-07-28Move STMT_VINFO_TYPE to SLP_TREE_TYPERichard Biener10-103/+94
I am at a point where I want to store additional information from analysis (from loads and stores) to re-use them at transform stage without repeating the analysis. I do not want to add to stmt_vec_info at this point, so this starts adding kind specific sub-structures by moving the STMT_VINFO_TYPE field to the SLP tree and adding a (dummy for now) union tagged by it to receive such data. The change is largely mechanical after RISC-V has been prepared to have a SLP node around. I have settled for a union (supposed to get pointers to data). As followup this enables getting rid of SLP_TREE_CODE and making VEC_PERM therein a separate type, unifying its handling. * tree-vectorizer.h (_slp_tree::type): Add. (_slp_tree::u): Likewise. (_stmt_vec_info::type): Remove. (STMT_VINFO_TYPE): Likewise. (SLP_TREE_TYPE): New. * tree-vectorizer.cc (vec_info::new_stmt_vec_info): Do not initialize type. * tree-vect-slp.cc (_slp_tree::_slp_tree): Initialize type. (vect_slp_analyze_node_operations): Adjust. (vect_schedule_slp_node): Likewise. * tree-vect-patterns.cc (vect_init_pattern_stmt): Do not copy STMT_VINFO_TYPE. * tree-vect-loop.cc: Set SLP_TREE_TYPE instead of STMT_VINFO_TYPE everywhere. (vect_create_loop_vinfo): Do not set STMT_VINFO_TYPE on loop conditions. * tree-vect-stmts.cc: Set SLP_TREE_TYPE instead of STMT_VINFO_TYPE everywhere. (vect_analyze_stmt): Adjust. (vect_transform_stmt): Likewise. * config/aarch64/aarch64.cc (aarch64_vector_costs::count_ops): Access SLP_TREE_TYPE instead of STMT_VINFO_TYPE. * config/i386/i386.cc (ix86_vector_costs::add_stmt_cost): Remove non-SLP element-wise load/store matching. * config/rs6000/rs6000.cc (rs6000_cost_data::update_target_cost_per_stmt): Pass in the SLP node. Use that to get at the memory access kind and type. (rs6000_cost_data::add_stmt_cost): Pass down SLP node. * config/riscv/riscv-vector-costs.cc (variable_vectorized_p): Use SLP_TREE_TYPE. (costs::need_additional_vector_vars_p): Likewise. (costs::update_local_live_ranges): Likewise.
2025-07-28ada: Minor typo fix in commentMarc Poulhiès1-1/+1
gcc/ada/ChangeLog: * gcc-interface/trans.cc (gnat_to_gnu): Fix typo in comment.
2025-07-28aarch64: Add tuning model for Olympus core.Jennifer Schmitz3-1/+212
This patch adds a new tuning model for the NVIDIA Olympus core. The values used here are based on the Software Optimization Guide that will be published imminently. Bootstrapped and tested on aarch64-linux-gnu, no regression. OK for trunk? OK to backport to GCC 15? Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com> Co-Authored-By: Dhruv Chawla <dhruvc@nvidia.com> gcc/ChangeLog: * config/aarch64/aarch64-cores.def (olympus): Use olympus tuning model. * config/aarch64/aarch64.cc: Include olympus.h. * config/aarch64/tuning_models/olympus.h: New file.
2025-07-28libstdc++: Refactor tests for mdspan related accessors.Luc Grosheintz1-22/+37
Versions 1, 2 and 3 of the patch for adding aligned_accessor had a bug in the constraints that allowed conversion of aligned_accessor<T, N> a = aligned_accessor<const T, N>{}; and prevented the reverse. The file mdspan/accessors/generic.cc already contains code that checks all variation of the constraint. This commit allows passing in two different accessors. Enabling it to be reused more widely. libstdc++-v3/ChangeLog: * testsuite/23_containers/mdspan/accessors/generic.cc: Refactor test_ctor. Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com> Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com>
2025-07-28libstdc++: Support braces as arguments for std::erase on inplace_vector ↵Tomasz Kamiński2-4/+24
[PR121196] PR libstdc++/121196 libstdc++-v3/ChangeLog: * include/std/inplace_vector (std::erase): Provide default argument for _Up parameter. * testsuite/23_containers/inplace_vector/erasure.cc: Add test for using braces-init-list as arguments to erase_if and use function to verify content of inplace_vector Reviewed-by: Patrick Palka <ppalka@redhat.com> Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
2025-07-28LoongArch: Remove the definition of CASE_VECTOR_SHORTEN_MODE.Lulu Cheng1-2/+0
On LoongArch, the switch jump-table always stores absolute addresses, so there is no need to define the macro CASE_VECTOR_SHORTEN_MODE. gcc/ChangeLog: * config/loongarch/loongarch.h (CASE_VECTOR_SHORTEN_MODE): Delete.
2025-07-27xtensa: Fix remaining inaccuracies in xtensa_is_insn_L32R_p()Takayuki 'January June' Suwa1-11/+35
The previous fix also had some flaws: - The TARGET_CONST16 check was a bit premature - It didn't take into account the possibility of the RTL expression "(set (reg:SF gpr) (const_int))", especially when TARGET_AUTOLITPOOLS is configured This patch fixes the above. gcc/ChangeLog: * config/xtensa/xtensa.cc (xtensa_is_insn_L32R_p): Re-rewrite to more accurately capture insns that could be L32R machine instructions wherever possible, and add comments that help understand the intent of the process.
2025-07-28Daily bump.GCC Administrator4-1/+96
2025-07-27fortran: Consistently use the same assignment reallocation condition [PR121185]Mikael Morin1-6/+8
This is a follow-up to: r16-2248-gac8e536526393580bc9a4339bab2f8603eff8a47 fortran: Delay evaluation of array bounds after reallocation That revision delayed the evaluation of array bounds, with changes in two places: in the scalarizer where we save expressions without evaluating their values to variables, and in the reallocation code where we evaluate to variables the expressions previously saved. The effect should not have been visible in scalarized code, as the saving to a variable was only delayed after reallocation. Unfortunately, it's actually not the case, and there are cases where expressions that were saved to variables before the change, are no longer after it. The reason for that is differing conditions guarding the omission of the evaluation to variables in the scalarizer on one hand, and the emission of reallocation code with the saving to variables on the other hand. There is an additional check that avoids the emission of reallocation code if we can prove at compile time that both sides of the assignment are conformable. This change moves up the reallocation code condition definition, so that it can be used as well to flag the left hand side array as reallocatable, and omit the evaluation of expressions in the exact same conditions where the reallocation code would catch those unevaluated expressions. An explicit call to gfc_fix_class_refs is added before the evaluation of the reallocation code condition. It was implicit before, by the call to gfc_walk_expr. This is not a correctness issue, but PR #121185, that made the problem apparent, exhibited wrong code examples where the lack of an intermediary variable was making visible a class container at the beginning of an array reference, causing the non-polymorphic array reference to be evaluated in a polymorphic way. The preceding commits have already fixed the PR #121185 test, so I haven't found any addition to the testsuite that would reliably test this change. PR fortran/121185 gcc/fortran/ChangeLog: * trans-expr.cc (gfc_trans_assignment_1): Use the same condition to set the is_alloc_lhs flag and to decide to generate reallocation code. Add explicit call to gfc_fix_class_refs before evaluating the condition.
2025-07-27fortran: Trigger reference saving on pointer dereference [PR121185]Mikael Morin2-18/+60
This is a follow-up to revision: r16-2371-g8f41c87654fd819e48c9f6f1ac3d87e35794d310 fortran: Factor array descriptor references That revision introduced new variables to limit repeated subexpressions in array descriptor references. The change added a walk along the reference from child to parent, that selected subreferences worth saving and applied the saving if the reference proved non-trivial enough. Trivialness was defined in a comment as: only made of a DECL and NOPs and COMPONENTs. But the case of a pointer derefence didn't trigger the saving, so the code was also considering a dereference as if it was trivial. This change triggers the reference saving on pointer dereferences, making the trivialness as defined by the code aligned with the comment. This change is not strictly speaking a bug fix, but PR #121185 exhibited wrong code examples where the lack of a variable hiding the polymorphic leading part of a non-polymorphic array reference was causing the latter to be evaluated in a polymorphic way. PR fortran/121185 gcc/fortran/ChangeLog: * trans-array.cc (set_factored_descriptor_value): Also trigger the saving of the previously selected reference on encountering an INDIRECT_REF. Extract the saving code... (save_ref): ... here as a new function. gcc/testsuite/ChangeLog: * gfortran.dg/assign_14.f90: New test.
2025-07-27fortran: Bound class container lookup after array descriptor [PR121185]Mikael Morin2-0/+46
Don't look for a class container too far after an array descriptor. This avoids generating a polymorphic array reference, using the virtual table of a parent object, to access a non-polymorphic child having a type unrelated to that of the parent. PR fortran/121185 gcc/fortran/ChangeLog: * trans-expr.cc (gfc_get_class_from_expr): Give up class container lookup on the second COMPONENT_REF after an array descriptor. gcc/testsuite/ChangeLog: * gfortran.dg/assign_13.f90: New test.
2025-07-27RISC-V: Add test case for vaadd.vx combine polluting VXRMPan Li4-0/+92
Add asm check to make sure vx combine of vaadd.vx will not pollute the vxrm. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-fixed-vxrm-1-i16.c: New test. * gcc.target/riscv/rvv/autovec/vx_vf/vx-fixed-vxrm-1-i32.c: New test. * gcc.target/riscv/rvv/autovec/vx_vf/vx-fixed-vxrm-1-i64.c: New test. * gcc.target/riscv/rvv/autovec/vx_vf/vx-fixed-vxrm-1-i8.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2025-07-27RISC-V: Add test for vec_duplicate + vaadd.vv combine case 1 with GR2VR cost ↵Pan Li13-2/+67
0, 1 and 2 Add asm dump check test for vec_duplicate + vaadd.vv combine to vaadd.vx, with the GR2VR cost is 0, 1 and 2 gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add asm check. * gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i32.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i64.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i8.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i16.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i32.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i64.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i8.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u8.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i16.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i32.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i64.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i8.c: Ditto. Signed-off-by: Pan Li <pan2.li@intel.com>