aboutsummaryrefslogtreecommitdiff
path: root/gcc
AgeCommit message (Collapse)AuthorFilesLines
2025-08-04aarch64: Use VNx16BI for svac*Richard Sandiford6-17/+465
This patch continues the work of making ACLE intrinsics use VNx16BI for svbool_t results. It deals with the svac* intrinsics (floating- point compare absolute). gcc/ * config/aarch64/aarch64-sve.md (@aarch64_pred_fac<cmp_op><mode>): Replace with... (@aarch64_pred_fac<cmp_op><mode>_acle): ...this new expander. (*aarch64_pred_fac<cmp_op><mode>_strict_acle): New pattern. * config/aarch64/aarch64-sve-builtins-base.cc (svac_impl::expand): Update accordingly. gcc/testsuite/ * gcc.target/aarch64/sve/acle/general/acge_1.c: New test. * gcc.target/aarch64/sve/acle/general/acgt_1.c: Likewise. * gcc.target/aarch64/sve/acle/general/acle_1.c: Likewise. * gcc.target/aarch64/sve/acle/general/aclt_1.c: Likewise.
2025-08-04aarch64: Use VNx16BI for floating-point svcmp*Richard Sandiford9-2/+802
This patch continues the work of making ACLE intrinsics use VNx16BI for svbool_t results. It deals with the floating-point forms of svcmp*. gcc/ * config/aarch64/aarch64-sve.md (@aarch64_pred_fcm<cmp_op><mode>_acle) (*aarch64_pred_fcm<cmp_op><mode>_acle, @aarch64_pred_fcmuo<mode>_acle) (*aarch64_pred_fcmuo<mode>_acle): New patterns. * config/aarch64/aarch64-sve-builtins-base.cc (svcmp_impl::expand, svcmpuo_impl::expand): Use them. gcc/testsuite/ * gcc.target/aarch64/sve/acle/general/cmpeq_6.c: New test. * gcc.target/aarch64/sve/acle/general/cmpge_9.c: Likewise. * gcc.target/aarch64/sve/acle/general/cmpgt_9.c: Likewise. * gcc.target/aarch64/sve/acle/general/cmple_9.c: Likewise. * gcc.target/aarch64/sve/acle/general/cmplt_9.c: Likewise. * gcc.target/aarch64/sve/acle/general/cmpne_5.c: Likewise. * gcc.target/aarch64/sve/acle/general/cmpuo_1.c: Likewise.
2025-08-04aarch64: Use VNx16BI for svcmp*_wideRichard Sandiford11-1/+608
This patch continues the work of making ACLE intrinsics use VNx16BI for svbool_t results. It deals with the svcmp*_wide intrinsics. Since the only uses of these patterns are for ACLE intrinsics, there didn't seem much point adding an "_acle" suffix. gcc/ * config/aarch64/aarch64-sve.md (@aarch64_pred_cmp<cmp_op><mode>_wide): Split into VNx16QI_ONLY and SVE_FULL_HSI patterns. Use VNx16BI results for both. (*aarch64_pred_cmp<cmp_op><mode>_wide): New pattern. (*aarch64_pred_cmp<cmp_op><mode>_wide_cc): Likewise. gcc/testsuite/ * gcc.target/aarch64/sve/acle/general/cmpeq_5.c: New test. * gcc.target/aarch64/sve/acle/general/cmpge_7.c: Likewise. * gcc.target/aarch64/sve/acle/general/cmpge_8.c: Likewise. * gcc.target/aarch64/sve/acle/general/cmpgt_7.c: Likewise. * gcc.target/aarch64/sve/acle/general/cmpgt_8.c: Likewise. * gcc.target/aarch64/sve/acle/general/cmple_7.c: Likewise. * gcc.target/aarch64/sve/acle/general/cmple_8.c: Likewise. * gcc.target/aarch64/sve/acle/general/cmplt_7.c: Likewise. * gcc.target/aarch64/sve/acle/general/cmplt_8.c: Likewise. * gcc.target/aarch64/sve/acle/general/cmpne_4.c: Likewise.
2025-08-04aarch64: Drop unnecessary GPs in svcmp_wide PTEST patternsRichard Sandiford11-2/+679
Patterns that fuse a predicate operation P with a PTEST use aarch64_sve_same_pred_for_ptest_p to test whether the governing predicates of P and the PTEST are compatible. Most patterns were also written as define_insn_and_rewrites, with the rewrite replacing P's original governing predicate with PTEST's. This ensures that we don't, for example, have both a .H PTRUE for the PTEST and a .B PTRUE for a comparison that feeds the PTEST. The svcmp_wide* patterns were missing this rewrite, meaning that we did have redundant PTRUEs. gcc/ * config/aarch64/aarch64-sve.md (*aarch64_pred_cmp<cmp_op><mode>_wide_cc): Turn into a define_insn_and_rewrite and rewrite the governing predicate of the comparison so that it is identical to the PTEST's. (*aarch64_pred_cmp<cmp_op><mode>_wide_ptest): Likewise. gcc/testsuite/ * gcc.target/aarch64/sve/acle/general/cmpeq_1.c: Check the number of PTRUEs. * gcc.target/aarch64/sve/acle/general/cmpge_5.c: New test. * gcc.target/aarch64/sve/acle/general/cmpge_6.c: Likewise. * gcc.target/aarch64/sve/acle/general/cmpgt_5.c: Likewise. * gcc.target/aarch64/sve/acle/general/cmpgt_6.c: Likewise. * gcc.target/aarch64/sve/acle/general/cmple_5.c: Likewise. * gcc.target/aarch64/sve/acle/general/cmple_6.c: Likewise. * gcc.target/aarch64/sve/acle/general/cmplt_5.c: Likewise. * gcc.target/aarch64/sve/acle/general/cmplt_6.c: Likewise. * gcc.target/aarch64/sve/acle/general/cmpne_3.c: Likewise.
2025-08-04aarch64: Use the correct GP mode in the svcmp_wide patternsRichard Sandiford2-4/+55
The patterns for the svcmp_wide intrinsics used a VNx16BI input predicate for all modes, instead of the usual <VPRED>. That unnecessarily made some input bits significant, but more importantly, it triggered an ICE in aarch64_sve_same_pred_for_ptest_p when testing whether a comparison pattern could be fused with a PTEST. A later patch will add tests for other comparisons. gcc/ * config/aarch64/aarch64-sve.md (@aarch64_pred_cmp<cmp_op><mode>_wide) (*aarch64_pred_cmp<cmp_op><mode>_wide_cc): Use <VPRED> instead of VNx16BI for the governing predicate. (*aarch64_pred_cmp<cmp_op><mode>_wide_ptest): Likewise. gcc/testsuite/ * gcc.target/aarch64/sve/acle/general/cmpeq_1.c: Add more tests.
2025-08-04aarch64: Use VNx16BI for non-widening integer svcmp*Richard Sandiford25-7/+3133
This patch continues the work of making ACLE intrinsics use VNx16BI for svbool_t results. It deals with the non-widening integer forms of svcmp*. The handling of the PTEST patterns is similar to that for the earlier svwhile* patch. Unfortunately, on its own, this triggers a failure in the pred_clobber_*.c tests. The problem is that, after the patch, we have a comparison instruction followed by a move into p0. Combine combines the instructions together, so that the destination of the comparison is the hard register p0 rather than a pseudo. This defeats IRA's make_early_clobber_and_input_conflicts, which requires the source and destination to be pseudo registers. Before the patch, there was a subreg move between the comparison and the move into p0, so it was that subreg move that ended up with a hard register destination. Arguably the fix for PR87600 should be extended to destination registers as well as source registers, but in the meantime, the patch just disables combine for these tests. The tests are really testing the constraints and register allocation. gcc/ * config/aarch64/aarch64-sve.md (@aarch64_pred_cmp<cmp_op><mode>_acle) (*aarch64_pred_cmp<cmp_op><mode>_acle, *cmp<cmp_op><mode>_acle_cc) (*cmp<cmp_op><mode>_acle_and): New patterns that yield VNx16BI results for all element types. * config/aarch64/aarch64-sve-builtins-base.cc (svcmp_impl::expand): Use them. (svcmp_wide_impl::expand): Likewise when implementing an svcmp_wide against an in-range constant. gcc/testsuite/ * gcc.target/aarch64/sve/pred_clobber_1.c: Disable combine. * gcc.target/aarch64/sve/pred_clobber_2.c: Likewise. * gcc.target/aarch64/sve/pred_clobber_3.c: Likewise. * gcc.target/aarch64/sve/acle/general/cmpeq_2.c: Add more cases. * gcc.target/aarch64/sve/acle/general/cmpeq_4.c: New test. * gcc.target/aarch64/sve/acle/general/cmpge_1.c: Likewise. * gcc.target/aarch64/sve/acle/general/cmpge_2.c: Likewise. * gcc.target/aarch64/sve/acle/general/cmpge_3.c: Likewise. * gcc.target/aarch64/sve/acle/general/cmpge_4.c: Likewise. * gcc.target/aarch64/sve/acle/general/cmpgt_1.c: Likewise. * gcc.target/aarch64/sve/acle/general/cmpgt_2.c: Likewise. * gcc.target/aarch64/sve/acle/general/cmpgt_3.c: Likewise. * gcc.target/aarch64/sve/acle/general/cmpgt_4.c: Likewise. * gcc.target/aarch64/sve/acle/general/cmple_1.c: Likewise. * gcc.target/aarch64/sve/acle/general/cmple_2.c: Likewise. * gcc.target/aarch64/sve/acle/general/cmple_3.c: Likewise. * gcc.target/aarch64/sve/acle/general/cmple_4.c: Likewise. * gcc.target/aarch64/sve/acle/general/cmplt_1.c: Likewise. * gcc.target/aarch64/sve/acle/general/cmplt_2.c: Likewise. * gcc.target/aarch64/sve/acle/general/cmplt_3.c: Likewise. * gcc.target/aarch64/sve/acle/general/cmplt_4.c: Likewise. * gcc.target/aarch64/sve/acle/general/cmpne_1.c: Likewise. * gcc.target/aarch64/sve/acle/general/cmpne_2.c: Likewise.
2025-08-04aarch64: Use VNx16BI for svunpklo/hi_bRichard Sandiford4-1/+77
This patch continues the work of making ACLE intrinsics use VNx16BI for svbool_t results. It deals with the svunpk* intrinsics. gcc/ * config/aarch64/aarch64-sve.md (@aarch64_sve_punpk<perm_hilo>_acle) (*aarch64_sve_punpk<perm_hilo>_acle): New patterns. * config/aarch64/aarch64-sve-builtins-base.cc (svunpk_impl::expand): Use them for boolean svunpk*. gcc/testsuite/ * gcc.target/aarch64/sve/acle/general/unpkhi_1.c: New test. * gcc.target/aarch64/sve/acle/general/unpklo_1.c: Likewise.
2025-08-04aarch64: Use VNx16BI for svrev_b* [PR121294]Richard Sandiford4-2/+56
The previous patch for PR121294 handled svtrn1/2, svuzp1/2, and svzip1/2. This one extends it to handle svrev intrinsics, where the same kind of wrong code can be generated. gcc/ PR target/121294 * config/aarch64/aarch64.md (UNSPEC_REV_PRED): New unspec. * config/aarch64/aarch64-sve.md (@aarch64_sve_rev<mode>_acle) (*aarch64_sve_rev<mode>_acle): New patterns. * config/aarch64/aarch64-sve-builtins-base.cc (svrev_impl::expand): Use the new patterns for boolean svrev. gcc/testsuite/ PR target/121294 * gcc.target/aarch64/sve/acle/general/rev_2.c: New test.
2025-08-04aarch64: Use VNx16BI for more permutations [PR121294]Richard Sandiford10-12/+613
The patterns for the predicate forms of svtrn1/2, svuzp1/2, and svzip1/2 are shared with aarch64_vectorize_vec_perm_const. The .H, .S, and .D forms operate on VNx8BI, VNx4BI, and VNx2BI respectively. Thus, for all four element widths, there is one significant bit per element, for both the inputs and the output. That's appropriate for aarch64_vectorize_vec_perm_const but not for the ACLE intrinsics, where every bit of the output is significant, and where every bit of the selected input elements is therefore also significant. The current expansion can lead the optimisers to simplify inputs by changing the upper bits of the input elements (since the current patterns claim that those bits don't matter), which in turn leads to wrong code. The ACLE expansion should operate on VNx16BI instead, for all element widths. There was already a pattern for a VNx16BI-only form of TRN1, for constructing certain predicate constants. The patch generalises it to handle the other five permutations as well. For the reasons given in the comments, this is done by making the permutation unspec an operand to a new UNSPEC_PERMUTE_PRED, rather than overloading the existing unspecs, and rather than adding a new unspec for each permutation. gcc/ PR target/121294 * config/aarch64/iterators.md (UNSPEC_TRN1_CONV): Delete. (UNSPEC_PERMUTE_PRED): New unspec. * config/aarch64/aarch64-sve.md (@aarch64_sve_trn1_conv<mode>): Replace with... (@aarch64_sve_<perm_insn><mode>_acle) (*aarch64_sve_<perm_insn><mode>_acle): ...these new patterns. * config/aarch64/aarch64.cc (aarch64_expand_sve_const_pred_trn): Update accordingly. * config/aarch64/aarch64-sve-builtins-functions.h (binary_permute::expand): Use the new _acle patterns for predicate operations. gcc/testsuite/ PR target/121294 * gcc.target/aarch64/sve/acle/general/perm_2.c: New test. * gcc.target/aarch64/sve/acle/general/perm_3.c: Likewise. * gcc.target/aarch64/sve/acle/general/perm_4.c: Likewise. * gcc.target/aarch64/sve/acle/general/perm_5.c: Likewise. * gcc.target/aarch64/sve/acle/general/perm_6.c: Likewise. * gcc.target/aarch64/sve/acle/general/perm_7.c: Likewise.
2025-08-04aarch64: Use VNx16BI for more SVE WHILE* results [PR121118]Richard Sandiford13-5/+895
PR121118 is about a case where we try to construct a predicate constant using a permutation of a PFALSE and a WHILELO. The WHILELO is a .H operation and its result has mode VNx8BI. However, the permute instruction expects both inputs to be VNx16BI, leading to an unrecognisable insn ICE. VNx8BI is effectively a form of VNx16BI in which every odd-indexed bit is insignificant. In the PR's testcase that's OK, since those bits will be dropped by the permutation. But if the WHILELO had been a VNx4BI, so that only every fourth bit is significant, the input to the permutation would have had undefined bits. The testcase in the patch has an example of this. This feeds into a related ACLE problem that I'd been meaning to fix for a long time: every bit of an svbool_t result is significant, and so every ACLE intrinsic that returns an svbool_t should return a VNx16BI. That doesn't currently happen for ACLE svwhile* intrinsics. This patch fixes both issues together. We still need to keep the current WHILE* patterns for autovectorisation, where the result mode should match the element width. The patch therefore adds a new set of patterns that are defined to return VNx16BI instead. For want of a better scheme, it uses an "_acle" suffix to distinguish these new patterns from the "normal" ones. The formulation used is: (and:VNx16BI (subreg:VNx16BI normal-pattern 0) C) where C has mode VNx16BI and is a canonical ptrue for normal-pattern's element width (so that the low bit of each element is set and the upper bits are clear). This is a bit clunky, and leads to some repetition. But it has two advantages: * After g:965564eafb721f8000013a3112f1bba8d8fae32b, converting the above expression back to normal-pattern's mode will reduce to normal-pattern, so that the pattern for testing the result using a PTEST doesn't change. * It gives RTL optimisers a bit more information, as the new tests demonstrate. In the expression above, C is matched using a new "special" predicate aarch64_ptrue_all_operand, where "special" means that the mode on the predicate is not necessarily the mode of the expression. In this case, C always has mode VNx16BI, but the mode on the predicate indicates which kind of canonical PTRUE is needed. gcc/ PR testsuite/121118 * config/aarch64/iterators.md (VNx16BI_ONLY): New mode iterator. * config/aarch64/predicates.md (aarch64_ptrue_all_operand): New predicate. * config/aarch64/aarch64-sve.md (@aarch64_sve_while_<while_optab_cmp><GPI:mode><VNx16BI_ONLY:mode>_acle) (@aarch64_sve_while_<while_optab_cmp><GPI:mode><PRED_HSD:mode>_acle) (*aarch64_sve_while_<while_optab_cmp><GPI:mode><PRED_HSD:mode>_acle) (*while_<while_optab_cmp><GPI:mode><PRED_HSD:mode>_acle_cc): New patterns. * config/aarch64/aarch64-sve-builtins-functions.h (while_comparison::expand): Use the new _acle patterns that always return a VNx16BI. * config/aarch64/aarch64-sve-builtins-sve2.cc (svwhilerw_svwhilewr_impl::expand): Likewise. * config/aarch64/aarch64.cc (aarch64_sve_move_pred_via_while): Likewise. gcc/testsuite/ PR testsuite/121118 * gcc.target/aarch64/sve/acle/general/pr121118_1.c: New test. * gcc.target/aarch64/sve/acle/general/whilele_13.c: Likewise. * gcc.target/aarch64/sve/acle/general/whilelt_6.c: Likewise. * gcc.target/aarch64/sve2/acle/general/whilege_1.c: Likewise. * gcc.target/aarch64/sve2/acle/general/whilegt_1.c: Likewise. * gcc.target/aarch64/sve2/acle/general/whilerw_5.c: Likewise. * gcc.target/aarch64/sve2/acle/general/whilewr_5.c: Likewise.
2025-08-04aarch64: Improve svdupq_lane expension for big-endian [PR121293]Richard Sandiford2-2/+11
If the index to svdupq_lane is variable, or is outside the range of the .Q form of DUP, the fallback expansion is to convert to VNx2DI and use TBL. The problem in this PR was that the conversion used subregs, and on big-endian targets, a bitcast from VNx2DI to another element size requires a REV[BHW] in the best case or a spill and reload in the worst case. (See the comment at the head of aarch64-sve.md for details.) Here we want the conversion to act like svreinterpret, so it should use aarch64_sve_reinterpret instead of subregs. gcc/ PR target/121293 * config/aarch64/aarch64-sve-builtins-base.cc (svdupq_lane::expand): Use aarch64_sve_reinterpret instead of subregs. Explicitly reinterpret the result back to the required mode, rather than leaving the caller to take a subreg. gcc/testsuite/ PR target/121293 * gcc.target/aarch64/sve/acle/general/dupq_lane_9.c: New test.
2025-08-04tree-optimization/121362 - missed FRE through aggregate copyRichard Biener3-21/+153
The following streamlines and generalizes how we find the common base of the lookup ref and a kill ref when looking through aggregate copies. In particular this tries to deal with all variants of punning that happens on the inner MEM_REF after forwarding of address taken components of the common base. PR tree-optimization/121362 * tree-ssa-sccvn.cc (vn_reference_lookup_3): Generalize aggregate copy handling. * gcc.dg/tree-ssa/ssa-fre-105.c: New testcase. * gcc.dg/tree-ssa/ssa-fre-106.c: Likewise.
2025-08-04invoke.texi: Update docs of -fdump-{rtl,tree}-<pass>-<options>Filip Kastl1-7/+10
This patch changes two things. Firstly, we document -fdump-rtl-<whatever>-graph and other such options under -fdump-tree. At least write a remark about this under -fdump-rtl. Secondly, the documentation incorrectly says that -fdump-tree-<whatever>-graph is not implemented. Change that. gcc/ChangeLog: * doc/invoke.texi: Add remark about -options being documented under -fdump-tree. Remove remark about -graph working only for RTL. Signed-off-by: Filip Kastl <fkastl@suse.cz>
2025-08-03x86: Don't hoist non all 0s/1s vector set outside of loopH.J. Lu2-50/+106
Don't hoist non all 0s/1s vector set outside of the loop to avoid extra spills. gcc/ PR target/120941 * config/i386/i386-features.cc (x86_cse_kind): Moved before ix86_place_single_vector_set. (redundant_load): Likewise. (ix86_place_single_vector_set): Replace the last argument to the pointer to redundant_load. For X86_CSE_VEC_DUP, don't place the vector set outside of the loop to avoid extra spills. (remove_redundant_vector_load): Pass load to ix86_place_single_vector_set. gcc/testsuite/ PR target/120941 * gcc.target/i386/pr120941-1.c: New test. Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2025-08-04Daily bump.GCC Administrator3-1/+34
2025-08-03c++: Add stringification testcase for CWG1709 [PR120778]Jakub Jelinek1-0/+18
The CWG1709 just codifies existing GCC (and clang) behavior, so this just adds a testcase for that. 2025-08-03 Jakub Jelinek <jakub@redhat.com> PR preprocessor/120778 * g++.dg/DRs/dr1709.C: New test.
2025-08-03libcpp: Fix up cpp_maybe_module_directive [PR120845]Jakub Jelinek1-0/+8
My changes for "Module Declarations Shouldn’t be Macros" paper broke the following testcase. The backup handling intentionally tries to drop CPP_PRAGMA_EOL token if things go wrong, which is desirable for the case where we haven't committed to the module preprocessing directive (i.e. changed the first token to the magic one). In that case there is no preprocessing directive start and so CPP_PRAGMA_EOL would be wrong. If there is a premature new-line after we've changed the first token though, we shouldn't drop CPP_PRAGMA_EOL, because otherwise we ICE in the FE. While clang++ and MSVC accept the testcase, in my reading it is incorrect at least in the C++23 and newer wordings and I think the changes have been a DR, https://eel.is/c++draft/cpp.module has no exception for new-lines and https://eel.is/c++draft/cpp.pre#1.sentence-2 says that new-line (unless deleted during phase 2 when after backslash) ends the preprocessing directive. The patch arranges for eol being set only in the not_module case. 2025-08-03 Jakub Jelinek <jakub@redhat.com> PR c++/120845 libcpp/ * lex.cc (cpp_maybe_module_directive): Move eol variable declaration to the start of the function, initialize to false and only set it to peek->type == CPP_PRAGMA_EOL in the not_module case. Formatting fix. gcc/testsuite/ * g++.dg/modules/cpp-21.C: New test.
2025-08-03AVR: Use avr_add_ccclobber / DONE_ADD_CCC in md instead of repeats.Georg-Johann Lay3-961/+436
There are many post-reload define_insn_and_split's that just append a (clobber (reg:CC REG_CC)) to the pattern. Instead of repeating the original patterns, avr_add_ccclobber (curr_insn) is used to do that job. This avoids repeating patterns all over the place, and splits that do something different (like using a canonical form) stand out clearly. gcc/ * config/avr/avr.md (define_insn_and_split) [reload_completed]: For splits that just append a (clobber (reg:CC REG_CC)) to the pattern, use avr_add_ccclobber (curr_insn) instead of repeating the original pattern. * config/avr/avr-dimode.md: Same. * config/avr/avr-fixed.md: Same.
2025-08-03AVR: Add avr.cc::avr_add_ccclobber().Georg-Johann Lay2-0/+25
gcc/ * config/avr/avr.cc (avr_add_ccclobber): New function. * config/avr/avr-protos.h (avr_add_ccclobber): New proto. (DONE_ADD_CCC): New define.
2025-08-03tree-optimization/90242 - UBSAN error in vn_reference_compute_hashRichard Biener1-3/+3
The following plugs possible overflow issues in vn_reference_compute_hash and possibly in vn_reference_eq. The inchash "integer" adds are a bit of a mess, but I know overloads with different integer types can get messy, so not this time. For hashing simply truncate to 64bits. PR tree-optimization/90242 * tree-ssa-sccvn.cc (vn_reference_compute_hash): Use poly_offset_int for offset accumulation. For hashing truncate to 64 bits and also hash 64 bits. (vn_reference_eq): Likewise.
2025-08-03Daily bump.GCC Administrator6-1/+33
2025-08-03doc: Drop note on 16-bit Windows supportGerald Pfeifer1-7/+0
gcc: PR target/69374 * doc/install.texi (Specific) <windows>: Drop note on 16-bit Windows support. Streamline note on 32-bit support.
2025-08-02cobol: Use %td in error_msg in 3 spotsJakub Jelinek3-5/+5
On Thu, Jul 31, 2025 at 11:33:07PM +0200, Jakub Jelinek via Gcc wrote: > > this was all described in excruciating detail in the patch submission > > > > https://gcc.gnu.org/pipermail/gcc-patches/2025-June/687385.html > > > > and the commit message. > > Looking at that patch, the dbgmsg change looks correct (dbgmsg is > ATTRIBUTE_PRINTF_1), while the last 3 hunks are suboptimal, they should > really use %td and keep the ptrdiff_t arguments without casts. Here it is in patch form. I couldn't find other similar casts in calls to ATTRIBUTE_GCOBOL_DIAG functions. 2025-08-02 Jakub Jelinek <jakub@redhat.com> * parse.y (intrinsic): Use %td format specifier with no cast on argument instead of %ld with cast to long. * scan_ante.h (numstr_of): Likewise. * util.cc (cbl_field_t::report_invalid_initial_value): Likewise.
2025-08-02c: rewrite implementation of `arg spec' attributeMartin Uecker4-199/+145
Rewrite the implementation of the `arg spec' attribute to pass the the original type of an array parameter instead of passing a string description and a list of bounds. The string and list are then created from this type during the integration of the information of `arg spec' into the access attribute because it still needed later for various warnings. This change makes the implementation simpler and more robust as the declarator is not processed and information that is already encoded in the type is not duplicated. A similar change to the access attribute could then completely remove the string processing. With this change, the original type is now available and can be used for other warnings or for _Countof. The new implementation tries to be faithful to the original, but this is not entirely true as it fixes a bug in the old implementation. gcc/c-family/ChangeLog: * c-attribs.cc (handle_argspec_attribute): Update. (build_arg_spec): New function. (build_attr_access_from_parms): Rewrite `arg spec' handling. gcc/c/ChangeLog: * c-decl.cc (get_parm_array_spec): Remove. (push_parm_decl): Do not add `arg spec` attribute. (build_arg_spec_attribute): New function. (grokdeklarator): Add `arg spec` attribute. gcc/testsuite/ChangeLog: * gcc.dg/Warray-parameter-11.c: Change Warray-parameter to -Wvla-parameter as these are VLAs. * gcc.dg/Warray-parameter.c: Remove xfail.
2025-08-02Daily bump.GCC Administrator7-1/+134
2025-08-01i386: Fix incorrect attributes-error.c testArtemiy Granat1-1/+2
gcc/testsuite/ChangeLog: * gcc.target/i386/attributes-error.c: Change incorrect sseregparm,fastcall combination to cdecl,fastcall.
2025-08-01cobol: Minor changes to quiet cppcheck warnings. [PR119324]Robert Dubner4-10/+12
gcc/cobol/ChangeLog: PR cobol/119324 * cbldiag.h (location_dump): Inline suppression of knownConditionTrueFalse. * genapi.cc (parser_statement_begin): Combine two if() statements. * genutil.cc (get_binary_value): File-level suppression of duplicateBreak. * symbols.cc (symbol_elem_cmp): File-level suppression of duplicateBreak.
2025-08-01PR modula2/121354: ICE when attempting to fold HIGH from an unbounded array ↵Gaius Mulley1-18/+33
in a nested procedure The bug fix re-implements gcc/m2/gm2-compiler/M2GenGCC.mod:FoldHigh to ignore any attempt to constant fold HIGH if it has an unbounded array operand. gcc/m2/ChangeLog: PR modula2/121354 * gm2-compiler/M2GenGCC.mod (FoldHigh): Rewrite. (IsUnboundedArray): New procedure function. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2025-08-01fortran: Fix closing brace in commentMikael Morin1-1/+1
In a comment, fix the closing brace of the tree layout definition of the openmp allocate clause. It was confusing vim's matching brace support. gcc/fortran/ChangeLog: * trans-decl.cc (gfc_trans_deferred_vars): Fix closing brace in a comment.
2025-08-01Properly record SLP node when costing a vectorized storeRichard Biener1-1/+1
Even when we emit scalar stores we should pass down the SLP node. PR tree-optimization/121350 * tree-vect-stmts.cc (vectorizable_store): Pass down SLP node when costing scalar stores in vect_body.
2025-08-01Avoid representing SLP mask by scalar opRichard Biener1-55/+62
The following removes the scalar mask output from vect_check_scalar_mask and deals with the fallout, eliminating uses of it. That's mostly replacing checks on 'mask' by checks on 'mask_node' but also realizing PR121349 and fixing that up a bit in check_load_store_for_partial_vectors. PR tree-optimization/121349 * tree-vect-stmts.cc (check_load_store_for_partial_vectors): Get full SLP mask, reduce to uniform scalar_mask for further processing if possible. (vect_check_scalar_mask): Remove scalar mask output, remove code conditional on slp_mask. (vectorizable_call): Adjust. (check_scan_store): Get and check SLP mask. (vectorizable_store): Eliminate scalar mask variable. (vectorizable_load): Likewise.
2025-08-01doc: mdocml.bsd.lv is now mandoc.bsd.lvGerald Pfeifer1-1/+1
On the way switch from http to https. gcc: * doc/install.texi (Prerequisites): mdocml.bsd.lv is now mandoc.bsd.lv.
2025-08-01Merge get_group_load_store_type into get_load_store_typeRichard Biener1-74/+31
The following merges back get_group_load_store_type into get_load_store_type, it gets easier to follow that way. I've removed the unused ncopies parameter as well. * tree-vect-stmts.cc (get_group_load_store_type): Remove, inline into ... (get_load_store_type): ... this. Remove ncopies parameter. (vectorizable_load): Adjust. (vectorizable_store): Likewise.
2025-08-01Some TLC to vectorizable_storeRichard Biener1-33/+12
The following removes redundant checks and scalar operand uses. * tree-vect-stmts.cc (get_group_load_store_type): Remove checks performed at SLP build time. (vect_check_store_rhs): Remove scalar RHS output. (vectorizable_store): Remove uses of scalar RHS.
2025-08-01Add VMAT_UNINITIALIZEDRichard Biener2-1/+3
We're using VMAT_INVARIANT as default, but we should simply have an uninitialized state. * tree-vectorizer.h (VMAT_UNINITIALIZED): New vect_memory_access_type. * tree-vect-slp.cc (_slp_tree::_slp_tree): Use it.
2025-08-01tree-optimization/121338 - UBSAN error in adjust_setup_costRichard Biener1-3/+5
The following avoids possibly overflowing adds for rounding. We know cost is bound, so it's enough to do this simple test. PR tree-optimization/121338 * tree-ssa-loop-ivopts.cc (avg_loop_niter): Return an unsigned. (adjust_setup_cost): When niters is so large the division result is one or zero avoid it. (create_new_ivs): Adjust.
2025-08-01Put SLP_TREE_SIMD_CLONE_INFO into type specifc dataRichard Biener3-11/+17
The following adds vect_simd_clone_data as a container for vect type specific data for vectorizable_simd_clone_call and moves SLP_TREE_SIMD_CLONE_INFO there. * tree-vectorizer.h (vect_simd_clone_data): New. (_slp_tree::simd_clone_info): Remove. (SLP_TREE_SIMD_CLONE_INFO): Likewise. * tree-vect-slp.cc (_slp_tree::_slp_tree): Adjust. (_slp_tree::~_slp_tree): Likewise. * tree-vect-stmts.cc (vectorizable_simd_clone_call): Use tyupe specific data to store SLP_TREE_SIMD_CLONE_INFO.
2025-08-01Use a class hierarchy for vect specific dataRichard Biener2-6/+13
The following turns the union into a class hierarchy. One completed SLP_TREE_TYPE could move into the base class. * tree-vect-slp.cc (_slp_tree::_slp_tree): Adjust. (_slp_tree::~_slp_tree): Likewise. * tree-vectorizer.h (vect_data): New base class. (_slp_tree::u): Remove. (_slp_tree::data): Add pointer to vect_data. (_slp_tree::get_data): New helper template.
2025-08-01bswap: Fix up ubsan detected UB in find_bswap_or_nop [PR121322]Jakub Jelinek2-0/+16
The following testcase results in compiler UB as detected by ubsan. find_bswap_or_nop first checks is_bswap_or_nop_p and if that fails on the tmp_n value, tries some rotation of that if possible. The discovery what rotate count to use ignores zero bytes from the least significant end (those mean zero bytes and so can be masked away) and on the first non-zero non-0xff byte (0xff means don't know), 1-8 means some particular byte of the original computes count (the rotation count) from that byte + the byte index. Now, on the following testcase we have tmp_n 0x403020105060700, i.e. the least significant byte is zero, then the msb from the original value, byte below it, another one below it, then the low 32 bits of the original value. So, we stop at count 7 with i 1, it wraps around and we get count 0. Then we invoke UB on tmp_n = tmp_n >> count | tmp_n << (range - count); because count is 0 and range is 64. Now, of course I could fix it up by doing tmp_n << ((range - count) % range) or something similar, but that is just wasted compile time, if count is 0, we already know that is_bswap_or_nop_p failed on that tmp_n value and so it will fail again if the value is the same. So I think better just return NULL (i.e. punt). 2025-08-01 Jakub Jelinek <jakub@redhat.com> PR middle-end/121322 * gimple-ssa-store-merging.cc (find_bswap_or_nop): Return NULL if count is 0. * gcc.dg/pr121322.c: New test.
2025-08-01c++/modules: Warn for optimize attributes instead of ICEing [PR108080]Nathaniel Shead2-10/+24
This PR is the most frequently reported modules bug for 15, as the ICE message does not indicate the issue at all and reducing to find the underlying cause can be tricky. I have a WIP patch to fix this issue by just reconstructing these nodes on stream-in from any attributes applied to the functions, but since at this stage it may still take a while to be ready, it seems useful to me to at least make the error here more friendly and guide users to what they could do to work around this issue. In fact, as noted on the PR, a lot of the time it should be harmless to just ignore the optimize etc. attribute and continue translation, at the user's own risk; this patch as such turns the ICE into a warning with no option to silence. PR c++/108080 gcc/cp/ChangeLog: * module.cc (trees_out::core_vals): Warn when streaming target/optimize node; adjust comments. (trees_in::core_vals): Don't stream a target/optimize node. gcc/testsuite/ChangeLog: * g++.dg/modules/pr108080.H: New test. Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by: Jason Merrill <jason@redhat.com> Reviewed-by: Patrick Palka <ppalka@redhat.com>
2025-08-01c++/modules: Merge PARM_DECL properties from function definitions [PR121238]Nathaniel Shead4-0/+56
When we merge a function definition, if there already exists a forward declaration in the importing TU we use the PARM_DECLs belonging to that decl. This usually works fine, except as noted in the linked PR there are some flags (such as TREE_ADDRESSABLE) that only get set on a PARM_DECL once a definition is provided. This patch fixes the wrong-code issues by propagating any properties on PARM_DECLs I could find that may affect codegen. PR c++/121238 gcc/cp/ChangeLog: * module.cc (trees_in::fn_parms_fini): Merge properties for definitions. gcc/testsuite/ChangeLog: * g++.dg/modules/merge-19.h: New test. * g++.dg/modules/merge-19_a.H: New test. * g++.dg/modules/merge-19_b.C: New test. Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by: Jason Merrill <jason@redhat.com> Reviewed-by: Patrick Palka <ppalka@redhat.com>
2025-08-01Daily bump.GCC Administrator8-1/+578
2025-07-31PR modula2/121314: quotes appearing in concatenated error stringsGaius Mulley9-12/+177
This patch fixes the addition of strings so that no extraneous quotes appear in the result string. The fix is made to the bootstrap tool mc and it has been rebuilt. gcc/m2/ChangeLog: PR modula2/121314 * mc-boot/GFormatStrings.cc (PerformFormatString): Rebuilt. * mc-boot/GM2EXCEPTION.cc (M2EXCEPTION_M2Exception): Rebuilt. * mc-boot/GSFIO.cc (SFIO_GetFileName): Rebuilt. * mc-boot/GSFIO.h (SFIO_GetFileName): Rebuilt. * mc-boot/Gdecl.cc: Rebuilt. * mc-boot/GmcFileName.h: Rebuilt. * mc/decl.mod (getStringChar): New procedure function. (getStringContents): Call getStringChar. (addQuotes): New procedure function. (foldBinary): Call addQuotes to add delimiting quotes to the new string. gcc/testsuite/ChangeLog: PR modula2/121314 * gm2/errors/fail/badindrtype.mod: New test. * gm2/errors/fail/badindrtype2.mod: New test. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2025-07-31fortran: Evaluate class function bounds in the scalarizer [PR121342]Mikael Morin3-54/+132
There is code in gfc_conv_procedure_call that, for polymorphic functions, initializes the scalarization array descriptor information and forcedfully sets loop bounds. This code is changing the decisions made by the scalarizer behind his back, and the test shows an example where the consequences are (badly) visible. In the test, for one of the actual arguments to an elemental subroutine, an offset to the loop variable is missing to access the array, as it was the one originally chosen to set the loop bounds from. This could theoretically be fixed by just clearing the array of choice for the loop bounds. This change takes instead the harder path of adding the missing information to the scalarizer's knowledge so that its decision doesn't need to be forced to something else after the fact. The array descriptor information initialisation for polymorphic functions is moved to gfc_add_loop_ss_code (after the function call generation), and the loop bounds initialization to a new function called after that. As the array chosen to set the loop bounds from is no longer forced to be the polymorphic function result, we have to let the scalarizer set a delta for polymorphic function results. For regular non-polymorphic function result arrays, they are zero-based and the temporary creation makes the loop zero-based as well, so we can continue to skip the delta calculation. In the cases where a temporary is created to store the result of the array function, the creation of the temporary shifts the loop bounds to be zero-based. As there was no delta for polymorphic result arrays, the function result descriptor offset was set to zero in that case for a zero-based array reference to be correct. Now that the scalarizer sets a delta, those forced offset updates have to go because they can make the descriptor invalid and cause erroneous array references. PR fortran/121342 gcc/fortran/ChangeLog: * trans-expr.cc (gfc_conv_subref_array_arg): Remove offset update. (gfc_conv_procedure_call): For polymorphic functions, move the scalarizer descriptor information... * trans-array.cc (gfc_add_loop_ss_code): ... here, and evaluate the bounds to fresh variables. (get_class_info_from_ss): Remove offset update. (gfc_conv_ss_startstride): Don't set a zero value for function result upper bounds. (late_set_loop_bounds): New. (gfc_conv_loop_setup): If the bounds of a function result have been set, and no other array provided loop bounds for a dimension, use the function result bounds as loop bounds for that dimension. (gfc_set_delta): Don't skip delta setting for polymorphic function results. gcc/testsuite/ChangeLog: * gfortran.dg/class_elemental_1.f90: New test.
2025-07-31AVR: avr.opt.urls: Add -mfuse-move2Georg-Johann Lay1-0/+3
PR rtl-optimization 121340 gcc/ * config/avr/avr.opt.urls (-mfuse-move2): Add url.
2025-07-31AVR: Set .type of jump table label.Georg-Johann Lay1-0/+7
gcc/ * config/avr/avr.cc (avr_output_addr_vec) <labl>: Asm out its .type.
2025-07-31AVR: rtl-optimization/121340 - New mini-pass to undo superfluous moves from ↵Georg-Johann Lay6-1/+158
insn combine. Insn combine may come up with superfluous reg-reg moves, where the combine people say that these are no problem since reg-alloc is supposed to optimize them. The issue is that the lower-subreg pass sitting between combine and reg-alloc may split such moves, coming up with a zoo of subregs which are only handled poorly by the register allocator. This patch adds a new avr mini-pass that handles such cases. As an example, take int f_ffssi (long x) { return __builtin_ffsl (x); } where the two functions have the same interface, i.e. there are no extra moves required for the argument or for the return value. However, $ avr-gcc -S -Os -dp -mno-fuse-move ... f_ffssi: mov r20,r22 ; 29 [c=4 l=1] movqi_insn/0 mov r21,r23 ; 30 [c=4 l=1] movqi_insn/0 mov r22,r24 ; 31 [c=4 l=1] movqi_insn/0 mov r23,r25 ; 32 [c=4 l=1] movqi_insn/0 mov r25,r23 ; 33 [c=4 l=4] *movsi/0 mov r24,r22 mov r23,r21 mov r22,r20 rcall __ffssi2 ; 34 [c=16 l=1] *ffssihi2.libgcc ret ; 37 [c=0 l=1] return where all the moves add up to a no-op. The -mno-fuse-move option stops any attempts by the avr backend to clean up that mess. PR rtl-optimization/121340 gcc/ * config/avr/avr.opt (-mfuse-move2): New option. * config/avr/avr-passes.def (avr_pass_2moves): Insert after combine. * config/avr/avr-passes.cc (make_avr_pass_2moves): New function. (pass_data avr_pass_data_2moves): New static variable. (avr_pass_2moves): New rtl_opt_pass. * config/avr/avr-protos.h (make_avr_pass_2moves): New proto. * common/config/avr/avr-common.cc (default_options avr_option_optimization_table) <-mfuse-move2>: Set for -O1 and higher. * doc/invoke.texi (AVR Options) <-mfuse-move2>: Document.
2025-07-31c++: constexpr, array, private ctor [PR120800]Jason Merrill2-0/+25
Here cxx_eval_vec_init_1 wants to recreate the default constructor call that we previously built and threw away in build_vec_init_elt, but we aren't in the same access context at this point. Since we already checked access, let's just suppress access control here. Redoing overload resolution at constant evaluation time is sketchy, but should usually be fine for a default/copy constructor. PR c++/120800 gcc/cp/ChangeLog: * constexpr.cc (cxx_eval_vec_init_1): Suppress access control. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/constexpr-array30.C: New test.
2025-07-31Revert "Ada: Add System.C_Time and GNAT.C_Time units to libgnat"Eric Botcazou71-546/+2164
This reverts commit 41974d6ed349507ca1532629851b7b5d74f44abc.
2025-07-31Ada: Fix miscompilation of GNAT tools with -march=znver3Eric Botcazou1-4/+4
The throw and catch sides of the Ada exception machinery disagree about the BIGGEST_ALIGNMENT setting. gcc/ada/ PR ada/120440 * gcc-interface/Makefile.in (GNATLINK_OBJS): Add s-excmac.o. (GNATMAKE_OBJS): Likewise.