aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2025-07-30libcpp: Fix up comma diagnostics in preprocessor for C++ [PR120778]Jakub Jelinek3-3/+60
The P2843R3 Preprocessing is never undefined paper contains comments that various compilers handle comma operators in preprocessor expressions incorrectly and I think they are right. In both C and C++ the grammar uses constant-expression non-terminal for #if/#elif and in both C and C++ that NT is conditional-expression, so #if 1, 2 is IMHO clearly wrong in both languages. C89 then says for constant-expression "Constant expressions shall not contain assignment, increment, decrement, function-call, or comma operators, except when they are contained within the operand of a sizeof operator." Because all the remaining identifiers in the #if/#elif expression are replaced with 0 I think assignments, increment, decrement and function-call aren't that big deal because (0 = 1) or ++4 etc. are all invalid, but for comma expressions I think it matters. In r0-56429 PR456 Joseph has added !CPP_OPTION (pfile, c99) to handle that correctly. Then C99 changed that to: "Constant expressions shall not contain assignment, increment, decrement, function-call, or comma operators, except when they are contained within a subexpression that is not evaluated." That made for C99+ #if 1 || (1, 2) etc. valid but #if (1, 2) is still invalid, ditto #if 1 ? 1, 2 : 3 In C++ I can't find anything like that though, and as can be seen on say int a[(1, 2)]; int b[1 ? 1, 2 : 3]; being accepted by C++ and rejected by C while int c[1, 2]; int d[1 ? 2 : 3, 4]; being rejected in both C and C++, so I think for C++ it is indeed just the grammar that prevents #if 1, 2. When it is the second operand of ?: or inside of () the grammar just uses expression and that allows comma operator. So, the following patch uses different decisions for C++ when to diagnose comma operator in preprocessor expressions, for C++ tracks if it is inside of () (obviously () around #embed clauses don't count unless one uses limit ((1, 2)) etc.) or inside of the second ?: operand and allows comma operator there and disallows elsewhere. BTW, I wonder if anything in the standard disallows <=> in the preprocessor expressions. Say #if (0 <=> 1) < 0 etc. #include <compare> constexpr int a = (0 <=> 1) < 0; is valid (but not valid without #include <compare>) and the expressions don't use any identifiers. 2025-07-30 Jakub Jelinek <jakub@redhat.com> PR c++/120778 * internal.h (struct lexer_state): Add comma_ok member. * expr.cc (_cpp_parse_expr): Initialize it to 0, increment on CPP_OPEN_PAREN and CPP_QUERY, decrement on CPP_CLOSE_PAREN and CPP_COLON. (num_binary_op): For C++ pedwarn on comma operator if pfile->state.comma_ok is 0 instead of !c99 or skip_eval. * g++.dg/cpp/if-comma-1.C: New test.
2025-07-30vect: Add missing skip-vector check for peeling with versioning [PR121020]Pengfei Li3-1/+59
This fixes a miscompilation issue introduced by the enablement of combined loop peeling and versioning. A test case that reproduces the issue is included in the patch. When performing loop peeling, GCC usually inserts a skip-vector check. This ensures that after peeling, there are enough remaining iterations to enter the main vectorized loop. Previously, the check was omitted if loop versioning for alignment was applied. It was safe before because versioning and peeling for alignment were mutually exclusive. However, with combined peeling and versioning enabled, this is not safe any more. A loop may be peeled and versioned at the same time. Without the skip-vector check, the main vectorized loop can be entered even if its iteration count is zero. This can cause the loop running many more iterations than needed, resulting in incorrect results. To fix this, the patch updates the condition of omitting the skip-vector check to when versioning is performed alone without peeling. gcc/ChangeLog: PR tree-optimization/121020 * tree-vect-loop-manip.cc (vect_do_peeling): Update the condition of omitting the skip-vector check. * tree-vectorizer.h (LOOP_VINFO_USE_VERSIONING_WITHOUT_PEELING): Add a helper macro. gcc/testsuite/ChangeLog: PR tree-optimization/121020 * gcc.dg/vect/vect-early-break_138-pr121020.c: New test.
2025-07-30vect: Fix insufficient alignment requirement for speculative loads [PR121190]Pengfei Li3-15/+69
This patch fixes a segmentation fault issue that can occur in vectorized loops with an early break. When GCC vectorizes such loops, it may insert a versioning check to ensure that data references (DRs) with speculative loads are aligned. The check normally requires DRs to be aligned to the vector mode size, which prevents generated vector load instructions from crossing page boundaries. However, this is not sufficient when a single scalar load is vectorized into multiple loads within the same iteration. In such cases, even if none of the vector loads crosses page boundaries, subsequent loads after the first one may still access memory beyond current valid page. Consider the following loop as an example: while (i < MAX_COMPARE) { if (*(p + i) != *(q + i)) return i; i++; } When compiled with "-O3 -march=znver2" on x86, the vectorized loop may include instructions like: vmovdqa (%rcx,%rax), %ymm0 vmovdqa 32(%rcx,%rax), %ymm1 vpcmpeqq (%rdx,%rax), %ymm0, %ymm0 vpcmpeqq 32(%rdx,%rax), %ymm1, %ymm1 Note two speculative vector loads are generated for each DR (p and q). The first vmovdqa and vpcmpeqq are safe due to the vector size (32-byte) alignment, but the following ones (at offset 32) may not be safe because they could read from the beginning of the next memory page, potentially leading to segmentation faults. To avoid the issue, this patch increases the alignment requirement for speculative loads to DR_TARGET_ALIGNMENT. It ensures all vector loads in the same vector iteration access memory within the same page. gcc/ChangeLog: PR tree-optimization/121190 * tree-vect-data-refs.cc (vect_enhance_data_refs_alignment): Increase alignment requirement for speculative loads. gcc/testsuite/ChangeLog: PR tree-optimization/121190 * gcc.dg/vect/vect-early-break_52.c: Update an unsafe test. * gcc.dg/vect/vect-early-break_137-pr121190.c: New test.
2025-07-30aarch64: Fix sme2+faminmax intrisic gating (PR 121300)Alfie Richards2-1/+11
Fixes the feature gating for the SME2+FAMINMAX intrinsics. PR target/121300 gcc/ChangeLog: * config/aarch64/aarch64-sve-builtins-sme.def (svamin/svamax): Fix arch gating. gcc/testsuite/ChangeLog: * gcc.target/aarch64/pr121300.c: New test.
2025-07-30tree-optimization/121304 - set memory_access_type before reading itRichard Biener1-43/+43
The following re-orders gather/scatter handling back to be before we check for fallback situations, specifically make sure to set memory_access_type before reading it. * tree-vect-stmts.cc (get_group_load_store_type): Process STMT_VINFO_GATHER_SCATTER before reading memory_access_type.
2025-07-30aarch64: Add support for unpacked SVE FP conditional ternary arithmeticSpencer Abson11-41/+196
This patch extends the expander for fma, fnma, fms, and fnms to support partial SVE FP modes. We add the missing BF16 tests, which we can now trigger for having implemented the conditional expander. We also add tests for the 'merging with multiplicand' case, which this expander canonicalizes (albeit under SVE_STRICT_GP). gcc/ChangeLog: * config/aarch64/aarch64-sve.md (@cond_<optab><mode>): Extend to support partial FP modes. (*cond_<optab><mode>_2_strict): Extend from SVE_FULL_F to SVE_F, use aarch64_predicate_operand. (*cond_<optab><mode>_4_strict): Extend from SVE_FULL_F_B16B16 to SVE_F_B16B16, use aarch64_predicate_operand. (*cond_<optab><mode>_any_strict): Likewise. gcc/testsuite/ChangeLog: * gcc.target/aarch64/sve/unpacked_cond_fmla_1.c: Add test cases for merging with multiplcand. * gcc.target/aarch64/sve/unpacked_cond_fmls_1.c: Likewise. * gcc.target/aarch64/sve/unpacked_cond_fnmla_1.c: Likewise. * gcc.target/aarch64/sve/unpacked_cond_fnmls_1.c: Likewise. * gcc.target/aarch64/sve/unpacked_cond_fmla_2.c: New test. * gcc.target/aarch64/sve/unpacked_cond_fmls_2.c: Likewise. * gcc.target/aarch64/sve/unpacked_cond_fnmla_2.c: Likewise.. * gcc.target/aarch64/sve/unpacked_cond_fnmls_2.c: Likewise. * g++.target/aarch64/sve/unpacked_cond_ternary_bf16_1.C: Likewise. * g++.target/aarch64/sve/unpacked_cond_ternary_bf16_2.C: Likewise.
2025-07-30aarch64: Relaxed SEL combiner patterns for unpacked SVE FP ternary arithmeticSpencer Abson5-19/+207
Extend the ternary op/UNSPEC_SEL combiner patterns from SVE_FULL_F/ SVE_FULL_F_BF to SVE_F/SVE_F_BF, where the strictness value is SVE_RELAXED_GP. We can only reliably test the 'merging with the third input' (addend) and 'independent value' patterns at this stage as the canocalisation that reorders the multiplicands based on the second SEL input would be performed by the conditional expander. Another difficulty is that we can't test these fused multiply/SEL combines without using __builtin_fma and friends. The reason for this is as follows: We support COND_ADD, COND_SUB, and COND_MUL optabs, so match.pd will canonicalize patterns like ADD/SUB/MUL combined with a VEC_COND_EXPR into these conditional forms. Later, when widening_mul tries to fold these into conditional fused multiply operations, the transformation fails - simply because we haven’t implemented those conditional fused multiply optabs yet. Hence why this patch lacks tests for BFloat16... gcc/ChangeLog: * config/aarch64/aarch64-sve.md (*cond_<optab><mode>_2_relaxed): Extend from SVE_FULL_F to SVE_F. (*cond_<optab><mode>_4_relaxed): Extend from SVE_FULL_F_B16B16 to SVE_F_B16B16. (*cond_<optab><mode>_any_relaxed): Likewise. gcc/testsuite/ChangeLog: * gcc.target/aarch64/sve/unpacked_cond_fmla_1.c: New test. * gcc.target/aarch64/sve/unpacked_cond_fmls_1.c: Likewise. * gcc.target/aarch64/sve/unpacked_cond_fnmla_1.c: Likewise. * gcc.target/aarch64/sve/unpacked_cond_fnmls_1.c: Likewise.
2025-07-30fortran: Remove useless elements count variableMikael Morin3-15/+9
The function gfc_array_init_size evaluates the number of array elements to a variable from a caller, but the single caller providing the variable actually doesn't use it. This change removes the variable and the function arguments passing its address down the call chain. gcc/fortran/ChangeLog: * trans-array.cc (gfc_array_init_size): Remove the nelems argument. (gfc_array_allocate): Update caller. Remove the nelems argument. * trans-stmt.cc (gfc_trans_allocate): Update caller. Remove the nelems variable. * trans-array.h (gfc_array_allocate): Update prototype.
2025-07-30fortran: implement split for fortran 2023Yuao Ma15-0/+328
This patch includes the implementation, documentation, and test case for SPLIT. gcc/fortran/ChangeLog: * check.cc (gfc_check_split): Argument check for SPLIT. * gfortran.h (enum gfc_isym_id): Define GFC_ISYM_SPLIT. * intrinsic.cc (add_subroutines): Register SPLIT intrinsic. * intrinsic.h (gfc_check_split): New decl. (gfc_resolve_split): Ditto. * intrinsic.texi: SPLIT documentation. * iresolve.cc (gfc_resolve_split): Add resolved_sym for SPLIT. * trans-decl.cc (gfc_build_intrinsic_function_decls): Add decl for SPLIT in libgfortran. * trans-intrinsic.cc (conv_intrinsic_split): SPLIT codegen. (gfc_conv_intrinsic_subroutine): Handle SPLIT case. * trans.h (GTY): Declare gfor_fndecl_string_split{, _char4}. libgfortran/ChangeLog: * gfortran.map: Add split symbol. * intrinsics/string_intrinsics_inc.c (string_split): Runtime support for SPLIT. gcc/testsuite/ChangeLog: * gfortran.dg/split_1.f90: New test. * gfortran.dg/split_2.f90: New test. * gfortran.dg/split_3.f90: New test. * gfortran.dg/split_4.f90: New test. Signed-off-by: Yuao Ma <c8ef@outlook.com>
2025-07-30aarch64: Add support for unpacked SVE FP ternary arithmeticSpencer Abson11-13/+263
This patch extends the expander for unconditional fma, fnma, fms, and fnms, so that it supports partial SVE FP modes. gcc/ChangeLog: * config/aarch64/aarch64-sve.md (<optab><mode>4): Extend from SVE_FULL_F_B16B16 to SVE_F_B16B16. Use aarch64_sve_fp_pred instead of aarch64_ptrue_reg. (@aarch64_pred_<optab><mode>): Extend from SVE_FULL_F_B16B16 to SVE_F_B16B16. Use aarch64_predicate_operand. gcc/testsuite/ChangeLog: * g++.target/aarch64/sve/unpacked_ternary_bf16_1.C: New test. * g++.target/aarch64/sve/unpacked_ternary_bf16_2.C: Likewise. * gcc.target/aarch64/sve/unpacked_fmla_1.c: Likewise. * gcc.target/aarch64/sve/unpacked_fmla_2.c: Likewise. * gcc.target/aarch64/sve/unpacked_fmls_1.c: Likewise. * gcc.target/aarch64/sve/unpacked_fmls_2.c: Likewise. * gcc.target/aarch64/sve/unpacked_fnmla_1.c: Likeiwse. * gcc.target/aarch64/sve/unpacked_fnmla_2.c: Likewise. * gcc.target/aarch64/sve/unpacked_fnmls_1.c: Likewise. * gcc.target/aarch64/sve/unpacked_fnmls_2.c: Likewise.
2025-07-30Remove V64SFmode and V64SImode.liuhongt2-4/+1
It's needed by avx5124vnniw/avx5124fmaps which have been removed by r15-656-ge1a7e2c54d52d0. gcc/ChangeLog: * config/i386/i386-modes.def: Remove VECTOR_MODES(FLOAT, 256) and VECTOR_MODE (INT, SI, 64). * config/i386/i386.cc (ix86_hard_regno_nregs): Remove related code for V64SF/V64SImode.
2025-07-30Eliminate redundant vpextrq/vpinsrq when move TI to V4SI.liuhongt2-0/+37
r14-1902-g96c3539f2a3813 split TImode move with 2 DImode move, it's supposed to optimize TImode in parameter/return since accoring to psABI it's stored into 2 general registers. But when TImode is not in parameter/return, it could create redundancy in the PR. The patch add a splitter to handle that. .i.e. (insn 10 9 14 2 (set (subreg:V2DI (reg:V4SI 98 [ <retval> ]) 0) (vec_concat:V2DI (subreg:DI (reg:TI 101) 0) (subreg:DI (reg:TI 101) 8))) 8442 {vec_concatv2di} (expr_list:REG_DEAD (reg:TI 101) gcc/ChangeLog: PR target/121274 * config/i386/sse.md (*vec_concatv2di_0): Add a splitter before it. gcc/testsuite/ChangeLog: * gcc.target/i386/pr121274.c: New test.
2025-07-30RISC-V: Add testcases for unsigned avg ceil vx combine.Pan Li20-3/+303
The unsigned avg ceil share the vaaddux.vx for the vx combine, so add the test case to make sure it works well as expected. The below test suites are passed for this patch series. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u64.c: Add asm check for unsigned avg ceil. * gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u8.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u16.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u32.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u64.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u8.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u16.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u32.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u64.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u8.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u16.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u32.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u64.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u8.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx_binary.h: Add test helper macros. * gcc.target/riscv/rvv/autovec/vx_vf/vx_binary_data.h: Add test data. * gcc.target/riscv/rvv/autovec/vx_vf/vx_vaadd-run-2-u16.c: New test. * gcc.target/riscv/rvv/autovec/vx_vf/vx_vaadd-run-2-u32.c: New test. * gcc.target/riscv/rvv/autovec/vx_vf/vx_vaadd-run-2-u64.c: New test. * gcc.target/riscv/rvv/autovec/vx_vf/vx_vaadd-run-2-u8.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2025-07-30Daily bump.GCC Administrator5-1/+283
2025-07-29simplify-rtx: Fix Distribute subregs over logic ops [PR121302]Andrew Pinski1-4/+4
r16-2614-g965564eafb721f had a typo where it would assume byte==0 rather than use the byte (offset) that was passed. This fixes that typo and also fixes the comment since it is not just about lowerpart subregs but all non-paradoxical subregs. Pushed as obvious after bootstrap/test on x86_64-linux-gnu. PR rtl-optimization/121302 gcc/ChangeLog: * simplify-rtx.cc (simplify_context::simplify_subreg): Use byte instead of 0 when calling simplify_subreg. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2025-07-29Fortran: Andre's tweakJerry DeLisle1-16/+0
gcc/fortran/ChangeLog: * coarray.cc (check_add_new_component): Don't do addition checks.
2025-07-29testsuite: Cleanup after auto-profile testcases when auto-profile is not ↵Andrew Pinski1-0/+2
supported [PR121215] The problem here is that in tree-prof.exp does not cleanup if requiring auto-profile but it is not supported and the testcase uses dg-additional-sources. Currently additional_sources is not reset to "" and then another testcase comes along and thinks that is the additional source to be added. Committed as obvious after testing: make check-gcc RUNTESTFLAGS="tree-prof.exp=afdo-crossmodule-1.c tree-ssa.exp=pr67891.c" to make sure pr67891.c now no longer uses the additional source. PR testsuite/121215 gcc/testsuite/ChangeLog: * lib/profopt.exp (profopt-execute): Call cleanup-after-saved-dg-test if returning early for the -fauto-profile case failing case. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2025-07-29Fortran: Recommit changes for coarray after merging.Jerry DeLisle59-234/+5640
Testing only. Work in progress. gcc/fortran/ChangeLog: * check.cc (gfc_check_image_status): Modify (gfc_check_failed_or_stopped_images): Modify * coarray.cc (check_add_new_component): Modify * invoke.texi: Modify * trans-decl.cc (gfc_build_builtin_function_decls): Modify * trans-expr.cc (get_scalar_to_descriptor_type): Modify (copy_coarray_desc_part): Modify (gfc_class_array_data_assign): Modify (gfc_conv_derived_to_class): Modify * trans-intrinsic.cc (conv_intrinsic_image_status): Modify * trans-stmt.cc (gfc_trans_sync): Modify libgfortran/ChangeLog: * Makefile.am: Modify * Makefile.in: Modify * caf/libcaf.h (LIBCAF_H): Modify (_gfortran_caf_failed_images): Modify (_gfortran_caf_image_status): Modify (_gfortran_caf_stopped_images): Modify * caf/single.c (caf_internal_error): Modify * caf/caf_error.c: New file. Modify * caf/caf_error.h: New file. Modify * caf/shmem.c: New file. * caf/shmem/alloc.c: New file. * caf/shmem/alloc.h: New file. * caf/shmem/allocator.c: New file. * caf/shmem/allocator.h: New file. * caf/shmem/collective_subroutine.c: New file. * caf/shmem/collective_subroutine.h: New file. * caf/shmem/counter_barrier.c: New file. * caf/shmem/counter_barrier.h: New file. * caf/shmem/hashmap.c: New file. * caf/shmem/hashmap.h: New file. * caf/shmem/shared_memory.c: New file. * caf/shmem/shared_memory.h: New file. * caf/shmem/supervisor.c: New file. * caf/shmem/supervisor.h: New file. * caf/shmem/sync.c: New file. * caf/shmem/sync.h: New file. * caf/shmem/teams_mgmt.c: New file. * caf/shmem/teams_mgmt.h: New file. * caf/shmem/thread_support.c: New file. * caf/shmem/thread_support.h: New file. gcc/testsuite/ChangeLog: * gfortran.dg/coarray/alloc_comp_4.f90: Modify * gfortran.dg/coarray/atomic_2.f90: Modify * gfortran.dg/coarray/caf.exp: Modify * gfortran.dg/coarray/coarray_allocated.f90: Modify * gfortran.dg/coarray/coindexed_1.f90: Modify * gfortran.dg/coarray/coindexed_3.f08: Modify * gfortran.dg/coarray/coindexed_5.f90: Modify * gfortran.dg/coarray/dummy_3.f90: Modify * gfortran.dg/coarray/event_1.f90: Modify * gfortran.dg/coarray/event_3.f08: Modify * gfortran.dg/coarray/event_4.f08: Modify * gfortran.dg/coarray/failed_images_1.f08: Modify * gfortran.dg/coarray/failed_images_2.f08: Modify * gfortran.dg/coarray/image_status_1.f08: Modify * gfortran.dg/coarray/image_status_2.f08: Modify * gfortran.dg/coarray/lock_2.f90: Modify * gfortran.dg/coarray/poly_run_3.f90: Modify * gfortran.dg/coarray/scalar_alloc_1.f90: Modify * gfortran.dg/coarray/stopped_images_1.f08: Modify * gfortran.dg/coarray/stopped_images_2.f08: Modify * gfortran.dg/coarray/sync_1.f90: Modify * gfortran.dg/coarray/sync_3.f90: Modify * gfortran.dg/coarray_sync_memory.f90: Modify * gfortran.dg/coarray/co_reduce_string.f90: New test. Modify * gfortran.dg/coarray/sync_team.f90: New test. Modify
2025-07-29Merge branch 'master' into devel/gfortran-testJerry DeLisle789-13401/+32751
2025-07-29Revert "fortran: Testing patches for coarray shared memory."Jerry DeLisle59-5640/+234
This reverts commit 6955bb63595259d94a8c8eaba56650fe7652c3cd.
2025-07-29aarch64: Add support for unpacked SVE FP conditional binary arithmeticSpencer Abson12-73/+319
This patch extends the expander for conditional smax, smin, add, sub, mul, min, max, and div to support partial SVE FP modes. If exceptions from undefined vector elements must be suppressed, this expansion converts the container-level predicate to an element-level one, and ensures that these elements are inactive for the operation. In practice, this is a predicate AND with the existing mask and a container-size PTRUE. gcc/ChangeLog: * config/aarch64/aarch64-protos.h (aarch64_sve_emit_masked_fp_pred): Declare. * config/aarch64/aarch64-sve.md (and<mode>3): Change this to... (@and<mode>3): ...this, so that we can use gen_and3. (@cond_<optab><mode>): Extend from SVE_FULL_F_B16B16 to SVE_F_B16B16, use aarch64_predicate_operand. (*cond_<optab><mode>_2_strict): Likewise. (*cond_<optab><mode>_3_strict): Likewise. (*cond_<optab><mode>_any_strict): Likwise. (*cond_<optab><mode>_2_const_strict): Extend from SVE_FULL_F to SVE_F, use aarch64_predicate_operand. (*cond_<optab><mode>_any_const_strict): Likewise. (*cond_sub<mode>_3_const_strict): Likwise. (*cond_sub<mode>_const_strict): Likewise. (*vcond_mask_<mode><vpred>): Use aarch64_predicate_operand, and update the comment here. * config/aarch64/aarch64.cc (aarch64_sve_emit_masked_fp_pred): New function. Helper to mask the predicate in conditional expanders. gcc/testsuite/ChangeLog: * g++.target/aarch64/sve/unpacked_cond_binary_bf16_2.C: New test. * gcc.target/aarch64/sve/unpacked_cond_builtin_fmax_2.c: Likewise. * gcc.target/aarch64/sve/unpacked_cond_builtin_fmin_2.c: Likewise. * gcc.target/aarch64/sve/unpacked_cond_fadd_2.c: Likewise. * gcc.target/aarch64/sve/unpacked_cond_fdiv_2.c: Likewise. * gcc.target/aarch64/sve/unpacked_cond_fmaxnm_2.c: Likewise. * gcc.target/aarch64/sve/unpacked_cond_fminnm_2.c: Likewise. * gcc.target/aarch64/sve/unpacked_cond_fmul_2.c: Likewise. * gcc.target/aarch64/sve/unpacked_cond_fsubr_2.c: Likewise.
2025-07-29x86: Pass -mno-80387 to compile pr121208-1(a|b).cH.J. Lu2-2/+2
Pass -mno-80387 to compile pr121208-1(a|b).c to silence .../pr121208-1a.c:11:1: sorry, unimplemented: 80387 instructions aren’t allowed in a function with the ‘no_caller_saved_registers’ attribute PR target/121208 * gcc.target/i386/pr121208-1a.c (dg-options): Add -mno-80387. * gcc.target/i386/pr121208-1b.c (dg-options): Likewise. Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2025-07-29testsuite: Adjust s390x params for vector tests.Juergen Christ2-0/+2
Loop peeling and minimal loop vectorization threshold prevented loop vectorization in these examples. Adjust parameters in the test to make the test pass. Signed-off-by: Juergen Christ <jchrist@linux.ibm.com> PR testsuite/121286 PR testsuite/121288 gcc/testsuite/ChangeLog: * gcc.dg/vect/pr112325.c: Adjust parameters for s390. * gcc.dg/vect/pr117888-1.c: Ditto.
2025-07-29RISC-V: Generate -mcpu and -mtune options from riscv-cores.def.Dongyan Chen7-23/+251
Automatically generate -mcpu and -mtune options in invoke.texi from the unified riscv-cores.def metadata, ensuring documentation stays in sync with definitions and reducing manual maintenance. gcc/ChangeLog: * Makefile.in: Add riscv-mcpu.texi and riscv-mtune.texi to the list of files to be processed by the Texinfo generator. * config/riscv/t-riscv: Add rule for generating riscv-mcpu.texi and riscv-mtune.texi. * doc/invoke.texi: Replace hand‑written extension table with `@include riscv-mcpu.texi` and `@include riscv-mtune.texi` to pull in auto‑generated entries. * config/riscv/gen-riscv-mcpu-texi.cc: New file. * config/riscv/gen-riscv-mtune-texi.cc: New file. * doc/riscv-mcpu.texi: New file. * doc/riscv-mtune.texi: New file.
2025-07-29simplify-rtx: Simplify subregs of logic opsRichard Sandiford2-1/+108
This patch adds a new rule for distributing lowpart subregs through ANDs, IORs, and XORs with a constant, in cases where one of the terms then disappears. For example: (lowart-subreg:QI (and:HI x 0x100)) simplifies to zero and (lowart-subreg:QI (and:HI x 0xff)) simplifies to (lowart-subreg:QI x). This would often be handled at some point using nonzero bits. However, the specific case I want the optimisation for is SVE predicates, where nonzero bit tracking isn't currently an option. Specifically: the predicate modes VNx8BI, VNx4BI and VNx2BI have the same size as VNx16BI, but treat only every second, fourth, or eighth bit as significant. Thus if we have: (subreg:VNx8BI (and:VNx16BI x C)) where C is the repeating constant { 1, 0, 1, 0, ... }, then the AND only clears bits that are made insignificant by the subreg, and so the result is equal to (subreg:VNx8BI x). Later patches rely on this. gcc/ * simplify-rtx.cc (simplify_context::simplify_subreg): Distribute lowpart subregs through AND/IOR/XOR, if doing so eliminates one of the terms. (test_scalar_int_ext_ops): Add some tests of the above for integers. * config/aarch64/aarch64.cc (aarch64_test_sve_folding): Likewise add tests for predicate modes.
2025-07-29testsuite: Generalise aarch64/saturating_arithmetic*.cRichard Sandiford2-10/+10
gcc.target/aarch64/saturating_arithmetic_{1,2}.c expect w0 and w1 to be duplicated into vectors. The tests expected the duplication of w1 to happen first, but the other order would be fine too. A later simplify-rtx.cc patch happens to change the order. gcc/testsuite/ * gcc.target/aarch64/saturating_arithmetic_1.c: Allow w0 and w1 to be duplicated in either order. * gcc.target/aarch64/saturating_arithmetic_2.c: Likewise.
2025-07-29testsuite: Make aarch64/cmpbr.c more forgivingRichard Sandiford1-20/+20
The 8-bit and 16-bit tests in cmpbr.c assumed an inverted operand order ("w1, w0"), but it's possible to use the uninverted operand order too. This patch generalises the tests to support both forms. This is a prerequisite for a later patch that adds a new simplify-rtx.cc rule. gcc/testsuite/ * gcc.target/aarch64/cmpbr.c: Support both operand orders for 8-bit and 16-bit comparisons.
2025-07-29aarch64: Fix function_expander::get_reg_targetRichard Sandiford1-1/+2
function_expander::get_reg_target didn't actually check for a register, meaning that it could return a memory target instead. That doesn't really matter for the current direct and indirect uses (svundef*, svcreate*, and svset*) but it will for later patches. gcc/ * config/aarch64/aarch64-sve-builtins.cc (function_expander::get_reg_target): Check whether the target is a valid register_operand.
2025-07-29[modula2] Tidyup remove unused local variablesGaius Mulley2-7/+0
This patch removes unused local variables from three procedures. gcc/m2/ChangeLog: * gm2-compiler/M2GenGCC.mod (FoldBecomes): Remove all local variables. (CodeIndrX): Remove length. Remove newstr. * gm2-compiler/M2Range.mod (FoldTypeIndrX): Remove desType. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2025-07-29asf: Fix case of multiple stores with base offset [PR120660]Konstantinos Eleftheriou2-8/+46
When having multiple stores with the same offset as the load, in the case that we are eliminating the load, we were generating a mov instruction for both of them, leading to the overwrite of the register containing the loaded value. This patch fixes this issue by generating a mov instruction only for the first store in the store-load sequence that has the same offset as the load. For the next ones that might be encountered, we use bit-field insertion. Bootstrapped/regtested on AArch64 and x86_64. PR rtl-optimization/120660 gcc/ChangeLog: * avoid-store-forwarding.cc (process_store_forwarding): Fix instruction generation when haveing multiple stores with base offset. gcc/testsuite/ChangeLog: * gcc.dg/pr120660.c: New test.
2025-07-29libsdc++: Test using range_format::map as format_kind.Tomasz Kamiński1-1/+3
This adderess TODO from the test file. libstdc++-v3/ChangeLog: * testsuite/std/format/ranges/format_kind.cc: New test. Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
2025-07-29RISC-V: Remove use of structured binding to fix compiler warningChristoph Müllner1-1/+2
Function riscv_ext_is_subset () uses structured bindings to iterate over all keys and values of an unordered map. However, this is only available since C++17 and causes a warning like this: warning: structured bindings only available with ‘-std=c++17’ This patch addresses the warning. gcc/ChangeLog: * common/config/riscv/riscv-common.cc (riscv_ext_is_subset): Remove use of structured binding to fix compiler warning. Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
2025-07-29asf: Skip when an instruction doesn't satisfy the constraints [PR119795]Konstantinos Eleftheriou2-8/+83
While scanning the instructions and upon reaching an instruction that doesn't satisfy the constraints that we have set, we were removing the already detected stores, but we were continuing adding stores from that point onward. This was causing issues when the address ranges from later stores overlapped with the load's address, leading to partial and wrong update of the register containing the loaded value. With this patch, we are skipping the tranformation for stores that operate on the load's address range, when stores that operate on the same range have been deleted due to constraint violations. PR rtl-optimization/119795 gcc/ChangeLog: * avoid-store-forwarding.cc (store_forwarding_analyzer::avoid_store_forwarding): Skip transformations for stores that operate on the same address range as deleted ones. gcc/testsuite/ChangeLog: * gcc.target/i386/pr119795.c: New test.
2025-07-29RISC-V: Add test cases for mul based unsigned scalar SAT_MULPan Li12-3/+117
Add run and tree-optimized check for mul based unsigned scalar SAT_MUL instead of the widen_mul. gcc/testsuite/ChangeLog: * gcc.target/riscv/sat/sat_u_mul-run-1-u16-from-u64.c: Add rv64 target for run. * gcc.target/riscv/sat/sat_u_mul-run-1-u32-from-u64.c: Ditto. * gcc.target/riscv/sat/sat_u_mul-run-1-u8-from-u64.c: Ditto. * gcc.target/riscv/sat/sat_u_mul-1-u16-from-u32.c: New test. * gcc.target/riscv/sat/sat_u_mul-1-u8-from-u16.c: New test. * gcc.target/riscv/sat/sat_u_mul-1-u8-from-u32.c: New test. * gcc.target/riscv/sat/sat_u_mul-2-u16-from-u64.c: New test. * gcc.target/riscv/sat/sat_u_mul-2-u32-from-u64.c: New test. * gcc.target/riscv/sat/sat_u_mul-2-u8-from-u64.c: New test. * gcc.target/riscv/sat/sat_u_mul-run-1-u16-from-u32.c: New test. * gcc.target/riscv/sat/sat_u_mul-run-1-u8-from-u16.c: New test. * gcc.target/riscv/sat/sat_u_mul-run-1-u8-from-u32.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2025-07-29Match: Introduce mul based pattern for unsigned SAT_MULPan Li1-5/+17
Like widen_mul based pattern, we would like introduce the mul based pattern as well. The pattern is quite simple compares to the widen_mul, thus add new instead of the for loop in match.pd. gcc/ChangeLog: * match.pd: Add mul based unsigned SAT_MUL. Signed-off-by: Pan Li <pan2.li@intel.com>
2025-07-29Another testcase for PR120687Richard Biener1-0/+16
This shows reassoc is harmful even with len == 3. PR tree-optimization/120687 * gcc.dg/vect/pr120687-3.c: New testcase.
2025-07-29testsuite: Fix C++14 test failure with modules test [PR121285]Nathaniel Shead1-2/+2
I hadn't validated this test worked in C++14 before submitting, fixed thusly. PR testsuite/121285 gcc/testsuite/ChangeLog: * g++.dg/modules/class-11_a.H: Make static_asserts valid for C++14. Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
2025-07-29tree-optimization/120687 - avoid disturbing reduction chains in reassocRichard Biener4-4/+42
Reassoc carefully ranks operands to form reduction chains for vectorization so we are careful to not apply any width related changes in the early pass. Unfortunately we are not careful enough. The following gates fma related re-ordering and also the >= 3 ops tail "optimization" which is the culprit here. This does not fix the reported inefficient vectorization when using signed integer reductions yet. PR tree-optimization/120687 * tree-ssa-reassoc.cc (reassociate_bb): Do not disturb the sorted operand order in the early pass. * tree-vect-slp.cc (vect_analyze_slp): Dump when a detected reduction chain fails SLP discovery. * gcc.dg/vect/pr120687-1.c: New testcase. * gcc.dg/vect/pr120687-2.c: Likewise.
2025-07-29Fix UB in string_slice::operator== (PR 121261)Alfie Richards1-0/+4
This adds a nullptr check to fix a regression where it is possible to call `memcmp (NULL, NULL, 0)` which is UB prior to C26. This fixes the bootstrap-ubsan build. gcc/ChangeLog: PR middle-end/121261 * vec.h: Add null ptr check.
2025-07-29PR modula2/121289 Poor warning location when using Wstyle optionGaius Mulley11-54/+116
This patch adds a token location parameter to CheckVariableAgainstKeyword and dependants ensuring that the warning is generated from the token associated with the variable rather than the end of the statement. gcc/m2/ChangeLog: PR modula2/121289 * gm2-compiler/M2Students.def (CheckVariableAgainstKeyword): New parameter tok. * gm2-compiler/M2Students.mod (CheckVariableAgainstKeyword): New parameter tok. Pass tok to PerformVariableKeywordCheck. (PerformVariableKeywordCheck): New parameter tok. Pass tok to MetaErrorStringT0. * gm2-compiler/P2SymBuild.mod (BuildVariable): Pass tok to CheckVariableAgainstKeyword. * gm2-libs-iso/LowLong.mod (except): Replace with ... (exceptSrc): ... this. * gm2-libs-iso/LowReal.mod (except): Replace with ... (exceptSrc): ... this. * gm2-libs-iso/LowShort.mod (except): Replace with ... (exceptSrc): ... this. * gm2-libs-iso/Processes.mod (Wait): Replace from with fromCor. * gm2-libs-iso/RndFile.mod (EndPos): Replace end with endP. * gm2-libs/SCmdArgs.mod (GetArg): Replace start with startPos. Replace end with endPos. (NArg): Replace start with startPos. Replace end with endPos. gcc/testsuite/ChangeLog: PR modula2/121289 * gm2/warnings/style/fail/badvarname.mod: New test. * gm2/warnings/style/fail/warnings-style-fail.exp: New test. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2025-07-29testsuite: Restore dg-do run on pr116906 and pr78185 testsChristophe Lyon3-0/+3
Commit r15-7152-g57b706d141b87c removed /* { dg-do run { target*-*-linux* *-*-gnu* *-*-uclinux* } } */ from these tests, turning them into 'compile' only tests, even when they could be executed. This patch adds /* { dg-do run } */ which is OK since the tests are correctly skipped if needed thanks to the following effective-targets (alarm and signal). With this patch we have again two entries for these tests on linux targets: * compile (test for excess errors) * execution test gcc/testsuite/ChangeLog: * gcc.dg/pr116906-1.c: Add 'dg-do run'. * gcc.dg/pr116906-2.c: Likewise. * gcc.dg/pr78185.c: Likewise.
2025-07-29calls: Allow musttail calls to noreturn [PR121159]Jakub Jelinek3-2/+20
In the PR119483 r15-9003 change we've allowed musttail calls to noreturn functions, after all the decision not to normally tail call noreturn functions is not because it is not possible to tail call those, but because it screws up backtraces. As the following testcase shows, we've done that only for functions not declared [[noreturn]]/_Noreturn but later on discovered through IPA as noreturn. Functions explicitly declared [[noreturn]] have (for historical reasons) volatile FUNCTION_TYPE and the FUNCTION_DECLs are volatile as well, so in order to support those we shouldn't complain on ECF_NORETURN (we've stopped doing so for musttail in PR119483) but also shouldn't complain about TYPE_VOLATILE on their FUNCTION_TYPE (something that IPA doesn't change, I think it only sets TREE_THIS_VOLATILE on the FUNCTION_DECL). volatile on function type really means noreturn as well, it has no other meaning. 2025-07-29 Jakub Jelinek <jakub@redhat.com> PR middle-end/121159 * calls.cc (can_implement_as_sibling_call_p): Don't reject declared noreturn functions in musttail calls. * c-c++-common/pr121159.c: New test. * gcc.dg/plugin/must-tail-call-2.c (test_5): Don't expect an error.
2025-07-28output: Move an special # (256) to a new macroAndrew Pinski3-6/+9
This is a followup to the review of mergability of CSWTCH patch located at https://gcc.gnu.org/pipermail/gcc-patches/2025-July/690810.html. Moves the special # (256) to a macro so it is not used bare in the source and there is only the need to change it in one place. This special # was added with r0-37392-g201556f0e00580 which added the original mergeable section support to gcc. Pushed as obvious after build and test on x86_64. gcc/ChangeLog: * output.h (MAX_ALIGN_MERGABLE): New define. * tree-switch-conversion.cc (switch_conversion::build_one_array): Use MAX_ALIGN_MERGABLE instead of 256. * varasm.cc (mergeable_string_section): Likewise (mergeable_constant_section): Likewise Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2025-07-28Improve mergability of CSWTCH [PR120523]Andrew Pinski4-7/+99
When I did r16-1067-gaa935ce40a7, I thought it would be enough to mark the decl as mergable to get it to merge on all targets. Turns out a few things needed to be changed to support it being mergable on all targets. The first thing is improve the selecting of the mergable section and instead of basing it on the DECL's mode, it should be based on the size instead. The second thing that needed to be happen is change the alignment of the CSWTCH decl to be aligned to the next power of 2 compared to the size if the size is less than 32bytes (the max mergable size that is supported). With these changes, cswtch-6.c passes on ia32 and other targets. And the new testcase cswtch-7.c will pass now too. Note I noticed the darwin's darwin_mergeable_constant_section could be "fixed" up to use DECL_SIZE instead of the DECL_MODE but I am not sure it makes a huge difference. Bootstrapped and tested on x86_64-linux-gnu. PR middle-end/120523 gcc/ChangeLog: * output.h (mergeable_constant_section): New declaration taking unsigned HOST_WIDE_INT for the size. * tree-switch-conversion.cc (switch_conversion::build_one_array): Increase the alignment of CSWTCH for sizes less than 32bytes. * varasm.cc (mergeable_constant_section): Split out twice. One that takes the size in unsigned HOST_WIDE_INT and the other size in a tree. (default_elf_select_section): Pass DECL_SIZE instead of DECL_MODE to mergeable_constant_section. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/cswtch-7.c: New test. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2025-07-29Un-factor vectorizable_load partsRichard Biener1-32/+30
When the costing refactoring happened we ended up with some strange inter-mixing of VMAT unrelated code. The following moves stuff closer to where it's actually used, at the expense of duplicating some lines. * tree-vect-stmts.cc (vectorizable_load): Un-factor VMAT specific code to their handling blocks.
2025-07-29Eliminate gather-scatter-info offset_dt memberRichard Biener3-26/+10
The following removes this only set member. Sligthly complicated by the hoops get_group_load_store_type jumps through. I've simplified that, noting the offset vector type that's relevant is that of the actual offset SLP node, not of what vect_check_gather_scatter (re-)computes. * tree-vectorizer.h (gather_scatter_info::offset_dt): Remove. * tree-vect-data-refs.cc (vect_describe_gather_scatter_call): Do not set it. (vect_check_gather_scatter): Likewise. * tree-vect-stmts.cc (vect_truncate_gather_scatter_offset): Likewise. (get_group_load_store_type): Use the vector type of the offset SLP child. Do not re-check vect_is_simple_use validated by SLP build.
2025-07-29Daily bump.GCC Administrator6-1/+220
2025-07-28AVR: target/121277 - Don't load 0x800000 with const __flashx *x = NULL.Georg-Johann Lay1-6/+13
Converting from generic AS to __flashx used the same rule like for __memx, which tags RAM (generic AS) locations by setting bit 23. The justification was that generic isn't a subset of __flashx, though that lead to surprises with code like const __flashx *x = NULL. The natural thing to do is to just load 0x000000 in that case, so that the null pointer works in __flashx as expected. Apart from that, converting NULL to __flashx (or __flash) no more raises a -Waddr-space-convert diagnostic. gcc/ PR target/121277 * config/avr/avr.cc (avr_addr_space_convert): When converting from generic AS to __flashx, don't set bit 23. (avr_convert_to_type): Don't -Waddr-space-convert when NULL is converted to __flashx or to __flash.
2025-07-28ifcvt: Fix ifcvt for multiple phi nodes after factoring operator [PR121236]Andrew Pinski2-25/+55
When I added the factor operations to ifcvt, I messed how handling of removing the phi nodes. The fix is we need to remove the phi node that was factored out as we factored out the operator because otherwise scev can go when it comes to detecting if the new args are from a reduction. Also the need to change the interface for is_cond_scalar_reduction as the phi node that was being passed after the factoring no longer exists so need to pass the parts that were being used. PR tree-optimization/121236 gcc/ChangeLog: * tree-if-conv.cc (is_cond_scalar_reduction): Instead of phi argument, pass bb and res of the phi. (factor_out_operators): Add iterator for the phi. Remove the phi if this is the first time. Return if we had removed the phi. (predicate_scalar_phi): Add the phi iterator argument. Update call to is_cond_scalar_reduction. Update call to factor_out_operators and set the return value to true when factor_out_operators returns true. (predicate_all_scalar_phis): Don't remove the phi if predicate_scalar_phi already removed it. gcc/testsuite/ChangeLog: * gcc.dg/torture/pr121236-1.c: New test. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2025-07-28x86: Disallow -mtls-dialect=gnu with no_caller_saved_registersH.J. Lu7-0/+83
__tls_get_addr doesn't preserve vector registers. When a function with no_caller_saved_registers attribute calls __tls_get_addr, YMM and ZMM registers will be clobbered. Issue an error and suggest -mtls-dialect=gnu2 in this case. gcc/ PR target/121208 * config/i386/i386.cc (ix86_tls_get_addr): Issue an error for -mtls-dialect=gnu with no_caller_saved_registers attribute and suggest -mtls-dialect=gnu2. gcc/testsuite/ PR target/121208 * gcc.target/i386/pr121208-1a.c: New test. * gcc.target/i386/pr121208-1b.c: Likewise. * gcc.target/i386/pr121208-2a.c: Likewise. * gcc.target/i386/pr121208-2b.c: Likewise. * gcc.target/i386/pr121208-3a.c: Likewise. * gcc.target/i386/pr121208-3b.c: Likewise. Signed-off-by: H.J. Lu <hjl.tools@gmail.com>