aboutsummaryrefslogtreecommitdiff
path: root/gcc
AgeCommit message (Collapse)AuthorFilesLines
2025-08-25c++: Check for *jump_target earlier in cxx_bind_parameters_in_call [PR121601]Jakub Jelinek2-2/+21
The following testcase ICEs, because the /* Check we aren't dereferencing a null pointer when calling a non-static member function, which is undefined behaviour. */ if (i == 0 && DECL_OBJECT_MEMBER_FUNCTION_P (fun) && integer_zerop (arg) /* But ignore calls from within compiler-generated code, to handle cases like lambda function pointer conversion operator thunks which pass NULL as the 'this' pointer. */ && !(TREE_CODE (t) == CALL_EXPR && CALL_FROM_THUNK_P (t))) { if (!ctx->quiet) error_at (cp_expr_loc_or_input_loc (x), "dereferencing a null pointer"); *non_constant_p = true; } checking is done before testing if (*jump_target). Especially when throws (jump_target), arg can be (and is on this testcase) NULL_TREE, so calling integer_zerop on it ICEs. Fixed by moving the if (*jump_target) test earlier. 2025-08-25 Jakub Jelinek <jakub@redhat.com> PR c++/121601 * constexpr.cc (cxx_bind_parameters_in_call): Move break if *jump_target before the check for null this object pointer. * g++.dg/cpp26/constexpr-eh16.C: New test.
2025-08-25tree-optimization/121638 - missed SLP discovery of live inductionRichard Biener2-6/+82
The following fixes a missed SLP discovery of a live induction. Our pattern matching of those fails because of the PR81529 fix which I think was misguided and should now no longer be relevant. So this essentially reverts that fix. I have added a GIMPLE testcase to increase the chance the particular IL is preserved through the future. This shows that how we make some IVs live because of early-break isn't quite correct, so I had to preserve a hack here. Hopefully to be investigated at some point. PR tree-optimization/121638 * tree-vect-stmts.cc (process_use): Do not make induction PHI backedge values relevant. * gcc.dg/vect/pr121638.c: New testcase.
2025-08-25targhooks: i386: rename TAG_SIZE to TAG_BITSIZEIndu Bhagat7-11/+11
gcc/Changelog: * asan.h (HWASAN_TAG_SIZE): Use targetm.memtag.tag_bitsize. * config/i386/i386.cc (ix86_memtag_tag_size): Rename to ix86_memtag_tag_bitsize. (TARGET_MEMTAG_TAG_SIZE): Renamed to TARGET_MEMTAG_TAG_BITSIZE. * doc/tm.texi (TARGET_MEMTAG_TAG_SIZE): Likewise. * doc/tm.texi.in (TARGET_MEMTAG_TAG_SIZE): Likewise. * target.def (tag_size): Rename to tag_bitsize. * targhooks.cc (default_memtag_tag_size): Rename to default_memtag_tag_bitsize. * targhooks.h (default_memtag_tag_size): Likewise. Signed-off-by: Claudiu Zissulescu <claudiu.zissulescu-ianculescu@oracle.com> Co-authored-by: Claudiu Zissulescu <claudiu.zissulescu-ianculescu@oracle.com>
2025-08-25RISC-V: Replace deprecated FUNCTION_VALUE/LIBCALL_VALUE macros with target hooksKito Cheng3-21/+36
The FUNCTION_VALUE and LIBCALL_VALUE macros are deprecated in favor of the TARGET_FUNCTION_VALUE and TARGET_LIBCALL_VALUE target hooks. This patch replaces the macro definitions with proper target hook implementations. This change is also a preparatory step for VLS calling convention support, which will require additional information that is more easily handled through the target hook interface. gcc/ChangeLog: * config/riscv/riscv-protos.h (riscv_init_cumulative_args): Change fntype parameter from tree to const_tree. * config/riscv/riscv.cc (riscv_init_cumulative_args): Likewise. (riscv_function_value): Replace with new implementation that conforms to TARGET_FUNCTION_VALUE hook signature. (riscv_libcall_value): New function implementing TARGET_LIBCALL_VALUE. (TARGET_FUNCTION_VALUE): Define. (TARGET_LIBCALL_VALUE): Define. * config/riscv/riscv.h (FUNCTION_VALUE): Remove. (LIBCALL_VALUE): Remove.
2025-08-25Use x86 GFNI for vectorized constant byte shifts/rotatesAndi Kleen11-14/+595
The GFNI AVX gf2p8affineqb instruction can be used to implement vectorized byte shifts or rotates. This patch uses them to implement shift and rotate patterns to allow the vectorizer to use them. Previously AVX couldn't do rotates (except with XOP) and had to handle 8 bit shifts with a half throughput 16 bit shift. This is only implemented for constant shifts. In theory it could be used with a lookup table for variable shifts, but it's unclear if it's worth it. The vectorizer cost model could be improved, but seems to work for now. It doesn't model the true latencies of the instructions. Also it doesn't account for the memory loading of the mask, assuming that for a loop it will be loaded outside the loop. The instructions would also support more complex patterns (e.g. arbitary bit movement or inversions), so some of the tricks applied to ternlog could be applied here too to collapse more code. It's trickier because the input patterns can be much longer since they can apply to every bit individually. I didn't attempt any of this. There's currently no test case for the masked/cond_ variants, they seem to be difficult to trigger with the vectorizer. Suggestions for a test case for them welcome. gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_vgf2p8affine_shift_matrix): New function to lookup shift/rotate matrixes for gf2p8affine. * config/i386/i386-protos.h (ix86_vgf2p8affine_shift_matrix): Declare new function. * config/i386/i386.cc (ix86_shift_rotate_cost): Add cost model for shift/rotate implemented using gf2p8affine. * config/i386/sse.md (VI1_AVX512_3264): New mode iterator. (<insn><mode>3): Add GFNI case for shift patterns. (cond_<insn><mode>3): New pattern. (<insn><mode>3<mask_name>): Dito. (<insn>v16qi): New rotate pattern to handle XOP V16QI case and GFNI. (rotl<mode>3, rotr<mode>3): Exclude V16QI case. gcc/testsuite/ChangeLog: * gcc.target/i386/shift-gf2p8affine-1.c: New test * gcc.target/i386/shift-gf2p8affine-2.c: New test * gcc.target/i386/shift-gf2p8affine-3.c: New test * gcc.target/i386/shift-v16qi-4.c: New test * gcc.target/i386/shift-gf2p8affine-5.c: New test * gcc.target/i386/shift-gf2p8affine-6.c: New test * gcc.target/i386/shift-gf2p8affine-7.c: New test
2025-08-25LoongArch: Fix ICE in highway-1.3.0 testsuite [PR121634]Xi Ruoyao2-1/+16
I can't believe I made such a stupid pasto and the regression test didn't detect anything wrong. PR target/121634 gcc/ * config/loongarch/simd.md (simd_maddw_evod_<mode>_<su>): Use WVEC_HALF instead of WVEC for the mode of the sign_extend for the rhs of multiplication. gcc/testsuite/ * gcc.target/loongarch/pr121634.c: New test.
2025-08-24Fix invalid right shift count with recent ifcvt changesJeff Law2-6/+10
I got too clever trying to simplify the right shift computation in my recent ifcvt patch. Interestingly enough, I haven't seen anything but the Linaro CI configuration actually trip the problem, though the code is clearly wrong. The problem I was trying to avoid were the leading zeros when calling clz on a HWI when the real object is just say 32 bits. The net is we get a right shift count of "2" when we really wanted a right shift count of 30. That causes the execution aspect of bics_3 to fail. The scan failures are due to creating slightly more efficient code. THe new code sequences don't need to use conditional execution for selection and thus we can use bic rather bics which requires a twiddle in the scan. I reviewed recent bug reports and haven't seen one for this issue. So no new testcase as this is covered by the armv7 testsuite in the right configuration. Bootstrapped and regression tested on x86_64, also verified it fixes the Linaro reported CI failure and verified the crosses are still happy. Pushing to the trunk. gcc/ * ifcvt.cc (noce_try_sign_bit_splat): Fix right shift computation. gcc/testsuite/ * gcc.target/arm/bics_3.c: Adjust expected output
2025-08-25Daily bump.GCC Administrator2-1/+5
2025-08-24Update gcc de.poJoseph Myers1-331/+332
* de.po: Update.
2025-08-24i386: fix ChangeLog entrySam James1-1/+1
Fix a typo in the ChangeLog entry from r16-3355-g96a291c4bb0b8a.
2025-08-24Daily bump.GCC Administrator4-1/+55
2025-08-23c++: Fix greater-than operator in braced-init-lists [PR116928]Eczbek2-0/+8
PR c++/116928 gcc/cp/ChangeLog: * parser.cc (cp_parser_braced_list): Set greater_than_is_operator_p. gcc/testsuite/ChangeLog: * g++.dg/parse/template33.C: New test. Reviewed-by: Jason Merrill <jason@redhat.com>
2025-08-23x86: Compile noplt-(g|l)d-1.c with -mtls-dialect=gnuH.J. Lu2-2/+2
Compile noplt-gd-1.c and noplt-ld-1.c with -mtls-dialect=gnu to support the --with-tls=gnu2 configure option since they scan the assembly output for the __tls_get_addr call which is generated by -mtls-dialect=gnu. PR target/120933 * gcc.target/i386/noplt-gd-1.c (dg-options): Add -mtls-dialect=gnu. * gcc.target/i386/noplt-ld-1.c (dg-options): Likewise. Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2025-08-23i386: wire up --with-tls to control -mtls-dialect= defaultSam James3-2/+17
Allow passing --with-tls= at configure-time to control the default value of -mtls-dialect= for i386 and x86_64. The default itself (gnu) is not changed unless --with-tls= is passed. --with-tls= is already wired up for ARM and RISC-V. gcc/ChangeLog: PR target/120933 * config.gcc (supported_defaults): Add tls for i386, x86_64. * config/i386/i386.h (host_detect_local_cpu): Add tls. * doc/install.texi: Document --with-tls= for i386, x86_64.
2025-08-22driver: Rework for_each_path using C++John Ericson1-108/+77
The old C-style was cumbersome make making one responsible for manually creating and passing in two parts a closure (separate function and *_info class for closed-over variables). With C++ lambdas, we can just: - derive environment types implicitly - have fewer stray static functions Also thanks to templates we can - make the return type polymorphic, to avoid casting pointee types. Note that `struct spec_path` was *not* converted because it is used multiple times. We could still convert to a lambda, but we would want to put the for_each_path call with that lambda inside a separate function anyways, to support the multiple callers. Unlike the other two refactors, it is not clear that this one would make anything shorter. Instead, I define the `operator()` explicitly. Keeping the explicit struct gives us some nice "named arguments", versus the wrapper function alternative, too. gcc/ChangeLog: * gcc.cc (for_each_path): templated, to make passing lambdas possible/easy/safe, and to have a polymorphic return type. (struct add_to_obstack_info): Deleted, lambda captures replace it. (add_to_obstack): Moved to lambda in build_search_list. (build_search_list): Has above lambda now. (struct file_at_path_info): Deleted, lambda captures replace it. (file_at_path): Moved to lambda in find_a_file. (find_a_file): Has above lambda now. (struct spec_path_info): Reamed to just struct spec_path. (struct spec_path): New name. (spec_path): Rnamed to spec_path::operator() (spec_path::operator()): New name (do_spec_1): Updated for_each_path call sites. Signed-off-by: John Ericson <git@JohnEricson.me> Reviewed-by: Jason Merrill <jason@redhat.com>
2025-08-23c++/modules: Provide definitions of synthesized methods outside their ↵Nathaniel Shead4-0/+59
defining module [PR120499] In the PR, we're getting a linker error from _Vector_impl's destructor never getting emitted. This is because of a combination of factors: 1. in imp-member-4_a, the destructor is not used and so there is no definition generated. 2. in imp-member-4_b, the destructor gets synthesized (as part of the synthesis for Coll's destructor) but is not ODR-used and so does not get emitted. Despite there being a definition provided in this TU, the destructor is still considered imported and so isn't streamed into the module body. 3. in imp-member-4_c, we need to ODR-use the destructor but we only got a forward declaration from imp-member-4_b, so we cannot emit a body. The point of failure here is step 2; this function has effectively been declared in the imp-member-4_b module, and so we shouldn't treat it as imported. This way we'll properly stream the body so that importers can emit it. PR c++/120499 gcc/cp/ChangeLog: * method.cc (synthesize_method): Set the instantiating module. gcc/testsuite/ChangeLog: * g++.dg/modules/imp-member-4_a.C: New test. * g++.dg/modules/imp-member-4_b.C: New test. * g++.dg/modules/imp-member-4_c.C: New test. Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
2025-08-23Daily bump.GCC Administrator3-1/+130
2025-08-23rs6000: Add shift count guards to avoid undefined behavior [PR118890]Kishan Parmar1-2/+2
This patch adds missing guards on shift amounts to prevent UB when the shift count equals or exceeds HOST_BITS_PER_WIDE_INT. In the patch (r16-2666-g647bd0a02789f1), shift counts were only checked for nonzero but not for being within valid bounds. This patch tightens those conditions by enforcing that shift counts are greater than zero and less than HOST_BITS_PER_WIDE_INT. 2025-08-23 Kishan Parmar <kishan@linux.ibm.com> gcc/ PR target/118890 * config/rs6000/rs6000.cc (can_be_rotated_to_negative_lis): Add bounds checks for shift counts to prevent undefined behavior. (rs6000_emit_set_long_const): Likewise.
2025-08-22[PR rtl-optimization/120553] Improve selecting between constants based on ↵Jeff Law9-0/+783
sign bit test While working to remove mvconst_internal I stumbled over a regression in the code to handle signed division by a power of two. In that sequence we want to select between 0, 2^n-1 by pairing a sign bit splat with a subsequent logical right shift. This can be done without branches or conditional moves. Playing with it a bit made me realize there's a handful of selections we can do based on a sign bit test. Essentially there's two broad cases. Clearing bits after the sign bit splat. So we have 0, -1, if we clear bits the 0 stays as-is, but the -1 could easily turn into 2^n-1, ~2^n-1, or some small constants. Setting bits after the sign bit splat. If we have 0, -1, setting bits the -1 stays as-is, but the 0 can turn into 2^n, a small constant, etc. Shreya and I originally started looking at target patterns to do this, essentially discovering conditional move forms of the selects and rewriting them into something more efficient. That got out of control pretty quickly and it relied on if-conversion to initially create the conditional move. The better solution is to actually discover the cases during if-conversion itself. That catches cases that were previously being missed, checks cost models, and is actually simpler since we don't have to distinguish between things like ori and bseti, instead we just emit the natural RTL and let the target figure it out. In the ifcvt implementation we put these cases just before trying the traditional conditional move sequences. Essentially these are a last attempt before trying the generalized conditional move sequence. This as been bootstrapped and regression tested on aarch64, riscv, ppc64le, s390x, alpha, m68k, sh4eb, x86_64 and probably a couple others I've forgotten. It's also been tested on the other embedded targets. Obviously the new tests are risc-v specific, so that testing was primarily to make sure we didn't ICE, generate incorrect code or regress target existing specific tests. Raphael has some changes to attack this from the gimple direction as well. I think the latest version of those is on me to push through internal review. PR rtl-optimization/120553 gcc/ * ifcvt.cc (noce_try_sign_bit_splat): New function. (noce_process_if_block): Use it. gcc/testsuite/ * gcc.target/riscv/pr120553-1.c: New test. * gcc.target/riscv/pr120553-2.c: New test. * gcc.target/riscv/pr120553-3.c: New test. * gcc.target/riscv/pr120553-4.c: New test. * gcc.target/riscv/pr120553-5.c: New test. * gcc.target/riscv/pr120553-6.c: New test. * gcc.target/riscv/pr120553-7.c: New test. * gcc.target/riscv/pr120553-8.c: New test.
2025-08-22Pass representative of live SLP node to vect_create_epilog_for_reductionRichard Biener1-2/+3
We passed the reduc_info which is close, but the representative is more spot on and will not collide with making the reduc_info a distinct type. * tree-vect-loop.cc (vectorizable_live_operation): Pass the representative of the PHIs node to vect_create_epilog_for_reduction.
2025-08-22Fixups around reduction info and STMT_VINFO_REDUC_VECTYPE_INRichard Biener1-6/+6
STMT_VINFO_REDUC_VECTYPE_IN exists on relevant reduction stmts, not the reduction info. And STMT_VINFO_DEF_TYPE exists on the reduction info. The following fixes up a few places. * tree-vect-loop.cc (vectorizable_lane_reducing): Get reduction info properly. Adjust checks according to comments. (vectorizable_reduction): Do not set STMT_VINFO_REDUC_VECTYPE_IN on the reduc info. (vect_transform_reduction): Query STMT_VINFO_REDUC_VECTYPE_IN on the actual reduction stmt, not the info.
2025-08-22RISC-V: Add testcase for scalar unsigned SAT_MUL form 3Pan Li27-0/+382
Add run and asm check test cases for scalar unsigned SAT_MUL form 3. gcc/testsuite/ChangeLog: * gcc.target/riscv/sat/sat_arith.h: Add test helper macros. * gcc.target/riscv/sat/sat_u_mul-4-u16-from-u128.c: New test. * gcc.target/riscv/sat/sat_u_mul-4-u16-from-u32.c: New test. * gcc.target/riscv/sat/sat_u_mul-4-u16-from-u64.c: New test. * gcc.target/riscv/sat/sat_u_mul-4-u16-from-u64.rv32.c: New test. * gcc.target/riscv/sat/sat_u_mul-4-u32-from-u128.c: New test. * gcc.target/riscv/sat/sat_u_mul-4-u32-from-u64.c: New test. * gcc.target/riscv/sat/sat_u_mul-4-u32-from-u64.rv32.c: New test. * gcc.target/riscv/sat/sat_u_mul-4-u64-from-u128.c: New test. * gcc.target/riscv/sat/sat_u_mul-4-u8-from-u128.c: New test. * gcc.target/riscv/sat/sat_u_mul-4-u8-from-u16.c: New test. * gcc.target/riscv/sat/sat_u_mul-4-u8-from-u32.c: New test. * gcc.target/riscv/sat/sat_u_mul-4-u8-from-u64.c: New test. * gcc.target/riscv/sat/sat_u_mul-4-u8-from-u64.rv32.c: New test. * gcc.target/riscv/sat/sat_u_mul-run-4-u16-from-u128.c: New test. * gcc.target/riscv/sat/sat_u_mul-run-4-u16-from-u32.c: New test. * gcc.target/riscv/sat/sat_u_mul-run-4-u16-from-u64.c: New test. * gcc.target/riscv/sat/sat_u_mul-run-4-u16-from-u64.rv32.c: New test. * gcc.target/riscv/sat/sat_u_mul-run-4-u32-from-u128.c: New test. * gcc.target/riscv/sat/sat_u_mul-run-4-u32-from-u64.c: New test. * gcc.target/riscv/sat/sat_u_mul-run-4-u32-from-u64.rv32.c: New test. * gcc.target/riscv/sat/sat_u_mul-run-4-u64-from-u128.c: New test. * gcc.target/riscv/sat/sat_u_mul-run-4-u8-from-u128.c: New test. * gcc.target/riscv/sat/sat_u_mul-run-4-u8-from-u16.c: New test. * gcc.target/riscv/sat/sat_u_mul-run-4-u8-from-u32.c: New test. * gcc.target/riscv/sat/sat_u_mul-run-4-u8-from-u64.c: New test. * gcc.target/riscv/sat/sat_u_mul-run-4-u8-from-u64.rv32.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2025-08-22Match: Add form 3 for unsigned SAT_MULPan Li1-1/+26
This patch would like to try to match the the unsigned SAT_MUL form 3, aka below: #define DEF_SAT_U_MUL_FMT_3(NT, WT) \ NT __attribute__((noinline)) \ sat_u_mul_##NT##_from_##WT##_fmt_3 (NT a, NT b) \ { \ WT x = (WT)a * (WT)b; \ if ((x >> sizeof(a) * 8) == 0) \ return (NT)x; \ else \ return (NT)-1; \ } While WT is T is uint16_t, uint32_t, uint64_t and uint128_t, and NT is is uint8_t, uint16_t, uint32_t and uint64_t. gcc/ChangeLog: * match.pd: Add form 3 for unsigned SAT_MUL. Signed-off-by: Pan Li <pan2.li@intel.com>
2025-08-22Emit the TLS call after NOTE_INSN_FUNCTION_BEGH.J. Lu3-4/+44
For the beginning basic block: (note 4 0 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK) (note 2 4 26 2 NOTE_INSN_FUNCTION_BEG) emit the TLS call after NOTE_INSN_FUNCTION_BEG. gcc/ PR target/121635 * config/i386/i386-features.cc (ix86_emit_tls_call): Emit the TLS call after NOTE_INSN_FUNCTION_BEG. gcc/testsuite/ PR target/121635 * gcc.target/i386/pr121635-1a.c: New test. * gcc.target/i386/pr121635-1b.c: Likewise. Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2025-08-22Use REDUC_GROUP_FIRST_ELEMENT lessRichard Biener1-34/+33
REDUC_GROUP_FIRST_ELEMENT is often checked to see whether we are dealing with a SLP reduction or a reduction chain. When we are in the context of analyzing the reduction (so we are sure the SLP instance we see is correct), then we can use the SLP instance kind instead. * tree-vect-loop.cc (get_initial_defs_for_reduction): Adjust comment. (vect_create_epilog_for_reduction): Get at the reduction kind via the instance, re-use the slp_reduc flag instead of checking REDUC_GROUP_FIRST_ELEMENT again. Remove unreachable code. (vectorizable_reduction): Compute a reduc_chain flag from the SLP instance kind, avoid REDUC_GROUP_FIRST_ELEMENT checks. (vect_transform_cycle_phi): Likewise. (vectorizable_live_operation): Check the SLP instance kind instead of REDUC_GROUP_FIRST_ELEMENT.
2025-08-22testsuite: Fix g++.dg/abi/mangle83.C for -fshort-enumsNathaniel Shead1-2/+2
Linaro CI informed me that this test fails on ARM thumb-m7-hard-eabi. This appears to be because the target defaults to -fshort-enums, and so the mangled names are inaccurate. This patch just disables the implicit type enum test for this case. gcc/testsuite/ChangeLog: * g++.dg/abi/mangle83.C: Disable implicit enum test for -fshort-enums. Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
2025-08-22Decouple parloops from vect reduction infra some moreRichard Biener1-66/+48
The following removes the use of STMT_VINFO_REDUC_* from parloops, also fixing a mistake with analyzing double reductions which rely on the outer loop vinfo so the inner loop is properly detected as nested. * tree-parloops.cc (parloops_is_simple_reduction): Pass in double reduction inner loop LC phis and query that. (parloops_force_simple_reduction): Similar, but set it. Check for valid reduction types here. (valid_reduction_p): Remove. (gather_scalar_reductions): Adjust, fixup double reduction inner loop processing.
2025-08-22RTEMS: Add riscv multilibsSebastian Huber1-2/+7
gcc/ChangeLog: * config/riscv/t-rtems: Add -mstrict-align multilibs for targets without support for misaligned access in hardware.
2025-08-21[arm] require armv7 support for [PR120424]Alexandre Oliva2-1/+4
Without stating the architecture version required by the test, test runs with options that are incompatible with the required architecture version fail, e.g. -mfloat-abi=hard. armv7 was not covered by the long list of arm variants in target-supports.exp, so add it, and use it for the effective target requirement and for the option. for gcc/testsuite/ChangeLog PR rtl-optimization/120424 * lib/target-supports.exp (arm arches): Add arm_arch_v7. * g++.target/arm/pr120424.C: Require armv7 support. Use dg-add-options arm_arch_v7 instead of explicit -march=armv7.
2025-08-22Daily bump.GCC Administrator6-1/+171
2025-08-21Fortran: Fix NULL pointer issue.Steven G. Kargl2-2/+10
PR fortran/121627 gcc/fortran/ChangeLog: * module.cc (create_int_parameter_array): Avoid NULL pointer dereference and enhance error message. gcc/testsuite/ChangeLog: * gfortran.dg/pr121627.f90: New test.
2025-08-21pru: libgcc: Add software implementation for multiplicationDimitar Dimitrov1-1/+10
For cores without a hardware multiplier, set respective optabs with library functions which use software implementation of multiplication. The implementation was copied from the RL78 backend. gcc/ChangeLog: * config/pru/pru.cc (pru_init_libfuncs): Set softmpy libgcc functions for optab multiplication entries if TARGET_OPT_MUL option is not set. libgcc/ChangeLog: * config/pru/libgcc-eabi.ver: Add __pruabi_softmpyi and __pruabi_softmpyll symbols. * config/pru/t-pru: Add softmpy source files. * config/pru/pru-softmpy.h: New file. * config/pru/softmpyi.c: New file. * config/pru/softmpyll.c: New file. Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
2025-08-21pru: Define multilib for different core variantsDimitar Dimitrov3-1/+33
Enable multilib builds for contemporary PRU core versions (AM335x and later), and older versions present in AM18xx. gcc/ChangeLog: * config.gcc: Include pru/t-multilib. * config/pru/pru.h (MULTILIB_DEFAULTS): Define. * config/pru/t-multilib: New file. Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
2025-08-21pru: Add options to disable MUL/FILL/ZERO instructionsDimitar Dimitrov6-15/+42
Older PRU core versions (e.g. in AM1808 SoC) do not support XIN, XOUT, FILL, ZERO instructions. Add GCC command line options to optionally disable generation of those instructions, so that code can be executed on such older PRU cores. gcc/ChangeLog: * common/config/pru/pru-common.cc (TARGET_DEFAULT_TARGET_FLAGS): Keep multiplication, FILL and ZERO instructions enabled by default. * config/pru/pru.md (prumov<mode>): Gate code generation on TARGET_OPT_FILLZERO. (mov<mode>): Ditto. (zero_extendqidi2): Ditto. (zero_extendhidi2): Ditto. (zero_extendsidi2): Ditto. (@pru_ior_fillbytes<mode>): Ditto. (@pru_and_zerobytes<mode>): Ditto. (@<code>di3): Ditto. (mulsi3): Gate code generation on TARGET_OPT_MUL. * config/pru/pru.opt: Add mmul and mfillzero options. * config/pru/pru.opt.urls: Regenerate. * config/rl78/rl78.opt.urls: Regenerate. * doc/invoke.texi: Document new options. Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
2025-08-21c: Add folding of nullptr_t in some cases [PR121478]Andrew Pinski3-3/+58
The middle-end does not fully understand NULLPTR_TYPE. So it gets confused a lot of the time when dealing with it. This adds the folding that is similarly done in the C++ front-end already. In some cases it should produce slightly better code as there is no reason to load from a nullptr_t variable as it is always NULL. The following is handled: nullptr_v ==/!= nullptr_v -> true/false (ptr)nullptr_v -> (ptr)0, nullptr_v f(nullptr_v) -> f ((nullptr, nullptr_v)) The last one is for conversion inside ... . Bootstrapped and tested on x86_64-linux-gnu. PR c/121478 gcc/c/ChangeLog: * c-fold.cc (c_fully_fold_internal): Fold nullptr_t ==/!= nullptr_t. * c-typeck.cc (convert_arguments): Handle conversion from nullptr_t for varargs. (convert_for_assignment): Handle conversions from nullptr_t to pointer type specially. gcc/testsuite/ChangeLog: * gcc.dg/torture/pr121478-1.c: New test. Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
2025-08-21c++: constexpr clobber of const [PR121068]Jason Merrill2-0/+29
Since r16-3022, 20_util/variant/102912.cc was failing in C++20 and above due to wrong errors about destruction modifying a const object; destruction is OK. PR c++/121068 gcc/cp/ChangeLog: * constexpr.cc (cxx_eval_store_expression): Allow clobber of a const object. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/constexpr-dtor18.C: New test.
2025-08-21RISC-V: testsuite: Fix DejaGnu support for riscv_zvfhPaul-Antoine Arras13-18/+14
Call check_effective_target_riscv_zvfh_ok rather than check_effective_target_riscv_zvfh in vx_vf_*run-1-f16.c run tests and ensure that they are actually run. Also fix remove_options_for_riscv_zvfh. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vf_vfmacc-run-1-f16.c: Call check_effective_target_riscv_zvfh_ok rather than check_effective_target_riscv_zvfh. * gcc.target/riscv/rvv/autovec/vx_vf/vf_vfmadd-run-1-f16.c: Likewise. * gcc.target/riscv/rvv/autovec/vx_vf/vf_vfmsac-run-1-f16.c: Likewise. * gcc.target/riscv/rvv/autovec/vx_vf/vf_vfmsub-run-1-f16.c: Likewise. * gcc.target/riscv/rvv/autovec/vx_vf/vf_vfnmacc-run-1-f16.c: Likewise. * gcc.target/riscv/rvv/autovec/vx_vf/vf_vfnmadd-run-1-f16.c: Likewise. * gcc.target/riscv/rvv/autovec/vx_vf/vf_vfnmsac-run-1-f16.c: Likewise. * gcc.target/riscv/rvv/autovec/vx_vf/vf_vfnmsub-run-1-f16.c: Likewise. * gcc.target/riscv/rvv/autovec/vx_vf/vf_vfwmacc-run-1-f16.c: Likewise. * gcc.target/riscv/rvv/autovec/vx_vf/vf_vfwmsac-run-1-f16.c: Likewise. * gcc.target/riscv/rvv/autovec/vx_vf/vf_vfwnmacc-run-1-f16.c: Likewise. * gcc.target/riscv/rvv/autovec/vx_vf/vf_vfwnmsac-run-1-f16.c: Likewise. * lib/target-supports.exp (check_effective_target_riscv_zvfh_ok): Append zvfh instead of v to march. (remove_options_for_riscv_zvfh): Remove duplicate and call remove_ rather than add_options_for_riscv_z_ext.
2025-08-21rtl-ssa: Add missing live-out uses [PR121619]Richard Sandiford4-1/+74
This PR is another bug in the rtl-ssa code to manage live-out uses. It seems that this didn't get much coverage until recently. In the testcase, late-combine first removed a register-to-register move by substituting into all uses, some of which were in other EBBs. This was done after checking make_uses_available, which (as expected) says that single dominating definitions are available everywhere that the definition dominates. But the update failed to add appropriate live-out uses, so a later parallelisation attempt tried to move the new destination into a later block. gcc/ PR rtl-optimization/121619 * rtl-ssa/functions.h (function_info::commit_make_use_available): Declare. * rtl-ssa/blocks.cc (function_info::commit_make_use_available): New function. * rtl-ssa/changes.cc (function_info::apply_changes_to_insn): Use it. gcc/testsuite/ PR rtl-optimization/121619 * gcc.dg/pr121619.c: New test.
2025-08-21tree-optimization/111494 - reduction vectorization with signed UBRichard Biener3-1/+56
The following makes sure to pun arithmetic that's used in vectorized reduction to unsigned when overflow invokes undefined behavior. PR tree-optimization/111494 * gimple-fold.h (arith_code_with_undefined_signed_overflow): Declare. * gimple-fold.cc (arith_code_with_undefined_signed_overflow): Export. * tree-vect-stmts.cc (vectorizable_operation): Use unsigned arithmetic for operations participating in a reduction.
2025-08-21x86-64: Emit the TLS call after NOTE_INSN_BASIC_BLOCKH.J. Lu3-3/+86
For a basic block with only a label: (code_label 78 11 77 3 14 (nil) [1 uses]) (note 77 78 54 3 [bb 3] NOTE_INSN_BASIC_BLOCK) emit the TLS call after NOTE_INSN_BASIC_BLOCK, instead of before NOTE_INSN_BASIC_BLOCK, to avoid x.c: In function ‘aout_16_write_syms’: x.c:54:1: error: NOTE_INSN_BASIC_BLOCK is missing for block 3 54 | } | ^ x.c:54:1: error: NOTE_INSN_BASIC_BLOCK 77 in middle of basic block 3 during RTL pass: x86_cse x.c:54:1: internal compiler error: verify_flow_info failed gcc/ PR target/121607 * config/i386/i386-features.cc (ix86_emit_tls_call): Emit the TLS call after NOTE_INSN_BASIC_BLOCK in a basic block with only a label. gcc/testsuite/ PR target/121607 * gcc.target/i386/pr121607-1a.c: New test. * gcc.target/i386/pr121607-1b.c: Likewise. Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2025-08-21xtensa: Small improvement to "*btrue_INT_MIN"Takayuki 'January June' Suwa1-10/+7
This patch changes the implementation of the insn to test whether the result itself is negative or not, rather than the MSB of the result of the ABS machine instruction. This eliminates the need to consider bit- endianness and allows for longer branch distances. /* example */ extern void foo(int); void test0(int a) { if (a == -2147483648) foo(a); } void test1(int a) { if (a != -2147483648) foo(a); } ;; before (endianness: little) test0: entry sp, 32 abs a8, a2 bbci a8, 31, .L1 mov.n a10, a2 call8 foo .L1: retw.n test1: entry sp, 32 abs a8, a2 bbsi a8, 31, .L4 mov.n a10, a2 call8 foo .L4: retw.n ;; after (endianness-independent) test0: entry sp, 32 abs a8, a2 bgez a8, .L1 mov.n a10, a2 call8 foo .L1: retw.n test1: entry sp, 32 abs a8, a2 bltz a8, .L4 mov.n a10, a2 call8 foo .L4: retw.n gcc/ChangeLog: * config/xtensa/xtensa.md (*btrue_INT_MIN): Change the branch insn condition to test for a negative number rather than testing for the MSB.
2025-08-21Merge BB and loop path in vect_analyze_stmtRichard Biener4-64/+37
We have now common patterns for most of the vectorizable_* calls, so merge. This also avoids calling vectorizable_early_exit for BB vect and clarifies signatures of it and vectorizable_phi. * tree-vectorizer.h (vectorizable_phi): Take bb_vec_info. (vectorizable_early_exit): Take loop_vec_info. * tree-vect-loop.cc (vectorizable_phi): Adjust. * tree-vect-slp.cc (vect_slp_analyze_operations): Likewise. (vectorize_slp_instance_root_stmt): Likewise. * tree-vect-stmts.cc (vectorizable_early_exit): Likewise. (vect_transform_stmt): Likewise. (vect_analyze_stmt): Merge the sequences of vectorizable_* where common.
2025-08-21Fortran: gfortran PDT component access [PR84122, PR85942]Paul Thomas4-2/+193
2025-08-21 Paul Thomas <pault@gcc.gnu.org> gcc/fortran PR fortran/84122 * parse.cc (parse_derived): PDT type parameters are not allowed an explicit access specification and must appear before a PRIVATE statement. If a PRIVATE statement is seen, mark all the other components as PRIVATE. PR fortran/85942 * simplify.cc (get_kind): Convert a PDT KIND component into a specification expression using the default initializer. gcc/testsuite/ PR fortran/84122 * gfortran.dg/pdt_38.f03: New test. PR fortran/85942 * gfortran.dg/pdt_39.f03: New test.
2025-08-20c++: pointer to auto member function [PR120757]Jason Merrill2-2/+26
Here r13-1210 correctly changed &A<int>::foo to not be considered type-dependent, but tsubst_expr of the OFFSET_REF got confused trying to tsubst a type that involved auto. Fixed by getting the type from the member rather than tsubst. PR c++/120757 gcc/cp/ChangeLog: * pt.cc (tsubst_expr) [OFFSET_REF]: Don't tsubst the type. gcc/testsuite/ChangeLog: * g++.dg/cpp1y/auto-fn66.C: New test.
2025-08-21Daily bump.GCC Administrator6-1/+216
2025-08-20c++: lambda capture and shadowing [PR121553]Marek Polacek5-7/+23
P2036 says that this: [x=1]{ int x; } should be rejected, but with my P2036 we started giving an error for the attached testcase as well, breaking Dolphin. So let's keep the error only for init-captures. PR c++/121553 gcc/cp/ChangeLog: * name-lookup.cc (check_local_shadow): Check !is_normal_capture_proxy. gcc/testsuite/ChangeLog: * g++.dg/warn/Wshadow-19.C: Revert P2036 changes. * g++.dg/warn/Wshadow-6.C: Likewise. * g++.dg/warn/Wshadow-20.C: New test. * g++.dg/warn/Wshadow-21.C: New test. Reviewed-by: Jason Merrill <jason@redhat.com>
2025-08-20Regenerate common.opt.urls for -fdiagnostics-show-contextQing Zhao1-0/+6
When -fdiagnostics-show-context[=DEPTH] was added, they were documented, but common.opt.urls wasn't regenerated. gcc/ChangeLog: * common.opt.urls: Regenerate.
2025-08-20Provide new option -fdiagnostics-show-context=N for -Warray-bounds, ↵Qing Zhao23-100/+1099
-Wstringop-* warnings [PR109071,PR85788,PR88771,PR106762,PR108770,PR115274,PR117179] '-fdiagnostics-show-context[=DEPTH]' '-fno-diagnostics-show-context' With this option, the compiler might print the interesting control flow chain that guards the basic block of the statement which has the warning. DEPTH is the maximum depth of the control flow chain. Currently, The list of the impacted warning options includes: '-Warray-bounds', '-Wstringop-overflow', '-Wstringop-overread', '-Wstringop-truncation'. and '-Wrestrict'. More warning options might be added to this list in future releases. The forms '-fdiagnostics-show-context' and '-fno-diagnostics-show-context' are aliases for '-fdiagnostics-show-context=1' and '-fdiagnostics-show-context=0', respectively. For example: $ cat t.c extern void warn(void); static inline void assign(int val, int *regs, int *index) { if (*index >= 4) warn(); *regs = val; } struct nums {int vals[4];}; void sparx5_set (int *ptr, struct nums *sg, int index) { int *val = &sg->vals[index]; assign(0, ptr, &index); assign(*val, ptr, &index); } $ gcc -Wall -O2 -c -o t.o t.c t.c: In function ‘sparx5_set’: t.c:12:23: warning: array subscript 4 is above array bounds of ‘int[4]’ [-Warray-bounds=] 12 | int *val = &sg->vals[index]; | ~~~~~~~~^~~~~~~ t.c:8:18: note: while referencing ‘vals’ 8 | struct nums {int vals[4];}; | ^~~~ In the above, Although the warning is correct in theory, the warning message itself is confusing to the end-user since there is information that cannot be connected to the source code directly. It will be a nice improvement to add more information in the warning message to report where such index value come from. With the new option -fdiagnostics-show-context=1, the warning message for the above testing case is now: $ gcc -Wall -O2 -fdiagnostics-show-context=1 -c -o t.o t.c t.c: In function ‘sparx5_set’: t.c:12:23: warning: array subscript 4 is above array bounds of ‘int[4]’ [-Warray-bounds=] 12 | int *val = &sg->vals[index]; | ~~~~~~~~^~~~~~~ ‘sparx5_set’: events 1-2 4 | if (*index >= 4) | ^ | | | (1) when the condition is evaluated to true ...... 12 | int *val = &sg->vals[index]; | ~~~~~~~~~~~~~~~ | | | (2) warning happens here t.c:8:18: note: while referencing ‘vals’ 8 | struct nums {int vals[4];}; | ^~~~ PR tree-optimization/109071 PR tree-optimization/85788 PR tree-optimization/88771 PR tree-optimization/106762 PR tree-optimization/108770 PR tree-optimization/115274 PR tree-optimization/117179 gcc/ChangeLog: * Makefile.in (OBJS): Add diagnostic-context-rich-location.o. * common.opt (fdiagnostics-show-context): New option. (fdiagnostics-show-context=): New option. * diagnostic-context-rich-location.cc: New file. * diagnostic-context-rich-location.h: New file. * doc/invoke.texi (fdiagnostics-details): Add documentation for the new options. * gimple-array-bounds.cc (check_out_of_bounds_and_warn): Add one new parameter. Use rich location with details for warning_at. (array_bounds_checker::check_array_ref): Use rich location with ditails for warning_at. (array_bounds_checker::check_mem_ref): Add one new parameter. Use rich location with details for warning_at. (array_bounds_checker::check_addr_expr): Use rich location with move_history_diagnostic_path for warning_at. (array_bounds_checker::check_array_bounds): Call check_mem_ref with one more parameter. * gimple-array-bounds.h: Update prototype for check_mem_ref. * gimple-ssa-warn-access.cc (warn_string_no_nul): Use rich location with details for warning_at. (maybe_warn_nonstring_arg): Likewise. (maybe_warn_for_bound): Likewise. (warn_for_access): Likewise. (check_access): Likewise. (pass_waccess::check_strncat): Likewise. (pass_waccess::maybe_check_access_sizes): Likewise. * gimple-ssa-warn-restrict.cc (pass_wrestrict::execute): Calculate dominance info for diagnostics show context. (maybe_diag_overlap): Use rich location with details for warning_at. (maybe_diag_access_bounds): Use rich location with details for warning_at. gcc/testsuite/ChangeLog: * gcc.dg/pr109071.c: New test. * gcc.dg/pr109071_1.c: New test. * gcc.dg/pr109071_10.c: New test. * gcc.dg/pr109071_11.c: New test. * gcc.dg/pr109071_12.c: New test. * gcc.dg/pr109071_2.c: New test. * gcc.dg/pr109071_3.c: New test. * gcc.dg/pr109071_4.c: New test. * gcc.dg/pr109071_5.c: New test. * gcc.dg/pr109071_6.c: New test. * gcc.dg/pr109071_7.c: New test. * gcc.dg/pr109071_8.c: New test. * gcc.dg/pr109071_9.c: New test. * gcc.dg/pr117375.c: New test.
2025-08-20sra: Make build_ref_for_offset static [PR121568]Andrew Pinski2-5/+1
build_ref_for_offset was originally made external with r0-95095-g3f84bf08c48ea4. The call was extracted out into ipa_get_jf_ancestor_result by r0-110216-g310bc6334823b9. Then the call was removed by r10-7273-gf3280e4c0c98e1. So there is no use of build_ref_for_offset outside of SRA, so let's make it static again. Bootstrapped and tested on x86_64-linux-gnu. PR tree-optimization/121568 gcc/ChangeLog: * ipa-prop.h (build_ref_for_offset): Remove. * tree-sra.cc (build_ref_for_offset): Make static. Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
2025-08-20Merge aarch64-cc-fusion into late-combineRichard Sandiford7-345/+208
I'd added the aarch64-specific CC fusion pass to fold a PTEST instruction into the instruction that feeds the PTEST, in cases where the latter instruction can set the appropriate flags as a side-effect. Combine does the same optimisation. However, as explained in the comments, the PTEST case often has: A: set predicate P based on inputs X B: clobber X C: test P and so the fusion is only possible if we move C before B. That's something that combine currently can't do (for the cases that we needed). The optimisation was never really AArch64-specific. It's just that, in an all-too-familiar fashion, we needed it in stage 3, when it was too late to add something target-independent. late-combine adds a convenient place to do the optimisation in a target-independent way, just as combine is a convenient place to do its related optimisation. gcc/ * config.gcc (aarch64*-*-*): Remove aarch64-cc-fusion.o from extra_objs. * config/aarch64/aarch64-passes.def (pass_cc_fusion): Delete. * config/aarch64/aarch64-protos.h (make_pass_cc_fusion): Delete. * config/aarch64/t-aarch64 (aarch64-cc-fusion.o): Delete. * config/aarch64/aarch64-cc-fusion.cc: Delete. * late-combine.cc (late_combine::optimizable_set): Take a set_info * rather than an insn_info * and move destination tests from... (late_combine::combine_into_uses): ...here. Take a set_info * rather an insn_info *. Take the rtx set. (late_combine::parallelize_insns, late_combine::combine_cc_setter) (late_combine::combine_insn): New member functions. (late_combine::m_parallel): New member variable. * rtlanal.cc (pattern_cost): Handle sets of CC registers in the same way as comparisons.