aboutsummaryrefslogtreecommitdiff
path: root/gcc
AgeCommit message (Collapse)AuthorFilesLines
2025-07-31c++: constexpr, array, private ctor [PR120800]Jason Merrill2-0/+25
Here cxx_eval_vec_init_1 wants to recreate the default constructor call that we previously built and threw away in build_vec_init_elt, but we aren't in the same access context at this point. Since we already checked access, let's just suppress access control here. Redoing overload resolution at constant evaluation time is sketchy, but should usually be fine for a default/copy constructor. PR c++/120800 gcc/cp/ChangeLog: * constexpr.cc (cxx_eval_vec_init_1): Suppress access control. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/constexpr-array30.C: New test.
2025-07-31Revert "Ada: Add System.C_Time and GNAT.C_Time units to libgnat"Eric Botcazou71-546/+2164
This reverts commit 41974d6ed349507ca1532629851b7b5d74f44abc.
2025-07-31Ada: Fix miscompilation of GNAT tools with -march=znver3Eric Botcazou1-4/+4
The throw and catch sides of the Ada exception machinery disagree about the BIGGEST_ALIGNMENT setting. gcc/ada/ PR ada/120440 * gcc-interface/Makefile.in (GNATLINK_OBJS): Add s-excmac.o. (GNATMAKE_OBJS): Likewise.
2025-07-31Ada: Add System.C_Time and GNAT.C_Time units to libgnatNicolas Boulenguez71-2164/+546
The first unit provides the time_t, timeval and timespec types corresponding to the C types defined by the OS, as well as various conversion functions. The second unit is a mere renaming of the first under the GNAT hierarchy. This removes C time types and conversions under System, and from bodies and private parts under GNAT, while keeping visible types and conversions under GNAT as Obsolescent. [changelog] PR ada/114065 * Makefile.rtl (GNATRTL_NONTASKING_OBJS): Add g-c_time$(objext) and s-c_time$(objext). (Aarch64/Android): Do not use s-osinte__android.adb. (SPARC/Solaris): Do not use s-osprim__solaris.adb. (x86/Solaris): Likewise. (LynxOS178): Do not use s-parame__posix2008.ads. (RTEMS): Likewise. (x32/Linux): Likewise, as well as s-linux__x32.ads. Replace s-osprim__x32.adb with s-osprim__posix.adb. (LIBGNAT_OBJS): Remove cal.o. * cal.c: Delete. * doc/gnat_rm/the_gnat_library.rst (GNAT.C_Time): New entry. (GNAT.Calendar): Do not mention the obsolete conversion functions. * impunit.adb (Non_Imp_File_Names_95): Add g-c_time. * libgnarl/a-exetim__posix.adb: Add with clause for System.C_Time (Clock): Use type and functions from System.C_Time. * libgnarl/s-linux.ads: Remove with clause for System.Parameters. Remove declarations of C time types. * libgnarl/s-linux__alpha.ads: Likewise. * libgnarl/s-linux__android-aarch64.ads: Likewise. * libgnarl/s-linux__android-arm.ads: Likewise. * libgnarl/s-linux__hppa.ads: Likewise. * libgnarl/s-linux__loongarch.ads: Likewise. * libgnarl/s-linux__mips.ads: Likewise. * libgnarl/s-linux__riscv.ads: Likewise. * libgnarl/s-linux__sparc.ads: Likewise. * libgnarl/s-osinte__aix.ads: Likewise. * libgnarl/s-osinte__android.ads: Likewise. * libgnarl/s-osinte__cheribsd.ads: Likewise. * libgnarl/s-osinte__darwin.ads: Likewise. * libgnarl/s-osinte__dragonfly.ads: Likewise. * libgnarl/s-osinte__freebsd.ads: Likewise. * libgnarl/s-osinte__gnu.ads: Likewise. * libgnarl/s-osinte__hpux.ads: Likewise. * libgnarl/s-osinte__kfreebsd-gnu.ads: Likewise. * libgnarl/s-osinte__linux.ads: Likewise. * libgnarl/s-osinte__lynxos178e.ads: Likewise. * libgnarl/s-osinte__qnx.ads: Likewise. * libgnarl/s-osinte__rtems.ads: Likewise. * libgnarl/s-osinte__solaris.ads: Likewise. * libgnarl/s-osinte__vxworks.ads: Likewise. * libgnarl/s-qnx.ads: Likewise. * libgnarl/s-linux__x32.ads: Delete. * libgnarl/s-osinte__darwin.adb (To_Duration): Remove. (To_Timespec): Likewise. * libgnarl/s-osinte__aix.adb: Likewise. * libgnarl/s-osinte__dragonfly.adb: Likewise. * libgnarl/s-osinte__freebsd.adb: Likewise. * libgnarl/s-osinte__gnu.adb: Likewise. * libgnarl/s-osinte__lynxos178.adb: Likewise. * libgnarl/s-osinte__posix.adb: Likewise. * libgnarl/s-osinte__qnx.adb: Likewise. * libgnarl/s-osinte__rtems.adb: Likewise. * libgnarl/s-osinte__solaris.adb: Likewise. * libgnarl/s-osinte__vxworks.adb: Likewise. * libgnarl/s-osinte__x32.adb: Likewise. * libgnarl/s-taprop__solaris.adb: Add with clause for System.C_Time. (Monotonic_Clock): Use type and functions from System.C_Time. (RT_Resolution): Likewise. (Timed_Sleep): Likewise. (Timed_Delay): Likewise. * libgnarl/s-taprop__vxworks.adb: Likewise. * libgnarl/s-tpopmo.adb: Likewise. * libgnarl/s-osinte__android.adb: Delete. * libgnat/g-c_time.ads: New file. * libgnat/g-calend.adb: Delegate to System.C_Time. * libgnat/g-calend.ads: Likewise. * libgnat/g-socket.adb: Likewise. * libgnat/g-socthi.adb: Likewise. * libgnat/g-socthi__vxworks.adb: Likewise. * libgnat/g-sothco.ads: Likewise. * libgnat/g-spogwa.adb: Likewise. * libgnat/s-c_time.adb: New file. * libgnat/s-c_time.ads: Likewise. * libgnat/s-optide.adb: Import nanosleep here. * libgnat/s-os_lib.ads (time_t): Remove. (To_Ada): Adjust. (To_C): Likewise. * libgnat/s-os_lib.adb: Likewise. * libgnat/s-osprim__darwin.adb: Delegate to System.C_Time. * libgnat/s-osprim__posix.adb: Likewise. * libgnat/s-osprim__posix2008.adb: Likewise. * libgnat/s-osprim__rtems.adb: Likewise. * libgnat/s-osprim__unix.adb: Likewise. * libgnat/s-osprim__solaris.adb: Delete. * libgnat/s-osprim__x32.adb: Likewise. * libgnat/s-parame.ads (time_t_bits): Remove. * libgnat/s-parame__hpux.ads: Likewise. * libgnat/s-parame__vxworks.ads: Likewise. * libgnat/s-parame__posix2008.ads: Delete. * s-oscons-tmplt.c (SIZEOF_tv_nsec): New constant.
2025-07-31c++: consteval blocksMarek Polacek14-39/+600
This patch implements consteval blocks, as specified by P2996. They aren't very useful without define_aggregate, but having a reviewed implementation on trunk would be great. consteval {} can be anywhere where a member-declaration or block-declaration can be. The expression corresponding to it is: [] -> void static consteval compound-statement () and it must be a constant expression. I've used cp_parser_lambda_expression to take care of most of the parsing. Since a consteval block can find itself in a template, we need a vehicle to carry the block for instantiation. Rather than inventing a new tree, I'm using STATIC_ASSERT. A consteval block can't return a value but that is checked by virtue of the lambda having a void return type. PR c++/120775 gcc/cp/ChangeLog: * constexpr.cc (cxx_eval_outermost_constant_expr): Use extract_call_expr. * cp-tree.h (CONSTEVAL_BLOCK_P, LAMBDA_EXPR_CONSTEVAL_BLOCK_P): Define. (finish_static_assert): Adjust declaration. (current_nonlambda_function): Likewise. * lambda.cc (current_nonlambda_function): New parameter. Only keep iterating if the function represents a consteval block. * parser.cc (cp_parser_lambda_expression): New parameter for consteval blocks. Use it. Set LAMBDA_EXPR_CONSTEVAL_BLOCK_P. (cp_parser_lambda_declarator_opt): Likewise. (build_empty_string): New. (cp_parser_next_tokens_are_consteval_block_p): New. (cp_parser_consteval_block): New. (cp_parser_block_declaration): Handle consteval blocks. (cp_parser_static_assert): Use build_empty_string. (cp_parser_member_declaration): Handle consteval blocks. * pt.cc (tsubst_stmt): Adjust a call to finish_static_assert. * semantics.cc (finish_fname): Warn for consteval blocks. (finish_static_assert): New parameter for consteval blocks. Set CONSTEVAL_BLOCK_P. Evaluate consteval blocks specially. gcc/testsuite/ChangeLog: * g++.dg/cpp26/consteval-block1.C: New test. * g++.dg/cpp26/consteval-block2.C: New test. * g++.dg/cpp26/consteval-block3.C: New test. * g++.dg/cpp26/consteval-block4.C: New test. * g++.dg/cpp26/consteval-block5.C: New test. * g++.dg/cpp26/consteval-block6.C: New test. * g++.dg/cpp26/consteval-block7.C: New test. * g++.dg/cpp26/consteval-block8.C: New test. Reviewed-by: Jason Merrill <jason@redhat.com>
2025-07-31RISC-V: Add testcases for signed avg ceil vx combinePan Li22-5/+293
The unsigned avg ceil share the vaaddx.vx for the vx combine, so add the test case to make sure it works well as expected. The below test suites are passed for this patch series. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add asm check for signed avg ceil. * gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i32.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i64.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i8.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i32.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i64.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i8.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i16.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i32.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i64.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i8.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i16.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i32.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i64.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i8.c: Ditto. * gcc.target/riscv/rvv/autovec/vx_vf/vx_binary.h: Add test helper macros. * gcc.target/riscv/rvv/autovec/vx_vf/vx_binary_data.h: Add test data for run test. * gcc.target/riscv/rvv/autovec/vx_vf/vx_vaadd-run-2-i16.c: New test. * gcc.target/riscv/rvv/autovec/vx_vf/vx_vaadd-run-2-i32.c: New test. * gcc.target/riscv/rvv/autovec/vx_vf/vx_vaadd-run-2-i64.c: New test. * gcc.target/riscv/rvv/autovec/vx_vf/vx_vaadd-run-2-i8.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2025-07-31vect: Don't set bogus bounds on epilogues [PR120805]Tamar Christina1-1/+1
The testcases in the PR are failing due to the code trying to set a vector range on an epilogue. However on epilogues the range doesn't make sense. In particular we are setting ranged to help niters analysis. But the epilogue doesn't iterate. Secondly the bounds variable hasn't been adjusted to vector iterations: In the epilogue this is calculated as <bb 13> [local count: 81467476]: # i_127 = PHI <tmp.7_131(10), 0(5)> # _132 = PHI <_133(10), 0(5)> _181 = (unsigned int) n_41(D); bnd.31_180 = _181 - _132; where _133 = niters_vector_mult_vf.6_130; but _132 is a phi node, and if coming from the vector loop skip edge _181 will be <1, VF>. But this is a range VRP or Ranger can easily report due to the guard on the skip_vector loop. Previously, non-const VF would skip this code entirely due to the .is_constant() check. Non-partial vector loop would also skip it because the bounds would fold to a constant. so it doesn't enter the !gimple_value check. When support for partial vector ranges was added, this accidentally enabled ranges on partial vector epilogues. This patch now makes it explicit that ranges shouldn't be set for epilogues, as they don't seem to be useful anyway. gcc/ChangeLog: PR tree-optimization/120805 * tree-vect-loop-manip.cc (vect_gen_vector_loop_niters): Skip setting bounds on epilogues.
2025-07-31libgcc: Update FMV features to latest ACLE spec 2024Q4Wilco Dijkstra2-18/+19
Update FMV features to latest ACLE spec of 2024Q4 - several features have been removed or merged. Add FMV support for CSSC and MOPS. Preserve the ordering in enum CPUFeatures. gcc: * common/config/aarch64/cpuinfo.h: Remove unused features, add FEAT_CSSC and FEAT_MOPS. * config/aarch64/aarch64-option-extensions.def: Remove FMV support for RPRES, use PULL rather than AES, add FMV support for CSSC and MOPS. libgcc: * config/aarch64/cpuinfo.c (__init_cpu_features_constructor): Remove unused features, add support for CSSC and MOPS.
2025-07-31AArch64: Use correct cost for shifted halfword load/storesWilco Dijkstra1-1/+1
Since all Armv9 cores support shifted LDRH/STRH, use the correct cost of zero for these. gcc: * config/aarch64/tuning_models/generic_armv9_a.h (generic_armv9_a_addrcost_table): Use zero cost for himode.
2025-07-31Fixup wrong change to get_group_load_store_typeRichard Biener1-1/+1
The following fixes up the r16-2593-g6ac78317aa6adf change which made us match up a scalar with a vector type. Oops. Noticed when removing the gather/scatter pattern that creates the IFNs early. * tree-vect-stmts.cc (get_group_load_store_type): Properly compare the scalar type of the gather/scatter offset to the offset vector component type.
2025-07-31Extend gimple_fold_inplace APIRichard Biener2-6/+6
The following allows to specify the valueization hook to be used. * gimple-fold.h (fold_stmt_inplace): Add valueization hook argument, defaulted to no_follow_ssa_edges. * gimple-fold.cc (fold_stmt_inplace): Adjust.
2025-07-31cobol: Eliminate various errors. [PR120244]Robert Dubner3-13/+25
The following coding errors were located by running extended tests through valgrind. These changes repair the errors. gcc/cobol/ChangeLog: PR cobol/120244 * genapi.cc (get_level_88_domain): Increase array size for final byte. (psa_FldLiteralA): Use correct length in build_string_literal call. * scan.l: Use a loop instead of std:transform to avoid EOF overrun. * scan_ante.h (binary_integer_usage): Use a variable-length buffer.
2025-07-31i386: Fix typo in diagnostic about simultaneous regparm and thiscall useArtemiy Granat1-1/+1
gcc/ChangeLog: * config/i386/i386-options.cc (ix86_handle_cconv_attribute): Fix typo.
2025-07-31i386: Fix incorrect handling of simultaneous regparm and thiscall useArtemiy Granat2-7/+38
gcc/ChangeLog: * config/i386/i386-options.cc (ix86_handle_cconv_attribute): Handle simultaneous use of regparm and thiscall attributes in case when regparm is set before thiscall. gcc/testsuite/ChangeLog: * gcc.target/i386/attributes-error.c: Add more attributes combinations.
2025-07-31i386: Fix incorrect comment about stdcall and fastcall compatibilityArtemiy Granat1-3/+2
gcc/ChangeLog: * config/i386/i386-options.cc (ix86_handle_cconv_attribute): Fix comments which state that combination of stdcall and fastcall attributes is valid but redundant.
2025-07-31i386: Ignore regparm attribute and warn for it in 64-bit modeArtemiy Granat10-25/+69
The regparm attribute does not affect code generation on x86-64 target. Despite this, regparm was accepted silently, unlike other calling convention attributes handled in the ix86_handle_cconv_attribute function. Due to lack of diagnostics, Linux kernel attempted to specify regparm(0) on vmread_error_trampoline declaration, which is supposed to be invoked with all arguments on stack: https://lore.kernel.org/all/20220928232015.745948-1-seanjc@google.com/ To produce a warning for regparm in 64-bit mode, simply move the block that produces diagnostics above the block that handles the regparm attribute. gcc/ChangeLog: * config/i386/i386-options.cc (ix86_handle_cconv_attribute): Move 64-bit mode check before regparm handling. gcc/testsuite/ChangeLog: * g++.dg/abi/regparm1.C: Require ia32 target. * gcc.target/i386/20020224-1.c: Likewise. * gcc.target/i386/pr103785.c: Use regparm attribute only if not in 64-bit mode. * gcc.target/i386/pr36533.c: Likewise. * gcc.target/i386/pr59099.c: Likewise. * gcc.target/i386/sibcall-8.c: Likewise. * gcc.target/i386/sw-1.c: Likewise. * gcc.target/i386/pr15184-2.c: Fix invalid comment. * gcc.target/i386/attributes-ignore.c: New test.
2025-07-31tree-optimization/121320 - UBSAN error in ao_ref_init_from_vn_referenceRichard Biener1-2/+2
The multiplication by BITS_PER_UNIT should be done in poly_offset_int. PR tree-optimization/121320 * tree-ssa-sccvn.cc (ao_ref_init_from_vn_reference): Convert op->off to poly_offset_int before multiplying by BITS_PER_UNIT.
2025-07-31tree-optimization/121323 - UBSAN error in ao_ref_init_from_ptr_and_rangeRichard Biener1-1/+3
We should check the offset fits a HWI when multiplied to be in bits. PR tree-optimization/121323 * tree-ssa-alias.cc (ao_ref_init_from_ptr_and_range): Check the pointer offset fits in a HWI when represented in bits.
2025-07-31testsuite: Add runtime test for FMV resolversYury Khrustalev1-0/+82
gcc/testsuite/ChangeLog: * g++.target/aarch64/mv-cpu-features.C: new test.
2025-07-31testsuite: Add tests for __init_cpu_features_constructorYury Khrustalev6-0/+118
Add tests that would call __init_cpu_features_resolver() directly from an ifunc resolver that would in tern call the function under test __init_cpu_features_constructor() using synthetic parameters for different sizes of the 2nd argument. gcc/testsuite/ChangeLog: * gcc.target/aarch64/ifunc-resolver.in: add core test functions. * gcc.target/aarch64/ifunc-resolver-0.c: new test. * gcc.target/aarch64/ifunc-resolver-1.c: ditto. * gcc.target/aarch64/ifunc-resolver-2.c: ditto. * gcc.target/aarch64/ifunc-resolver-3.c: ditto. * gcc.target/aarch64/ifunc-resolver-4.c: as above.
2025-07-31aarch64: Stop using sys/ifunc.h header in libatomic and libgccYury Khrustalev1-0/+12
This optional header is used to bring in the definition of the struct __ifunc_arg_t type. Since it has been added to glibc only recently, the previous implementation had to check whether this header is present and, if not, it provide its own definition. This creates dead code because either one of these two parts would not be tested. The ABI specification for ifunc resolvers allows to create own ABI-compatible definition for this type, which is the right way of doing it. In addition to improving consistency, the new approach also helps with addition of new fields to struct __ifunc_arg_t type without the need to work-around situations when the definition imported from the header lacks these new fields. ABI allows to define as many hwcap fields in this struct as needed, provided that at runtime we only access the fields that are permitted by the _size value. gcc/ * config/aarch64/aarch64.cc (build_ifunc_arg_type): Add new fields _hwcap3 and _hwcap4. libatomic/ * config/linux/aarch64/host-config.h (__ifunc_arg_t): Remove sys/ifunc.h and add new fields _hwcap3 and _hwcap4. libgcc/ * config/aarch64/cpuinfo.c (__ifunc_arg_t): Likewise. (__init_cpu_features): obtain and assign values for the fields _hwcap3 and _hwcap4. (__init_cpu_features_constructor): check _size in the arg argument.
2025-07-31rs6000: Avoid undefined behavior caused by overflow and invalid shiftsKishan Parmar2-11/+16
While building GCC with --with-build-config=bootstrap-ubsan on powerpc64le-unknown-linux-gnu, multiple UBSAN runtime errors were encountered in rs6000.cc and rs6000.md due to undefined behavior involving left shifts on negative values and shift exponents equal to or exceeding the type width. The issue was in bit pattern recognition code (in can_be_rotated_to_negative_lis and can_be_built_by_li_and_rldic), where signed values were shifted without handling negative inputs or guarding against shift counts equal to the type width, causing UB. The fix ensures shifts and rotations are done unsigned HOST_WIDE_INT, and casting back only where needed (like for arithmetic right shifts) with proper guards to prevent shift-by-64. 2025-07-31 Kishan Parmar <kishan@linux.ibm.com> gcc: PR target/118890 * config/rs6000/rs6000.cc (can_be_rotated_to_negative_lis): Avoid left shift of negative value and guard shift count. (can_be_built_by_li_and_rldic): Likewise. (rs6000_emit_set_long_const): Likewise. * config/rs6000/rs6000.md (splitter for plus into two 16-bit parts): Fix UB from overflow in addition.
2025-07-31Add checks for node in aarch64 vector cost modelingRichard Biener1-1/+3
After removing STMT_VINFO_MEMORY_ACCESS_TYPE we now ICE when costing for scalar stmts required in the epilog since the cost model tries to pattern-match gathers (an earlier patch tried to improve this by introducing stmt groups, but that was on hold due to negative feedback). The following shot-cuts those attempts when node is NULL as that then cannot be a vector stmt. Another possibility would be to gate on vect_body, or restructure everything. Note we now ensure that when m_costing_for_scalar node is NULL. * config/aarch64/aarch64.cc (aarch64_detect_vector_stmt_subtype): Check for node before dereferencing. (aarch64_vector_costs::add_stmt_cost): Likewise.
2025-07-31aarch64: Prevent streaming-compatible code from assembler rejection [PR121028]Spencer Abson4-6/+61
Streaming-compatible functions can be compiled without SME enabled, but need to use "SMSTART SM" and "SMSTOP SM" to temporarily switch into the streaming state of a callee. These switches are conditional on the current mode being opposite to the target mode, so no SME instructions are executed if SME is not available. However, in GAS, "SMSTART SM" and "SMSTOP SM" always require +sme. A call from a streaming-compatible function, compiled without SME enabled, to a non -streaming function will be rejected as: Error: selected processor does not support `smstop sm'.. To work around this, we make use of the .inst directive to insert the literal encodings of "SMSTART SM" and "SMSTOP SM". gcc/ChangeLog: PR target/121028 * config/aarch64/aarch64-sme.md (aarch64_smstart_sm): Use the .inst directive if !TARGET_SME. (aarch64_smstop_sm): Likewise. gcc/testsuite/ChangeLog: PR target/121028 * gcc.target/aarch64/sme/call_sm_switch_1.c: Tell check-function -bodies not to ignore .inst directives, and replace the test for "smstart sm" with one for it's encoding. * gcc.target/aarch64/sme/call_sm_switch_11.c: Likewise. * gcc.target/aarch64/sme/pr121028.c: New test.
2025-07-31Remove STMT_VINFO_MEMORY_ACCESS_TYPERichard Biener5-38/+19
This should be present only on SLP nodes now. The RISC-V changes are mechanical along the line of the SLP_TREE_TYPE changes. * tree-vectorizer.h (_stmt_vec_info::memory_access_type): Remove. (STMT_VINFO_MEMORY_ACCESS_TYPE): Likewise. (vect_mem_access_type): Likewise. * tree-vect-stmts.cc (vectorizable_store): Do not set STMT_VINFO_MEMORY_ACCESS_TYPE. Fix SLP_TREE_MEMORY_ACCESS_TYPE usage. * tree-vect-loop.cc (update_epilogue_loop_vinfo): Remove checking of memory access type. * config/riscv/riscv-vector-costs.cc (costs::compute_local_live_ranges): Use SLP_TREE_MEMORY_ACCESS_TYPE. (costs::need_additional_vector_vars_p): Likewise. (segment_loadstore_group_size): Get SLP node as argument, use SLP_TREE_MEMORY_ACCESS_TYPE. (costs::adjust_stmt_cost): Pass down SLP node. * config/aarch64/aarch64.cc (aarch64_ld234_st234_vectors): Use SLP_TREE_MEMORY_ACCESS_TYPE instead of vect_mem_access_type. (aarch64_detect_vector_stmt_subtype): Likewise. (aarch64_vector_costs::count_ops): Likewise. (aarch64_vector_costs::add_stmt_cost): Likewise.
2025-07-31Do not bother with fake verifying of shared DRsRichard Biener1-4/+2
The following avoids comparing the shared DRs against their unmodified copy for epilogues during loop transform since they are actually modified by update_epilogue_loop_vinfo. Avoid the pointless faking of the original DRs there. * tree-vect-loop.cc (vect_transform_loop): Do not verify DRs have not been modified for epilogue loops. (update_epilogue_loop_vinfo): Do not copy modified DRs to the originals.
2025-07-31change get_best_mode args int -> HOST_WIDE_INT [PR121264]Jakub Jelinek3-2/+15
The following testcase is miscompiled, because byte 0x20000000 is bit 0x100000000 and ifcombine incorrectly combines the two loads into a BIT_FIELD_REF even when they are very far away. The problem is that gimple-fold.cc ifcombine uses get_best_mode heavily, and that function has just int bitsize and int bitpos arguments, so when called e.g. with if (get_best_mode (end_bit - first_bit, first_bit, 0, ll_end_region, ll_align, BITS_PER_WORD, volatilep, &lnmode)) where end_bit - first_bit doesn't fit into int, it is silently truncated. If there was just a single problematic get_best_mode call, I would probably just check for overflows in the caller, but there are many. And the two arguments are used solely as arguments to bit_field_mode_iterator constructor which has HOST_WIDE_INT arguments, so I think the easiest fix is just make the get_best_mode arguments also HOST_WIDE_INT. 2025-07-31 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/121264 * machmode.h (get_best_mode): Change type of first 2 arguments from int to HOST_WIDE_INT. * stor-layout.cc (get_best_mode): Likewise. * gcc.dg/tree-ssa/pr121264.c: New test.
2025-07-31aarch64: testsuite: Fix do-assemble tests for SMESpencer Abson13-4/+48
GCC doesn't support SME without SVE2, so the -march=armv8-a+<ext> argument to check_no_compiler_messages causes aarch64_asm_<ext>_ok to return zero for SME and any <ext> that implies it. This patch changes the baseline architecure to armv9-a for these extensions. The tests for ACLE SME2 intrinsics that require FEAT_FAMINMAX were configured to do-assemble if aarch64_asm_sme2_ok returned 1 (by default), but they really need to check if +faminmax is supported too. The fix above exposed this, so we also fix the do-assemble/do-compile choice for those tests here. gcc/testsuite/ChangeLog: * gcc.target/aarch64/sme2/acle-asm/amax_f16_x2.c: Gate do-assemble on assembler support for +faminmax and +sme2. * gcc.target/aarch64/sme2/acle-asm/amax_f16_x4.c: Likewise. * gcc.target/aarch64/sme2/acle-asm/amax_f32_x2.c: Likewise. * gcc.target/aarch64/sme2/acle-asm/amax_f32_x4.c: Likewise. * gcc.target/aarch64/sme2/acle-asm/amax_f64_x2.c: Likewise. * gcc.target/aarch64/sme2/acle-asm/amax_f64_x4.c: Likewise. * gcc.target/aarch64/sme2/acle-asm/amin_f16_x2.c: Likewise. * gcc.target/aarch64/sme2/acle-asm/amin_f16_x4.c: Likewise. * gcc.target/aarch64/sme2/acle-asm/amin_f32_x2.c: Likewise. * gcc.target/aarch64/sme2/acle-asm/amin_f32_x4.c: Likewise. * gcc.target/aarch64/sme2/acle-asm/amin_f64_x2.c: Likewise. * gcc.target/aarch64/sme2/acle-asm/amin_f64_x4.c: Likewise. * lib/target-supports.exp: Split the extensions that require SME into a separate set, and use armv9-a as their baseline.
2025-07-31Fix comment typos - hanlde -> handleJakub Jelinek6-8/+8
2025-07-31 Jakub Jelinek <jakub@redhat.com> * gimple-ssa-store-merging.cc (find_bswap_or_nop): Fix comment typos, hanlde -> handle. * config/i386/i386.cc (ix86_gimple_fold_builtin, ix86_rtx_costs): Likewise. * config/i386/i386-features.cc (remove_partial_avx_dependency): Likewise. * gcc.target/i386/apx-1.c (apx_hanlder): Rename to ... (apx_handler): ... this. * gcc.target/i386/uintr-2.c (UINTR_hanlder): Rename to ... (UINTR_handler): ... this. * gcc.target/i386/uintr-5.c (UINTR_hanlder): Rename to ... (UINTR_handler): ... this.
2025-07-31Disallow scan-store vectorization in epiloguesRichard Biener1-1/+2
The following disallows vectorizing epilogues containing scan-stores. Since code generation works by walking gimple stmts it is not ready for this when cleaning up epilogue vectorization. I believe scan-store vectorization needs most of the work done during SLP discovery to reflect the data flow. * tree-vect-stmts.cc (check_scan_store): Remove redundant slp_node check. Disallow epilogue vectorization.
2025-07-31Avoid passing vectype != NULL when costing scalar ILRichard Biener1-0/+8
The following makes sure to not leak a set vectype on a stmt when doing scalar IL costing as this can confuse vector cost models which do not look at m_costing_for_scalar most of the time. * tree-vectorizer.h (vector_costs::costing_for_scalar): New accessor. (add_stmt_cost): For scalar costing force vectype to NULL. Verify we do not pass in a SLP node.
2025-07-31RISC-V: Adding H to the canonical order [PR121312]Kito Cheng1-1/+1
We added H into canonical order before, but forgot to add it to arch-canonicalize as well... gcc/ChangeLog: PR target/121312 * config/riscv/arch-canonicalize: Add H extension to the canonical order.
2025-07-31Daily bump.GCC Administrator5-1/+396
2025-07-31c++: Don't assume trait funcs return error_mark_node when tf_error is passed ↵Nathaniel Shead5-30/+52
[PR121291] For the sake of determining if there are other errors in user code to report early, many trait functions don't always return error_mark_node if not called in a SFINAE context (i.e., tf_error is set). This patch removes some assumptions on this behaviour I'd made when improving diagnostics of builtin traits. PR c++/121291 gcc/cp/ChangeLog: * constraint.cc (diagnose_trait_expr): Remove assumption about failures returning error_mark_node. * except.cc (explain_not_noexcept): Allow expr not being noexcept. * method.cc (build_invoke): Adjust comment. (is_trivially_xible): Always note non-trivial components if expr is not null or error_mark_node. (is_nothrow_xible): Likewise for non-noexcept components. (is_nothrow_convertible): Likewise. gcc/testsuite/ChangeLog: * g++.dg/ext/is_invocable7.C: New test. * g++.dg/ext/is_nothrow_convertible5.C: New test. Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by: Patrick Palka <ppalka@redhat.com>
2025-07-30c++: improve non-constant template arg diagnosticJason Merrill4-26/+45
A conversation today pointed out that the current diagnostic for this case doesn't mention the constant evaluation failure, it just says e.g. "'p' is not a valid template argument for 'int*' because it is not the address of a variable" This patch improves the diagnostic in C++17 and above (post-N4268) to diagnose failed constant-evaluation. gcc/cp/ChangeLog: * pt.cc (convert_nontype_argument_function): Check cxx_constant_value on failure. (invalid_tparm_referent_p): Likewise. gcc/testsuite/ChangeLog: * g++.dg/tc1/dr49.C: Adjust diagnostic. * g++.dg/template/func2.C: Likewise. * g++.dg/cpp1z/nontype8.C: New test.
2025-07-30simplify-rtx: Add `(subreg (not a))` simplification for word_mode [PR121308]Andrew Pinski1-0/+9
Right now in simplify_subreg, there is code to try to simplify for word_mode with the binary bitwise operators. The unary bitwise operator is not handle, this causes an odd mix match and the new self testing code that was added with r16-2614-g965564eafb721f was not expecting. The self testing code was for testing the newly added code but since there was already code that handles word_mode, we hit the mismatch but only for targets where word_mode is SImode (or smaller). This adds the code to handle `not` in a similar fashion as the other bitwise operators for word_mode. Changes since v1: * v2: add `&& SCALAR_INT_MODE_P (innermode)` to the conditional. Bootstrapped and tested on x86_64-linux-gnu. PR rtl-optimization/121308 gcc/ChangeLog: * simplify-rtx.cc (simplify_context::simplify_subreg): Handle subreg of `not` with word_mode to make it symmetric with the other bitwise operators. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2025-07-30IFCVT: Fix factor_out_operators correctly for more than 1 phi [PR121295]Andrew Pinski3-0/+40
r16-2590-ga51bf9e10182cf was not the correct fix for this in the end. Instead a much simplier and localized fix is needed, just change the phi that is being worked on with the new result and arguments that is from the factored out operator. This solves the issue of not having result in the IR and causing issues that way. Bootstrapped and tested on x86_64-linux-gnu. Note this depends on reverting r16-2590-ga51bf9e10182cf. PR tree-optimization/121236 PR tree-optimization/121295 gcc/ChangeLog: * tree-if-conv.cc (factor_out_operators): Change the phi node to the new result and args. gcc/testsuite/ChangeLog: * gcc.dg/torture/pr121236-1.c: New test. * gcc.dg/torture/pr121295-1.c: New test. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2025-07-30Revert "ifcvt: Fix ifcvt for multiple phi nodes after factoring operator ↵Andrew Pinski2-55/+25
[PR121236]" This reverts commit a51bf9e10182cf7ac858db0ea6c5cb11b4f12377.
2025-07-30Report read errors when reading auto-profileJan Hubicka3-2/+13
currently -fauto-profile will happily read truncated file without any warning and interpret it as a zero profile which will in turn result in slow code. This patch exports gcov_is_error and adds checks so truncated files are detected. gcc/ChangeLog: * auto-profile.cc (string_table::read): Check gcov_is_error. (read_profile): Likewise. * gcov-io.cc (gcov_is_error): Export for gcc linkage. * gcov-io.h (gcov_is_error): Declare.
2025-07-30[x86] factor out worker from ix86_builtin_vectorization_costRichard Biener1-18/+23
The following factors out a worker that gets a mode argument rather than a vectype argument. That makes a difference when we hit the fallback in add_stmt_cost for scalar stmts where vectype might be NULL and thus mode is derived from the scalar stmt there. But ix86_builtin_vectorization_cost does not have access to the stmt. So the patch instead dispatches to the new ix86_default_vector_cost there, passing down the mode we derived from the stmt. This is to avoid regressions with a patch that makes even more scalar stmt costings have a vectype passed. * config/i386/i386.cc (ix86_default_vector_cost): Split out from ... (ix86_builtin_vectorization_cost): ... this and use mode instead of vectype as argument. (ix86_vector_costs::add_stmt_cost): Call ix86_default_vector_cost instead of ix86_builtin_vectorization_cost.
2025-07-30s390: Implement spaceship optab [PR117015]Stefan Schulze Frielinghaus10-0/+381
gcc/ChangeLog: PR target/117015 * config/s390/s390-protos.h (s390_expand_int_spaceship): New function. (s390_expand_fp_spaceship): New function. * config/s390/s390.cc (s390_expand_int_spaceship): New function. (s390_expand_fp_spaceship): New function. * config/s390/s390.md (spaceship<mode>4): New expander. gcc/testsuite/ChangeLog: * gcc.target/s390/spaceship-fp-1.c: New test. * gcc.target/s390/spaceship-fp-2.c: New test. * gcc.target/s390/spaceship-fp-3.c: New test. * gcc.target/s390/spaceship-fp-4.c: New test. * gcc.target/s390/spaceship-int-1.c: New test. * gcc.target/s390/spaceship-int-2.c: New test. * gcc.target/s390/spaceship-int-3.c: New test.
2025-07-30cprop: Allow jump bypassing for single set insnsStefan Schulze Frielinghaus1-6/+18
During jump bypassing also consider insns of the form (insn 25 57 26 9 (parallel [ (set (reg:CCZ 33 %cc) (compare:CCZ (reg:SI 60 [ _9 ]) (const_int 0 [0]))) (clobber (scratch:SI)) ]) "spaceship-fp-4.c":27:1 1746 {*tstsi_cconly_extimm} (nil)) by testing for a single set insn during bypass_conditional_jumps(). This is a requirement for test gcc.target/s390/spaceship-fp-4.c of the subsequent commit. In order to silence cprop.cc:1621:40: error: 'setcc_dest' may be used uninitialized [-Werror=maybe-uninitialized] 1621 | src = simplify_replace_rtx (src, setcc_dest, setcc_src); | ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~ initialize setcc_{dest,src} in bypass_block() although this is not really required. gcc/ChangeLog: * cprop.cc (bypass_block): Extract single set. (bypass_conditional_jumps): Ditto.
2025-07-30x86: Transform to "pushq $-1; popq reg" for -OzH.J. Lu2-1/+12
commit 4c80062d7b8c272e2e193b8074a8440dbb4fe588 Author: H.J. Lu <hjl.tools@gmail.com> Date: Sun May 25 07:40:29 2025 +0800 x86: Enable *mov<mode>_(and|or) only for -Oz disabled transformation from "movq $-1,reg" to "pushq $-1; popq reg" for -Oz. But for legacy integer registers, the former is 4 bytes and the latter is 3 bytes. Enable such transformation for -Oz. gcc/ PR target/120427 * config/i386/i386.md (peephole2): Transform "movq $-1,reg" to "pushq $-1; popq reg" for -Oz if reg is a legacy integer register. gcc/testsuite/ PR target/120427 * gcc.target/i386/pr120427-5.c: New test. Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2025-07-30auto-profile fixesJan Hubicka1-3/+9
This patch silences warning about bad location in function_instance::match warning about profile containing record for line numbers that are not matched by the function body. While this is a bogus profile (and we will end up losing the profile data), create_gcov does not have enough information to output them correctly in all contexts since in dwarf5 we output multiple locations per single instructions (possibly comming from different inlines) while it can only represent one inline stack. The patch also fixes issue with profile scaling. By making force_nonzero to take into account cutoffs, I made the test for counter being non-zero before scaling too agressive. gcc/ChangeLog: * auto-profile.cc (function_instance::match): Disable warning about bogus locations since dwarf does not represent enough info to output them correctly in all cases. (add_scale): Use nonzero_p instead of orig.force_nonzero () == orig. (afdo_adjust_guessed_profile): Add missing newline in dump file.
2025-07-30Fix symbol_table::change_decl_assembler_name when DECL_RTL is already computedJan Hubicka1-0/+5
while working on patch assigning unique names to static symbols I noticed that fortran symbols are not renamed since the frontend calls make_decl_rtl. This gets DECL_ASSEMBBLER_NAME and DECL_RTL out of sync. I think we can drop that call, but it is also good idea to avoid this inconsistence, so this patch makes symbol_table::change_decl_assembler_name to recompute DECL_RTL in this case. gcc/ChangeLog: * symtab.cc (symbol_table::change_decl_assembler_name): Recompute DECL_RTL in case it is already computed.
2025-07-30Fix fasle profile insonsistency errorJan Hubicka2-2/+60
This patch fixes false incosistent profile error message seen when building SPEC with -fprofile-use -fdump-ipa-profile. The problem is that with dumping tree_esitmate_probability is run in dry run mode to report success rates of heuristics. It however runs determine_unlikely_bbs which ovewrites some counts to profile_count::zero and later value profiling sees the mismatch. In sane profile determine_unlikely_bbs should be almost always no-op since it should only drop to 0 things that are known to be unlikely executed. What happens here is that there is a comdat where profile is lost and we see a call with non-zero count calling function with zero count and "fix" the profile by making the call to have zero count, too. I also extended unlikely prediates to avoid tampering with predictions when prediciton is believed to be reliable. This also avoids us from dropping all EH regions to 0 count as tested by the testcase. gcc/ChangeLog: * predict.cc (unlikely_executed_edge_p): Ignore EDGE_EH if profile is reliable. (unlikely_executed_stmt_p): special case builtin_trap/unreachable and ignore other heuristics for reliable profiles. (tree_estimate_probability): Disable unlikely bb detection when doing dry run gcc/testsuite/ChangeLog: * g++.dg/tree-prof/eh1.C: New test.
2025-07-30vect: Add target hook to prefer gather/scatter instructionsAndrew Stubbs7-10/+68
For AMD GCN, the instructions available for loading/storing vectors are always scatter/gather operations (i.e. there are separate addresses for each vector lane), so the current heuristic to avoid gather/scatter operations with too many elements in get_group_load_store_type is counterproductive. Avoiding such operations in that function can subsequently lead to a missed vectorization opportunity whereby later analyses in the vectorizer try to use a very wide array type which is not available on this target, and thus it bails out. This patch adds a target hook to override the "single_element_p" heuristic in the function as a target hook, and activates it for GCN. This allows much better code to be generated for affected loops. Co-authored-by: Julian Brown <julian@codesourcery.com> gcc/ * doc/tm.texi.in (TARGET_VECTORIZE_PREFER_GATHER_SCATTER): Add documentation hook. * doc/tm.texi: Regenerate. * target.def (prefer_gather_scatter): Add target hook under vectorizer. * hooks.cc (hook_bool_mode_int_unsigned_false): New function. * hooks.h (hook_bool_mode_int_unsigned_false): New prototype. * tree-vect-stmts.cc (vect_use_strided_gather_scatters_p): Add parameters group_size and single_element_p, and rework to use targetm.vectorize.prefer_gather_scatter. (get_group_load_store_type): Move some of the condition into vect_use_strided_gather_scatters_p. * config/gcn/gcn.cc (gcn_prefer_gather_scatter): New function. (TARGET_VECTORIZE_PREFER_GATHER_SCATTER): Define hook.
2025-07-30Don't pass vector params through to offload targetsAndrew Stubbs3-5/+26
The optimization options are deliberately passed through to the LTO compiler, but when the same mechanism is reused for offloading it ends up forcing the host compiler settings onto the device compiler. Maybe this should be removed completely, but this patch just fixes a few of them. In particular, param_vect_partial_vector_usage is disabled by x86 and this really hurts amdgcn. I also fixed an ambiguous else warning in the generated file by adding braces. gcc/ChangeLog: * config/gcn/gcn.cc (gcn_option_override): Add note to set default for param_vect_partial_vector_usage to "1". * optc-save-gen.awk: Don't pass through options marked "NoOffload". * params.opt (-param=vect-epilogues-nomask): Add NoOffload. (-param=vect-partial-vector-usage): Likewise. (-param=vect-inner-loop-cost-factor): Likewise.
2025-07-30tree-optimization/121130 - vectorizable_call cannot handle .MASK_CALLRichard Biener2-1/+18
The following makes it correctly reject them, vectorizable_simd_clone_call is solely responsible for them. PR tree-optimization/121130 * tree-vect-stmts.cc (vectorizable_call): Bail out for .MASK_CALL. * gcc.dg/vect/vect-simd-pr121130.c: New testcase.
2025-07-30c++: Make __extension__ silence -Wlong-long pedwarns/warnings [PR121133]Jakub Jelinek5-11/+52
The PR13358 r0-92909 change changed the diagnostics on long long in C++ (either with -std=c++98 or -Wlong-long), but unlike the C FE we unfortunately warn even in the __extension__ long long a; etc. cases. The C FE in that case in disable_extension_diagnostics saves and clears not just pedantic flag but also warn_long_long (and several others), while C++ FE only temporarily disables pedantic. The following patch makes it behave like the C FE in this regard, though (__extension__ 1LL) still doesn't work because of the separate lexing (and I must say I have no idea how to fix that). Or do you prefer a solution closer to the C FE, cp_parser_extension_opt saving the values into a bitfield and have another function to restore the state (or use RAII)? 2025-07-30 Jakub Jelinek <jakub@redhat.com> PR c++/121133 * parser.cc (cp_parser_unary_expression): Adjust cp_parser_extension_opt caller and restore warn_long_long. (cp_parser_declaration): Likewise. (cp_parser_block_declaration): Likewise. (cp_parser_member_declaration): Likewise. (cp_parser_extension_opt): Add SAVED_LONG_LONG argument, save previous warn_long_long state into it and clear it for __extension__. * g++.dg/warn/pr121133-1.C: New test. * g++.dg/warn/pr121133-2.C: New test. * g++.dg/warn/pr121133-3.C: New test. * g++.dg/warn/pr121133-4.C: New test.