riscv-gnu-toolchain/gcc.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
2025-09-02	testsuite: Fix gcc.dg/tree-ssa/cswtch-[67].c on Solaris/SPARC with as	Rainer Orth	2	-2/+2
	The gcc.dg/tree-ssa/cswtch-[67].c tests FAIL on Solaris/SPARC with the native as: FAIL: gcc.dg/tree-ssa/cswtch-6.c scan-assembler .rodata.cst16 FAIL: gcc.dg/tree-ssa/cswtch-7.c scan-assembler .rodata.cst32 The issue is the same in both cases: compared to the gas version, with as there's only - .section .rodata.cst32,"aM",@progbits,32 + .section ".rodata" It turns out that varasm.c (mergeable_constant_section) only emits the former if HAVE_GAS_SHF_MERGE, which is 0 with the native as. Fixed by xfailing the tests in this case. Tested on sparc-sun-solaris2.11 with both as and gas. 2025-07-30 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> gcc/testsuite: * gcc.dg/tree-ssa/cswtch-6.c (dg-final): xfail on sparc--solaris2* && !gas. * gcc.dg/tree-ssa/cswtch-7.c: Likewise.
2025-09-02	RISC-V: Remove unused print_ext_doc_entry function [NFC]	Kito Cheng	1	-44/+0
	The print_ext_doc_entry function and associated version_t struct in gen-riscv-ext-opt.cc were not being used anywhere in the codebase. Remove them to clean up the code. gcc/ * config/riscv/gen-riscv-ext-opt.cc (version_t): Remove unused struct. (print_ext_doc_entry): Remove unused function.
2025-09-01	Testsuite: Don't test vector-compare-1.C on strict alignment targets	Andrew Pinski	1	-1/+1
	This testcase will fail on strict alignment targets due to the requirement of doing a possible unaligned load. This fixes that. Note this testcase still fails on arm (and maybe riscv) targets while having unaligned loads, they have slow ones. Pushed as obvious after testing on x86_64-linux-gnu to make sure it is still testing. gcc/testsuite/ChangeLog: * g++.dg/tree-ssa/vector-compare-1.C: Restrict to non_strict_align targets. Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
2025-09-02	Daily bump.	GCC Administrator	6	-1/+248

2025-09-01	install: Fix spelling of "support" and "arithmetic"	Jonathan Grant	1	-7/+7
	gcc: * doc/install.texi (Configuration): Fix spelling of "support" and "floating-point arithmetic". Signed-off-by: Jonathan Grant <jg@jguk.org>
2025-09-01	Fix assertion when trying to represent Ada arrays in CodeView	Mark Harmstone	1	-0/+13
	The LF_ARRAY CodeView type represents a C- or C++-style array, which a length known at compile time. We were crashing when using -gcodeview with Ada (bug #121157), as the DW_AT_upper_bound value is not an unsigned integer but something more complicated: 0x00000123: DW_TAG_array_type DW_AT_type (0x0000014d "character") DW_AT_sibling (0x00000142) 0x0000012c: DW_TAG_subrange_type DW_AT_type (0x00000142 "integer") DW_AT_lower_bound (DW_OP_push_object_address, DW_OP_plus_uconst 0x8, DW_OP_deref, DW_OP_deref_size 0x4) DW_AT_upper_bound (DW_OP_push_object_address, DW_OP_plus_uconst 0x8, DW_OP_deref, DW_OP_plus_uconst 0x4, DW_OP_deref_size 0x4) It doesn't look like we can represent Ada arrays in CodeView, so return 0 in get_type_num_array_type so that they come through as an unknown type. gcc/ * dwarf2codeview.cc (get_type_num_array_type): Don't try to encode non-C-style arrays.
2025-09-01	maintainer-scripts: Improve syncing of libstdc++ docs	Gerald Pfeifer	1	-3/+3
	rsync generally is a more commonly used tool for syncing data - among others it retains time stamps and is able to remove orphaned files on the receiver side. We just need to exclude some directories and a symlink from being removed as "orphaned", since they originate elsewhere. maintainer-scripts: * update_web_docs_libstdcxx_git: Copy our "inner" documentation into the web area using rsync instead of cpio and remove orphaned files.
2025-09-01	c: Implement C2Y N3457 - The __COUNTER__ predefined macro	Jakub Jelinek	2	-0/+47
	The following patch implements the https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3457.htm paper without the first 3 lines in Recommended practice. Seems GCC behavior already matches the expected behavior except for diagnostics of more than 2147483648 __COUNTER__ expansions, so the patch adds a diagnostic for that (but not testcase because #define A __COUNTER__ __COUNTER__ __COUNTER__ __COUNTER__ __COUNTER__ __COUNTER__ __COUNTER__ __COUNTER__ #define B A A A A A A A A #define C B B B B B B B B #define D C C C C C C C C #define E D D D D D D D D #define F E E E E E E E E #define G F F F F F F F F #define H G G G G G G G G #define I H H H H H H H H #define J I I I I I I I I J J J J __COUNTER__ just takes too long to preprocess). Plus I've included all the snippets from the paper into one testcase. 2025-09-01 Jakub Jelinek <jakub@redhat.com> * macro.cc: Implement C2Y N3457 - The __COUNTER__ predefined macro. (_cpp_builtin_macro_text): Diagnose if __COUNTER__ reaches 2147483648 value. * gcc.dg/cpp/c2y-counter-1.c: New test.
2025-09-01	c: Rename uimaxabs to umaxabs	Jakub Jelinek	5	-30/+30
	The following patch implements https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3577.txt No big deal on the GCC side, for uimaxabs we just won't recognize it as builtin and I don't see it worth preserving __builtin_uimaxabs, I doubt anything but gcc testsuite used that. But on the glibc side I think it will need to remain exported for ABI compatibility :( 2025-09-01 Jakub Jelinek <jakub@redhat.com> * builtins.def: Implement C2Y N3577 - Rename s/uimaxabs/umaxabs/. (BUILT_IN_UIMAXABS): Rename to ... (BUILT_IN_UMAXABS): ... this. Change second argument to "umaxabs". * builtins.cc (fold_builtin_1): Use BUILT_IN_UMAXABS rather than BUILT_IN_UIMAXABS. * gcc.c-torture/execute/builtins/lib/abs.c (uimaxabs): Rename to ... (umaxabs): ... this. * gcc.c-torture/execute/builtins/uabs-2.c (uimaxabs): Rename to ... (umaxabs): ... this. (main_test): Use umaxabs instead of uimaxabs. * gcc.c-torture/execute/builtins/uabs-3.c (main_test): Use umaxabs instead of uimaxabs.
2025-09-01	Fortran: truncate constant string passed to character,value dummy [PR121727]	Harald Anlauf	2	-0/+77
	PR fortran/121727 gcc/fortran/ChangeLog: * trans-expr.cc (gfc_const_length_character_type_p): New helper function. (conv_dummy_value): Use it to determine if a character actual argument has a constant length. If a character actual argument is constant and longer than the dummy, truncate it at compile time. gcc/testsuite/ChangeLog: * gfortran.dg/value_10.f90: New test.
2025-09-01	doc: Update perfwiki web address	Gerald Pfeifer	1	-1/+1
	gcc: * doc/invoke.texi (Optimize Options): Update the perfwiki web address.
2025-09-01	diagnostics: Fix bootstrap fail on Darwin 32b hosts.	Iain Sandoe	1	-1/+1
	The use of HOST_SIZE_T_PRINT_HEX needs to be paired with a c-style cast to (fmt_size_t) otherwise the detection mechanisms in hwint.h are not sufficient to deal with size_t defined as 'long unsigned int' which is done on Darwin (and I think on Windows). This patch just makes that update. gcc/ChangeLog: * diagnostics/logging.h (log_param_location_t): Cast location_t value to fmt_size_t. Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
2025-09-01	configure, Darwin: Do not claim .cfi_xxx instruction support.	Iain Sandoe	2	-0/+12
	While the assemblers used by Darwin that are based on LLVM, do support .cfi_ instructions, their use triggers production of compact unwind which currently does not interwork properly with GCC's output. When the system objdump is used in the configure process this is currently working by good fortune (the objdump does not recognise the command and we fail to detect the cfi_advance. However, if a user has binutils objdump earlier in thier PATH then we will detect support and try to use .cfi_ which will cause later and hard-to-diagnose issues. Until we have this resolved, force cfi instruction use off for Darwin. gcc/ChangeLog: * configure: Regenerate. * configure.ac: Do not claim cfi instruction support even if the assembler has it. Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
2025-09-01	PR target/89828 Inernal compiler error on "-fno-omit-frame-pointer"	Yoshinori Sato	2	-32/+66
	The problem was caused by an erroneous note about creating a stack frame, which caused the cur_cfa reg to fail to assert with a value other than the frame pointer. This fix will generate notes that correctly update cur_cfa. v2 changes. Add testcase. All tests that failed with "internal compiler error: in dwarf2out_frame_debug_adjust_cfa, at dwarf2cfi.cc" now pass. PR target/89828 gcc * config/rx/rx.cc (add_pop_cfi_notes): Release the frame pointer if it is used. (rx_expand_prologue): Redesigned stack pointer and frame pointer update process. gcc/testsuite/ * gcc.dg/pr89828.c: New.
2025-09-01	Add default arch/tuning to shift-gf2p8affine test cases	Andi Kleen	6	-6/+6
	This makes them not fail during test suite runs with overriden arch or tunings. gcc/testsuite/ChangeLog: * gcc.target/i386/shift-gf2p8affine-1.c: Use -march=x86-64 -mtune-generic. * gcc.target/i386/shift-gf2p8affine-2.c: Dito. * gcc.target/i386/shift-gf2p8affine-3.c: Dito. * gcc.target/i386/shift-gf2p8affine-5.c: Dito. * gcc.target/i386/shift-gf2p8affine-6.c: Dito. * gcc.target/i386/shift-gf2p8affine-7.c: Dito.
2025-09-01	testsuite: arm: factorize arm_v8_neon_ok flags	Christophe Lyon	1	-3/+3
	Like we do in other effective-targets, add "-mcpu=unset -march=armv8-a" directly when setting et_arm_v8_neon_flags in arm_v8_neon_ok_nocache, to avoid having to add these two flags in all users of arm_v8_neon_ok. This avoids duplication and possible typos / oversights. gcc/testsuite/ChangeLog: * lib/target-supports.exp (check_effective_target_arm_v8_neon_ok_nocache): Add "-mcpu=unset -march=armv8-a" to et_arm_v8_neon_flags. (add_options_for_vect_early_break): Remove useless "-mcpu=unset -march=armv8-a". (add_options_for_arm_v8_neon): Likewise.
2025-09-01	testsuite: arm: remove arm32 check from a few effective-targets	Christophe Lyon	1	-46/+39
	A few arm effective-targets call check_effective_target_arm32 even though they would force a -march=XXX flag which supports Arm and/or Thumb-2, thus making the arm32 check useless. This has an impact when the toolchain is configured with a default -march or -mcpu which supports Thumb-1 only: in such a case, arm32 is false and we skip many tests, thus reducing coverage. This patch removes the call to check_effective_target_arm32 where it is useless, enabling about 2000 tests. In addition, add an early exit if the target is not an arm one, thus saving a few compilation cycles where not needed. In all callers of arm_neon_ok, remove the now useless "istarget arm--. gcc/testsuite/ChangeLog: lib/target-supports.exp (check_effective_target_arm_neon_ok_nocache): Remove arm32 check. Add istarget arm--* check. (check_effective_target_arm_neon_fp16_ok_nocache): Likewise. (check_effective_target_arm_neon_softfp_fp16_ok_nocache): Likewise. (check_effective_target_arm_v8_neon_ok_nocache): Likewise. (check_effective_target_arm_neonv2_ok_nocache): Likewise. (check_effective_target_vect_pack_trunc): Remove istarget arm--* check. (check_effective_target_vect_unpack): Likewise. (check_effective_target_vect_condition): Likewise. (check_effective_target_vect_cond_mixed): Likewise. (available_vector_sizes): Likewise.
2025-09-01	tree-optimization/121744 - handle CST << var in shift pattern recog	Richard Biener	2	-2/+14
	We currently do not handle promotion/demotion of 'var' when the left operand of a variable shift is constant. There's no good reason why, so the following fixes this omission. PR tree-optimization/121744 * tree-vect-patterns.cc (vect_recog_vector_vector_shift_pattern): Allow constant left operand. * gcc.dg/vect/pr121744-1.c: New testcase.
2025-09-01	Eliminate some STMT_VINFO_REDUC_IDX for SLP_TREE_REDUC_IDX	Richard Biener	2	-22/+17
	The following uses SLP_TREE_REDUC_IDX where it looks more appropriate. * tree-vect-loop.cc (vect_create_epilog_for_reduction): Use SLP_TREE_REDUC_IDX for following the SLP graph and for identifying whether we use the 'else' in a COND. (vectorizable_lane_reducing): Simplify check of whether we are in a reduction. (vectorizable_reduction): Add sanity checking around SLP_TREE_REDUC_IDX and use it where it looks appropriate. (vect_transform_reduction): Use SLP_TREE_REDUC_IDX. * tree-vect-stmts.cc (vectorizable_call): Likewise. (vectorizable_operation): Likewise. (vectorizable_condition): Likewise.
2025-09-01	Remove no longer needed STMT_VINFO_REDUC_DEF sets	Richard Biener	1	-15/+1
	The following removes no longer needed extra sets of STMT_VINFO_REDUC_DEF and replaces a single remaining one with a more appropriate check. * tree-vect-loop.cc (vectorizable_live_operation): Check vect_is_reduction on the SLP node rather than STMT_VINFO_REDUC_DEF on the stmt. (vectorizable_reduction): Do not set STMT_VINFO_REDUC_DEF on live stmts.
2025-09-01	Introduce abstraction for vect reduction info, tracked from SLP nodes	Richard Biener	6	-198/+294
	While we have already the accessor info_for_reduction, its result is a plain stmt_vec_info. The following turns that into a class for the purpose of changing accesses to reduction info to a new set of accessors prefixed with VECT_REDUC_INFO and removes the corresponding STMT_VINFO prefixed accessors where possible. There is few reduction related things that are used by scalar cycle detection and thus have to stay as-is for now and as copies in future. This also separates reduction info into one object per reduction and associate it with SLP nodes, splitting it out from stmt_vec_info, retaining (and duplicating) parts used by scalar cycle analysis. The data is then associated with SLP nodes forming reduction cycles and accessible via info_for_reduction. The data is created at SLP discovery time as we look at it even pre-vectorizable_reduction analysis, but most of the data is only populated by the latter. There is no reduction info with nested cycles that are not part of an outer reduction. In the process this adds cycle info to each SLP tree, notably the reduc-idx and a way to identify the reduction info. * tree-vectorizer.h (vect_reduc_info): New. (create_info_for_reduction): Likewise. (VECT_REDUC_INFO_TYPE): Likewise. (VECT_REDUC_INFO_CODE): Likewise. (VECT_REDUC_INFO_FN): Likewise. (VECT_REDUC_INFO_SCALAR_RESULTS): Likewise. (VECT_REDUC_INFO_INITIAL_VALUES): Likewise. (VECT_REDUC_INFO_REUSED_ACCUMULATOR): Likewise. (VECT_REDUC_INFO_INDUC_COND_INITIAL_VAL): Likewise. (VECT_REDUC_INFO_EPILOGUE_ADJUSTMENT): Likewise. (VECT_REDUC_INFO_FORCE_SINGLE_CYCLE): Likewise. (VECT_REDUC_INFO_RESULT_POS): Likewise. (VECT_REDUC_INFO_VECTYPE): Likewise. (STMT_VINFO_VEC_INDUC_COND_INITIAL_VAL): Remove. (STMT_VINFO_REDUC_EPILOGUE_ADJUSTMENT): Likewise. (STMT_VINFO_FORCE_SINGLE_CYCLE): Likewise. (STMT_VINFO_REDUC_FN): Likewise. (STMT_VINFO_REDUC_VECTYPE): Likewise. (vect_reusable_accumulator::reduc_info): Adjust. (vect_reduc_type): Adjust. (_slp_tree::cycle_info): New member. (SLP_TREE_REDUC_IDX): Likewise. (vect_reduc_info_s): Move/copy data from ... (_stmt_vec_info): ... here. (_loop_vec_info::redcu_infos): New member. (info_for_reduction): Adjust to take SLP node. (vect_reduc_type): Adjust. (vect_is_reduction): Add overload for SLP node. * tree-vectorizer.cc (vec_info::new_stmt_vec_info): Do not initialize removed members. (vec_info::free_stmt_vec_info): Do not release them. * tree-vect-stmts.cc (vectorizable_condition): Adjust. * tree-vect-slp.cc (_slp_tree::_slp_tree): Initialize cycle info. (vect_build_slp_tree_2): Compute SLP reduc_idx and store it. Create, populate and propagate reduction info. (vect_print_slp_tree): Print cycle info. (vect_analyze_slp_reduc_chain): Set cycle info on the manual added conversion node. (vect_optimize_slp_pass::start_choosing_layouts): Adjust. * tree-vect-loop.cc (_loop_vec_info::~_loop_vec_info): Release reduction infos. (info_for_reduction): Get the reduction info from the vector in the loop_vinfo. (vect_create_epilog_for_reduction): Adjust. (vectorizable_reduction): Likewise. (vect_transform_reduction): Likewise. (vect_transform_cycle_phi): Likewise, deal with nested cycles not part of a double reduction have no reduction info. * config/aarch64/aarch64.cc (aarch64_force_single_cycle): Use VECT_REDUC_INFO_FORCE_SINGLE_CYCLE, get SLP node and use that. (aarch64_vector_costs::count_ops): Adjust.
2025-09-01	install.texi: For amdgcn, update Newlib version recommendation	Tobias Burnus	1	-1/+2
	Add two Newlib commits to the recommended Newlib version, fixing two other SIMD issues. Cf. PR target/121392 and Newlib Bug 33272. gcc/ChangeLog: PR target/121392 * doc/install.texi (amdgcn): Mention Newlib commit that fixes another SIMD issue.
2025-09-01	Simplify vectorizer IV analysis	Richard Biener	1	-26/+18
	The following simplifies the flow of IV analysis a bit. * tree-vect-loop.cc (vect_is_simple_iv_evolution): Get stmt_info and store into STMT_VINFO_LOOP_PHI_EVOLUTION_BASE_UNCHANGED and STMT_VINFO_LOOP_PHI_EVOLUTION_PART here. Drop unused output parameters. (vect_is_nonlinear_iv_evolution): Likewise. (vect_analyze_scalar_cycles_1): Remove redundant setting of STMT_VINFO_LOOP_PHI_EVOLUTION_BASE_UNCHANGED and STMT_VINFO_LOOP_PHI_EVOLUTION_PART.
2025-09-01	ira: Remove soft conflict related code in improve_allocation. [PR117838]	Cui, Lili	1	-30/+11
	The original intention of this code was to allow more allocnos to share the same register, but this led to expensive allocno overflows. Extracted a small case (a bit large, see Bugzilla PR117838 for details) from 548.exchange2_r to analyze this register allocation issue. Before improve_allocation function: a537 (cost 1896, reg42) a20 (cost 270, reg1) a13 (cost 144, spill) a551 (cost 70, reg40) a5 (cost 43, spill) a493 (cost 30, reg42) a499 (cost 12, reg40) ------------------------------ Dump info in improve_allocation function: Base: Spilling a493r125 for a5r113 Spilling a573r202 for a5r113 Spilling a499r248 for a13r106 Spilling a551r120 for a13r106 Spilling a20r237 for a551r120 With patch: Spilling a499r248 for a13r106 Spilling a551r120 for a13r106 Spilling a493r125 for a551r120 ------------------------------ After assign_hard_reg (at the end of improve_allocation): Base: a537 (cost 1896, reg1) a20 (cost 270, spill) -----> This is unreasonable a13 (cost 144, reg40) a551 (cost 70, reg1) a5 (cost 43, reg42) a493 (cost 30, spill) a499 (cost 12, reg1) With patch: a537 (cost 1896, reg42) a20 (cost 270, reg1) a13 (cost 144, reg40) a551 (cost 70, reg42) a5 (cost 43, spill) a493 (cost 30, spill) a499 (cost 12, reg42) ----------------------------- Collected spec2017 performance on Znver3/Graviton4/EMR/SRF for O2 and Ofast. No performance regression was observed. FOR multi-copy O2 SRF: 548.exchange2_r increased by 7.5%, 500.perlbench_r increased by 2.0%. EMR: 548.exchange2_r increased by 4.5%, 500.perlbench_r increased by 1.7%. Graviton4: 548.exchange2_r Increased by 2.2%, 511.povray_r increased by 2.8%. Znver3 : 500.perlbench_r increased by 2.0%. gcc/ChangeLog: PR rtl-optimization/117838 * ira-color.cc (improve_allocation): Remove soft conflict related code.
2025-08-31	Fix ICE due to wrong operand is passed to ix86_vgf2p8affine_shift_matrix.	liuhongt	3	-4/+30
	1) Fix predicate of operands[3] in cond_<insn><mode> since only const_vec_dup_operand is excepted for masked operations, and pass real count to ix86_vgf2p8affine_shift_matrix. 2) Pass operands[2] instead of operands[1] to gen_vgf2p8affineqb_<mode>_mask which excepted the operand to shifted, but operands[1] is mask operand in cond_<insn><mode>. gcc/ChangeLog: PR target/121699 * config/i386/predicates.md (const_vec_dup_operand): New predicate. * config/i386/sse.md (cond_<insn><mode>): Fix predicate of operands[3], and fix wrong operands passed to ix86_vgf2p8affine_shift_matrix and gen_vgf2p8affineqb_<mode>_mask. gcc/testsuite/ChangeLog: * gcc.target/i386/pr121699.c: New test.
2025-09-01	Daily bump.	GCC Administrator	5	-1/+77

2025-08-31	xtensa: Optimize branch whether (reg:SI) is within/out the range handled by ↵	Takayuki 'January June' Suwa	2	-0/+39
	CLAMPS instruction The CLAMPS instruction in Xtensa ISA, provided when the TARGET_CLAMPS configuration is enabled (and also requires TARGET_MINMAX), returns a value clamped the number in the specified register to between -(1<<N) and (1<<N)-1 inclusive, where N is an immediate value from 7 to 22. Therefore, when the above configurations are met, by comparing the clamped result with the original value for equality, branching whether the value is within the range mentioned above or not is implemented with fewer instructions, especially when the upper and lower bounds of the range are too large to fit into a single immediate assignment. /* example (TARGET_MINMAX and TARGET_CLAMPS) / extern void foo(void); void test0(int a) { if (a >= -(1 << 9) && a < (1 << 9)) foo(); } void test1(int a) { if (a < -(1 << 20) \|\| a >= (1 << 20)) foo(); } ;; before test0: entry sp, 32 addmi a2, a2, 0x200 movi a8, 0x3ff bltu a8, a2, .L1 call8 foo .L1: retw.n test1: entry sp, 32 movi.n a9, 1 movi.n a8, -1 slli a9, a9, 20 srli a8, a8, 11 add.n a2, a2, a9 bgeu a8, a2, .L4 call8 foo .L4: retw.n ;; after test0: entry sp, 32 clamps a8, a2, 9 bne a2, a8, .L1 call8 foo .L1: retw.n test1: entry sp, 32 clamps a8, a2, 20 beq a2, a8, .L4 call8 foo .L4: retw.n (note: Currently, in the RTL instruction combination pass, the possible const_int values are fundamentally constrained by TARGET_LEGITIMATE_CONSTANT_P() if no bare large constant assignments are possible (i.e., neither -mconst16 nor -mauto-litpools), so limiting N to a range of 7 to only 10 instead of to 22. A series of forthcoming patches will introduce an entirely new "xt_largeconst" pass that will solve several issues including this.) gcc/ChangeLog: config/xtensa/predicates.md (alt_ubranch_operator): New predicate. * config/xtensa/xtensa.md (*eqne_in_range): New insn_and_split pattern.
2025-08-31	Fortran: Pass PDTs to dummies with VALUE attribute [PR99709]	Paul Thomas	3	-1/+72
	2025-08-31 Paul Thomas <pault@gcc.gnu.org> gcc/fortran PR fortran/99709 * trans-array.cc (structure_alloc_comps): For the case COPY_ALLOC_COMP, do a deep copy of non-allocatable PDT arrays Suppress the use of 'duplicate_allocatable' for PDT arrays. * trans-expr.cc (conv_dummy_value): When passing to a PDT dummy with the VALUE attribute, do a deep copy to ensure that parameterized components are reallocated. gcc/testsuite/ PR fortran/99709 * gfortran.dg/pdt_41.f03: New test.
2025-08-31	[RISC-V] Improve initial RTL generation for SImode adds on rv64	Shreya Munnangi	4	-13/+125
	So this is the next chunk of Shreya's work to adjust our add expanders. In this patch we're adding support for adding a 2s12 immediate in SI for rv64. To recap, the basic idea is reduce our reliance on the define_insn_and_split that was added a year or so ago by synthesizing the more efficient sequence at expansion time. By handling this early rather than late the synthesized sequence participates in the various optimizer passes in the natural way. In contrast using the define_insn_and_split bypasses the cost modeling in combine and hides the synthesis until after reload as completed (which in turn leads to the problems seen in pr120811). This doesn't solve pr120811, but it is the last prerequisite patch before directly tackling pr120811. This has been bootstrapped & regression tested on the pioneer & bpi and been through the usual testing on riscv32-elf and riscv64-elf. Waiting on pre-commit CI before moving forward. gcc/ config/riscv/riscv-protos.h (synthesize_add_extended): Prototype. * config/riscv/riscv.cc (synthesize_add_extended): New function. * config/riscv/riscv.md (addsi3): For RV64, try synthesize_add_extended. gcc/testsuite/ * gcc.target/riscv/add-synthesis-2.c: New test.
2025-08-31	install: Drop MinGW binaries download link	Gerald Pfeifer	1	-2/+1
	This has been unavailable for well over a year. gcc: * doc/install.texi (Binaries): Drop MinGW.
2025-08-31	libstdc++: Update link to Boost "Exception-Safety"	Gerald Pfeifer	2	-2/+2
	libstdc++-v3: * doc/xml/manual/using_exceptions.xml: Update link to Boost's "Exception-Safety" * doc/html/manual/using_exceptions.html: Rebuild.
2025-08-31	libstdc++: Fix bootstrap failures in src/c++26/debugging.cc	Jonathan Wakely	1	-1/+2
	ptrace on Darwin requires <sys/types.h>. The inline x86 asm doesn't work with the Solaris assembler. libstdc++-v3/ChangeLog: * src/c++26/debugging.cc [_GLIBCXX_HAVE_SYS_PTRACE_H]: Include <sys/types.h>. (breakpoint) [__i386__ \|\| __x86_64__]: Use "int 0x03" instead of "int3".
2025-08-31	RISC-V: Add test case for unsigned scalar SAT_MUL form 4	Pan Li	24	-0/+320
	The form 4 of unsigned scalar SAT_MUL is covered in middle-expand alreay, add test case here to cover form 4. The below test suites are passed for this patch series. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/sat/sat_arith.h: Add test helper macros. * gcc.target/riscv/sat/sat_u_mul-5-u16-from-u128.c: New test. * gcc.target/riscv/sat/sat_u_mul-5-u16-from-u32.c: New test. * gcc.target/riscv/sat/sat_u_mul-5-u16-from-u64.rv32.c: New test. * gcc.target/riscv/sat/sat_u_mul-5-u16-from-u64.rv64.c: New test. * gcc.target/riscv/sat/sat_u_mul-5-u32-from-u128.c: New test. * gcc.target/riscv/sat/sat_u_mul-5-u32-from-u64.rv32.c: New test. * gcc.target/riscv/sat/sat_u_mul-5-u32-from-u64.rv64.c: New test. * gcc.target/riscv/sat/sat_u_mul-5-u64-from-u128.c: New test. * gcc.target/riscv/sat/sat_u_mul-5-u8-from-u128.c: New test. * gcc.target/riscv/sat/sat_u_mul-5-u8-from-u16.c: New test. * gcc.target/riscv/sat/sat_u_mul-5-u8-from-u32.c: New test. * gcc.target/riscv/sat/sat_u_mul-5-u8-from-u64.rv32.c: New test. * gcc.target/riscv/sat/sat_u_mul-5-u8-from-u64.rv64.c: New test. * gcc.target/riscv/sat/sat_u_mul-run-5-u16-from-u128.c: New test. * gcc.target/riscv/sat/sat_u_mul-run-5-u16-from-u32.c: New test. * gcc.target/riscv/sat/sat_u_mul-run-5-u16-from-u64.c: New test. * gcc.target/riscv/sat/sat_u_mul-run-5-u32-from-u128.c: New test. * gcc.target/riscv/sat/sat_u_mul-run-5-u32-from-u64.c: New test. * gcc.target/riscv/sat/sat_u_mul-run-5-u64-from-u128.c: New test. * gcc.target/riscv/sat/sat_u_mul-run-5-u8-from-u128.c: New test. * gcc.target/riscv/sat/sat_u_mul-run-5-u8-from-u16.c: New test. * gcc.target/riscv/sat/sat_u_mul-run-5-u8-from-u32.c: New test. * gcc.target/riscv/sat/sat_u_mul-run-5-u8-from-u64.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2025-08-31	Daily bump.	GCC Administrator	4	-1/+89

2025-08-30	phiopt, math-opts: Adjust spaceship_replacement and optimize_spaceship for ↵	Jakub Jelinek	10	-233/+473
	recent libstdc++ changes [PR121698] libstdc++ changed its ABI in <compare> for C++20 recently (under the C++20 is still experimental rule). In addition to the -1, 0, 1 values for less, equal, greater it now uses -128 for unordered instead of former 2 and changes some of the operators, instead of checks like (_M_value & ~1) == _M_value in some cases it now uses _M_reverse() which is negation in unsigned char type + conversion back to the original type. _M_reverse() thus turns the -1, 0, 1, -128 values into 1, 0, -1, -128. Note libc++ uses value -127 instead of 2/-128. Now, the middle-end has some optimizations which rely on the particular implementation and don't optimize if not. One is optimize_spaceship which on some targets (currently x86, aarch64 and s390) attempts to use better comparison instructions (ideally just one floating point comparison to get all 4 possible outcomes plus some flag tests or magic instead of 2 or 3 floating point comparisons). This one can actually handle arbitrary int non-[-1,1] values for unordered but still has a default of 2. The patch changes that default to -128 so that even if something is expanded as branches if it is later during RTL optimizations determined to convert that into partial_ordering we get better code. The other optimization (phiopt one) is about optimizing (x <=> y) < 0 etc. into just x < y. This one actually relies on the exact unordered value (2) and has code to deal with that (_M_value & ~1) == _M_value kind of tests and whatever match.pd lowers it. So, this patch partially rewrites it to look for -128 instead of 2, drop those (_M_value & ~1) == _M_value pattern recognitions and instead introduces pattern recognition of _M_reverse(), i.e. cast to unsigned char, negation in that type and cast back to the original signed type. With all these changes we get back the desired optimizations for all the cases we could optimize previously (note, for HONOR_NANS case we don't try to optimize say (x <=> y) == 0 because the original will raise exception if either x or y is a NaN, while turning it into x == y will not, but (x <=> y) <= 0 is fine (x <= y), because it does raise those exceptions. 2025-08-30 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/121698 * tree-ssa-phiopt.cc (spaceship_replacement): Adjust to handle spaceship unordered value -128 rather than 2 and stmts from the new std::partial_order::_M_reverse() instead of (_M_value & ~1) == _M_value etc. * doc/md.texi (spaceship@var{m}4): Use -128 instead of 2. * tree-ssa-math-opts.cc (optimize_spaceship): Adjust comments that libstdc++ unordered value is -128 rather than 2 and use that as the default unordered value. * config/i386/i386-expand.cc (ix86_expand_fp_spaceship): Use GEN_INT (-128) instead of const2_rtx and adjust comment accordingly. * config/aarch64/aarch64.cc (aarch64_expand_fp_spaceship): Likewise. * config/s390/s390.cc (s390_expand_fp_spaceship): Likewise. * gcc.dg/pr94589-2.c: Adjust for expected unordered value -128 rather than 2 and negations in unsigned char instead of and with ~1 and comparison against original value. * gcc.dg/pr94589-4.c: Likewise. * gcc.dg/pr94589-5.c: Likewise. * gcc.dg/pr94589-6.c: Likewise.
2025-08-30	doc: Improve markup for list of vector operators	Gerald Pfeifer	1	-2/+3
	gcc: * doc/extend.texi (Vector Extensions): Improve markup for list of operators.
2025-08-30	doc: Update Objective-C language reference	Gerald Pfeifer	1	-3/+2
	gcc: * doc/standards.texi (Standards): Update "Object-Oriented Programming and the Objective-C Language" reference.
2025-08-30	x86-64: Use UNSPEC_DTPOFF to check source operand in TLS64_COMBINE	H.J. Lu	3	-22/+57
	Since the first operand of PLUS in the source of TLS64_COMBINE pattern: (set (reg/f:DI 128) (plus:DI (unspec:DI [ (symbol_ref:DI ("_TLS_MODULE_BASE_") [flags 0x10]) (reg:DI 126) (reg/f:DI 7 sp) ] UNSPEC_TLSDESC) (const:DI (unspec:DI [ (symbol_ref:DI ("bfd_error") [flags 0x1a] <var_decl 0x7fffe99d6e40 bfd_error>) ] UNSPEC_DTPOFF)))) is unused, use the second operand of PLUS: (const:DI (unspec:DI [ (symbol_ref:DI ("bfd_error") [flags 0x1a] <var_decl 0x7fffe99d6e40 bfd_error>) ] UNSPEC_DTPOFF)) to check if 2 TLS_COMBINE patterns have the same source. gcc/ PR target/121725 * config/i386/i386-features.cc (pass_x86_cse::candidate_gnu2_tls_p): Use the UNSPEC_DTPOFF operand to check source operand in TLS64_COMBINE pattern. gcc/testsuite/ PR target/121725 * gcc.target/i386/pr121725-1a.c: New test. * gcc.target/i386/pr121725-1b.c: Likewise. Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2025-08-29	forwprop: Copy the memcmp optimization from strlen to forwprop [PR116651]	Andrew Pinski	3	-0/+97
	To better optimize code dealing with `memcmp == 0` where we have a small constant size, we can inline the memcmp in those cases. There is code to do this in strlen but that is run too late in the case where we can figure out the value of one of the arguments to memcmp. So this copies the optimization to forwprop. An example of where this helps is: ``` bool cmpvect(const std::vector<int> &a) { return a == std::vector<int>{10}; } ``` Where the above should be optimized to just `return a.size() == 1 && a[0] == 10;`. Note pr44130.c testcase needed to change as now it will be optimized away otherwise. Note the loop in pr44130.c os also vectorized which it was not before. Note the optimization remains in strlen as the other part (memcmp -> memcmp_eq) should move to either isel or fab and I didn't want to remove it just yet. Bootstrapped and tested on x86_64-linux-gnu. Changes since v1: * v2: Add verification of arguments to memcmp to simplify_builtin_memcmp. PR tree-optimization/116651 PR tree-optimization/93265 PR tree-optimization/103647 PR tree-optimization/52171 gcc/ChangeLog: * tree-ssa-forwprop.cc (simplify_builtin_memcmp): New function. (simplify_builtin_call): Call simplify_builtin_memcmp for memcmp memcmp_eq builtins. gcc/testsuite/ChangeLog: * gcc.target/i386/pr44130.c: Add an inline-asm clobber. * g++.dg/tree-ssa/vector-compare-1.C: New test. Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
2025-08-29	Revert "Fix _Decimal128 arithmetic error under FE_UPWARD."	liuhongt	6	-586/+215
	This reverts commit 50064b2898edfb83bc37f2597a35cbd3c1c853e3.
2025-08-30	Daily bump.	GCC Administrator	8	-1/+319

2025-08-29	PR modula2/121709: Failed bootstrap in m2	Gaius Mulley	2	-27/+82
	This patch is a followup to PR modula2/121629 which uses the cpp_include_defaults array to configure the default search path entries. In particular it creates default search paths based on LOCAL_INCLUDE_DIR, PREFIX_INCLUDE_DIR, gcc version path and NATIVE_SYSTEM_HEADER_DIR. gcc/m2/ChangeLog: PR modula2/121709 * gm2-lang.cc (concat_component): New function. (find_cpp_entry): Ditto. (lookup_cpp_default): Ditto. (add_default_include_paths): Rewrite. (m2_pathname_root): Remove. gcc/ChangeLog: PR modula2/121709 * doc/gm2.texi (Module Search Path): Reflect the new search order. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2025-08-29	c++: array subscript with COND_EXPR as the array	Sirui Mu	2	-13/+44
	The following minimum reproducer would miscompile with vanilla gcc: extern int x[10], y[10]; bool g(); void f() { 0[g() ? x : y] = 1; } gcc would mistakenly treat the subexpression (g() ? x : y) as a prvalue and move that array to stack. The following assignment would then write to the stack instead of to the global arrays. When optimizations are enabled, this assignment is discarded by dse and gcc generates the following code for the f function: "_Z1fi": jmp "_Z1gv" The miscompilation requires all the following conditions to be met: - The array subscription expression is written as idx[array], instead of the usual form array[idx]; - The "array" part must be a ternary expression (COND_EXPR in gcc tree) and it must be an lvalue. - The code must be compiled with -fstrong-eval-order which is the default for -std=c++17 or later. The cause of the issue lies in cp_build_array_ref, where it mistakenly generates a COND_EXPR with ARRAY_TYPE to the IL when all the criteria above are met. This patch tries to resolve this issue. It moves the canonicalization step that transforms idx[array] to array[idx] early in cp_build_array_ref to ensure we handle these two forms of array subscription consistently. Tested on x86_64-linux. gcc/cp/ChangeLog: * typeck.cc (cp_build_array_ref): Handle 0[arr] earlier. gcc/testsuite/ChangeLog: * g++.dg/cpp1z/array-condition-expr.C: New test. Signed-off-by: Sirui Mu <msrlancern@gmail.com>
2025-08-29	diagnostics: add GCC_DIAGNOSTICS_LOG	David Malcolm	15	-36/+784
	Whilst experimenting with PR diagnostics/121039 (potentially capturing suppressed diagnostics in SARIF output), I found it very useful to have a text log from the diagnostic subsystem to track what it's doing and the decisions it's making (e.g. exactly when and why a diagnostic is being rejected). This patch adds a simple logging mechanism to the diagnostics subsystem, enabled by setting GCC_DIAGNOSTICS_LOG in the environment, which emits nested text like this to stderr (or a named file): warning (option_id: 668, gmsgid: "%<-Wformat-security%> ignored without %<-Wformat%>") diagnostics::context::diagnostic_impl (option_id: 668, kind: warning, gmsgid: "%<-Wformat-security%> ignored without %<-Wformat%>") diagnostics::context::report_diagnostic rejecting: diagnostic not enabled false <- diagnostics::context::diagnostic_impl false <- warning This logging mechanism doesn't use pretty_printer because it can be helpful to use it to debug pretty_printer itself. gcc/ChangeLog: * Makefile.in (OBJS-libcommon): Add diagnostics/logging.o. * diagnostic-global-context.cc: Include "diagnostics/logging.h". (log_function_params, auto_inc_log_depth): New "using" decls. (verbatim): Add logging. (emit_diagnostic): Likewise. (emit_diagnostic_valist): Likewise. (emit_diagnostic_valist_meta): Likewise. (inform): Likewise. (inform_n): Likewise. (warning): Likewise. (warning_at): Likewise. (warning_meta): Likewise. (warning_n): Likewise. (pedwarn): Likewise. (permerror): Likewise. (permerror_opt): Likewise. * diagnostics/context.cc: Include "diagnostics/logging.h". (context::initialize): Initialize m_logger. Add logging. (context::finish): Add logging. Clean up m_logger. (context::dump): Add indent param. (context::set_sink): Add logging. (context::add_sink): Add logging. (diagnostic_kind_debug_text): New. (get_debug_string_for_kind): New. (context::report_diagnostic): Add logging. (context::diagnostic_impl): Likewise. (context::diagnostic_n_impl): Likewise. (context::end_group): Likewise. * diagnostics/context.h: Include "diagnostics/logging.h". (context::dump): Add indent param. (context::get_logger): New accessor. (context::classify_diagnostics): Add logging. (context::push_diagnostics): Likewise. (context::pop_diagnostics): Likewise. (context::m_logger): New field. * diagnostics/html-sink.cc: Include "diagnostics/logging.h". (html_builder::flush_to_file): Add logging. (html_sink::on_report_diagnostic): Likewise. * diagnostics/kinds.h (get_debug_string_for_kind): New decl. * diagnostics/logging.cc: New file. * diagnostics/logging.h: New file. * diagnostics/output-file.h: Include "label-text.h". * diagnostics/sarif-sink.cc: Include "diagnostics/logging.h". (sarif_builder::flush_to_object): Add logging. (sarif_builder::flush_to_file): Likewise. (sarif_sink::on_report_diagnostic): Likewise. * diagnostics/sink.h (sink::get_logger): New. * diagnostics/text-sink.cc: Include "diagnostics/logging.h". (text_sink::on_report_diagnostic): Add logging. * doc/invoke.texi (Environment Variables): Document GCC_DIAGNOSTICS_LOG. * opts-diagnostic.cc: Include "diagnostics/logging.h". (handle_OPT_fdiagnostics_add_output_): Add loggging. (handle_OPT_fdiagnostics_set_output_): Likewise. gcc/analyzer/ChangeLog: * pending-diagnostic.cc: Include "diagnostics/logging.h". (diagnostic_emission_context::warn): Add logging. (diagnostic_emission_context::inform): Likewise. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2025-08-29	xtensa: Rewrite bswapsi2_internal with compact syntax	Takayuki 'January June' Suwa	4	-29/+126
	Also, the omission of the instruction that sets the shift amount register (SAR) to 8 is now more efficient: it is omitted if there was a previous bswapsi2 in the same BB, but not omitted if no bswapsi2 is found or another insn that modifies SAR is found first (see below). Note that the five instructions for writing to SAR are as follows, along with the insns that use them (except for bswapsi2_internal itself): - SSA8B shift_per_byte, shlrd_per_byte - SSA8L shift_per_byte, shlrd_per_byte - SSR ashrsi3 (alt 1), lshrsi3 (alt 1), shlrd_reg, rotrsi3 (alt 1) - SSL ashlsi3_internal (alt 1), shlrd_reg, rotlsi3 (alt 1) - SSAI shlrd_const, rotlsi3 (alt 0), rotrsi3 (alt 0) gcc/ChangeLog: config/xtensa/xtensa-protos.h (xtensa_bswapsi2_output): New function prototype. * config/xtensa/xtensa.cc (xtensa_bswapsi2_output_1, xtensa_bswapsi2_output): New functions. * config/xtensa/xtensa.md (bswapsi2_internal): Rewrite in compact syntax and use xtensa_bswapsi2_output() as asm output. gcc/testsuite/ChangeLog: * gcc.target/xtensa/bswap-SSAI8.c: New.
2025-08-29	[RISC-V][PR target/121548] Avoid bogus index into recog operand cache	Jeff Law	2	-0/+7
	So the RISC-V port has attributes which indicate the index within the recog_data where certain operands will be found. For this BZ the default value for the merge_op_idx attribute on the given insn is "2". But the insn only has operands 0 & 1. So we do an out of bounds array access and boom the ICE/valgrind failure. As we discussed in the patchwork meeting, this is all a bit clunky and has been fairly error prone. This doesn't add any massive checking, but does introduce some asserts to help catch problems a bit earlier and clearer. In particular in cases where we're already asserting that the returned index is valid (!= INVALID_ATTRIBUTE) we also assert that the index is less than the total number of operands. In the get_vlmax_ta_preferred_avl routine it appears like we need to handle these two cases more gracefully as we apparently legitimately query for the merge_op_idx on a fairly arbitrary insn. We just have to make sure to not use the result if it's INVALID_ATTRIBUTE. So for that code we assert that merge_op_idx is either INVALID_ATTRIBUTE or smaller than the number of operands. This patch also adds overrides for 3 patterns to return INVALID_ATTRIBUTE for merge_op_idx, similar to how they already do for mode_idx and avl_type_idx. This has been bootstrapped and regression tested on the bpi & pioneer systems and regression tested for riscv32-elf and riscv64-elf. Waiting on CI before pushing. PR target/121548 gcc/ * config/riscv/riscv-avlprop.cc (get_insn_vtype_mode): Assert MODE_IDX is smaller than the number of operands. (simplify_replace_vlmax_avl): Similarly. (pass_avlprop::get_vlmax_ta_preferred_avl): Similarly. * config/riscv/vector.md: Override merge_op_idx computation for simple moves, just like is done for avl_type_idx and mode_idx.
2025-08-29	Fortran: improve compile-time checking of character dummy arguments [PR93330]	Harald Anlauf	7	-38/+370
	PR fortran/93330 gcc/fortran/ChangeLog: * interface.cc (get_sym_storage_size): Add argument size_known to indicate that the storage size could be successfully determined. (get_expr_storage_size): Likewise. (gfc_compare_actual_formal): Use them to handle zero-sized dummy and actual arguments. If a character formal argument has the pointer or allocatable attribute, or is an array that is not assumed or explicit size, we generate an error by default unless -std=legacy is specified, which falls back to just giving a warning. If -Wcharacter-truncation is given, warn on a character actual argument longer than the dummy. Generate an error for too short scalar character arguments if -std=f* is given instead of just a warning. gcc/testsuite/ChangeLog: * gfortran.dg/argument_checking_15.f90: Adjust dg-pattern. * gfortran.dg/bounds_check_strlen_7.f90: Add dg-pattern. * gfortran.dg/char_length_3.f90: Adjust options. * gfortran.dg/whole_file_24.f90: Add dg-pattern. * gfortran.dg/whole_file_29.f90: Likewise. * gfortran.dg/argument_checking_27.f90: New test.
2025-08-29	RISC-V: Add patterns for vector-scalar IEEE floating-point min	Paul-Antoine Arras	17	-14/+139
	This pattern enables the combine pass (or late-combine, depending on the case) to merge a vec_duplicate into an unspec_vfmin RTL instruction. Before this patch, we have two instructions, e.g.: vfmv.v.f v2,fa0 vfmin.vv v1,v1,v2 After, we get only one: vfmin.vf v1,v1,fa0 gcc/ChangeLog: * config/riscv/autovec-opt.md (vfmin_vf_ieee_<mode>): Add new patterns to combine vec_duplicate + vfmin.vv (unspec) into vfmin.vf. (vfmul_vf_<mode>, vfrdiv_vf_<mode>, vfmin_vf_<mode>): Fix attribute types. * config/riscv/vector.md (@pred_<ieee_fmaxmin_op><mode>_scalar): Allow VLS modes. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vf-4-f16.c: Add vfmin. * gcc.target/riscv/rvv/autovec/vx_vf/vf-4-f32.c: Likewise. * gcc.target/riscv/rvv/autovec/vx_vf/vf-4-f64.c: Likewise. * gcc.target/riscv/rvv/autovec/vx_vf/vf-5-f16.c: New test. * gcc.target/riscv/rvv/autovec/vx_vf/vf-5-f32.c: New test. * gcc.target/riscv/rvv/autovec/vx_vf/vf-5-f64.c: New test. * gcc.target/riscv/rvv/autovec/vx_vf/vf-6-f16.c: New test. * gcc.target/riscv/rvv/autovec/vx_vf/vf-6-f32.c: New test. * gcc.target/riscv/rvv/autovec/vx_vf/vf-6-f64.c: New test. * gcc.target/riscv/rvv/autovec/vx_vf/vf-7-f16.c: New test. * gcc.target/riscv/rvv/autovec/vx_vf/vf-7-f32.c: New test. * gcc.target/riscv/rvv/autovec/vx_vf/vf-7-f64.c: New test. * gcc.target/riscv/rvv/autovec/vx_vf/vf-8-f16.c: New test. * gcc.target/riscv/rvv/autovec/vx_vf/vf-8-f32.c: New test. * gcc.target/riscv/rvv/autovec/vx_vf/vf-8-f64.c: New test.
2025-08-29	x86: Allow by_pieces op when expanding memcpy/memset epilogue	H.J. Lu	11	-0/+132
	Since commit 401199377c50045ede560daf3f6e8b51749c2a87 Author: H.J. Lu <hjl.tools@gmail.com> Date: Tue Jun 17 10:17:17 2025 +0800 x86: Improve vector_loop/unrolled_loop for memset/memcpy uses move_by_pieces and store_by_pieces to expand memcpy/memset epilogue with vector_loop even when targetm.use_by_pieces_infrastructure_p returns false, which triggers gcc_assert (targetm.use_by_pieces_infrastructure_p (len, align, memsetp ? SET_BY_PIECES : STORE_BY_PIECES, optimize_insn_for_speed_p ())); in store_by_pieces. Fix it by: 1. Add by_pieces_in_use to machine_function to indicate that by_pieces op is currently in use. 2. Set and clear by_pieces_in_use when expanding memcpy/memset epilogue with move_by_pieces and store_by_pieces. 3. Define TARGET_USE_BY_PIECES_INFRASTRUCTURE_P to return true if by_pieces_in_use is true. gcc/ PR target/121096 * config/i386/i386-expand.cc (expand_cpymem_epilogue): Set and clear by_pieces_in_use when using by_pieces op. (expand_setmem_epilogue): Likewise. * config/i386/i386.cc (ix86_use_by_pieces_infrastructure_p): New. (TARGET_USE_BY_PIECES_INFRASTRUCTURE_P): Likewise. * config/i386/i386.h (machine_function): Add by_pieces_in_use. gcc/testsuite/ PR target/121096 * gcc.target/i386/memcpy-strategy-14.c: New test. * gcc.target/i386/memcpy-strategy-15.c: Likewise. * gcc.target/i386/memset-strategy-10.c: Likewise. * gcc.target/i386/memset-strategy-11.c: Likewise. * gcc.target/i386/memset-strategy-12.c: Likewise. * gcc.target/i386/memset-strategy-13.c: Likewise. * gcc.target/i386/memset-strategy-14.c: Likewise. * gcc.target/i386/memset-strategy-15.c: Likewise. Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2025-08-29	x86: Handle constant in any modes in setmem_epilogue_gen_val	H.J. Lu	2	-8/+14
	Since the constant passed to setmem_epilogue_gen_val may not be in word_mode, update setmem_epilogue_gen_val to handle any integer modes. gcc/ PR target/121108 * config/i386/i386-expand.cc (setmem_epilogue_gen_val): Don't assert op_mode == word_mode and handle any integer modes. gcc/testsuite/ PR target/121108 * gcc.target/i386/memset-strategy-16.c: New test. Signed-off-by: H.J. Lu <hjl.tools@gmail.com>