riscv-gnu-toolchain/gcc.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
2024-02-06	riscv: Fix compiler warning in thead.cc	Christoph Müllner	1	-1/+2
	A recent commit introduced a compiler warning in thead.cc: error: invalid suffix on literal; C++11 requires a space between literal and string macro [-Werror=literal-suffix] 1144 \| fprintf (file, "(%s),"HOST_WIDE_INT_PRINT_DEC",%u", reg_names[REGNO (addr.reg)], \| ^ This commit addresses this issue and breaks the line such that it won't exceed 80 characters. gcc/ChangeLog: * config/riscv/thead.cc (th_print_operand_address): Fix compiler warning. Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
2024-02-05	c++: -frounding-math test [PR109359]	Jason Merrill	1	-0/+8
	This test was fixed by the patch for PR95226, but that patch had no testcase so let's add this one. PR c++/109359 gcc/testsuite/ChangeLog: * g++.dg/ext/frounding-math1.C: New test.
2024-02-05	Update gcc zh_CN.po	Joseph Myers	1	-202/+157
	* zh_CN.po: Update.
2024-02-05	c++: prvalue of array type [PR111286]	Jason Merrill	2	-4/+17
	Here we want to build a prvalue array to bind to the T reference, but we were wrongly trying to strip cv-quals from the array prvalue, which should be treated the same as a class prvalue. PR c++/111286 gcc/cp/ChangeLog: * tree.cc (rvalue): Don't drop cv-quals from an array. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/initlist-array22.C: New test.
2024-02-05	libgo: bump libgo version for GCC 14 release	Ian Lance Taylor	1	-1/+1
	PR go/113668 Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/560676
2024-02-05	compiler: add Type::message_name	Ian Lance Taylor	4	-25/+375
	As we move toward generics, the error messages need to be able to refer to types in a readable manner. Add that capability, and use it today in AST dumps. Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/536716
2024-02-05	x86-64: Find a scratch register for large model profiling	H.J. Lu	4	-15/+214
	2 scratch registers, %r10 and %r11, are available at function entry for large model profiling. But %r10 may be used by stack realignment and we can't use %r10 in this case. Add x86_64_select_profile_regnum to find a caller-saved register which isn't live or a callee-saved register which has been saved on stack in the prologue at entry for large model profiling and sorry if we can't find one. gcc/ PR target/113689 * config/i386/i386.cc (x86_64_select_profile_regnum): New. (x86_function_profiler): Call x86_64_select_profile_regnum to get a scratch register for large model profiling. gcc/testsuite/ PR target/113689 * gcc.target/i386/pr113689-1.c: New file. * gcc.target/i386/pr113689-2.c: Likewise. * gcc.target/i386/pr113689-3.c: Likewise.
2024-02-05	c: Avoid ICE with _BitInt(N) : 0 bitfield [PR113740]	Jakub Jelinek	2	-1/+6
	finish_struct already made sure not to call build_bitint_type for signed _BitInt(2) : 1; or signed _BitInt(2) : 0; bitfields (but instead build a zero precision integral type, we remove it later), this patch makes sure we do it also for unsigned _BitInt(1) : 0; because of the build_bitint_type assertion that precision is >= (unsigned ? 1 : 2). 2024-02-05 Jakub Jelinek <jakub@redhat.com> PR c/113740 * c-decl.cc (finish_struct): Only use build_bitint_type if bit-field has width larger or equal to minimum _BitInt precision. * gcc.dg/bitint-85.c: New test.
2024-02-05	arm: Fix missing bti instruction for virtual thunks	Richard Ball	3	-0/+24
	Adds missing bti instruction at the beginning of a virtual thunk, when bti is enabled. gcc/ChangeLog: * config/arm/arm.cc (arm_output_mi_thunk): Emit insn for bti_c when bti is enabled. gcc/testsuite/ChangeLog: * lib/target-supports.exp: Add v8_1_m_main_pacbti. * g++.target/arm/bti_thunk.C: New test.
2024-02-05	x86-64: Update gcc.target/i386/apx-ndd.c	H.J. Lu	1	-34/+34
	Fix the following issues: 1. Replace long with int64_t to support x32. 2. Replace \\(%rdi\\) with \\(%(?:r\|e)di\\) for memory operand since x32 uses (%edi). 3. Replace %(?:\|r\|e)al with %al in negb scan. * gcc.target/i386/apx-ndd.c: Updated.
2024-02-05	mips: Fix missing mode in neg<mode:MSA>2	Xi Ruoyao	1	-1/+1
	I was too sleepy writting this :(. gcc/ChangeLog: * config/mips/mips-msa.md (neg<mode:MSA>2): Add missing mode for neg.
2024-02-05	MIPS: Fix wrong MSA FP vector negation	Xi Ruoyao	1	-3/+15
	We expanded (neg x) to (minus const0 x) for MSA FP vectors, this is wrong because -0.0 is not 0 - 0.0. This causes some Python tests to fail when Python is built with MSA enabled. Use the bnegi.df instructions to simply reverse the sign bit instead. gcc/ChangeLog: * config/mips/mips-msa.md (elmsgnbit): New define_mode_attr. (neg<mode>2): Change the mode iterator from MSA to IMSA because in FP arithmetic we cannot use (0 - x) for -x. (neg<mode>2): New define_insn to implement FP vector negation, using a bnegi instruction to negate the sign bit.
2024-02-05	tree-optimization/113707 - ICE with VN elimination	Richard Biener	3	-0/+76
	The following avoids different avail answers depending on how the iteration progressed. PR tree-optimization/113707 * tree-ssa-sccvn.cc (rpo_elim::eliminate_avail): After checking the avail set treat out-of-region defines as available. * gcc.dg/torture/pr113707-1.c: New testcase. * gcc.dg/torture/pr113707-2.c: Likewise.
2024-02-05	Vectorizer and address-spaces	Richard Biener	1	-1/+1
	The following makes sure to use the correct pointer mode when building pointer types to a non-default address-space. * tree-vect-data-refs.cc (vect_create_data_ref_ptr): Use the default mode when building a pointer.
2024-02-05	lower-bitint: Remove single label _BitInt switches [PR113737]	Jakub Jelinek	2	-1/+40
	The following testcase ICEs, because group_case_labels_stmt optimizes switch (a.0_7) <default: <L6> [50.00%], case 0: <L7> [50.00%], case 2: <L7> [50.00%]> where L7 block starts with __builtin_unreachable (); to switch (a.0_7) <default: <L6> [50.00%]> and single label GIMPLE_SWITCH is something the switch expansion refuses to lower: if (gimple_switch_num_labels (m_switch) == 1 \|\| range_check_type (index_type) == NULL_TREE) return false; (range_check_type never returns NULL for BITINT_TYPE), but the gimple lowering pass relies on all large/huge _BitInt switches to be lowered by that pass. The following patch just removes those after making the single successor edge EDGE_FALLTHRU. I've done it even if !optimize just in case in case we'd end up with single case label from earlier passes. 2024-02-05 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/113737 * gimple-lower-bitint.cc (gimple_lower_bitint): If GIMPLE_SWITCH has just a single label, remove it and make single successor edge EDGE_FALLTHRU. * gcc.dg/bitint-84.c: New test.
2024-02-05	i386: Clear REG_UNUSED and REG_DEAD notes from the IL at the end of ↵	Jakub Jelinek	1	-0/+26
	vzeroupper pass [PR113059] The move of the vzeroupper pass from after reload pass to after postreload_cse helped only partially, CSE-like passes can still invalidate those notes (especially REG_UNUSED) if they use some earlier register holding some value later on in the IL. So, either we could try to move it one pass further after gcse2 and hope no later pass invalidates the notes, or the following patch attempts to restore the REG_DEAD/REG_UNUSED state from GCC 13 and earlier, where the LRA or reload passes remove all REG_DEAD/REG_UNUSED notes and the notes reappear only at the start of dse2 pass when it calls df_note_add_problem (); df_analyze (); So, effectively NEXT_PASS (pass_postreload_cse); NEXT_PASS (pass_gcse2); NEXT_PASS (pass_split_after_reload); NEXT_PASS (pass_ree); NEXT_PASS (pass_compare_elim_after_reload); NEXT_PASS (pass_thread_prologue_and_epilogue); passes operate without those notes in the IL. While in GCC 14 mode switching computes the notes problem at the start of vzeroupper, the patch below removes them at the end of the pass again, so that the above passes continue to operate without them. 2024-02-05 Jakub Jelinek <jakub@redhat.com> PR target/113059 * config/i386/i386-features.cc (rest_of_handle_insert_vzeroupper): Remove REG_DEAD/REG_UNUSED notes at the end of the pass before df_analyze call.
2024-02-05	target/113255 - avoid REG_POINTER on a pointer difference	Richard Biener	1	-1/+1
	The following avoids re-using a register holding a pointer (and thus might be REG_POINTER) for the result of a pointer difference computation. That might confuse heuristics in (broken) RTL alias analysis which relies on REG_POINTER indicating that we're dealing with one. This alone doesn't fix anything. PR target/113255 * config/i386/i386-expand.cc (expand_set_or_cpymem_prologue_epilogue_by_misaligned_moves): Use a new pseudo for the skipped number of bytes.
2024-02-04	RISC-V: Add sifive-p450, sifive-p67 to -mcpu	Monk Chiang	4	-1/+85
	gcc/ChangeLog: * config/riscv/riscv-cores.def: Add sifive-p450, sifive-p670. * doc/invoke.texi (RISC-V Options): Add sifive-p450, sifive-p670. gcc/testsuite/ChangeLog: * gcc.target/riscv/mcpu-sifive-p450.c: New test. * gcc.target/riscv/mcpu-sifive-p670.c: New test.
2024-02-04	RISC-V: Support scheduling for sifive p400 series	Monk Chiang	7	-3/+198
	Add sifive p400 series scheduler module. For more information see https://www.sifive.com/cores/performance-p450-470. gcc/ChangeLog: * config/riscv/riscv.md: Include sifive-p400.md. * config/riscv/sifive-p400.md: New file. * config/riscv/riscv-cores.def (RISCV_TUNE): Add parameter. * config/riscv/riscv-opts.h (enum riscv_microarchitecture_type): Add sifive_p400. * config/riscv/riscv.cc (sifive_p400_tune_info): New. * config/riscv/riscv.h (TARGET_SFB_ALU): Update. * doc/invoke.texi (RISC-V Options): Add sifive-p400-series
2024-02-05	Daily bump.	GCC Administrator	4	-1/+74

2024-02-04	xtensa: Fix missing mode warning in "*eqne_zero_masked_bits"	Takayuki 'January June' Suwa	1	-1/+1
	gcc/ChangeLog: * config/xtensa/xtensa.md (*eqne_zero_masked_bits): Add missing ":SI" to the match_operator.
2024-02-04	xtensa: Recover constant synthesis for HImode after LRA transition	Takayuki 'January June' Suwa	1	-8/+14
	After LRA transition, HImode constants that don't fit into signed 12 bits are no longer subject to constant synthesis: /* example / void test(void) { short foo = 32767; __asm__ ("" :: "r"(foo)); } ;; before .literal_position .literal .LC0, 32767 test: l32r a9, .LC0 ret.n This patch fixes that: ;; after test: movi.n a9, -1 extui a9, a9, 17, 15 ret.n gcc/ChangeLog: config/xtensa/xtensa.md (SHI): New mode iterator. (2 split patterns related to constsynth): Change to also accept HImode operands.
2024-02-04	[committed] Reasonably handle SUBREGs in risc-v cost modeling	Jeff Law	2	-7/+26
	This patch adjusts the costs so that we treat REG and SUBREG expressions the same for costing. This was motivated by bt_skip_func and bt_find_func in xz and results in nearly a 5% improvement in the dynamic instruction count for input #2 and smaller, but definitely visible improvements pretty much across the board. Exceptions would be perlbench input #1 and exchange2 which showed very small regressions. In the bt_find_func and bt_skip_func cases we have something like this: > (insn 10 7 11 2 (set (reg/v:DI 136 [ x ]) > (zero_extend:DI (subreg/s/u:SI (reg/v:DI 137 [ a ]) 0))) "zz.c":6:21 387 {zero_extendsidi2_bitmanip} > (nil)) > (insn 11 10 12 2 (set (reg:DI 142 [ _1 ]) > (plus:DI (reg/v:DI 136 [ x ]) > (reg/v:DI 139 [ b ]))) "zz.c":7:23 5 {adddi3} > (nil)) [ ... ]> (insn 13 12 14 2 (set (reg:DI 143 [ _2 ]) > (plus:DI (reg/v:DI 136 [ x ]) > (reg/v:DI 141 [ c ]))) "zz.c":8:23 5 {adddi3} > (nil)) Note the two uses of (reg 136). The best way to handle that in combine might be a 3->2 split. But there's a much better approach if we look at fwprop... (set (reg:DI 142 [ _1 ]) (plus:DI (zero_extend:DI (subreg/s/u:SI (reg/v:DI 137 [ a ]) 0)) (reg/v:DI 139 [ b ]))) change not profitable (cost 4 -> cost 8) So that should be the same cost as a regular DImode addition when the ZBA extension is enabled. But it ends up costing more because the clause to cost this variant isn't prepared to handle a SUBREG. That results in the RTL above having too high a cost and fwprop gives up. One approach would be to replace the REG_P with REG_P \|\| SUBREG_P in the costing code. I ultimately decided against that and instead check if the operand in question passes register_operand. By far the most important case to handle is the DImode PLUS. But for the sake of consistency, I changed the other instances in riscv_rtx_costs as well. For those other cases we're talking about improvements in the .000001% range. While we are into stage4, this just hits cost modeling which we've generally agreed is still appropriate (though we were mostly talking about vector). So I'm going to extend that general agreement ever so slightly and include scalar cost modeling :-) gcc/ config/riscv/riscv.cc (riscv_rtx_costs): Handle SUBREG and REG similarly. gcc/testsuite/ * gcc.target/riscv/reg_subreg_costs.c: New test. Co-authored-by: Jivan Hakobyan <jivanhakobyan9@gmail.com>
2024-02-04	LoongArch: Fix wrong LSX FP vector negation	Xi Ruoyao	3	-27/+18
	We expanded (neg x) to (minus const0 x) for LSX FP vectors, this is wrong because -0.0 is not 0 - 0.0. This causes some Python tests to fail when Python is built with LSX enabled. Use the vbitrevi.{d/w} instructions to simply reverse the sign bit instead. We are already doing this for LASX and now we can unify them into simd.md. gcc/ChangeLog: * config/loongarch/lsx.md (neg<mode:FLSX>2): Remove the incorrect expand. * config/loongarch/simd.md (simdfmt_as_i): New define_mode_attr. (elmsgnbit): Likewise. (neg<mode:FVEC>2): New define_insn. * config/loongarch/lasx.md (negv4df2, negv8sf2): Remove as they are now instantiated in simd.md.
2024-02-04	LoongArch: Avoid out-of-bounds access in loongarch_symbol_insns	Xi Ruoyao	1	-1/+2
	We call loongarch_symbol_insns with mode = MAX_MACHINE_MODE sometimes. But in loongarch_symbol_insns: if (LSX_SUPPORTED_MODE_P (mode) \|\| LASX_SUPPORTED_MODE_P (mode)) return 0; And LSX_SUPPORTED_MODE_P is defined as: #define LSX_SUPPORTED_MODE_P(MODE) \ (ISA_HAS_LSX \ && GET_MODE_SIZE (MODE) == UNITS_PER_LSX_REG ... ... GET_MODE_SIZE is expanded to a call to mode_to_bytes, which is defined: ALWAYS_INLINE poly_uint16 mode_to_bytes (machine_mode mode) { #if GCC_VERSION >= 4001 return (__builtin_constant_p (mode) ? mode_size_inline (mode) : mode_size[mode]); #else return mode_size[mode]; #endif } There is an assertion in mode_size_inline: gcc_assert (mode >= 0 && mode < NUM_MACHINE_MODES); Note that NUM_MACHINE_MODES = MAX_MACHINE_MODE (emitted by genmodes.cc), thus if __builtin_constant_p (mode) is evaluated true (it happens when GCC is bootstrapped with LTO+PGO), the assertion will be triggered and cause an ICE. OTOH if __builtin_constant_p (mode) is evaluated false, mode_size[mode] is still an out-of-bound array access (the length or the mode_size array is NUM_MACHINE_MODES). So we shouldn't call LSX_SUPPORTED_MODE_P or LASX_SUPPORTED_MODE_P with MAX_MACHINE_MODE in loongarch_symbol_insns. This is very similar to a MIPS bug PR98491 fixed by me about 3 years ago. gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_symbol_insns): Do not use LSX_SUPPORTED_MODE_P or LASX_SUPPORTED_MODE_P if mode is MAX_MACHINE_MODE.
2024-02-04	LoongArch: testsuite: Fix gcc.dg/vect/vect-reduc-mul_{1, 2}.c FAIL.	Li Wei	1	-55/+163
	This FAIL was introduced from r14-6908. The reason is that when merging constant vector permutation implementations, the 128-bit matching situation was not fully considered. In fact, the expansion of 128-bit vectors after merging only supports value-based 4 elements set shuffle, so this time is a complete implementation of the entire 128-bit vector constant permutation, and some structural adjustments have also been made to the code. gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_expand_vselect): Adjust. (loongarch_expand_vselect_vconcat): Ditto. (loongarch_try_expand_lsx_vshuf_const): New, use vshuf to implement all 128-bit constant permutation situations. (loongarch_expand_lsx_shuffle): Adjust and rename function name. (loongarch_is_imm_set_shuffle): Renamed function name. (loongarch_expand_vec_perm_even_odd): Function forward declaration. (loongarch_expand_vec_perm_even_odd_1): Add implement for 128-bit extract-even and extract-odd permutations. (loongarch_is_odd_extraction): Delete. (loongarch_is_even_extraction): Ditto. (loongarch_expand_vec_perm_const): Adjust.
2024-02-04	d: Merge dmd, druntime a6f1083699, phobos 31dedd7da	Iain Buclaw	65	-3150/+3401
	D front-end changes: - Import dmd v2.107.0. - Character postfixes can now also be used for integers of size two or four. D run-time changes: - Import druntime v2.107.0. Phobos changes: - Import phobos v2.107.0. gcc/d/ChangeLog: * dmd/MERGE: Merge upstream dmd a6f1083699. * dmd/VERSION: Bump version to v2.107.0 * Make-lang.in (D_FRONTEND_OBJS): Add d/pragmasem.o. * d-builtins.cc (strip_type_modifiers): Update for new front-end interface. * d-codegen.cc (declaration_type): Likewise. (parameter_type): Likewise. * d-target.cc (TargetCPP::parameterType): Likewise. * expr.cc (ExprVisitor::visit (IndexExp )): Likewise. (ExprVisitor::visit (VarExp )): Likewise. (ExprVisitor::visit (AssocArrayLiteralExp )): Likewise. runtime.cc (get_libcall_type): Likewise. * typeinfo.cc (TypeInfoVisitor::visit (TypeInfoConstDeclaration )): Likewise. (TypeInfoVisitor::visit (TypeInfoInvariantDeclaration )): Likewise. (TypeInfoVisitor::visit (TypeInfoSharedDeclaration )): Likewise. (TypeInfoVisitor::visit (TypeInfoWildDeclaration )): Likewise. * types.cc (build_ctype): Likewise. libphobos/ChangeLog: * libdruntime/MERGE: Merge upstream druntime a6f1083699. * src/MERGE: Merge upstream phobos 31dedd7da.
2024-02-04	Daily bump.	GCC Administrator	4	-1/+69

2024-02-03	Fix xfail for 32-bit hppa--* in gcc.dg/pr84877.c	John David Anglin	1	-1/+1
	2024-02-03 John David Anglin <danglin@gcc.gnu.org> gcc/testsuite/ChangeLog: * gcc.dg/pr84877.c: Adjust xfail parentheses.
2024-02-03	libgfortran: EN0.0E0 and ES0.0E0 format editing.	Jerry DeLisle	4	-5/+77
	F2018 and F2023 standards added zero width exponents. This required additional special handing in the process of building formatted floating point strings. G formatting uses either F or E formatting as documented in write_float.def comments. This logic changes the format token from FMT_G to FMT_F or FMT_E. The new formatting requirements interfere with this process when a FMT_G float string is being built. To avoid this, a new component called 'pushed' is added to the fnode structure to save this condition. The 'pushed' condition is then used to bypass portions of the new ES,E,EN, and D formatting, falling through to the existing default formatting which is retained. libgfortran/ChangeLog: PR libfortran/111022 * io/format.c (get_fnode): Update initialization of fnode. (parse_format_list): Initialization. * io/format.h (struct fnode): Added the new 'pushed' component. * io/write.c (select_buffer): Whitespace. (write_real): Whitespace. (write_real_w0): Adjust logic for the d == 0 condition. * io/write_float.def (determine_precision): Whitespace. (build_float_string): Calculate width of ..E0 exponents and adjust logic accordingly. (build_infnan_string): Whitespace. (CALCULATE_EXP): Whitespace. (quadmath_snprintf): Whitespace. (determine_en_precision): Whitespace. gcc/testsuite/ChangeLog: PR libfortran/111022 * gfortran.dg/fmt_error_10.f: Show D+0 exponent. * gfortran.dg/pr96436_4.f90: Show E+0 exponent. * gfortran.dg/pr96436_5.f90: Show E+0 exponent. * gfortran.dg/pr111022.f90: New test.
2024-02-03	wide-int: Fix up wi::bswap_large [PR113722]	Jakub Jelinek	2	-6/+27
	Since bswap has been converted from a method to a function we miscompile the following testcase. The problem is the assumption that the passed in len argument (number of limbs in the xval array) is the upper bound for the bswap result, which is true only if precision is <= 64. If precision is larger than that, e.g. 128 as in the testcase, if the argument has only one limb (i.e. 0 to ~(unsigned HOST_WIDE_INT) 0), the result can still need 2 limbs for that precision, or generally BLOCKS_NEEDED (precision) limbs, it all depends on how many least significant limbs of the operand are zero. bswap_large as implemented only cleared len limbs of result, then swapped the bytes (invoking UB when oring something in all the limbs above it) and finally passed len to canonize, saying that more limbs aren't needed. The following patch fixes it by renaming len to xlen (so that it is clear it is X's length), using it solely for safe_uhwi argument when we attempt to read from X, and using new len = BLOCKS_NEEDED (precision) instead in the other two spots (i.e. when clearing the val array, turned it also into memset, and in canonize argument). wi::bswap asserts it isn't invoked on widest_int, so we are always invoked on wide_int or similar and those have preallocated result sized for the corresponding precision (i.e. BLOCKS_NEEDED (precision)). 2024-02-03 Jakub Jelinek <jakub@redhat.com> PR middle-end/113722 * wide-int.cc (wi::bswap_large): Rename third argument from len to xlen and adjust use in safe_uhwi. Add len variable, set it to BLOCKS_NEEDED (precision) and use it for clearing of val and as canonize argument. Clear val using memset instead of a loop. * gcc.dg/pr113722.c: New test.
2024-02-03	ggc-common: Fix save PCH assertion	Jakub Jelinek	1	-1/+1
	We are getting a gnuradio PCH ICE /usr/include/pybind11/stl.h:447:1: internal compiler error: in gt_pch_save, at ggc-common.cc:693 0x1304e7d gt_pch_save(_IO_FILE) ../../gcc/ggc-common.cc:693 0x12a45fb c_common_write_pch() ../../gcc/c-family/c-pch.cc:175 0x18ad711 c_parse_final_cleanups() ../../gcc/cp/decl2.cc:5062 0x213988b c_common_parse_file() ../../gcc/c-family/c-opts.cc:1319 (unfortunately it isn't reproduceable always, but often needs up to 100 attempts, isn't reproduceable in a cross etc.). The bug is in the assertion I've added in gt_pch_save when adding relocation support for the PCH files in case they happen not to be mmapped at the selected address. addr is a relocated address which points to a location in the PCH blob (starting at mmi.preferred_base, with mmi.size bytes) which contains a pointer that needs to be relocated. So the assertion is meant to verify the address is within the PCH blob, obviously it needs to be equal or above mmi.preferred_base, but I got the other comparison wrong and when one is very unlucky and the last sizeof (void ) bytes of the blob happen to be a pointer which needs to be relocated, such as on the s390x host addr 0x8008a04ff8, mmi.preferred_base 0x8000000000 and mmi.size 0x8a05000, addr + sizeof (void ) is equal to mmi.preferred_base + mmi.size and that is still fine, both addresses are end of something. 2024-02-03 Jakub Jelinek <jakub@redhat.com> ggc-common.cc (gt_pch_save): Allow addr to be equal to mmi.preferred_base + mmi.size - sizeof (void *).
2024-02-03	d: Merge dmd. druntime e770945277, phobos 6d6e0b9b9	Iain Buclaw	118	-2937/+3959
	D front-end changes: - Import latest fixes from dmd v2.107.0-beta.1. - Hex strings can now be cast to integer arrays. - Add support for Interpolated Expression Sequences. D runtime changes: - Import latest fixes from druntime v2.107.0-beta.1. - New core.interpolation module to provide run-time support for D interpolated expression sequence literals. Phobos changes: - Import latest fixes from phobos v2.107.0-beta.1. - `std.range.primitives.isBidirectionalRange', and `std.range.primitives.isRandomAccessRange' now take an optional element type. gcc/d/ChangeLog: * dmd/MERGE: Merge upstream dmd e770945277. * Make-lang.in (D_FRONTEND_OBJS): Add d/basicmangle.o, d/enumsem.o, d/funcsem.o, d/templatesem.o. * d-builtins.cc (build_frontend_type): Update for new front-end interface. * d-codegen.cc (declaration_type): Likewise. (parameter_type): Likewise. * d-incpath.cc (add_globalpaths): Likewise. (add_filepaths): Likewise. (add_import_paths): Likewise. * d-lang.cc (d_init_options): Likewise. (d_handle_option): Likewise. (d_parse_file): Likewise. * decl.cc (DeclVisitor::finish_vtable): Likewise. (DeclVisitor::visit (FuncDeclaration )): Likewise. (get_symbol_decl): Likewise. expr.cc (ExprVisitor::visit (StringExp )): Likewise. Implement support for 8-byte hexadecimal strings. typeinfo.cc (create_tinfo_types): Update internal TypeInfo representation. (TypeInfoVisitor::visit (TypeInfoConstDeclaration )): Update for new front-end interface. (TypeInfoVisitor::visit (TypeInfoInvariantDeclaration )): Likewise. (TypeInfoVisitor::visit (TypeInfoSharedDeclaration )): Likewise. (TypeInfoVisitor::visit (TypeInfoWildDeclaration )): Likewise. (TypeInfoVisitor::visit (TypeInfoClassDeclaration )): Move data for TypeInfo_Class.nameSig to the end of the object. (create_typeinfo): Update for new front-end interface. libphobos/ChangeLog: libdruntime/MERGE: Merge upstream druntime e770945277. * libdruntime/Makefile.am (DRUNTIME_SOURCES): Add core/interpolation.d. * libdruntime/Makefile.in: Regenerate. * src/MERGE: Merge upstream phobos 6d6e0b9b9.
2024-02-03	LoongArch: Fix an ODR violation	Xi Ruoyao	2	-2/+3
	When bootstrapping GCC 14 with --with-build-config=bootstrap-lto, an ODR violation is detected: ../../gcc/config/loongarch/loongarch-opts.cc:57: warning: 'abi_minimal_isa' violates the C++ One Definition Rule [-Wodr] 57 \| abi_minimal_isa[N_ABI_BASE_TYPES][N_ABI_EXT_TYPES]; ../../gcc/config/loongarch/loongarch-def.cc:186: note: 'abi_minimal_isa' was previously declared here 186 \| abi_minimal_isa = array<array<loongarch_isa, N_ABI_EXT_TYPES>, ../../gcc/config/loongarch/loongarch-def.cc:186: note: code may be misoptimized unless '-fno-strict-aliasing' is used Fix it by adding a proper declaration of abi_minimal_isa into loongarch-def.h and remove the ODR-violating local declaration in loongarch-opts.cc. gcc/ChangeLog: * config/loongarch/loongarch-def.h (abi_minimal_isa): Declare. * config/loongarch/loongarch-opts.cc (abi_minimal_isa): Remove the ODR-violating locale declaration.
2024-02-03	Daily bump.	GCC Administrator	7	-1/+526

2024-02-02	c++: requires-exprs and partial constraint subst [PR110006]	Patrick Palka	4	-13/+84
	In r11-3261-gb28b621ac67bee we made tsubst_requires_expr never partially substitute into a requires-expression so as to avoid checking its requirements out of order during e.g. generic lambda regeneration. These PRs however illustrate that we still sometimes do need to partially substitute into a requires-expression, in particular when it appears in associated constraints that we're directly substituting for sake of declaration matching or dguide constraint rewriting. In these cases we're being called from tsubst_constraint during which processing_constraint_expression_p is true, so this patch checks this predicate to control whether we defer substitution or partially substitute. In turn, we now need to propagate semantic tsubst flags through tsubst_requires_expr rather than just using tf_none, notably for sake of dguide constraint rewriting which sets tf_dguide. PR c++/110006 PR c++/112769 gcc/cp/ChangeLog: * constraint.cc (subst_info::quiet): Accomodate non-diagnostic tsubst flags. (tsubst_valid_expression_requirement): Likewise. (tsubst_simple_requirement): Return a substituted _REQ node when processing_template_decl. (tsubst_type_requirement_1): Accomodate non-diagnostic tsubst flags. (tsubst_type_requirement): Return a substituted _REQ node when processing_template_decl. (tsubst_compound_requirement): Likewise. Accomodate non-diagnostic tsubst flags. (tsubst_nested_requirement): Likewise. (tsubst_requires_expr): Don't defer partial substitution when processing_constraint_expression_p is true, in which case return a substituted REQUIRES_EXPR. * pt.cc (tsubst_expr) <case REQUIRES_EXPR>: Accomodate non-diagnostic tsubst flags. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/class-deduction-alias18.C: New test. * g++.dg/cpp2a/concepts-friend16.C: New test. Reviewed-by: Jason Merrill <jason@redhat.com>
2024-02-03	PR modula2/113730 Unexpected handling of mixed-precision integer arithmetic	Gaius Mulley	16	-55/+463
	This patch fixes a bug which occurs when an expression is created with a ZType and an integer or cardinal. The resulting subexpression is incorrecly given a ZType which allows compatibility with another integer/cardinal type. The solution was to fix the subexpression type. In turn this required a minor change to SArgs.mod. gcc/m2/ChangeLog: PR modula2/113730 * gm2-compiler/M2Base.mod (IsUserType): New procedure function. (MixTypes): Use IsUserType instead of IsType before calling MixTypes. * gm2-compiler/M2GenGCC.mod (GetTypeMode): Remove and import from SymbolTable. (CodeBinaryCheck): Replace call to MixTypes with MixTypesBinary. (CodeBinary): Replace call to MixTypes with MixTypesBinary. (CodeIfLess): Replace MixTypes with ComparisonMixTypes. (CodeIfGre): Replace MixTypes with ComparisonMixTypes. (CodeIfLessEqu): Replace MixTypes with ComparisonMixTypes. (CodeIfGreEqu): Replace MixTypes with ComparisonMixTypes. (CodeIfEqu): Replace MixTypes with ComparisonMixTypes. (CodeIfNotEqu): Replace MixTypes with ComparisonMixTypes. (ComparisonMixTypes): New procedure function. * gm2-compiler/M2Quads.mod (BuildEndFor): Replace GenQuadO with GenQuadOtok and pass tokenpos for the operands to the AddOp and XIndrOp. (CheckRangeIncDec): Check etype against NulSym and dtype for a pointer and set etype to Address. (BuildAddAdrFunction): New variable opa. Convert operand to an address and save result in opa. Replace GenQuad with GenQuadOtok. (BuildSubAdrFunction): New variable opa. Convert operand to an address and save result in opa. Replace GenQuad with GenQuadOtok. (BuildDiffAdrFunction): New variable opa. Convert operand to an address and save result in opa. Replace GenQuad with GenQuadOtok. (calculateMultipicand): Replace GenQuadO with GenQuadOtok. (ConvertToAddress): New procedure function. (BuildDynamicArray): Convert index to address before adding to the base. * gm2-compiler/SymbolTable.def (GetTypeMode): New procedure function. * gm2-compiler/SymbolTable.mod (GetTypeMode): New procedure function implemented (moved from M2GenGCC.mod). * gm2-libs/SArgs.mod (GetArg): Replace cast to PtrToChar with ADDRESS. gcc/testsuite/ChangeLog: PR modula2/113730 * gm2/extensions/fail/arith1.mod: New test. * gm2/extensions/fail/arith2.mod: New test. * gm2/extensions/fail/arith3.mod: New test. * gm2/extensions/fail/arith4.mod: New test. * gm2/extensions/fail/arithpromote.mod: New test. * gm2/extensions/fail/extensions-fail.exp: New test. * gm2/linking/fail/badimp.def: New test. * gm2/linking/fail/badimp.mod: New test. * gm2/linking/fail/linking-fail.exp: New test. * gm2/linking/fail/testbadimp.mod: New test. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2024-02-02	middle-end: check memory accesses in the destination block [PR113588].	Tamar Christina	3	-47/+104
	When analyzing loads for early break it was always the intention that for the exit where things get moved to we only check the loads that can be reached from the condition. However the main loop checks all loads and we skip the destination BB. As such we never actually check the loads reachable from the COND in the last BB unless this BB was also the exit chosen by the vectorizer. This leads us to incorrectly vectorize the loop in the PR and in doing so access out of bounds. gcc/ChangeLog: PR tree-optimization/113588 PR tree-optimization/113467 * tree-vect-data-refs.cc (vect_analyze_data_ref_dependence): Choose correct dest and fix checks. (vect_analyze_early_break_dependences): Update comments. gcc/testsuite/ChangeLog: PR tree-optimization/113588 PR tree-optimization/113467 * gcc.dg/vect/vect-early-break_108-pr113588.c: New test. * gcc.dg/vect/vect-early-break_109-pr113588.c: New test.
2024-02-02	Fix some of vect-avg-*.c testcases	Andrew Pinski	12	-12/+24
	The vect-avg-.c testcases are trying to make sure the AVG internal function are used and not doing promotion to `vector unsigned short` but when V4QI is implemented, `vector(2) unsigned short` shows up in the detail dump file and causes the failure. To fix this checking the optimized dump instead of the vect dump for `vector unsigned short` to make sure the vectorizer does not do the promotion. Built and tested for aarch64-linux-gnu. gcc/testsuite/ChangeLog: gcc.dg/vect/vect-avg-1.c: Check optimized dump for `vector signed short` instead of the `vect` dump. gcc.dg/vect/vect-avg-11.c: Likewise. * gcc.dg/vect/vect-avg-12.c: Likewise. * gcc.dg/vect/vect-avg-13.c: Likewise. * gcc.dg/vect/vect-avg-14.c: Likewise. * gcc.dg/vect/vect-avg-2.c: Likewise. * gcc.dg/vect/vect-avg-3.c: Likewise. * gcc.dg/vect/vect-avg-4.c: Likewise. * gcc.dg/vect/vect-avg-5.c: Likewise. * gcc.dg/vect/vect-avg-6.c: Likewise. * gcc.dg/vect/vect-avg-7.c: Likewise. * gcc.dg/vect/vect-avg-8.c: Likewise. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2024-02-03	d: Merge dmd, druntime bce5c1f7b5, phobos e4d0dd513.	Iain Buclaw	43	-1359/+1843
	D front-end changes: - Import latest changes from dmd v2.107.0-beta.1. - Keywords like `__FILE__' are now always evaluated at the callsite. D runtime changes: - Import latest changes from druntime v2.107.0-beta.1. - Added `nameSig' field to TypeInfo_Class in object.d. Phobos changes: - Import latest changes from phobos v2.107.0-beta.1. gcc/d/ChangeLog: * dmd/MERGE: Merge upstream dmd bce5c1f7b5. * d-attribs.cc (build_attributes): Update for new front-end interface. * d-lang.cc (d_parse_file): Likewise. * decl.cc (DeclVisitor::visit (VarDeclaration )): Likewise. expr.cc (build_lambda_tree): New function. (ExprVisitor::visit (FuncExp )): Use build_lambda_tree. (ExprVisitor::visit (SymOffExp )): Likewise. (ExprVisitor::visit (VarExp )): Likewise. typeinfo.cc (create_tinfo_types): Add two ulong fields to internal TypeInfo representation. (TypeInfoVisitor::visit (TypeInfoClassDeclaration )): Emit stub data for TypeInfo_Class.nameSig. (TypeInfoVisitor::visit (TypeInfoStructDeclaration )): Update for new front-end interface. libphobos/ChangeLog: * libdruntime/MERGE: Merge upstream druntime bce5c1f7b5. * src/MERGE: Merge upstream phobos e4d0dd513.
2024-02-03	d: Merge dmd, druntime d8e3976a58, phobos 7a6e95688	Iain Buclaw	183	-618/+1025
	D front-end changes: - Import dmd v2.107.0-beta.1. - A string literal as an assert condition is deprecated. - Added `@standalone` for module constructors. D runtime changes: - Import druntime v2.107.0-beta.1. Phobos changes: - Import phobos v2.107.0-beta.1. gcc/d/ChangeLog: * dmd/MERGE: Merge upstream dmd d8e3976a58. * dmd/VERSION: Bump version to v2.107.0-beta.1. * d-lang.cc (d_parse_file): Update for new front-end interface. * modules.cc (struct module_info): Add standalonectors. (build_module_tree): Implement @standalone. (register_module_decl): Likewise. libphobos/ChangeLog: * libdruntime/MERGE: Merge upstream druntime d8e3976a58. * src/MERGE: Merge upstream phobos 7a6e95688.
2024-02-03	d: Merge upstream dmd, druntime f1a045928e	Iain Buclaw	57	-1873/+1901
	D front-end changes: - Import dmd v2.106.1-rc.1. - Unrecognized pragmas are no longer an error by default. D runtime changes: - Import druntime v2.106.1-rc.1. gcc/d/ChangeLog: * dmd/MERGE: Merge upstream dmd f1a045928e. * dmd/VERSION: Bump version to v2.106.1-rc.1. * gdc.texi (fignore-unknown-pragmas): Update documentation. * d-builtins.cc (covariant_with_builtin_type_p): Update for new front-end interface. * d-lang.cc (d_parse_file): Likewise. * typeinfo.cc (make_frontend_typeinfo): Likewise. libphobos/ChangeLog: * libdruntime/MERGE: Merge upstream druntime f1a045928e. * libdruntime/Makefile.am (DRUNTIME_DSOURCES): Add core/stdc/stdatomic.d. * libdruntime/Makefile.in: Regenerate.
2024-02-02	libgo: better error messages for unknown GOARCH/GOOS	Ian Lance Taylor	1	-1/+1
	PR go/113530 Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/557655
2024-02-02	compiler: export the type "any" as a builtin	Ian Lance Taylor	4	-2/+5
	Otherwise we can't tell the difference between builtin type "any" and a locally defined type "any". This will require updates to the gccgo export data parsers in the main Go repo and the x/tools repo. These updates are https://go.dev/cl/537195 and https://go.dev/cl/537215. Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/536715
2024-02-02	libgcc: Fix up _BitInt division [PR113604]	Jakub Jelinek	2	-0/+92
	The following testcase ends up with SIGFPE in __divmodbitint4. The problem is a thinko in my attempt to implement Knuth's algorithm. The algorithm does (where b is 65536, i.e. one larger than what fits in their unsigned short word): // Compute estimate qhat of q[j]. qhat = (un[j+n]b + un[j+n-1])/vn[n-1]; rhat = (un[j+n]b + un[j+n-1]) - qhatvn[n-1]; again: if (qhat >= b \|\| qhatvn[n-2] > brhat + un[j+n-2]) { qhat = qhat - 1; rhat = rhat + vn[n-1]; if (rhat < b) goto again; } The problem is that it uses a double-word / word -> double-word division (and modulo), while all we have is udiv_qrnnd unless we'd want to do further library calls, and udiv_qrnnd is a double-word / word -> word division and modulo. Now, as the algorithm description says, it can produce at most word bits + 1 bit quotient. And I believe that actually the highest qhat the original algorithm can produce is (1 << word_bits) + 1. The algorithm performs earlier canonicalization where both the divisor and dividend are shifted left such that divisor has msb set. If it has msb set already before, no shifting occurs but we start with added 0 limb, so in the first uv1:uv0 double-word uv1 is 0 and so we can't get too high qhat, if shifting occurs, the first limb of dividend is shifted right by UWtype bits - shift count into a new limb, so again in the first iteration in the uv1:uv0 double-word uv1 doesn't have msb set while vv1 does and qhat has to fit into word. In the following iterations, previous iteration should guarantee that the previous quotient digit is correct. Even if the divisor was the maximal possible vv1:all_ones_in_all_lower_limbs, if the old uv0:lower_limbs would be larger or equal to the divisor, the previous quotient digit would increase and another divisor would be subtracted, which I think implies that in the next iteration in uv1:uv0 double-word uv1 <= vv1, but uv0 could be up to all ones, e.g. in case of all lower limbs of divisor being all ones and at least one dividend limb below uv0 being not all ones. So, we can e.g. for 64-bit UWtype see uv1:uv0 / vv1 0x8000000000000000UL:0xffffffffffffffffUL / 0x8000000000000000UL or 0xffffffffffffffffUL:0xffffffffffffffffUL / 0xffffffffffffffffUL In all these cases (when uv1 == vv1 && uv0 >= uv1), qhat is 0x10000000000000001UL, i.e. 2 more than fits into UWtype result, if uv1 == vv1 && uv0 < uv1 it would be 0x10000000000000000UL, i.e. 1 more than fits into UWtype result. Because we only have udiv_qrnnd which can't deal with those too large cases (SIGFPEs or otherwise invokes undefined behavior on those), I've tried to handle the uv1 >= vv1 case separately, but for one thing I thought it would be at most 1 larger than what fits, and for two have actually subtracted vv1:vv1 from uv1:uv0 instead of subtracting 0:vv1 from uv1:uv0. For the uv1 < vv1 case, the implementation already performs roughly what the algorithm does. Now, let's see what happens with the two possible extra cases in the original algorithm. If uv1 == vv1 && uv0 < uv1, qhat above would be b, so we take if (qhat >= b, decrement qhat by 1 (it becomes b - 1), add vn[n-1] aka vv1 to rhat and goto again if rhat < b (but because qhat already fits we can goto to the again label in the uv1 < vv1 code). rhat in this case is uv0 and rhat + vv1 can but doesn't have to overflow, say for uv0 42UL and vv1 0x8000000000000000UL it will not (and so we should goto again), while for uv0 0x8000000000000000UL and vv1 0x8000000000000001UL it will (and we shouldn't goto again). If uv1 == vv1 && uv0 >= uv1, qhat above would be b + 1, so we take if (qhat >= b, decrement qhat by 1 (it becomes b), add vn[n-1] aka vv1 to rhat. But because vv1 has msb set and rhat in this case is uv0 - vv1, the rhat + vv1 addition certainly doesn't overflow, because (uv0 - vv1) + vv1 is uv0, so in the algorithm we goto again, again take if (qhat >= b and decrement qhat so it finally becomes b - 1, and add vn[n-1] aka vv1 to rhat again. But this time I believe it must always overflow, simply because we added (uv0 - vv1) + vv1 + vv1 and vv1 has msb set, so already vv1 + vv1 must overflow. And because it overflowed, it will not goto again. So, I believe the following patch implements this correctly, by subtracting vv1 from uv1:uv0 double-word once, then comparing again if uv1 >= vv1. If that is true, subtract vv1 from uv1:uv0 again and add 2 vv1 to rhat, no __builtin_add_overflow is needed as we know it always overflowed and so won't goto again. If after the first subtraction uv1 < vv1, use __builtin_add_overflow when adding vv1 to rhat, because it can but doesn't have to overflow. I've added an extra testcase which tests the behavior of all the changed cases, so it has a case where uv1:uv0 / vv1 is 1:1, where it is 1:0 and rhat + vv1 overflows and where it is 1:0 and rhat + vv1 does not overflow, and includes tests also from Zdenek's other failing tests. 2024-02-02 Jakub Jelinek <jakub@redhat.com> PR libgcc/113604 * libgcc2.c (__divmodbitint4): If uv1 >= vv1, subtract vv1 from uv1:uv0 once or twice as needed, rather than subtracting vv1:vv1. * gcc.dg/torture/bitint-53.c: New test. * gcc.dg/torture/bitint-55.c: New test.
2024-02-02	libgccjit: Implement sizeof operator	Antoni Boucher	11	-0/+209
	gcc/jit/ChangeLog: * docs/topics/compatibility.rst (LIBGCCJIT_ABI_27): New ABI tag. * docs/topics/expressions.rst: Document gcc_jit_context_new_sizeof. * jit-playback.cc (new_sizeof): New method. * jit-playback.h (new_sizeof): New method. * jit-recording.cc (recording::context::new_sizeof, recording::memento_of_sizeof::replay_into, recording::memento_of_sizeof::make_debug_string, recording::memento_of_sizeof::write_reproducer): New methods. * jit-recording.h (class memento_of_sizeof): New class. * libgccjit.cc (gcc_jit_context_new_sizeof): New function. * libgccjit.h (gcc_jit_context_new_sizeof): New function. * libgccjit.map: New function. gcc/testsuite/ChangeLog: * jit.dg/all-non-failing-tests.h: New test. * jit.dg/test-sizeof.c: New test.
2024-02-02	c++: op== defaulted outside class [PR110084]	Jason Merrill	3	-1/+13
	defaulted_late_check is for checks that need to happen after the class is complete; we shouldn't call it sooner. PR c++/110084 gcc/cp/ChangeLog: * pt.cc (tsubst_function_decl): Only check a function defaulted outside the class if the class is complete. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/spaceship-synth-neg3.C: Check error message. * g++.dg/cpp2a/spaceship-eq16.C: New test.
2024-02-02	hppa: Implement TARGET_ATOMIC_ASSIGN_EXPAND_FENV	John David Anglin	2	-1/+298
	This change implements __builtin_get_fpsr() and __builtin_set_fpsr(x) to get and set the floating-point status register. They are used to implement pa_atomic_assign_expand_fenv(). 2024-02-02 John David Anglin <danglin@gcc.gnu.org> gcc/ChangeLog: PR target/59778 * config/pa/pa.cc (enum pa_builtins): Add PA_BUILTIN_GET_FPSR and PA_BUILTIN_SET_FPSR builtins. * (pa_builtins_icode): Declare. * (def_builtin, pa_fpu_init_builtins): New. * (pa_init_builtins): Initialize FPU builtins. * (pa_builtin_decl, pa_expand_builtin_1): New. * (pa_expand_builtin): Handle PA_BUILTIN_GET_FPSR and PA_BUILTIN_SET_FPSR builtins. * (pa_atomic_assign_expand_fenv): New. * config/pa/pa.md (UNSPECV_GET_FPSR, UNSPECV_SET_FPSR): New UNSPECV constants. (get_fpsr, put_fpsr): New expanders. (get_fpsr_32, get_fpsr_64, set_fpsr_32, set_fpsr_64): New insn patterns.
2024-02-03	RISC-V: Expand VLMAX scalar move in reduction	Juzhe-Zhong	2	-5/+21
	This patch fixes the following: vsetvli a5,a1,e32,m1,tu,ma slli a4,a5,2 sub a1,a1,a5 vle32.v v2,0(a0) add a0,a0,a4 vadd.vv v1,v2,v1 bne a1,zero,.L3 vsetivli zero,1,e32,m1,ta,ma vmv.s.x v2,zero vsetvli a5,zero,e32,m1,ta,ma ---> Redundant vsetvl. vredsum.vs v1,v1,v2 vmv.x.s a0,v1 ret VSETVL PASS is able to fuse avl = 1 of scalar move and VLMAX avl of reduction. However, this following RTL blocks the fusion in dependence analysis in VSETVL PASS: (insn 49 24 50 5 (set (reg:RVVM1SI 98 v2 [148]) (if_then_else:RVVM1SI (unspec:RVVMF32BI [ (const_vector:RVVMF32BI [ (const_int 1 [0x1]) repeat [ (const_int 0 [0]) ] ]) (const_int 1 [0x1]) (const_int 2 [0x2]) repeated x2 (const_int 0 [0]) (reg:SI 66 vl) (reg:SI 67 vtype) ] UNSPEC_VPREDICATE) (const_vector:RVVM1SI repeat [ (const_int 0 [0]) ]) (unspec:RVVM1SI [ (reg:DI 0 zero) ] UNSPEC_VUNDEF))) 3813 {pred_broadcastrvvm1si_zero} (nil)) (insn 50 49 51 5 (set (reg:DI 15 a5 [151]) ----> It set a5, blocks the following VLMAX into the scalar move above. (unspec:DI [ (const_int 32 [0x20]) ] UNSPEC_VLMAX)) 2566 {vlmax_avldi} (expr_list:REG_EQUIV (unspec:DI [ (const_int 32 [0x20]) ] UNSPEC_VLMAX) (nil))) (insn 51 50 52 5 (set (reg:RVVM1SI 97 v1 [150]) (unspec:RVVM1SI [ (unspec:RVVMF32BI [ (const_vector:RVVMF32BI repeat [ (const_int 1 [0x1]) ]) (reg:DI 15 a5 [151]) (const_int 2 [0x2]) (const_int 1 [0x1]) (reg:SI 66 vl) (reg:SI 67 vtype) ] UNSPEC_VPREDICATE) (unspec:RVVM1SI [ (reg:RVVM1SI 97 v1 [orig:134 vect_result_14.6 ] [134]) (reg:RVVM1SI 98 v2 [148]) ] UNSPEC_REDUC_SUM) (unspec:RVVM1SI [ (reg:DI 0 zero) ] UNSPEC_VUNDEF) ] UNSPEC_REDUC)) 17541 {pred_redsumrvvm1si} (expr_list:REG_DEAD (reg:RVVM1SI 98 v2 [148]) (expr_list:REG_DEAD (reg:SI 66 vl) (expr_list:REG_DEAD (reg:DI 15 a5 [151]) (expr_list:REG_DEAD (reg:DI 0 zero) (nil)))))) Such situation can only happen on auto-vectorization, never happen on intrinsic codes. Since the reduction is passed VLMAX AVL, it should be more natural to pass VLMAX to the scalar move which initial the value of the reduction. After this patch: vsetvli a5,a1,e32,m1,tu,ma slli a4,a5,2 sub a1,a1,a5 vle32.v v2,0(a0) add a0,a0,a4 vadd.vv v1,v2,v1 bne a1,zero,.L3 vsetvli a5,zero,e32,m1,ta,ma vmv.s.x v2,zero vredsum.vs v1,v1,v2 vmv.x.s a0,v1 ret Tested on both RV32/RV64 no regression. PR target/113697 gcc/ChangeLog: config/riscv/riscv-v.cc (expand_reduction): Pass VLMAX avl to scalar move. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/pr113697.c: New test.
2024-02-02	testsuite, Darwin: Allow for undefined symbols in shared test.	Iain Sandoe	1	-1/+8
	Darwin's linker defaults to error on undefined (which makes it look as if we do not support shared, leading to tests being marked incorrectly as unsupported). This fixes the issue by allowing the symbols used in the target supports test to be undefined. gcc/testsuite/ChangeLog: * lib/target-supports.exp (check_effective_target_shared): Allow the external symbols referenced in the test to be undefined.