riscv-gnu-toolchain/gcc.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
2023-06-26	compiler: support -fgo-importcfg	Ian Lance Taylor	10	-7/+182
	* lang.opt (fgo-importcfg): New option. * go-c.h (struct go_create_gogo_args): Add importcfg field. * go-lang.cc (go_importcfg): New static variable. (go_langhook_init): Set args.importcfg. (go_langhook_handle_option): Handle -fgo-importcfg. * gccgo.texi (Invoking gccgo): Document -fgo-importcfg. Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/506095
2023-06-26	aarch64: Use <DWI> instead of <V2XWIDE> in scalar SQRSHRUN pattern	Kyrylo Tkachov	1	-10/+10
	In the scalar pattern for SQRSHRUN it's a bit clearer to use DWI instead of V2XWIDE to make it more clear that no vector modes are involved. No behavioural change intended. Bootstrapped and tested on aarch64-none-linux-gnu. gcc/ChangeLog: * config/aarch64/aarch64-simd.md (aarch64_sqrshrun_n<mode>_insn): Use <DWI> instead of <V2XWIDE>. (aarch64_sqrshrun_n<mode>): Likewise.
2023-06-26	aarch64: Clean up some rounding immediate predicates	Kyrylo Tkachov	4	-24/+20
	aarch64_simd_rsra_rnd_imm_vec is now used for more than just RSRA and accepts more than just vectors so rename it to make it more truthful. The aarch64_simd_rshrn_imm_vec is now unused and can be deleted. No behavioural change intended. Bootstrapped and tested on aarch64-none-linux-gnu. gcc/ChangeLog: * config/aarch64/aarch64-protos.h (aarch64_const_vec_rsra_rnd_imm_p): Rename to... (aarch64_rnd_imm_p): ... This. * config/aarch64/predicates.md (aarch64_simd_rsra_rnd_imm_vec): Rename to... (aarch64_int_rnd_operand): ... This. (aarch64_simd_rshrn_imm_vec): Delete. * config/aarch64/aarch64-simd.md (aarch64_<sra_op>rsra_n<mode>_insn): Adjust for the above. (aarch64_<sra_op>rshr_n<mode><vczle><vczbe>_insn): Likewise. (aarch64_<shrn_op>rshrn_n<mode>_insn): Likewise. (aarch64_sqrshrun_n<mode>_insn<vczle><vczbe>): Likewise. (aarch64_sqrshrun_n<mode>_insn): Likewise. (aarch64_<shrn_op>rshrn2_n<mode>_insn_le): Likewise. (aarch64_<shrn_op>rshrn2_n<mode>_insn_be): Likewise. (aarch64_sqrshrun2_n<mode>_insn_le): Likewise. (aarch64_sqrshrun2_n<mode>_insn_be): Likewise. * config/aarch64/aarch64.cc (aarch64_const_vec_rsra_rnd_imm_p): Rename to... (aarch64_rnd_imm_p): ... This.
2023-06-26	IBM zSystems: Assume symbols without explicit alignment to be ok	Andreas Krebbel	2	-2/+36
	A change we have committed back in 2015 relies on the backend requested ABI alignment to be applied to ALL symbols by the middle-end. However, this does not appear to be the case for external symbols. With this commit we assume all symbols without explicit alignment to be aligned according to the ABI. That's the behavior we had before. This fixes a performance regression caused by the 2015 patch. Since then the address of external char type symbols have been pushed to the literal pool, although it is safe to access them with larl (which requires symbols to reside at even addresses). gcc/ * config/s390/s390.cc (s390_encode_section_info): Set SYMBOL_FLAG_SET_NOTALIGN2 only if the symbol has explicitely been misaligned. gcc/testsuite/ * gcc.target/s390/larl-1.c: New test.
2023-06-26	Fix profile of forwarders produced by cd-dce	Jan Hubicka	1	-0/+3
	compiling the testcase from PR109849 (which uses std:vector based stack to drive a loop) with profile feedbakc leads to profile mismatches introduced by tree-ssa-dce. This is the new code to produce unified forwarder blocks for PHIs. I am not including the testcase itself since checking it for Invalid sum is probably going to be too fragile and this should show in our LNT testers. The patch however fixes the mismatch. Bootstrapped/regtested x86_64-linux and plan to commit it shortly. gcc/ChangeLog: PR tree-optimization/109849 * tree-ssa-dce.cc (make_forwarders_with_degenerate_phis): Fix profile count of newly constructed forwarder block.
2023-06-26	docs: Fix typo	Andrew Carlotti	1	-1/+1
	gcc/ChangeLog: * doc/optinfo.texi: Fix "steam" -> "stream".
2023-06-26	DSE: Add LEN_MASK_STORE analysis into DSE and fix LEN_STORE	Ju-Zhe Zhong	1	-16/+31
	Hi, Richi. This patch is adding LEN_MASK_STORE into DSE. My understanding is LEN_MASK_STORE is predicated by mask and len. No matter len is constant or not, the ao_ref should be the same as MASK_STORE. Wheras for LEN_STORE, when len is constant, we use (len - bias), otherwise, it's the same as MASK_STORE/LEN_MASK_STORE. Not sure whether I am on the same page with you, feel free to correct me. Thanks. gcc/ChangeLog: * tree-ssa-dse.cc (initialize_ao_ref_for_dse): Add LEN_MASK_STORE and fix LEN_STORE. (dse_optimize_stmt): Add LEN_MASK_STORE.
2023-06-26	GIMPLE_FOLD: Fix gimple fold for LEN_{MASK}_{LOAD,STORE}	Ju-Zhe Zhong	2	-2/+47
	Hi, previous I made a mistake on GIMPLE_FOLD of LEN_MASK_{LOAD,STORE}. We should fold LEN_MASK_{LOAD,STORE} (bias+len) == vf (nunits instead of bytesize) && mask = all trues mask into: MEM_REF [...]. This patch added testcase to test gimple fold of LEN_MASK_{LOAD,STORE}. Also, I fix LEN_LOAD/LEN_STORE, to make them have the same behavior. Ok for trunk ? gcc/ChangeLog: * gimple-fold.cc (gimple_fold_partial_load_store_mem_ref): Fix gimple fold of LOAD/STORE with length. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/partial/gimple_fold-1.c: New test.
2023-06-26	Avoid redundant GORI calcuations.	Andrew MacLeod	1	-4/+17
	When GORI evaluates a statement, if operand 1 and 2 are both in the dependency chain, GORI evaluates the name through both operands sequentially and combines the results. If either operand is in the dependency chain of the other, this evaluation will do the same work twice, for questionable gain. Instead, simple evaluate only the operand which depends on the other and keep the evaluation linear in time. * gimple-range-gori.cc (compute_operand1_and_operand2_range): Check for interdependence between operands 1 and 2.
2023-06-26	vect: Cost intermediate conversions	Richard Sandiford	1	-2/+3
	g:6f19cf7526168f8 extended N-vector to N-vector conversions to handle cases where an intermediate integer extension or truncation is needed. This patch adjusts the cost to account for these intermediate conversions. gcc/ * tree-vect-stmts.cc (vectorizable_conversion): Take multi_step_cvt into account when costing non-widening/truncating conversions.
2023-06-26	tree-optimization/110381 - preserve SLP permutation with in-order reductions	Richard Biener	2	-2/+56
	The following fixes a bug that manifests itself during fold-left reduction transform in picking not the last scalar def to replace and thus double-counting some elements. But the underlying issue is that we merge a load permutation into the in-order reduction which is of course wrong. Now, reduction analysis has not yet been performend when optimizing permutations so we have to resort to check that ourselves. PR tree-optimization/110381 * tree-vect-slp.cc (vect_optimize_slp_pass::start_choosing_layouts): Materialize permutes before fold-left reductions. * gcc.dg/vect/pr110381.c: New testcase.
2023-06-26	RISC-V: Remove duplicated extern function_base decl	Pan Li	1	-5/+0
	Signed-off-by: Pan Li <pan2.li@intel.com> gcc/ChangeLog: * config/riscv/riscv-vector-builtins-bases.h: Remove duplicated decl.
2023-06-26	narrowing initializers and initializer_constant_valid_p_1	Richard Biener	1	-0/+2
	initializer_constant_valid_p_1 attempts to handle narrowing differences and sums but fails to handle when the overall value looks like VIEW_CONVERT_EXPR<long long int>(NON_LVALUE_EXPR <v> - VEC_COND_EXPR < { 0, 0 } == { 0, 0 } , { -1, -1 } , { 0, 0 } > ) where endtype is scalar integer but value is a vector type. In this particular case all is good and we recurse since two vector lanes is more than 64bits of long long. But still it compares apples and oranges. Fixed by appropriately also requiring the type of the value to be scalar integral. * varasm.cc (initializer_constant_valid_p_1): Also constrain the type of value to be scalar integral before dispatching to narrowing_initializer_constant_valid_p.
2023-06-26	Avoid shorten_binary_op on VECTOR_TYPE	Richard Biener	1	-0/+4
	When we disallow TYPE_PRECISION on VECTOR_TYPEs it shows that shorten_binary_op performs some checks on that that are likely harmless in the end. The following bails out early for VECTOR_TYPE operations to avoid those questionable checks. gcc/c-family/ * c-common.cc (shorten_binary_op): Exit early for VECTOR_TYPE operations.
2023-06-26	Fix TYPE_PRECISION use in hashable_expr_equal_p	Richard Biener	1	-1/+1
	While the checks look unnecessary they probably are quick and thus done early. The following avoids using TYPE_PRECISION on VECTOR_TYPEs by making the code match the comment which talks about precision and signedness. An alternative would be to only retain the ERROR_MARK and TYPE_MODE checks or use TYPE_PRECISION_RAW (but I like that least). * tree-ssa-scopedtables.cc (hashable_expr_equal_p): Use element_precision.
2023-06-26	RISC-V: Remove redundant vcond patterns	Juzhe-Zhong	3	-61/+0
	Previously, Richi has suggested that vcond patterns are only needed when target support comparison + select consuming 1 instruction. Now, I do the experiments on removing those "vcond" patterns, it works perfectly. All testcases PASS. Really appreicate Richi helps us recognize such issue. Now remove all "vcond" patterns as Richi suggested. gcc/ChangeLog: * config/riscv/autovec.md (vcond<V:mode><VI:mode>): Remove redundant vcond patterns. (vcondu<V:mode><VI:mode>): Ditto. * config/riscv/riscv-protos.h (expand_vcond): Ditto. * config/riscv/riscv-v.cc (expand_vcond): Ditto.
2023-06-26	tree-optimization/110392 - ICE with predicate analysis	Richard Biener	1	-2/+2
	Feeding not optimized IL can result in predicate normalization to simplify things so a predicate can get true or false. The following re-orders the early exit in that case to come after simplification and normalization to take care of that. PR tree-optimization/110392 * gimple-predicate-analysis.cc (uninit_analysis::is_use_guarded): Do early exits on true/false predicate only after normalization.
2023-06-26	SCCVN: Fix repeating variable name "len"	Ju-Zhe Zhong	1	-7/+7
	Line 3292: has variable name "len": tree mask = NULL_TREE, len = NULL_TREE, bias = NULL_TREE; Line 3349: has variable name "len": HOST_WIDE_INT start = 0, len = 0; Since they are never used simultaneously, such issue is not recognized for now. However, I want to add LEN_MASK_{LOAD,STORE} which will need these 2 variables, so fix naming in this path. Change HOST_WIDE_INT start = 0, len = 0; into HOST_WIDE_INT start = 0, length = 0; gcc/ChangeLog: * tree-ssa-sccvn.cc (vn_reference_lookup_3): Change name "len" into "length".
2023-06-26	i386: New *ashl<dwi3>_doubleword_highpart define_insn_and_split.	Roger Sayle	3	-0/+67
	This patch contains a pair of (related) optimizations in i386.md that allow us to generate better code for the example below (this is a step towards fixing a bugzilla PR, but I've forgotten the number). __int128 foo64(__int128 x, long long y) { __int128 t = (__int128)y << 64; return x ^ t; } The hidden issue is that the RTL currently seen by reload contains the sign extension of y from DImode to TImode, even though this is dead (not required) for left shifts by more than WORD_SIZE bits. (insn 11 8 12 2 (parallel [ (set (reg:TI 0 ax [orig:91 y ] [91]) (sign_extend:TI (reg:DI 1 dx [97]))) (clobber (reg:CC 17 flags)) (clobber (scratch:DI)) ]) {extendditi2} What makes this particularly undesirable is that the sign-extension pattern above requires an additional DImode scratch register, indicated by the clobber, which unnecessarily increases register pressure. The proposed solution is to add a define_insn_and_split for such left shifts (of sign or zero extensions) that only have a non-zero highpart, where the extension is redundant and eliminated, that can be split after reload, without scratch registers or early clobbers. This (late split) exposes a second optimization opportunity where setting the lowpart to zero can sometimes be combined/simplified with the following instruction during peephole2. For the test case above, we previously generated with -O2: foo64: xorl %eax, %eax xorq %rsi, %rdx xorq %rdi, %rax ret with this patch, we now generate: foo64: movq %rdi, %rax xorq %rsi, %rdx ret Likewise for the related -m32 test case, we go from: foo32: movl 12(%esp), %eax movl %eax, %edx xorl %eax, %eax xorl 8(%esp), %edx xorl 4(%esp), %eax ret to the improved: foo32: movl 12(%esp), %edx movl 4(%esp), %eax xorl 8(%esp), %edx ret 2023-06-26 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog * config/i386/i386.md (peephole2): Simplify zeroing a register followed by an IOR, XOR or PLUS operation on it, into a move. (ashl<dwi>3_doubleword_highpart): New define_insn_and_split to eliminate (and hide from reload) unnecessary word to doubleword extensions that are followed by left shifts by sufficiently large, but valid, bit counts. gcc/testsuite/ChangeLog gcc.target/i386/ashldi3-1.c: New 32-bit test case. * gcc.target/i386/ashlti3-2.c: New 64-bit test case.
2023-06-26	Use cvt_op to save intermediate type operand instead of "subtle" vec_dest.	liuhongt	2	-4/+30
	When there're multiple operands in vec_oprnds0, vec_dest will be overwrited to vectype_out, but in multi_step_cvt case, cvt_type is expected. It caused an ICE when verify_gimple_in_cfg. gcc/ChangeLog: PR tree-optimization/110371 PR tree-optimization/110018 * tree-vect-stmts.cc (vectorizable_conversion): Use cvt_op to save intermediate type operand instead of "subtle" vec_dest for case NONE. gcc/testsuite/ChangeLog: * gcc.target/aarch64/pr110371.c: New test.
2023-06-26	Don't use intermiediate type for FIX_TRUNC_EXPR when ftrapping-math.	liuhongt	3	-3/+4
	> > Hmm, good question. GENERIC has a direct truncation to unsigned char > > for example, the C standard generally says if the integral part cannot > > be represented then the behavior is undefined. So I think we should be > > safe here (0x1.0p32 doesn't fit an int). > > We should be following Annex F (unspecified value plus "invalid" exception > for out-of-range floating-to-integer conversions rather than undefined > behavior). But we don't achieve that very well at present (see bug 93806 > comments 27-29 for examples of how such conversions produce wobbly > values). That would mean guarding this with !flag_trapping_math would be the appropriate thing to do. gcc/ChangeLog: PR tree-optimization/110371 PR tree-optimization/110018 * tree-vect-stmts.cc (vectorizable_conversion): Don't use intermiediate type for FIX_TRUNC_EXPR when ftrapping-math. gcc/testsuite/ChangeLog: * gcc.target/i386/pr110018-1.c: Add -fno-trapping-math to dg-options. * gcc.target/i386/pr110018-2.c: Ditto.
2023-06-26	i386: Sync tune_string with arch_string for target attribute arch=*	Hongyu Wang	2	-1/+16
	For function with target attribute arch=, current logic will set its tune to -mtune from command line so all target_clones will get same tuning flags which would affect the performance for each clone. Override tune with arch if tune was not explicitly specified to get proper tuning flags for target_clones. gcc/ChangeLog: config/i386/i386-options.cc (ix86_valid_target_attribute_tree): Override tune_string with arch_string if tune_string is not explicitly specified. gcc/testsuite/ChangeLog: * gcc.target/i386/mvc17.c: New test.
2023-06-26	RISC-V: Fix one test failure of dg config.	Juzhe-Zhong	1	-1/+1
	gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/vlmul_ext-2.c: Add -Wno-psabi for dg.
2023-06-26	d: Suboptimal codegen for __builtin_expect(cond, false)	Iain Buclaw	2	-12/+41
	Since PR96435, both boolean objects and expressions have been evaluated in the following way. ((ubyte)&obj_or_expr) & 1 It has been noted that sometimes this can cause the back-end to optimize in non-obvious ways - in particular with __builtin_expect. This @safe feature is now restricted to just when reading the value of a bool field that comes from a union. PR d/110359 gcc/d/ChangeLog: * d-convert.cc (convert_for_rvalue): Only apply the @safe boolean conversion to boolean fields of a union. (convert_for_condition): Call convert_for_rvalue in the default case. gcc/testsuite/ChangeLog: * gdc.dg/pr110359.d: New test.
2023-06-26	Daily bump.	GCC Administrator	4	-1/+186

2023-06-26	d: Merge upstream dmd, druntime a45f4e9f43, phobos 106038f2e.	Iain Buclaw	42	-205/+428
	D front-end changes: - Import dmd v2.103.1. - Deprecated invalid special token sequences inside token strings. D runtime changes: - Import druntime v2.103.1. Phobos changes: - Import phobos v2.103.1. gcc/d/ChangeLog: * dmd/MERGE: Merge upstream dmd a45f4e9f43. * dmd/VERSION: Bump version to v2.103.1. libphobos/ChangeLog: * libdruntime/MERGE: Merge upstream druntime a45f4e9f43. * src/MERGE: Merge upstream phobos 106038f2e.
2023-06-25	RISC-V: Optimize VSETVL codegen of SELECT_VL with LEN_MASK_{LOAD, STORE}	Juzhe-Zhong	4	-4/+76
	This patch is depending on LEN_MASK_{LOAD,STORE} patch: https://gcc.gnu.org/pipermail/gcc-patches/2023-June/622742.html After enabling the LEN_MASK_{LOAD,STORE}, I notice that there is a case that VSETVL PASS need to be optimized: void f (int32_t __restrict a, int32_t __restrict b, int32_t __restrict cond, int n) { for (int i = 0; i < 8; i++) if (cond[i]) a[i] = b[i]; } Before this patch: f: vsetivli a5,8,e8,mf4,tu,mu --> Propagate "8" to the following vsetvl vsetvli zero,a5,e32,m1,ta,ma vle32.v v0,0(a2) vsetvli a6,zero,e32,m1,ta,ma li a3,8 vmsne.vi v0,v0,0 vsetvli zero,a5,e32,m1,ta,ma vle32.v v1,0(a1),v0.t vse32.v v1,0(a0),v0.t sub a4,a3,a5 beq a3,a5,.L6 slli a5,a5,2 add a2,a2,a5 add a1,a1,a5 add a0,a0,a5 vsetvli a5,a4,e8,mf4,tu,mu --> Propagate "a4" to the following vsetvl vsetvli zero,a5,e32,m1,ta,ma vle32.v v0,0(a2) vsetvli a6,zero,e32,m1,ta,ma vmsne.vi v0,v0,0 vsetvli zero,a5,e32,m1,ta,ma vle32.v v1,0(a1),v0.t vse32.v v1,0(a0),v0.t .L6: ret Current VSETLV PASS only enable AVL propagation of VLMAX AVL ("zero"). Now, we enable AVL propagation of immediate && conservative non-VLMAX. After this patch: f: vsetivli a5,8,e8,mf4,ta,ma vle32.v v0,0(a2) vsetvli a6,zero,e32,m1,ta,ma li a3,8 vmsne.vi v0,v0,0 vsetivli zero,8,e32,m1,ta,ma vle32.v v1,0(a1),v0.t vse32.v v1,0(a0),v0.t sub a4,a3,a5 beq a3,a5,.L6 slli a5,a5,2 vsetvli a4,a4,e8,mf4,ta,ma add a2,a2,a5 vle32.v v0,0(a2) add a1,a1,a5 vsetvli a6,zero,e32,m1,ta,ma add a0,a0,a5 vmsne.vi v0,v0,0 vsetvli zero,a4,e32,m1,ta,ma vle32.v v1,0(a1),v0.t vse32.v v1,0(a0),v0.t .L6: ret gcc/ChangeLog: config/riscv/riscv-vsetvl.cc (vector_insn_info::parse_insn): Ehance AVL propagation. * config/riscv/riscv-vsetvl.h: New function. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/partial/select_vl-1.c: Add dump checks. * gcc.target/riscv/rvv/autovec/partial/select_vl-2.c: New test.
2023-06-25	RISC-V: fix expand function of vlmul_ext RVV intrinsic	Li Xu	2	-1/+9
	Consider this following case: void test_vlmul_ext_v_i8mf8_i8mf4(vint8mf8_t op1) { vint8mf4_t res = __riscv_vlmul_ext_v_i8mf8_i8mf4(op1); } Compilation fails with: test.c: In function 'test_vlmul_ext_v_i8mf8_i8mf4': test.c:5:1: error: unrecognizable insn: 5 \| } \| ^ (insn 30 29 0 2 (set (mem/c:VNx2QI (reg/f:DI 143) [0 x+0 S[2, 2] A32]) (mem/c:VNx2QI (reg/f:DI 148) [0 op1+0 S[2, 2] A16])) "test.c":4:18 -1 (nil)) during RTL pass: vregs test.c:5:1: internal compiler error: in extract_insn, at recog.cc:2791 0x7c61b8 _fatal_insn(char const, rtx_def const, char const, int, char const) ../.././riscv-gcc/gcc/rtl-error.cc:108 0x7c61d7 _fatal_insn_not_found(rtx_def const, char const, int, char const) ../.././riscv-gcc/gcc/rtl-error.cc:116 0xed58a7 extract_insn(rtx_insn) ../.././riscv-gcc/gcc/recog.cc:2791 0xb7f789 instantiate_virtual_regs_in_insn ../.././riscv-gcc/gcc/function.cc:1611 0xb7f789 instantiate_virtual_regs ../.././riscv-gcc/gcc/function.cc:1984 gcc/ChangeLog: * config/riscv/riscv-vector-builtins-bases.cc: change emit_insn to emit_move_insn gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/vlmul_ext-2.c: New test.
2023-06-25	RISC-V: Enable len_mask{load, store} and remove len_{load, store}	Juzhe-Zhong	12	-15/+346
	This patch enable len_mask_{load,store} to support flow-control in RVV auto-vectorization. Consider this following case: void f (int32_t __restrict a, int32_t __restrict b, int32_t __restrict cond, int n) { for (int i = 0; i < n; i++) if (cond[i]) a[i] = b[i]; } Before this patch: <source>:9:21: missed: couldn't vectorize loop <source>:9:21: missed: not vectorized: control flow in loop. After this patch: f: ble a3,zero,.L5 .L3: vsetvli a5,a3,e32,m1,ta,ma vle32.v v0,0(a2) vsetvli a6,zero,e32,m1,ta,ma slli a4,a5,2 vmsne.vi v0,v0,0 sub a3,a3,a5 vsetvli zero,a5,e32,m1,ta,ma vle32.v v1,0(a1),v0.t vse32.v v1,0(a0),v0.t add a2,a2,a4 add a1,a1,a4 add a0,a0,a4 bne a3,zero,.L3 .L5: ret gcc/ChangeLog: config/riscv/autovec.md (len_load_<mode>): Remove. (len_maskload<mode><vm>): Remove. (len_store_<mode>): New pattern. (len_maskstore<mode><vm>): New pattern. * config/riscv/predicates.md (autovec_length_operand): New predicate. * config/riscv/riscv-protos.h (enum insn_type): New enum. (expand_load_store): New function. * config/riscv/riscv-v.cc (emit_vlmax_masked_insn): Ditto. (emit_nonvlmax_masked_insn): Ditto. (expand_load_store): Ditto. * config/riscv/riscv-vector-builtins.cc (function_expander::use_contiguous_store_insn): Add avl_type operand into pred_store. * config/riscv/vector.md: Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/partial/single_rgroup-2.c: New test. * gcc.target/riscv/rvv/autovec/partial/single_rgroup-2.h: New test. * gcc.target/riscv/rvv/autovec/partial/single_rgroup-3.c: New test. * gcc.target/riscv/rvv/autovec/partial/single_rgroup-3.h: New test. * gcc.target/riscv/rvv/autovec/partial/single_rgroup_run-2.c: New test. * gcc.target/riscv/rvv/autovec/partial/single_rgroup_run-3.c: New test.
2023-06-25	internal-fn: Fix bug of BIAS argument index	Ju-Zhe Zhong	1	-1/+1
	When trying to enable LEN_MASK_{LOAD,STORE} in RISC-V port, I found I made a mistake in case of argument index of BIAS. This patch is an obvious fix. gcc/ChangeLog: * internal-fn.cc (expand_partial_store_optab_fn): Fix bug of BIAS argument index.
2023-06-25	Revert "RISC-V:Add float16 tuple type abi"	Pan Li	9	-630/+17
	This reverts commit f9ab5d62c94547499de52c800ab914cc8e802212 due to the bootstrap failure on machine mode out of range memory access. Signed-off-by: Pan Li <pan2.li@intel.com> gcc/ChangeLog: * config/riscv/vector.md: Revert. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/abi-10.c: Revert. * gcc.target/riscv/rvv/base/abi-11.c: Ditto. * gcc.target/riscv/rvv/base/abi-12.c: Ditto. * gcc.target/riscv/rvv/base/abi-15.c: Ditto. * gcc.target/riscv/rvv/base/abi-8.c: Ditto. * gcc.target/riscv/rvv/base/abi-9.c: Ditto. * gcc.target/riscv/rvv/base/abi-17.c: Ditto. * gcc.target/riscv/rvv/base/abi-18.c: Ditto.
2023-06-25	Revert "RISC-V:Add float16 tuple type support"	Pan Li	12	-366/+3
	This reverts commit 8a96f240d71d367a2955ab9e0f0fef3a0b0e2a74 due to bootstrap failure on mode out of range access, will commit this patch after the issue addressed. gcc/ChangeLog: * config/riscv/genrvv-type-indexer.cc (valid_type): Revert changes. * config/riscv/riscv-modes.def (RVV_TUPLE_MODES): Ditto. (ADJUST_ALIGNMENT): Ditto. (RVV_TUPLE_PARTIAL_MODES): Ditto. (ADJUST_NUNITS): Ditto. * config/riscv/riscv-vector-builtins-types.def (vfloat16mf4x2_t): Ditto. (vfloat16mf4x3_t): Ditto. (vfloat16mf4x4_t): Ditto. (vfloat16mf4x5_t): Ditto. (vfloat16mf4x6_t): Ditto. (vfloat16mf4x7_t): Ditto. (vfloat16mf4x8_t): Ditto. (vfloat16mf2x2_t): Ditto. (vfloat16mf2x3_t): Ditto. (vfloat16mf2x4_t): Ditto. (vfloat16mf2x5_t): Ditto. (vfloat16mf2x6_t): Ditto. (vfloat16mf2x7_t): Ditto. (vfloat16mf2x8_t): Ditto. (vfloat16m1x2_t): Ditto. (vfloat16m1x3_t): Ditto. (vfloat16m1x4_t): Ditto. (vfloat16m1x5_t): Ditto. (vfloat16m1x6_t): Ditto. (vfloat16m1x7_t): Ditto. (vfloat16m1x8_t): Ditto. (vfloat16m2x2_t): Ditto. (vfloat16m2x3_t): Diito. (vfloat16m2x4_t): Diito. (vfloat16m4x2_t): Diito. * config/riscv/riscv-vector-builtins.def (vfloat16mf4x2_t): Ditto. (vfloat16mf4x3_t): Ditto. (vfloat16mf4x4_t): Ditto. (vfloat16mf4x5_t): Ditto. (vfloat16mf4x6_t): Ditto. (vfloat16mf4x7_t): Ditto. (vfloat16mf4x8_t): Ditto. (vfloat16mf2x2_t): Ditto. (vfloat16mf2x3_t): Ditto. (vfloat16mf2x4_t): Ditto. (vfloat16mf2x5_t): Ditto. (vfloat16mf2x6_t): Ditto. (vfloat16mf2x7_t): Ditto. (vfloat16mf2x8_t): Ditto. (vfloat16m1x2_t): Ditto. (vfloat16m1x3_t): Ditto. (vfloat16m1x4_t): Ditto. (vfloat16m1x5_t): Ditto. (vfloat16m1x6_t): Ditto. (vfloat16m1x7_t): Ditto. (vfloat16m1x8_t): Ditto. (vfloat16m2x2_t): Ditto. (vfloat16m2x3_t): Ditto. (vfloat16m2x4_t): Ditto. (vfloat16m4x2_t): Ditto. * config/riscv/riscv-vector-switch.def (TUPLE_ENTRY): Ditto. * config/riscv/riscv.md: Ditto. * config/riscv/vector-iterators.md: Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/tuple-28.c: Removed. * gcc.target/riscv/rvv/base/tuple-29.c: Removed. * gcc.target/riscv/rvv/base/tuple-30.c: Removed. * gcc.target/riscv/rvv/base/tuple-31.c: Removed. * gcc.target/riscv/rvv/base/tuple-32.c: Removed. Signed-off-by: Pan Li <pan2.li@intel.com>
2023-06-25	GIMPLE_FOLD: Apply LEN_MASK_{LOAD,STORE} into GIMPLE_FOLD	Ju-Zhe Zhong	1	-5/+18
	Hi, since we are going to have LEN_MASK_{LOAD,STORE} into loopVectorizer. Currenly, 1. we can fold MASK_{LOAD,STORE} into MEM when mask is all ones. 2. we can fold LEN_{LOAD,STORE} into MEM when (len - bias) is VF. Now, I think it makes sense that we can support fold LEN_MASK_{LOAD,STORE} into MEM when both mask = all ones and (len - bias) is VF. gcc/ChangeLog: * gimple-fold.cc (arith_overflowed_p): Apply LEN_MASK_{LOAD,STORE}. (gimple_fold_partial_load_store_mem_ref): Ditto. (gimple_fold_partial_store): Ditto. (gimple_fold_call): Ditto.
2023-06-25	Refine maskloadmn pattern with UNSPEC_MASKLOAD.	liuhongt	2	-14/+28
	If mem_addr points to a memory region with less than whole vector size bytes of accessible memory and k is a mask that would prevent reading the inaccessible bytes from mem_addr, add UNSPEC_MASKLOAD to prevent it to be transformed to vpblendd. gcc/ChangeLog: PR target/110309 * config/i386/sse.md (maskload<mode><avx512fmaskmodelower>): Refine pattern with UNSPEC_MASKLOAD. (maskload<mode><avx512fmaskmodelower>): Ditto. (<avx512>_load<mode>_mask): Extend mode iterator to VI12HFBF_AVX512VL. (<avx512>_load<mode>): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/pr110309.c: New test.
2023-06-25	SSA ALIAS: Apply LEN_MASK_STORE to 'ref_maybe_used_by_call_p_1'	Ju-Zhe Zhong	1	-0/+1
	gcc/ChangeLog: * tree-ssa-alias.cc (call_may_clobber_ref_p_1): Add LEN_MASK_STORE.
2023-06-25	SSA ALIAS: Apply LEN_MASK_{LOAD, STORE} into SSA alias analysis	Ju-Zhe Zhong	1	-0/+2
	gcc/ChangeLog: * tree-ssa-alias.cc (ref_maybe_used_by_call_p_1): Apply LEN_MASK_{LOAD,STORE}
2023-06-25	RISC-V:Add float16 tuple type abi	yulong	9	-17/+630
	gcc/ChangeLog: * config/riscv/vector.md: Add float16 attr at sew、vlmul and ratio. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/abi-10.c: Add float16 tuple type case. * gcc.target/riscv/rvv/base/abi-11.c: Ditto. * gcc.target/riscv/rvv/base/abi-12.c: Ditto. * gcc.target/riscv/rvv/base/abi-15.c: Ditto. * gcc.target/riscv/rvv/base/abi-8.c: Ditto. * gcc.target/riscv/rvv/base/abi-9.c: Ditto. * gcc.target/riscv/rvv/base/abi-17.c: New test. * gcc.target/riscv/rvv/base/abi-18.c: New test.
2023-06-25	Daily bump.	GCC Administrator	5	-1/+128

2023-06-24	i386: Add alternate representation for {and,or,xor}b %ah,%dh.	Roger Sayle	1	-0/+22
	A patch that I'm working on to improve RTL simplifications in the middle-end results in the regression of pr78904-1b.c, due to changes in the canonical representation of high-byte (%ah, %bh, %ch, %dh) logic. See also PR target/78904. This patch avoids/prevents those failures by adding support for the alternate representation, duplicating the existing <code>qi_ext<mode>_2 as <code>qi_ext<mode>_3 (the new version also replacing any_or with any_logic to provide andqi_ext<mode>_3 in the same pattern). Removing the original pattern isn't trivial, as it's generated by define_split, but this can be investigated after the other pieces are approved. The current representation of this instruction is: (set (zero_extract:DI (reg/v:DI 87 [ aD.2763 ]) (const_int 8 [0x8]) (const_int 8 [0x8])) (subreg:DI (xor:QI (subreg:QI (zero_extract:DI (reg:DI 94) (const_int 8 [0x8]) (const_int 8 [0x8])) 0) (subreg:QI (zero_extract:DI (reg/v:DI 87 [ aD.2763 ]) (const_int 8 [0x8]) (const_int 8 [0x8])) 0)) 0)) after my proposed middle-end improvement, we attempt to recognize: (set (zero_extract:DI (reg/v:DI 87 [ aD.2763 ]) (const_int 8 [0x8]) (const_int 8 [0x8])) (zero_extract:DI (xor:DI (reg:DI 94) (reg/v:DI 87 [ aD.2763 ])) (const_int 8 [0x8]) (const_int 8 [0x8]))) 2023-06-24 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog config/i386/i386.md (*<code>qi_ext<mode>_3): New define_insn.
2023-06-24	Fortran: ABI for scalar CHARACTER(LEN=1),VALUE dummy argument [PR110360]	Harald Anlauf	1	-8/+13
	gcc/fortran/ChangeLog: PR fortran/110360 * trans-expr.cc (gfc_conv_procedure_call): Truncate constant string argument of length > 1 passed to scalar CHARACTER(1),VALUE dummy.
2023-06-24	RISC-V: Refactor the integer ternary autovec pattern	Juzhe-Zhong	1	-26/+28
	Long time ago, I encounter ICE when trying to set clobber register as Pmode and I forgot the reason. So, I clobber SI scratch and PUT_MODE to make it Pmode after reload which makes patterns look unreasonable. According to Jeff's comments, I tried it again, it works now when we try to set clobber register as Pmode and the patterns look more reasonable now. The tests are all passed, Ok for trunk. gcc/ChangeLog: * config/riscv/autovec.md (fma<mode>): set clobber to Pmode in expand stage. (fma<VI:mode><P:mode>): Ditto. (fnma<mode>): Ditto. (fnma<VI:mode><P:mode>): Ditto.
2023-06-24	RISC-V: Support RVV floating-point auto-vectorization	Juzhe-Zhong	40	-34/+1386
	This patch adds RVV floating-point auto-vectorization. Also, fix attribute bug of floating-point ternary operations in vector.md. gcc/ChangeLog: * config/riscv/autovec.md (fma<mode>4): New pattern. (fma<mode>): Ditto. (fnma<mode>4): Ditto. (fnma<mode>): Ditto. (fms<mode>4): Ditto. (fms<mode>): Ditto. (fnms<mode>4): Ditto. (fnms<mode>): Ditto. * config/riscv/riscv-protos.h (emit_vlmax_fp_ternary_insn): New function. * config/riscv/riscv-v.cc (emit_vlmax_fp_ternary_insn): Ditto. * config/riscv/vector.md: Fix attribute bug. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/ternop/ternop-1.c: Adjust tests. * gcc.target/riscv/rvv/autovec/ternop/ternop-2.c: Ditto. * gcc.target/riscv/rvv/autovec/ternop/ternop-3.c: Ditto. * gcc.target/riscv/rvv/autovec/ternop/ternop-4.c: Ditto. * gcc.target/riscv/rvv/autovec/ternop/ternop-5.c: Ditto. * gcc.target/riscv/rvv/autovec/ternop/ternop-6.c: Ditto. * gcc.target/riscv/rvv/autovec/ternop/ternop_run-1.c: Ditto. * gcc.target/riscv/rvv/autovec/ternop/ternop_run-2.c: Ditto. * gcc.target/riscv/rvv/autovec/ternop/ternop_run-3.c: Ditto. * gcc.target/riscv/rvv/autovec/ternop/ternop_run-4.c: Ditto. * gcc.target/riscv/rvv/autovec/ternop/ternop_run-5.c: Ditto. * gcc.target/riscv/rvv/autovec/ternop/ternop_run-6.c: Ditto. * gcc.target/riscv/rvv/autovec/ternop/ternop-10.c: New test. * gcc.target/riscv/rvv/autovec/ternop/ternop-11.c: New test. * gcc.target/riscv/rvv/autovec/ternop/ternop-12.c: New test. * gcc.target/riscv/rvv/autovec/ternop/ternop-7.c: New test. * gcc.target/riscv/rvv/autovec/ternop/ternop-8.c: New test. * gcc.target/riscv/rvv/autovec/ternop/ternop-9.c: New test. * gcc.target/riscv/rvv/autovec/ternop/ternop_run-10.c: New test. * gcc.target/riscv/rvv/autovec/ternop/ternop_run-11.c: New test. * gcc.target/riscv/rvv/autovec/ternop/ternop_run-12.c: New test. * gcc.target/riscv/rvv/autovec/ternop/ternop_run-7.c: New test. * gcc.target/riscv/rvv/autovec/ternop/ternop_run-8.c: New test. * gcc.target/riscv/rvv/autovec/ternop/ternop_run-9.c: New test. * gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-1.c: New test. * gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-10.c: New test. * gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-11.c: New test. * gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-12.c: New test. * gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-2.c: New test. * gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-3.c: New test. * gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-4.c: New test. * gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-5.c: New test. * gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-6.c: New test. * gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-7.c: New test. * gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-8.c: New test. * gcc.target/riscv/rvv/autovec/ternop/ternop_run_zvfh-9.c: New test.
2023-06-24	LOOP IVOPTS: Apply LEN_MASK_{LOAD,STORE}	Ju-Zhe Zhong	1	-3/+11
	Hi, Jeff. I fix format as you suggested. Ok for trunk ? gcc/ChangeLog: * tree-ssa-loop-ivopts.cc (get_mem_type_for_internal_fn): Apply LEN_MASK_{LOAD,STORE}.
2023-06-24	IVOPTS: Add LEN_MASK_{LOAD, STORE} into 'get_alias_ptr_type_for_ptr_address'	Ju-Zhe Zhong	1	-0/+2
	gcc/ChangeLog: * tree-ssa-loop-ivopts.cc (get_alias_ptr_type_for_ptr_address): Add LEN_MASK_{LOAD,STORE}.
2023-06-23	text-art: remove explicit #include of C++ standard library headers	David Malcolm	18	-5/+25
	gcc/analyzer/ChangeLog: * access-diagram.cc: Add #define INCLUDE_VECTOR. * bounds-checking.cc: Likewise. gcc/ChangeLog: * diagnostic-format-sarif.cc: Add #define INCLUDE_VECTOR. * diagnostic.cc: Likewise. * text-art/box-drawing.cc: Likewise. * text-art/canvas.cc: Likewise. * text-art/ruler.cc: Likewise. * text-art/selftests.cc: Likewise. * text-art/selftests.h (text_art::canvas): New forward decl. * text-art/style.cc: Add #define INCLUDE_VECTOR. * text-art/styled-string.cc: Likewise. * text-art/table.cc: Likewise. * text-art/table.h: Remove #include <vector>. * text-art/theme.cc: Add #define INCLUDE_VECTOR. * text-art/types.h: Check that INCLUDE_VECTOR is defined. Remove #include of <vector> and <string>. * text-art/widget.cc: Add #define INCLUDE_VECTOR. * text-art/widget.h: Remove #include <vector>. gcc/testsuite/ChangeLog: * gcc.dg/plugin/diagnostic_plugin_test_text_art.c: Add #define INCLUDE_VECTOR. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2023-06-24	VECT: Apply LEN_MASK_{LOAD,STORE} into vectorizer	Ju-Zhe Zhong	4	-83/+267
	Address comments from Richard and Bernhard from V5 patch. V6 fixed all issues according their comments. gcc/ChangeLog: * internal-fn.cc (expand_partial_store_optab_fn): Adapt for LEN_MASK_STORE. (internal_load_fn_p): Add LEN_MASK_LOAD. (internal_store_fn_p): Add LEN_MASK_STORE. (internal_fn_mask_index): Add LEN_MASK_{LOAD,STORE}. (internal_fn_stored_value_index): Add LEN_MASK_STORE. (internal_len_load_store_bias): Add LEN_MASK_{LOAD,STORE}. * optabs-tree.cc (can_vec_mask_load_store_p): Adapt for LEN_MASK_{LOAD,STORE}. (get_len_load_store_mode): Ditto. * optabs-tree.h (can_vec_mask_load_store_p): Ditto. (get_len_load_store_mode): Ditto. * tree-vect-stmts.cc (check_load_store_for_partial_vectors): Ditto. (get_all_ones_mask): New function. (vectorizable_store): Apply LEN_MASK_{LOAD,STORE} into vectorizer. (vectorizable_load): Ditto.
2023-06-24	Daily bump.	GCC Administrator	6	-1/+154

2023-06-23	compiler, libgo: support bootstrapping gc compiler	Ian Lance Taylor	3	-3/+33
	In the Go 1.21 release the package internal/profile imports internal/lazyregexp. That works when bootstrapping with Go 1.17, because that compiler has internal/lazyregep and permits importing it. We also have internal/lazyregexp in libgo, but since it is not installed it is not available for importing. This CL adds internal/lazyregexp to the list of internal packages that are installed for bootstrapping. The Go 1.21, and earlier, releases have a couple of functions in the internal/abi package that are always fully intrinsified. The gofrontend recognizes and intrinsifies those functions as well. However, the gofrontend was also building function descriptors for references to the functions without calling them, which failed because there was nothing to refer to. That is OK for the gc compiler, which guarantees that the functions are only called, not referenced. This CL arranges to not generate function descriptors for these functions. For golang/go#60913 Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/504798
2023-06-23	c++: provide #include hint for missing includes [PR110164]	David Malcolm	4	-1/+22
	PR c++/110164 notes that in cases where we have a forward decl of a std library type such as: std::array<int, 10> x; we emit this diagnostic: error: aggregate ‘std::array<int, 10> x’ has incomplete type and cannot be defined This patch adds this hint to the diagnostic: note: ‘std::array’ is defined in header ‘<array>’; this is probably fixable by adding ‘#include <array>’ gcc/cp/ChangeLog: PR c++/110164 * cp-name-hint.h (maybe_suggest_missing_header): New decl. * decl.cc: Define INCLUDE_MEMORY. Add include of "cp/cp-name-hint.h". (start_decl_1): Call maybe_suggest_missing_header. * name-lookup.cc (maybe_suggest_missing_header): Remove "static". gcc/testsuite/ChangeLog: PR c++/110164 * g++.dg/diagnostic/missing-header-pr110164.C: New test. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2023-06-23	c++: Add support for -std={c,gnu}++2{c,6}	Marek Polacek	9	-12/+103
	It seems prudent to add C++26 now that the first C++26 papers have been approved. I followed commit r11-6920 as well as r8-3237. Since C++23 is essentially finished and its __cplusplus value has settled to 202302L, I've updated cpp_init_builtins and marked -std=c++2b Undocumented and made -std=c++23 no longer Undocumented. As for __cplusplus, I've chosen 202400L: $ xg++ -std=c++26 -dM -E -x c++ - < /dev/null \| grep cplusplus #define __cplusplus 202400L I've verified the patch with a simple test, exercising the new directives. Don't forget to update your GXX_TESTSUITE_STDS! This patch does not add -Wc++26-extensions. gcc/c-family/ChangeLog: * c-common.h (cxx_dialect): Add cxx26 as a dialect. * c-opts.cc (set_std_cxx26): New. (c_common_handle_option): Set options when -std={c,gnu}++2{c,6} is enabled. (c_common_post_options): Adjust comments. * c.opt: Add options for -std=c++26, std=c++2c, -std=gnu++26, and -std=gnu++2c. (std=c++2b): Mark as Undocumented. (std=c++23): No longer Undocumented. gcc/ChangeLog: * doc/cpp.texi (__cplusplus): Document value for -std=c++26 and -std=gnu++26. Document that for C++23, its value is 202302L. * doc/invoke.texi: Document -std=c++26 and -std=gnu++26. * dwarf2out.cc (highest_c_language): Handle GNU C++26. (gen_compile_unit_die): Likewise. libcpp/ChangeLog: * include/cpplib.h (c_lang): Add CXX26 and GNUCXX26. * init.cc (lang_defaults): Add rows for CXX26 and GNUCXX26. (cpp_init_builtins): Set __cplusplus to 202400L for C++26. Set __cplusplus to 202302L for C++23. gcc/testsuite/ChangeLog: * lib/target-supports.exp (check_effective_target_c++23): Return 1 also if check_effective_target_c++26. (check_effective_target_c++23_down): New. (check_effective_target_c++26_only): New. (check_effective_target_c++26): New. * g++.dg/cpp23/cplusplus.C: Adjust expected value. * g++.dg/cpp26/cplusplus.C: New test.