riscv-gnu-toolchain/gcc.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
2024-08-29	RISC-V: Fix subreg of VLS modes larger than a vector [PR116086].	Robin Dapp	3	-0/+248
	When the source mode is potentially larger than one vector (e.g. an LMUL2 mode for VLEN=128) we don't know which vector the subreg actually refers to. For zvl128b and LMUL=2 the subreg in (subreg:V2DI (reg:V4DI)) could actually be the a full (high) vector register of a two-register group (at VLEN=128) or the higher part of a single register (at VLEN>128). As the subreg is statically ambiguous we prevent such situations in can_change_mode_class. The culprit in PR116086 is _12 = BIT_FIELD_REF <vect_cst__42, 128, 128>; which can be expanded with a vector-vector extract (from V4DI to V2DI). This patch adds a VLS-mode vector-vector extract that handles "halving" cases like this one by sliding down the source vector, thus making sure the correct part is used. PR target/116086 gcc/ChangeLog: * config/riscv/autovec.md (vec_extract<mode><v_half>): Add vector-vector extract for VLS modes. * config/riscv/riscv.cc (riscv_can_change_mode_class): Forbid VLS modes larger than one vector. * config/riscv/vector-iterators.md: Add vector-vector extract iterators. gcc/testsuite/ChangeLog: * lib/target-supports.exp: Add effective target checks for zvl256b and zvl512b. * gcc.target/riscv/rvv/autovec/pr116086-2-run.c: New test. * gcc.target/riscv/rvv/autovec/pr116086-2.c: New test. * gcc.target/riscv/rvv/autovec/pr116086.c: New test.
2024-08-28	RISC-V: Add missing mode_idx for vrol and vror	Kito Cheng	1	-1/+1
	We add pattern for vector rotate, but seems like we forgot adding mode_idx which used in AVL propgation (riscv-avlprop.cc). gcc/ChangeLog: * config/riscv/vector.md (mode_idx): Add vrol and vror. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/rotr.c: New.
2024-08-27	RISC-V: Move helper functions above expand_const_vector	Patrick O'Neill	1	-66/+66
	These subroutines will be used in expand_const_vector in a future patch. Relocate so expand_const_vector can use them. gcc/ChangeLog: * config/riscv/riscv-v.cc (expand_vector_init_insert_elems): Relocate. (expand_vector_init_trailing_same_elem): Ditto. Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
2024-08-27	RISC-V: Allow non-duplicate bool patterns in expand_const_vector	Patrick O'Neill	1	-15/+8
	Currently we assert when encountering a non-duplicate boolean vector. This patch allows non-duplicate vectors to fall through to the gcc_unreachable and assert there. This will be useful when adding a catch-all pattern to emit costs and handle arbitary vectors. gcc/ChangeLog: * config/riscv/riscv-v.cc (expand_const_vector): Allow non-duplicate to fall through other patterns before asserting. Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
2024-08-27	RISC-V: Handle 0.0 floating point pattern costing to match const_vector expander	Patrick O'Neill	3	-6/+15
	The comment previously here stated that the Wc0/Wc1 cases are handled by the vi constraint but that is not true for the 0.0 Wc0 case. gcc/ChangeLog: * config/riscv/riscv-v.h (valid_vec_immediate_p): Add new helper. * config/riscv/riscv-v.cc (valid_vec_immediate_p): Ditto. (expand_const_vector): Use new helper. * config/riscv/riscv.cc (riscv_const_insns): Handle 0.0 floating-point case. Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
2024-08-27	RISC-V: Emit costs for bool and stepped const vectors	Patrick O'Neill	3	-52/+131
	These cases are handled in the expander (riscv-v.cc:expand_const_vector). We need the vector builder to detect these cases so extract that out into a new riscv-v.h header file. gcc/ChangeLog: * config/riscv/riscv-v.cc (class rvv_builder): Move to riscv-v.h. * config/riscv/riscv.cc (riscv_const_insns): Emit placeholder costs for bool/stepped const vectors. * config/riscv/riscv-v.h: New file. Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
2024-08-27	RISC-V: Handle case when constant vector construction target rtx is not a ↵	Patrick O'Neill	1	-32/+41
	register This manifests in RTL that is optimized away which causes runtime failures in the testsuite. Update all patterns to use a temp result register if required. gcc/ChangeLog: * config/riscv/riscv-v.cc (expand_const_vector): Use tmp register if needed. Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
2024-08-27	RISC-V: Reorder insn cost match order to match corresponding expander match ↵	Patrick O'Neill	1	-9/+9
	order The corresponding expander (riscv-v.cc:expand_const_vector) matches const_vec_duplicate_p before const_vec_series_p. Reorder to match this behavior when calculating costs. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_const_insns): Relocate. Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
2024-08-27	RISC-V: Fix vid const vector expander for non-npatterns size steps	Patrick O'Neill	1	-6/+42
	Prior to this patch the expander would emit vectors like: { 0, 0, 5, 5, 10, 10, ...} as: { 0, 0, 2, 2, 4, 4, ...} This patch sets the step size to the requested value. gcc/ChangeLog: * config/riscv/riscv-v.cc (expand_const_vector): Fix STEP size in expander. Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
2024-08-27	RISC-V: Support IMM for operand 1 of ussub pattern	Pan Li	2	-2/+2
	This patch would like to allow IMM for the operand 1 of ussub pattern. Aka .SAT_SUB(x, 22) as the below example. Form 2: #define DEF_SAT_U_SUB_IMM_FMT_2(T, IMM) \ T __attribute__((noinline)) \ sat_u_sub_imm##IMM##_##T##_fmt_2 (T x) \ { \ return x >= (T)IMM ? x - (T)IMM : 0; \ } DEF_SAT_U_SUB_IMM_FMT_2(uint64_t, 1022) It is almost the as support imm for operand 0 of ussub pattern, but allow the second operand to be imm insted of the first operand. The below test suites are passed for this patch: 1. The rv64gcv fully regression test. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_expand_ussub): Gen xmode for the second operand, aka y in parameter. * config/riscv/riscv.md (ussub<mode>3): Allow const_int for operand 2. gcc/testsuite/ChangeLog: * gcc.target/riscv/sat_arith.h: Add test helper macros. * gcc.target/riscv/sat_u_sub_imm-5.c: New test. * gcc.target/riscv/sat_u_sub_imm-5_1.c: New test. * gcc.target/riscv/sat_u_sub_imm-5_2.c: New test. * gcc.target/riscv/sat_u_sub_imm-6.c: New test. * gcc.target/riscv/sat_u_sub_imm-6_1.c: New test. * gcc.target/riscv/sat_u_sub_imm-6_2.c: New test. * gcc.target/riscv/sat_u_sub_imm-7.c: New test. * gcc.target/riscv/sat_u_sub_imm-7_1.c: New test. * gcc.target/riscv/sat_u_sub_imm-7_2.c: New test. * gcc.target/riscv/sat_u_sub_imm-8.c: New test. * gcc.target/riscv/sat_u_sub_imm-run-5.c: New test. * gcc.target/riscv/sat_u_sub_imm-run-6.c: New test. * gcc.target/riscv/sat_u_sub_imm-run-7.c: New test. * gcc.target/riscv/sat_u_sub_imm-run-8.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-08-26	RISC-V: Support IMM for operand 0 of ussub pattern	Pan Li	2	-2/+46
	This patch would like to allow IMM for the operand 0 of ussub pattern. Aka .SAT_SUB(1023, y) as the below example. Form 1: #define DEF_SAT_U_SUB_IMM_FMT_1(T, IMM) \ T __attribute__((noinline)) \ sat_u_sub_imm##IMM##_##T##_fmt_1 (T y) \ { \ return (T)IMM >= y ? (T)IMM - y : 0; \ } DEF_SAT_U_SUB_IMM_FMT_1(uint64_t, 1023) Before this patch: 10 │ sat_u_sub_imm82_uint64_t_fmt_1: 11 │ li a5,82 12 │ bgtu a0,a5,.L3 13 │ sub a0,a5,a0 14 │ ret 15 │ .L3: 16 │ li a0,0 17 │ ret After this patch: 10 │ sat_u_sub_imm82_uint64_t_fmt_1: 11 │ li a5,82 12 │ sltu a4,a5,a0 13 │ addi a4,a4,-1 14 │ sub a0,a5,a0 15 │ and a0,a4,a0 16 │ ret The below test suites are passed for this patch: 1. The rv64gcv fully regression test. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_gen_unsigned_xmode_reg): Add new func impl to gen xmode rtx reg from operand rtx. (riscv_expand_ussub): Gen xmode reg for operand 1. * config/riscv/riscv.md: Allow const_int for operand 1. gcc/testsuite/ChangeLog: * gcc.target/riscv/sat_arith.h: Add test helper macro. * gcc.target/riscv/sat_u_sub_imm-1.c: New test. * gcc.target/riscv/sat_u_sub_imm-1_1.c: New test. * gcc.target/riscv/sat_u_sub_imm-1_2.c: New test. * gcc.target/riscv/sat_u_sub_imm-2.c: New test. * gcc.target/riscv/sat_u_sub_imm-2_1.c: New test. * gcc.target/riscv/sat_u_sub_imm-2_2.c: New test. * gcc.target/riscv/sat_u_sub_imm-3.c: New test. * gcc.target/riscv/sat_u_sub_imm-3_1.c: New test. * gcc.target/riscv/sat_u_sub_imm-3_2.c: New test. * gcc.target/riscv/sat_u_sub_imm-4.c: New test. * gcc.target/riscv/sat_u_sub_imm-run-1.c: New test. * gcc.target/riscv/sat_u_sub_imm-run-2.c: New test. * gcc.target/riscv/sat_u_sub_imm-run-3.c: New test. * gcc.target/riscv/sat_u_sub_imm-run-4.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-08-25	RISC-V: Fix double mode under RV32 not utilize vf	demin.han	1	-1/+2
	Currently, some binops of vector vs double scalar under RV32 can't translated to vf but vfmv+vxx.vv. The cause is that vec_duplicate is also expanded to broadcast for double mode under RV32. last-combine can't process expanded broadcast. gcc/ChangeLog: * config/riscv/vector.md: Add !FLOAT_MODE_P constraint. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/binop/vadd-rv32gcv-nofm.c: Fix test. * gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv-nofm.c: Ditto. * gcc.target/riscv/rvv/autovec/binop/vmul-rv32gcv-nofm.c: Ditto. * gcc.target/riscv/rvv/autovec/binop/vsub-rv32gcv-nofm.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_copysign-rv32gcv.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fadd-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fadd-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fadd-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fadd-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-5.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-6.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-5.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-6.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmul-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmul-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmul-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmul-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmul-5.c: Ditto.
2024-08-23	RISC-V: Use encoded nelts when calling repeating_sequence_p	Patrick O'Neill	1	-7/+3
	repeating_sequence_p operates directly on the encoded pattern and does not derive elements using the .elt() accessor. Passing in the length of the unencoded vector can cause an out-of-bounds read of the encoded pattern. gcc/ChangeLog: * config/riscv/riscv-v.cc (rvv_builder::can_duplicate_repeating_sequence_p): Use encoded_nelts when calling repeating_sequence_p. (rvv_builder::is_repeating_sequence): Ditto. (rvv_builder::repeating_sequence_use_merge_profitable_p): Ditto. Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
2024-08-23	RISC-V: Expand vec abs without masking.	Robin Dapp	1	-18/+8
	Standard abs synthesis during expand is max (a, -a). This expansion has the advantage of avoiding masking and is thus potentially faster than the a < 0 ? -a : a synthesis. gcc/ChangeLog: * config/riscv/autovec.md (abs<mode>2): Expand via max (a, -a). gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/unop/abs-rv32gcv.c: Adjust test expectation. * gcc.target/riscv/rvv/autovec/unop/abs-rv64gcv.c: Ditto. * gcc.target/riscv/rvv/autovec/vls/abs-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_unary-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_unary-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_unary-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_unary-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_unary-5.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_unary-6.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_unary-7.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_unary-8.c: Ditto.
2024-08-22	RISC-V: Fix vector cfi notes for stack-clash protection	Raphael Moreira Zinsly	1	-2/+16
	The stack-clash code is generating wrong cfi directives in riscv_v_adjust_scalable_frame because REG_CFA_DEF_CFA has a different encoding than REG_FRAME_RELATED_EXPR, this patch fixes the offset sign in prologue and starts using REG_CFA_DEF_CFA in the epilogue. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_v_adjust_scalable_frame): Add epilogue code for stack-clash and fix prologue cfi note. gcc/testsuite/ChangeLog: * gcc.target/riscv/stack-check-cfa-3.c: Fix the expected output.
2024-08-18	RISC-V: Implement the quad and oct .SAT_TRUNC for scalar	Pan Li	2	-0/+40
	This patch would like to implement the quad and oct .SAT_TRUNC pattern in the riscv backend. Aka: Form 1: #define DEF_SAT_U_TRUC_FMT_1(NT, WT) \ NT __attribute__((noinline)) \ sat_u_truc_##WT##_to_##NT##_fmt_1 (WT x) \ { \ bool overflow = x > (WT)(NT)(-1); \ return ((NT)x) \| (NT)-overflow; \ } DEF_SAT_U_TRUC_FMT_1(uint16_t, uint64_t) Before this patch: 4 │ __attribute__((noinline)) 5 │ uint16_t sat_u_truc_uint64_t_to_uint16_t_fmt_1 (uint64_t x) 6 │ { 7 │ _Bool overflow; 8 │ short unsigned int _1; 9 │ short unsigned int _2; 10 │ short unsigned int _3; 11 │ uint16_t _6; 12 │ 13 │ ;; basic block 2, loop depth 0 14 │ ;; pred: ENTRY 15 │ overflow_5 = x_4(D) > 65535; 16 │ _1 = (short unsigned int) x_4(D); 17 │ _2 = (short unsigned int) overflow_5; 18 │ _3 = -_2; 19 │ _6 = _1 \| _3; 20 │ return _6; 21 │ ;; succ: EXIT 22 │ 23 │ } After this patch: 3 │ 4 │ __attribute__((noinline)) 5 │ uint16_t sat_u_truc_uint64_t_to_uint16_t_fmt_1 (uint64_t x) 6 │ { 7 │ uint16_t _6; 8 │ 9 │ ;; basic block 2, loop depth 0 10 │ ;; pred: ENTRY 11 │ _6 = .SAT_TRUNC (x_4(D)); [tail call] 12 │ return _6; 13 │ ;; succ: EXIT 14 │ 15 │ } The below tests suites are passed for this patch 1. The rv64gcv fully regression test. 2. The rv64gcv build with glibc gcc/ChangeLog: * config/riscv/iterators.md (ANYI_QUAD_TRUNC): New iterator for quad truncation. (ANYI_OCT_TRUNC): New iterator for oct truncation. (ANYI_QUAD_TRUNCATED): New attr for truncated quad modes. (ANYI_OCT_TRUNCATED): New attr for truncated oct modes. (anyi_quad_truncated): Ditto but for lower case. (anyi_oct_truncated): Ditto but for lower case. * config/riscv/riscv.md (ustrunc<mode><anyi_quad_truncated>2): Add new pattern for quad truncation. (ustrunc<mode><anyi_oct_truncated>2): Ditto but for oct. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-2.c: Adjust the expand dump check times. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-3.c: Ditto. * gcc.target/riscv/sat_arith_data.h: Add test helper macros. * gcc.target/riscv/sat_u_trunc-4.c: New test. * gcc.target/riscv/sat_u_trunc-5.c: New test. * gcc.target/riscv/sat_u_trunc-6.c: New test. * gcc.target/riscv/sat_u_trunc-run-4.c: New test. * gcc.target/riscv/sat_u_trunc-run-5.c: New test. * gcc.target/riscv/sat_u_trunc-run-6.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-08-18	RISC-V: Make sure high bits of usadd operands is clean for non-Xmode [PR116278]	Pan Li	1	-12/+22
	For QI/HImode of .SAT_ADD, the operands may be sign-extended and the high bits of Xmode may be all 1 which is not expected. For example as below code. signed char b[1]; unsigned short c; signed char d = b; int main() { b[0] = -40; c = ({ (unsigned short)d[0] < 0xFFF6 ? (unsigned short)d[0] : 0xFFF6; }) + 9; __builtin_printf("%d\n", c); } After expanding we have: ;; _6 = .SAT_ADD (_3, 9); (insn 8 7 9 (set (reg:DI 143) (high:DI (symbol_ref:DI ("d") [flags 0x86] <var_decl d>))) (nil)) (insn 9 8 10 (set (reg/f:DI 142) (mem/f/c:DI (lo_sum:DI (reg:DI 143) (symbol_ref:DI ("d") [flags 0x86] <var_decl d>)) [1 d+0 S8 A64])) (nil)) (insn 10 9 11 (set (reg:HI 144 [ _3 ]) (sign_extend:HI (mem:QI (reg/f:DI 142) [0 d.0_1+0 S1 A8]))) "test.c":7:10 -1 (nil)) The convert from signed char to unsigned short will have sign_extend rtl as above. And finally become the lb insn as below: lb a1,0(a5) // a1 is -40, aka 0xffffffffffffffd8 lui a0,0x1a addi a5,a1,9 slli a5,a5,0x30 srli a5,a5,0x30 // a5 is 65505 sltu a1,a5,a1 // compare 65505 and 0xffffffffffffffd8 => TRUE The sltu try to compare 65505 and 0xffffffffffffffd8 here, but we actually want to compare 65505 and 65496 (0xffd8). Thus we need to clean up the high bits to ensure this. The below test suites are passed for this patch: * The rv64gcv fully regression test. PR target/116278 gcc/ChangeLog: * config/riscv/riscv.cc (riscv_gen_zero_extend_rtx): Add new func impl to zero extend rtx. (riscv_expand_usadd): Leverage above func to cleanup operands 0 and remove the special handing for SImode in RV64. gcc/testsuite/ChangeLog: * gcc.target/riscv/sat_u_add-11.c: Adjust asm check body. * gcc.target/riscv/sat_u_add-15.c: Ditto. * gcc.target/riscv/sat_u_add-19.c: Ditto. * gcc.target/riscv/sat_u_add-23.c: Ditto. * gcc.target/riscv/sat_u_add-3.c: Ditto. * gcc.target/riscv/sat_u_add-7.c: Ditto. * gcc.target/riscv/sat_u_add_imm-11.c: Ditto. * gcc.target/riscv/sat_u_add_imm-15.c: Ditto. * gcc.target/riscv/sat_u_add_imm-3.c: Ditto. * gcc.target/riscv/sat_u_add_imm-7.c: Ditto. * gcc.target/riscv/pr116278-run-1.c: New test. * gcc.target/riscv/pr116278-run-2.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-08-17	t-rtems: add rv32imf architecture to the RTEMS multilib for RISC-V	Kevin Kirspel	1	-2/+3
	The attach patch is specific to the RTEMS RISC-V architecture multilib which is controlled by the t-rtems file in the gcc/config/riscv/ directory. The patch file was created from the gcc-13.3.0 branch. It was successfully tested within RTEMS Source Builder. gcc/ * config/riscv/t-rtems: Add ilp32f multilib.
2024-08-17	RISC-V: Fix ICE for vector single-width integer multiply-add intrinsics	Jin Ma	1	-40/+40
	When rs1 is the immediate 0, the following ICE occurs: error: unrecognizable insn: (insn 8 5 12 2 (set (reg:RVVM1DI 134 [ <retval> ]) (if_then_else:RVVM1DI (unspec:RVVMF64BI [ (const_vector:RVVMF64BI repeat [ (const_int 1 [0x1]) ]) (reg/v:DI 137 [ vl ]) (const_int 2 [0x2]) repeated x2 (const_int 0 [0]) (reg:SI 66 vl) (reg:SI 67 vtype) ] UNSPEC_VPREDICATE) (plus:RVVM1DI (mult:RVVM1DI (vec_duplicate:RVVM1DI (const_int 0 [0])) (reg/v:RVVM1DI 136 [ vs2 ])) (reg/v:RVVM1DI 135 [ vd ])) (reg/v:RVVM1DI 135 [ vd ]))) gcc/ChangeLog: * config/riscv/vector.md: Allow scalar operand to be 0. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/bug-7.c: New test. * gcc.target/riscv/rvv/base/bug-8.c: New test.
2024-08-17	[RISC-V][PR target/116282] Stabilize pattern conditions	Jeff Law	4	-31/+55
	So as expected the core problem with target/116282 is that the cost of certain constant synthesis cases varied depending on whether or not we're allowed to generate new pseudos or not. That in turn meant that in obscure cases an insn might change from recognizable to unrecognizable and triggers the observed failure. So we need to keep the cost stable, at least when called from a pattern's condition. So we pass another boolean down when necessary. I've tried to keep API fallout minimized. Built and tested on rv32 in my tester. Let's see what pre-commit testing has to say though 🙂 Note this will also require a minor change to the in-flight constant synthesis work. PR target/116282 gcc/ * config/riscv/riscv-protos.h (riscv_const_insns): Add new argument. * config/riscv/riscv.cc (riscv_build_integer): Add new argument ALLOW_NEW_PSEUDOS. Pass it down to recursive calls and check it before using synthesis which allows new registers to be created. (riscv_split_integer_cost): Pass new argument to riscv_build_integer. (riscv_integer_cost): Add ALLOW_NEW_PSEUDOS argument, pass it down to riscv_build_integer. (riscv_legitimate_constant_p): Pass new argument to riscv_const_insns. (riscv_const_insns): New argment ALLOW_NEW_PSEUDOS. Pass it down to riscv_integer_cost and riscv_const_insns. (riscv_split_const_insns): Pass new argument to riscv_const_insns. (riscv_move_integer, riscv_rtx_costs): Similarly. * config/riscv/riscv.md (shadd with costly constant): Pass new argument to riscv_const_insns. * config/riscv/bitmanip.md (and with costly constant): Pass new argument to riscv_const_insns. gcc/testsuite/ * gcc.target/riscv/pr116282.c: New test.
2024-08-17	RISC-V: Bugfix for RVV rounding intrinsic ICE in function checker	Jin Ma	3	-3/+7
	When compiling an interface for rounding of type 'vfloat16' without using zvfh or zvfhmin, it is not enough to use FLOAT_MODE_P because the type does not support it. Although the subsequent riscv_validate_vector_type checks will still fail and throw exceptions, I don't think we should have ICE here. internal compiler error: in check, at config/riscv/riscv-vector-builtins-shapes.cc:444 10 \| return __riscv_vfadd_vv_f16m1_rm (vs2, vs1, 0, vl); \| ^~~~~~ 0x4191794 internal_error(char const, ...) /iothome/jin.ma/code/master/gcc/gcc/diagnostic-global-context.cc:491 0x416ebf5 fancy_abort(char const, int, char const) /iothome/jin.ma/code/master/gcc/gcc/diagnostic.cc:1772 0x220aae6 riscv_vector::build_frm_base::check(riscv_vector::function_checker&) const /iothome/jin.ma/code/master/gcc/gcc/config/riscv/riscv-vector-builtins-shapes.cc:444 0x2205323 riscv_vector::function_checker::check() /iothome/jin.ma/code/master/gcc/gcc/config/riscv/riscv-vector-builtins.cc:4414 gcc/ChangeLog: * config/riscv/riscv-protos.h (riscv_vector_float_type_p): New. * config/riscv/riscv-vector-builtins.cc (function_instance::any_type_float_p): Use riscv_vector_float_type_p instead of FLOAT_MODE_P for judgment. * config/riscv/riscv.cc (riscv_vector_int_type_p): Change static to extern. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/bug-9.c: New test.
2024-08-17	RISC-V: Bugfix incorrect operand for vwsll auto-vect	Pan Li	1	-0/+4
	This patch would like to fix one ICE when rv64gcv_zvbb for vwsll. Consider below example. void vwsll_vv_test (short restrict dst, char restrict a, int restrict b, int n) { for (int i = 0; i < n; i++) dst[i] = a[i] << b[i]; } It will hit the vwsll pattern with following operands. operand 0 -> (reg:RVVMF2HI 146 [ vect__7.13 ]) operand 1 -> (reg:RVVMF4QI 165 [ vect_cst__33 ]) operand 2 -> (reg:RVVM1SI 171 [ vect_cst__36 ]) According to the ISA, operand 2 should be the same as operand 1. Aka operand 2 should have RVVMF4QI mode as above. Thus, add quad truncation for operand 2 before emit vwsll. The below test suites are passed for this patch. The rv64gcv fully regression test. PR target/116280 gcc/ChangeLog: * config/riscv/autovec-opt.md: Add quad truncation to align the mode requirement for vwsll. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/pr116280-1.c: New test. * gcc.target/riscv/rvv/base/pr116280-2.c: New test.
2024-08-17	RISC-V: Add auto-vect pattern for vector rotate shift	Feng Wang	1	-0/+16
	This patch add the vector rotate shift pattern for auto-vect. With this patch, the scalar rotate shift can be automatically vectorized into vector rotate shift. gcc/ChangeLog: * config/riscv/autovec.md (v<bitmanip_optab><mode>3): Add new define_expand pattern for vector rotate shift. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/binop/vrolr-1.c: New test. * gcc.target/riscv/rvv/autovec/binop/vrolr-run.c: New test. * gcc.target/riscv/rvv/autovec/binop/vrolr-template.h: New test.
2024-08-17	RISC-V: Fix factor in dwarf_poly_indeterminate_value [PR116305]	曾治金	1	-2/+2
	This patch is to fix the bug (BugId:116305) introduced by the commit bd93ef for risc-v target. The commit bd93ef changes the chunk_num from 1 to TARGET_MIN_VLEN/128 if TARGET_MIN_VLEN is larger than 128 in riscv_convert_vector_bits. So it changes the value of BYTES_PER_RISCV_VECTOR. For example, before merging the commit bd93ef and if TARGET_MIN_VLEN is 256, the value of BYTES_PER_RISCV_VECTOR should be [8, 8], but now [16, 16]. The value of riscv_bytes_per_vector_chunk and BYTES_PER_RISCV_VECTOR are no longer equal. Prologue will use BYTES_PER_RISCV_VECTOR.coeffs[1] to estimate the vlenb register value in riscv_legitimize_poly_move, and dwarf2cfi will also get the estimated vlenb register value in riscv_dwarf_poly_indeterminate_value to calculate the number of times to multiply the vlenb register value. So need to change the factor from riscv_bytes_per_vector_chunk to BYTES_PER_RISCV_VECTOR, otherwise we will get the incorrect dwarf information. The incorrect example as follow: ``` csrr t0,vlenb slli t1,t0,1 sub sp,sp,t1 .cfi_escape 0xf,0xb,0x72,0,0x92,0xa2,0x38,0,0x34,0x1e,0x23,0x50,0x22 ``` The sequence '0x92,0xa2,0x38,0' means the vlenb register, '0x34' means the literal 4, '0x1e' means the multiply operation. But in fact, the vlenb register value just need to multiply the literal 2. PR target/116305 gcc/ChangeLog: * config/riscv/riscv.cc (riscv_dwarf_poly_indeterminate_value): Take BYTES_PER_RISCV_VECTOR for factor instead of riscv_bytes_per_vector_chunk. gcc/testsuite/ChangeLog: gcc.target/riscv/rvv/base/scalable_vector_cfi.c: New test. Signed-off-by: Zhijin Zeng <zhijin.zeng@spacemit.com>
2024-08-15	RISC-V: use fclass insns to implement isfinite,isnormal and isinf builtins	Vineet Gupta	1	-0/+63
	Currently these builtins use float compare instructions which require FP flags to be saved/restored which could be costly in uarch. RV Base ISA already has FCLASS.{d,s,h} instruction to compare/identify FP values w/o disturbing FP exception flags. Now that upstream supports the corresponding optabs, wire them up in the backend. gcc/ChangeLog: * config/riscv/riscv.md: define_insn for fclass insn. define_expand for isfinite, isnormal, isinf. gcc/testsuite/ChangeLog: * gcc.target/riscv/fclass.c: New tests. Tested-by: Edwin Lu <ewlu@rivosinc.com> # pre-commit-CI #2060 Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
2024-08-13	RISC-V: Fix non-obvious comment typos	Patrick O'Neill	5	-8/+8
	This fixes the remainder of the typos I found when reading various parts of the RISC-V backend. gcc/ChangeLog: * config/riscv/riscv-v.cc (legitimize_move): extrac -> extract. (expand_vec_cmp_float): Remove duplicate vmnor.mm. * config/riscv/riscv-vector-builtins.cc: ins -> insns. * config/riscv/riscv.cc (riscv_init_machine_status): mwrvv -> mrvv. * config/riscv/vector-iterators.md: RVVM8QImde -> RVVM8QImode * config/riscv/vector.md: Replaced non-existant vsetivl with vsetivli. Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
2024-08-09	[RISC-V][PR target/116283] Fix split code for recent Zbs improvements with ↵	Jeff Law	1	-6/+18
	masked bit positions So Patrick's fuzzer found an interesting little buglet in the Zbs improvements I added a couple months back. Specifically when we have masked bit position for a Zbs instruction. If the mask has extraneous bits set we'll generate an unrecognizable insn due to an invalid constant. More concretely, let's take this pattern: > (define_insn_and_split "" > [(set (match_operand:DI 0 "register_operand" "=r") > (any_extend:DI > (ashift:SI (const_int 1) > (subreg:QI (and:DI (match_operand:DI 1 "register_operand" "r") > (match_operand 2 "const_int_operand")) 0))))] What we need to know to transform this into bset for rv64. After masking the shift count we want to know the low 5 bits aren't 0x1f. If they were 0x1f, then the constant generated would be 0x80000000 which would then need sign extension out to 64bits, which the bset instruction will not do for us. We can ignore anything outside the low 5 bits. The mode of the shift is SI, so shifting by 32+ bits is undefined behavior. It's also worth explicitly mentioning that the hardware is going to mask the count against 0x3f. The net is if (operands[2] & 0x1f) != 0x1f, then this transformation is safe. So onto the generated split code... > [(set (match_dup 0) (and:DI (match_dup 1) (match_dup 2))) > (set (match_dup 0) (zero_extend:DI (ashift:SI > (const_int 1) > (subreg:QI (match_dup 0) 0))))] Which would seemingly do exactly what we want. The problem is the first split insn. If the constant does not fit into a simm12, that insn won't be recognized resulting in the ICE. The fix is simple, we just need to mask the constant before generating RTL. We can just mask it against 0x1f since we only care about the low 5 bits. This affects multiple patterns. I've added the appropriate fix to all of them. Tested in my tester. Waiting for the pre-commit bits to run before pushing. PR target/116283 gcc/ * config/riscv/bitmanip.md (Zbs combiner patterns/splitters): Mask the bit position in the split code appropriately. gcc/testsuite/ * gcc.target/riscv/pr116283.c: New test
2024-08-09	RISC-V: Enable stack clash in alloca	Raphael Moreira Zinsly	2	-0/+34
	Add the TARGET_STACK_CLASH_PROTECTION_ALLOCA_PROBE_RANGE to riscv in order to enable stack clash protection when using alloca. The code and tests are the same used by aarch64. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_compute_frame_info): Update outgoing args size. (riscv_stack_clash_protection_alloca_probe_range): New. (TARGET_STACK_CLASH_PROTECTION_ALLOCA_PROBE_RANGE): New. * config/riscv/riscv.h (STACK_CLASH_MIN_BYTES_OUTGOING_ARGS): New. (STACK_DYNAMIC_OFFSET): New. gcc/testsuite/ChangeLog: * gcc.target/riscv/stack-check-14.c: New test. * gcc.target/riscv/stack-check-15.c: New test. * gcc.target/riscv/stack-check-alloca-1.c: New test. * gcc.target/riscv/stack-check-alloca-2.c: New test. * gcc.target/riscv/stack-check-alloca-3.c: New test. * gcc.target/riscv/stack-check-alloca-4.c: New test. * gcc.target/riscv/stack-check-alloca-5.c: New test. * gcc.target/riscv/stack-check-alloca-6.c: New test. * gcc.target/riscv/stack-check-alloca-7.c: New test. * gcc.target/riscv/stack-check-alloca-8.c: New test. * gcc.target/riscv/stack-check-alloca-9.c: New test. * gcc.target/riscv/stack-check-alloca-10.c: New test. * gcc.target/riscv/stack-check-alloca.h: New.
2024-08-09	RISC-V: Add support to vector stack-clash protection	Raphael Moreira Zinsly	2	-21/+83
	Adds basic support to vector stack-clash protection using a loop to do the probing and stack adjustments. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_allocate_and_probe_stack_loop): New function. (riscv_v_adjust_scalable_frame): Add stack-clash protection support. (riscv_allocate_and_probe_stack_space): Move the probe loop implementation to riscv_allocate_and_probe_stack_loop. * config/riscv/riscv.h: Define RISCV_STACK_CLASH_VECTOR_CFA_REGNUM. gcc/testsuite/ChangeLog: * gcc.target/riscv/stack-check-cfa-3.c: New test. * gcc.target/riscv/stack-check-prologue-16.c: New test. * gcc.target/riscv/struct_vect_24.c: New test.
2024-08-09	RISC-V: Stack-clash protection implemention	Raphael Moreira Zinsly	2	-35/+217
	This implements stack-clash protection for riscv, with riscv_allocate_and_probe_stack_space being based of aarch64_allocate_and_probe_stack_space from aarch64's implementation. We enforce the probing interval and the guard size to always be equal, their default value is 4Kb which is riscv page size. We also probe up by 1024 bytes in the general case when a probe is required. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_option_override): Enforce that interval is the same size as guard size. (riscv_allocate_and_probe_stack_space): New function. (riscv_expand_prologue): Call riscv_allocate_and_probe_stack_space to the final allocation of the stack and add stack-clash dump information. * config/riscv/riscv.h: Define STACK_CLASH_CALLER_GUARD and STACK_CLASH_MAX_UNROLL_PAGES. gcc/testsuite/ChangeLog: * gcc.dg/params/blocksort-part.c: Skip riscv for stack-clash protection intervals. * gcc.dg/pr82788.c: Skip riscv. * gcc.dg/stack-check-6.c: Skip residual check for riscv. * gcc.dg/stack-check-6a.c: Skip riscv. * gcc.target/riscv/stack-check-12.c: New test. * gcc.target/riscv/stack-check-13.c: New test. * gcc.target/riscv/stack-check-cfa-1.c: New test. * gcc.target/riscv/stack-check-cfa-2.c: New test. * gcc.target/riscv/stack-check-prologue-1.c: New test. * gcc.target/riscv/stack-check-prologue-10.c: New test. * gcc.target/riscv/stack-check-prologue-11.c: New test. * gcc.target/riscv/stack-check-prologue-12.c: New test. * gcc.target/riscv/stack-check-prologue-13.c: New test. * gcc.target/riscv/stack-check-prologue-14.c: New test. * gcc.target/riscv/stack-check-prologue-15.c: New test. * gcc.target/riscv/stack-check-prologue-2.c: New test. * gcc.target/riscv/stack-check-prologue-3.c: New test. * gcc.target/riscv/stack-check-prologue-4.c: New test. * gcc.target/riscv/stack-check-prologue-5.c: New test. * gcc.target/riscv/stack-check-prologue-6.c: New test. * gcc.target/riscv/stack-check-prologue-7.c: New test. * gcc.target/riscv/stack-check-prologue-8.c: New test. * gcc.target/riscv/stack-check-prologue-9.c: New test. * gcc.target/riscv/stack-check-prologue.h: New file. * lib/target-supports.exp (check_effective_target_supports_stack_clash_protection): Add riscv. (check_effective_target_caller_implicit_probes): Likewise.
2024-08-09	RISC-V: Move riscv_v_adjust_scalable_frame	Raphael Moreira Zinsly	1	-31/+31
	Move riscv_v_adjust_scalable_frame () in preparation for the stack clash protection support. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_v_adjust_scalable_frame): Move closer to riscv_expand_prologue.
2024-08-09	RISC-V: Small stack tie changes	Raphael Moreira Zinsly	2	-10/+10
	Enable the register used by riscv_emit_stack_tie () to be passed as an argument so we can tie the stack with other registers besides hard_frame_pointer_rtx. Also don't allow operand 1 of stack_tie<mode> to be optimized to sp in preparation for the stack clash protection support. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_emit_stack_tie): Pass the register to be tied to the stack pointer as argument. * config/riscv/riscv.md (stack_tie<mode>): Don't match equal operands.
2024-08-08	RISC-V: rv32/DF: Prevent 2 SImode loads using XTheadMemIdx	Christoph Müllner	2	-2/+11
	When enabling XTheadFmv/Zfa and XThead(F)MemIdx, we might end up with the following insn (registers are examples, but of correct class): (set (reg:DF a4) (mem:DF (plus:SI (mult:SI (reg:SI a0) (const_int 8)) (reg:SI a5)))) This is a result of an attempt to load the DF register via two SI register loads followed by XTheadFmv/Zfa instructions to move the contents of the two SI registers into the DF register. The two loads are generated in riscv_split_doubleword_move(), where the second load adds an offset of 4 to load address. While this works fine for RVI loads, this can't be handled for XTheadMemIdx addresses. Coming back to the example above, we would end up with the following insn, which can't be simplified or matched: (set (reg:SI a4) (mem:SI (plus:SI (plus:SI (mult:SI (reg:SI a0) (const_int 8)) (reg:SI a5)) (const_int 4)))) This triggered an ICE in the past, which was resolved in b79cd204c780, which also added the test xtheadfmemidx-medany.c, where the examples are from. The patch postponed the optimization insn_and_split pattern for XThead(F)MemIdx, so that the situation could effectively be avoided. Since we don't want to rely on these optimization pattern in the future, we need a different solution. Therefore, this patch restricts the movdf_hardfloat_rv32 insn to not match for split-double-word-moves with XThead(F)MemIdx operands. This ensures we don't need to split them up later. When looking at the code generation of the test file, we can see that we have less GP<->FP conversions, but cannot use the indexed loads. The new sequence is identical to rv32gc_xtheadfmv (similar to rv32gc_zfa). Old: [...] lla a5,.LANCHOR0 th.flrd fa5,a5,a0,3 fmv.x.w a4,fa5 th.fmv.x.hw a5,fa5 .L1: fmv.w.x fa0,a4 th.fmv.hw.x fa0,a5 ret [...] New: [...] lla a5,.LANCHOR0 slli a4,a0,3 add a4,a4,a5 lw a5,4(a4) lw a4,0(a4) .L1: fmv.w.x fa0,a4 th.fmv.hw.x fa0,a5 ret [...] This was tested (together with the patch that eliminates the XTheadMemIdx optimization patterns) with SPEC CPU 2017 intrate on QEMU (RV64/lp64d). gcc/ChangeLog: * config/riscv/constraints.md (th_m_noi): New constraint. * config/riscv/riscv.md: Adjust movdf_hardfloat_rv32 for XTheadMemIdx. gcc/testsuite/ChangeLog: * gcc.target/riscv/xtheadfmemidx-xtheadfmv-medany.c: Adjust. * gcc.target/riscv/xtheadfmemidx-zfa-medany.c: Likewise. Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
2024-08-08	RISC-V: xthead(f)memidx: Eliminate optimization patterns	Christoph Müllner	2	-430/+75
	We have a huge amount of optimization patterns (insn_and_split) for XTheadMemIdx and XTheadFMemIdx that attempt to do something, that can be done more efficient by generic GCC passes, if we have proper support code. A key function in eliminating the optimization patterns is th_memidx_classify_address_index(), which needs to identify each possible memory expression that can be lowered into a XTheadMemIdx/XTheadFMemIdx instruction. This patch adds all memory expressions that were previously only recognized by the optimization patterns. Now, that the address classification is complete, we can finally remove all optimization patterns with the side-effect or getting rid of the non-canonical memory expression they produced: (plus (reg) (ashift (reg) (imm))). A positive side-effect of this change is, that we address an RV32 ICE, that was caused by the th_memidx_I_c pattern, which did not properly handle SUBREGs (more details are in PR116131). A temporary negative side-effect of this change is, that we cause a regression of the xtheadfmemidx + xtheadfmv/zfa tests (initially introduced as part of b79cd204c780 to address an ICE). As this issue cannot be addressed in the code parts that are adjusted in this patch, we just accept the regression for now. PR target/116131 gcc/ChangeLog: * config/riscv/thead.cc (th_memidx_classify_address_index): Recognize all possible XTheadMemIdx memory operand structures. (th_fmemidx_output_index): Do strict classification. * config/riscv/thead.md (th_memidx_operand): Remove. (TARGET_XTHEADMEMIDX): Likewise. (TARGET_HARD_FLOAT && TARGET_XTHEADFMEMIDX): Likewise. (!TARGET_64BIT && TARGET_XTHEADMEMIDX): Likewise. (th_memidx_I_a): Likewise. (th_memidx_I_b): Likewise. (th_memidx_I_c): Likewise. (th_memidx_US_a): Likewise. (th_memidx_US_b): Likewise. (th_memidx_US_c): Likewise. (th_memidx_UZ_a): Likewise. (th_memidx_UZ_b): Likewise. (th_memidx_UZ_c): Likewise. (th_fmemidx_movsf_hardfloat): Likewise. (th_fmemidx_movdf_hardfloat_rv64): Likewise. (th_fmemidx_I_a): Likewise. (th_fmemidx_I_c): Likewise. (th_fmemidx_US_a): Likewise. (th_fmemidx_US_c): Likewise. (th_fmemidx_UZ_a): Likewise. (th_fmemidx_UZ_c): Likewise. gcc/testsuite/ChangeLog: * gcc.target/riscv/pr116131.c: New test. Reported-by: Patrick O'Neill <patrick@rivosinc.com> Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
2024-08-08	RISC-V: Delete duplicate '#define RISCV_DWARF_VLENB'	Jin Ma	1	-2/+0
	gcc/ChangeLog: * config/riscv/riscv.h (RISCV_DWARF_VLENB): Delete.
2024-08-08	[RISC-V][PR target/116240] Ensure object is a comparison before extracting ↵	Jeff Law	1	-2/+4
	arguments This was supposed to go out the door yesterday, but I kept getting interrupted. The target bits for rtx costing can't assume the rtl they're given actually matches a target pattern. It's just kind of inherent in how the costing routines get called in various places. In this particular case we're trying to cost a conditional move: (set (dest) (if_then_else (cond) (true) (false)) On the RISC-V port the backend only allows actual conditionals for COND. So something like (eq (reg) (const_int 0)). In the costing code for if-then-else we did something like (XEXP (XEXP (cond, 0), 0))) Which fails miserably if COND is a terminal node like (reg) rather than (ne (reg) (const_int 0) So this patch tightens up the RTL scanning to ensure that we have a comparison before we start looking at the comparison's arguments. Run through my tester without incident, but I'll wait for the pre-commit tester to run through a cycle before pushing to the trunk. Jeff ps. We probably could support a naked REG for the condition and internally convert it to (ne (reg) (const_int 0)), but I don't think it likely happens with any regularity. PR target/116240 gcc/ * config/riscv/riscv.cc (riscv_rtx_costs): Ensure object is a comparison before looking at its arguments. gcc/testsuite * gcc.target/riscv/pr116240.c: New test.
2024-08-08	RISC-V: Minimal support for Zimop extension.	Jiawei	1	-0/+7
	This patch support Zimop and Zcmop extension[1].To enable GCC to recognize and process Zimop and Zcmop extension correctly at compile time. https://github.com/riscv/riscv-isa-manual/blob/main/src/zimop.adoc gcc/ChangeLog: * common/config/riscv/riscv-common.cc: New extension. * config/riscv/riscv.opt: New mask. gcc/testsuite/ChangeLog: * gcc.target/riscv/arch-42.c: New test. * gcc.target/riscv/arch-43.c: New test.
2024-08-06	RISC-V: Fix typos in code	Patrick O'Neill	7	-52/+52
	This fixes typos in function names and executed code. gcc/ChangeLog: * config/riscv/riscv-target-attr.cc (num_occurences_in_str): Rename... (num_occurrences_in_str): here. (riscv_process_target_attr): Update num_occurences_in_str callsite. * config/riscv/riscv-v.cc (emit_vec_widden_cvt_x_f): widden -> widen. (emit_vec_widen_cvt_x_f): Ditto. (emit_vec_widden_cvt_f_f): Ditto. (emit_vec_widen_cvt_f_f): Ditto. (emit_vec_rounding_to_integer): Update widden callsites. * config/riscv/riscv-vector-builtins.cc (expand_builtin): Update required_ext_to_isa_name callsite and fix xtheadvector typo. * config/riscv/riscv-vector-builtins.h (reqired_ext_to_isa_name): Rename... (required_ext_to_isa_name): here. * config/riscv/riscv_th_vector.h: Fix endif label. * config/riscv/vector-crypto.md: boardcast_scalar -> broadcast_scalar. * config/riscv/vector.md: Ditto. Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
2024-08-06	RISC-V: Fix comment typos	Patrick O'Neill	21	-70/+70
	This fixes most of the typos I found when reading various parts of the RISC-V backend. gcc/ChangeLog: * config/riscv/arch-canonicalize: Fix typos in comments. * config/riscv/autovec.md: Ditto. * config/riscv/riscv-avlprop.cc (avl_can_be_propagated_p): Ditto. (pass_avlprop::get_vlmax_ta_preferred_avl): Ditto. * config/riscv/riscv-modes.def (ADJUST_FLOAT_FORMAT): Ditto. (VLS_MODES): Ditto. * config/riscv/riscv-opts.h (TARGET_ZICOND_LIKE): Ditto. (enum rvv_vector_bits_enum): Ditto. * config/riscv/riscv-protos.h (enum insn_flags): Ditto. (enum insn_type): Ditto. * config/riscv/riscv-sr.cc (riscv_sr_match_epilogue): Ditto. * config/riscv/riscv-string.cc (expand_block_move): Ditto. * config/riscv/riscv-v.cc (rvv_builder::is_repeating_sequence): Ditto. (rvv_builder::single_step_npatterns_p): Ditto. (calculate_ratio): Ditto. (expand_const_vector): Ditto. (shuffle_merge_patterns): Ditto. (shuffle_compress_patterns): Ditto. (expand_select_vl): Ditto. * config/riscv/riscv-vector-builtins-functions.def (REQUIRED_EXTENSIONS): Ditto. * config/riscv/riscv-vector-builtins-shapes.h: Ditto. * config/riscv/riscv-vector-builtins.cc (function_builder::add_function): Ditto. (resolve_overloaded_builtin): Ditto. * config/riscv/riscv-vector-builtins.def (vbool1_t): Ditto. (vuint8m8_t): Ditto. (vuint16m8_t): Ditto. (vfloat16m8_t): Ditto. (unsigned_vector): Ditto. * config/riscv/riscv-vector-builtins.h (enum required_ext): Ditto. * config/riscv/riscv-vector-costs.cc (get_store_value): Ditto. (costs::analyze_loop_vinfo): Ditto. (costs::add_stmt_cost): Ditto. * config/riscv/riscv.cc (riscv_build_integer): Ditto. (riscv_vector_type_p): Ditto. * config/riscv/thead.cc (th_mempair_output_move): Ditto. * config/riscv/thead.md: Ditto. * config/riscv/vector-iterators.md: Ditto. * config/riscv/vector.md: Ditto. * config/riscv/zc.md: Ditto. Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
2024-08-06	RISC-V: Fix format-diag warning from improperly formatted url	Patrick O'Neill	1	-2/+2
	gcc/ChangeLog: PR target/116152 * config/riscv/riscv.cc (riscv_option_override): Fix url formatting. gcc/testsuite/ChangeLog: * gcc.target/riscv/predef-9.c: Update testcase. Co-authored-by: Jakub Jelinek <jakub@redhat.com> Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
2024-08-05	RISC-V: Add deprecation warning to LP64E abi	Patrick O'Neill	1	-0/+7
	gcc/ChangeLog: PR target/116152 * config/riscv/riscv.cc (riscv_option_override): Add deprecation warning. gcc/testsuite/ChangeLog: * gcc.target/riscv/predef-9.c: Add check for warning. Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
2024-08-02	RISC-V: Improve length attributes for atomic insn sequences	Patrick O'Neill	3	-8/+21
	gcc/ChangeLog: * config/riscv/sync-rvwmo.md: Add conditional length attributes. * config/riscv/sync-ztso.md: Ditto. * config/riscv/sync.md: Fix incorrect insn length attributes and reformat existing conditional checks. Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
2024-08-01	RISC-V: Correct mode_idx attribute for viwalu wx variants [PR116149].	Robin Dapp	1	-0/+2
	In PR116149 we choose a wrong vector length which causes wrong values in a reduction. The problem happens in avlprop where we choose the number of units in the instruction's mode as vector length. For the non-scalar variants the respective operand has the correct non-widened mode. For the scalar variants, however, the same operand has a scalar mode which obviously only has one unit. This makes us choose VL = 1 leaving three elements undisturbed (so potentially -1). Those end up in the reduction causing the wrong result. This patch adjusts the mode_idx just for the scalar variants of the affected instruction patterns. gcc/ChangeLog: PR target/116149 * config/riscv/vector.md: Fix mode_idx attribute of scalar widen add/sub variants. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/pr116149.c: New test.
2024-08-01	RISC-V: Reject 'd' extension with ILP32E ABI	Patrick O'Neill	1	-0/+5
	Also add a testcase for -mabi=lp64d where 'd' is required. gcc/ChangeLog: PR target/116111 * config/riscv/riscv.cc (riscv_option_override): Add error. gcc/testsuite/ChangeLog: * gcc.target/riscv/arch-41.c: New test. * gcc.target/riscv/pr116111.c: New test. Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
2024-07-30	RISC-V: Add basic support for the Zacas extension	Gianluca Guida	3	-12/+102
	This patch adds support for amocas.{b\|h\|w\|d}. Support for amocas.q (64/128 bit cas for rv32/64) will be added in a future patch. Extension: https://github.com/riscv/riscv-zacas Ratification: https://jira.riscv.org/browse/RVS-680 gcc/ChangeLog: * common/config/riscv/riscv-common.cc: Add zacas extension. * config/riscv/arch-canonicalize: Make zacas imply zaamo. * config/riscv/riscv.opt: Add zacas. * config/riscv/sync.md (zacas_atomic_cas_value<mode>): New pattern. (atomic_compare_and_swap<mode>): Use new pattern for compare-and-swap ops. (zalrsc_atomic_cas_value_strong<mode>): Rename atomic_cas_value_strong. * doc/sourcebuild.texi: Add Zacas documentation. gcc/testsuite/ChangeLog: * lib/target-supports.exp: Add zacas testsuite infra support. * gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-acquire-release.c: Remove zacas to continue to test the lr/sc pairs. * gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-acquire.c: Ditto. * gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-consume.c: Ditto. * gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-relaxed.c: Ditto. * gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-release.c: Ditto. * gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-seq-cst-relaxed.c: Ditto. * gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-seq-cst.c: Ditto. * gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-acquire-release.c: Ditto. * gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-acquire.c: Ditto. * gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-consume.c: Ditto. * gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-relaxed.c: Ditto. * gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-release.c: Ditto. * gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-seq-cst-relaxed.c: Ditto. * gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-seq-cst.c: Ditto. * gcc.target/riscv/amo/zabha-zacas-preferred-over-zalrsc.c: New test. * gcc.target/riscv/amo/zacas-char-requires-zabha.c: New test. * gcc.target/riscv/amo/zacas-char-requires-zacas.c: New test. * gcc.target/riscv/amo/zacas-preferred-over-zalrsc.c: New test. * gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-char-acq-rel.c: New test. * gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-char-acquire.c: New test. * gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-char-relaxed.c: New test. * gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-char-release.c: New test. * gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-char-seq-cst.c: New test. * gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-compatability-mapping-no-fence.c: New test. * gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-compatability-mapping.cc: New test. * gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-int-acq-rel.c: New test. * gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-int-acquire.c: New test. * gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-int-relaxed.c: New test. * gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-int-release.c: New test. * gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-int-seq-cst.c: New test. * gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-short-acq-rel.c: New test. * gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-short-acquire.c: New test. * gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-short-relaxed.c: New test. * gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-short-release.c: New test. * gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-short-seq-cst.c: New test. * gcc.target/riscv/amo/zacas-ztso-compare-exchange-char-seq-cst.c: New test. * gcc.target/riscv/amo/zacas-ztso-compare-exchange-char.c: New test. * gcc.target/riscv/amo/zacas-ztso-compare-exchange-compatability-mapping-no-fence.c: New test. * gcc.target/riscv/amo/zacas-ztso-compare-exchange-compatability-mapping.cc: New test. * gcc.target/riscv/amo/zacas-ztso-compare-exchange-int-seq-cst.c: New test. * gcc.target/riscv/amo/zacas-ztso-compare-exchange-int.c: New test. * gcc.target/riscv/amo/zacas-ztso-compare-exchange-short-seq-cst.c: New test. * gcc.target/riscv/amo/zacas-ztso-compare-exchange-short.c: New test. Co-authored-by: Patrick O'Neill <patrick@rivosinc.com> Tested-by: Andrea Parri <andrea@rivosinc.com> Signed-Off-By: Gianluca Guida <gianluca@rivosinc.com>
2024-07-30	RISC-V: Take Xmode instead of Pmode for ussub expanding	Pan Li	1	-12/+12
	The Pmode is designed for pointer, thus leverage the Xmode instead for the expanding of the ussub. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_expand_ussub): Promote to Xmode instead of Pmode. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-07-26	[RISC-V][target/116085] Fix rv64 minmax extension avoidance splitter	Jeff Law	2	-18/+29
	A patch introduced a pattern to avoid unnecessary extensions when doing a min/max operation where one of the values is a 32 bit positive constant. > (define_insn_and_split "minmax" > [(set (match_operand:DI 0 "register_operand" "=r") > (sign_extend:DI > (subreg:SI > (bitmanip_minmax:DI (zero_extend:DI (match_operand:SI 1 "register_operand" "r")) > (match_operand:DI 2 "immediate_operand" "i")) > 0))) > (clobber (match_scratch:DI 3 "=&r")) > (clobber (match_scratch:DI 4 "=&r"))] > "TARGET_64BIT && TARGET_ZBB && sext_hwi (INTVAL (operands[2]), 32) >= 0" > "#" > "&& reload_completed" > [(set (match_dup 3) (sign_extend:DI (match_dup 1))) > (set (match_dup 4) (match_dup 2)) > (set (match_dup 0) (<minmax_optab>:DI (match_dup 3) (match_dup 4)))] Lots going on in here. The key is the nonconstant value is zero extended from SI to DI in the original RTL and we know the constant value is unchanged if we were to sign extend it from 32 to 64 bits. We change the extension of the nonconstant operand from zero to sign extension. I'm pretty confident the goal there is take advantage of the fact that SI values are kept sign extended and will often be optimized away. The problem occurs when the nonconstant operand has the SI sign bit set. As an example: smax (0x8000000, 0x7) resulting in 0x80000000 The split RTL will generate smax (sign_extend (0x80000000), 0x7)) smax (0xffffffff80000000, 0x7) resulting in 0x7 Opps. We really needed to change the opcode to umax for this transformation to work. That's easy enough. But there's further improvements we can make. First the pattern is a define_and_split with a post-reload split condition. It would be better implemented as a 4->3 define_split so that the costing model just works. Second, if operands[1] is a suitably promoted subreg, then we can elide the sign extension when we generate the split code, so often it'll be a 4->2 split, again with the cost model working with no adjustments needed. Tested on rv32 and rv64 in my tester. I'll wait for the pre-commit tester to spin it as well. PR target/116085 gcc/ config/riscv/bitmanip.md (minmax extension avoidance splitter): Rewrite as a simpler define_split. Adjust the opcode appropriately. Avoid emitting sign extension if it's clearly not needed. * config/riscv/iterators.md (minmax_optab): Rename to uminmax_optab and map everything to unsigned variants. gcc/testsuite/ * gcc.target/riscv/pr116085.c: New test.
2024-07-26	RISC-V: Work around bare apostrophe in error string.	Robin Dapp	1	-1/+1
	An unquoted apostrophe slipped through when testing the recent V/M extension patch. This, again, re-words the message to "Currently the 'V' implementation requires the 'M' extension". Going to commit as obvious after testing. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_override_options_internal): Reword error string without apostrophe. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/pr116036.c: Adjust expected error string.
2024-07-25	RISC-V: xtheadmemidx: Fix mode test for pre/post-modify addressing	Christoph Müllner	1	-4/+2
	auto_inc_dec (-O3) performs optimizations like the following if RVV and XTheadMemIdx is enabled. (insn 23 20 27 3 (set (mem:V4QI (reg:DI 136 [ ivtmp.13 ]) [0 MEM <vector(4) char> [(char )_39]+0 S4 A32]) (reg:V4QI 168)) "gcc/testsuite/gcc.target/riscv/pr116033.c":12:27 3183 {movv4qi} (nil)) (insn 40 39 41 3 (set (reg:DI 136 [ ivtmp.13 ]) (plus:DI (reg:DI 136 [ ivtmp.13 ]) (const_int 20 [0x14]))) 5 {adddi3} (nil)) ====> (insn 23 20 27 3 (set (mem:V4QI (post_modify:DI (reg:DI 136 [ ivtmp.13 ]) (plus:DI (reg:DI 136 [ ivtmp.13 ]) (const_int 20 [0x14]))) [0 MEM <vector(4) char> [(char )_39]+0 S4 A32]) (reg:V4QI 168)) "gcc/testsuite/gcc.target/riscv/pr116033.c":12:27 3183 {movv4qi} (expr_list:REG_INC (reg:DI 136 [ ivtmp.13 ]) (nil))) The reason why the pass believes that this is legal is, that the mode test in th_memidx_classify_address_modify() requires INTEGRAL_MODE_P (mode), which includes vector modes. Let's restrict the mode test such, that only MODE_INT is allowed. PR target/116033 gcc/ChangeLog: * config/riscv/thead.cc (th_memidx_classify_address_modify): Fix mode test. gcc/testsuite/ChangeLog: * gcc.target/riscv/pr116033.c: New test. Reported-by: Patrick O'Neill <patrick@rivosinc.com> Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
2024-07-25	rtl-ssa: Define INCLUDE_ARRAY	Richard Sandiford	2	-0/+2
	g:72fbd3b2b2a497dbbe6599239bd61c5624203ed0 added a use of std::array without explicitly forcing <array> to be included. That didn't cause problems in my local builds but understandably did for some people. gcc/ * doc/rtl.texi: Document the need to define INCLUDE_ARRAY before including rtl-ssa.h. * rtl-ssa.h: Likewise (in comment). * config/aarch64/aarch64-cc-fusion.cc: Add INCLUDE_ARRAY. * config/aarch64/aarch64-early-ra.cc: Likewise. * config/riscv/riscv-avlprop.cc: Likewise. * config/riscv/riscv-vsetvl.cc: Likewise. * fwprop.cc: Likewise. * late-combine.cc: Likewise. * pair-fusion.cc: Likewise. * rtl-ssa/accesses.cc: Likewise. * rtl-ssa/blocks.cc: Likewise. * rtl-ssa/changes.cc: Likewise. * rtl-ssa/functions.cc: Likewise. * rtl-ssa/insns.cc: Likewise. * rtl-ssa/movement.cc: Likewise.