Age | Commit message (Collapse) | Author | Files | Lines |
|
When the source mode is potentially larger than one vector (e.g. an
LMUL2 mode for VLEN=128) we don't know which vector the subreg actually
refers to. For zvl128b and LMUL=2 the subreg in (subreg:V2DI (reg:V4DI))
could actually be the a full (high) vector register of a two-register
group (at VLEN=128) or the higher part of a single register (at VLEN>128).
As the subreg is statically ambiguous we prevent such situations in
can_change_mode_class.
The culprit in PR116086 is
_12 = BIT_FIELD_REF <vect_cst__42, 128, 128>;
which can be expanded with a vector-vector extract (from V4DI to V2DI).
This patch adds a VLS-mode vector-vector extract that handles "halving"
cases like this one by sliding down the source vector, thus making sure
the correct part is used.
PR target/116086
gcc/ChangeLog:
* config/riscv/autovec.md (vec_extract<mode><v_half>): Add
vector-vector extract for VLS modes.
* config/riscv/riscv.cc (riscv_can_change_mode_class): Forbid
VLS modes larger than one vector.
* config/riscv/vector-iterators.md: Add vector-vector extract
iterators.
gcc/testsuite/ChangeLog:
* lib/target-supports.exp: Add effective target checks for
zvl256b and zvl512b.
* gcc.target/riscv/rvv/autovec/pr116086-2-run.c: New test.
* gcc.target/riscv/rvv/autovec/pr116086-2.c: New test.
* gcc.target/riscv/rvv/autovec/pr116086.c: New test.
|
|
We add pattern for vector rotate, but seems like we forgot adding
mode_idx which used in AVL propgation (riscv-avlprop.cc).
gcc/ChangeLog:
* config/riscv/vector.md (mode_idx): Add vrol and vror.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/rotr.c: New.
|
|
These subroutines will be used in expand_const_vector in a future patch.
Relocate so expand_const_vector can use them.
gcc/ChangeLog:
* config/riscv/riscv-v.cc (expand_vector_init_insert_elems): Relocate.
(expand_vector_init_trailing_same_elem): Ditto.
Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
|
|
Currently we assert when encountering a non-duplicate boolean vector.
This patch allows non-duplicate vectors to fall through to the
gcc_unreachable and assert there.
This will be useful when adding a catch-all pattern to emit costs and
handle arbitary vectors.
gcc/ChangeLog:
* config/riscv/riscv-v.cc (expand_const_vector): Allow non-duplicate
to fall through other patterns before asserting.
Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
|
|
The comment previously here stated that the Wc0/Wc1 cases are handled by
the vi constraint but that is not true for the 0.0 Wc0 case.
gcc/ChangeLog:
* config/riscv/riscv-v.h (valid_vec_immediate_p): Add new helper.
* config/riscv/riscv-v.cc (valid_vec_immediate_p): Ditto.
(expand_const_vector): Use new helper.
* config/riscv/riscv.cc (riscv_const_insns): Handle 0.0 floating-point
case.
Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
|
|
These cases are handled in the expander
(riscv-v.cc:expand_const_vector). We need the vector builder to detect
these cases so extract that out into a new riscv-v.h header file.
gcc/ChangeLog:
* config/riscv/riscv-v.cc (class rvv_builder): Move to riscv-v.h.
* config/riscv/riscv.cc (riscv_const_insns): Emit placeholder costs for
bool/stepped const vectors.
* config/riscv/riscv-v.h: New file.
Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
|
|
register
This manifests in RTL that is optimized away which causes runtime failures
in the testsuite. Update all patterns to use a temp result register if required.
gcc/ChangeLog:
* config/riscv/riscv-v.cc (expand_const_vector): Use tmp register if
needed.
Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
|
|
order
The corresponding expander (riscv-v.cc:expand_const_vector) matches
const_vec_duplicate_p before const_vec_series_p. Reorder to match this
behavior when calculating costs.
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_const_insns): Relocate.
Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
|
|
Prior to this patch the expander would emit vectors like:
{ 0, 0, 5, 5, 10, 10, ...}
as:
{ 0, 0, 2, 2, 4, 4, ...}
This patch sets the step size to the requested value.
gcc/ChangeLog:
* config/riscv/riscv-v.cc (expand_const_vector): Fix STEP size in
expander.
Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
|
|
This patch would like to allow IMM for the operand 1 of ussub pattern.
Aka .SAT_SUB(x, 22) as the below example.
Form 2:
#define DEF_SAT_U_SUB_IMM_FMT_2(T, IMM) \
T __attribute__((noinline)) \
sat_u_sub_imm##IMM##_##T##_fmt_2 (T x) \
{ \
return x >= (T)IMM ? x - (T)IMM : 0; \
}
DEF_SAT_U_SUB_IMM_FMT_2(uint64_t, 1022)
It is almost the as support imm for operand 0 of ussub pattern, but
allow the second operand to be imm insted of the first operand.
The below test suites are passed for this patch:
1. The rv64gcv fully regression test.
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_expand_ussub): Gen xmode for the
second operand, aka y in parameter.
* config/riscv/riscv.md (ussub<mode>3): Allow const_int for operand 2.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/sat_arith.h: Add test helper macros.
* gcc.target/riscv/sat_u_sub_imm-5.c: New test.
* gcc.target/riscv/sat_u_sub_imm-5_1.c: New test.
* gcc.target/riscv/sat_u_sub_imm-5_2.c: New test.
* gcc.target/riscv/sat_u_sub_imm-6.c: New test.
* gcc.target/riscv/sat_u_sub_imm-6_1.c: New test.
* gcc.target/riscv/sat_u_sub_imm-6_2.c: New test.
* gcc.target/riscv/sat_u_sub_imm-7.c: New test.
* gcc.target/riscv/sat_u_sub_imm-7_1.c: New test.
* gcc.target/riscv/sat_u_sub_imm-7_2.c: New test.
* gcc.target/riscv/sat_u_sub_imm-8.c: New test.
* gcc.target/riscv/sat_u_sub_imm-run-5.c: New test.
* gcc.target/riscv/sat_u_sub_imm-run-6.c: New test.
* gcc.target/riscv/sat_u_sub_imm-run-7.c: New test.
* gcc.target/riscv/sat_u_sub_imm-run-8.c: New test.
Signed-off-by: Pan Li <pan2.li@intel.com>
|
|
This patch would like to allow IMM for the operand 0 of ussub pattern.
Aka .SAT_SUB(1023, y) as the below example.
Form 1:
#define DEF_SAT_U_SUB_IMM_FMT_1(T, IMM) \
T __attribute__((noinline)) \
sat_u_sub_imm##IMM##_##T##_fmt_1 (T y) \
{ \
return (T)IMM >= y ? (T)IMM - y : 0; \
}
DEF_SAT_U_SUB_IMM_FMT_1(uint64_t, 1023)
Before this patch:
10 │ sat_u_sub_imm82_uint64_t_fmt_1:
11 │ li a5,82
12 │ bgtu a0,a5,.L3
13 │ sub a0,a5,a0
14 │ ret
15 │ .L3:
16 │ li a0,0
17 │ ret
After this patch:
10 │ sat_u_sub_imm82_uint64_t_fmt_1:
11 │ li a5,82
12 │ sltu a4,a5,a0
13 │ addi a4,a4,-1
14 │ sub a0,a5,a0
15 │ and a0,a4,a0
16 │ ret
The below test suites are passed for this patch:
1. The rv64gcv fully regression test.
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_gen_unsigned_xmode_reg): Add new
func impl to gen xmode rtx reg from operand rtx.
(riscv_expand_ussub): Gen xmode reg for operand 1.
* config/riscv/riscv.md: Allow const_int for operand 1.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/sat_arith.h: Add test helper macro.
* gcc.target/riscv/sat_u_sub_imm-1.c: New test.
* gcc.target/riscv/sat_u_sub_imm-1_1.c: New test.
* gcc.target/riscv/sat_u_sub_imm-1_2.c: New test.
* gcc.target/riscv/sat_u_sub_imm-2.c: New test.
* gcc.target/riscv/sat_u_sub_imm-2_1.c: New test.
* gcc.target/riscv/sat_u_sub_imm-2_2.c: New test.
* gcc.target/riscv/sat_u_sub_imm-3.c: New test.
* gcc.target/riscv/sat_u_sub_imm-3_1.c: New test.
* gcc.target/riscv/sat_u_sub_imm-3_2.c: New test.
* gcc.target/riscv/sat_u_sub_imm-4.c: New test.
* gcc.target/riscv/sat_u_sub_imm-run-1.c: New test.
* gcc.target/riscv/sat_u_sub_imm-run-2.c: New test.
* gcc.target/riscv/sat_u_sub_imm-run-3.c: New test.
* gcc.target/riscv/sat_u_sub_imm-run-4.c: New test.
Signed-off-by: Pan Li <pan2.li@intel.com>
|
|
Currently, some binops of vector vs double scalar under RV32 can't
translated to vf but vfmv+vxx.vv.
The cause is that vec_duplicate is also expanded to broadcast for double mode
under RV32. last-combine can't process expanded broadcast.
gcc/ChangeLog:
* config/riscv/vector.md: Add !FLOAT_MODE_P constraint.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/binop/vadd-rv32gcv-nofm.c: Fix test.
* gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv-nofm.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vmul-rv32gcv-nofm.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vsub-rv32gcv-nofm.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_copysign-rv32gcv.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fadd-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fadd-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fadd-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fadd-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fmax-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fmax-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fmax-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fmax-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fmin-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fmin-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fmin-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fmin-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fmul-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fmul-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fmul-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fmul-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fmul-5.c: Ditto.
|
|
repeating_sequence_p operates directly on the encoded pattern and does
not derive elements using the .elt() accessor. Passing in the length of
the unencoded vector can cause an out-of-bounds read of the encoded
pattern.
gcc/ChangeLog:
* config/riscv/riscv-v.cc (rvv_builder::can_duplicate_repeating_sequence_p):
Use encoded_nelts when calling repeating_sequence_p.
(rvv_builder::is_repeating_sequence): Ditto.
(rvv_builder::repeating_sequence_use_merge_profitable_p): Ditto.
Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
|
|
Standard abs synthesis during expand is max (a, -a). This
expansion has the advantage of avoiding masking and is thus potentially
faster than the a < 0 ? -a : a synthesis.
gcc/ChangeLog:
* config/riscv/autovec.md (abs<mode>2): Expand via max (a, -a).
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/unop/abs-rv32gcv.c: Adjust test
expectation.
* gcc.target/riscv/rvv/autovec/unop/abs-rv64gcv.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/abs-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_unary-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_unary-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_unary-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_unary-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_unary-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_unary-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_unary-7.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_unary-8.c: Ditto.
|
|
The stack-clash code is generating wrong cfi directives in
riscv_v_adjust_scalable_frame because REG_CFA_DEF_CFA has a different
encoding than REG_FRAME_RELATED_EXPR, this patch fixes the offset sign
in prologue and starts using REG_CFA_DEF_CFA in the epilogue.
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_v_adjust_scalable_frame): Add
epilogue code for stack-clash and fix prologue cfi note.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/stack-check-cfa-3.c: Fix the expected output.
|
|
This patch would like to implement the quad and oct .SAT_TRUNC pattern
in the riscv backend. Aka:
Form 1:
#define DEF_SAT_U_TRUC_FMT_1(NT, WT) \
NT __attribute__((noinline)) \
sat_u_truc_##WT##_to_##NT##_fmt_1 (WT x) \
{ \
bool overflow = x > (WT)(NT)(-1); \
return ((NT)x) | (NT)-overflow; \
}
DEF_SAT_U_TRUC_FMT_1(uint16_t, uint64_t)
Before this patch:
4 │ __attribute__((noinline))
5 │ uint16_t sat_u_truc_uint64_t_to_uint16_t_fmt_1 (uint64_t x)
6 │ {
7 │ _Bool overflow;
8 │ short unsigned int _1;
9 │ short unsigned int _2;
10 │ short unsigned int _3;
11 │ uint16_t _6;
12 │
13 │ ;; basic block 2, loop depth 0
14 │ ;; pred: ENTRY
15 │ overflow_5 = x_4(D) > 65535;
16 │ _1 = (short unsigned int) x_4(D);
17 │ _2 = (short unsigned int) overflow_5;
18 │ _3 = -_2;
19 │ _6 = _1 | _3;
20 │ return _6;
21 │ ;; succ: EXIT
22 │
23 │ }
After this patch:
3 │
4 │ __attribute__((noinline))
5 │ uint16_t sat_u_truc_uint64_t_to_uint16_t_fmt_1 (uint64_t x)
6 │ {
7 │ uint16_t _6;
8 │
9 │ ;; basic block 2, loop depth 0
10 │ ;; pred: ENTRY
11 │ _6 = .SAT_TRUNC (x_4(D)); [tail call]
12 │ return _6;
13 │ ;; succ: EXIT
14 │
15 │ }
The below tests suites are passed for this patch
1. The rv64gcv fully regression test.
2. The rv64gcv build with glibc
gcc/ChangeLog:
* config/riscv/iterators.md (ANYI_QUAD_TRUNC): New iterator for
quad truncation.
(ANYI_OCT_TRUNC): New iterator for oct truncation.
(ANYI_QUAD_TRUNCATED): New attr for truncated quad modes.
(ANYI_OCT_TRUNCATED): New attr for truncated oct modes.
(anyi_quad_truncated): Ditto but for lower case.
(anyi_oct_truncated): Ditto but for lower case.
* config/riscv/riscv.md (ustrunc<mode><anyi_quad_truncated>2):
Add new pattern for quad truncation.
(ustrunc<mode><anyi_oct_truncated>2): Ditto but for oct.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-2.c: Adjust
the expand dump check times.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-3.c: Ditto.
* gcc.target/riscv/sat_arith_data.h: Add test helper macros.
* gcc.target/riscv/sat_u_trunc-4.c: New test.
* gcc.target/riscv/sat_u_trunc-5.c: New test.
* gcc.target/riscv/sat_u_trunc-6.c: New test.
* gcc.target/riscv/sat_u_trunc-run-4.c: New test.
* gcc.target/riscv/sat_u_trunc-run-5.c: New test.
* gcc.target/riscv/sat_u_trunc-run-6.c: New test.
Signed-off-by: Pan Li <pan2.li@intel.com>
|
|
For QI/HImode of .SAT_ADD, the operands may be sign-extended and the
high bits of Xmode may be all 1 which is not expected. For example as
below code.
signed char b[1];
unsigned short c;
signed char *d = b;
int main() {
b[0] = -40;
c = ({ (unsigned short)d[0] < 0xFFF6 ? (unsigned short)d[0] : 0xFFF6; }) + 9;
__builtin_printf("%d\n", c);
}
After expanding we have:
;; _6 = .SAT_ADD (_3, 9);
(insn 8 7 9 (set (reg:DI 143)
(high:DI (symbol_ref:DI ("d") [flags 0x86] <var_decl d>)))
(nil))
(insn 9 8 10 (set (reg/f:DI 142)
(mem/f/c:DI (lo_sum:DI (reg:DI 143)
(symbol_ref:DI ("d") [flags 0x86] <var_decl d>)) [1 d+0 S8 A64]))
(nil))
(insn 10 9 11 (set (reg:HI 144 [ _3 ])
(sign_extend:HI (mem:QI (reg/f:DI 142) [0 *d.0_1+0 S1 A8]))) "test.c":7:10 -1
(nil))
The convert from signed char to unsigned short will have sign_extend rtl
as above. And finally become the lb insn as below:
lb a1,0(a5) // a1 is -40, aka 0xffffffffffffffd8
lui a0,0x1a
addi a5,a1,9
slli a5,a5,0x30
srli a5,a5,0x30 // a5 is 65505
sltu a1,a5,a1 // compare 65505 and 0xffffffffffffffd8 => TRUE
The sltu try to compare 65505 and 0xffffffffffffffd8 here, but we
actually want to compare 65505 and 65496 (0xffd8). Thus we need to
clean up the high bits to ensure this.
The below test suites are passed for this patch:
* The rv64gcv fully regression test.
PR target/116278
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_gen_zero_extend_rtx): Add new
func impl to zero extend rtx.
(riscv_expand_usadd): Leverage above func to cleanup operands 0
and remove the special handing for SImode in RV64.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/sat_u_add-11.c: Adjust asm check body.
* gcc.target/riscv/sat_u_add-15.c: Ditto.
* gcc.target/riscv/sat_u_add-19.c: Ditto.
* gcc.target/riscv/sat_u_add-23.c: Ditto.
* gcc.target/riscv/sat_u_add-3.c: Ditto.
* gcc.target/riscv/sat_u_add-7.c: Ditto.
* gcc.target/riscv/sat_u_add_imm-11.c: Ditto.
* gcc.target/riscv/sat_u_add_imm-15.c: Ditto.
* gcc.target/riscv/sat_u_add_imm-3.c: Ditto.
* gcc.target/riscv/sat_u_add_imm-7.c: Ditto.
* gcc.target/riscv/pr116278-run-1.c: New test.
* gcc.target/riscv/pr116278-run-2.c: New test.
Signed-off-by: Pan Li <pan2.li@intel.com>
|
|
The attach patch is specific to the RTEMS RISC-V architecture multilib which is
controlled by the t-rtems file in the gcc/config/riscv/ directory. The patch
file was created from the gcc-13.3.0 branch. It was successfully tested within
RTEMS Source Builder.
gcc/
* config/riscv/t-rtems: Add ilp32f multilib.
|
|
When rs1 is the immediate 0, the following ICE occurs:
error: unrecognizable insn:
(insn 8 5 12 2 (set (reg:RVVM1DI 134 [ <retval> ])
(if_then_else:RVVM1DI (unspec:RVVMF64BI [
(const_vector:RVVMF64BI repeat [
(const_int 1 [0x1])
])
(reg/v:DI 137 [ vl ])
(const_int 2 [0x2]) repeated x2
(const_int 0 [0])
(reg:SI 66 vl)
(reg:SI 67 vtype)
] UNSPEC_VPREDICATE)
(plus:RVVM1DI (mult:RVVM1DI (vec_duplicate:RVVM1DI (const_int 0 [0]))
(reg/v:RVVM1DI 136 [ vs2 ]))
(reg/v:RVVM1DI 135 [ vd ]))
(reg/v:RVVM1DI 135 [ vd ])))
gcc/ChangeLog:
* config/riscv/vector.md: Allow scalar operand to be 0.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/bug-7.c: New test.
* gcc.target/riscv/rvv/base/bug-8.c: New test.
|
|
So as expected the core problem with target/116282 is that the cost of certain
constant synthesis cases varied depending on whether or not we're allowed to
generate new pseudos or not.
That in turn meant that in obscure cases an insn might change from recognizable
to unrecognizable and triggers the observed failure.
So we need to keep the cost stable, at least when called from a pattern's
condition. So we pass another boolean down when necessary. I've tried to keep
API fallout minimized.
Built and tested on rv32 in my tester. Let's see what pre-commit testing has
to say though 🙂
Note this will also require a minor change to the in-flight constant synthesis
work.
PR target/116282
gcc/
* config/riscv/riscv-protos.h (riscv_const_insns): Add new argument.
* config/riscv/riscv.cc (riscv_build_integer): Add new argument
ALLOW_NEW_PSEUDOS. Pass it down to recursive calls and check it
before using synthesis which allows new registers to be created.
(riscv_split_integer_cost): Pass new argument to riscv_build_integer.
(riscv_integer_cost): Add ALLOW_NEW_PSEUDOS argument, pass it down to
riscv_build_integer.
(riscv_legitimate_constant_p): Pass new argument to riscv_const_insns.
(riscv_const_insns): New argment ALLOW_NEW_PSEUDOS. Pass it down to
riscv_integer_cost and riscv_const_insns.
(riscv_split_const_insns): Pass new argument to riscv_const_insns.
(riscv_move_integer, riscv_rtx_costs): Similarly.
* config/riscv/riscv.md (shadd with costly constant): Pass new argument
to riscv_const_insns.
* config/riscv/bitmanip.md (and with costly constant): Pass new argument
to riscv_const_insns.
gcc/testsuite/
* gcc.target/riscv/pr116282.c: New test.
|
|
When compiling an interface for rounding of type 'vfloat16*' without using zvfh
or zvfhmin, it is not enough to use FLOAT_MODE_P because the type does not
support it. Although the subsequent riscv_validate_vector_type checks will
still fail and throw exceptions, I don't think we should have ICE here.
internal compiler error: in check, at config/riscv/riscv-vector-builtins-shapes.cc:444
10 | return __riscv_vfadd_vv_f16m1_rm (vs2, vs1, 0, vl);
| ^~~~~~
0x4191794 internal_error(char const*, ...)
/iothome/jin.ma/code/master/gcc/gcc/diagnostic-global-context.cc:491
0x416ebf5 fancy_abort(char const*, int, char const*)
/iothome/jin.ma/code/master/gcc/gcc/diagnostic.cc:1772
0x220aae6 riscv_vector::build_frm_base::check(riscv_vector::function_checker&) const
/iothome/jin.ma/code/master/gcc/gcc/config/riscv/riscv-vector-builtins-shapes.cc:444
0x2205323 riscv_vector::function_checker::check()
/iothome/jin.ma/code/master/gcc/gcc/config/riscv/riscv-vector-builtins.cc:4414
gcc/ChangeLog:
* config/riscv/riscv-protos.h (riscv_vector_float_type_p): New.
* config/riscv/riscv-vector-builtins.cc (function_instance::any_type_float_p):
Use riscv_vector_float_type_p instead of FLOAT_MODE_P for judgment.
* config/riscv/riscv.cc (riscv_vector_int_type_p): Change static to extern.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/bug-9.c: New test.
|
|
This patch would like to fix one ICE when rv64gcv_zvbb for vwsll.
Consider below example.
void vwsll_vv_test (short *restrict dst, char *restrict a,
int *restrict b, int n)
{
for (int i = 0; i < n; i++)
dst[i] = a[i] << b[i];
}
It will hit the vwsll pattern with following operands.
operand 0 -> (reg:RVVMF2HI 146 [ vect__7.13 ])
operand 1 -> (reg:RVVMF4QI 165 [ vect_cst__33 ])
operand 2 -> (reg:RVVM1SI 171 [ vect_cst__36 ])
According to the ISA, operand 2 should be the same as operand 1.
Aka operand 2 should have RVVMF4QI mode as above. Thus, add
quad truncation for operand 2 before emit vwsll.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
PR target/116280
gcc/ChangeLog:
* config/riscv/autovec-opt.md: Add quad truncation to
align the mode requirement for vwsll.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/pr116280-1.c: New test.
* gcc.target/riscv/rvv/base/pr116280-2.c: New test.
|
|
This patch add the vector rotate shift pattern for auto-vect.
With this patch, the scalar rotate shift can be automatically
vectorized into vector rotate shift.
gcc/ChangeLog:
* config/riscv/autovec.md (v<bitmanip_optab><mode>3):
Add new define_expand pattern for vector rotate shift.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/binop/vrolr-1.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vrolr-run.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vrolr-template.h: New test.
|
|
This patch is to fix the bug (BugId:116305) introduced by the commit
bd93ef for risc-v target.
The commit bd93ef changes the chunk_num from 1 to TARGET_MIN_VLEN/128
if TARGET_MIN_VLEN is larger than 128 in riscv_convert_vector_bits. So
it changes the value of BYTES_PER_RISCV_VECTOR. For example, before
merging the commit bd93ef and if TARGET_MIN_VLEN is 256, the value
of BYTES_PER_RISCV_VECTOR should be [8, 8], but now [16, 16]. The value
of riscv_bytes_per_vector_chunk and BYTES_PER_RISCV_VECTOR are no longer
equal.
Prologue will use BYTES_PER_RISCV_VECTOR.coeffs[1] to estimate the vlenb
register value in riscv_legitimize_poly_move, and dwarf2cfi will also
get the estimated vlenb register value in riscv_dwarf_poly_indeterminate_value
to calculate the number of times to multiply the vlenb register value.
So need to change the factor from riscv_bytes_per_vector_chunk to
BYTES_PER_RISCV_VECTOR, otherwise we will get the incorrect dwarf
information. The incorrect example as follow:
```
csrr t0,vlenb
slli t1,t0,1
sub sp,sp,t1
.cfi_escape 0xf,0xb,0x72,0,0x92,0xa2,0x38,0,0x34,0x1e,0x23,0x50,0x22
```
The sequence '0x92,0xa2,0x38,0' means the vlenb register, '0x34' means
the literal 4, '0x1e' means the multiply operation. But in fact, the
vlenb register value just need to multiply the literal 2.
PR target/116305
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_dwarf_poly_indeterminate_value): Take
BYTES_PER_RISCV_VECTOR for *factor instead of riscv_bytes_per_vector_chunk.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/scalable_vector_cfi.c: New test.
Signed-off-by: Zhijin Zeng <zhijin.zeng@spacemit.com>
|
|
Currently these builtins use float compare instructions which require
FP flags to be saved/restored which could be costly in uarch.
RV Base ISA already has FCLASS.{d,s,h} instruction to compare/identify FP
values w/o disturbing FP exception flags.
Now that upstream supports the corresponding optabs, wire them up in the
backend.
gcc/ChangeLog:
* config/riscv/riscv.md: define_insn for fclass insn.
define_expand for isfinite, isnormal, isinf.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/fclass.c: New tests.
Tested-by: Edwin Lu <ewlu@rivosinc.com> # pre-commit-CI #2060
Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
|
|
This fixes the remainder of the typos I found when reading various parts of the
RISC-V backend.
gcc/ChangeLog:
* config/riscv/riscv-v.cc (legitimize_move): extrac -> extract.
(expand_vec_cmp_float): Remove duplicate vmnor.mm.
* config/riscv/riscv-vector-builtins.cc: ins -> insns.
* config/riscv/riscv.cc (riscv_init_machine_status): mwrvv -> mrvv.
* config/riscv/vector-iterators.md: RVVM8QImde -> RVVM8QImode
* config/riscv/vector.md: Replaced non-existant vsetivl with vsetivli.
Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
|
|
masked bit positions
So Patrick's fuzzer found an interesting little buglet in the Zbs improvements
I added a couple months back.
Specifically when we have masked bit position for a Zbs instruction. If the
mask has extraneous bits set we'll generate an unrecognizable insn due to an
invalid constant.
More concretely, let's take this pattern:
> (define_insn_and_split ""
> [(set (match_operand:DI 0 "register_operand" "=r")
> (any_extend:DI
> (ashift:SI (const_int 1)
> (subreg:QI (and:DI (match_operand:DI 1 "register_operand" "r")
> (match_operand 2 "const_int_operand")) 0))))]
What we need to know to transform this into bset for rv64.
After masking the shift count we want to know the low 5 bits aren't 0x1f. If
they were 0x1f, then the constant generated would be 0x80000000 which would
then need sign extension out to 64bits, which the bset instruction will not do
for us.
We can ignore anything outside the low 5 bits. The mode of the shift is SI, so
shifting by 32+ bits is undefined behavior.
It's also worth explicitly mentioning that the hardware is going to mask the
count against 0x3f.
The net is if (operands[2] & 0x1f) != 0x1f, then this transformation is safe.
So onto the generated split code...
> [(set (match_dup 0) (and:DI (match_dup 1) (match_dup 2)))
> (set (match_dup 0) (zero_extend:DI (ashift:SI
> (const_int 1)
> (subreg:QI (match_dup 0) 0))))]
Which would seemingly do exactly what we want. The problem is the first split
insn. If the constant does not fit into a simm12, that insn won't be
recognized resulting in the ICE.
The fix is simple, we just need to mask the constant before generating RTL. We
can just mask it against 0x1f since we only care about the low 5 bits.
This affects multiple patterns. I've added the appropriate fix to all of them.
Tested in my tester. Waiting for the pre-commit bits to run before pushing.
PR target/116283
gcc/
* config/riscv/bitmanip.md (Zbs combiner patterns/splitters): Mask the
bit position in the split code appropriately.
gcc/testsuite/
* gcc.target/riscv/pr116283.c: New test
|
|
Add the TARGET_STACK_CLASH_PROTECTION_ALLOCA_PROBE_RANGE to riscv in
order to enable stack clash protection when using alloca.
The code and tests are the same used by aarch64.
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_compute_frame_info): Update
outgoing args size.
(riscv_stack_clash_protection_alloca_probe_range): New.
(TARGET_STACK_CLASH_PROTECTION_ALLOCA_PROBE_RANGE): New.
* config/riscv/riscv.h
(STACK_CLASH_MIN_BYTES_OUTGOING_ARGS): New.
(STACK_DYNAMIC_OFFSET): New.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/stack-check-14.c: New test.
* gcc.target/riscv/stack-check-15.c: New test.
* gcc.target/riscv/stack-check-alloca-1.c: New test.
* gcc.target/riscv/stack-check-alloca-2.c: New test.
* gcc.target/riscv/stack-check-alloca-3.c: New test.
* gcc.target/riscv/stack-check-alloca-4.c: New test.
* gcc.target/riscv/stack-check-alloca-5.c: New test.
* gcc.target/riscv/stack-check-alloca-6.c: New test.
* gcc.target/riscv/stack-check-alloca-7.c: New test.
* gcc.target/riscv/stack-check-alloca-8.c: New test.
* gcc.target/riscv/stack-check-alloca-9.c: New test.
* gcc.target/riscv/stack-check-alloca-10.c: New test.
* gcc.target/riscv/stack-check-alloca.h: New.
|
|
Adds basic support to vector stack-clash protection using a loop to do
the probing and stack adjustments.
gcc/ChangeLog:
* config/riscv/riscv.cc
(riscv_allocate_and_probe_stack_loop): New function.
(riscv_v_adjust_scalable_frame): Add stack-clash protection
support.
(riscv_allocate_and_probe_stack_space): Move the probe loop
implementation to riscv_allocate_and_probe_stack_loop.
* config/riscv/riscv.h: Define RISCV_STACK_CLASH_VECTOR_CFA_REGNUM.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/stack-check-cfa-3.c: New test.
* gcc.target/riscv/stack-check-prologue-16.c: New test.
* gcc.target/riscv/struct_vect_24.c: New test.
|
|
This implements stack-clash protection for riscv, with
riscv_allocate_and_probe_stack_space being based of
aarch64_allocate_and_probe_stack_space from aarch64's implementation.
We enforce the probing interval and the guard size to always be equal, their
default value is 4Kb which is riscv page size.
We also probe up by 1024 bytes in the general case when a probe is required.
gcc/ChangeLog:
* config/riscv/riscv.cc
(riscv_option_override): Enforce that interval is the same size as
guard size.
(riscv_allocate_and_probe_stack_space): New function.
(riscv_expand_prologue): Call riscv_allocate_and_probe_stack_space
to the final allocation of the stack and add stack-clash dump
information.
* config/riscv/riscv.h: Define STACK_CLASH_CALLER_GUARD and
STACK_CLASH_MAX_UNROLL_PAGES.
gcc/testsuite/ChangeLog:
* gcc.dg/params/blocksort-part.c: Skip riscv for
stack-clash protection intervals.
* gcc.dg/pr82788.c: Skip riscv.
* gcc.dg/stack-check-6.c: Skip residual check for riscv.
* gcc.dg/stack-check-6a.c: Skip riscv.
* gcc.target/riscv/stack-check-12.c: New test.
* gcc.target/riscv/stack-check-13.c: New test.
* gcc.target/riscv/stack-check-cfa-1.c: New test.
* gcc.target/riscv/stack-check-cfa-2.c: New test.
* gcc.target/riscv/stack-check-prologue-1.c: New test.
* gcc.target/riscv/stack-check-prologue-10.c: New test.
* gcc.target/riscv/stack-check-prologue-11.c: New test.
* gcc.target/riscv/stack-check-prologue-12.c: New test.
* gcc.target/riscv/stack-check-prologue-13.c: New test.
* gcc.target/riscv/stack-check-prologue-14.c: New test.
* gcc.target/riscv/stack-check-prologue-15.c: New test.
* gcc.target/riscv/stack-check-prologue-2.c: New test.
* gcc.target/riscv/stack-check-prologue-3.c: New test.
* gcc.target/riscv/stack-check-prologue-4.c: New test.
* gcc.target/riscv/stack-check-prologue-5.c: New test.
* gcc.target/riscv/stack-check-prologue-6.c: New test.
* gcc.target/riscv/stack-check-prologue-7.c: New test.
* gcc.target/riscv/stack-check-prologue-8.c: New test.
* gcc.target/riscv/stack-check-prologue-9.c: New test.
* gcc.target/riscv/stack-check-prologue.h: New file.
* lib/target-supports.exp
(check_effective_target_supports_stack_clash_protection):
Add riscv.
(check_effective_target_caller_implicit_probes): Likewise.
|
|
Move riscv_v_adjust_scalable_frame () in preparation for the stack clash
protection support.
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_v_adjust_scalable_frame): Move
closer to riscv_expand_prologue.
|
|
Enable the register used by riscv_emit_stack_tie () to be passed as
an argument so we can tie the stack with other registers besides
hard_frame_pointer_rtx.
Also don't allow operand 1 of stack_tie<mode> to be optimized to sp
in preparation for the stack clash protection support.
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_emit_stack_tie): Pass the
register to be tied to the stack pointer as argument.
* config/riscv/riscv.md (stack_tie<mode>): Don't match equal
operands.
|
|
When enabling XTheadFmv/Zfa and XThead(F)MemIdx, we might end up
with the following insn (registers are examples, but of correct class):
(set (reg:DF a4)
(mem:DF (plus:SI (mult:SI (reg:SI a0)
(const_int 8))
(reg:SI a5))))
This is a result of an attempt to load the DF register via two SI
register loads followed by XTheadFmv/Zfa instructions to move the
contents of the two SI registers into the DF register.
The two loads are generated in riscv_split_doubleword_move(),
where the second load adds an offset of 4 to load address.
While this works fine for RVI loads, this can't be handled
for XTheadMemIdx addresses. Coming back to the example above,
we would end up with the following insn, which can't be simplified
or matched:
(set (reg:SI a4)
(mem:SI (plus:SI (plus:SI (mult:SI (reg:SI a0)
(const_int 8))
(reg:SI a5))
(const_int 4))))
This triggered an ICE in the past, which was resolved in b79cd204c780,
which also added the test xtheadfmemidx-medany.c, where the examples
are from. The patch postponed the optimization insn_and_split pattern
for XThead(F)MemIdx, so that the situation could effectively be avoided.
Since we don't want to rely on these optimization pattern in the future,
we need a different solution. Therefore, this patch restricts the
movdf_hardfloat_rv32 insn to not match for split-double-word-moves
with XThead(F)MemIdx operands. This ensures we don't need to split
them up later.
When looking at the code generation of the test file, we can see that
we have less GP<->FP conversions, but cannot use the indexed loads.
The new sequence is identical to rv32gc_xtheadfmv (similar to rv32gc_zfa).
Old:
[...]
lla a5,.LANCHOR0
th.flrd fa5,a5,a0,3
fmv.x.w a4,fa5
th.fmv.x.hw a5,fa5
.L1:
fmv.w.x fa0,a4
th.fmv.hw.x fa0,a5
ret
[...]
New:
[...]
lla a5,.LANCHOR0
slli a4,a0,3
add a4,a4,a5
lw a5,4(a4)
lw a4,0(a4)
.L1:
fmv.w.x fa0,a4
th.fmv.hw.x fa0,a5
ret
[...]
This was tested (together with the patch that eliminates the
XTheadMemIdx optimization patterns) with SPEC CPU 2017 intrate
on QEMU (RV64/lp64d).
gcc/ChangeLog:
* config/riscv/constraints.md (th_m_noi): New constraint.
* config/riscv/riscv.md: Adjust movdf_hardfloat_rv32 for
XTheadMemIdx.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/xtheadfmemidx-xtheadfmv-medany.c: Adjust.
* gcc.target/riscv/xtheadfmemidx-zfa-medany.c: Likewise.
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
|
|
We have a huge amount of optimization patterns (insn_and_split) for
XTheadMemIdx and XTheadFMemIdx that attempt to do something, that can be
done more efficient by generic GCC passes, if we have proper support code.
A key function in eliminating the optimization patterns is
th_memidx_classify_address_index(), which needs to identify each possible
memory expression that can be lowered into a XTheadMemIdx/XTheadFMemIdx
instruction. This patch adds all memory expressions that were
previously only recognized by the optimization patterns.
Now, that the address classification is complete, we can finally remove
all optimization patterns with the side-effect or getting rid of the
non-canonical memory expression they produced: (plus (reg) (ashift (reg) (imm))).
A positive side-effect of this change is, that we address an RV32 ICE,
that was caused by the th_memidx_I_c pattern, which did not properly
handle SUBREGs (more details are in PR116131).
A temporary negative side-effect of this change is, that we cause a
regression of the xtheadfmemidx + xtheadfmv/zfa tests (initially
introduced as part of b79cd204c780 to address an ICE).
As this issue cannot be addressed in the code parts that are
adjusted in this patch, we just accept the regression for now.
PR target/116131
gcc/ChangeLog:
* config/riscv/thead.cc (th_memidx_classify_address_index):
Recognize all possible XTheadMemIdx memory operand structures.
(th_fmemidx_output_index): Do strict classification.
* config/riscv/thead.md (*th_memidx_operand): Remove.
(TARGET_XTHEADMEMIDX): Likewise.
(TARGET_HARD_FLOAT && TARGET_XTHEADFMEMIDX): Likewise.
(!TARGET_64BIT && TARGET_XTHEADMEMIDX): Likewise.
(*th_memidx_I_a): Likewise.
(*th_memidx_I_b): Likewise.
(*th_memidx_I_c): Likewise.
(*th_memidx_US_a): Likewise.
(*th_memidx_US_b): Likewise.
(*th_memidx_US_c): Likewise.
(*th_memidx_UZ_a): Likewise.
(*th_memidx_UZ_b): Likewise.
(*th_memidx_UZ_c): Likewise.
(*th_fmemidx_movsf_hardfloat): Likewise.
(*th_fmemidx_movdf_hardfloat_rv64): Likewise.
(*th_fmemidx_I_a): Likewise.
(*th_fmemidx_I_c): Likewise.
(*th_fmemidx_US_a): Likewise.
(*th_fmemidx_US_c): Likewise.
(*th_fmemidx_UZ_a): Likewise.
(*th_fmemidx_UZ_c): Likewise.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/pr116131.c: New test.
Reported-by: Patrick O'Neill <patrick@rivosinc.com>
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
|
|
gcc/ChangeLog:
* config/riscv/riscv.h (RISCV_DWARF_VLENB): Delete.
|
|
arguments
This was supposed to go out the door yesterday, but I kept getting interrupted.
The target bits for rtx costing can't assume the rtl they're given actually
matches a target pattern. It's just kind of inherent in how the costing
routines get called in various places.
In this particular case we're trying to cost a conditional move:
(set (dest) (if_then_else (cond) (true) (false))
On the RISC-V port the backend only allows actual conditionals for COND. So
something like (eq (reg) (const_int 0)). In the costing code for if-then-else
we did something like
(XEXP (XEXP (cond, 0), 0)))
Which fails miserably if COND is a terminal node like (reg) rather than (ne
(reg) (const_int 0)
So this patch tightens up the RTL scanning to ensure that we have a comparison
before we start looking at the comparison's arguments.
Run through my tester without incident, but I'll wait for the pre-commit tester
to run through a cycle before pushing to the trunk.
Jeff
ps. We probably could support a naked REG for the condition and internally convert it to (ne (reg) (const_int 0)), but I don't think it likely happens with any regularity.
PR target/116240
gcc/
* config/riscv/riscv.cc (riscv_rtx_costs): Ensure object is a
comparison before looking at its arguments.
gcc/testsuite
* gcc.target/riscv/pr116240.c: New test.
|
|
This patch support Zimop and Zcmop extension[1].To enable GCC to recognize
and process Zimop and Zcmop extension correctly at compile time.
https://github.com/riscv/riscv-isa-manual/blob/main/src/zimop.adoc
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc: New extension.
* config/riscv/riscv.opt: New mask.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/arch-42.c: New test.
* gcc.target/riscv/arch-43.c: New test.
|
|
This fixes typos in function names and executed code.
gcc/ChangeLog:
* config/riscv/riscv-target-attr.cc (num_occurences_in_str): Rename...
(num_occurrences_in_str): here.
(riscv_process_target_attr): Update num_occurences_in_str callsite.
* config/riscv/riscv-v.cc (emit_vec_widden_cvt_x_f): widden -> widen.
(emit_vec_widen_cvt_x_f): Ditto.
(emit_vec_widden_cvt_f_f): Ditto.
(emit_vec_widen_cvt_f_f): Ditto.
(emit_vec_rounding_to_integer): Update *widden* callsites.
* config/riscv/riscv-vector-builtins.cc (expand_builtin): Update
required_ext_to_isa_name callsite and fix xtheadvector typo.
* config/riscv/riscv-vector-builtins.h (reqired_ext_to_isa_name): Rename...
(required_ext_to_isa_name): here.
* config/riscv/riscv_th_vector.h: Fix endif label.
* config/riscv/vector-crypto.md: boardcast_scalar -> broadcast_scalar.
* config/riscv/vector.md: Ditto.
Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
|
|
This fixes most of the typos I found when reading various parts of the RISC-V
backend.
gcc/ChangeLog:
* config/riscv/arch-canonicalize: Fix typos in comments.
* config/riscv/autovec.md: Ditto.
* config/riscv/riscv-avlprop.cc (avl_can_be_propagated_p): Ditto.
(pass_avlprop::get_vlmax_ta_preferred_avl): Ditto.
* config/riscv/riscv-modes.def (ADJUST_FLOAT_FORMAT): Ditto.
(VLS_MODES): Ditto.
* config/riscv/riscv-opts.h (TARGET_ZICOND_LIKE): Ditto.
(enum rvv_vector_bits_enum): Ditto.
* config/riscv/riscv-protos.h (enum insn_flags): Ditto.
(enum insn_type): Ditto.
* config/riscv/riscv-sr.cc (riscv_sr_match_epilogue): Ditto.
* config/riscv/riscv-string.cc (expand_block_move): Ditto.
* config/riscv/riscv-v.cc (rvv_builder::is_repeating_sequence): Ditto.
(rvv_builder::single_step_npatterns_p): Ditto.
(calculate_ratio): Ditto.
(expand_const_vector): Ditto.
(shuffle_merge_patterns): Ditto.
(shuffle_compress_patterns): Ditto.
(expand_select_vl): Ditto.
* config/riscv/riscv-vector-builtins-functions.def (REQUIRED_EXTENSIONS): Ditto.
* config/riscv/riscv-vector-builtins-shapes.h: Ditto.
* config/riscv/riscv-vector-builtins.cc (function_builder::add_function): Ditto.
(resolve_overloaded_builtin): Ditto.
* config/riscv/riscv-vector-builtins.def (vbool1_t): Ditto.
(vuint8m8_t): Ditto.
(vuint16m8_t): Ditto.
(vfloat16m8_t): Ditto.
(unsigned_vector): Ditto.
* config/riscv/riscv-vector-builtins.h (enum required_ext): Ditto.
* config/riscv/riscv-vector-costs.cc (get_store_value): Ditto.
(costs::analyze_loop_vinfo): Ditto.
(costs::add_stmt_cost): Ditto.
* config/riscv/riscv.cc (riscv_build_integer): Ditto.
(riscv_vector_type_p): Ditto.
* config/riscv/thead.cc (th_mempair_output_move): Ditto.
* config/riscv/thead.md: Ditto.
* config/riscv/vector-iterators.md: Ditto.
* config/riscv/vector.md: Ditto.
* config/riscv/zc.md: Ditto.
Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
|
|
gcc/ChangeLog:
PR target/116152
* config/riscv/riscv.cc (riscv_option_override): Fix url
formatting.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/predef-9.c: Update testcase.
Co-authored-by: Jakub Jelinek <jakub@redhat.com>
Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
|
|
gcc/ChangeLog:
PR target/116152
* config/riscv/riscv.cc (riscv_option_override): Add deprecation
warning.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/predef-9.c: Add check for warning.
Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
|
|
gcc/ChangeLog:
* config/riscv/sync-rvwmo.md: Add conditional length attributes.
* config/riscv/sync-ztso.md: Ditto.
* config/riscv/sync.md: Fix incorrect insn length attributes and
reformat existing conditional checks.
Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
|
|
In PR116149 we choose a wrong vector length which causes wrong values in
a reduction. The problem happens in avlprop where we choose the
number of units in the instruction's mode as vector length. For the
non-scalar variants the respective operand has the correct non-widened
mode. For the scalar variants, however, the same operand has a scalar
mode which obviously only has one unit. This makes us choose VL = 1
leaving three elements undisturbed (so potentially -1). Those end up
in the reduction causing the wrong result.
This patch adjusts the mode_idx just for the scalar variants of the
affected instruction patterns.
gcc/ChangeLog:
PR target/116149
* config/riscv/vector.md: Fix mode_idx attribute of scalar
widen add/sub variants.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/pr116149.c: New test.
|
|
Also add a testcase for -mabi=lp64d where 'd' is required.
gcc/ChangeLog:
PR target/116111
* config/riscv/riscv.cc (riscv_option_override): Add error.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/arch-41.c: New test.
* gcc.target/riscv/pr116111.c: New test.
Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
|
|
This patch adds support for amocas.{b|h|w|d}. Support for amocas.q
(64/128 bit cas for rv32/64) will be added in a future patch.
Extension: https://github.com/riscv/riscv-zacas
Ratification: https://jira.riscv.org/browse/RVS-680
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc: Add zacas extension.
* config/riscv/arch-canonicalize: Make zacas imply zaamo.
* config/riscv/riscv.opt: Add zacas.
* config/riscv/sync.md (zacas_atomic_cas_value<mode>): New pattern.
(atomic_compare_and_swap<mode>): Use new pattern for compare-and-swap ops.
(zalrsc_atomic_cas_value_strong<mode>): Rename atomic_cas_value_strong.
* doc/sourcebuild.texi: Add Zacas documentation.
gcc/testsuite/ChangeLog:
* lib/target-supports.exp: Add zacas testsuite infra support.
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-acquire-release.c:
Remove zacas to continue to test the lr/sc pairs.
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-acquire.c: Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-consume.c: Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-relaxed.c: Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-release.c: Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-seq-cst-relaxed.c: Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-seq-cst.c: Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-acquire-release.c: Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-acquire.c: Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-consume.c: Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-relaxed.c: Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-release.c: Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-seq-cst-relaxed.c: Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-seq-cst.c: Ditto.
* gcc.target/riscv/amo/zabha-zacas-preferred-over-zalrsc.c: New test.
* gcc.target/riscv/amo/zacas-char-requires-zabha.c: New test.
* gcc.target/riscv/amo/zacas-char-requires-zacas.c: New test.
* gcc.target/riscv/amo/zacas-preferred-over-zalrsc.c: New test.
* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-char-acq-rel.c: New test.
* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-char-acquire.c: New test.
* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-char-relaxed.c: New test.
* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-char-release.c: New test.
* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-char-seq-cst.c: New test.
* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-compatability-mapping-no-fence.c:
New test.
* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-compatability-mapping.cc: New test.
* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-int-acq-rel.c: New test.
* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-int-acquire.c: New test.
* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-int-relaxed.c: New test.
* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-int-release.c: New test.
* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-int-seq-cst.c: New test.
* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-short-acq-rel.c: New test.
* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-short-acquire.c: New test.
* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-short-relaxed.c: New test.
* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-short-release.c: New test.
* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-short-seq-cst.c: New test.
* gcc.target/riscv/amo/zacas-ztso-compare-exchange-char-seq-cst.c: New test.
* gcc.target/riscv/amo/zacas-ztso-compare-exchange-char.c: New test.
* gcc.target/riscv/amo/zacas-ztso-compare-exchange-compatability-mapping-no-fence.c:
New test.
* gcc.target/riscv/amo/zacas-ztso-compare-exchange-compatability-mapping.cc: New test.
* gcc.target/riscv/amo/zacas-ztso-compare-exchange-int-seq-cst.c: New test.
* gcc.target/riscv/amo/zacas-ztso-compare-exchange-int.c: New test.
* gcc.target/riscv/amo/zacas-ztso-compare-exchange-short-seq-cst.c: New test.
* gcc.target/riscv/amo/zacas-ztso-compare-exchange-short.c: New test.
Co-authored-by: Patrick O'Neill <patrick@rivosinc.com>
Tested-by: Andrea Parri <andrea@rivosinc.com>
Signed-Off-By: Gianluca Guida <gianluca@rivosinc.com>
|
|
The Pmode is designed for pointer, thus leverage the Xmode instead
for the expanding of the ussub.
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_expand_ussub): Promote to Xmode
instead of Pmode.
Signed-off-by: Pan Li <pan2.li@intel.com>
|
|
A patch introduced a pattern to avoid unnecessary extensions when doing a
min/max operation where one of the values is a 32 bit positive constant.
> (define_insn_and_split "*minmax"
> [(set (match_operand:DI 0 "register_operand" "=r")
> (sign_extend:DI
> (subreg:SI
> (bitmanip_minmax:DI (zero_extend:DI (match_operand:SI 1 "register_operand" "r"))
> (match_operand:DI 2 "immediate_operand" "i"))
> 0)))
> (clobber (match_scratch:DI 3 "=&r"))
> (clobber (match_scratch:DI 4 "=&r"))]
> "TARGET_64BIT && TARGET_ZBB && sext_hwi (INTVAL (operands[2]), 32) >= 0"
> "#"
> "&& reload_completed"
> [(set (match_dup 3) (sign_extend:DI (match_dup 1)))
> (set (match_dup 4) (match_dup 2))
> (set (match_dup 0) (<minmax_optab>:DI (match_dup 3) (match_dup 4)))]
Lots going on in here. The key is the nonconstant value is zero extended from
SI to DI in the original RTL and we know the constant value is unchanged if we
were to sign extend it from 32 to 64 bits.
We change the extension of the nonconstant operand from zero to sign extension.
I'm pretty confident the goal there is take advantage of the fact that SI
values are kept sign extended and will often be optimized away.
The problem occurs when the nonconstant operand has the SI sign bit set. As an
example:
smax (0x8000000, 0x7) resulting in 0x80000000
The split RTL will generate
smax (sign_extend (0x80000000), 0x7))
smax (0xffffffff80000000, 0x7) resulting in 0x7
Opps.
We really needed to change the opcode to umax for this transformation to work.
That's easy enough. But there's further improvements we can make.
First the pattern is a define_and_split with a post-reload split condition. It
would be better implemented as a 4->3 define_split so that the costing model
just works. Second, if operands[1] is a suitably promoted subreg, then we can
elide the sign extension when we generate the split code, so often it'll be a
4->2 split, again with the cost model working with no adjustments needed.
Tested on rv32 and rv64 in my tester. I'll wait for the pre-commit tester to
spin it as well.
PR target/116085
gcc/
* config/riscv/bitmanip.md (minmax extension avoidance splitter):
Rewrite as a simpler define_split. Adjust the opcode appropriately.
Avoid emitting sign extension if it's clearly not needed.
* config/riscv/iterators.md (minmax_optab): Rename to uminmax_optab
and map everything to unsigned variants.
gcc/testsuite/
* gcc.target/riscv/pr116085.c: New test.
|
|
An unquoted apostrophe slipped through when testing the recent
V/M extension patch. This, again, re-words the message to
"Currently the 'V' implementation requires the 'M' extension".
Going to commit as obvious after testing.
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_override_options_internal):
Reword error string without apostrophe.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/pr116036.c: Adjust expected error
string.
|
|
auto_inc_dec (-O3) performs optimizations like the following
if RVV and XTheadMemIdx is enabled.
(insn 23 20 27 3 (set (mem:V4QI (reg:DI 136 [ ivtmp.13 ]) [0 MEM <vector(4) char> [(char *)_39]+0 S4 A32])
(reg:V4QI 168)) "gcc/testsuite/gcc.target/riscv/pr116033.c":12:27 3183 {*movv4qi}
(nil))
(insn 40 39 41 3 (set (reg:DI 136 [ ivtmp.13 ])
(plus:DI (reg:DI 136 [ ivtmp.13 ])
(const_int 20 [0x14]))) 5 {adddi3}
(nil))
====>
(insn 23 20 27 3 (set (mem:V4QI (post_modify:DI (reg:DI 136 [ ivtmp.13 ])
(plus:DI (reg:DI 136 [ ivtmp.13 ])
(const_int 20 [0x14]))) [0 MEM <vector(4) char> [(char *)_39]+0 S4 A32])
(reg:V4QI 168)) "gcc/testsuite/gcc.target/riscv/pr116033.c":12:27 3183 {*movv4qi}
(expr_list:REG_INC (reg:DI 136 [ ivtmp.13 ])
(nil)))
The reason why the pass believes that this is legal is,
that the mode test in th_memidx_classify_address_modify()
requires INTEGRAL_MODE_P (mode), which includes vector modes.
Let's restrict the mode test such, that only MODE_INT is allowed.
PR target/116033
gcc/ChangeLog:
* config/riscv/thead.cc (th_memidx_classify_address_modify):
Fix mode test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/pr116033.c: New test.
Reported-by: Patrick O'Neill <patrick@rivosinc.com>
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
|
|
g:72fbd3b2b2a497dbbe6599239bd61c5624203ed0 added a use of std::array
without explicitly forcing <array> to be included. That didn't cause
problems in my local builds but understandably did for some people.
gcc/
* doc/rtl.texi: Document the need to define INCLUDE_ARRAY before
including rtl-ssa.h.
* rtl-ssa.h: Likewise (in comment).
* config/aarch64/aarch64-cc-fusion.cc: Add INCLUDE_ARRAY.
* config/aarch64/aarch64-early-ra.cc: Likewise.
* config/riscv/riscv-avlprop.cc: Likewise.
* config/riscv/riscv-vsetvl.cc: Likewise.
* fwprop.cc: Likewise.
* late-combine.cc: Likewise.
* pair-fusion.cc: Likewise.
* rtl-ssa/accesses.cc: Likewise.
* rtl-ssa/blocks.cc: Likewise.
* rtl-ssa/changes.cc: Likewise.
* rtl-ssa/functions.cc: Likewise.
* rtl-ssa/insns.cc: Likewise.
* rtl-ssa/movement.cc: Likewise.
|