diff options
author | Juzhe-Zhong <juzhe.zhong@rivai.ai> | 2024-01-06 10:29:21 +0800 |
---|---|---|
committer | Pan Li <pan2.li@intel.com> | 2024-01-06 10:31:23 +0800 |
commit | 19c76b5a91837cdb3e7aa4cb484dd4cfdebca8ae (patch) | |
tree | e0e7dad08571e1398a0b70600bb5c92635284639 /gcc/tree-vect-loop.cc | |
parent | 9873f13d833b536b46cd6ff46d72e62407b048a8 (diff) | |
download | gcc-19c76b5a91837cdb3e7aa4cb484dd4cfdebca8ae.zip gcc-19c76b5a91837cdb3e7aa4cb484dd4cfdebca8ae.tar.gz gcc-19c76b5a91837cdb3e7aa4cb484dd4cfdebca8ae.tar.bz2 |
RISC-V: Teach liveness computation loop invariant shift amount
1). We not only have vashl_optab,vashr_optab,vlshr_optab which vectorize shift with vector shift amount,
that is, vectorization of 'a[i] >> x[i]', the shift amount is loop variant.
2). But also, we have ashl_optab, ashr_optab, lshr_optab which can vectorize shift with scalar shift amount,
that is, vectorization of 'a[i] >> x', the shift amount is loop invariant.
For the 2) case, we don't need to allocate a vector register group for shift amount.
So consider this following case:
void
f (int *restrict a, int *restrict b, int *restrict c, int *restrict d, int x,
int n)
{
for (int i = 0; i < n; i++)
{
int tmp = b[i] >> x;
int tmp2 = tmp * b[i];
c[i] = tmp2 * b[i];
d[i] = tmp * tmp2 * b[i] >> x;
}
}
Before this patch, we choose LMUL = 4, now after this patch, we can choose LMUL = 8:
f:
ble a5,zero,.L5
.L3:
vsetvli a0,a5,e32,m8,ta,ma
slli a6,a0,2
vle32.v v16,0(a1)
vsra.vx v24,v16,a4
vmul.vv v8,v24,v16
vmul.vv v0,v8,v16
vse32.v v0,0(a2)
vmul.vv v8,v8,v24
vmul.vv v8,v8,v16
vsra.vx v8,v8,a4
vse32.v v8,0(a3)
add a1,a1,a6
add a2,a2,a6
add a3,a3,a6
sub a5,a5,a0
bne a5,zero,.L3
.L5:
ret
Tested on both RV32/RV64 no regression. Ok for trunk ?
Note that we will apply same heuristic for vadd.vx, ... etc when the late-combine pass from
Richard Sandiford is committed (Since we need late combine pass to do vv->vx transformation for vadd).
gcc/ChangeLog:
* config/riscv/riscv-vector-costs.cc (loop_invariant_op_p): New function.
(variable_vectorized_p): Teach loop invariant.
(has_unexpected_spills_p): Ditto.
gcc/testsuite/ChangeLog:
* gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-12.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-14.c: New test.
Diffstat (limited to 'gcc/tree-vect-loop.cc')
0 files changed, 0 insertions, 0 deletions