diff options
author | Juzhe-Zhong <juzhe.zhong@rivai.ai> | 2024-01-02 15:26:55 +0800 |
---|---|---|
committer | Pan Li <pan2.li@intel.com> | 2024-01-02 17:12:15 +0800 |
commit | 76f069fef7dc12166fa65a664f03f82e7d2d9a78 (patch) | |
tree | b27aeedd4c4ba48f4a0f4b0f60ce6f2cc7718879 /gcc/rust/util/rust-attributes.h | |
parent | b041bd4ec2cff7b6cfa0b27fc631cba8a02975e4 (diff) | |
download | gcc-76f069fef7dc12166fa65a664f03f82e7d2d9a78.zip gcc-76f069fef7dc12166fa65a664f03f82e7d2d9a78.tar.gz gcc-76f069fef7dc12166fa65a664f03f82e7d2d9a78.tar.bz2 |
RISC-V: Add simplification of dummy len and dummy mask COND_LEN_xxx pattern
In https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=d1eacedc6d9ba9f5522f2c8d49ccfdf7939ad72d
I optimize COND_LEN_xxx pattern with dummy len and dummy mask with too simply solution which
causes redundant vsetvli in the following case:
vsetvli a5,a2,e8,m1,ta,ma
vle32.v v8,0(a0)
vsetivli zero,16,e32,m4,tu,mu ----> We should apply VLMAX instead of a CONST_INT AVL
slli a4,a5,2
vand.vv v0,v8,v16
vand.vv v4,v8,v12
vmseq.vi v0,v0,0
sub a2,a2,a5
vneg.v v4,v8,v0.t
vsetvli zero,a5,e32,m4,ta,ma
The root cause above is the following codes:
is_vlmax_len_p (...)
return poly_int_rtx_p (len, &value)
&& known_eq (value, GET_MODE_NUNITS (mode))
&& !satisfies_constraint_K (len); ---> incorrect check.
Actually, we should not elide the VLMAX situation that has AVL in range of [0,31].
After removing the the check above, we will have this following issue:
vsetivli zero,4,e32,m1,ta,ma
vlseg4e32.v v4,(a5)
vlseg4e32.v v12,(a3)
vsetvli a5,zero,e32,m1,tu,ma ---> This is redundant since VLMAX AVL = 4 when it is fixed-vlmax
vfadd.vf v3,v13,fa0
vfadd.vf v1,v12,fa1
vfmul.vv v17,v3,v5
vfmul.vv v16,v1,v5
Since all the following operations (vfadd.vf ... etc) are COND_LEN_xxx with dummy len and dummy mask,
we add the simplification operations dummy len and dummy mask into VLMAX TA and MA policy.
So, after this patch. Both cases are optimal codegen now:
case 1:
vsetvli a5,a2,e32,m1,ta,mu
vle32.v v2,0(a0)
slli a4,a5,2
vand.vv v1,v2,v3
vand.vv v0,v2,v4
sub a2,a2,a5
vmseq.vi v0,v0,0
vneg.v v1,v2,v0.t
vse32.v v1,0(a1)
case 2:
vsetivli zero,4,e32,m1,tu,ma
addi a4,a5,400
vlseg4e32.v v12,(a3)
vfadd.vf v3,v13,fa0
vfadd.vf v1,v12,fa1
vlseg4e32.v v4,(a4)
vfadd.vf v2,v14,fa1
vfmul.vv v17,v3,v5
vfmul.vv v16,v1,v5
This patch is just additional fix of previous approved patch.
Tested on both RV32 and RV64 newlib no regression. Committed.
gcc/ChangeLog:
* config/riscv/riscv-v.cc (is_vlmax_len_p): Remove satisfies_constraint_K.
(expand_cond_len_op): Add simplification of dummy len and dummy mask.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/vf_avl-3.c: New test.
Diffstat (limited to 'gcc/rust/util/rust-attributes.h')
0 files changed, 0 insertions, 0 deletions