diff options
author | Juzhe-Zhong <juzhe.zhong@rivai.ai> | 2023-10-18 12:32:59 +0800 |
---|---|---|
committer | Lehua Ding <lehua.ding@rivai.ai> | 2023-10-18 15:58:53 +0800 |
commit | c51040cb43404f411d4234abe7cf1a238b6e0d34 (patch) | |
tree | 767b7cfaa3a4e53d9d09e14db8a38ce66dc656f7 /gcc/tree-vectorizer.h | |
parent | 372c5da2153835baa95f232b917da81ea7445c7d (diff) | |
download | gcc-c51040cb43404f411d4234abe7cf1a238b6e0d34.zip gcc-c51040cb43404f411d4234abe7cf1a238b6e0d34.tar.gz gcc-c51040cb43404f411d4234abe7cf1a238b6e0d34.tar.bz2 |
RISC-V: Optimize consecutive permutation index pattern by vrgather.vi/vx
This patch optimize this following permutation with consecutive patterns index:
typedef char vnx16i __attribute__ ((vector_size (16)));
#define MASK_16 12, 13, 14, 15, 12, 13, 14, 15, 12, 13, 14, 15, 12, 13, 14, 15
vnx16i __attribute__ ((noinline, noclone))
test_1 (vnx16i x, vnx16i y)
{
return __builtin_shufflevector (x, y, MASK_16);
}
Before this patch:
lui a5,%hi(.LC0)
addi a5,a5,%lo(.LC0)
vsetivli zero,16,e8,m1,ta,ma
vle8.v v3,0(a5)
vle8.v v2,0(a1)
vrgather.vv v1,v2,v3
vse8.v v1,0(a0)
ret
After this patch:
vsetivli zero,16,e8,mf8,ta,ma
vle8.v v2,0(a1)
vsetivli zero,4,e32,mf2,ta,ma
vrgather.vi v1,v2,3
vsetivli zero,16,e8,mf8,ta,ma
vse8.v v1,0(a0)
ret
Overal reduce 1 instruction which is vector load instruction which is much more expansive
than VL toggling.
Also, with this patch, we are using vrgather.vi which reduce 1 vector register consumption.
gcc/ChangeLog:
* config/riscv/riscv-v.cc (shuffle_consecutive_patterns): New function.
(expand_vec_perm_const_1): Add consecutive pattern recognition.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vls/def.h: Add new test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/consecutive-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/consecutive-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/consecutive_run-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/consecutive_run-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/consecutive-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/consecutive-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/consecutive-3.c: New test.
Diffstat (limited to 'gcc/tree-vectorizer.h')
0 files changed, 0 insertions, 0 deletions