diff options
author | Juzhe-Zhong <juzhe.zhong@rivai.ai> | 2023-11-08 14:05:00 +0800 |
---|---|---|
committer | Lehua Ding <lehua.ding@rivai.ai> | 2023-11-08 14:38:16 +0800 |
commit | f9148120048f4508156acfcd19a334f4dcbb96f0 (patch) | |
tree | 9e6636ecb048609bfe6faf7714aaf881ed32b8be /libgfortran/generated/minloc1_4_s1.c | |
parent | 078087d1605060da4f993af83b1bfa351b278d38 (diff) | |
download | gcc-f9148120048f4508156acfcd19a334f4dcbb96f0.zip gcc-f9148120048f4508156acfcd19a334f4dcbb96f0.tar.gz gcc-f9148120048f4508156acfcd19a334f4dcbb96f0.tar.bz2 |
RISC-V: Normalize user vsetvl intrinsics[PR112092]
Since our user vsetvl intrinsics are defined as just calculate the VL output
which is the number of the elements to be processed. Such intrinsics do not
have any side effects. We should normalize them when they have same ratio.
E.g __riscv_vsetvl_e8mf8 result is same as __riscv_vsetvl_e64m1.
Normalize them can allow us have better codegen.
Consider this following example:
#include "riscv_vector.h"
void foo(int32_t *in1, int32_t *in2, int32_t *in3, int32_t *out, size_t n, int cond, int avl) {
size_t vl;
if (cond)
vl = __riscv_vsetvl_e32m1(avl);
else
vl = __riscv_vsetvl_e16mf2(avl);
for (size_t i = 0; i < n; i += 1) {
vint32m1_t a = __riscv_vle32_v_i32m1(in1, vl);
vint32m1_t b = __riscv_vle32_v_i32m1_tu(a, in2, vl);
vint32m1_t c = __riscv_vle32_v_i32m1_tu(b, in3, vl);
__riscv_vse32_v_i32m1(out, c, vl);
}
}
Before this patch:
foo:
beq a5,zero,.L2
vsetvli a6,a6,e32,m1,tu,ma
.L3:
li a5,0
beq a4,zero,.L9
.L4:
vle32.v v1,0(a0)
addi a5,a5,1
vle32.v v1,0(a1)
vle32.v v1,0(a2)
vse32.v v1,0(a3)
bne a4,a5,.L4
.L9:
ret
.L2:
vsetvli zero,a6,e32,m1,tu,ma
j .L3
After this patch:
foo:
li a5,0
vsetvli zero,a6,e32,m1,tu,ma
beq a4,zero,.L9
.L4:
vle32.v v1,0(a0)
addi a5,a5,1
vle32.v v1,0(a1)
vle32.v v1,0(a2)
vse32.v v1,0(a3)
bne a4,a5,.L4
.L9:
ret
PR target/112092
gcc/ChangeLog:
* config/riscv/riscv-vector-builtins-bases.cc: Normalize the vsetvls.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/vsetvl/pr109743-1.c: Adapt test.
* gcc.target/riscv/rvv/vsetvl/pr109743-3.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vsetvl-11.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vsetvl-15.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vsetvl-22.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vsetvlmax-13.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vsetvlmax-15.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vsetvlmax-5.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vsetvlmax-7.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vsetvlmax-8.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/pr112092-1.c: New test.
* gcc.target/riscv/rvv/vsetvl/pr112092-2.c: New test.
Diffstat (limited to 'libgfortran/generated/minloc1_4_s1.c')
0 files changed, 0 insertions, 0 deletions