diff options
author | Juzhe-Zhong <juzhe.zhong@rivai.ai> | 2023-07-14 06:17:09 +0800 |
---|---|---|
committer | Pan Li <pan2.li@intel.com> | 2023-07-14 20:53:30 +0800 |
commit | 0d2673e995f0dd69f406a34d2e87d2a25cf3c285 (patch) | |
tree | a3e7dd7baefe5a74f5182941f9d59fc758fb1c7e /gcc/tree-pass.h | |
parent | 53d12ecd624ec901d8449cfa1917f6f90e910927 (diff) | |
download | gcc-0d2673e995f0dd69f406a34d2e87d2a25cf3c285.zip gcc-0d2673e995f0dd69f406a34d2e87d2a25cf3c285.tar.gz gcc-0d2673e995f0dd69f406a34d2e87d2a25cf3c285.tar.bz2 |
RISC-V: Enable COND_LEN_FMA auto-vectorization
Add comments as Robin's suggestion in scatter_store_run-7.c
Enable COND_LEN_FMA auto-vectorization for floating-point FMA auto-vectorization **NO** ffast-math.
Since the middle-end support has been approved and I will merge it after I finished bootstrap && regression on X86.
https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624395.html
Now, it's time to send this patch.
Consider this following case:
__attribute__ ((noipa)) void ternop_##TYPE (TYPE *__restrict dst, \
TYPE *__restrict a, \
TYPE *__restrict b, int n) \
{ \
for (int i = 0; i < n; i++) \
dst[i] += a[i] * b[i]; \
}
TEST_ALL ()
Before this patch:
ternop_double:
ble a3,zero,.L5
mv a6,a0
.L3:
vsetvli a5,a3,e64,m1,tu,ma
slli a4,a5,3
vle64.v v1,0(a0)
vle64.v v2,0(a1)
vle64.v v3,0(a2)
sub a3,a3,a5
vfmul.vv v2,v2,v3
vfadd.vv v1,v1,v2
vse64.v v1,0(a6)
add a0,a0,a4
add a1,a1,a4
add a2,a2,a4
add a6,a6,a4
bne a3,zero,.L3
.L5:
ret
After this patch:
ternop_double:
ble a3,zero,.L5
mv a6,a0
.L3:
vsetvli a5,a3,e64,m1,tu,ma
slli a4,a5,3
vle64.v v1,0(a0)
vle64.v v2,0(a1)
vle64.v v3,0(a2)
sub a3,a3,a5
vfmacc.vv v1,v3,v2
vse64.v v1,0(a6)
add a0,a0,a4
add a1,a1,a4
add a2,a2,a4
add a6,a6,a4
bne a3,zero,.L3
.L5:
ret
Notice: This patch only supports COND_LEN_FMA, **NO** COND_LEN_FNMA, ... etc since I didn't support them
in the middle-end yet.
Will support them in the following patches soon.
gcc/ChangeLog:
* config/riscv/autovec.md (cond_len_fma<mode>): New pattern.
* config/riscv/riscv-protos.h (enum insn_type): New enum.
(expand_cond_len_ternop): New function.
* config/riscv/riscv-v.cc (emit_nonvlmax_fp_ternary_tu_insn): Ditto.
(expand_cond_len_ternop): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-7.c:
Adapt testcase for link fail.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-1.c: New test.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-2.c: New test.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-3.c: New test.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm_run-1.c: New test.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm_run-2.c: New test.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm_run-3.c: New test.
Diffstat (limited to 'gcc/tree-pass.h')
0 files changed, 0 insertions, 0 deletions