diff options
author | Juzhe-Zhong <juzhe.zhong@rivai.ai> | 2023-11-01 14:56:39 +0800 |
---|---|---|
committer | Pan Li <pan2.li@intel.com> | 2023-11-02 08:51:15 +0800 |
commit | 1a0af6e5a99cd895a663f0221c25321ae802413f (patch) | |
tree | 0402ab6261de29021fa24ab0f7386627b1f42e86 /libcpp/include/cpplib.h | |
parent | c73d2d49f9beec33bb843a0c04bde8bc41d7a0b9 (diff) | |
download | gcc-1a0af6e5a99cd895a663f0221c25321ae802413f.zip gcc-1a0af6e5a99cd895a663f0221c25321ae802413f.tar.gz gcc-1a0af6e5a99cd895a663f0221c25321ae802413f.tar.bz2 |
RISC-V: Allow dest operand and accumulator operand overlap of widen reduction instruction[PR112327]
Consider this following intrinsic code:
void rvv_dot_prod(int16_t *pSrcA, int16_t *pSrcB, uint32_t n, int64_t *result)
{
size_t vl;
vint16m4_t vSrcA, vSrcB;
vint64m1_t vSum = __riscv_vmv_s_x_i64m1(0, 1);
while (n > 0) {
vl = __riscv_vsetvl_e16m4(n);
vSrcA = __riscv_vle16_v_i16m4(pSrcA, vl);
vSrcB = __riscv_vle16_v_i16m4(pSrcB, vl);
vSum = __riscv_vwredsum_vs_i32m8_i64m1(__riscv_vwmul_vv_i32m8(vSrcA, vSrcB, vl), vSum, vl);
pSrcA += vl;
pSrcB += vl;
n -= vl;
}
*result = __riscv_vmv_x_s_i64m1_i64(vSum);
}
https://godbolt.org/z/vWd35W7G6
Before this patch:
...
Loop:
...
vmv1r.v v2,v1
...
vwredsum.vs v1,v8,v2
...
After this patch:
...
Loop:
...
vwredsum.vs v1,v8,v1
...
PR target/112327
gcc/ChangeLog:
* config/riscv/vector.md: Add '0'.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/pr112327-1.c: New test.
* gcc.target/riscv/rvv/base/pr112327-2.c: New test.
Diffstat (limited to 'libcpp/include/cpplib.h')
0 files changed, 0 insertions, 0 deletions