riscv-gnu-toolchain/gcc.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	Pan Li <pan2.li@intel.com>	2023-09-28 13:51:07 +0800
committer	Pan Li <pan2.li@intel.com>	2023-09-28 22:22:30 +0800
commit	88d8829e4f435bfc844db5a9df730e20faf7c2c7 (patch)
tree	479cb90bcb3e78229c7b20b3b73d53a55d14b370 /gcc/function-tests.cc
parent	0c8ecbcd3cf7d7187d2017ad02b663a57123b417 (diff)
download	gcc-88d8829e4f435bfc844db5a9df730e20faf7c2c7.zip gcc-88d8829e4f435bfc844db5a9df730e20faf7c2c7.tar.gz gcc-88d8829e4f435bfc844db5a9df730e20faf7c2c7.tar.bz2

RISC-V: Support {U}INT64 to FP16 auto-vectorization

Update in v2: * Add math trap check. * Adjust some test cases. Original logs: This patch would like to support the auto-vectorization from the INT64 to FP16. We take below steps for the conversion. * INT64 to FP32. * FP32 to FP16. Given sample code as below: void test_func (int64_t * __restrict a, _Float16 *b, unsigned n) { for (unsigned i = 0; i < n; i++) b[i] = (_Float16) (a[i]); } Before this patch: test.c:6:26: missed: couldn't vectorize loop test.c:6:26: missed: not vectorized: unsupported data-type ld a0,0(s0) call __floatdihf fsh fa0,0(s1) addi s0,s0,8 addi s1,s1,2 bne s2,s0,.L3 ld ra,24(sp) ld s0,16(sp) ld s1,8(sp) ld s2,0(sp) addi sp,sp,32 After this patch: vsetvli a5,a2,e8,mf8,ta,ma vle64.v v1,0(a0) vsetvli a4,zero,e32,mf2,ta,ma vfncvt.f.x.w v1,v1 vsetvli zero,zero,e16,mf4,ta,ma vfncvt.f.f.w v1,v1 vsetvli zero,a2,e16,mf4,ta,ma vse16.v v1,0(a1) Please note VLS mode is also involved in this patch and covered by the test cases. PR target/111506 gcc/ChangeLog: * config/riscv/autovec.md (<float_cvt><mode><vnnconvert>2): New pattern. * config/riscv/vector-iterators.md: New iterator. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/unop/cvt-0.c: New test. * gcc.target/riscv/rvv/autovec/unop/cvt-1.c: New test. * gcc.target/riscv/rvv/autovec/vls/cvt-0.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>

Diffstat (limited to 'gcc/function-tests.cc')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: