aboutsummaryrefslogtreecommitdiff
path: root/gcc/rtl.h
diff options
context:
space:
mode:
authorPan Li <pan2.li@intel.com>2023-05-25 10:58:49 +0800
committerPan Li <pan2.li@intel.com>2023-05-29 18:24:52 +0800
commita99dc11fe272f6a1214f357b82f3f7eb5c7dabc3 (patch)
tree7c0b524a29ac393cf0ea2c5e69d5fc85f9514343 /gcc/rtl.h
parent8d1d9b16482d929c79e075782f165588e89c60ce (diff)
downloadgcc-a99dc11fe272f6a1214f357b82f3f7eb5c7dabc3.zip
gcc-a99dc11fe272f6a1214f357b82f3f7eb5c7dabc3.tar.gz
gcc-a99dc11fe272f6a1214f357b82f3f7eb5c7dabc3.tar.bz2
RISC-V: Using merge approach to optimize repeating sequence in vec_init
This patch would like to optimize the VLS vector initialization like repeating sequence. From the vslide1down to the vmerge with a simple cost model, aka every instruction only has 1 cost. Given code with -march=rv64gcv_zvl256b --param riscv-autovec-preference=fixed-vlmax typedef int64_t vnx32di __attribute__ ((vector_size (256))); __attribute__ ((noipa)) void f_vnx32di (int64_t a, int64_t b, int64_t *out) { vnx32di v = { a, b, a, b, a, b, a, b, a, b, a, b, a, b, a, b, a, b, a, b, a, b, a, b, a, b, a, b, a, b, a, b, }; *(vnx32di *) out = v; } Before this patch: vslide1down.vx (x31 times) After this patch: li a5,-1431654400 addi a5,a5,-1365 li a3,-1431654400 addi a3,a3,-1366 slli a5,a5,32 add a5,a5,a3 vsetvli a4,zero,e64,m8,ta,ma vmv.v.x v8,a0 vmv.s.x v0,a5 vmerge.vxm v8,v8,a1,v0 vs8r.v v8,0(a2) Since we dont't have SEW = 128 in vec_duplicate, we can't combine ab into SEW = 128 element and then broadcast this big element. Signed-off-by: Pan Li <pan2.li@intel.com> Co-Authored by: Juzhe-Zhong <juzhe.zhong@rivai.ai> gcc/ChangeLog: * config/riscv/riscv-protos.h (enum insn_type): New type. * config/riscv/riscv-v.cc (RVV_INSN_OPERANDS_MAX): New macro. (rvv_builder::can_duplicate_repeating_sequence_p): Align the referenced class member. (rvv_builder::get_merged_repeating_sequence): Ditto. (rvv_builder::repeating_sequence_use_merge_profitable_p): New function to evaluate the optimization cost. (rvv_builder::get_merge_scalar_mask): New function to get the merge mask. (emit_scalar_move_insn): New function to emit vmv.s.x. (emit_vlmax_integer_move_insn): New function to emit vlmax vmv.v.x. (emit_nonvlmax_integer_move_insn): New function to emit nonvlmax vmv.v.x. (get_repeating_sequence_dup_machine_mode): New function to get the dup machine mode. (expand_vector_init_merge_repeating_sequence): New function to perform the optimization. (expand_vec_init): Add this vector init optimization. * config/riscv/riscv.h (BITS_PER_WORD): New macro. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-1.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-2.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-3.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-4.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-5.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-run-1.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-run-2.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-run-3.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
Diffstat (limited to 'gcc/rtl.h')
0 files changed, 0 insertions, 0 deletions