diff options
author | Pan Li <pan2.li@intel.com> | 2023-05-25 10:58:49 +0800 |
---|---|---|
committer | Pan Li <pan2.li@intel.com> | 2023-05-29 18:24:52 +0800 |
commit | a99dc11fe272f6a1214f357b82f3f7eb5c7dabc3 (patch) | |
tree | 7c0b524a29ac393cf0ea2c5e69d5fc85f9514343 /gcc/rtl.h | |
parent | 8d1d9b16482d929c79e075782f165588e89c60ce (diff) | |
download | gcc-a99dc11fe272f6a1214f357b82f3f7eb5c7dabc3.zip gcc-a99dc11fe272f6a1214f357b82f3f7eb5c7dabc3.tar.gz gcc-a99dc11fe272f6a1214f357b82f3f7eb5c7dabc3.tar.bz2 |
RISC-V: Using merge approach to optimize repeating sequence in vec_init
This patch would like to optimize the VLS vector initialization like
repeating sequence. From the vslide1down to the vmerge with a simple
cost model, aka every instruction only has 1 cost.
Given code with -march=rv64gcv_zvl256b --param riscv-autovec-preference=fixed-vlmax
typedef int64_t vnx32di __attribute__ ((vector_size (256)));
__attribute__ ((noipa)) void
f_vnx32di (int64_t a, int64_t b, int64_t *out)
{
vnx32di v = {
a, b, a, b, a, b, a, b,
a, b, a, b, a, b, a, b,
a, b, a, b, a, b, a, b,
a, b, a, b, a, b, a, b,
};
*(vnx32di *) out = v;
}
Before this patch:
vslide1down.vx (x31 times)
After this patch:
li a5,-1431654400
addi a5,a5,-1365
li a3,-1431654400
addi a3,a3,-1366
slli a5,a5,32
add a5,a5,a3
vsetvli a4,zero,e64,m8,ta,ma
vmv.v.x v8,a0
vmv.s.x v0,a5
vmerge.vxm v8,v8,a1,v0
vs8r.v v8,0(a2)
Since we dont't have SEW = 128 in vec_duplicate, we can't combine ab into
SEW = 128 element and then broadcast this big element.
Signed-off-by: Pan Li <pan2.li@intel.com>
Co-Authored by: Juzhe-Zhong <juzhe.zhong@rivai.ai>
gcc/ChangeLog:
* config/riscv/riscv-protos.h (enum insn_type): New type.
* config/riscv/riscv-v.cc (RVV_INSN_OPERANDS_MAX): New macro.
(rvv_builder::can_duplicate_repeating_sequence_p): Align the referenced
class member.
(rvv_builder::get_merged_repeating_sequence): Ditto.
(rvv_builder::repeating_sequence_use_merge_profitable_p): New function
to evaluate the optimization cost.
(rvv_builder::get_merge_scalar_mask): New function to get the merge
mask.
(emit_scalar_move_insn): New function to emit vmv.s.x.
(emit_vlmax_integer_move_insn): New function to emit vlmax vmv.v.x.
(emit_nonvlmax_integer_move_insn): New function to emit nonvlmax
vmv.v.x.
(get_repeating_sequence_dup_machine_mode): New function to get the dup
machine mode.
(expand_vector_init_merge_repeating_sequence): New function to perform
the optimization.
(expand_vec_init): Add this vector init optimization.
* config/riscv/riscv.h (BITS_PER_WORD): New macro.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-5.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-run-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-run-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-run-3.c: New test.
Signed-off-by: Pan Li <pan2.li@intel.com>
Diffstat (limited to 'gcc/rtl.h')
0 files changed, 0 insertions, 0 deletions