aboutsummaryrefslogtreecommitdiff
path: root/libgcc
diff options
context:
space:
mode:
authorKito Cheng <kito.cheng@sifive.com>2023-05-12 10:26:06 +0800
committerKito Cheng <kito.cheng@sifive.com>2023-05-12 21:31:11 +0800
commitc919d059fcb67747d3c0bd539c7044e874b03fb7 (patch)
treed6ede9bcf47623a094ea5faf17d882004ff36586 /libgcc
parentcc0e22b3f25d4b2a326322bce711179c02377e6c (diff)
downloadgcc-c919d059fcb67747d3c0bd539c7044e874b03fb7.zip
gcc-c919d059fcb67747d3c0bd539c7044e874b03fb7.tar.gz
gcc-c919d059fcb67747d3c0bd539c7044e874b03fb7.tar.bz2
RISC-V: Optimize vsetvli of LCM INSERTED edge for user vsetvli [PR 109743]
Rebase to trunk and send V3 patch for: https://gcc.gnu.org/pipermail/gcc-patches/2023-May/617821.html This patch is fixing: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109743. This issue happens is because we are currently very conservative in optimization of user vsetvli. Consider this following case: bb 1: vsetvli a5,a4... (demand AVL = a4). bb 2: RVV insn use a5 (demand AVL = a5). LCM will hoist vsetvl of bb 2 into bb 1. We don't do AVL propagation for this situation since it's complicated that we should analyze the code sequence between vsetvli in bb 1 and RVV insn in bb 2. They are not necessary the consecutive blocks. This patch is doing the optimizations after LCM, we will check and eliminate the vsetvli in LCM inserted edge if such vsetvli is redundant. Such approach is much simplier and safe. code: void foo2 (int32_t *a, int32_t *b, int n) { if (n <= 0) return; int i = n; size_t vl = __riscv_vsetvl_e32m1 (i); for (; i >= 0; i--) { vint32m1_t v = __riscv_vle32_v_i32m1 (a, vl); __riscv_vse32_v_i32m1 (b, v, vl); if (i >= vl) continue; if (i == 0) return; vl = __riscv_vsetvl_e32m1 (i); } } Before this patch: foo2: .LFB2: .cfi_startproc ble a2,zero,.L1 mv a4,a2 li a3,-1 vsetvli a5,a2,e32,m1,ta,mu vsetvli zero,a5,e32,m1,ta,ma <- can be eliminated. .L5: vle32.v v1,0(a0) vse32.v v1,0(a1) bgeu a4,a5,.L3 .L10: beq a2,zero,.L1 vsetvli a5,a4,e32,m1,ta,mu addi a4,a4,-1 vsetvli zero,a5,e32,m1,ta,ma <- can be eliminated. vle32.v v1,0(a0) vse32.v v1,0(a1) addiw a2,a2,-1 bltu a4,a5,.L10 .L3: addiw a2,a2,-1 addi a4,a4,-1 bne a2,a3,.L5 .L1: ret After this patch: f: ble a2,zero,.L1 mv a4,a2 li a3,-1 vsetvli a5,a2,e32,m1,ta,ma .L5: vle32.v v1,0(a0) vse32.v v1,0(a1) bgeu a4,a5,.L3 .L10: beq a2,zero,.L1 vsetvli a5,a4,e32,m1,ta,ma addi a4,a4,-1 vle32.v v1,0(a0) vse32.v v1,0(a1) addiw a2,a2,-1 bltu a4,a5,.L10 .L3: addiw a2,a2,-1 addi a4,a4,-1 bne a2,a3,.L5 .L1: ret PR target/109743 gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (pass_vsetvl::get_vsetvl_at_end): New. (local_avl_compatible_p): New. (pass_vsetvl::local_eliminate_vsetvl_insn): Enhance local optimizations for LCM, rewrite as a backward algorithm. (pass_vsetvl::cleanup_insns): Use new local_eliminate_vsetvl_insn interface, handle a BB at once. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/vsetvl/pr109743-1.c: New test. * gcc.target/riscv/rvv/vsetvl/pr109743-2.c: New test. * gcc.target/riscv/rvv/vsetvl/pr109743-3.c: New test. * gcc.target/riscv/rvv/vsetvl/pr109743-4.c: New test. Co-authored-by: Juzhe-Zhong <juzhe.zhong@rivai.ai>
Diffstat (limited to 'libgcc')
0 files changed, 0 insertions, 0 deletions