diff options
author | Jeff Law <jlaw@ventanamicro.com> | 2024-05-18 15:08:07 -0600 |
---|---|---|
committer | Jeff Law <jlaw@ventanamicro.com> | 2024-05-18 15:08:07 -0600 |
commit | 3c9c52a1c0fa7af22f769a2116b28a0b7ea18129 (patch) | |
tree | 982b496dfe1476963239e7d0a824c66fa88c6fab /gcc/config/riscv/riscv.md | |
parent | 988838da722dea09bd81ee9d49800a6f24980372 (diff) | |
download | gcc-3c9c52a1c0fa7af22f769a2116b28a0b7ea18129.zip gcc-3c9c52a1c0fa7af22f769a2116b28a0b7ea18129.tar.gz gcc-3c9c52a1c0fa7af22f769a2116b28a0b7ea18129.tar.bz2 |
[to-be-committed,RISC-V] Improve some shift-add sequences
So this is a minor fix/improvement for shift-add sequences. This was
supposed to help xz in a minor way IIRC.
Combine may present us with (x + C2') << C1 which was canonicalized from
(x << C1) + C2.
Depending on the precise values of C2 and C2' one form may be better
than the other. We can (somewhat awkwardly) use riscv_const_insns to
test for which sequence would be preferred.
Tested on Ventana's CI system as well as my own. Waiting on CI results
from Rivos's tester before moving forward.
Jeff
gcc/
* config/riscv/riscv.md: Add new patterns to allow selection
between (x << C1) + C2 vs (x + C2') << C1 depending on the
cost C2 vs C2'.
gcc/testsuite
* gcc.target/riscv/shift-add-1.c: New test.
Diffstat (limited to 'gcc/config/riscv/riscv.md')
-rw-r--r-- | gcc/config/riscv/riscv.md | 56 |
1 files changed, 56 insertions, 0 deletions
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md index ff4557c..78c16ad 100644 --- a/gcc/config/riscv/riscv.md +++ b/gcc/config/riscv/riscv.md @@ -4162,6 +4162,62 @@ } ) +;; These are forms of (x << C1) + C2, potentially canonicalized from +;; ((x + C2') << C1. Depending on the cost to load C2 vs C2' we may +;; want to go ahead and recognize this form as C2 may be cheaper to +;; synthesize than C2'. +;; +;; It might be better to refactor riscv_const_insns a bit so that we +;; can have an API that passes integer values around rather than +;; constructing a lot of garbage RTL. +;; +;; The mvconst_internal pattern in effect requires this pattern to +;; also be a define_insn_and_split due to insn count costing when +;; splitting in combine. +(define_insn_and_split "" + [(set (match_operand:DI 0 "register_operand" "=r") + (plus:DI (ashift:DI (match_operand:DI 1 "register_operand" "r") + (match_operand 2 "const_int_operand" "n")) + (match_operand 3 "const_int_operand" "n"))) + (clobber (match_scratch:DI 4 "=&r"))] + "(TARGET_64BIT + && riscv_const_insns (operands[3]) + && ((riscv_const_insns (operands[3]) + < riscv_const_insns (GEN_INT (INTVAL (operands[3]) >> INTVAL (operands[2])))) + || riscv_const_insns (GEN_INT (INTVAL (operands[3]) >> INTVAL (operands[2]))) == 0))" + "#" + "&& reload_completed" + [(set (match_dup 0) (ashift:DI (match_dup 1) (match_dup 2))) + (set (match_dup 4) (match_dup 3)) + (set (match_dup 0) (plus:DI (match_dup 0) (match_dup 4)))] + "" + [(set_attr "type" "arith")]) + +(define_insn_and_split "" + [(set (match_operand:DI 0 "register_operand" "=r") + (sign_extend:DI (plus:SI (ashift:SI + (match_operand:SI 1 "register_operand" "r") + (match_operand 2 "const_int_operand" "n")) + (match_operand 3 "const_int_operand" "n")))) + (clobber (match_scratch:DI 4 "=&r"))] + "(TARGET_64BIT + && riscv_const_insns (operands[3]) + && ((riscv_const_insns (operands[3]) + < riscv_const_insns (GEN_INT (INTVAL (operands[3]) >> INTVAL (operands[2])))) + || riscv_const_insns (GEN_INT (INTVAL (operands[3]) >> INTVAL (operands[2]))) == 0))" + "#" + "&& reload_completed" + [(set (match_dup 0) (ashift:DI (match_dup 1) (match_dup 2))) + (set (match_dup 4) (match_dup 3)) + (set (match_dup 0) (sign_extend:DI (plus:SI (match_dup 5) (match_dup 6))))] + "{ + operands[1] = gen_lowpart (DImode, operands[1]); + operands[5] = gen_lowpart (SImode, operands[0]); + operands[6] = gen_lowpart (SImode, operands[4]); + }" + [(set_attr "type" "arith")]) + + (include "bitmanip.md") (include "crypto.md") (include "sync.md") |