diff options
author | Jeff Law <jlaw@ventanamicro.com> | 2023-11-09 17:34:01 -0700 |
---|---|---|
committer | Jeff Law <jlaw@ventanamicro.com> | 2023-11-09 17:34:01 -0700 |
commit | 57dbc02d261bb833f6ef287187eb144321dd595c (patch) | |
tree | 65b0d2164967b2a49137c0a5e787ca176cafe556 /gcc | |
parent | 9a0cc04b9c9b02426762892b88efc5c44ba546bd (diff) | |
download | gcc-57dbc02d261bb833f6ef287187eb144321dd595c.zip gcc-57dbc02d261bb833f6ef287187eb144321dd595c.tar.gz gcc-57dbc02d261bb833f6ef287187eb144321dd595c.tar.bz2 |
[committed] Improve single bit zero extraction on H8.
When zero extracting a single bit bitfield from bits 16..31 on the H8 we
currently generate some pretty bad code.
The fundamental issue is we can't shift efficiently and there's no trivial way
to extract a single bit out of the high half word of an SImode value.
What usually happens is we use a synthesized right shift to get the single bit
into the desired position, then a bit-and to mask off everything we don't care
about.
The shifts are expensive, even using tricks like half and quarter word moves to
implement shift-by-16 and shift-by-8. Additionally a logical right shift must
clear out the upper bits which is redundant since we're going to mask things
with &1 later.
This patch provides a consistently better sequence for such extractions. The
general form moves the high half into the low half, a bit extraction into C,
clear the destination, then move C into the destination with a few special
cases.
This also avoids all the shenanigans for H8/SX which has a much more capable
shifter. It's not single cycle, but it is reasonably efficient.
This has been regression tested on the H8 without issues. Pushing to the trunk
momentarily.
jeff
ps. Yes, supporting zero extraction of multi-bit fields might be improvable as
well. But I've already spent more time on this than I can reasonably justify.
gcc/
* config/h8300/combiner.md (single bit sign_extract): Avoid recently
added patterns for H8/SX.
(single bit zero_extract): New patterns.
Diffstat (limited to 'gcc')
-rw-r--r-- | gcc/config/h8300/combiner.md | 70 |
1 files changed, 68 insertions, 2 deletions
diff --git a/gcc/config/h8300/combiner.md b/gcc/config/h8300/combiner.md index 2f7faf7..e1179b5 100644 --- a/gcc/config/h8300/combiner.md +++ b/gcc/config/h8300/combiner.md @@ -1278,7 +1278,7 @@ (sign_extract:SI (match_operand:QHSI 1 "register_operand" "0") (const_int 1) (match_operand 2 "immediate_operand")))] - "" + "!TARGET_H8300SX" "#" "&& reload_completed" [(parallel [(set (match_dup 0) @@ -1291,7 +1291,7 @@ (const_int 1) (match_operand 2 "immediate_operand"))) (clobber (reg:CC CC_REG))] - "" + "!TARGET_H8300SX" { int position = INTVAL (operands[2]); @@ -1359,3 +1359,69 @@ return "subx\t%s0,%s0\;exts.w %T0\;exts.l %0"; } [(set_attr "length" "10")]) + +;; For shift counts >= 16 we can always do better than the +;; generic sequences. Other patterns handle smaller counts. +(define_insn_and_split "" + [(set (match_operand:SI 0 "register_operand" "=r") + (and:SI (lshiftrt:SI (match_operand:SI 1 "register_operand" "0") + (match_operand 2 "immediate_operand" "n")) + (const_int 1)))] + "!TARGET_H8300SX && INTVAL (operands[2]) >= 16" + "#" + "&& reload_completed" + [(parallel [(set (match_dup 0) (and:SI (lshiftrt:SI (match_dup 0) (match_dup 2)) + (const_int 1))) + (clobber (reg:CC CC_REG))])]) + +(define_insn "" + [(set (match_operand:SI 0 "register_operand" "=r") + (and:SI (lshiftrt:SI (match_operand:SI 1 "register_operand" "0") + (match_operand 2 "immediate_operand" "n")) + (const_int 1))) + (clobber (reg:CC CC_REG))] + "!TARGET_H8300SX && INTVAL (operands[2]) >= 16" +{ + int position = INTVAL (operands[2]); + + /* If the bit we want is the highest bit we can just rotate it into position + and mask off everything else. */ + if (position == 31) + { + output_asm_insn ("rotl.l\t%0", operands); + return "and.l\t#1,%0"; + } + + /* Special case for H8/S. Similar to bit 31. */ + if (position == 30 && TARGET_H8300S) + return "rotl.l\t#2,%0\;and.l\t#1,%0"; + + if (position <= 30 && position >= 17) + { + /* Shift 16 bits, without worrying about extensions. */ + output_asm_insn ("mov.w\t%e1,%f0", operands); + + /* Get the bit we want into C. */ + operands[2] = GEN_INT (position % 8); + if (position >= 24) + output_asm_insn ("bld\t%2,%t0", operands); + else + output_asm_insn ("bld\t%2,%s0", operands); + + /* xor + rotate to clear the destination, then rotate + the C into position. */ + return "xor.l\t%0,%0\;rotxl.l\t%0"; + } + + if (position == 16) + { + /* Shift 16 bits, without worrying about extensions. */ + output_asm_insn ("mov.w\t%e1,%f0", operands); + + /* And finally, mask out everything we don't want. */ + return "and.l\t#1,%0"; + } + + gcc_unreachable (); +} + [(set_attr "length" "10")]) |