diff options
author | Roger Sayle <roger@nextmovesoftware.com> | 2023-04-28 14:21:53 +0100 |
---|---|---|
committer | Roger Sayle <roger@nextmovesoftware.com> | 2023-04-28 14:21:53 +0100 |
commit | 650c36ec461a722d9c65e82512b4c3aeec2ffee1 (patch) | |
tree | a10528a33f9e8c8411c6f532acbfc0d49e69f3aa /gcc/fold-const.h | |
parent | fde00589911b5ff75ca167a45128d1d13fa76e57 (diff) | |
download | gcc-650c36ec461a722d9c65e82512b4c3aeec2ffee1.zip gcc-650c36ec461a722d9c65e82512b4c3aeec2ffee1.tar.gz gcc-650c36ec461a722d9c65e82512b4c3aeec2ffee1.tar.bz2 |
PR rtl-optimization/109476: Use ZERO_EXTEND instead of zeroing a SUBREG.
This patch fixes PR rtl-optimization/109476, which is a code quality
regression affecting AVR. The cause is that the lower-subreg pass is
sometimes overly aggressive, lowering the LSHIFTRT below:
(insn 7 4 8 2 (set (reg:HI 51)
(lshiftrt:HI (reg/v:HI 49 [ b ])
(const_int 8 [0x8]))) "t.ii":4:36 557 {lshrhi3}
(nil))
into a pair of QImode SUBREG assignments:
(insn 19 4 20 2 (set (subreg:QI (reg:HI 51) 0)
(reg:QI 54 [ b+1 ])) "t.ii":4:36 86 {movqi_insn_split}
(nil))
(insn 20 19 8 2 (set (subreg:QI (reg:HI 51) 1)
(const_int 0 [0])) "t.ii":4:36 86 {movqi_insn_split}
(nil))
but this idiom, SETs of SUBREGs, interferes with combine's ability
to associate/fuse instructions. The solution, on targets that
have a suitable ZERO_EXTEND (i.e. where the lower-subreg pass
wouldn't itself split a ZERO_EXTEND, so "splitting_zext" is false),
is to split/lower LSHIFTRT to a ZERO_EXTEND.
To answer Richard's question in comment #10 of the bugzilla PR,
the function resolve_shift_zext is called with one of four RTX
codes, ASHIFTRT, LSHIFTRT, ZERO_EXTEND and ASHIFT, but only with
LSHIFTRT can the setting of low_part and high_part SUBREGs be
replaced by a ZERO_EXTEND. For ASHIFTRT, we require a sign
extension, so don't set the high_part to zero; if we're splitting
a ZERO_EXTEND then it doesn't make sense to replace it with a
ZERO_EXTEND, and for ASHIFT we've played games to swap the
high_part and low_part SUBREGs, so that we assign the low_part
to zero (for double word shifts by greater than word size bits).
2023-04-28 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
PR rtl-optimization/109476
* lower-subreg.cc: Include explow.h for force_reg.
(find_decomposable_shift_zext): Pass an additional SPEED_P argument.
If decomposing a suitable LSHIFTRT and we're not splitting
ZERO_EXTEND (based on the current SPEED_P), then use a ZERO_EXTEND
instead of setting a high part SUBREG to zero, which helps combine.
(decompose_multiword_subregs): Update call to resolve_shift_zext.
gcc/testsuite/ChangeLog
PR rtl-optimization/109476
* gcc.target/avr/mmcu/pr109476.c: New test case.
Diffstat (limited to 'gcc/fold-const.h')
0 files changed, 0 insertions, 0 deletions