diff options
author | Jeff Law <jlaw@ventanamicro.com> | 2024-06-18 06:40:40 -0600 |
---|---|---|
committer | Jeff Law <jlaw@ventanamicro.com> | 2024-06-18 06:42:51 -0600 |
commit | a78e2c3a00d8b147b44416f7a843c9df61f04531 (patch) | |
tree | 131cff440687a5e9a69092c0756ca31ee3f4850a /gcc/tree-vect-loop.cc | |
parent | 89c26a99102d2cc00455333795d81d6426be7057 (diff) | |
download | gcc-a78e2c3a00d8b147b44416f7a843c9df61f04531.zip gcc-a78e2c3a00d8b147b44416f7a843c9df61f04531.tar.gz gcc-a78e2c3a00d8b147b44416f7a843c9df61f04531.tar.bz2 |
[to-be-committed,RISC-V] Improve bset generation when bit position is limited
So more work in the ongoing effort to make better use of the Zbs
extension. This time we're trying to exploit knowledge of the shift
count/bit position to allow us to use a bset instruction.
Consider this expression in SImode
(1 << (pos & 0xf)
None of the resulting values will have bit 31 set. So if there's an
explicit zero or sign extension to DI we can drop that explicit
extension and generate a simple bset with x0 as the input value.
Or another example (which I think came from spec at some point and IIRC
was the primary motivation for this patch):
(1 << (7-(pos) % 8))
Before this change they'd generate something like this respectively:
li a5,1
andi a0,a0,15
sllw a0,a5,a0
li a5,7
andn a0,a5,a0
li a5,1
sllw a0,a5,a0
After this change they generate:
andi a0,a0,15 # 9 [c=4 l=4] *anddi3/1
bset a0,x0,a0 # 17 [c=8 l=4] *bsetdi_2
li a5,7 # 27 [c=4 l=4] *movdi_64bit/1
andn a0,a5,a0 # 28 [c=4 l=4] and_notdi3
bset a0,x0,a0 # 19 [c=8 l=4] *bsetdi_2
We achieve this with simple define_splits which target the bsetdi_2
pattern I recently added. Much better than the original implementation
I did a few months back :-) I've got a bclr/binv variant from a few
months back as well, but it needs to be updated to the simpler
implementation found here.
Just ran this through my tester. Will wait for the precommit CI to
render its verdict before moving forward.
gcc/
* config/riscv/bitmanip.md (bset splitters): New patterns for
generating bset when bit position is limited.
Diffstat (limited to 'gcc/tree-vect-loop.cc')
0 files changed, 0 insertions, 0 deletions