diff options
author | Roger Sayle <roger@nextmovesoftware.com> | 2024-07-16 07:58:28 +0100 |
---|---|---|
committer | Roger Sayle <roger@nextmovesoftware.com> | 2024-07-16 07:58:28 +0100 |
commit | df9451936c6c9e4faea371e3f188e1fc6b6d39e3 (patch) | |
tree | f47fe44bf0245d3b73bef1185c7061b01c58fdfe /gcc/tree-vect-loop.cc | |
parent | a902e35396d68f10bd27477153fafa4f5ac9c319 (diff) | |
download | gcc-df9451936c6c9e4faea371e3f188e1fc6b6d39e3.zip gcc-df9451936c6c9e4faea371e3f188e1fc6b6d39e3.tar.gz gcc-df9451936c6c9e4faea371e3f188e1fc6b6d39e3.tar.bz2 |
PR tree-optimization/114661: Generalize MULT_EXPR recognition in match.pd.
This patch resolves PR tree-optimization/114661, by generalizing the set
of expressions that we canonicalize to multiplication. This extends the
optimization(s) contributed (by me) back in July 2021.
https://gcc.gnu.org/pipermail/gcc-patches/2021-July/575999.html
The existing transformation folds (X*C1)^(X<<C2) into X*C3 when
allowed. A subtlety is that for non-wrapping integer types, we
actually fold this into (int)((unsigned)X*C3) so that we don't
introduce an undefined overflow that wasn't in the original.
Unfortunately, this transformation confuses itself, as the type-cast
multiplication isn't recognized when further combining bit operations.
Fixed here by allowing optional useless type conversions in transforms
to turn (int)((unsigned)X*C1)^(X<<C2) into (int)((unsigned)X*C3) so
that match.pd and EVRP can continue to construct multiplications.
For the example given in the PR:
unsigned mul(unsigned char c) {
if (c > 3) __builtin_unreachable();
return c << 18 | c << 15 |
c << 12 | c << 9 |
c << 6 | c << 3 | c;
}
GCC on x86_64 with -O2 previously generated:
mul: movzbl %dil, %edi
leal (%rdi,%rdi,8), %edx
leal 0(,%rdx,8), %eax
movl %edx, %ecx
sall $15, %edx
orl %edi, %eax
sall $9, %ecx
orl %ecx, %eax
orl %edx, %eax
ret
with this patch we now generate:
mul: movzbl %dil, %eax
imull $299593, %eax, %eax
ret
2024-07-16 Roger Sayle <roger@nextmovesoftware.com>
Richard Biener <rguenther@suse.de>
gcc/ChangeLog
PR tree-optimization/114661
* match.pd ((X*C1)|(X*C2) to X*(C1+C2)): Allow optional useless
type conversions around multiplications, such as those inserted
by this transformation.
gcc/testsuite/ChangeLog
PR tree-optimization/114661
* gcc.dg/pr114661.c: New test case.
Diffstat (limited to 'gcc/tree-vect-loop.cc')
0 files changed, 0 insertions, 0 deletions