diff options
author | Roger Sayle <roger@nextmovesoftware.com> | 2024-02-01 06:10:42 +0000 |
---|---|---|
committer | Roger Sayle <roger@nextmovesoftware.com> | 2024-02-01 06:10:42 +0000 |
commit | 2f14c0dbb789852947cb58fdf7d3162413f053fa (patch) | |
tree | 87b48a81a1d7734aecb79533c601a42ca4b7bdef /libcpp | |
parent | fd4829dde46b9836c40c9ab27bde98521e692119 (diff) | |
download | gcc-2f14c0dbb789852947cb58fdf7d3162413f053fa.zip gcc-2f14c0dbb789852947cb58fdf7d3162413f053fa.tar.gz gcc-2f14c0dbb789852947cb58fdf7d3162413f053fa.tar.bz2 |
PR target/113560: Enhance is_widening_mult_rhs_p.
This patch resolves PR113560, a code quality regression from GCC12
affecting x86_64, by enhancing the middle-end's tree-ssa-math-opts.cc
to recognize more instances of widening multiplications.
The widening multiplication perception code identifies cases like:
_1 = (unsigned __int128) x;
__res = _1 * 100;
but in the reported test case, the original input looks like:
_1 = (unsigned long long) x;
_2 = (unsigned __int128) _1;
__res = _2 * 100;
which gets optimized by constant folding during tree-ssa to:
_2 = x & 18446744073709551615; // x & 0xffffffffffffffff
__res = _2 * 100;
where the BIT_AND_EXPR hides (has consumed) the extension operation.
This reveals the more general deficiency (missed optimization
opportunity) in widening multiplication perception that additionally
both
__int128 foo(__int128 x, __int128 y) {
return (x & 1000) * (y & 1000)
}
and
unsigned __int128 bar(unsigned __int128 x, unsigned __int128) {
return (x >> 80) * (y >> 80);
}
should be recognized as widening multiplications. Hence rather than
test explicitly for BIT_AND_EXPR (as in the first version of this patch)
the more general solution is to make use of range information, as
provided by tree_non_zero_bits.
As a demonstration of the observed improvements, function foo above
currently with -O2 compiles on x86_64 to:
foo: movq %rdi, %rsi
movq %rdx, %r8
xorl %edi, %edi
xorl %r9d, %r9d
andl $1000, %esi
andl $1000, %r8d
movq %rdi, %rcx
movq %r9, %rdx
imulq %rsi, %rdx
movq %rsi, %rax
imulq %r8, %rcx
addq %rdx, %rcx
mulq %r8
addq %rdx, %rcx
movq %rcx, %rdx
ret
with this patch, GCC recognizes the *w and instead generates:
foo: movq %rdi, %rsi
movq %rdx, %r8
andl $1000, %esi
andl $1000, %r8d
movq %rsi, %rax
imulq %r8
ret
which is perhaps easier to understand at the tree-level where
__int128 foo (__int128 x, __int128 y)
{
__int128 _1;
__int128 _2;
__int128 _5;
<bb 2> [local count: 1073741824]:
_1 = x_3(D) & 1000;
_2 = y_4(D) & 1000;
_5 = _1 * _2;
return _5;
}
gets transformed to:
__int128 foo (__int128 x, __int128 y)
{
__int128 _1;
__int128 _2;
__int128 _5;
signed long _7;
signed long _8;
<bb 2> [local count: 1073741824]:
_1 = x_3(D) & 1000;
_2 = y_4(D) & 1000;
_7 = (signed long) _1;
_8 = (signed long) _2;
_5 = _7 w* _8;
return _5;
}
2023-02-01 Roger Sayle <roger@nextmovesoftware.com>
Richard Biener <rguenther@suse.de>
gcc/ChangeLog
PR target/113560
* tree-ssa-math-opts.cc (is_widening_mult_rhs_p): Use range
information via tree_non_zero_bits to check if this operand
is suitably extended for a widening (or highpart) multiplication.
(convert_mult_to_widen): Insert explicit casts if the RHS or LHS
isn't already of the claimed type.
gcc/testsuite/ChangeLog
PR target/113560
* g++.target/i386/pr113560.C: New test case.
* gcc.target/i386/pr113560.c: Likewise.
* gcc.dg/pr87954.c: Update test case.
Diffstat (limited to 'libcpp')
0 files changed, 0 insertions, 0 deletions