diff options
author | Tamar Christina <tamar.christina@arm.com> | 2023-07-14 11:21:12 +0100 |
---|---|---|
committer | Tamar Christina <tamar.christina@arm.com> | 2023-07-14 11:21:12 +0100 |
commit | d8f5e349772b6652bddb0620bb178290905998b9 (patch) | |
tree | ea989ad3a5834016d436064ccd5380fe44ac1d23 /gcc/tree-pass.h | |
parent | b77161e60bce7b4416319defe5f141f14fd375c4 (diff) | |
download | gcc-d8f5e349772b6652bddb0620bb178290905998b9.zip gcc-d8f5e349772b6652bddb0620bb178290905998b9.tar.gz gcc-d8f5e349772b6652bddb0620bb178290905998b9.tar.bz2 |
ifcvt: Reduce comparisons on conditionals by tracking truths [PR109154]
Following on from Jakub's patch in g:de0ee9d14165eebb3d31c84e98260c05c3b33acb
these two patches finishes the work fixing the regression and improves codegen.
As explained in that commit, ifconvert sorts PHI args in increasing number of
occurrences in order to reduce the number of comparisons done while
traversing the tree.
The remaining task that this patch fixes is dealing with the long chain of
comparisons that can be created from phi nodes, particularly when they share
any common successor (classical example is a diamond node).
on a PHI-node the true and else branches carry a condition, true will
carry `a` and false `~a`. The issue is that at the moment GCC tests both `a`
and `~a` when the phi node has more than 2 arguments. Clearly this isn't
needed. The deeper the nesting of phi nodes the larger the repetition.
As an example, for
foo (int *f, int d, int e)
{
for (int i = 0; i < 1024; i++)
{
int a = f[i];
int t;
if (a < 0)
t = 1;
else if (a < e)
t = 1 - a * d;
else
t = 0;
f[i] = t;
}
}
after Jakub's patch we generate:
_7 = a_10 < 0;
_21 = a_10 >= 0;
_22 = a_10 < e_11(D);
_23 = _21 & _22;
_ifc__42 = _23 ? t_13 : 0;
t_6 = _7 ? 1 : _ifc__42
but while better than before it is still inefficient, since in the false
branch, where we know ~_7 is true, we still test _21.
This leads to superfluous tests for every diamond node. After this patch we
generate
_7 = a_10 < 0;
_22 = a_10 < e_11(D);
_ifc__42 = _22 ? t_13 : 0;
t_6 = _7 ? 1 : _ifc__42;
Which correctly elides the test of _21. This is done by borrowing the
vectorizer's helper functions to limit predicate mask usages. Ifcvt will chain
conditionals on the false edge (unless specifically inverted) so this patch on
creating cond a ? b : c, will register ~a when traversing c. If c is a
conditional then c will be simplified to the smaller possible predicate given
the assumptions we already know to be true.
gcc/ChangeLog:
PR tree-optimization/109154
* tree-if-conv.cc (gen_simplified_condition,
gen_phi_nest_statement): New.
(gen_phi_arg_condition, predicate_scalar_phi): Use it.
gcc/testsuite/ChangeLog:
PR tree-optimization/109154
* gcc.dg/vect/vect-ifcvt-19.c: New test.
Diffstat (limited to 'gcc/tree-pass.h')
0 files changed, 0 insertions, 0 deletions