diff options
author | Richard Sandiford <richard.sandiford@linaro.org> | 2018-05-25 08:18:42 +0000 |
---|---|---|
committer | Richard Sandiford <rsandifo@gcc.gnu.org> | 2018-05-25 08:18:42 +0000 |
commit | 8f76f377861b4195487416806c4a0eacabc433c9 (patch) | |
tree | bf586692b116d22b1e515fb4c7f3bb8fc7f2774a /gcc/tree-vect-patterns.c | |
parent | 0d2b3bca81acf226e6c10defbc6072de4cf7e75c (diff) | |
download | gcc-8f76f377861b4195487416806c4a0eacabc433c9.zip gcc-8f76f377861b4195487416806c4a0eacabc433c9.tar.gz gcc-8f76f377861b4195487416806c4a0eacabc433c9.tar.bz2 |
Prefer open-coding vector integer division
vect_recog_divmod_pattern currently bails out if the target has
native support for integer division, but I think in practice
it's always going to be better to open-code it anyway, just as
we usually open-code scalar divisions by constants.
I think the only currently affected targets are MIPS MSA and
powerpcspe (which is currently marked obsolete). For:
void
foo (int *x)
{
for (int i = 0; i < 100; ++i)
x[i] /= 2;
}
the MSA port previously preferred to use division for powers of 2:
.set noreorder
bnz.w $w1,1f
div_s.w $w0,$w0,$w1
break 7
.set reorder
1:
(or just the div_s.w for -mno-check-zero-division), but after the patch
it open-codes them using shifts:
clt_s.w $w1,$w0,$w2
subv.w $w0,$w0,$w1
srai.w $w0,$w0,1
MSA doesn't define a high-part pattern, so it still uses a division
instruction for the non-power-of-2 case.
Richard B pointed out that this would disable SLP of division by
different amounts, but I think in practice that's a price worth paying,
since the current cost model can't really tell whether using a general
vector division is better than using open-coded scalar divisions.
The fix would be either to support SLP of mixed open-coded divisions
or to improve the cost model and try SLP again without the patterns.
The patch adds an XFAILed test for this.
2018-05-23 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* tree-vect-patterns.c: Include predict.h.
(vect_recog_divmod_pattern): Restrict check for division support
to when optimizing for size.
gcc/testsuite/
* gcc.dg/vect/bb-slp-div-1.c: New XFAILed test.
From-SVN: r260711
Diffstat (limited to 'gcc/tree-vect-patterns.c')
-rw-r--r-- | gcc/tree-vect-patterns.c | 21 |
1 files changed, 13 insertions, 8 deletions
diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c index 75bf84b..6da784c 100644 --- a/gcc/tree-vect-patterns.c +++ b/gcc/tree-vect-patterns.c @@ -45,6 +45,7 @@ along with GCC; see the file COPYING3. If not see #include "attribs.h" #include "cgraph.h" #include "omp-simd-clone.h" +#include "predict.h" /* Pattern recognition functions */ static gimple *vect_recog_widen_sum_pattern (vec<gimple *> *, tree *, @@ -2674,15 +2675,19 @@ vect_recog_divmod_pattern (vec<gimple *> *stmts, if (vectype == NULL_TREE) return NULL; - /* If the target can handle vectorized division or modulo natively, - don't attempt to optimize this. */ - optab = optab_for_tree_code (rhs_code, vectype, optab_default); - if (optab != unknown_optab) + if (optimize_bb_for_size_p (gimple_bb (last_stmt))) { - machine_mode vec_mode = TYPE_MODE (vectype); - int icode = (int) optab_handler (optab, vec_mode); - if (icode != CODE_FOR_nothing) - return NULL; + /* If the target can handle vectorized division or modulo natively, + don't attempt to optimize this, since native division is likely + to give smaller code. */ + optab = optab_for_tree_code (rhs_code, vectype, optab_default); + if (optab != unknown_optab) + { + machine_mode vec_mode = TYPE_MODE (vectype); + int icode = (int) optab_handler (optab, vec_mode); + if (icode != CODE_FOR_nothing) + return NULL; + } } prec = TYPE_PRECISION (itype); |