diff options
author | Jakub Jelinek <jakub@redhat.com> | 2021-04-27 15:42:47 +0200 |
---|---|---|
committer | Jakub Jelinek <jakub@redhat.com> | 2021-04-27 15:42:47 +0200 |
commit | 83d26d0e1b3625ab6c2d83610a13976b52f63e0a (patch) | |
tree | 8a5ffbe1f59590ad485a150e098ff30af783e841 /gcc | |
parent | 26690993d0a93656b0a20788b5c3439fbd260da2 (diff) | |
download | gcc-83d26d0e1b3625ab6c2d83610a13976b52f63e0a.zip gcc-83d26d0e1b3625ab6c2d83610a13976b52f63e0a.tar.gz gcc-83d26d0e1b3625ab6c2d83610a13976b52f63e0a.tar.bz2 |
veclower: Fix up vec_shl matching of VEC_PERM_EXPR [PR100239]
The following testcase ICEs at -O0, because lower_vec_perm sees the
_1 = { 0, 0, 0, 0, 0, 0, 0, 0 };
_2 = VEC_COND_EXPR <_1, { -1, -1, -1, -1, -1, -1, -1, -1 }, { 0, 0, 0, 0, 0, 0, 0, 0 }>;
_3 = { 6, 0, 0, 0, 0, 0, 0, 0 };
_4 = VEC_PERM_EXPR <{ 0, 0, 0, 0, 0, 0, 0, 0 }, _2, _3>;
and as the ISA is SSE2, there is no support for the particular permutation
nor for variable mask permutation. But, the code to match vec_shl matches
it, because the permutation has the first operand a zero vector and the
mask picks all elements randomly from that vector.
So, in the end that isn't a vec_shl, but the permutation could be in theory
optimized into the first argument. As we keep it as is, it will fail
during expansion though, because that for vec_shl correctly requires that
it actually is a shift:
unsigned firstidx = 0;
for (unsigned int i = 0; i < nelt; i++)
{
if (known_eq (sel[i], nelt))
{
if (i == 0 || firstidx)
return NULL_RTX;
firstidx = i;
}
else if (firstidx
? maybe_ne (sel[i], nelt + i - firstidx)
: maybe_ge (sel[i], nelt))
return NULL_RTX;
}
if (firstidx == 0)
return NULL_RTX;
first = firstidx;
The if (firstidx == 0) return NULL; is what is missing a counterpart
on the lower_vec_perm side.
As with optimize != 0 we fold it in other spots, I think it is not needed
to optimize this cornercase in lower_vec_perm (which would mean we'd need
to recurse on the newly created _4 = { 0, 0, 0, 0, 0, 0, 0, 0 };
whether it is supported or not).
2021-04-27 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/100239
* tree-vect-generic.c (lower_vec_perm): Don't accept constant
permutations with all indices from the first zero element as vec_shl.
* gcc.dg/pr100239.c: New test.
Diffstat (limited to 'gcc')
-rw-r--r-- | gcc/testsuite/gcc.dg/pr100239.c | 12 | ||||
-rw-r--r-- | gcc/tree-vect-generic.c | 2 |
2 files changed, 13 insertions, 1 deletions
diff --git a/gcc/testsuite/gcc.dg/pr100239.c b/gcc/testsuite/gcc.dg/pr100239.c new file mode 100644 index 0000000..1ade810 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr100239.c @@ -0,0 +1,12 @@ +/* PR tree-optimization/100239 */ +/* { dg-do compile } */ +/* { dg-options "-O0" } */ + +typedef short __attribute__((__vector_size__ (8 * sizeof (short)))) V; +V v, w; + +void +foo (void) +{ + w = __builtin_shuffle (v != v, 0 < (V) {}, (V) {192} >> 5); +} diff --git a/gcc/tree-vect-generic.c b/gcc/tree-vect-generic.c index 751f181..5cc32c4 100644 --- a/gcc/tree-vect-generic.c +++ b/gcc/tree-vect-generic.c @@ -1563,7 +1563,7 @@ lower_vec_perm (gimple_stmt_iterator *gsi) elements + i - first) : maybe_ge (poly_uint64 (indices[i]), elements)) break; - if (i == elements) + if (first && i == elements) { gimple_assign_set_rhs3 (stmt, mask); update_stmt (stmt); |