diff options
author | Richard Sandiford <richard.sandiford@arm.com> | 2024-01-25 12:03:17 +0000 |
---|---|---|
committer | Richard Sandiford <richard.sandiford@arm.com> | 2024-01-25 12:03:17 +0000 |
commit | f251bbfec9174169510b2dec14b9bf763e7b77af (patch) | |
tree | dc10fa6f862e6900de4bd267f9b0eb1d2c173d1a /gcc/fold-const.cc | |
parent | c6c2a1d79eb333a00124bf67820a7f405d0d8641 (diff) | |
download | gcc-f251bbfec9174169510b2dec14b9bf763e7b77af.zip gcc-f251bbfec9174169510b2dec14b9bf763e7b77af.tar.gz gcc-f251bbfec9174169510b2dec14b9bf763e7b77af.tar.bz2 |
aarch64: Avoid paradoxical subregs in UXTL split [PR113485]
g:74e3e839ab2d36841320 handled the UXTL{,2}-ZIP[12] optimisation
in split1. The UXTL input is a 64-bit vector of N-bit elements
and the result is a 128-bit vector of 2N-bit elements. The
corresponding ZIP1 operates on 128-bit vectors of N-bit elements.
This meant that the ZIP1 input had to be a 128-bit paradoxical subreg
of the 64-bit UXTL input. In the PRs, it wasn't possible to generate
this subreg because the inputs were already subregs of a x[234]
structure of 64-bit vectors.
I don't think the same thing can happen for UXTL2->ZIP2 because
UXTL2 input is a 128-bit vector rather than a 64-bit vector.
It isn't really necessary for ZIP1 to take 128-bit inputs,
since the upper 64 bits are ignored. This patch therefore adds
a pattern for 64-bit → 128-bit ZIP1s.
In principle, we should probably use this form for all ZIP1s.
But in practice, that creates an awkward special case, and
would be quite invasive for stage 4.
gcc/
PR target/113485
* config/aarch64/aarch64-simd.md (aarch64_zip1<mode>_low): New
pattern.
(<optab><Vnarrowq><mode>2): Use it instead of generating a
paradoxical subreg for the input.
gcc/testsuite/
PR target/113485
* gcc.target/aarch64/pr113485.c: New test.
* gcc.target/aarch64/pr113573.c: Likewise.
Diffstat (limited to 'gcc/fold-const.cc')
0 files changed, 0 insertions, 0 deletions