aboutsummaryrefslogtreecommitdiff
path: root/gcc/tree-data-ref.c
diff options
context:
space:
mode:
authorRichard Sandiford <richard.sandiford@linaro.org>2018-02-01 11:04:00 +0000
committerRichard Sandiford <rsandifo@gcc.gnu.org>2018-02-01 11:04:00 +0000
commit8179efe00e04285184112de7dbb977a75852197c (patch)
tree5a7162b0e53a7fb9c7dbbc86cf05d9a3619cbda2 /gcc/tree-data-ref.c
parent947b137212d16d432eec201fe7f800dfdb481203 (diff)
downloadgcc-8179efe00e04285184112de7dbb977a75852197c.zip
gcc-8179efe00e04285184112de7dbb977a75852197c.tar.gz
gcc-8179efe00e04285184112de7dbb977a75852197c.tar.bz2
[AArch64] Prefer LD1RQ for big-endian SVE
This patch deals with cases in which a CONST_VECTOR contains a repeating bit pattern that is wider than one element but narrower than 128 bits. The current code: * treats the repeating pattern as a single element * uses the associated LD1R to load and replicate it (such as LD1RD for 64-bit patterns) * uses a subreg to cast the result back to the original vector type The problem is that for big-endian targets, the final cast is effectively a form of element reverse. E.g. say we're using LD1RD to load 16-bit elements, with h being the high parts and l being the low parts: +-----+-----+-----+-----+-----+---- lanes | 0 | 1 | 2 | 3 | 4 | ... +-----+-----+-----+-----+-----+---- memory bytes |h0 l0 h1 l1 h2 l2 h3 l3 h0 l0 .... +---------------------------------- V V V V V V V V ----------+-----------------------+ register .... | 0 | after ----------+-----------------------+ lsb LD1RD .... h3 l3 h0 l0 h1 l1 h2 l2 h3 l3| ----------------------------------+ ----+-----+-----+-----+-----+-----+ expected ... | 4 | 3 | 2 | 1 | 0 | register ----+-----+-----+-----+-----+-----+ lsb contents .... h0 l0 h3 l3 h2 l2 h1 l1 h0 l0| ----------------------------------+ A later patch fixes the handling of general subregs to account for this, but it means that we need to do a REV instruction after the load. It seems better to use LD1RQ[BHW] on a 128-bit pattern instead, since that gets the endianness right without a separate fixup instruction. 2018-02-01 Richard Sandiford <richard.sandiford@linaro.org> gcc/ * config/aarch64/aarch64.c (aarch64_expand_sve_const_vector): Prefer the TImode handling for big-endian targets. gcc/testsuite/ * gcc.target/aarch64/sve/slp_2.c: Expect LD1RQ to be used instead of LD1R[HWD] for multi-element constants on big-endian targets. * gcc.target/aarch64/sve/slp_3.c: Likewise. * gcc.target/aarch64/sve/slp_4.c: Likewise. Reviewed-by: James Greenhalgh <james.greenhalgh@arm.com> From-SVN: r257288
Diffstat (limited to 'gcc/tree-data-ref.c')
0 files changed, 0 insertions, 0 deletions