diff options
author | Kyrylo Tkachov <ktkachov@nvidia.com> | 2025-07-03 09:45:02 -0700 |
---|---|---|
committer | Kyrylo Tkachov <ktkachov@nvidia.com> | 2025-07-11 16:09:40 +0200 |
commit | 1b7bcac0327ccd84f1966c748f4d1aedef64a9c5 (patch) | |
tree | 07f2cffc4ba78f6cb1879727fe514ab145192cd9 /gcc/testsuite | |
parent | 4da7ba86179ffe27956c0ae0191ad9c4a7724443 (diff) | |
download | gcc-1b7bcac0327ccd84f1966c748f4d1aedef64a9c5.zip gcc-1b7bcac0327ccd84f1966c748f4d1aedef64a9c5.tar.gz gcc-1b7bcac0327ccd84f1966c748f4d1aedef64a9c5.tar.bz2 |
aarch64: Handle DImode BCAX operations
To handle DImode BCAX operations we want to do them on the SIMD side only if
the incoming arguments don't require a cross-bank move.
This means we need to split back the combination to separate GP BIC+EOR
instructions if the operands are expected to be in GP regs through reload.
The split happens pre-reload if we already know that the destination will be
a GP reg. Otherwise if reload descides to use the "=r,r" alternative we ensure
operand 0 is early-clobber.
This scheme is similar to how we handle the BSL operations elsewhere in
aarch64-simd.md.
Thus, for the functions:
uint64_t bcax_d_gp (uint64_t a, uint64_t b, uint64_t c) { return BCAX (a, b, c); }
uint64x1_t bcax_d (uint64x1_t a, uint64x1_t b, uint64x1_t c) { return BCAX (a, b, c); }
we now generate the desired:
bcax_d_gp:
bic x1, x1, x2
eor x0, x1, x0
ret
bcax_d:
bcax v0.16b, v0.16b, v1.16b, v2.16b
ret
When the inputs are in SIMD regs we use BCAX and when they are in GP regs we
don't force them to SIMD with extra moves.
Bootstrapped and tested on aarch64-none-linux-gnu.
Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com>
gcc/
* config/aarch64/aarch64-simd.md (*bcaxqdi4): New
define_insn_and_split.
gcc/testsuite/
* gcc.target/aarch64/simd/bcax_d.c: Add tests for DImode arguments.
Diffstat (limited to 'gcc/testsuite')
-rw-r--r-- | gcc/testsuite/gcc.target/aarch64/simd/bcax_d.c | 6 |
1 files changed, 5 insertions, 1 deletions
diff --git a/gcc/testsuite/gcc.target/aarch64/simd/bcax_d.c b/gcc/testsuite/gcc.target/aarch64/simd/bcax_d.c index d68f0e1..a7640c3 100644 --- a/gcc/testsuite/gcc.target/aarch64/simd/bcax_d.c +++ b/gcc/testsuite/gcc.target/aarch64/simd/bcax_d.c @@ -7,9 +7,13 @@ #define BCAX(x,y,z) ((x) ^ ((y) & ~(z))) +/* When the inputs come from GP regs don't form a BCAX. */ +uint64_t bcax_d_gp (uint64_t a, uint64_t b, uint64_t c) { return BCAX (a, b, c); } + +uint64x1_t bcax_d (uint64x1_t a, uint64x1_t b, uint64x1_t c) { return BCAX (a, b, c); } uint32x2_t bcax_s (uint32x2_t a, uint32x2_t b, uint32x2_t c) { return BCAX (a, b, c); } uint16x4_t bcax_h (uint16x4_t a, uint16x4_t b, uint16x4_t c) { return BCAX (a, b, c); } uint8x8_t bcax_b (uint8x8_t a, uint8x8_t b, uint8x8_t c) { return BCAX (a, b, c); } -/* { dg-final { scan-assembler-times {bcax\tv0.16b, v0.16b, v1.16b, v2.16b} 3 } } */ +/* { dg-final { scan-assembler-times {bcax\tv0.16b, v0.16b, v1.16b, v2.16b} 4 } } */ |