diff options
author | Jennifer Schmitz <jschmitz@nvidia.com> | 2025-02-13 04:34:30 -0800 |
---|---|---|
committer | Jennifer Schmitz <jschmitz@nvidia.com> | 2025-04-30 11:05:11 +0200 |
commit | 83bb288faa39a0bf5ce2d62e21a090a130d8dda4 (patch) | |
tree | dd7482b9b3986d6e280bedab72ec644e72eebf47 /libjava | |
parent | cc8b8c0b69200ab816a2626e29d91ac995f7438f (diff) | |
download | gcc-83bb288faa39a0bf5ce2d62e21a090a130d8dda4.zip gcc-83bb288faa39a0bf5ce2d62e21a090a130d8dda4.tar.gz gcc-83bb288faa39a0bf5ce2d62e21a090a130d8dda4.tar.bz2 |
AArch64: Fold LD1/ST1 with ptrue to LDR/STR for 128-bit VLS
If -msve-vector-bits=128, SVE loads and stores (LD1 and ST1) with a
ptrue predicate can be replaced by neon instructions (LDR and STR),
thus avoiding the predicate altogether. This also enables formation of
LDP/STP pairs.
For example, the test cases
svfloat64_t
ptrue_load (float64_t *x)
{
svbool_t pg = svptrue_b64 ();
return svld1_f64 (pg, x);
}
void
ptrue_store (float64_t *x, svfloat64_t data)
{
svbool_t pg = svptrue_b64 ();
return svst1_f64 (pg, x, data);
}
were previously compiled to
(with -O2 -march=armv8.2-a+sve -msve-vector-bits=128):
ptrue_load:
ptrue p3.b, vl16
ld1d z0.d, p3/z, [x0]
ret
ptrue_store:
ptrue p3.b, vl16
st1d z0.d, p3, [x0]
ret
Now the are compiled to:
ptrue_load:
ldr q0, [x0]
ret
ptrue_store:
str q0, [x0]
ret
The implementation includes the if-statement
if (known_eq (GET_MODE_SIZE (mode), 16)
&& aarch64_classify_vector_mode (mode) == VEC_SVE_DATA)
which checks for 128-bit VLS and excludes partial modes with a
mode size < 128 (e.g. VNx2QI).
The patch was bootstrapped and tested on aarch64-linux-gnu, no regression.
OK for mainline?
Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com>
gcc/
* config/aarch64/aarch64.cc (aarch64_emit_sve_pred_move):
Fold LD1/ST1 with ptrue to LDR/STR for 128-bit VLS.
gcc/testsuite/
* gcc.target/aarch64/sve/ldst_ptrue_128_to_neon.c: New test.
* gcc.target/aarch64/sve/cond_arith_6.c: Adjust expected outcome.
* gcc.target/aarch64/sve/pcs/return_4_128.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_5_128.c: Likewise.
* gcc.target/aarch64/sve/pcs/struct_3_128.c: Likewise.
Diffstat (limited to 'libjava')
0 files changed, 0 insertions, 0 deletions