aboutsummaryrefslogtreecommitdiff
path: root/libgomp/testsuite/libgomp.oacc-c-c++-common/mapping-1.c
diff options
context:
space:
mode:
authorJennifer Schmitz <jschmitz@nvidia.com>2025-03-11 02:18:46 -0700
committerJennifer Schmitz <jschmitz@nvidia.com>2025-05-09 09:14:01 +0200
commit3d7e67ac0d9acc43927c2fb7c358924c84d90f37 (patch)
tree9249d33fc6a2a82f5c4bf297f77231589f57717b /libgomp/testsuite/libgomp.oacc-c-c++-common/mapping-1.c
parent86a7642ef5979ff1cf28f4b3eda73dae8f0e0ef2 (diff)
downloadgcc-master.zip
gcc-master.tar.gz
gcc-master.tar.bz2
AArch64: Optimize SVE loads/stores with ptrue predicates to unpredicated instructions.HEADtrunkmaster
SVE loads and stores where the predicate is all-true can be optimized to unpredicated instructions. For example, svuint8_t foo (uint8_t *x) { return svld1 (svptrue_b8 (), x); } was compiled to: foo: ptrue p3.b, all ld1b z0.b, p3/z, [x0] ret but can be compiled to: foo: ldr z0, [x0] ret Late_combine2 had already been trying to do this, but was missing the instruction: (set (reg/i:VNx16QI 32 v0) (unspec:VNx16QI [ (const_vector:VNx16BI repeat [ (const_int 1 [0x1]) ]) (mem:VNx16QI (reg/f:DI 0 x0 [orig:106 x ] [106]) [0 MEM <svuint8_t> [(unsigned char *)x_2(D)]+0 S[16, 16] A8]) ] UNSPEC_PRED_X)) This patch adds a new define_insn_and_split that matches the missing instruction and splits it to an unpredicated load/store. Because LDR offers fewer addressing modes than LD1[BHWD], the pattern is guarded under reload_completed to only apply the transform once the address modes have been chosen during RA. The patch was bootstrapped and tested on aarch64-linux-gnu, no regression. OK for mainline? Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com> gcc/ * config/aarch64/aarch64-sve.md (*aarch64_sve_ptrue<mode>_ldr_str): Add define_insn_and_split to fold predicated SVE loads/stores with ptrue predicates to unpredicated instructions. gcc/testsuite/ * gcc.target/aarch64/sve/ptrue_ldr_str.c: New test. * gcc.target/aarch64/sve/acle/general/attributes_6.c: Adjust expected outcome. * gcc.target/aarch64/sve/cost_model_14.c: Adjust expected outcome. * gcc.target/aarch64/sve/cost_model_4.c: Adjust expected outcome. * gcc.target/aarch64/sve/cost_model_5.c: Adjust expected outcome. * gcc.target/aarch64/sve/cost_model_6.c: Adjust expected outcome. * gcc.target/aarch64/sve/cost_model_7.c: Adjust expected outcome. * gcc.target/aarch64/sve/pcs/varargs_2_f16.c: Adjust expected outcome. * gcc.target/aarch64/sve/pcs/varargs_2_f32.c: Adjust expected outcome. * gcc.target/aarch64/sve/pcs/varargs_2_f64.c: Adjust expected outcome. * gcc.target/aarch64/sve/pcs/varargs_2_mf8.c: Adjust expected outcome. * gcc.target/aarch64/sve/pcs/varargs_2_s16.c: Adjust expected outcome. * gcc.target/aarch64/sve/pcs/varargs_2_s32.c: Adjust expected outcome. * gcc.target/aarch64/sve/pcs/varargs_2_s64.c: Adjust expected outcome. * gcc.target/aarch64/sve/pcs/varargs_2_s8.c: Adjust expected outcome. * gcc.target/aarch64/sve/pcs/varargs_2_u16.c: Adjust expected outcome. * gcc.target/aarch64/sve/pcs/varargs_2_u32.c: Adjust expected outcome. * gcc.target/aarch64/sve/pcs/varargs_2_u64.c: Adjust expected outcome. * gcc.target/aarch64/sve/pcs/varargs_2_u8.c: Adjust expected outcome. * gcc.target/aarch64/sve/peel_ind_2.c: Adjust expected outcome. * gcc.target/aarch64/sve/single_1.c: Adjust expected outcome. * gcc.target/aarch64/sve/single_2.c: Adjust expected outcome. * gcc.target/aarch64/sve/single_3.c: Adjust expected outcome. * gcc.target/aarch64/sve/single_4.c: Adjust expected outcome.
Diffstat (limited to 'libgomp/testsuite/libgomp.oacc-c-c++-common/mapping-1.c')
0 files changed, 0 insertions, 0 deletions