riscv-gnu-toolchain/gcc.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	Richard Sandiford <richard.sandiford@arm.com>	2025-07-10 16:54:45 +0100
committer	Richard Sandiford <richard.sandiford@arm.com>	2025-07-10 16:54:45 +0100
commit	e7f049471c6caf22c65ac48773d864fca7a4cdc4 (patch)
tree	e374b4bb7ed781375336e86ee9e961f848a172ca /libjava/java/io/ObjectStreamField.h
parent	3f59a1cac717f8af84e884e9ec0f6ef14e102e6e (diff)
download	gcc-master.zip gcc-master.tar.gz gcc-master.tar.bz2

aarch64: Fix LD1Q and ST1Q failures for big-endianHEAD trunk master

LD1Q gathers and ST1Q scatters are unusual in that they operate on 128-bit blocks (effectively VNx1TI). However, we don't have modes or ACLE types for 128-bit integers, and 128-bit integers are not the intended use case. Instead, the instructions are intended to be used in "hybrid VLA" operations, where each 128-bit block is an Advanced SIMD vector. The normal SVE modes therefore capture the intended use case better than VNx1TI would. For example, VNx2DI is effectively N copies of V2DI, VNx4SI N copies of V4SI, etc. Since there is only one LD1Q instruction and one ST1Q instruction, the ACLE support used a single pattern for each, with the loaded or stored data having mode VNx2DI. The ST1Q pattern was generated by: rtx data = e.args.last (); e.args.last () = force_lowpart_subreg (VNx2DImode, data, GET_MODE (data)); e.prepare_gather_address_operands (1, false); return e.use_exact_insn (CODE_FOR_aarch64_scatter_st1q); where the force_lowpart_subreg bitcast the stored data to VNx2DI. But such subregs require an element reverse on big-endian targets (see the comment at the head of aarch64-sve.md), which wasn't the intention. The code should have used aarch64_sve_reinterpret instead. The LD1Q pattern was used as follows: e.prepare_gather_address_operands (1, false); return e.use_exact_insn (CODE_FOR_aarch64_gather_ld1q); which always returns a VNx2DI value, leaving the caller to bitcast that to the correct mode. That bitcast again uses subregs and has the same problem as above. However, for the reasons explained in the comment, using aarch64_sve_reinterpret does not work well for LD1Q. The patch instead parameterises the LD1Q based on the required data mode. gcc/ * config/aarch64/aarch64-sve2.md (aarch64_gather_ld1q): Replace with... (@aarch64_gather_ld1q<mode>): ...this, parameterizing based on mode. * config/aarch64/aarch64-sve-builtins-sve2.cc (svld1q_gather_impl::expand): Update accordingly. (svst1q_scatter_impl::expand): Use aarch64_sve_reinterpret instead of force_lowpart_subreg.

Diffstat (limited to 'libjava/java/io/ObjectStreamField.h')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: