diff options
author | Richard Sandiford <richard.sandiford@arm.com> | 2025-07-10 16:54:45 +0100 |
---|---|---|
committer | Richard Sandiford <richard.sandiford@arm.com> | 2025-07-10 16:54:45 +0100 |
commit | e7f049471c6caf22c65ac48773d864fca7a4cdc4 (patch) | |
tree | e374b4bb7ed781375336e86ee9e961f848a172ca /libjava/java/io/ObjectStreamField.h | |
parent | 3f59a1cac717f8af84e884e9ec0f6ef14e102e6e (diff) | |
download | gcc-master.zip gcc-master.tar.gz gcc-master.tar.bz2 |
LD1Q gathers and ST1Q scatters are unusual in that they operate
on 128-bit blocks (effectively VNx1TI). However, we don't have
modes or ACLE types for 128-bit integers, and 128-bit integers
are not the intended use case. Instead, the instructions are
intended to be used in "hybrid VLA" operations, where each 128-bit
block is an Advanced SIMD vector.
The normal SVE modes therefore capture the intended use case better
than VNx1TI would. For example, VNx2DI is effectively N copies
of V2DI, VNx4SI N copies of V4SI, etc.
Since there is only one LD1Q instruction and one ST1Q instruction,
the ACLE support used a single pattern for each, with the loaded or
stored data having mode VNx2DI. The ST1Q pattern was generated by:
rtx data = e.args.last ();
e.args.last () = force_lowpart_subreg (VNx2DImode, data, GET_MODE (data));
e.prepare_gather_address_operands (1, false);
return e.use_exact_insn (CODE_FOR_aarch64_scatter_st1q);
where the force_lowpart_subreg bitcast the stored data to VNx2DI.
But such subregs require an element reverse on big-endian targets
(see the comment at the head of aarch64-sve.md), which wasn't the
intention. The code should have used aarch64_sve_reinterpret instead.
The LD1Q pattern was used as follows:
e.prepare_gather_address_operands (1, false);
return e.use_exact_insn (CODE_FOR_aarch64_gather_ld1q);
which always returns a VNx2DI value, leaving the caller to bitcast
that to the correct mode. That bitcast again uses subregs and has
the same problem as above.
However, for the reasons explained in the comment, using
aarch64_sve_reinterpret does not work well for LD1Q. The patch
instead parameterises the LD1Q based on the required data mode.
gcc/
* config/aarch64/aarch64-sve2.md (aarch64_gather_ld1q): Replace with...
(@aarch64_gather_ld1q<mode>): ...this, parameterizing based on mode.
* config/aarch64/aarch64-sve-builtins-sve2.cc
(svld1q_gather_impl::expand): Update accordingly.
(svst1q_scatter_impl::expand): Use aarch64_sve_reinterpret
instead of force_lowpart_subreg.
Diffstat (limited to 'libjava/java/io/ObjectStreamField.h')
0 files changed, 0 insertions, 0 deletions