diff options
author | Richard Sandiford <richard.sandiford@arm.com> | 2024-06-13 12:48:21 +0100 |
---|---|---|
committer | Richard Sandiford <richard.sandiford@arm.com> | 2024-06-13 12:48:21 +0100 |
commit | 0970ff46ba6330fc80e8736fc05b2eaeeae0b6a0 (patch) | |
tree | b97e9211988dd560d55a7d3fdfeacd1ce24bf87d /gcc/explow.cc | |
parent | 3dac1049c1211e6d06c2536b86445a6334c3866d (diff) | |
download | gcc-0970ff46ba6330fc80e8736fc05b2eaeeae0b6a0.zip gcc-0970ff46ba6330fc80e8736fc05b2eaeeae0b6a0.tar.gz gcc-0970ff46ba6330fc80e8736fc05b2eaeeae0b6a0.tar.bz2 |
aarch64: Fix invalid nested subregs [PR115464]
The testcase extracts one arm_neon.h vector from a pair (one subreg)
and then reinterprets the result as an SVE vector (another subreg).
Each subreg makes sense individually, but we can't fold them together
into a single subreg: it's 32 bytes -> 16 bytes -> 16*N bytes,
but the interpretation of 32 bytes -> 16*N bytes depends on
whether N==1 or N>1.
Since the second subreg makes sense individually, simplify_subreg
should bail out rather than ICE on it. simplify_gen_subreg will
then do the same (because it already checks validate_subreg).
This leaves simplify_gen_subreg returning null, requiring the
caller to take appropriate action.
I think this is relatively likely to occur elsewhere, so the patch
adds a helper for forcing a subreg, allowing a temporary pseudo to
be created where necessary.
I'll follow up by using force_subreg in more places. This patch
is intended to be a minimal backportable fix for the PR.
gcc/
PR target/115464
* simplify-rtx.cc (simplify_context::simplify_subreg): Don't try
to fold two subregs together if their relationship isn't known
at compile time.
* explow.h (force_subreg): Declare.
* explow.cc (force_subreg): New function.
* config/aarch64/aarch64-sve-builtins-base.cc
(svset_neonq_impl::expand): Use it instead of simplify_gen_subreg.
gcc/testsuite/
PR target/115464
* gcc.target/aarch64/sve/acle/general/pr115464.c: New test.
Diffstat (limited to 'gcc/explow.cc')
-rw-r--r-- | gcc/explow.cc | 15 |
1 files changed, 15 insertions, 0 deletions
diff --git a/gcc/explow.cc b/gcc/explow.cc index 8e5f6b8..f684339 100644 --- a/gcc/explow.cc +++ b/gcc/explow.cc @@ -745,6 +745,21 @@ force_reg (machine_mode mode, rtx x) return temp; } +/* Like simplify_gen_subreg, but force OP into a new register if the + subreg cannot be formed directly. */ + +rtx +force_subreg (machine_mode outermode, rtx op, + machine_mode innermode, poly_uint64 byte) +{ + rtx x = simplify_gen_subreg (outermode, op, innermode, byte); + if (x) + return x; + + op = copy_to_mode_reg (innermode, op); + return simplify_gen_subreg (outermode, op, innermode, byte); +} + /* If X is a memory ref, copy its contents to a new temp reg and return that reg. Otherwise, return X. */ |