diff options
author | Richard Sandiford <richard.sandiford@arm.com> | 2025-08-04 11:45:28 +0100 |
---|---|---|
committer | Richard Sandiford <richard.sandiford@arm.com> | 2025-08-04 11:45:28 +0100 |
commit | f702b593e7268ab161053bafd097f1b09933b783 (patch) | |
tree | cb430637a4828a26ecd2629e69f86440be08e90c /libcpp/include/cpplib.h | |
parent | fcfbe83d88c1bfae49e654b5095ebe46cbe361d8 (diff) | |
download | gcc-f702b593e7268ab161053bafd097f1b09933b783.zip gcc-f702b593e7268ab161053bafd097f1b09933b783.tar.gz gcc-f702b593e7268ab161053bafd097f1b09933b783.tar.bz2 |
aarch64: Use VNx16BI for more SVE WHILE* results [PR121118]
PR121118 is about a case where we try to construct a predicate
constant using a permutation of a PFALSE and a WHILELO. The WHILELO
is a .H operation and its result has mode VNx8BI. However, the
permute instruction expects both inputs to be VNx16BI, leading to
an unrecognisable insn ICE.
VNx8BI is effectively a form of VNx16BI in which every odd-indexed
bit is insignificant. In the PR's testcase that's OK, since those
bits will be dropped by the permutation. But if the WHILELO had been a
VNx4BI, so that only every fourth bit is significant, the input to the
permutation would have had undefined bits. The testcase in the patch
has an example of this.
This feeds into a related ACLE problem that I'd been meaning to
fix for a long time: every bit of an svbool_t result is significant,
and so every ACLE intrinsic that returns an svbool_t should return a
VNx16BI. That doesn't currently happen for ACLE svwhile* intrinsics.
This patch fixes both issues together.
We still need to keep the current WHILE* patterns for autovectorisation,
where the result mode should match the element width. The patch
therefore adds a new set of patterns that are defined to return
VNx16BI instead. For want of a better scheme, it uses an "_acle"
suffix to distinguish these new patterns from the "normal" ones.
The formulation used is:
(and:VNx16BI (subreg:VNx16BI normal-pattern 0) C)
where C has mode VNx16BI and is a canonical ptrue for normal-pattern's
element width (so that the low bit of each element is set and the upper
bits are clear).
This is a bit clunky, and leads to some repetition. But it has two
advantages:
* After g:965564eafb721f8000013a3112f1bba8d8fae32b, converting the
above expression back to normal-pattern's mode will reduce to
normal-pattern, so that the pattern for testing the result using a
PTEST doesn't change.
* It gives RTL optimisers a bit more information, as the new tests
demonstrate.
In the expression above, C is matched using a new "special" predicate
aarch64_ptrue_all_operand, where "special" means that the mode on the
predicate is not necessarily the mode of the expression. In this case,
C always has mode VNx16BI, but the mode on the predicate indicates which
kind of canonical PTRUE is needed.
gcc/
PR testsuite/121118
* config/aarch64/iterators.md (VNx16BI_ONLY): New mode iterator.
* config/aarch64/predicates.md (aarch64_ptrue_all_operand): New
predicate.
* config/aarch64/aarch64-sve.md
(@aarch64_sve_while_<while_optab_cmp><GPI:mode><VNx16BI_ONLY:mode>_acle)
(@aarch64_sve_while_<while_optab_cmp><GPI:mode><PRED_HSD:mode>_acle)
(*aarch64_sve_while_<while_optab_cmp><GPI:mode><PRED_HSD:mode>_acle)
(*while_<while_optab_cmp><GPI:mode><PRED_HSD:mode>_acle_cc): New
patterns.
* config/aarch64/aarch64-sve-builtins-functions.h
(while_comparison::expand): Use the new _acle patterns that
always return a VNx16BI.
* config/aarch64/aarch64-sve-builtins-sve2.cc
(svwhilerw_svwhilewr_impl::expand): Likewise.
* config/aarch64/aarch64.cc
(aarch64_sve_move_pred_via_while): Likewise.
gcc/testsuite/
PR testsuite/121118
* gcc.target/aarch64/sve/acle/general/pr121118_1.c: New test.
* gcc.target/aarch64/sve/acle/general/whilele_13.c: Likewise.
* gcc.target/aarch64/sve/acle/general/whilelt_6.c: Likewise.
* gcc.target/aarch64/sve2/acle/general/whilege_1.c: Likewise.
* gcc.target/aarch64/sve2/acle/general/whilegt_1.c: Likewise.
* gcc.target/aarch64/sve2/acle/general/whilerw_5.c: Likewise.
* gcc.target/aarch64/sve2/acle/general/whilewr_5.c: Likewise.
Diffstat (limited to 'libcpp/include/cpplib.h')
0 files changed, 0 insertions, 0 deletions