diff options
author | David Sherwood <david.sherwood@arm.com> | 2025-09-03 09:51:54 +0100 |
---|---|---|
committer | GitHub <noreply@github.com> | 2025-09-03 09:51:54 +0100 |
commit | 73bed64433072338d11ebf770d6db99c2ce810aa (patch) | |
tree | 79c3d9d717694baa65567dcfa74ece3af057733c /llvm/utils/FileCheck | |
parent | 349523e26b80155b200e52e628006855371b6a93 (diff) | |
download | llvm-73bed64433072338d11ebf770d6db99c2ce810aa.zip llvm-73bed64433072338d11ebf770d6db99c2ce810aa.tar.gz llvm-73bed64433072338d11ebf770d6db99c2ce810aa.tar.bz2 |
[AArch64] Improve lowering for scalable masked deinterleaving loads (#154338)
For IR like this:
%mask = ... @llvm.vector.interleave2(<vscale x 16 x i1> %a, <vscale x 16
x i1> %a)
%vec = ... @llvm.masked.load(..., <vscale x 32 x i1> %mask, ...)
%dvec = ... @llvm.vector.deinterleave2(<vscale x 32 x i8> %vec)
where we're deinterleaving a wide masked load of the supported type
and with an interleaved mask we can lower this directly to a ld2b
instruction. Similarly we can also support other variants of ld2
and ld4.
This PR adds a DAG combine to spot such patterns and lower to ld2X
or ld4X variants accordingly, whilst being careful to ensure the
masked load is only used by the deinterleave intrinsic.
Diffstat (limited to 'llvm/utils/FileCheck')
0 files changed, 0 insertions, 0 deletions