diff options
author | Oliver Stannard <oliver.stannard@arm.com> | 2025-03-04 08:10:22 +0000 |
---|---|---|
committer | GitHub <noreply@github.com> | 2025-03-04 08:10:22 +0000 |
commit | a619a2e53a9ba09ba18a047b8389bf4dd1912b72 (patch) | |
tree | 82bee21412f26bf94ca5b8e3fb58eca6e146ad66 /clang/lib/CodeGen/CodeGenModule.cpp | |
parent | c61c88862805905dfa8a2c2f8c9f8ef7e1874720 (diff) | |
download | llvm-a619a2e53a9ba09ba18a047b8389bf4dd1912b72.zip llvm-a619a2e53a9ba09ba18a047b8389bf4dd1912b72.tar.gz llvm-a619a2e53a9ba09ba18a047b8389bf4dd1912b72.tar.bz2 |
[ARM] Fix lane ordering for AdvSIMD intrinsics on big-endian targets (#127068)
In arm-neon.h, we insert shufflevectors around each intrinsic when the
target is big-endian, to compensate for the difference between the
ABI-defined memory format of vectors (with the whole vector stored as
one big-endian access) and LLVM's target-independent expectations (with
the lowest-numbered lane in the lowest address). However, this code was
written for the AArch64 ABI, and the AArch32 ABI differs slightly: it
requires that vectors are stored in memory as-if stored with VSTM, which
does a series of 64-bit accesses, instead of the AArch64 VSTR, which
does a single 128-bit access. This means that for AArch32 we need to
reverse the lanes in each 64-bit chunk of the vector, instead of in the
whole vector.
Since there are only a small number of different shufflevector orderings
needed, I've split them out into macros, so that this doesn't need
separate conditions in each intrinsic definition.
Diffstat (limited to 'clang/lib/CodeGen/CodeGenModule.cpp')
0 files changed, 0 insertions, 0 deletions