aboutsummaryrefslogtreecommitdiff
path: root/clang/lib/CodeGen/CodeGenModule.cpp
diff options
context:
space:
mode:
authorOliver Stannard <oliver.stannard@arm.com>2025-03-04 08:10:22 +0000
committerGitHub <noreply@github.com>2025-03-04 08:10:22 +0000
commita619a2e53a9ba09ba18a047b8389bf4dd1912b72 (patch)
tree82bee21412f26bf94ca5b8e3fb58eca6e146ad66 /clang/lib/CodeGen/CodeGenModule.cpp
parentc61c88862805905dfa8a2c2f8c9f8ef7e1874720 (diff)
downloadllvm-a619a2e53a9ba09ba18a047b8389bf4dd1912b72.zip
llvm-a619a2e53a9ba09ba18a047b8389bf4dd1912b72.tar.gz
llvm-a619a2e53a9ba09ba18a047b8389bf4dd1912b72.tar.bz2
[ARM] Fix lane ordering for AdvSIMD intrinsics on big-endian targets (#127068)
In arm-neon.h, we insert shufflevectors around each intrinsic when the target is big-endian, to compensate for the difference between the ABI-defined memory format of vectors (with the whole vector stored as one big-endian access) and LLVM's target-independent expectations (with the lowest-numbered lane in the lowest address). However, this code was written for the AArch64 ABI, and the AArch32 ABI differs slightly: it requires that vectors are stored in memory as-if stored with VSTM, which does a series of 64-bit accesses, instead of the AArch64 VSTR, which does a single 128-bit access. This means that for AArch32 we need to reverse the lanes in each 64-bit chunk of the vector, instead of in the whole vector. Since there are only a small number of different shufflevector orderings needed, I've split them out into macros, so that this doesn't need separate conditions in each intrinsic definition.
Diffstat (limited to 'clang/lib/CodeGen/CodeGenModule.cpp')
0 files changed, 0 insertions, 0 deletions