[CIR][AArch64] Lower NEON vzip intrinsics (#193658) - rocket-tools/riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	Jiahao Guo <eoonguo@gmail.com>	2026-04-29 23:41:31 +0800
committer	GitHub <noreply@github.com>	2026-04-29 16:41:31 +0100
commit	b46904a0326ac685e970b1a8a3f576b865098763 (patch)
tree	db4ce7d9645b539d7e4a4df8fab0e776cd8ae2df /polly/lib/CodeGen/LoopGeneratorsKMP.cpp
parent	f11ad99f08fc64a93ba9b6f8d2c7faa8ecbdcd52 (diff)
download	llvm-b46904a0326ac685e970b1a8a3f576b865098763.tar.gz llvm-b46904a0326ac685e970b1a8a3f576b865098763.tar.bz2 llvm-b46904a0326ac685e970b1a8a3f576b865098763.zip

[CIR][AArch64] Lower NEON vzip intrinsics (#193658)

### Summary part of https://github.com/llvm/llvm-project/issues/185382 lower part of intrinsics in : https://arm-software.github.io/acle/neon_intrinsics/advsimd.html#zip-elements Lower NEON::BI__builtin_neon_vzip_v and NEON::BI__builtin_neon_vzipq_v in CIRGenBuiltinAArch64.cpp by porting the existing incubator logic (`clangir/clang/lib/CIR/CodeGen/CIRGenBuiltinAArch64.cpp`) onto ClangIR: two bitcasts on the input vectors, two rounds of cir.vec.shuffle generating the low/high interleave patterns, each stored through a ptr_stride of the sret base pointer. ### Test - test_vzip_mf8 - test_vzipq_mf8 I found that these two intrinsics are defined in `llvm-project/clang/test/CodeGen/AArch64/fp8-intrinsics/acle_neon_fp8_untyped.c`, but this file seems to be a test suite specifically for the `mfloat8` type, so I did not remove their original test cases. Some of the new CHECK lines additionally match a pair of bitcasts before the shuffle; this shape comes from arm_neon.h's inline wrappers, which re-cast typed vectors (e.g. <4 x i16>) through <8 x i8> before calling __builtin_neon_vzip_v. Variants whose element type is already i8 (s8/u8/p8/mf8) skip that round-trip and therefore have no bitcasts in the check lines.

Diffstat (limited to 'polly/lib/CodeGen/LoopGeneratorsKMP.cpp')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: