[LLVM] Add `llvm.experimental.vector.compress` intrinsic (#92289)

This PR adds a new vector intrinsic `@llvm.experimental.vector.compress` to "compress" data within a vector based on a selection mask, i.e., it moves all selected values (i.e., where `mask[i] == 1`) to consecutive lanes in the result vector. A `passthru` vector can be provided, from which remaining lanes are filled. The main reason for this is that the existing `@llvm.masked.compressstore` has very strong constraints in that it can only write values that were selected, resulting in guard branches for all targets except AVX-512 (and even there the AMD implementation is _very_ slow). More instruction sets support "compress" logic, but only within registers. So to store the values, an additional store is needed. But this combination is likely significantly faster on many target as it avoids branches. In follow up PRs, my plan is to add target-specific lowerings for x86, SVE, and possibly RISCV. I also want to combine this with a store instruction, as this is probably a common case and we can avoid some memory writes in that case. See [discussion in forum](https://discourse.llvm.org/t/new-intrinsic-for-masked-vector-compress-without-store/78663) for initial discussion on the design.
author: Lawrence Benson <github@lawben.com> 2024-07-17 14:24:24 +0200
committer: GitHub <noreply@github.com> 2024-07-17 14:24:24 +0200
commit: 177ce1900f0de05337f744edd3f4e454f7a93b06 (patch)
tree: 04c2c4069e81bbd32f52b01558c5bde56cc3bdf2 /llvm/lib/CodeGen/TargetLoweringBase.cpp
parent: 329e7c80ac2dbc16c267390da5f1baaf1cd438b1 (diff)
download: llvm-177ce1900f0de05337f744edd3f4e454f7a93b06.zip
llvm-177ce1900f0de05337f744edd3f4e454f7a93b06.tar.gz
llvm-177ce1900f0de05337f744edd3f4e454f7a93b06.tar.bz2
1 files changed, 3 insertions, 0 deletions
diff --git a/llvm/lib/CodeGen/TargetLoweringBase.cpp b/llvm/lib/CodeGen/TargetLoweringBase.cpp
index bf031c0..8040f1e 100644
--- a/llvm/lib/CodeGen/TargetLoweringBase.cpp
+++ b/llvm/lib/CodeGen/TargetLoweringBase.cpp
@@ -758,6 +758,9 @@ void TargetLoweringBase::initActions() {
     // Named vector shuffles default to expand.
     setOperationAction(ISD::VECTOR_SPLICE, VT, Expand);
 
+    // Only some target support this vector operation. Most need to expand it.
+    setOperationAction(ISD::VECTOR_COMPRESS, VT, Expand);
+
     // VP operations default to expand.
 #define BEGIN_REGISTER_VP_SDNODE(SDOPC, ...)                                   \
     setOperationAction(ISD::SDOPC, VT, Expand);
author	Lawrence Benson <github@lawben.com>	2024-07-17 14:24:24 +0200
committer	GitHub <noreply@github.com>	2024-07-17 14:24:24 +0200
commit	177ce1900f0de05337f744edd3f4e454f7a93b06 (patch)
tree	04c2c4069e81bbd32f52b01558c5bde56cc3bdf2 /llvm/lib/CodeGen/TargetLoweringBase.cpp
parent	329e7c80ac2dbc16c267390da5f1baaf1cd438b1 (diff)
download	llvm-177ce1900f0de05337f744edd3f4e454f7a93b06.zip llvm-177ce1900f0de05337f744edd3f4e454f7a93b06.tar.gz llvm-177ce1900f0de05337f744edd3f4e454f7a93b06.tar.bz2