[LoopVectorize] Generate wide active lane masks (#147535)

This patch adds a new flag (-enable-wide-lane-mask) which allows LoopVectorize to generate wider-than-VF active lane masks when it is safe to do so (i.e. the mask is used for data and control flow). The transform in extractFromWideActiveLaneMask creates vector extracts from the first active lane mask in the header & loop body, modifying the active lane mask phi operands to use the extracts. An additional operand is passed to the ActiveLaneMask instruction, the value of which is used as a multiplier of VF when generating the mask. By default this is 1, and is updated to UF by extractFromWideActiveLaneMask. The motivation for this change is to improve interleaved loops when SVE2.1 is available, where we can make use of the whilelo instruction which returns a predicate pair. This is based on a PR that was created by @momchil-velikov (#81140) and contains tests which were added there.
author: Kerry McLaughlin <kerry.mclaughlin@arm.com> 2025-09-01 13:53:30 +0100
committer: GitHub <noreply@github.com> 2025-09-01 13:53:30 +0100
commit: f0e9bba024d44b55d54b02025623ce4a3ba5a37c (patch)
tree: 38683bafa768fa6424f372eca65628df21c4d5a4 /llvm/lib/Analysis/VectorUtils.cpp
parent: 5f412415645edb0d735dbc58306e9deb6ab0bcd4 (diff)
download: llvm-f0e9bba024d44b55d54b02025623ce4a3ba5a37c.zip
llvm-f0e9bba024d44b55d54b02025623ce4a3ba5a37c.tar.gz
llvm-f0e9bba024d44b55d54b02025623ce4a3ba5a37c.tar.bz2
1 files changed, 2 insertions, 0 deletions
diff --git a/llvm/lib/Analysis/VectorUtils.cpp b/llvm/lib/Analysis/VectorUtils.cpp
index 425ea31..091d948 100644
--- a/llvm/lib/Analysis/VectorUtils.cpp
+++ b/llvm/lib/Analysis/VectorUtils.cpp
@@ -166,6 +166,7 @@ bool llvm::isVectorIntrinsicWithScalarOpAtArg(Intrinsic::ID ID,
   case Intrinsic::is_fpclass:
   case Intrinsic::vp_is_fpclass:
   case Intrinsic::powi:
+  case Intrinsic::vector_extract:
     return (ScalarOpdIdx == 1);
   case Intrinsic::smul_fix:
   case Intrinsic::smul_fix_sat:
@@ -200,6 +201,7 @@ bool llvm::isVectorIntrinsicWithOverloadTypeAtArg(
   case Intrinsic::vp_llrint:
   case Intrinsic::ucmp:
   case Intrinsic::scmp:
+  case Intrinsic::vector_extract:
     return OpdIdx == -1 || OpdIdx == 0;
   case Intrinsic::modf:
   case Intrinsic::sincos:
author	Kerry McLaughlin <kerry.mclaughlin@arm.com>	2025-09-01 13:53:30 +0100
committer	GitHub <noreply@github.com>	2025-09-01 13:53:30 +0100
commit	f0e9bba024d44b55d54b02025623ce4a3ba5a37c (patch)
tree	38683bafa768fa6424f372eca65628df21c4d5a4 /llvm/lib/Analysis/VectorUtils.cpp
parent	5f412415645edb0d735dbc58306e9deb6ab0bcd4 (diff)
download	llvm-f0e9bba024d44b55d54b02025623ce4a3ba5a37c.zip llvm-f0e9bba024d44b55d54b02025623ce4a3ba5a37c.tar.gz llvm-f0e9bba024d44b55d54b02025623ce4a3ba5a37c.tar.bz2