aboutsummaryrefslogtreecommitdiff
path: root/clang/lib/CodeGen/ModuleBuilder.cpp
diff options
context:
space:
mode:
authorPhilip Reames <preames@rivosinc.com>2025-06-16 10:20:09 -0700
committerGitHub <noreply@github.com>2025-06-16 10:20:09 -0700
commit90d62e0ae352e67d808f94ffb6d215d033f4ec22 (patch)
tree4c8134f44f15ed2717f998ee5eb629764e02917a /clang/lib/CodeGen/ModuleBuilder.cpp
parent6f9cd79fa2f43b8128be3e4386ee182ad5a843cc (diff)
downloadllvm-90d62e0ae352e67d808f94ffb6d215d033f4ec22.zip
llvm-90d62e0ae352e67d808f94ffb6d215d033f4ec22.tar.gz
llvm-90d62e0ae352e67d808f94ffb6d215d033f4ec22.tar.bz2
[RISCV][TTI] Refine reverse shuffle costing for high LMUL (#144155)
This contains two closely related changes: 1) Explicitly recurse on the i1 case - "3" happens to be the right magic constant at m1, but is not otherwise correct, and we're better off deferring this to existing logic. 2) Match the lowering for high LMUL shuffles - we've switched to using a linear number of m1 vrgather instead of a single big vrgather. This results in substantially faster (but also larger) code for reverse shuffles larger than m1. Note that fixed vectors need a slide at the end, but scalable ones don't. This will have the effect of biasing the vectorizer towards larger (particularly scalable larger) vector factors. This increases VF for the s112 and s1112 loops from TSVC_2 (in all configurations). We could refine the high LMUL estimates a bit more, but I think getting the linear scaling right is probably close enough for the moment.
Diffstat (limited to 'clang/lib/CodeGen/ModuleBuilder.cpp')
0 files changed, 0 insertions, 0 deletions