diff options
author | Philip Reames <preames@rivosinc.com> | 2025-06-16 10:20:09 -0700 |
---|---|---|
committer | GitHub <noreply@github.com> | 2025-06-16 10:20:09 -0700 |
commit | 90d62e0ae352e67d808f94ffb6d215d033f4ec22 (patch) | |
tree | 4c8134f44f15ed2717f998ee5eb629764e02917a /clang/lib/CodeGen/ModuleBuilder.cpp | |
parent | 6f9cd79fa2f43b8128be3e4386ee182ad5a843cc (diff) | |
download | llvm-90d62e0ae352e67d808f94ffb6d215d033f4ec22.zip llvm-90d62e0ae352e67d808f94ffb6d215d033f4ec22.tar.gz llvm-90d62e0ae352e67d808f94ffb6d215d033f4ec22.tar.bz2 |
[RISCV][TTI] Refine reverse shuffle costing for high LMUL (#144155)
This contains two closely related changes:
1) Explicitly recurse on the i1 case - "3" happens to be the right
magic constant at m1, but is not otherwise correct, and we're
better off deferring this to existing logic.
2) Match the lowering for high LMUL shuffles - we've switched to using
a linear number of m1 vrgather instead of a single big vrgather.
This results in substantially faster (but also larger) code for
reverse shuffles larger than m1. Note that fixed vectors need
a slide at the end, but scalable ones don't.
This will have the effect of biasing the vectorizer towards larger
(particularly scalable larger) vector factors. This increases VF for the
s112 and s1112 loops from TSVC_2 (in all configurations).
We could refine the high LMUL estimates a bit more, but I think getting
the linear scaling right is probably close enough for the moment.
Diffstat (limited to 'clang/lib/CodeGen/ModuleBuilder.cpp')
0 files changed, 0 insertions, 0 deletions