aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/Transforms/Utils/LoopUtils.cpp
diff options
context:
space:
mode:
authorFlorian Hahn <flo@fhahn.com>2025-07-31 19:20:05 +0100
committerGitHub <noreply@github.com>2025-07-31 19:20:05 +0100
commit078d214672e23691566137fa88b851c7022666b7 (patch)
treec886f264c930def397460d7154f4319d873b2b6e /llvm/lib/Transforms/Utils/LoopUtils.cpp
parent5482ef76f5b3d5ffcaded397fa924569e83f0b2d (diff)
downloadllvm-078d214672e23691566137fa88b851c7022666b7.zip
llvm-078d214672e23691566137fa88b851c7022666b7.tar.gz
llvm-078d214672e23691566137fa88b851c7022666b7.tar.bz2
[TailDup] Delay aggressive computed-goto taildup to after RegAlloc. (#150911)
https://github.com/llvm/llvm-project/pull/114990 allowed more aggressive tail duplication for computed-gotos in both pre- and post-regalloc tail duplication. In some cases, performing tail-duplication too early can lead to worse results, especially if we duplicate blocks with a number of phi nodes. This is causing a ~3% performance regression in some workloads using Python 3.12. This patch updates TailDup to delay aggressive tail-duplication for computed gotos to after register allocation. This means we can keep the non-duplicated version for a bit longer throughout the backend, which should reduce compile-time as well as allowing a number of optimizations and simplifications to trigger before drastically expanding the CFG. For the case in https://github.com/llvm/llvm-project/issues/106846, I get the same performance with and without this patch on Skylake. PR: https://github.com/llvm/llvm-project/pull/150911
Diffstat (limited to 'llvm/lib/Transforms/Utils/LoopUtils.cpp')
0 files changed, 0 insertions, 0 deletions