aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/BinaryFormat/DXContainer.cpp
diff options
context:
space:
mode:
authorJoel E. Denny <jdenny.ornl@gmail.com>2025-10-07 10:45:49 -0400
committerGitHub <noreply@github.com>2025-10-07 10:45:49 -0400
commit6d44b9082e42b918a152098ec70ed409c4da8c79 (patch)
treebced5d7583c7f177a56e408a8bdb9323dd718ea0 /llvm/lib/BinaryFormat/DXContainer.cpp
parentb36e762cdb2e90e29f65c7abffc00541addfed3f (diff)
downloadllvm-6d44b9082e42b918a152098ec70ed409c4da8c79.zip
llvm-6d44b9082e42b918a152098ec70ed409c4da8c79.tar.gz
llvm-6d44b9082e42b918a152098ec70ed409c4da8c79.tar.bz2
[LoopUnroll] Skip remainder loop guard if skip unrolled loop (#156549)
The original loop (OL) that serves as input to LoopUnroll has basic blocks that are arranged as follows: ``` OLPreHeader OLHeader <-. ... | OLLatch ---' OLExit ``` In this depiction, every block has an implicit edge to the next block below, so any explicit edge indicates a conditional branch. Given OL and unroll count N, LoopUnroll sometimes creates an unrolled loop (UL) with a remainder loop (RL) epilogue arranged like this: ``` ,-- ULGuard | ULPreHeader | ULHeader <-. | ... | | ULLatch ---' | ULExit `-> RLGuard -----. RLPreHeader | ,-> RLHeader | | ... | `-- RLLatch | RLExit | OLExit <-----' ``` Each UL iteration executes N OL iterations, but each RL iteration executes 1 OL iteration. ULGuard or RLGuard checks whether the first iteration of UL or RL should execute, respectively. If so, ULLatch or RLLatch checks whether to execute each subsequent iteration. Once reached, OL always executes its first iteration but not necessarily the next N-1 iterations. Thus, ULGuard is always required before the first UL iteration. However, when control flows from ULGuard directly to RLGuard, the first OL iteration has yet to execute, so RLGuard is then redundant before the first RL iteration. Thus, this patch makes the following changes: - Adjust ULGuard to branch to RLPreHeader instead of RLGuard, thus eliminating RLGuard's unnecessary branch instruction for that path. - Eliminate the creation of RLGuard phi node poison values. Without this patch, RLGuard has such a phi node for each value that is defined by any OL iteration and used in OLExit. The poison value is required where ULGuard is the predecessor. The poison value indicates that control flow from ULGuard to RLGuard to Exit has no counterpart in OL because the first OL iteration must execute either in UL or RL. - Simplify the CFG by not splitting ULExit and RLGuard because, without the ULGuard predecessor, the single block can now be a dedicated UL exit. - To RLPreHeader, add an `llvm.assume` call that asserts the RL trip count is non-zero. Without this patch, RLPreHeader is reachable only when RLGuard guarantees that assertion is true. With this patch, RLGuard guarantees it only when RLGuard is the predecessor, and the OL structure guarantees it when ULGuard is the predecessor. If RL itself is unrolled later, this guarantee somehow prevents ScalarEvolution from giving up when trying to compute a maximum trip count for RL. That maximum trip count enables the branch instruction in the final unrolled instance of RLLatch to be eliminated. Without the `llvm.assume` call, some existing unroll tests start to fail because that instruction is not eliminated. The original motivation for this patch is to facilitate later patches that fix LoopUnroll's computation of branch weights so that they maintain the block frequency of OL's body (see #135812). Specifically, this patch ensures RLGuard's branch weights do not affect RL's contribution to the block frequency of OL's body in the case that ULGuard skips UL.
Diffstat (limited to 'llvm/lib/BinaryFormat/DXContainer.cpp')
0 files changed, 0 insertions, 0 deletions