aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/Transforms/Utils/LoopPeel.cpp
AgeCommit message (Collapse)AuthorFilesLines
2025-08-25[LoopPeel] Address followup comments on #121104 (#155221)Ryotaro Kasuga1-11/+9
This is a follow-up PR for post-commit comments in #121104 . Details: - Rename `mergeTwoCounter` to `mergeTwoCounters` (add trailing `s`). - Avoid duplicated hash lookup. - Use `///` instead of `//`. - Fix typo.
2025-08-20[LoopPeel] Add new option to peeling loops to convert PHI into IV (#121104)Ryotaro Kasuga1-28/+171
LoopPeel currently considers PHI nodes that become loop invariants through peeling. However, in some cases, peeling transforms PHI nodes into induction variables (IVs), potentially enabling further optimizations such as loop vectorization. For example: ```c // TSVC s292 int im = N-1; for (int i=0; i<N; i++) { a[i] = b[i] + b[im]; im = i; } ``` In this case, peeling one iteration converts `im` into an IV, allowing it to be handled by the loop vectorizer. This patch adds a new feature to peel loops when to convert PHIs into IVs. At the moment this feature is disabled by default. Enabling it allows to vectorize the above example. I have measured on neoverse-v2 and observed a speedup of more than 60% (options: `-O3 -ffast-math -mcpu=neoverse-v2 -mllvm -enable-peeling-for-iv`). This PR is taken over from #94900 Related #81851
2025-06-17[LoopPeel] Support last iteration peeling of min/max intrinsics (#143598)Philip Reames1-1/+4
This isn't terribly useful at the moment because of the step=1 restriction but it should be functionally sound. This is mostly just making sure the codepaths don't diverge as we make other changes.
2025-06-10[LoopPeel] Use loop guards when checking if last iter can be peeled. (#142605)Florian Hahn1-0/+3
Apply loop guards to BTC before checking if the last iteration should be peeled off. This also adds an assert to make sure applying the guards does not pessimize the results. I checked on a large test set and it did not trigger there, but it adds an additional guard to catch potential cases where loop-guards pessimize results. Peels ~15% more loops. PR: https://github.com/llvm/llvm-project/pull/142605
2025-06-06[LoopPeel] Handle non-local instructions/arguments when updating exiting ↵Yingwei Zheng1-1/+5
values (#142993) Similar to https://github.com/llvm/llvm-project/commit/7e14161f49b32387988cf9d937bbfaa27d0fbdd5, the exiting value may be a non-local instruction or an argument. Closes https://github.com/llvm/llvm-project/issues/142895.
2025-05-28Reapply "[LoopPeel] Remove known trip count restriction when peeling last. ↵Florian Hahn1-21/+62
(#140792)" This reverts commit 580454526b936f7a576ddbc9bb932cf9be376ec4. The recommitted version contains an extra check to not peel if the latch exit is controlled by a pointer induction. Original message: Remove the restriction that the loop must be known to execute at least 2 iterations when peeling the last iteration. If we cannot prove at least 2 iterations are executed, a check and branch to skip the peeled loop is inserted. PR: https://github.com/llvm/llvm-project/pull/140792
2025-05-27Revert "[LoopPeel] Remove known trip count restriction when peeling last. ↵Florian Hahn1-61/+21
(#140792)" This reverts commit 24b97756decb7bf0e26dcf0e30a7a9aaf27f417c. Also reverts ac9a466e39bf97ffeab127982aa7c405cb257551. Building CMake triggers a crash with the patch, revert while I investigate.
2025-05-27[LoopPeel] Insert new phis before first non-PHI when peeling last iter.Florian Hahn1-1/+1
Make sure the new phis are inserted before any non-phi instructions. This fixes a crash when dbg_value instructions are present in the original exit block.
2025-05-26[LoopPeel] Remove known trip count restriction when peeling last. (#140792)Florian Hahn1-21/+61
Remove the restriction that the loop must be known to execute at least 2 iterations when peeling the last iteration. If we cannot prove at least 2 iterations are executed, a check and branch to skip the peeled loop is inserted. PR: https://github.com/llvm/llvm-project/pull/140792
2025-05-25[LoopPeel] Make sure bound in exit condition is loop invariant.Florian Hahn1-5/+7
Follow-up to post-commit comment for (https://github.com/llvm/llvm-project/pull/139551. This should effectively be NFC, given the other existing restrictions.
2025-05-25[LoopPeel] Make sure AddRec is for correct loop when peeling last iter.Florian Hahn1-2/+4
Follow-up to post-commit comment for (https://github.com/llvm/llvm-project/pull/139551. This should effectively be NFC, given the other existing restrictions.
2025-05-18[LoopPeel] Make sure exit condition has a single use when peeling last.Florian Hahn1-3/+3
Update the check in canPeelLastIteration to make sure the exiting condition has a single use. When peeling the last iteration, we adjust the condition in the loop body to be true one iteration early, which would be incorrect for other users. Fixes https://github.com/llvm/llvm-project/issues/140444.
2025-05-18[LoopPeel] Handle constants when updating exit values when peeling last.Florian Hahn1-1/+1
Account for constant values when updating exit values after peeling an iteration from the end. This can happen if the inner loop gets unrolled and simplified. Fixes https://github.com/llvm/llvm-project/issues/140442.
2025-05-17Reapply "[LoopPeel] Implement initial peeling off the last loop iteration. ↵Florian Hahn1-106/+273
(#139551)" This reverts the revert commit bf92b127d2637948f53d11a187e865aa10e2e74c. This adds missing initialization of PeelLast in gatherPeelingPreferences. Original message: Generalize countToEliminateCompares to also consider peeling off the last iteration if it eliminates a compare. At the moment, codegen for peeling off the last iteration is quite restrictive and callers have to make sure that the exit condition can be adjusted when peeling and that the loop executes at least 2 iterations. Both will be relaxed in follow-ups. PR: https://github.com/llvm/llvm-project/pull/139551
2025-05-16Revert "[LoopPeel] Implement initial peeling off the last loop iteration. ↵Florian Hahn1-272/+106
(#139551)" This reverts commit bb10c3ba7f77d40a7fbfd4ac815015d3a4ae476a. Also reverts 4f663cca15f2b53c2bc6a84d1b1f5bd81679356d: Revert "[LoopPeel] Make sure PeelLast is always initialized." Revert for now to bring msan bots back to green https://lab.llvm.org/buildbot/#/builders/164/builds/9992 https://lab.llvm.org/buildbot/#/builders/94/builds/7158
2025-05-15[LoopPeel] Make sure PeelLast is always initialized.Florian Hahn1-0/+1
Make sure PeelLast is initialized on all paths. Should fix MSan bootstrap failures https://lab.llvm.org/buildbot/#/builders/164/builds/9992 https://lab.llvm.org/buildbot/#/builders/94/builds/7158 Fixup after https://github.com/llvm/llvm-project/pull/139551.
2025-05-15[LoopPeel] Implement initial peeling off the last loop iteration. (#139551)Florian Hahn1-106/+271
Generalize countToEliminateCompares to also consider peeling off the last iteration if it eliminates a compare. At the moment, codegen for peeling off the last iteration is quite restrictive and callers have to make sure that the exit condition can be adjusted when peeling and that the loop executes at least 2 iterations. Both will be relaxed in follow-ups. PR: https://github.com/llvm/llvm-project/pull/139551
2025-03-23[Transforms] Use *Set::insert_range (NFC) (#132652)Kazu Hirata1-7/+3
We can use *Set::insert_range to collapse: for (auto Elem : Range) Set.insert(E); down to: Set.insert_range(Range); In some cases, we can further fold that into the set declaration.
2025-03-04[LoopUtils] Don't wrap in getLoopEstimatedTripCount (#129080)Ramkumar Ramachandra1-13/+11
getLoopEstimatedTripCount returns the trip count based on profiling data, and its documentation says that it could return 0 when the trip count is zero, but this is not the case: a valid trip count can never be zero, and it returns 0 when the unsigned ExitCount is incremented by 1 and wraps. Some callers are careful about checking for this zero value in an std::optional, but it makes for an API with footguns, as a std::optional return value indicates that a non-nullopt value would be a valid trip count. Fix this by explicitly returning std::nullopt when the return value would wrap, and strip additional checks in callers. This also fixes a minor bug in LoopVectorize.
2024-12-13PatternMatch: migrate to CmpPredicate (#118534)Ramkumar Ramachandra1-1/+1
With the introduction of CmpPredicate in 51a895a (IR: introduce struct with CmpInst::Predicate and samesign), PatternMatch is one of the first key pieces of infrastructure that must be updated to match a CmpInst respecting samesign information. Implement this change to Cmp-matchers. This is a preparatory step in migrating the codebase over to CmpPredicate. Since we no functional changes are desired at this stage, we have chosen not to migrate CmpPredicate::operator==(CmpPredicate) calls to use CmpPredicate::getMatching(), as that would have visible impact on tests that are not yet written: instead, we call CmpPredicate::operator==(Predicate), preserving the old behavior, while also inserting a few FIXME comments for follow-ups.
2024-10-17[Transforms] Avoid repeated hash lookups (NFC) (#112654)Kazu Hirata1-6/+4
2024-09-20[LoopPeel] Fix LCSSA phi node invalidationNikita Popov1-1/+1
In the test case, the BECount of the second loop uses %load, but we only have an LCSSA phi node for %add, so that is what gets invalidated. Use the forgetLcssaPhiWithNewPredecessor() API instead, which will invalidate the roots of the expression instead. Fixes https://github.com/llvm/llvm-project/issues/109333.
2024-06-27[IR] Add getDataLayout() helpers to BasicBlock and Instruction (#96902)Nikita Popov1-1/+1
This is a helper to avoid writing `getModule()->getDataLayout()`. I regularly try to use this method only to remember it doesn't exist... `getModule()->getDataLayout()` is also a common (the most common?) reason why code has to include the Module.h header.
2024-06-12Reapply "[llvm][IR] Extend BranchWeightMetadata to track provenance o… ↵Paul Kirth1-2/+2
(#95281) …f weights" #95136 Reverts #95060, and relands #86609, with the unintended code generation changes addressed. This patch implements the changes to LLVM IR discussed in https://discourse.llvm.org/t/rfc-update-branch-weights-metadata-to-allow-tracking-branch-weight-origins/75032 In this patch, we add an optional field to MD_prof meatdata nodes for branch weights, which can be used to distinguish weights added from llvm.expect* intrinsics from those added via other methods, e.g. from profiles or inserted by the compiler. One of the major motivations, is for use with MisExpect diagnostics, which need to know if branch_weight metadata originates from an llvm.expect intrinsic. Without that information, we end up checking branch weights multiple times in the case if ThinLTO + SampleProfiling, leading to some inaccuracy in how we report MisExpect related diagnostics to users. Since we change the format of MD_prof metadata in a fundamental way, we need to update code handling branch weights in a number of places. We also update the lang ref for branch weights to reflect the change.
2024-06-11Revert "[llvm][IR] Extend BranchWeightMetadata to track provenance of ↵Paul Kirth1-2/+2
weights" (#95060) Reverts llvm/llvm-project#86609 This change causes compile-time regressions for stage2 builds (https://llvm-compile-time-tracker.com/compare.php?from=3254f31a66263ea9647c9547f1531c3123444fcd&to=c5978f1eb5eeca8610b9dfce1fcbf1f473911cd8&stat=instructions:u). It also introduced unintended changes to `.text` which should be addressed before relanding.
2024-06-10[llvm][IR] Extend BranchWeightMetadata to track provenance of weights (#86609)Paul Kirth1-2/+2
This patch implements the changes to LLVM IR discussed in https://discourse.llvm.org/t/rfc-update-branch-weights-metadata-to-allow-tracking-branch-weight-origins/75032 In this patch, we add an optional field to MD_prof metadata nodes for branch weights, which can be used to distinguish weights added from `llvm.expect*` intrinsics from those added via other methods, e.g. from profiles or inserted by the compiler. One of the major motivations, is for use with MisExpect diagnostics, which need to know if branch_weight metadata originates from an llvm.expect intrinsic. Without that information, we end up checking branch weights multiple times in the case if ThinLTO + SampleProfiling, leading to some inaccuracy in how we report MisExpect related diagnostics to users. Since we change the format of MD_prof metadata in a fundamental way, we need to update code handling branch weights in a number of places. We also update the lang ref for branch weights to reflect the change.
2024-05-31[LoopPeel] Support min/max intrinsics in loop peeling (#93162)Sergey Kachkov1-23/+63
This patch adds processing of min/max intrinsics in LoopPeel in the similar way as it was done for conditional statements: for min/max(IterVal, BoundVal) we peel iterations where IterVal < BoundVal for monotonically increasing IterVal; for monotonically decreasing IterVal we peel iterations where IterVal > BoundVal (strict comparision predicates are used to minimize number of peeled iterations).
2023-12-02[LoopPeel] Peel iterations based on and, or conditions (#73413)Joshua Cao1-4/+13
For example, this allows us to peel this loop with a `and`: ``` for (int i = 0; i < N; ++i) { if (i % 2 == 0 && i < 3) // can peel based on || as well f1(); f2(); ``` into: ``` for (int i = 0; i < 3; ++i) { // peel three iterations if (i % 2 == 0) f1(); f2(); } for (int i = 3; i < N; ++i) f2(); ```
2023-11-16Add setBranchWeigths convenience function. NFC (#72446)Matthias Braun1-13/+4
Add `setBranchWeights` convenience function to ProfDataUtils.h and use it where appropriate.
2023-10-31[LoopPeeling] Fix weights updating of peeled off branches (#70094)Aleksandr Popov1-3/+8
In https://reviews.llvm.org/D64235 a new algorithm has been introduced for updating the branch weights of latch blocks and their copies. It increases the probability of going to the exit block for each next peel iteration, calculating weights by (F - I * E, E), where: - F is a weight of the edge from latch to header. - E is a weight of the edge from latch to exit. - I is a number of peeling iteration. E.g: Let's say the latch branch weights are (100,300) and the estimated trip count is 4. If we peel off all 4 iterations the weights of the copied branches will be: 0: (100,300) 1: (100,200) 2: (100,100) 3: (100,1) https://godbolt.org/z/93KnoEsT6 So we make the original loop almost unreachable from the 3rd peeled copy according to the profile data. But that's only true if the profiling data is accurate. Underestimated trip count can lead to a performance issues with the register allocator, which may decide to spill intervals inside the loop assuming it's unreachable. Since we don't know how accurate the profiling data is, it seems better to set neutral 1/1 weights on the last peeled latch branch. After this change, the weights in the example above will look like this: 0: (100,300) 1: (100,200) 2: (100,100) 3: (100,100) Co-authored-by: Aleksandr Popov <apopov@azul.com>
2023-09-01[llvm] Fix duplicate word typos. NFCFangrui Song1-1/+1
Those fixes were taken from https://reviews.llvm.org/D137338
2023-07-19[LoopPeel] Clear dispositions after peelingNikita Popov1-0/+1
Block dispositions of values defined inside the loop may change during peeling, so clear them. We already do this for other kinds of unrolling. Differential Revision: https://reviews.llvm.org/D153762
2023-05-24[LoopUnroll] Peel iterations based on select conditionsJoshua Cao1-16/+33
This also allows us to peel loops with a `select`: ``` for (int i = 0; i <= N; ++i); f3(i == 0 ? a : b); // select instruction ``` into: ``` f3(a); // peel one iteration for (int i = 1; i <= N; ++i) f3(b); ``` Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D151052
2022-12-19[LoopPeel] Expose ValueMap of last peeled iteration. NFCAnna Thomas1-3/+1
The value map of last peeled iteration is computed within peelLoop API. This patch exposes it for callers of peelLoop. While this is not currently used by upstream passes, we have a usecase downstream which benefits from this API update. Future users of peelLoop can also use the ValueMap if needed. Similar value maps are exposed by other loop utilities such as loop cloning. Differential Revision: https://reviews.llvm.org/D138228
2022-12-14Don't include Optional.hKazu Hirata1-1/+0
These files no longer use llvm::Optional.
2022-12-14[NFC] Cleanup: Replace Function::getBasicBlockList().splice() with ↵Vasileios Porpodas1-3/+2
Function::splice() This is part of a series of patches that aim at making Function::getBasicBlockList() private. Differential Revision: https://reviews.llvm.org/D139984
2022-12-12Transforms/Utils: llvm::Optional => std::optionalFangrui Song1-8/+10
2022-12-05Expand loop peeling phi computation to handle binary ops and castsJamie Schmeiser1-1/+16
Summary: Expand the capabilities of the code for computing how many peels are needed to make phis determined. A cast gets the peel count for the value being casted while a binary op gets the maximum of the operands. Respond to review comments: remove redundant asserts. Author: Jamie Schmeiser <schmeise@ca.ibm.com> Reviewed By:mkazantsev (Max Kazantsev),syzaara (Zaara Syeda) Differential Revision: https://reviews.llvm.org/D138719
2022-12-02[Transforms] Use std::nullopt instead of None (NFC)Kazu Hirata1-2/+2
This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-11-26[Utils] Use std::optional in LoopPeel.cpp (NFC)Kazu Hirata1-1/+2
This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-11-25[NFC] Refactor loop peeling code for calculating phi invariance.Jamie Schmeiser1-62/+158
Summary: Refactor loop peeling code by moving code for calculating phi invariance into a separate class that does the calculation. Redescribe and rework the algorithm in preparation for adding increased functionality. Add test case that does not exhibit peeling that will be subsequently supported. Author: Jamie Schmeiser <schmeise@ca.ibm.com> Reviewed By: mkazantsev (Max Kazantsev) Differential Revision: https://reviews.llvm.org/D138232
2022-11-23[NFC] Replaced BB->getInstList().{erase(),pop_front(),pop_back()} with ↵Vasileios Porpodas1-1/+1
eraseFromParent(). Differential Revision: https://reviews.llvm.org/D138617
2022-10-25[LoopPeeling] Add flag to disable support for peeling loops with non-latch exitsAlina Sbirlea1-1/+21
Add a flag to allow disabling the changes in https://reviews.llvm.org/D134803. Differential Revision: https://reviews.llvm.org/D136643
2022-10-07[LoopPeeling] Support peeling loops with non-latch exitsNikita Popov1-104/+90
Loop peeling currently requires that a) the latch is exiting b) a branch and c) other exits are unreachable/deopt. This patch removes all of these limitations, and adds the necessary branch weight updating support. It essentially works the same way as before with latch -> exiting terminator and loop trip count -> per exit trip count. It's worth noting that there are still other limitations in profitability heuristics: This patch enables peeling of loops to make conditions invariant (which is pretty much always highly profitable if possible), while peeling to make loads dereferenceable still checks that non-latch exits are unreachable and PGO-based peeling has even more conditions. Those checks could be relaxed later if we consider those cases profitable. The motivation for this change is that loops using iterator adaptors in Rust often optimize very badly, and end up with a loop phi of the form phi(true, false) in the final result. Peeling eliminates that phi and conditions based on it, which enables a lot of follow-on simplification. Differential Revision: https://reviews.llvm.org/D134803
2022-09-26LoopPeel: Pass through AssumptionCache (NFC)Matt Arsenault1-4/+6
2022-09-19Analysis: Add AssumptionCache argument to isDereferenceableAndAlignedPointerMatt Arsenault1-1/+1
This does not try to pass it through from the end users.
2022-08-20Remove redundant initialization of Optional (NFC)Kazu Hirata1-1/+1
2022-08-07[Transforms] Fix comment typos (NFC)Kazu Hirata1-1/+1
2022-08-03[llvm][NFC] Refactor code to use ProfDataUtilsPaul Kirth1-1/+2
In this patch we replace common code patterns with the use of utility functions for dealing with profiling metadata. There should be no change in functionality, as the existing checks should be preserved in all cases. Reviewed By: bogner, davidxl Differential Revision: https://reviews.llvm.org/D128860
2022-07-27Revert "[llvm][NFC] Refactor code to use ProfDataUtils"Paul Kirth1-2/+1
This reverts commit 300c9a78819b4608b96bb26f9320bea6b8a0c4d0. We will reland once these issues are ironed out.