aboutsummaryrefslogtreecommitdiff
path: root/llvm/test/Transforms/LoopPredication
AgeCommit message (Collapse)AuthorFilesLines
2025-10-17[SimpleLoopUnswitch] Don't use BlockFrequencyInfo to skip cold loops (#159522)Luke Lau1-60/+0
In https://reviews.llvm.org/D129599, non-trivial switching was disabled for cold loops in the interest of code size. This added a dependency on BlockFrequencyInfo with PGO, but in loop passes this is only available on a lossy basis: see https://reviews.llvm.org/D86156 LICM moved away from BFI so as of today SimpleLoopUnswitch is the only remaining loop pass that uses BFI, for the sole reason to prevent code size increases in PGO builds. It doesn't use BFI if there's no profile summary available. After some investigation on llvm-test-suite it turns out that the lossy BFI causes very significant deviations in block frequency, since when new loops are deleted/created during the loop pass manager it can return frequencies for different loops altogether. This results in unswitchable loops being mistakenly skipped because they are thought to be cold. This patch removes the use of BFI from SimpleLoopUnswitch and thus the last remaining use of BFI in a loop pass. To recover the original intent of not unswitching cold code, PGOForceFunctionAttrs can be used to annotate functions which can be optimized for code size, since SimpleLoopUnswitch will respect OptSize: https://reviews.llvm.org/D94559 This isn't 100% the same behaviour since the previous behaviour checked for coldness at the loop level and this is now at the function level. We could expand PGOForceFunctionAttrs to be more granular at the loop level, https://github.com/llvm/llvm-project/issues/159595 tracks this idea.
2025-01-29[IR] Convert from nocapture to captures(none) (#123181)Nikita Popov1-6/+6
This PR removes the old `nocapture` attribute, replacing it with the new `captures` attribute introduced in #116990. This change is intended to be essentially NFC, replacing existing uses of `nocapture` with `captures(none)` without adding any new analysis capabilities. Making use of non-`none` values is left for a followup. Some notes: * `nocapture` will be upgraded to `captures(none)` by the bitcode reader. * `nocapture` will also be upgraded by the textual IR reader. This is to make it easier to use old IR files and somewhat reduce the test churn in this PR. * Helper APIs like `doesNotCapture()` will check for `captures(none)`. * MLIR import will convert `captures(none)` into an `llvm.nocapture` attribute. The representation in the LLVM IR dialect should be updated separately.
2024-11-21[llvm] Remove `br i1 undef` from some regression tests [NFC] (#117112)Lee Wei1-1/+1
This PR removes tests with `br i1 undef` under `llvm/tests/Transforms/Loop*, Lower*`.
2024-05-03[StandardInstrumentation] Annotate loops with the function name (#90756)annamthomas1-2/+2
When analyzing pass debug output it is helpful to have the function name along with the loop name.
2023-12-07[SCEVExpander] Attempt to reinfer flags dropped due to CSE (#72431)Philip Reames1-1/+1
LSR uses SCEVExpander to generate induction formulas. The expander internally tries to reuse existing IR expressions. To do that, it needs to strip any poison generating flags (nsw, nuw, exact, nneg, etc..) which may not be valid for the newly added users. This is conservatively correct, but has the effect that LSR will strip nneg flags on zext instructions involved in trip counts in loop preheaders. To avoid this, this patch adjusts the expanded to reinfer the flags on the CSE candidate if legal for all possible users. This should fix the regression reported in https://github.com/llvm/llvm-project/issues/71200. This should arguably be done inside canReuseInstruction instead, but doing it outside is more conservative compile time wise. Both canReuseInstruction and isGuaranteedNotToBePoison walk operand lists, so right now we are performing work which is roughly O(N^2) in the size of the operand graph. We should fix that before making the per operand step more expensive. My tenative plan is to land this, and then rework the code to sink the logic into more core interfaces.
2023-09-20[GuardUtils] Revert llvm::isWidenableBranch change (#66411)Aleksandr Popov1-0/+50
In the d6e7c162e1df3736d8e2b3610a831b7cfa5be99b was introduced util to to extract widenable conditions from branch. That util was applied in the llvm::isWidenableBranch to check if branch is widenable. So we consider branch is widenable if it has widenable condition anywhere in the condition tree. But that will be true when we finish GuardWidening reworking from branch widening to widenable conditions widening. For now we still need to check that widenable branch is in the form of: `br(widenable_condition & (...))`, because that form is assumed by LoopPredication and GuardWidening algorithms. Fixes: https://github.com/llvm/llvm-project/issues/66418 Co-authored-by: Aleksander Popov <apopov@azul.com>
2023-09-19[LoopPredication] Fix division by zero in case of zero branch weights (#66506)Danila Malyutin2-1/+279
Treat the case where all branch weights are zero as if there was no profile. Fixes #66382
2023-09-14[NFC] Add test for #66382Danila Malyutin1-0/+32
2023-08-18[LoopPredication] Rework assumes of widened conditionsAleksandr Popov1-7/+6
Currently after widening br(WC && (c1 && c2)) we insert assume of (c1 && c2) which is joined to WC by And operation. But we are going to support more flexible form of widenable branches where WC could be placed arbitrary in the expression tree, e.g: br(c1 && (c2 && WC)). In that case we won't have (c1 && c2) in the IR. So we need to add explicit (c1 && c2) and then create an assumption of it. Reviewed By: anna Differential Revision: https://reviews.llvm.org/D157502
2023-08-10Revert "[NFC][LoopPredication] Add parsed checks logging"Aleksandr Popov1-7/+2
This reverts commit aa603c41caab63e246f4a4258c8b96e6ea06fdc9. Revert due to LLVM Buildbot failure
2023-08-10[NFC][LoopPredication] Add parsed checks loggingAleksandr Popov1-2/+7
Differential Revision: https://reviews.llvm.org/D157491
2023-05-25[GuardUtils] Allow intermmediate blocks between widenable branch and deopt blockSerguei Katkov1-0/+75
Reviewed By: anna Differential Revision: https://reviews.llvm.org/D151082
2023-04-10[LoopPredication] Fix where we generate widened condition. PR61963Anna Thomas2-9/+4
Loop predication's predicateLoopExit pass does two incorrect things: It sinks the widenable call into the loop, thereby converting an invariant condition to a variant one It widens the widenable call at a branch thereby converting the branch into a loop-varying one. The latter is problematic when the branch may have been loop-invariant and prior optimizations (such as indvars) may have relied on this fact, and updated the deopt state accordingly. Now, when we widen this with a loop-varying condition, the deopt state is no longer correct. https://github.com/llvm/llvm-project/issues/61963 fixed. Differential Revision: https://reviews.llvm.org/D147662
2023-04-10Simplify test with deopt state in D147662. NFCAnna Thomas1-4/+4
2023-04-06Precommit test from D147662Anna Thomas1-0/+101
2023-03-29[LoopPredication] Fix the LoopPredication by feezing the result of predication.Serguei Katkov11-153/+225
LoopPredication introduces the use of possibly posion value in branch (guard) instruction, so to avoid introducing undefined behavior it should be frozen. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D146685
2023-03-24[Test] Regenerate checks in test fileMax Kazantsev1-29/+50
2023-03-22[LoopPredication] Add a test demonstrating bug.Serguei Katkov1-0/+111
LoopPredication may introduce undefined behavior.
2023-02-27[LoopPredication] Account for critical edges when inserting assumes. PR26496Max Kazantsev1-5/+28
Loop predication can insert assumes to preserve knowledge about some facts that may otherwise be lost, because loop predication is a lossy transform. When a guard is represented as branch by widenable condition, it should insert it in the guarded block. However, if the guarded block has other predecessors than the guard block, then the condition might not dominate it. Currently we generate invalid code here. One possible fix here is to split critical edge and insert the assume there, but in this case we should modify CFG, which Loop Predication is not currently doing, and we want to keep it that way. The fix is to handle this case by inserting a Phi which takes `Cond` as input from the guard block and `true` from any other blocks. This is valid in terms of IR and does not introduce any new knowledge if we came from another block. Differential Revision: https://reviews.llvm.org/D144859 Reviewed By: nikic, skatkov
2023-02-27[Test] Add failing test for PR61022Max Kazantsev1-0/+43
Details: https://github.com/llvm/llvm-project/issues/61022
2023-01-02[LoopPredication] Convert tests to opaque pointers (NFC)Nikita Popov12-654/+654
2022-12-08[NFC] Port all LoopPredication tests to `-passes=` syntaxRoman Lebedev9-9/+9
2022-11-29[Test] Update tests for LoopPredication constant ranges wideningDmitry Makogon1-0/+18
2022-11-08[Test] Add tests with range checks with known constant rangesDmitry Makogon1-0/+532
LoopPredication might be able to turn such checks (which are not necessarily are done on IV) into loop invariant checks.
2022-10-07[opt] Don't translate legacy -analysis flag to require<analysis>Arthur Eubanks1-1/+1
Tests relying on this should explicitly use -passes='require<analysis>,foo'.
2022-10-07[LoopPredication] Insert assumes of conditions of predicated guardsDmitry Makogon9-22/+270
As LoopPredication performs non-equivalent transforms removing some checks from loops, other passes may not be able to perform transforms they'd be able to do if the checks were left in loops. This patch makes LoopPredication insert assumes of the replaced conditions either after a guard call or in the true block of widenable condition branch. Differential Revision: https://reviews.llvm.org/D135354
2022-10-06[Test] Add test showing missed branch elimination due to loop predication ↵Dmitry Makogon1-0/+115
transform
2022-09-09Loop names used in reporting can grow very largeJamie Schmeiser2-10/+9
Summary: The code for generating a name for loops for various reporting scenarios created a name by serializing the loop into a string. This may result in a very large name for a loop containing many blocks. Use the getName() function on the loop instead. Author: Jamie Schmeiser <schmeise@ca.ibm.com> Reviewed By: Whitney (Whitney Tsang), aeubanks (Arthur Eubanks) Differential Revision: https://reviews.llvm.org/D133587
2022-08-08[SimpleLoopUnswitch] Skip non-trivial unswitching of cold loopsRuobing Han1-0/+1
With profile data, non-trivial LoopUnswitch will only apply on non-cold loops, as unswitching cold loops may not gain much benefit but significantly increase the code size. Reviewed By: aeubanks, asbirlea Differential Revision: https://reviews.llvm.org/D129599
2021-11-29[SCEVExpander] Drop poison generating flags when reusing instructionsPhilip Reames1-1/+1
The basic problem we have is that we're trying to reuse an instruction which is mapped to some SCEV. Since we can have multiple such instructions (potentially with different flags), this is analogous to our need to drop flags when performing CSE. A trivial implementation would simply drop flags on any instruction we decided to reuse, and that would be correct. This patch is almost that trivial patch except that we preserve flags on the reused instruction when existing users would imply UB on overflow already. Adding new users can, at most, refine this program to one which doesn't execute UB which is valid. In practice, this fixes two conceptual problems with the previous code: 1) a binop could have been canonicalized into a form with different opcode or operands, or 2) the inbounds GEP case which was simply unhandled. On the test changes, most are pretty straight forward. We loose some flags (in some cases, they'd have been dropped on the next CSE pass anyways). The one that took me the longest to understand was the ashr-expansion test. What's happening there is that we're considering reuse of the mul, previously we disallowed it entirely, now we allow it with no flags. The surrounding diffs are all effects of generating the same mul with a different operand order, and then doing simple DCE. The loss of the inbounds is unfortunate, but even there, we can recover most of those once we actually treat branch-on-poison as immediate UB. Differential Revision: https://reviews.llvm.org/D112734
2021-10-28Revert rest of `IRBuilderBase`'s short-circuiting foldsRoman Lebedev4-7/+13
Upon further investigation and discussion, this is actually the opposite direction from what we should be taking, and this direction wouldn't solve the motivational problem anyway. Additionally, some more (polly) tests have escaped being updated. So, let's just take a step back here. This reverts commit f3190dedeef9da2109ea57e4cb372f295ff53b88. This reverts commit 749581d21f2b3f53e4fca4eb8728c942d646893b. This reverts commit f3df87d57e096143670e0fd396e81d43393a2dd2. This reverts commit ab1dbcecd6f0969976fafd62af34730436ad5944.
2021-10-27[IR] `IRBuilderBase::CreateAnd()`: short-circuit `x & 0` --> `0`Roman Lebedev2-6/+3
https://alive2.llvm.org/ce/z/YzPhSb Refs. https://reviews.llvm.org/D109368#3089809
2021-10-27[IR] `IRBuilderBase::CreateAnd()`: fix short-circuiting for constant on LHSRoman Lebedev3-7/+4
Refs. https://reviews.llvm.org/D109368#3089809
2021-10-27[NFC] Re-autogenerate check lines in some tests to ease of future updateRoman Lebedev3-69/+69
2021-10-19[LoopPredication] Calculate profitability without BPIAnna Thomas1-4/+4
Using BPI within loop predication is non-trivial because BPI is only preserved lossily in loop pass manager (one fix exposed by lossy preservation is up for review at D111448). However, since loop predication is only used in downstream pipelines, it is hard to keep BPI from breaking for incomplete state with upstream changes in BPI. Also, correctly preserving BPI for all loop passes is a non-trivial undertaking (D110438 does this lossily), while the benefit of using it in loop predication isn't clear. In this patch, we rely on profile metadata to get almost similar benefit as BPI, without actually using the complete heuristics provided by BPI. This avoids the compile time explosion we tried to fix with D110438 and also avoids fragile bugs because BPI can be lossy in loop passes (D111448). Reviewed-By: asbirlea, apilipenko Differential Revision: https://reviews.llvm.org/D111668
2021-09-30[BPI] Keep BPI available in loop passes through LoopStandardAnalysisResultsAnna Thomas1-11/+6
This is analogous to D86156 (which preserves "lossy" BFI in loop passes). Lossy means that the analysis preserved may not be up to date with regards to new blocks that are added in loop passes, but BPI will not contain stale pointers to basic blocks that are deleted by the loop passes. This is achieved through BasicBlockCallbackVH in BPI, which calls eraseBlock that updates the data structures in BPI whenever a basic block is deleted. This patch does not have any changes in the upstream pipeline, since none of the loop passes in the pipeline use BPI currently. However, since BPI wasn't previously preserved in loop passes, the loop predication pass was invoking BPI *on the entire function* every time it ran in an LPM. This caused massive compile time in our downstream LPM invocation which contained loop predication. See updated test with an invocation of a loop-pipeline containing loop predication and -debug-pass turned ON. Reviewed-By: asbirlea, modimo Differential Revision: https://reviews.llvm.org/D110438
2021-09-28Add profile count. Regenerate check lines. NFCAnna Thomas1-7/+7
Function profile counts added to test cases. Regenerated test lines for loop predication test.
2021-09-27[LoopPred Test] Fix lld-x86_64-win BB failureAnna Thomas1-1/+1
Need a more general CHECK line for testcase in 5df9112 for correctly handling lld-x86_64-win buildbot.
2021-09-27Reland "[LoopPredication] Add testcase showing BPI computation. NFC"Anna Thomas1-0/+65
This relands commit 16a62d4f. Relanded after fixing CHECK-LINES for opt pipeline output to be more general (based on failures seen in buildbot).
2021-09-27Revert "[LoopPredication] Add testcase showing BPI computation. NFC"Anna Thomas1-70/+0
This reverts commit 16a62d4f3dca189b0e0565c7ebcd83ddfcc67629. Needs some update to check lines to fix bb failure.
2021-09-27[LoopPredication] Add testcase showing BPI computation. NFCAnna Thomas1-0/+70
Precommit testcase for D110438. Since we do not preserve BPI in loop pass manager, we are forced to compute BPI everytime Loop predication is invoked. The patch referenced changes that behaviour by preserving lossy BPI for loop passes.
2021-09-16Update LoopPredication test to fix buildbot failure.Daniil Suchkov1-9/+8
This patch updates tests added in 5f2b7879f16ad5023f0684febeb0a20f7d53e4a8.
2021-09-16[LoopPredication] Report changes correctly when attempting loop exit predicationDaniil Suchkov1-2/+2
To make the IR easier to analyze, this pass makes some minor transformations. After that, even if it doesn't decide to optimize anything, it can't report that it changed nothing and preserved all the analyses. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D109855
2021-09-16NFC. Add tests exposing missing analysis invalidation in LoopPredication.Daniil Suchkov1-0/+166
2021-09-02[LoopPredication] Fix MemorySSA crash in predicateLoopExitsAnna Thomas1-0/+28
The attached testcase crashes without the patch (Not the same accesses in the same order). When we move instructions before another instruction, we also need to update the memory accesses corresponding to it. Reviewed-By: asbirlea Differential Revision: https://reviews.llvm.org/D109197
2021-08-26[LoopPredication] Preserve MemorySSAAnna Thomas10-8/+11
Since LICM has now unconditionally moved to MemorySSA based form, all passes that run in same LPM as LICM need to preserve MemorySSA (i.e. our downstream pipeline). Added loop-mssa to all tests and perform -verify-memoryssa within LoopPredication itself. Differential Revision: https://reviews.llvm.org/D108724
2021-03-06[NFCI] SCEVExpander: emit intrinsics for integral {u,s}{min,max} SCEV ↵Roman Lebedev1-95/+78
expressions These intrinsics, not the icmp+select are the canonical form nowadays, so we might as well directly emit them. This should not cause any regressions, but if it does, then then they would needed to be fixed regardless. Note that this doesn't deal with `SCEVExpander::isHighCostExpansion()`, but that is a pessimization, not a correctness issue. Additionally, the non-intrinsic form has issues with undef, see https://reviews.llvm.org/D88287#2587863
2020-06-26[BasicAA] Rename deprecated -basicaa to -basic-aaFangrui Song1-1/+1
Follow-up to D82607 Revert an accidental change (empty.ll) of D82683
2020-01-17[BasicBlock] fix looping in getPostdominatingDeoptimizeCallFedor Sergeev1-0/+55
Blindly following unique-successors chain appeared to be a bad idea. In a degenerate case when block jumps to itself that goes into endless loop. Discovered this problem when playing with additional changes, managed to reproduce it on existing LoopPredication code. Fix by checking a "visited" set while iterating through unique successors. Reviewed By: skatkov Tags: #llvm Differential Revision: https://reviews.llvm.org/D72908
2019-11-21[LoopPred] Robustly handle partially unswitched loopsPhilip Reames1-0/+111
We may end up with a case where we have a widenable branch above the loop, but not all widenable branches within the loop have been removed. Since a widenable branch inhibit SCEVs ability to reason about exit counts (by design), we have a tradeoff between effectiveness of this optimization and allowing future widening of the branches within the loop. LoopPred is thought to be one of the most important optimizations for range check elimination, so let's pay the cost.