aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp
AgeCommit message (Collapse)AuthorFilesLines
2025-09-05[SimpleLoopUnswitch] Adjust cost multiplier accounting for parent loop sizeAntonio Frighetto1-5/+22
When estimating the cost to avoid exponential unswitches of non-trivial invariant conditions, also consider the parent loop basic blocks size, ensuring this does not grow unexpectedly. Fixes: https://github.com/llvm/llvm-project/issues/138509.
2025-08-18Revert "[SimpleLoopUnswitch] Record loops from unswitching non-trivial ↵Antonio Frighetto1-26/+23
conditions" This reverts commit e9de32fd159d30cfd6fcc861b57b7e99ec2742ab due to multiple performance regressions observed across downstream Numba benchmarks (https://github.com/llvm/llvm-project/issues/138509#issuecomment-3193855772). While avoiding non-trivial unswitches on newly-cloned loops helps mitigate the pathological case reported in https://github.com/llvm/llvm-project/issues/138509, it may as well make the IR less friendly to vectorization / loop- canonicalization (in the test reported, previously no select with loop-carried dependence existed in the new specialized loops), leading the abovementioned approach to be reconsidered.
2025-07-24[SimpleLoopUnswitch] Record loops from unswitching non-trivial conditionsAntonio Frighetto1-23/+26
Track newly-cloned loops coming from unswitching non-trivial invariant conditions, so as to prevent conditions in such cloned blocks from being unswitched again. Fixes: https://github.com/llvm/llvm-project/issues/138509.
2025-06-11[DLCov][NFC] Annotate intentionally-blank DebugLocs in existing code (#136192)Stephen Tozer1-0/+3
Following the work in PR #107279, this patch applies the annotative DebugLocs, which indicate that a particular instruction is intentionally missing a location for a given reason, to existing sites in the compiler where their conditions apply. This is NFC in ordinary LLVM builds (each function `DebugLoc::getFoo()` is inlined as `DebugLoc()`), but marks the instruction in coverage-tracking builds so that it will be ignored by Debugify, allowing only real errors to be reported. From a developer standpoint, it also communicates the intentionality and reason for a missing DebugLoc. Some notes for reviewers: - The difference between `I->dropLocation()` and `I->setDebugLoc(DebugLoc::getDropped())` is that the former _may_ decide to keep some debug info alive, while the latter will always be empty; in this patch, I always used the latter (even if the former could technically be correct), because the former could result in some (barely) different output, and I'd prefer to keep this patch purely NFC. - I've generally documented the uses of `DebugLoc::getUnknown()`, with the exception of the vectorizers - in summary, they are a huge cause of dropped source locations, and I don't have the time or the domain knowledge currently to solve that, so I've plastered it all over them as a form of "fixme".
2025-05-09[KeyInstr][LoopUnswitch] Remap cloned instructions' atoms (#133491)Orlando Cazalet-Hyams1-0/+4
RFC: https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668
2025-05-04[Transforms] Remove unused local variables (NFC) (#138442)Kazu Hirata1-1/+0
2025-03-30[Transforms] Use llvm::append_range (NFC) (#133650)Kazu Hirata1-4/+2
2025-01-27[NFC][DebugInfo] Switch more call-sites to using iterator-insertion (#124283)Jeremy Morse1-2/+2
To finalise the "RemoveDIs" work removing debug intrinsics, we're updating call sites that insert instructions to use iterators instead. This set of changes are those where it's not immediately obvious that just calling getIterator to fetch an iterator is correct, and one or two places where more than one line needs to change. Overall the same rule holds though: iterators generated for the start of a block such as getFirstNonPHIIt need to be passed into insert/move methods without being unwrapped/rewrapped, everything else can use getIterator.
2025-01-27[NFC][DebugInfo] Make some block-start-position methods return iterators ↵Jeremy Morse1-1/+1
(#124287) As part of the "RemoveDIs" work to eliminate debug intrinsics, we're replacing methods that use Instruction*'s as positions with iterators. A number of these (such as getFirstNonPHIOrDbg) are sufficiently infrequently used that we can just replace the pointer-returning version with an iterator-returning version, hopefully without much/any disruption. Thus this patch has getFirstNonPHIOrDbg and getFirstNonPHIOrDbgOrLifetime return an iterator, and updates all call-sites. There are no concerns about the iterators returned being converted to Instruction*'s and losing the debug-info bit: because the methods skip debug intrinsics, the iterator head bit is always false anyway.
2025-01-24[NFC][DebugInfo] Use iterator moveBefore at many call-sites (#123583)Jeremy Morse1-1/+1
As part of the "RemoveDIs" project, BasicBlock::iterator now carries a debug-info bit that's needed when getFirstNonPHI and similar feed into instruction insertion positions. Call-sites where that's necessary were updated a year ago; but to ensure some type safety however, we'd like to have all calls to moveBefore use iterators. This patch adds a (guaranteed dereferenceable) iterator-taking moveBefore, and changes a bunch of call-sites where it's obviously safe to change to use it by just calling getIterator() on an instruction pointer. A follow-up patch will contain less-obviously-safe changes. We'll eventually deprecate and remove the instruction-pointer insertBefore, but not before adding concise documentation of what considerations are needed (very few).
2024-12-13PatternMatch: migrate to CmpPredicate (#118534)Ramkumar Ramachandra1-4/+6
With the introduction of CmpPredicate in 51a895a (IR: introduce struct with CmpInst::Predicate and samesign), PatternMatch is one of the first key pieces of infrastructure that must be updated to match a CmpInst respecting samesign information. Implement this change to Cmp-matchers. This is a preparatory step in migrating the codebase over to CmpPredicate. Since we no functional changes are desired at this stage, we have chosen not to migrate CmpPredicate::operator==(CmpPredicate) calls to use CmpPredicate::getMatching(), as that would have visible impact on tests that are not yet written: instead, we call CmpPredicate::operator==(Predicate), preserving the old behavior, while also inserting a few FIXME comments for follow-ups.
2024-11-27[SimpleLoopUnswitch] Fix LCSSA phi node invalidationNikita Popov1-2/+3
Fixes https://github.com/llvm/llvm-project/issues/117537.
2024-10-16[LLVM] Add `Intrinsic::getDeclarationIfExists` (#112428)Rahul Joshi1-2/+2
Add `Intrinsic::getDeclarationIfExists` to lookup an existing declaration of an intrinsic in a `Module`.
2024-09-12[LoopUnswitch] Remove redundant condition. (NFC) (#107893)Amr Hesham1-2/+1
Remove redundant condition from '!A || (A && B)' to '!A || B' Fixes: #99799
2024-07-19[DebugInfo][SimpleLoopUnswitch] Fix missing debug location updates for new ↵Shan Huang1-7/+14
terminators (#98789) Fix #98787 .
2024-07-15[DebugInfo][SimpleLoopUnswitch] Fix missing debug location updates (#97662)Shan Huang1-11/+22
Fix #97559 . For the change at line 1253, I propagate the debug location of the terminator (i.e., the insertion point) to the new phi. because `MergeBB` is generated by splitting `ExitBB` several lines above, it only has the terminator, which could provide a reasonable debug location. For the change at line 2348, I switch the order of moving and cloning `TI`. Because `NewTI` cloned from `TI` is inserted into the original place where `TI` is, `NewTI` should preserve the origianl debug location. At the same time, doing this allows us to propagate the debug location to the new branch instruction replacing `NewTI` (the change at line 2446).
2024-06-29[IRBuilder] Don't include Module.h (NFC) (#97159)Nikita Popov1-0/+1
This used to be necessary to fetch the DataLayout, but isn't anymore.
2024-05-08[RemoveDIs] Change remapDbgVariableRecord to remapDbgRecord (#91456)Harald van Dijk1-3/+2
We need to remap any DbgRecord, not just DbgVariableRecords. This is the followup to #91447. Co-authored-by: PietroGhg <pietro.ghiglio@codeplay.com>
2024-04-12[Passes] Run SimpleLoopUnswitch after introducing invariant branches. (#81271)Florian Hahn1-0/+1
IndVars may be able to replace a loop dependent condition with a loop invariant one, but loop-unswitch runs before IndVars, so the invariant check remains in the loop. For an example, consider a read-only loop with a bounds check: https://godbolt.org/z/8cdj4qhbG This patch uses a approach similar to the way extra cleanup passes are run on demand after vectorization (added in acea6e9cfa4c4a0e8678c7). It introduces a new ShouldRunExtraSimpleLoopUnswitch analysis marker, which IndVars can use to indicate that extra unswitching is beneficial. ExtraSimpleLoopUnswitchPassManager uses this analysis to determine whether to run its passes on a loop. Compile-time impact (geomean) ranges from +0.0% to 0.02% https://llvm-compile-time-tracker.com/compare.php?from=138c0beb109ffe47f75a0fe8c4dc2cdabe8a6532&to=19e6e99eeb280d426907ea73a21b139ba7225627&stat=instructions%3Au Compile-time impact (geomean) of unconditionally running SimpleLoopUnswitch ranges from +0.05% - +0.16% https://llvm-compile-time-tracker.com/compare.php?from=138c0beb109ffe47f75a0fe8c4dc2cdabe8a6532&to=2930dfd5accdce2e6f8d5146ae4d626add2065a2&stat=instructions:u Unconditionally running SimpleLoopUnswitch seems to indicate that there are multiple other scenarios where we fail to run unswitching when opportunities remain. Fixes https://github.com/llvm/llvm-project/issues/85551. PR: https://github.com/llvm/llvm-project/pull/81271
2024-03-19[RemoveDIs][NFC] Rename DPValue -> DbgVariableRecord (#85216)Stephen Tozer1-2/+3
This is the major rename patch that prior patches have built towards. The DPValue class is being renamed to DbgVariableRecord, which reflects the updated terminology for the "final" implementation of the RemoveDI feature. This is a pure string substitution + clang-format patch. The only manual component of this patch was determining where to perform these string substitutions: `DPValue` and `DPV` are almost exclusively used for DbgRecords, *except* for: - llvm/lib/target, where 'DP' is used to mean double-precision, and so appears as part of .td files and in variable names. NB: There is a single existing use of `DPValue` here that refers to debug info, which I've manually updated. - llvm/tools/gold, where 'LDPV' is used as a prefix for symbol visibility enums. Outside of these places, I've applied several basic string substitutions, with the intent that they only affect DbgRecord-related identifiers; I've checked them as I went through to verify this, with reasonable confidence that there are no unintended changes that slipped through the cracks. The substitutions applied are all case-sensitive, and are applied in the order shown: ``` DPValue -> DbgVariableRecord DPVal -> DbgVarRec DPV -> DVR ``` Following the previous rename patches, it should be the case that there are no instances of any of these strings that are meant to refer to the general case of DbgRecords, or anything other than the DPValue class. The idea behind this patch is therefore that pure string substitution is correct in all cases as long as these assumptions hold.
2024-03-12[RemoveDIs][NFC] Rename common interface functions for DPValues->DbgRecords ↵Stephen Tozer1-1/+1
(#84793) As part of the effort to rename the DbgRecord classes, this patch renames the widely-used functions that operate on DbgRecords but refer to DbgValues or DPValues in their names to refer to DbgRecords instead; all such functions are defined in one of `BasicBlock.h`, `Instruction.h`, and `DebugProgramInstruction.h`. This patch explicitly does not change the names of any comments or variables, except for where they use the exact name of one of the renamed functions. The reason for this is reviewability; this patch can be trivially examined to determine that the only changes are direct string substitutions and any results from clang-format responding to the changed line lengths. Future patches will cover renaming variables and comments, and then renaming the classes themselves.
2024-03-05[NFC][RemoveDIs] Insert instruction using iterators in Transforms/Jeremy Morse1-6/+7
As part of the RemoveDIs project we need LLVM to insert instructions using iterators wherever possible, so that the iterators can carry a bit of debug-info. This commit implements some of that by updating the contents of llvm/lib/Transforms/Utils to always use iterator-versions of instruction constructors. There are two general flavours of update: * Almost all call-sites just call getIterator on an instruction * Several make use of an existing iterator (scenarios where the code is actually significant for debug-info) The underlying logic is that any call to getFirstInsertionPt or similar APIs that identify the start of a block need to have that iterator passed directly to the insertion function, without being converted to a bare Instruction pointer along the way. Noteworthy changes: * FindInsertedValue now takes an optional iterator rather than an instruction pointer, as we need to always insert with iterators, * I've added a few iterator-taking versions of some value-tracking and DomTree methods -- they just unwrap the iterator. These are purely convenience methods to avoid extra syntax in some passes. * A few calls to getNextNode become std::next instead (to keep in the theme of using iterators for positions), * SeparateConstOffsetFromGEP has it's insertion-position field changed. Noteworthy because it's not a purely localised spelling change. All this should be NFC.
2023-11-24[SimpleLoopUnswitch] Remove callbacks (#73300)Aiden Grossman1-88/+76
After the removal of the legacyPM version of simple loop unswitch, there is no longer a need for the callback mechanism to handle PM specific tasks. This patch removes the callbacks to help simplify the code now that they're no longer needed.
2023-11-24[DebugInfo][RemoveDIs] Support cloning and remapping DPValues (#72546)Jeremy Morse1-0/+3
This patch adds support for CloneBasicBlock duplicating the DPValues attached to instructions, and adds facilities to remap them into their new context. The plumbing to achieve this is fairly straightforwards and mechanical. I've also added illustrative uses to LoopUnrollRuntime, SimpleLoopUnswitch and SimplifyCFG. The former only updates for the epilogue right now so I've added CHECK lines just for the end of an unrolled loop (further updates coming later). SimpleLoopUnswitch had no debug-info tests so I've added a new one. The two modified parts of SimplifyCFG are covered by the two modified SimplifyCFG tests. These are scenarios where we have to do extra cloning for copying of DPValues because they're no longer instructions, and remap them too.
2023-11-21[NewPM] Remove SimpleLoopUnswitchLegacyPass (#72934)Aiden Grossman1-105/+0
This pass isn't used anywhere in upstream and thus doesn't have any test coverage. For these reasons, remove it.
2023-09-21[SimpleLoopUnswitch] Fix exponential unswitchNikita Popov1-27/+34
When unswitching via invariant condition injection, we currently mark the condition in the old loop, so that it does not get unswitched again. However, if there are multiple branches for which conditions can be injected, then we can do that for both the old and new loop. This means that the number of unswitches increases exponentially. Change the handling to be more similar to partial unswitching, where we instead mark the whole loop, rather than a single condition. This means that we will only generate a linear number of loops. TBH I think even that is still highly undesirable, and we should probably be unswitching all candidates at the same time, so that we end up with only two loops. But at least this mitigates the worst case. The test case is a reduced variant that generates 1700 lines of IR without this patch and 290 with it. Fixes https://github.com/llvm/llvm-project/issues/66868.
2023-09-20[SimpleLoopUnswitch] Fix reversed branch during condition injectionNikita Popov1-2/+2
The in-loop successor is only on the left after a potential condition inversion. As we re-use the old condition as-is, we should also reuse the old successors as-is. Fixes https://github.com/llvm/llvm-project/issues/63962.
2023-09-11[NFC][RemoveDIs] Provide an iterator-taking split-block methodJeremy Morse1-4/+4
As per the stack of patches this is attached to, allow users of BasicBlock::splitBasicBlock to provide an iterator for a position, instead of just an instruction pointer. This is to fit with my proposal for how to get rid of debug intrinsics [0]. There are other call-sites that would need to change, but this is sufficient for a stage2clang self host and some other C++ projects to build identical binaries, in the context of the whole remove-DIs project. [0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939 Differential Revision: https://reviews.llvm.org/D152545
2023-09-11[NFC][RemoveDIs] Prefer iterator-insertion over instructionsJeremy Morse1-6/+7
Continuing the patch series to get rid of debug intrinsics [0], instruction insertion needs to be done with iterators rather than instruction pointers, so that we can communicate information in the iterator class. This patch adds an iterator-taking insertBefore method and converts various call sites to take iterators. These are all sites where such debug-info needs to be preserved so that a stage2 clang can be built identically; it's likely that many more will need to be changed in the future. At this stage, this is just changing the spelling of a few operations, which will eventually become signifiant once the debug-info bearing iterator is used. [0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939 Differential Revision: https://reviews.llvm.org/D152537
2023-07-07TTI: Pass function to hasBranchDivergence in a few passesMatt Arsenault1-2/+4
https://reviews.llvm.org/D152033
2023-06-14[SimpleLoopUnswitch] Unswitch AND/OR conditions of selectsJoshua Cao1-26/+29
If a select's condition is a AND/OR, we can unswitch invariant operands. This patch uses existing logic from unswitching AND/OR's for branch conditions. This patch fixes the Cost computation for unswitching selects to have the cost of the entire loop, since unswitching selects do not remove branches. This is required for this patch because otherwise, there are cases where unswitching selects of AND/OR is beating out unswitching of branches. This patch also prevents unswitching of logical AND/OR selects. This should instead be done by unswitching of AND/OR branch conditions. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D151677
2023-06-09[SimpleLoopUnswitch] Verify LoopInfo in turnGuardIntoBranch under a flagDmitry Makogon1-1/+2
A follow-up for 64397d8. Only do verification if VerifyLoopInfo is set.
2023-06-08[SimpleLoopUnswitch] Verify LoopInfo after turning guards to branchesDmitry Makogon1-0/+1
SplitBlockAndInsertIfThen doesn't correctly update LoopInfo when called with Unreachable=true, which is the case when we turn guards to branches in SimpleLoopUnswitch. This adds LoopInfo verification before returning from turnGuardIntoBranch.
2023-06-05Revert "[LCSSA] Remove unused ScalarEvolution argument (NFC)"Nikita Popov1-2/+2
This reverts commit 5362a0d859d8e96b3f7c0437b7866e17a818a4f7. In preparation for reverting a dependent revision.
2023-05-17[SimpleLoopUnswitch] turnGuardIntoBranch use BB utils to update DTJoshua Cao1-24/+4
turnGuardIntoBranch() can use splitBlockAndInsertIfThen to update the DominatorTree rather than implementing it itself.
2023-05-15[SimpleLoopUnswitch] Skip trivial select conds for selectsJoshua Cao1-1/+1
Fixes https://github.com/llvm/llvm-project/issues/62715 If a select's condition is a trivial select: ``` %s = select %cond, i1 true, i1 false ``` Unswitch on %cond, rather than %s. This fixes crashes where there is a disparity in finding candidates and and the transformation logic.
2023-05-11[SimpleLoopUnswitch][reland 2] unswitch selectsJoshua Cao1-43/+130
The old LoopUnswitch pass unswitched selects, but the changes were never ported to the new SimpleLoopUnswitch. We unswitch by turning: ``` S = select %cond, %a, %b ``` into: ``` head: br %cond, label %then, label %tail then: br label %tail tail: S = phi [ %a, %then ], [ %b, %head ] ``` Unswitch selects are always nontrivial, since the successors do not exit the loop and the loop body always needs to be cloned. Unswitch selects always need to freeze the conditional if the conditional could be poison or undef. Selects don't propagate poison/undef, and branches on poison/undef causes UB. Reland 1 - Fix the insertion of freeze instructions. The original implementation inserts a dead freeze instruction that is not used by the unswitched branch. Reland 2 - Include https://reviews.llvm.org/D149560 in the same patch, which was originally reverted along with this patch. The patch prevents unswitching of selects with a vector conditional. This could have been caught in SimpleLoopUnswitch/crash.ll if it included tests for nontrivial unswitching. This reland also adds a run for the test file with nontrivial unswitching. Reviewed By: nikic, kachkov98, vitalybuka Differential Revision: https://reviews.llvm.org/D138526
2023-05-10Revert "[SimpleLoopUnswitch] unswitch selects"Benjamin Kramer1-129/+43
This reverts commit 21f226fc4591db6e98faf380137a42067c909582. Crashes on this test case: define void @test2() nounwind { entry: br label %bb.nph bb.nph: ; preds = %entry %and.i13521 = and <4 x i1> undef, undef br label %for.body for.body: ; preds = %for.body, %bb.nph %or.i = select <4 x i1> %and.i13521, <4 x i32> undef, <4 x i32> undef br i1 false, label %for.body, label %for.end for.end: ; preds = %for.body, %entry ret void }
2023-05-10[SimpleLoopUnswitch] unswitch selectsJoshua Cao1-43/+129
The old LoopUnswitch pass unswitched selects, but the changes were never ported to the new SimpleLoopUnswitch. We unswitch by turning: ``` S = select %cond, %a, %b ``` into: ``` head: br %cond, label %then, label %tail then: br label %tail tail: S = phi [ %a, %then ], [ %b, %head ] ``` Unswitch selects are always nontrivial, since the successors do not exit the loop and the loop body always needs to be cloned. Unswitch selects always need to freeze the conditional if the conditional could be poison or undef. Selects don't propagate poison/undef, and branches on poison/undef causes UB. Reviewed By: nikic, kachkov98, vitalybuka Differential Revision: https://reviews.llvm.org/D138526
2023-05-02[LCSSA] Remove unused ScalarEvolution argument (NFC)Nikita Popov1-2/+2
After D149435, LCSSA formation no longer needs access to ScalarEvolution, so remove the argument from the utilities.
2023-05-01Revert "[SimpleLoopUnswitch] unswitch selects"Vitaly Buka1-86/+15
Revert "Don't loop unswitch vector selects" Breaks msan. Details in D138526. This reverts commit bf089732775520624cb4983bfed6c341e1b4c405. This reverts commit e479ed90b591c18873fda68c12946b9d08cbe02f.
2023-04-30Don't loop unswitch vector selectsValentin Churavy1-1/+2
Otherwise we could produce `br <2x i1>` which are of course not legal. ``` Branch condition is not 'i1' type! br <2 x i1> %cond.fr1, label %entry.split.us, label %entry.split %cond.fr1 = freeze <2 x i1> %cond LLVM ERROR: Broken module found, compilation aborted! PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace. Stack dump: 0. Program arguments: /home/vchuravy/builds/llvm/bin/opt -passes=simple-loop-unswitch<nontrivial> -S ``` Fixes change introduced by https://reviews.llvm.org/D138526 Reviewed By: caojoshua Differential Revision: https://reviews.llvm.org/D149560
2023-04-29[SimpleLoopUnswitch] unswitch selectsJoshua Cao1-15/+85
The old LoopUnswitch pass unswitched selects, but the changes were never ported to the new SimpleLoopUnswitch. We unswitch by turning: ``` S = select %cond, %a, %b ``` into: ``` head: br %cond, label %then, label %tail then: br label %tail tail: S = phi [ %a, %then ], [ %b, %head ] ``` Unswitch selects are always nontrivial, since the successors do not exit the loop and the loop body always needs to be cloned. Differential Revision: https://reviews.llvm.org/D138526 Co-authored-by: Sergey Kachkov <sergey.kachkov@syntacore.com>
2023-04-16[Transforms] Apply fixes from performance-for-range-copy (NFC)Kazu Hirata1-1/+1
2023-04-12[SimpleLoopUnswitch] Do not try to inject pointer conditions. PR62058Max Kazantsev1-0/+2
As shown in https://github.com/llvm/llvm-project/issues/62058, canonicalication may fail with pointer types (and basically this transform is not expected to work with pointers).
2023-04-06[SimpleLoopUnswitch] Fix SCEV invalidation issueBjorn Pettersson1-7/+9
This patch is making sure that we use getTopMostExitingLoop when finding out which loops to forget, when dealing with unswitchNontrivialInvariants and unswitchTrivialSwitch. It seems to at least be needed for unswitchNontrivialInvariants as detected by the included test case. Note that unswitchTrivialBranch already used getTopMostExitingLoop. This was done in commit 4a9cde5a791cd49b96993e6. The commit message in that commit says "If the patch makes sense, I will also update those places to a similar approach ...", referring to these functions mentioned above. As far as I can tell that never happened, but this is an attempt to finally fix that. Fixes https://github.com/llvm/llvm-project/issues/61080 Differential Revision: https://reviews.llvm.org/D147058
2023-03-28[SimpleLoopUnswitch] Fix SCEV invalidation for unswitchTrivialSwitchBjorn Pettersson1-4/+7
When doing a trivial unswitch of a switch statement the code need to "invalidate SCEVs for the outermost loop reached by any of the exits", as indicated by code comments. Depending on if we find such an outermost loop or not we can limit the invalidation to some sub-loops or the full loop-nest. As shown in the added test case there seem to have been some bugs in the code that was finding the "outermost loop", so we could end up invalidating too few loops. Seems like commit 1bf8ae17f5e2714c8c87978 introduced the bug by moving the code that invalidates the loops above some of the code that computed 'OuterL'. This patch fixes that by also moving that computation of 'OuterL' so that we compute 'OuterL' properly before we use it for the SCEV invalidation. Differential Revision: https://reviews.llvm.org/D146963
2023-03-20[SimpleLoopUnswitch] Skip non-trivial unswitching of cold loop nestsTeresa Johnson1-4/+27
This fixes a compile time issue due to guarding loop unswitching based on whether the enclosing function is cold. That approach is very inefficient in the case of large cold functions that contain numerous loops, since the loop pass calls isFunctionColdInCallGraph once per loop, and that function walks all BBs in the function (twice for Sample PGO) looking for any non-cold blocks. Originally, this code only checked if the current Loop's header was cold (D129599). However, that apparently caused a slowdown on a SPEC benchmark, and the example given was that of a cold inner loop nested in a non-cold outer loop (see comments in D129599). The fix was to check if the whole function is cold, done in D133275. This is overkill, and we can simply check if the header of any loop in the current loop's loop nest is non-cold (looking at both outer and inner loops). This patch drops the compile time for a large module by 40% with this approach. I also updated PGO-nontrivial-unswitch2.ll since it only had one cold loop in a non-cold function, so that it instead had IR based off the example given in the comments relating to the SPEC degradation in D129599. I confirmed that the new version of the test fails with the original check done in D129599 of only the current loop's header coldness. Similarly updated test PGO-nontrivial-unswitch.ll to contain a cold loop in a cold loop nest, and created PGO-nontrivial-unswitch3.ll to contain a non-cold loop in a non-cold loop nest. Differential Revision: https://reviews.llvm.org/D146383
2023-03-06[SimpleLoopUnswtich] Support zext when injecting invariant conditionsMax Kazantsev1-2/+14
This patch handles the following case: turn ``` if (x <u Invariant1) { if (zext(x) <u Invariant2) { ... } } ``` into ``` if (x <u Invariant1) { if (zext(Invariant1) <=u Invariant2) { // Unswitch here // No check needed } else { if (zext(x) <u Invariant2) { ... } } } ``` Differential Revision: https://reviews.llvm.org/D138015 Reviewed By: skatkov
2023-03-06[NFC] Fix typo in commentMax Kazantsev1-1/+1