aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/Transforms/Scalar/LoopFlatten.cpp
AgeCommit message (Collapse)AuthorFilesLines
2025-08-01[LAA] Support assumptions in evaluatePtrAddRecAtMaxBTCWillNotWrap (#147047)Florian Hahn1-1/+2
This patch extends the logic added in https://github.com/llvm/llvm-project/pull/128061 to support dereferenceability information from assumptions as well. Unfortunately both assumption cache and the dominator tree need to be threaded through multiple layers to make them available where needed. PR: https://github.com/llvm/llvm-project/pull/147047
2024-10-17[LLVM] Make more use of IRBuilder::CreateIntrinsic. NFC. (#112706)Jay Foad1-4/+4
Convert many instances of: Fn = Intrinsic::getOrInsertDeclaration(...); CreateCall(Fn, ...) to the equivalent CreateIntrinsic call.
2024-10-11[NFC] Rename `Intrinsic::getDeclaration` to `getOrInsertDeclaration` (#111752)Rahul Joshi1-2/+2
Rename the function to reflect its correct behavior and to be consistent with `Module::getOrInsertFunction`. This is also in preparation of adding a new `Intrinsic::getDeclaration` that will have behavior similar to `Module::getFunction` (i.e, just lookup, no creation).
2024-07-04[DebugInfo][LoopFlatten] Fix missing debug location update for new br ↵Shan Huang1-2/+4
instruction (#97085) Fix #97084 .
2024-06-28[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919)Nikita Popov1-1/+1
Similar to https://github.com/llvm/llvm-project/pull/96902, this adds `getDataLayout()` helpers to Function and GlobalValue, replacing the current `getParent()->getDataLayout()` pattern.
2024-05-15[Transforms] Preserve inbounds attribute of transformed GEPs when flattening ↵AtariDreams1-2/+4
loops (#86961) When flattening the loop, if the GEP was inbound, it should stay inbound, because the only thing that changed is how the pointers are calculated, not the elements being accessed. Proof: https://alive2.llvm.org/ce/z/dApMpQ
2024-05-10[LAA] Support backward dependences with non-constant distance. (#91525)Florian Hahn1-1/+1
Following up to 933f49248, also update the code reasoning about backwards dependences to support non-constant distances. Update the code to use the signed minimum distance instead of a constant distance This means e checked the lower bound of the dependence distance and the distance may be larger at runtime (and safe for vectorization). Whether to classify it as Unknown or Backwards depends on the vector width and LAA was updated to take TTI to get the maximum vector register width. If the minimum dependence distance is larger than the max vector width, we consider it as backwards-vectorizable. Otherwise we classify them as Unknown, so we re-try with runtime checks. PR: https://github.com/llvm/llvm-project/pull/91525
2024-03-05[NFC][RemoveDIs] Insert instruction using iterators in Transforms/Jeremy Morse1-1/+1
As part of the RemoveDIs project we need LLVM to insert instructions using iterators wherever possible, so that the iterators can carry a bit of debug-info. This commit implements some of that by updating the contents of llvm/lib/Transforms/Utils to always use iterator-versions of instruction constructors. There are two general flavours of update: * Almost all call-sites just call getIterator on an instruction * Several make use of an existing iterator (scenarios where the code is actually significant for debug-info) The underlying logic is that any call to getFirstInsertionPt or similar APIs that identify the start of a block need to have that iterator passed directly to the insertion function, without being converted to a bare Instruction pointer along the way. Noteworthy changes: * FindInsertedValue now takes an optional iterator rather than an instruction pointer, as we need to always insert with iterators, * I've added a few iterator-taking versions of some value-tracking and DomTree methods -- they just unwrap the iterator. These are purely convenience methods to avoid extra syntax in some passes. * A few calls to getNextNode become std::next instead (to keep in the theme of using iterators for positions), * SeparateConstOffsetFromGEP has it's insertion-position field changed. Noteworthy because it's not a purely localised spelling change. All this should be NFC.
2024-01-25[LoopFlatten] Use loop versioning when overflow can't be disproven (#78576)John Brawn1-13/+62
Implement the TODO in loop flattening to version the loop when we can't prove that the trip count calculation won't overflow.
2024-01-10[LoopFlatten] Recognise gep+gep (#72515)John Brawn1-27/+52
Now that InstCombine canonicalises add+gep to gep+gep, LoopFlatten needs to recognise (gep (gep ptr (i*M)), j) as being something it can optimise.
2023-12-18[LLVM][IR] Replace ConstantInt's specialisation of getType() with ↵Paul Walker1-3/+2
getIntegerType(). (#75217) The specialisation will not be valid when ConstantInt gains native support for vector types. This is largely a mechanical change but with extra attention paid to constant folding, InstCombineVectorOps.cpp, LoopFlatten.cpp and Verifier.cpp to remove the need to call `getIntegerType()`. Co-authored-by: Nikita Popov <github@npopov.com>
2023-10-10[ValueTracking] Use SimplifyQuery for the overflow APIs (NFC)Nikita Popov1-2/+3
Accept a SimplifyQuery instead of an unpacked list of arguments.
2023-04-25[SCEV] Common code for computing trip count in a fixed type [NFC-ish]Philip Reames1-6/+7
This is a follow on to D147117 and D147355. In both cases, we were adding special cases to compute zext(BTC+1) instead of zext(BTC)+1 when the BTC+1 computation was known not to overflow. Differential Revision: https://reviews.llvm.org/D148661
2023-04-17Remove several no longer needed includes. NFCIBjorn Pettersson1-3/+0
Mostly removing includes of InitializePasses.h and Pass.h in passes that no longer has support for the legacy PM.
2023-02-15[LoopFlatten] Inline an external linkage function not in llvm::. NFCFangrui Song1-16/+8
2023-02-15[LoopFlatten] Remove legacy pass (unused in the pipeline)Fangrui Song1-57/+0
Following recent changes to remove non-core legacy passes.
2023-01-12[NFC][LoopFlatten][LoopInterchange] Do not explicitly forget subloopsJoshua Cao1-2/+1
We don't need to explicitly forget subloops because forgetting parent loops will automatically forget their subloops Differential Revision: https://reviews.llvm.org/D141029
2023-01-08[NFC] Hide implementation details in anonymous namespacesBenjamin Kramer1-0/+2
2023-01-06[LoopFlattening] Check for extra uses on MulDavid Green1-0/+9
Similar to D138404, we were not guarding against extra uses of the Mul. In most cases other checks would catch the issue due to unsupported instructions in the outer loop, but certain non-canonical loop forms could still get through. Fixes #59339 Differential Revision: https://reviews.llvm.org/D141114
2022-12-05[LoopFlatten] Add some LLVM_DEBUG messages. NFC.Sjoerd Meijer1-5/+19
2022-11-26[Scalar] Use std::optional in LoopFlatten.cpp (NFC)Kazu Hirata1-2/+3
This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-11-22[LoopFlatten] Fix IV increment use countDavid Green1-1/+2
The add from the IV in the inner loop was always checking for 2 uses, the phi and the compare. The compare could be based on the phi though, leaving one valid use of the compare. In the testcase we could be left with the phi and a lcssa phi as the two users, invalidly allowing flattening where we shouldn't. Fixes 58441 Differential Revision: https://reviews.llvm.org/D138404
2022-11-21Don't use Optional::getPointer (NFC)Kazu Hirata1-3/+3
Since std::optional does not offer getPointer(), this patch replaces X.getPointer() with &*X to make the migration from llvm::Optional to std::optional easier. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716 Differential Revision: https://reviews.llvm.org/D138466
2022-11-14[LoopFlatten] Forget all block and loop dispositions after flattenluxufan1-0/+1
Method forgetLoop only forgets expression of phi or its users. SCEV expressions except the above mentioned may still has loop dispositions that point to the destroyed loop, which might cause a crash. Fixes: https://github.com/llvm/llvm-project/issues/58865 Reviewed By: nikic, fhahn Differential Revision: https://reviews.llvm.org/D137651
2022-10-18Revert "Recommit "[LoopFlatten] Enable it by default""Sjoerd Meijer1-1/+0
This reverts commit 5b9597f59a445523bd59b5251ab1c2865e74919f. A miscompilation was reported: https://github.com/llvm/llvm-project/issues/58441 Reverting this while I look at that.
2022-10-17Recommit "[LoopFlatten] Enable it by default"Sjoerd Meijer1-0/+1
The sanitizer bots turned green again after another change went in, i.e. revert 26dd64ba9cfabe5474bb207f3b7099965f81fed7, so I don't think this patch was causing the problems.
2022-10-17Revert "[LoopFlatten] Enable it by default"Sjoerd Meijer1-1/+0
This reverts commit 233659c7ae9b83b64a9f739d340736bca39c3d2e. I see some sanitizer build bot failures. Not sure if it is change causing it, but let's see if a revert returns the bots to green...
2022-10-17[LoopFlatten] Enable it by defaultSjoerd Meijer1-0/+1
LoopFlatten has been in the code base off by default for years, but this enables it to run by default. Downstream this has been running for years, so it has been exposed to quite some code. Then around the time we switched to the NPM, several fixes went in related to updating the MemorySSA state and we moved it to a loop pass manager, which both helped preventing rerunning certain analysis passes, and thus helped a bit with compile-times. About compile-times, adding a pass isn't free, but this should see only very minor increases. The pass is relatively simple and there shouldn't be anything algorithmically expensive because all it does is looking at inner/outer loops and it checks assumptions on loop increments and indices. If we see increases, I expect this to mainly come from invalidation of analysis info, and perhaps subsequent passes to trigger and do more. Despite its simplicity/restrictions, it triggers in most code-bases, which makes it worth to enable this by default. Differential Revision: https://reviews.llvm.org/D109958
2022-08-18[CostModel] Replace getUserCost with getInstructionCostSimon Pilgrim1-1/+1
* Replace getUserCost with getInstructionCost, covering all cost kinds. * Remove getInstructionLatency, it's not implemented by any backends, and we should fold the functionality into getUserCost (now getInstructionCost) to make it easier for targets to handle the cost kinds with their existing cost callbacks. Original Patch by @samparker (Sam Parker) Differential Revision: https://reviews.llvm.org/D79483
2022-08-07[Transforms] Fix comment typos (NFC)Kazu Hirata1-1/+1
2022-08-07[llvm] Fix comment typos (NFC)Kazu Hirata1-1/+1
2022-06-20Don't use Optional::hasValue (NFC)Kazu Hirata1-2/+2
2022-06-07[LoopFlatten] Fix crash if the inner loop trip count comes from a sext ↵Craig Topper1-2/+3
instruction. If we look through a truncate in matchLinearIVUser, it's possible we find a sext/zext instruction that didn't come from widening. This will fail the MatchedItCount->getType() == InnerInductionPHI->getType() assertion. Fix this by checking that we did not look through a truncate already. Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D127149
2022-06-07[LoopFlatten] Replace unchecked dyn_cast with cast.Craig Topper1-1/+1
Spotted while reading through the code. Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D127146
2022-03-03Cleanup includes: Transform/Scalarserge-sans-paille1-1/+2
Estimated impact on preprocessor output line: before: 1062981579 after: 1062494547 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D120817
2022-01-24[LoopFlatten] Address FIXME about getTripCountFromExitCount. NFC.Sjoerd Meijer1-152/+199
Together with the previous commit which mainly documents better LoopFlatten's overall strategy, this addresses a concern added as a FIXME comment in D110587; the code refactoring (NFC) introduces functions (also for the SCEV usage) to make this clearer.
2022-01-24[LoopFlatten] Added comments about usage of various Loop APIs. NFC.Sjoerd Meijer1-20/+61
2022-01-19[LoopFlatten] Update MemorySSA stateSjoerd Meijer1-23/+55
I would like to move LoopFlatten from LoopPass Manager LPM2 to LPM1 (D116612), but that is a LPM that is using MemorySSA and so LoopFlatten needs to preserve MemorySSA and this adds that. More specifically, LoopFlatten restructures the CFG and with this change the MSSA state is updated accordingly, where we also update the DomTree. LoopFlatten doesn't rewrite/optimise/delete load or store instructions, so I have not added any MSSA updates for that. Differential Revision: https://reviews.llvm.org/D116660
2022-01-06[LoopFlatten] checkOverflow - use cast<> instead of dyn_cast<> to avoid ↵Simon Pilgrim1-1/+1
dereference of nullptr. Fix static analysis warning by using cast<> instead of dyn_cast<> as both isa<> and isGuaranteedToExecuteForEveryIteration expect a non-null Instruction pointer.
2021-10-11[SCEV] Extend trip count to avoid overflow by defaultPhilip Reames1-2/+7
As a brief reminder, an "exit count" is the number of times the backedge executes before some event. It can be zero if we exit before the backedge is reached. A "trip count" is the number of times the loop header is entered if we branch into the loop. In general, TC = BTC + 1 and thus a zero trip count is ill defined There is a cornercases which we don't handle well. Let's assume i8 for our examples to keep things simple. If BTC = 255, then the correct trip count is 256. However, 256 is not representable in i8. In theory, code which needs to reason about trip counts is responsible for checking for this cornercase, and either bailing out, or handling it correctly. Historically, we don't have a great track record about actually doing so. When reviewing D109676, I found myself asking a basic question. Was there any good reason to preserve the current wrap-to-zero behavior when converting from backedge taken counts to trip counts? After reviewing existing code, I could not find a single case which appears to correctly and precisely handle the overflow case. This patch changes the default behavior to extend instead of wrap. That is, if the result might be 256, we return a value of i9 type to ensure we interpret the count correctly. I did leave the legacy behavior as an option since a) loop-flatten stops triggering if I extend due to weirdly specific pattern matching I didn't understand and b) we could reasonably use the mode if we'd externally established a lack of overflow. I want to emphasize that this change is *not* NFC. There are two call sites (one in ScalarEvolution.cpp, one in LoopCacheAnalysis.cpp) which are switched to the extend semantics. The former appears imprecise (but correct) for a constant 255 BTC. The later appears incorrect, though I don't have a test case. Differential Revision: https://reviews.llvm.org/D110587
2021-10-08[LoopFlatten] Mark inner loop as deletedNikita Popov1-8/+10
If a loop is flattened, the inner loop is removed and the LPM should be informed of this fact, so it can invalidate associated analyses. To support this, we relax an assertion in LPMUpdater to allow invalidating non-top-level loops when running in LoopNestMode, as the pass does not know how exactly it will get scheduled. Differential Revision: https://reviews.llvm.org/D111350
2021-10-07[LoopFlatten] Mark loop analyses as preservedNikita Popov1-1/+1
LoopFlatten does preserve loop analyses (DT, LI and SCEV), but currently doesn't mark them as preserved in the NewPM (they are marked as preserved in the LegacyPM). I think this doesn't really have an effect in the end because the loop pass adaptor will just assume they're preserved anyway, but let's be explicit about this for the sake of clarity. Differential Revision: https://reviews.llvm.org/D111328
2021-09-29[LoopFlatten] Bail if we can't perform flattening after IV wideningSjoerd Meijer1-4/+22
It can happen that after widening of the IV, flattening may not be possible, e.g. when it is deemed unprofitable. We were not properly checking this, which resulted in flattening being applied when it shouldn't, also leading to incorrect results (miscompilation). This should fix PR51980 (https://bugs.llvm.org/show_bug.cgi?id=51980) Differential Revision: https://reviews.llvm.org/D110712
2021-09-28[LoopFlatten] Updating Phi nodes after IV wideningSjoerd Meijer1-8/+17
In rG6a076fa9539e, a problem with updating the old/narrow phi nodes after IV widening was introduced. If after widening of the IV the transformation is *not* applied, the narrow phi node was incorrectly modified, which should only happen if flattening happens. This can be seen in the added test widen-iv2.ll, which incorrectly had 1 incoming value, but should have its original 2 incoming values, which is now restored. Differential Revision: https://reviews.llvm.org/D110234
2021-09-10nullptr initialize variables, spotted on msan bots.Eric Christopher1-2/+2
2021-09-10[LoopFlatten] Make the analysis more robust after IV wideningSjoerd Meijer1-14/+49
LoopFlatten wasn't triggering on this motivating case after IV widening: void foo(int *A, int N, int M) { for (int i = 0; i < N; ++i) for (int j = 0; j < M; ++j) f(A[i*M+j]); } The reason was that the old induction phi nodes were getting in the way. These narrow and dead induction phis are not always trivially dead, and having both the narrow and wide IVs confused the analysis and caused it to bail. This adds some extra bookkeeping for these old phis, so we can filter them out when checks on phi nodes are performed. Other clean up passes will get rid of these old phis and increment instructions. As this was one of the motivating examples from the beginning, it was surprising this wasn't triggering from C/C++ code. It looks like the IR and CFG is just slightly different. Differential Revision: https://reviews.llvm.org/D109309
2021-08-25[LoopFlatten] Add statistic for number of loops flattened. NFCRosie Sumpter1-2/+10
Differential Revision: https://reviews.llvm.org/D108644
2021-08-19[LoopFlatten] Fix assertion failureRosie Sumpter1-11/+21
There is an assertion failure in computeOverflowForUnsignedMul (used in checkOverflow) due to the inner and outer trip counts having different types. This occurs when the IV has been widened, but the loop components are not successfully rediscovered. This is fixed by some refactoring of the code in findLoopComponents which identifies the trip count of the loop. Differential Revision: https://reviews.llvm.org/D108107
2021-08-13[LoopFlatten] Fix assertion failure in checkOverflowRosie Sumpter1-34/+59
There is an assertion failure in computeOverflowForUnsignedMul (used in checkOverflow) due to the inner and outer trip counts having different types. This occurs when the IV has been widened, but the loop components are not successfully rediscovered. This is fixed by some refactoring of the code in findLoopComponents which identifies the trip count of the loop.
2021-08-02[LoopFlatten] Fix missed LoopFlatten opportunityRosie Sumpter1-5/+21
When the limit of the inner loop is a known integer, the InstCombine pass now causes the transformation e.g. imcp ult i32 %inc, tripcount -> icmp ult %j, tripcount-step (where %j is the inner loop induction variable and %inc is add %j, step), which is now accounted for when identifying the trip count of the loop. This is also an acceptable use of %j (provided the step is 1) so is ignored as long as the compare that it's used in is also the condition of the inner branch. Differential Revision: https://reviews.llvm.org/D105802