aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/Transforms/Scalar/LoopLoadElimination.cpp
AgeCommit message (Collapse)AuthorFilesLines
2025-08-03[Scalar] Remove an unnecessary cast (NFC) (#151849)Kazu Hirata1-1/+1
LoadType is already of Type *.
2025-06-11[DLCov][NFC] Annotate intentionally-blank DebugLocs in existing code (#136192)Stephen Tozer1-1/+2
Following the work in PR #107279, this patch applies the annotative DebugLocs, which indicate that a particular instruction is intentionally missing a location for a given reason, to existing sites in the compiler where their conditions apply. This is NFC in ordinary LLVM builds (each function `DebugLoc::getFoo()` is inlined as `DebugLoc()`), but marks the instruction in coverage-tracking builds so that it will be ignored by Debugify, allowing only real errors to be reported. From a developer standpoint, it also communicates the intentionality and reason for a missing DebugLoc. Some notes for reviewers: - The difference between `I->dropLocation()` and `I->setDebugLoc(DebugLoc::getDropped())` is that the former _may_ decide to keep some debug info alive, while the latter will always be empty; in this patch, I always used the latter (even if the former could technically be correct), because the former could result in some (barely) different output, and I'd prefer to keep this patch purely NFC. - I've generally documented the uses of `DebugLoc::getUnknown()`, with the exception of the vectorizers - in summary, they are a huge cause of dropped source locations, and I don't have the time or the domain knowledge currently to solve that, so I've plastered it all over them as a form of "fixme".
2024-11-02[Scalar] Remove unused includes (NFC) (#114645)Kazu Hirata1-2/+0
Identified with misc-include-cleaner.
2024-10-28Check hasOptSize() in shouldOptimizeForSize() (#112626)Ellis Hoag1-5/+2
2024-07-23[LLVM] Fix typo "depedent"Jay Foad1-5/+5
2024-07-11Revert "[LV] Autovectorization for the all-in-one histogram intrinsic" (#98493)Graham Hunter1-1/+0
Reverts llvm/llvm-project#91458 to deal with post-commit reviewer requests.
2024-07-11[LV] Autovectorization for the all-in-one histogram intrinsic (#91458)Graham Hunter1-0/+1
This patch implements limited loop vectorization support for the 'all-in-one' histogram intrinsic. The feature is disabled by default, and when enabled will only vectorize if there are no other users of values in the gather-modify-scatter sequence.
2024-06-27[IR] Add getDataLayout() helpers to BasicBlock and Instruction (#96902)Nikita Popov1-4/+4
This is a helper to avoid writing `getModule()->getDataLayout()`. I regularly try to use this method only to remember it doesn't exist... `getModule()->getDataLayout()` is also a common (the most common?) reason why code has to include the Module.h header.
2024-05-17[DebugInfo][LoopLoadElim] Fix missing debug location updates (#91839)Shan Huang1-1/+11
2024-05-05[LAA] Directly pass DepChecker to getSource/getDestination (NFC).Florian Hahn1-3/+4
Instead of passing LoopAccessInfo only to fetch the MemoryDepChecker, directly pass MemoryDepChecker. This simplifies the code and also allows new uses in places where no LAI is available.
2024-04-25[LAA] Support different strides & non constant dep distances using SCEV. ↵Florian Hahn1-1/+3
(#88039) Extend LoopAccessAnalysis to support different strides and as a consequence non-constant distances between dependences using SCEV to reason about the direction of the dependence. In multiple places, logic to rule out dependences using the stride has been updated to only be used if StrideA == StrideB, i.e. there's a common stride. We now also may bail out at multiple places where we may have to set FoundNonConstantDistanceDependence. This is done when we need to bail out and the distance is not constant to preserve original behavior. Fixes https://github.com/llvm/llvm-project/issues/87336 PR: https://github.com/llvm/llvm-project/pull/88039
2024-03-10Add llvm::min/max_element and use it in llvm/ and mlir/ directories. (#84678)Justin Lebar1-11/+12
For some reason this was missing from STLExtras.
2024-03-05[NFC][RemoveDIs] Insert instruction using iterators in Transforms/Jeremy Morse1-5/+7
As part of the RemoveDIs project we need LLVM to insert instructions using iterators wherever possible, so that the iterators can carry a bit of debug-info. This commit implements some of that by updating the contents of llvm/lib/Transforms/Utils to always use iterator-versions of instruction constructors. There are two general flavours of update: * Almost all call-sites just call getIterator on an instruction * Several make use of an existing iterator (scenarios where the code is actually significant for debug-info) The underlying logic is that any call to getFirstInsertionPt or similar APIs that identify the start of a block need to have that iterator passed directly to the insertion function, without being converted to a bare Instruction pointer along the way. Noteworthy changes: * FindInsertedValue now takes an optional iterator rather than an instruction pointer, as we need to always insert with iterators, * I've added a few iterator-taking versions of some value-tracking and DomTree methods -- they just unwrap the iterator. These are purely convenience methods to avoid extra syntax in some passes. * A few calls to getNextNode become std::next instead (to keep in the theme of using iterators for positions), * SeparateConstOffsetFromGEP has it's insertion-position field changed. Noteworthy because it's not a purely localised spelling change. All this should be NFC.
2023-11-15[LAA] Check if dependencies access loop-varying underlying objects.Florian Hahn1-1/+2
This patch adds a new dependence kind UnsafeIndirect, for cases where at least one of the memory access instructions may access a loop varying object, e.g. the address of underlying object is loaded inside the loop, like A[B[i]]. We cannot determine direction or distance in those cases, and also are unable to generate any runtime checks. This fixes a miscompile, if we attempt to generate runtime checks for unknown dependencies. Note that in most cases we do not attempt to generate runtime checks for unknown dependences, except if FoundNonConstantDistanceDependence is true. Fixes https://github.com/llvm/llvm-project/issues/69744.
2023-09-11[NFC][RemoveDIs] Prefer iterator-insertion over instructionsJeremy Morse1-2/+2
Continuing the patch series to get rid of debug intrinsics [0], instruction insertion needs to be done with iterators rather than instruction pointers, so that we can communicate information in the iterator class. This patch adds an iterator-taking insertBefore method and converts various call sites to take iterators. These are all sites where such debug-info needs to be preserved so that a stage2 clang can be built identically; it's likely that many more will need to be changed in the future. At this stage, this is just changing the spelling of a few operations, which will eventually become signifiant once the debug-info bearing iterator is used. [0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939 Differential Revision: https://reviews.llvm.org/D152537
2023-06-01[LoopLoadElimination] Add support for stride equal to -1Igor Kirillov1-8/+17
This patch allows us to gain all the benefits provided by LoopLoadElimination pass to descending loops. Differential Revision: https://reviews.llvm.org/D151448
2023-04-21[LoopLoadElimination] Preserve DT and LI (NFCI)Nikita Popov1-0/+2
This pass makes control-flow changes, but only via LoopSimplify and LoopVersioning utilities, which perserve DT and LI.
2023-04-17Remove several no longer needed includes. NFCIBjorn Pettersson1-3/+0
Mostly removing includes of InitializePasses.h and Pass.h in passes that no longer has support for the legacy PM.
2023-02-14[LoopLoadElimination] Remove legacy passFangrui Song1-64/+0
Following recent changes to remove non-core features of the legacy PM/optimization pipeline.
2022-10-04[LoopVersioning,LLE] Clear LoopAccessInfoManager after making changes.Florian Hahn1-0/+2
Loop versioning changes the control-flow, which may impact SCEVs cached by for other loops in LoopAccessInfoManager. Clear the manager after making changes. Fixes #57825. Depends on D134609. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D134611
2022-10-04[LAA] Pass LoopAccessInfoManager instead of GetLAA function.Florian Hahn1-13/+11
Use LoopAccessInfoManager directly instead of various GetLAA lambdas. Depends on D134608. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D134609
2022-10-01[LAA] Change to function analysis for new PM.Florian Hahn1-9/+3
At the moment, LoopAccessAnalysis is a loop analysis for the new pass manager. The issue with that is that LAI caches SCEV expressions and modifications in a loop may impact SCEV expressions in other loops, but we do not have a convenient way to invalidate LAI for other loops withing a loop pipeline. To avoid this issue, turn it into a function analysis which returns a manager object that keeps track of the individual LAI objects per loop. Fixes #50940. Fixes #51669. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D134606
2022-09-27[LAA] Make getPtrStride return Option instead of overloading zero as error ↵Philip Reames1-2/+2
value [nfc] This is purely NFC restructure in advance of a change which actually exposes zero strides. This is mostly because I find this interface confusing each time I look at it.
2022-09-02[LoopLoadElim] Add stores with matching sizes as load-store candidatesJolanta Jensen1-6/+22
We are not building up a proper list of load-store candidates because we are throwing away stores where the type don't match the load. This patch adds stores with matching store sizes as candidates. Author of the original patch: David Sherwood. Differential Revision: https://reviews.llvm.org/D130233
2022-04-29[CompileTime] [Passes] Avoid computing unnecessary analyses. NFCAnna Thomas1-1/+5
Similar to c515b2f39e77, If there are no loops in the function as seen through LI, we should avoid computing the remaining expensive analyses (such as SCEV, BPI). Reordered the analyses requests and early return if there are no loops. The logic of avoiding expensive analyses is applied to LoopVectorizer, LoopLoadElimination and LoopUnrollPass, i.e. all function passes which operate on loops. This is an NFC with compile time improvement. Differential Revision: https://reviews.llvm.org/D124529
2022-03-03Cleanup includes: Transform/Scalarserge-sans-paille1-1/+0
Estimated impact on preprocessor output line: before: 1062981579 after: 1062494547 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D120817
2022-02-10[PSE] Remove assumption that top level predicate is union from public ↵Philip Reames1-2/+2
interface [NFC*] Note that this doesn't actually cause the top level predicate to become a non-union just yet. The * above comes from a case in the LoopVectorizer where a predicate which is later proven no longer blocks vectorization due to a change from checking if predicates exists to whether the predicate is possibly false.
2022-02-09[LoopLoadElim] Support opaque pointersArthur Eubanks1-1/+2
With typed pointers the pointer operand type checks the address space and the load/store type. With opaque pointers we have to check the load/store type separately.
2021-09-30[BPI] Keep BPI available in loop passes through LoopStandardAnalysisResultsAnna Thomas1-2/+2
This is analogous to D86156 (which preserves "lossy" BFI in loop passes). Lossy means that the analysis preserved may not be up to date with regards to new blocks that are added in loop passes, but BPI will not contain stale pointers to basic blocks that are deleted by the loop passes. This is achieved through BasicBlockCallbackVH in BPI, which calls eraseBlock that updates the data structures in BPI whenever a basic block is deleted. This patch does not have any changes in the upstream pipeline, since none of the loop passes in the pipeline use BPI currently. However, since BPI wasn't previously preserved in loop passes, the loop predication pass was invoking BPI *on the entire function* every time it ran in an LPM. This caused massive compile time in our downstream LPM invocation which contained loop predication. See updated test with an invocation of a loop-pipeline containing loop predication and -debug-pass turned ON. Reviewed-By: asbirlea, modimo Differential Revision: https://reviews.llvm.org/D110438
2021-09-11[LAA] Pass access type to getPtrStride()Nikita Popov1-2/+2
Pass the access type to getPtrStride(), so it is not determined from the pointer element type. Many cases still fetch the element type at a higher level though, so this only partially addresses the issue.
2021-08-16[MemorySSA] Remove unnecessary MSSA dependenciesNikita Popov1-5/+1
LoopLoadElimination, LoopVersioning and LoopVectorize currently fetch MemorySSA when construction LoopAccessAnalysis. However, LoopAccessAnalysis does not actually use MemorySSA and we can pass nullptr instead. This saves one MemorySSA calculation in the default pipeline, and thus improves compile-time. Differential Revision: https://reviews.llvm.org/D108074
2021-07-13[NFC] Inline variable to prevent unused variable warningArthur Eubanks1-2/+1
2021-07-13[OpaquePtr] Get load/store type without PointerType::getElementType()Arthur Eubanks1-2/+2
2021-02-15[LoopLoadElim] Pass ScalarEvolution in old pass manager. PR49141Max Kazantsev1-1/+2
Loop canonicalization may end up deleting blocks from CFG. And Scalar Evolution may still keep cached referenced to those blocks unless updated properly.
2020-12-17[NFC] Reduce include files dependency and AA header cleanup (part 2).dfukalov1-1/+0
Continuing work started in https://reviews.llvm.org/D92489: Removed a bunch of includes from "AliasAnalysis.h" and "LoopPassManager.h". Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D92852
2020-12-14[LAA] Relax restrictions on early exits in loop structurePhilip Reames1-0/+3
his is a preparation patch for supporting multiple exits in the loop vectorizer, by itself it should be mostly NFC. This patch moves the loop structure checks from LAA to their respective consumers (where duplicates don't already exist). Moving the checks does end up changing some of the optimization warnings and debug output slightly, but nothing that appears to be a regression. Why do this? Well, after auditing the code, I can't actually find anything in LAA itself which relies on having all instructions within a loop execute an equal number of times. This patch simply makes this explicit so that if one consumer - say LV in the near future (hopefully) - wants to handle a broader class of loops, it can do so. Differential Revision: https://reviews.llvm.org/D92066
2020-11-26[LoopLoadElim] Make sure all loops are in simplify form. PR48150Max Kazantsev1-4/+9
LoopLoadElim may end up expanding an AddRec from a loop which is not the current loop. This loop may not be in simplify form. We figure it out after the no-return point, so cannot bail in this case. AddRec requires simplify form to expand. The only way to ensure this does not crash is to simplify all loops beforehand. The issue only exists in new PM. Old PM requests LoopSimplify required pass and it simplifies all loops before the opt begins. Differential Revision: https://reviews.llvm.org/D91525 Reviewed By: asbirlea, aeubanks
2020-10-15[LoopVersion] Unify SCEVChecks and alias check handling (NFC).Florian Hahn1-3/+1
This is an initial cleanup of the way LoopVersioning interacts with LAA. Currently LoopVersioning has 2 ways of initializing things: 1. Passing LAI and passing UseLAIChecks = true 2. Passing UseLAIChecks = false, followed by calling setSCEVChecks and setAliasChecks. Both ways of initializing lead to the same result and the duplication seems more complicated than necessary. This patch removes the UseLAIChecks flag from the constructor and the setSCEVChecks & setAliasChecks helpers and move initialization exclusively to the constructor. This simplifies things, by providing a single way to initialize LoopVersioning and reducing duplication. Reviewed By: Meinersbur, lebedev.ri Differential Revision: https://reviews.llvm.org/D84406
2020-09-22[LoopInfo] empty() -> isInnermost(), add isOutermost()Stefanos Baziotis1-1/+1
Differential Revision: https://reviews.llvm.org/D82895
2020-09-15[BFI] Make BFI information available through loop passes inside ↵Wenlei He1-1/+2
LoopStandardAnalysisResults ~~D65060 uncovered that trying to use BFI in loop passes can lead to non-deterministic behavior when blocks are re-used while retaining old BFI data.~~ ~~To make sure BFI is preserved through loop passes a Value Handle (VH) callback is registered on blocks themselves. When a block is freed it now also wipes out the accompanying BFI entry such that stale BFI data can no longer persist resolving the determinism issue. ~~ ~~An optimistic approach would be to incrementally update BFI information throughout the loop passes rather than only invalidating them on removed blocks. The issues with that are:~~ ~~1. It is not clear how BFI information should be incrementally updated: If a block is duplicated does its BFI information come with? How about if it's split/modified/moved around? ~~ ~~2. Assuming we can address these problems the implementation here will be a massive undertaking. ~~ ~~There's a known need of BFI in LICM analysis which requires correct but not incrementally updated BFI data. A follow-up change can register BFI in all loop passes so this preserved but potentially lossy data is available to any loop pass that wants it.~~ See: D75341 for an identical implementation of preserving BFI via VH callbacks. The previous statements do still apply but this change no longer has to be in this diff because it's already upstream 😄 . This diff also moves BFI to be a part of LoopStandardAnalysisResults since the previous method using getCachedResults now (correctly!) statically asserts (D72893) that this data isn't static through the loop passes. Testing Ninja check Reviewed By: asbirlea, nikic Differential Revision: https://reviews.llvm.org/D86156
2020-09-10[LoopLoadElim] Filter away candidates that stop being AddRecs after loop ↵Max Kazantsev1-5/+20
versioning. PR47457 The test in PR47457 demonstrates a situation when candidate load's pointer's SCEV is no loger a SCEVAddRec after loop versioning. The code there assumes that it is always a SCEVAddRec and crashes otherwise. This patch makes sure that we do not consider candidates for which this requirement is broken after the versioning. Differential Revision: https://reviews.llvm.org/D87355 Reviewed By: asbirlea
2020-07-20[LLE] std::inserter doesn't work with SmallSet, so don't use it.Benjamin Kramer1-7/+5
2020-05-20[SCEV] Move ScalarEvolutionExpander.cpp to Transforms/Utils (NFC).Florian Hahn1-1/+1
SCEVExpander modifies the underlying function so it is more suitable in Transforms/Utils, rather than Analysis. This allows using other transform utils in SCEVExpander. This patch was originally committed as b8a3c34eee06, but broke the modules build, as LoopAccessAnalysis was using the Expander. The code-gen part of LAA was moved to lib/Transforms recently, so this patch can be landed again. Reviewers: sanjoy.google, efriedma, reames Reviewed By: sanjoy.google Differential Revision: https://reviews.llvm.org/D71537
2020-05-13[NewPassManager] Add assertions when getting statefull cached analysis.Alina Sbirlea1-2/+2
Summary: Analyses that are statefull should not be retrieved through a proxy from an outer IR unit, as these analyses are only invalidated at the end of the inner IR unit manager. This patch disallows getting the outer manager and provides an API to get a cached analysis through the proxy. If the analysis is not stateless, the call to getCachedResult will assert. Reviewers: chandlerc Subscribers: mehdi_amini, eraman, hiraditya, zzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72893
2020-04-28[LAA] Move CheckingPtrGroup/PointerCheck outside class (NFC).Florian Hahn1-5/+4
This allows forward declarations of PointerCheck, which in turn reduce the number of times LoopAccessAnalysis needs to be included. Ultimately this helps with moving runtime check generation to Transforms/Utils/LoopUtils.h, without having to include it there. Reviewers: anemet, Ayal Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D78458
2020-04-10[LoopLoadElim] Fix crash by always checking simplify formMax Kazantsev1-5/+5
Loop simplify form should always be checked because logic of propagateStoredValueToLoadUsers relies on it (in particular, it requires preheader). Reviewed By: Fedor Sergeev, Florian Hahn Differential Revision: https://reviews.llvm.org/D77775
2020-04-08[LoopLoadElim] Add test showing that LoopLoadElim doesn't work correctly ↵Max Kazantsev1-0/+1
with new PM
2020-04-06[NFC] Modernize misc. uses of Align/MaybeAlign APIs.Eli Friedman1-2/+1
Use the current getAlign() APIs where it makes sense, and use Align instead of MaybeAlign when we know the value is non-zero.
2020-01-04Revert "[SCEV] Move ScalarEvolutionExpander.cpp to Transforms/Utils (NFC)."Florian Hahn1-1/+1
This reverts commit 51ef53f3bd23559203fe9af82ff2facbfedc1db3, as it breaks some bots.
2020-01-04[SCEV] Move ScalarEvolutionExpander.cpp to Transforms/Utils (NFC).Florian Hahn1-1/+1
SCEVExpander modifies the underlying function so it is more suitable in Transforms/Utils, rather than Analysis. This allows using other transform utils in SCEVExpander. Reviewers: sanjoy.google, efriedma, reames Reviewed By: sanjoy.google Differential Revision: https://reviews.llvm.org/D71537