aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/Analysis/LoopAccessAnalysis.cpp
AgeCommit message (Collapse)AuthorFilesLines
2023-06-23[LegacyPM] Remove LoopAccessLegacyAnalysisKazu Hirata1-43/+0
Differential Revision: https://reviews.llvm.org/D153610
2023-06-01[LV] Split off invariance check from isUniform (NFCI).Florian Hahn1-118/+4
After 572cfa3fde5433, isUniform now checks VF based uniformity instead of just invariance as before. As follow-up cleanup suggested in D148841, separate the invariance check out and update callers that currently check only for invariance. This also moves the implementation of isUniform from LoopAccessAnalysis to LoopVectorizationLegality, as LoopAccesAnalysis doesn't use the more general isUniform.
2023-06-01[LV] Bail out on loop-variant steps when rewriting SCEV exprs.Florian Hahn1-0/+4
If the step is not loop-invariant, we cannot create a modified AddRec, as the start needs to be loop-invariant. Mark those cases as CannotAnalyze and bail out, to fix a crash.
2023-05-31[LV] Use SCEV for uniformity analysis across VFFlorian Hahn1-2/+111
This patch uses SCEV to check if a value is uniform across a given VF. The basic idea is to construct SCEVs where the AddRecs of the loop are adjusted to reflect the version in the vectorized loop (Step multiplied by VF). We construct a SCEV for the value of the vector lane 0 (offset 0) compare it to the expressions for lanes 1 to the last vector lane (VF - 1). If they are equal, consider the expression uniform. While re-writing expressions, we also need to catch expressions we cannot determine uniformity (e.g. SCEVUnknown). Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D148841
2023-05-29[Analysis] Remove unused function stripIntegerCastKazu Hirata1-7/+0
The last use was removed by: commit d5b840131223f2ffef4e48ca769ad1eb7bb1869a Author: Philip Reames <preames@rivosinc.com> Date: Thu May 11 08:10:49 2023 -0700
2023-05-11[LAA] Simplify identification of speculatable strides [nfc]Philip Reames1-24/+24
Mostly just avoiding the need to keep both Value and SCEVs flowing through with consistent handling. We can do everything in terms of SCEV - aside from the profitability heuristics which are now isolated in one spot.
2023-05-11[LV/LAA] Use PSE to identify stride multiplies which simplify [mostly nfc]Philip Reames1-1/+0
LV/LAA will speculate that (some) strided access patterns have unit stride, and insert runtime checks if required. LV cost models a multiply by such a stride as free. We did this by keeping around the StrideSet structure, just to check if one of the operands were one of the strides we speculated. We can instead just ask PredicatedScalarEvolution if either of the operands are one (after predicates are applied). We get mostly the same result - PSE can prove it in more cases in theory - and simpler code.
2023-05-11[LAA/LV] Simplify stride speculation logic [NFC] (try 2)Philip Reames1-16/+27
The original commit wasn't quite NFC, and this was caught by an arguably overly strong assert. Specifically, I'd failed to strip off the integer cast off the SCEV before saving it in the map. The result - other than a failed assert - is that we'd speculate on the casted unknown, not the unknown. The only case I can think of where that might change behavior would be a sext(i1 load). I doubt that case is interesting in practice, but it's good to be strictly NFC on this change regardless. Original commit message follows.. The existing code makes it hard to tell that collectStridedAccess is really about identifying some loop invariant SCEV which is *profitable* to speculate is equal to one. The odd dual usage structure of Value and SCEV confuses this point. We could choose to loosen the profitability analysis if desired. I'm not proposing doing so at this time as it exposes too many cases where the speculation is unprofitable. Differential Revision: https://reviews.llvm.org/D147750
2023-05-11Revert "[LAA/LV] Simplify stride speculation logic [NFC]"Philip Reames1-19/+15
This reverts commit d5b840131223f2ffef4e48ca769ad1eb7bb1869a. Running this through broader testing after rebasing is revealing a crash. Reverting while I investigate.
2023-05-11[LAA/LV] Simplify stride speculation logic [NFC]Philip Reames1-15/+19
The existing code makes it hard to tell that collectStridedAccess is really about identifying some loop invariant SCEV which is *profitable* to speculate is equal to one. The odd dual usage structure of Value and SCEV confuses this point. We could choose to loosen the profitability analysis if desired. I'm not proposing doing so at this time as it exposes too many cases where the speculation is unprofitable. Differential Revision: https://reviews.llvm.org/D147750
2023-05-01[LAA] Add command line flag to disable unit stride speculationPhilip Reames1-0/+10
This is purely so that we can expose and work through downstream codegen issues. My intention is to see if we can get this disabled by default, but that requires fixing a bunch of downstream issues first.
2023-05-01[LAA] Rework overflow checking in getPtrStride [nfc]Philip Reames1-49/+35
The previous code structure and comments were exceedingly confusing. I have multiple times looked at this code and suspected a bug. This time, I decided to take the time to reflow the code and comment out why it is correct. The only suspect (to me) case left is that an underaligned access with a unit stride (in terms of the access type) might miss the undefined null pointer when wrapping. This is unlikely to be an issue for C/C++ code with real page sizes, so I'm not bothering to fully convince myself whether that case is correct or not.
2023-05-01[LAA] Use early return [nfc]Philip Reames1-17/+17
2023-04-25[SCEV] Do not plant SCEV checks unnecessarilyPaul Osmialowski1-0/+5
The vectorisation analysis collects strides for loop invariant pointers, which is wrong because they are not strided. We don't need to generate SCEV checks (which are costly performancewise) for such pointers, we just need to do the appropriate aliasing checks. This patch fixes the problem by changing getStrideFromPointer() to treat loop invariant pointers as having no stride. Originally proposed by David Sherwood with further suggestions from Florian Hahn. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D146958
2023-04-06[LAA] Cleanup casting in replaceSymbolicStrideSCEV [nfc]Philip Reames1-4/+4
2023-04-06[LAA] Continue moving utilities to sole use to isolate symbolic stride ↵Philip Reames1-1/+44
reasoning [nfc]
2023-04-05[LAA] Group implementation of stride speculation into one file [nfc]Philip Reames1-0/+91
These utilities are only used in one place, so move them there and make them static.
2023-03-17[Analysis] Make order of analysis executions more stableBjorn Pettersson1-5/+7
When debugging and using debug-pass-manager (e.g. in regression tests) we prefer a consistent order in which analysis passes are executed. But when for example doing return MyClass(AM.getResult<LoopAnalysis>(F), AM.getResult<DominatorTreeAnalysis>(F)); then the order in which LoopAnalysis and DominatorTreeAnalysis isn't guaranteed, and might for example depend on which compiler that is used when building LLVM. I've not scanned the full source tree, but this fixes some occurances of the above pattern found in lib/Analysis. This problem was discussed briefly in review for D146206.
2023-03-17[LAA] Fix transitive analysis invalidation bug by implementing ↵Bjorn Pettersson1-0/+18
LoopAccessInfoManager::invalidate The default invalidate method for analysis results is just looking at the preserved state of the pass itself. It does not consider if the analysis has an internal state that depend on other analyses. Thus, we need to implement LoopAccessInfoManager::invalidate in order to catch if LoopAccessAnalysis needs to be invalidated due to transitive analyses such as AAManager is being invalidated. Otherwise we might end up having references to an AAManager that is stale. Fixes https://github.com/llvm/llvm-project/issues/61324 Differential Revision: https://reviews.llvm.org/D146206
2023-01-11[NFC] Use TypeSize::geFixedValue() instead of TypeSize::getFixedSize()Guillaume Chatelet1-1/+1
This change is one of a series to implement the discussion from https://reviews.llvm.org/D141134.
2022-12-14[Analysis] llvm::Optional => std::optionalFangrui Song1-13/+15
2022-12-05[LAA] Use cross-iteration alias analysisNikita Popov1-1/+4
LAA analyzes cross-iteration memory dependencies, as such AA should not make assumptions about equality of values inside the loop, as they may come from different iterations. Fix this by exposing the MayBeCrossIteration AA flag and enabling it for LAA. Differential Revision: https://reviews.llvm.org/D137958
2022-12-05[AST] Make AliasSetTracker work on BatchAANikita Popov1-1/+4
D138014 restricted AST to work on immutable IR. This means it is also safe to use a single BatchAA instance for the entire AST lifetime, instead of only batching parts of individual queries. The primary motivation for this is not compile-time, but rather having a central place to control cross-iteration AA, which will be used by D137958. Differential Revision: https://reviews.llvm.org/D137955
2022-12-04Compress a few pairs using PointerIntPairsBenjamin Kramer1-53/+46
Use the uniform structured bindings interface where possible. NFCI.
2022-12-02[Analysis] Use std::nullopt instead of None (NFC)Kazu Hirata1-13/+13
This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-10-04[LAA] Use LoopAccessInfoManager in legacy pass.Florian Hahn1-26/+6
Simplify LoopAccessLegacyAnalysis by using LoopAccessInfoManager from D134606. As a side-effect this also removes printing support from LoopAccessLegacyAnalysis. Depends on D134606. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D134608
2022-10-01[LAA] Change to function analysis for new PM.Florian Hahn1-5/+18
At the moment, LoopAccessAnalysis is a loop analysis for the new pass manager. The issue with that is that LAI caches SCEV expressions and modifications in a loop may impact SCEV expressions in other loops, but we do not have a convenient way to invalidate LAI for other loops withing a loop pipeline. To avoid this issue, turn it into a function analysis which returns a manager object that keeps track of the individual LAI objects per loop. Fixes #50940. Fixes #51669. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D134606
2022-09-27[LAA] Make getPtrStride return Option instead of overloading zero as error ↵Philip Reames1-16/+17
value [nfc] This is purely NFC restructure in advance of a change which actually exposes zero strides. This is mostly because I find this interface confusing each time I look at it.
2022-09-21[LAA] Fix ICE with scAddExpr in forked pointersGraham Hunter1-2/+11
The IR from https://github.com/llvm/llvm-project/issues/57368 results in an assert firing when trying to create a runtime check for the forked pointer. One of the forks is fine since it's loop invariant, but the other is a scAddExpr (containing a scAddRecExpr, so not invariant) when RtCheck::insert expects a scAddRecExpr. This is a simple fix to just avoid forks which aren't AddRec or loop invariant. We can allow it as a forked pointer later with more work. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D133020
2022-08-28[llvm] Use llvm::find_if (NFC)Kazu Hirata1-5/+4
2022-08-26[LAA] Require AddRecs to be in the innermost loop for diff-checks.Florian Hahn1-1/+2
The simpler diff-checks require pointers with add-recs from the same innermost loop, but this property wasn't check completely. Add the missing check to ensure both addrecs are in the innermost loop. Fixes #57315.
2022-08-25[LAA] Prune dependencies with distance large than access implied by trip countPhilip Reames1-7/+8
When we have a dependency with a dependence distance which can only be hit on an iteration beyond the actual trip count of the loop, we can ignore that dependency when analyzing said loop. We already had this code, but had restricted it solely to unknown dependence distances. This change applies it to all dependence distances. Without this code, we relied on the vectorizer reducing VF such that our infeasible dependence was respected. This usually worked out to about the same result, but not always. For fixed length vectorization, this could mean a smaller VF than optimal being chosen or additional runtime checks. For scalable vectorization - where the bounds on access implied by VF are broader - we could often not find a feasible VF at all. Differential Revision: https://reviews.llvm.org/D131924
2022-08-25[LAA] Cache PSE.getSE() in variable (NFC).Florian Hahn1-4/+4
Preparation for follow-up patches will introduce additional uses of SE.
2022-08-23[NFC] LoopAccess: Move expressions close to usageAditya Kumar1-5/+6
Avoids useless evaluation of these expressions. Reviewed By: michaelmaitland, fhahn Differential Revision: https://reviews.llvm.org/D132337
2022-08-18[LoopVectorize][LoopAccessAnalysis] add newline to debug messageMichael Maitland1-1/+1
A debug message in `LoopAccessAnalysis` did not have a newline in it, causing printed debug messages to be formatted incorrectly. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D132172
2022-08-17[LAA] Handle forked pointers with add/sub instructionsGraham Hunter1-0/+40
Handle cases where a forked pointer has an add or sub instruction before reaching a select. Reviewed By: fhahn Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D130278
2022-07-28[LAA] Remove block order sensitivity in LAA algorithm. PR56672Max Kazantsev1-2/+6
As test in PR56672 shows, LAA produces different results which lead to either positive or negative vectorization decisions depending on the order of blocks in loop. The exact reason of this is not clear to me, however this makes investigation of related bugs extremely complex. Current order of blocks in the loop is arbitrary. It may change, for example, if loop info analysis is dropped and recomputed. Seems that it interferes with LAA's logic. This patch chooses fixed traversal order of blocks in loops, making it RPOT. Note: this is *not* a fix for bug with incorrect analysis result. It just makes the answer more robust to make the investigation easier. Differential Revision: https://reviews.llvm.org/D130482 Reviewed By: aeubanks, fhahn
2022-07-24Use llvm::less_first and llvm::less_second (NFC)Kazu Hirata1-3/+1
2022-07-23[Analysis] Remove a redundant return statement (NFC)Kazu Hirata1-2/+0
Identified with readability-redundant-control-flow.
2022-07-21[LoopAccessAnalysis] Simplify D119047Arthur Eubanks1-11/+11
No need to add checks for every type per pointer that we couldn't create a check for the first time around, just the types that weren't successful. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D119376
2022-07-20[LAA] Fix latent missing check bug when mixing scalable and non-scalabe stridesPhilip Reames1-1/+3
Noticed via inspection; to my knowledge, impossible to hit today. In theory, we could have a fixed stride check be analyzed, then a scalable one. With the old code, the scalable one would be silently dropped, and the runtime guard would go ahead with only the fixed one. This would be a miscompile.
2022-07-18[LAA] Fix the build with older versions of ClangBenjamin Kramer1-1/+1
llvm/lib/Analysis/LoopAccessAnalysis.cpp:916:12: error: no viable conversion from returned value of type 'SmallVector<[...], 2>' to function return type 'SmallVector<[...], (default) CalculateSmallVectorDefaultInlinedElements<T>::value aka 3>' return Scevs; ^~~~~
2022-07-18[LAA] Add recursive IR walker for forked pointersGraham Hunter1-13/+143
This builds on the previous forked pointers patch, which only accepted a single select as the pointer to check. A recursive function to walk through IR has been added, which searches for either a loop-invariant or addrec SCEV. This will only handle a single fork at present, so selects of selects or a GEP with a select for both the base and offset will be rejected. There is also a recursion limit with a cli option to change it. Reviewed By: fhahn, david-arm Differential Revision: https://reviews.llvm.org/D108699
2022-07-16[Analysis] Qualify auto variables in for loops (NFC)Kazu Hirata1-3/+3
2022-06-17Recommit "[LAA] Initial support for runtime checks with pointer selects."Florian Hahn1-70/+108
This reverts commit 7aa8a678826dea86ff3e6c7df9d2a8a6ef868f5d. This version includes fixes to address issues uncovered after the commit landed and discussed at D11448. Those include: * Limit select-traversal to selects inside the loop. * Freeze pointers resulting from looking through selects to avoid branch-on-poison.
2022-06-01Revert "[LAA] Initial support for runtime checks with pointer selects."Alexander Kornienko1-88/+67
This reverts commit 5890b30105999a137e72e42f3760bebfd77001ca as per discussion on the review thread: https://reviews.llvm.org/D114487#3547560.
2022-05-16[LAA,LV] Add initial support for pointer-diff memory checks.Florian Hahn1-5/+93
This patch adds initial support for a pointer diff based runtime check scheme for vectorization. This scheme requires fewer computations and checks than the existing full overlap checking, if it is applicable. The main idea is to only check if source and sink of a dependency are far enough apart so the accesses won't overlap in the vector loop. To do so, it is sufficient to compute the difference and compare it to the `VF * UF * AccessSize`. It is sufficient to check `(Sink - Src) <u VF * UF * AccessSize` to rule out a backwards dependence in the vector loop with the given VF and UF. If Src >=u Sink, there is not dependence preventing vectorization, hence the overflow should not matter and using the ULT should be sufficient. Note that the initial version is restricted in multiple ways: 1. Pointers must only either be read or written, by a single instruction (this allows re-constructing source/sink for dependences with the available information) 2. Source and sink pointers must be add-recs, with matching steps 3. The step must be a constant. 3. abs(step) == AccessSize. Most of those restrictions can be relaxed in the future. See https://github.com/llvm/llvm-project/issues/53590. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D119078
2022-05-12[LAA] Initial support for runtime checks with pointer selects.Florian Hahn1-66/+87
Scaffolding support for generating runtime checks for multiple SCEV expressions per pointer. The initial version just adds support for looking through a single pointer select. The more sophisticated logic for analyzing forks is in D108699 Reviewed By: huntergr Differential Revision: https://reviews.llvm.org/D114487
2022-05-03[LoopVectorize] Support reductions that store intermediary resultIgor Kirillov1-1/+4
Adds ability to vectorize loops containing a store to a loop-invariant address as part of a reduction that isn't converted to SSA form due to lack of aliasing info. Runtime checks are generated to ensure the store does not alias any other accesses in the loop. Ordered fadd reductions are not yet supported. Differential Revision: https://reviews.llvm.org/D110235
2022-04-22[NFC][LAA] Match-up type sizes for possible extensions, based on actual ↵Chang-Sun Lin Jr1-6/+6
bit-size rather than rounded-up byte size. Differential Revision: https://reviews.llvm.org/D119200