aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/Analysis/LoopAccessAnalysis.cpp
AgeCommit message (Collapse)AuthorFilesLines
36 hours[LAA] Revert 56a1cbb and 1aded51, due to crash (#160993)Ramkumar Ramachandra1-34/+22
This reverts commits 56a1cbb ([LAA] Fix non-NFC parts of 1aded51), 1aded51 ([LAA] Prepare to handle diff type sizes (NFC)). The original NFC patch caused some regressions, which the later patch tried to fix. However, the later patch is the cause of some crashes, and it would be best to revert both for now, and re-land after thorough testing.
3 days[LAA] Fix non-NFC parts of 1aded51 (#160701)Ramkumar Ramachandra1-1/+2
1aded51 ([LAA] Prepare to handle diff type sizes (NFC)) was supposed to be a non-functional patch, but introduced functional changes as known-non-negative and known-non-positive is not equivalent to !known-non-zero. Fix this.
11 days[LAA] Prepare to handle diff type sizes (NFC) (#122318)Ramkumar Ramachandra1-22/+33
As depend_diff_types shows, there are several places where the HasSameSize check can be relaxed for higher analysis precision. As a first step, return both the source size and the sink size from getDependenceDistanceStrideAndSize, along with a HasSameSize boolean for the moment.
2025-09-10[LAA] Strip findForkedPointer (NFC) (#140298)Ramkumar Ramachandra1-39/+28
Remove a level of indirection due to findForkedPointer, in an effort to improve code.
2025-09-04[LAA] Support assumptions with non-constant deref sizes. (#156758)Florian Hahn1-7/+7
Update evaluatePtrAddrecAtMaxBTCWillNotWrap to support non-constant sizes in dereferenceable assumptions. Apply loop-guards in a few places needed to reason about expressions involving trip counts of the from (BTC - 1). PR: https://github.com/llvm/llvm-project/pull/156758
2025-09-03Reapply "[LAA,Loads] Use loop guards and max BTC if needed when checking ↵Florian Hahn1-27/+42
deref. (#155672)" This reverts commit f0df1e3dd4ec064821f673ced7d83e5a2cf6afa1. Recommit with extra check for SCEVCouldNotCompute. Test has been added in b16930204b. Original message: Remove the fall-back to constant max BTC if the backedge-taken-count cannot be computed. The constant max backedge-taken count is computed considering loop guards, so to avoid regressions we need to apply loop guards as needed. Also remove the special handling for Mul in willNotOverflow, as this should not longer be needed after 914374624f (https://github.com/llvm/llvm-project/pull/155300). PR: https://github.com/llvm/llvm-project/pull/155672
2025-09-02Revert "[LAA,Loads] Use loop guards and max BTC if needed when checking ↵Florian Hahn1-39/+27
deref. (#155672)" This reverts commit 08001cf340185877665ee381513bf22a0fca3533. This triggers an assertion in some build configs, e.g. https://lab.llvm.org/buildbot/#/builders/24/builds/12211
2025-09-02[LAA,Loads] Use loop guards and max BTC if needed when checking deref. (#155672)Florian Hahn1-27/+39
Remove the fall-back to constant max BTC if the backedge-taken-count cannot be computed. The constant max backedge-taken count is computed considering loop guards, so to avoid regressions we need to apply loop guards as needed. Also remove the special handling for Mul in willNotOverflow, as this should not longer be needed after 914374624f (https://github.com/llvm/llvm-project/pull/155300). PR: https://github.com/llvm/llvm-project/pull/155672
2025-08-27[SCEV][LAA] Support multiplication overflow computation (#155236)annamthomas1-9/+13
Add support for identifying multiplication overflow in SCEV. This is needed in LoopAccessAnalysis and that limitation was worked around by 484417a. This allows early-exit vectorization to work as expected in vect.stats.ll test without needing the workaround.
2025-08-19[LAA] Move scalable vector check into `getStrideFromAddRec()` (#154013)Benjamin Maxwell1-5/+6
This moves the check closer to the `.getFixedValue()` call and fixes #153797 (which is a regression from #126971).
2025-08-14[LoopDist] Consider reads and writes together for runtime checks (#145623)Michael Berg1-5/+6
Emit safety guards for ptr accesses when cross partition loads exist which have a corresponding store to the same address in a different partition. This will emit the necessary ptr checks for these accesses. The test case was obtained from SuperTest, which SiFive runs regularly. We enabled LoopDistribution by default in our downstream compiler, this change was part of that enablement.
2025-08-01[LAA] Support assumptions in evaluatePtrAddRecAtMaxBTCWillNotWrap (#147047)Florian Hahn1-27/+49
This patch extends the logic added in https://github.com/llvm/llvm-project/pull/128061 to support dereferenceability information from assumptions as well. Unfortunately both assumption cache and the dominator tree need to be threaded through multiple layers to make them available where needed. PR: https://github.com/llvm/llvm-project/pull/147047
2025-07-22[LAA] Rename var used to retry with RT-checks (NFC) (#147307)Ramkumar Ramachandra1-12/+11
FoundNonConstantDistanceDependence is a misleading name for a variable that determines whether we retry with runtime checks. Rename it.
2025-07-16[LAA] Hoist check for SCEV-uncomputable dist (NFC) (#148841)Ramkumar Ramachandra1-7/+6
Hoist the check for SCEVCouldNotCompute distance into getDependenceDistanceAndSize.
2025-07-14Reapply "[LAA] Remove loop-invariant check added in 234cc40adc61."Florian Hahn1-18/+40
This reverts commit d43a80936d437d217d5a6dbbaa5fb131c27e7085. With the correctness issue blocking the recommit finally fixed (5d01697ec6cb), again unconditionally check if accesses are completely before or after each other.
2025-07-11[LAA] Move code to check if access are completely before/after (NFC).Florian Hahn1-27/+34
Factor out code to check if access are completely before/after each other. This reduces the diff for an upcoming re-commit and moving to a function also helps to reduce the nesting level via early exits.
2025-07-07[LAA] Strip outdated comment in isDependent (NFC) (#146367)Ramkumar Ramachandra1-16/+0
The comment has been outdated since 87ddd3a1 ([LAA] Rename and fix semantics of MaxSafeDepDistBytes to MinDepDistBytes).
2025-07-07[LAA] Hoist setting condition for RT-checks (#128045)Ramkumar Ramachandra1-33/+9
Strip ShouldRetyWithRuntimeCheck from the DepedenceDistanceStrideAndSizeInfo struct, and free isDependent from the responsibility of setting the condition for when runtime-checks are needed, transferring this responsibility to getDependenceDistanceStrideAndSize. We can have multiple DepType::Unknown dependences that, by themselves, do not trigger the retrying with runtime memory checks, and therefore block vectorization. But once a single FoundNonConstantDistanceDependence is found, the analysis seems to switch to the "LAA: Retrying with memory checks" path and allows all these dependences to be handled via runtime checks. There is hence no rationale for predicating FoundNonConstantDependenceDistance on DepType::Unknown, and removing this predication is one of the side-effects of this patch.
2025-06-30[LAA] Clean up APInt-overflow related code (#140048)Ramkumar Ramachandra1-17/+14
Co-authored-by: Florian Hahn <flo@fhahn.com>
2025-06-24[LAA] Address follow-up suggestions for #128061.Florian Hahn1-7/+7
Adjust naming and add argument comments as suggested.
2025-06-23[LAA] Be more careful when evaluating AddRecs at symbolic max BTC. (#128061)Florian Hahn1-14/+123
Evaluating AR at the symbolic max BTC may wrap and create an expression that is less than the start of the AddRec due to wrapping (for example consider MaxBTC = -2). If that's the case, set ScEnd to -(EltSize + 1). ScEnd will get incremented by EltSize before returning, so this effectively sets ScEnd to unsigned max. Note that LAA separately checks that accesses cannot not wrap (52ded672492, https://github.com/llvm/llvm-project/pull/127543), so unsigned max represents an upper bound. When there is a computable backedge-taken count, we are guaranteed to execute the number of iterations, and if any pointer would wrap it would be UB (or the access will never be executed, so cannot alias). It includes new tests from the previous discussion that show a case we wrap with a BTC, but it is UB due to the pointer after the object wrapping (in `evaluate-at-backedge-taken-count-wrapping.ll`) When we have only a maximum backedge taken count, we instead try to use dereferenceability information to determine if the pointer access must be in bounds for the maximum backedge taken count. PR: https://github.com/llvm/llvm-project/pull/128061
2025-06-20[LV] Stengthen loop-invariance checks in isPredicatedInst (#140744)Ramkumar Ramachandra1-2/+2
Check loop-invariance against SCEV as well.
2025-06-08[llvm] Compare std::optional<T> to values directly (NFC) (#143340)Kazu Hirata1-1/+1
This patch transforms: X && *X == Y to: X == Y where X is of std::optional<T>, and Y is of T or similar.
2025-06-04[LAA] Keep pointer checks on partial analysis (#139719)John Brawn1-21/+34
Currently if there's any memory access that AccessAnalysis couldn't analyze then all of the runtime pointer check results are discarded. This patch makes this able to be controlled with the AllowPartial option, which makes it so we generate the runtime check information for those pointers that we could analyze, as transformations may still be able to make use of the partial information. Of the transformations that use LoopAccessAnalysis, only LoopVersioningLICM changes behaviour as a result of this change. This is because the others either: * Check canVectorizeMemory, which will return false when we have partial pointer information as analyzeLoop() will return false. * Examine the dependencies returned by getDepChecker(), which will be empty as we exit analyzeLoop if we have partial pointer information before calling areDepsSafe(), which is what fills in the dependency information.
2025-06-03[LAA] Improve code in findForkedSCEVs (NFC) (#140384)Ramkumar Ramachandra1-26/+19
2025-05-31[Remarks] Remove an upcast footgun. NFC (#142191)Jon Roelofs1-3/+3
CodeRegion's were previously passed as Value*, but then immediately upcast to BasicBlock. Let's keep the type information around until the use cases for non-BasicBlock code regions actually materialize.
2025-05-26[llvm] Value-initialize values with *Map::try_emplace (NFC) (#141522)Kazu Hirata1-1/+1
try_emplace value-initializes values, so we do not need to pass nullptr to try_emplace when the value types are raw pointers or std::unique_ptr<T>.
2025-05-26[LAA] Use m_scev_AffineAddRec in LAA (NFC).Florian Hahn1-22/+16
2025-05-23[Analysis] Remove unused includes (NFC) (#141319)Kazu Hirata1-1/+0
These are identified by misc-include-cleaner. I've filtered out those that break builds. Also, I'm staying away from llvm-config.h, config.h, and Compiler.h, which likely cause platform- or compiler-specific build failures.
2025-05-22[LAA] Strip isNoWrapGEP: dead code (NFC) (#140308)Ramkumar Ramachandra1-51/+0
isNoWrap is the only caller of isNoWrapGEP, and it has subsuming check on the GEP immediately after.
2025-05-22[LAA] Remove dead SE arg from canCheckPtrAtRT (NFC).Florian Hahn1-6/+4
2025-05-21[LAA] Tweak debug output for UTC stability (#140764)Ramkumar Ramachandra1-6/+16
UpdateTestChecks has a make_analyzer_generalizer to replace pointer addressess from the debug output of LAA with a pattern, which is an acceptable solution when there is one RUN line. However, when there are multiple RUN lines with a common pattern, UTC fails to recognize common output due to mismatched pointer addresses. Instead of hacking UTC scrub the output before comparing the outputs from the different RUN lines, fix the issue once and for all by making LAA not output unstable pointer addresses in the first place. The removal of the now-dead make_analyzer_generalizer is left as a non-trivial exercise for a follow-up.
2025-05-18[LAA] Add assert check CanDoRTIFNeeded can be computed w/o RT.Need (NFC)Florian Hahn1-0/+2
Add assert to ensure that CanDoRTIfNeeded can be computed w/o RtCheck.Need, to prepare for adjusting the condition.
2025-05-15[LAA/SLP] Don't truncate APInt in getPointersDiff (#139941)Ramkumar Ramachandra1-14/+19
Change getPointersDiff to return an std::optional<int64_t>, and fill this value with using APInt::trySExtValue. This simple change requires changes to other functions in LAA, and major changes in SLPVectorizer changing types from 32-bit to 64-bit. Fixes #139202.
2025-05-13[LAA][NFC] Unify naming of DepCandidates to DepCands (#139534)Igor Kirillov1-8/+7
The MemoryDepChecker::DepCandidates instance in each LoopAccessInfo had multiple names (AccessSets, DepCands, DependentAccesses), which was confusing. This patch renames all references to DepCands for consistency.
2025-05-12[LAA] Improve code in replaceSymbolicStrideSCEV (NFC) (#139532)Ramkumar Ramachandra1-3/+2
Prefer DenseMap::lookup over DenseMap::find.
2025-05-09[LAA] Strip dead code in getStrideFromPointer (NFC) (#139140)Ramkumar Ramachandra1-17/+0
The SCEV multiply by 1 doesn't make sense, because SCEV would fold it: therefore, the OrigPtr == Ptr branch effectively rejects a multiply. However, in this branch, we have a pointer SCEV that cannot be a multiply, and hence the code the code is dead. Strip it.
2025-05-09[SCEVPatternMatch] Extend with more matchers (#138836)Ramkumar Ramachandra1-12/+11
2025-05-07[LAA] Use MaxStride instead of CommonStride to calculate MaxVF (#98142)vaibhav1-7/+6
We bail out from MaxVF calculation if the strides are not the same. Instead, we are dependent on runtime checks, though not yet implemented. We could instead use the MaxStride to conservatively use an upper bound. This handles cases like the following: ```c #define LEN 256 * 256 float a[LEN]; void gather() { for (int i = 0; i < LEN - 1024 - 255; i++) { #pragma clang loop interleave(disable) #pragma clang loop unroll(disable) for (int j = 0; j < 256; j++) a[i + j + 1024] += a[j * 4 + i]; } } ``` --------- Co-authored-by: Florian Hahn <flo@fhahn.com>
2025-05-04[llvm] Remove unused local variables (NFC) (#138454)Kazu Hirata1-3/+0
2025-04-29[LAA] Prefer set-contains over set-count (NFC) (#136749)Ramkumar Ramachandra1-9/+10
Improve code by preferring {SmallSet,SmallPtrSet}::contains() over the count() function, when used in a boolean context.
2025-04-16[llvm] Use llvm::append_range (NFC) (#136066)Kazu Hirata1-1/+1
This patch replaces: llvm::copy(Src, std::back_inserter(Dst)); with: llvm::append_range(Dst, Src); for breavity. One side benefit is that llvm::append_range eventually calls llvm::SmallVector::reserve if Dst is of llvm::SmallVector.
2025-04-12[LAA] Make sure MaxVF for Store-Load forward safe dep distances is pow2.Florian Hahn1-1/+2
MaxVF computed in couldPreventStoreLoadFowrard may not be a power of 2, as CommonStride may not be a power-of-2. This can cause crashes after 78777a20. Use bit_floor to make sure it is a suitable power-of-2. Fixes https://github.com/llvm/llvm-project/issues/134696.
2025-04-04[EquivClasses] Shorten members_{begin,end} idiom (#134373)Ramkumar Ramachandra1-3/+2
Introduce members() iterator-helper to shorten the members_{begin,end} idiom. A previous attempt of this patch was #130319, which had to be reverted due to unit-test failures when attempting to call members() on the end iterator. In this patch, members() accepts either an ECValue or an ElemTy, which is more intuitive and doesn't suffer from the same issue.
2025-03-31Reapply "[EquivalenceClasses] Replace findValue with contains (NFC)."Florian Hahn1-1/+1
This reverts the revert commit 616f447fc84bdc7655117f1b303d895dc3b93e4d. It includes updates to remaining users in Polly and Clang, to avoid failures when building those projects.
2025-03-31Revert "[EquivalenceClasses] Replace findValue with contains (NFC)."Florian Hahn1-1/+1
Breaks clang builds. This reverts commit 8e390dedd71d0c2bcbe8775aee2e234ef7a5b787.
2025-03-31[EquivalenceClasses] Replace findValue with contains (NFC).Florian Hahn1-1/+1
Replace remaining use of findValue with more compact and limited contains().
2025-03-31[LAA] Remove unneeded findValue calls (NFC).Florian Hahn1-7/+2
Use findLeader directly instead if going through findValue, getLeaderValue. This is simpler and more efficient.
2025-03-31[LV]Split store-load forward distance analysis from other checks, NFC (#121156)Alexey Bataev1-22/+25
The patch splits the store-load forwarding distance analysis from other dependency analysis in LAA. Currently it supports only power-of-2 distances, required to support non-power-of-2 distances in future. Part of #100755
2025-03-29[Analysis] Use llvm::append_range (NFC) (#133602)Kazu Hirata1-2/+1