aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/Analysis/LoopAccessAnalysis.cpp
AgeCommit message (Collapse)AuthorFilesLines
2021-07-26[LAA] Remove RuntimeCheckingPtrGroup::RtCheck member (NFC).Florian Hahn1-8/+20
This patch removes RtCheck from RuntimeCheckingPtrGroup to make it possible to construct RuntimeCheckingPtrGroup objects without a RuntimePointerChecking object. This should make it easier to re-use the code to generate runtime checks, e.g. in D102834. RtCheck was only used to access the pointer info for a given index. Instead, the start and end expressions can be passed directly. For code-gen, we also need to know the address space to use. This can also be explicitly passed at construction. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D105481
2021-07-19[LoopUtils] Fix incorrect RT check bounds of loop-invariant mem accessesMindong Chen1-8/+8
This fixes the lower and upper bound calculation of a RuntimeCheckingPtrGroup when it has more than one loop invariant pointers. Resolves PR50686. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D104148
2021-07-06Recommit [ScalarEvolution] Make getMinusSCEV() fail for unrelated pointers.Eli Friedman1-1/+2
As part of making ScalarEvolution's handling of pointers consistent, we want to forbid multiplying a pointer by -1 (or any other value). This means we can't blindly subtract pointers. There are a few ways we could deal with this: 1. We could completely forbid subtracting pointers in getMinusSCEV() 2. We could forbid subracting pointers with different pointer bases (this patch). 3. We could try to ptrtoint pointer operands. The option in this patch is more friendly to non-integral pointers: code that works with normal pointers will also work with non-integral pointers. And it seems like there are very few places that actually benefit from the third option. As a minimal patch, the ScalarEvolution implementation of getMinusSCEV still ends up subtracting pointers if they have the same base. This should eliminate the shared pointer base, but eventually we'll need to rewrite it to avoid negating the pointer base. I plan to do this as a separate step to allow measuring the compile-time impact. This doesn't cause obvious functional changes in most cases; the one case that is significantly affected is ICmpZero handling in LSR (which is the source of almost all the test changes). The resulting changes seem okay to me, but suggestions welcome. As an alternative, I tried explicitly ptrtoint'ing the operands, but the result doesn't seem obviously better. I deleted the test lsr-undef-in-binop.ll becuase I couldn't figure out how to repair it to test what it was actually trying to test. Recommitting with fix to MemoryDepChecker::isDependent. Differential Revision: https://reviews.llvm.org/D104806
2021-06-23[LAA] Make getPointersDiff() API compatible with opaque pointersNikita Popov1-13/+20
Make getPointersDiff() and sortPtrAccesses() compatible with opaque pointers by explicitly passing in the element type instead of determining it from the pointer element type. The SLPVectorizer result is slightly non-optimal in that unnecessary pointer bitcasts are added. Differential Revision: https://reviews.llvm.org/D104784
2021-05-28Revert "[LAA] Support pointer phis in loop by analyzing each incoming pointer."Florian Hahn1-23/+2
This reverts commit 1ed7f8ede564c3b11da4fdca30c36ccbff422576. This change can cause loop-distribute to crash in some cases. Revert until I have more time to wrap up a fix. See PR50296, PR5028 and D102266.
2021-05-05Make dependency between certain analysis passes transitive (reapply)Bjorn Pettersson1-5/+5
LazyBlockFrequenceInfoPass, LazyBranchProbabilityInfoPass and LoopAccessLegacyAnalysis all cache pointers to their nestled required analysis passes. One need to use addRequiredTransitive to describe that the nestled passes can't be freed until those analysis passes no longer are used themselves. There is still a bit of a mess considering the getLazyBPIAnalysisUsage and getLazyBFIAnalysisUsage functions. Those functions are used from both Transform, CodeGen and Analysis passes. I figure it is OK to use addRequiredTransitive also when being used from Transform and CodeGen passes. On the other hand, I figure we must do it when used from other Analysis passes. So using addRequiredTransitive should be more correct here. An alternative solution would be to add a bool option in those functions to let the user tell if it is a analysis pass or not. Since those lazy passes will be obsolete when new PM has conquered the world I figure we can leave it like this right now. Intention with the patch is to fix PR49950. It at least solves the problem for the reproducer in PR49950. However, that reproducer need five passes in a specific order, so there are lots of various "solutions" that could avoid the crash without actually fixing the root cause. This is a reapply of commit 3655f0757f2b4b, that was reverted in 33ff3c20498ef5c2057 due to problems with assertions in the polly lit tests. That problem is supposed to be solved by also adjusting ScopPass to explicitly preserve LazyBlockFrequencyInfo and LazyBranchProbabilityInfo (it already preserved OptimizationRemarkEmitter which depends on those lazy passes). Differential Revision: https://reviews.llvm.org/D100958
2021-05-04Revert "Make dependency between certain analysis passes transitive"Bjorn Pettersson1-5/+5
This reverts commit 3655f0757f2b4b61419446b326410118658826ba. It caused assertion failures related to setLastUser in polly builds.
2021-05-04Make dependency between certain analysis passes transitiveBjorn Pettersson1-5/+5
LazyBlockFrequenceInfoPass, LazyBranchProbabilityInfoPass and LoopAccessLegacyAnalysis all cache pointers to their nestled required analysis passes. One need to use addRequiredTransitive to describe that the nestled passes can't be freed until those analysis passes no longer are used themselves. There is still a bit of a mess considering the getLazyBPIAnalysisUsage and getLazyBFIAnalysisUsage functions. Those functions are used from both Transform, CodeGen and Analysis passes. I figure it is OK to use addRequiredTransitive also when being used from Transform and CodeGen passes. On the other hand, I figure we must do it when used from other Analysis passes. So using addRequiredTransitive should be more correct here. An alternative solution would be to add a bool option in those functions to let the user tell if it is a analysis pass or not. Since those lazy passes will be obsolete when new PM has conquered the world I figure we can leave it like this right now. Intention with the patch is to fix PR49950. It at least solves the problem for the reproducer in PR49950. However, that reproducer need five passes in a specific order, so there are lots of various "solutions" that could avoid the crash without actually fixing the root cause. Differential Revision: https://reviews.llvm.org/D100958
2021-04-28[LAA] Support pointer phis in loop by analyzing each incoming pointer.Florian Hahn1-2/+23
SCEV does not look through non-header PHIs inside the loop. Such phis can be analyzed by adding separate accesses for each incoming pointer value. This results in 2 more loops vectorized in SPEC2000/186.crafty and avoids regressions when sinking instructions before vectorizing. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D101286
2021-03-25Fix a miscompile introduced by 99203f2.Richard Smith1-1/+2
getPointersDiff would previously round down the difference between two pointers to a multiple of the element size of the pointee, which could result in a pointer value being decreased a little. Alexey Bataev has graciously agreed to add a testcase for this; submitting the bugfix now to unblock.
2021-03-24[LoopAnalysis][NFC]Remove redundant code.Alexey Bataev1-1/+0
Removed redundant code for IsConsecutive variable.
2021-03-23[Analysis]Add getPointersDiff function to improve compile time.Alexey Bataev1-107/+91
Added getPointersDiff function to LoopAccessAnalysis and used it instead direct calculatoin of the distance between pointers and/or isConsecutiveAccess function in SLP vectorizer to improve compile time and detection of stores consecutive chains. Part of D57059 Differential Revision: https://reviews.llvm.org/D98967
2021-03-23Revert "[Analysis]Add getPointersDiff function to improve compile time."Alexey Bataev1-91/+107
This reverts commit 065a14a12d2694f26f4e894641f5ab8cfc5da8bd to investigate and fix crash in SLP vectorizer.
2021-03-23[Analysis]Add getPointersDiff function to improve compile time.Alexey Bataev1-107/+91
Added getPointersDiff function to LoopAccessAnalysis and used it instead direct calculatoin of the distance between pointers and/or isConsecutiveAccess function in SLP vectorizer to improve compile time and detection of stores consecutive chains. Part of D57059 Differential Revision: https://reviews.llvm.org/D98967
2021-01-09[llvm] Drop unnecessary make_range (NFC)Kazu Hirata1-1/+1
2021-01-07[Analysis] MemoryDepChecker::couldPreventStoreLoadForward - remove dead ↵Simon Pilgrim1-1/+1
store. NFCI. As we're breaking from the loop when clamping MaxVF, clang static analyzer was warning that the VF iterator was being updated and never used.
2020-12-14[LAA] Relax restrictions on early exits in loop structurePhilip Reames1-20/+0
his is a preparation patch for supporting multiple exits in the loop vectorizer, by itself it should be mostly NFC. This patch moves the loop structure checks from LAA to their respective consumers (where duplicates don't already exist). Moving the checks does end up changing some of the optimization warnings and debug output slightly, but nothing that appears to be a regression. Why do this? Well, after auditing the code, I can't actually find anything in LAA itself which relies on having all instructions within a loop execute an equal number of times. This patch simply makes this explicit so that if one consumer - say LV in the near future (hopefully) - wants to handle a broader class of loops, it can do so. Differential Revision: https://reviews.llvm.org/D92066
2020-11-26[AA] Split up LocationSize::unknown()Nikita Popov1-2/+2
Currently, we have some confusion in the codebase regarding the meaning of LocationSize::unknown(): Some parts (including most of BasicAA) assume that LocationSize::unknown() only allows accesses after the base pointer. Some parts (various callers of AA) assume that LocationSize::unknown() allows accesses both before and after the base pointer (but within the underlying object). This patch splits up LocationSize::unknown() into LocationSize::afterPointer() and LocationSize::beforeOrAfterPointer() to make this completely unambiguous. I tried my best to determine which one is appropriate for all the existing uses. The test changes in cs-cs.ll in particular illustrate a previously clearly incorrect AA result: We were effectively assuming that argmemonly functions were only allowed to access their arguments after the passed pointer, but not before it. I'm pretty sure that this was not intentional, and it's certainly not specified by LangRef that way. Differential Revision: https://reviews.llvm.org/D91649
2020-11-25[SVE] Fix TypeSize warning in RuntimePointerChecking::insertJoe Ellis1-3/+3
The TypeSize warning would occur because RuntimePointerChecking::insert was not scalable vector aware. The fix is to use ScalarEvolution::getSizeOfExpr to grab the size of types. Differential Revision: https://reviews.llvm.org/D90171
2020-11-25[LAA] NFC: Rename [get]MaxSafeRegisterWidth -> [get]MaxSafeVectorWidthInBitsCullen Rhodes1-1/+1
MaxSafeRegisterWidth is a misnomer since it actually returns the maximum safe vector width. Register suggests it relates directly to a physical register where it could be a vector spanning one or more physical registers. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D91727
2020-11-24[SCEV] Use isa<> pattern for testing for CouldNotCompute [NFC]Philip Reames1-1/+1
Some older code - and code copied from older code - still directly tested against the singelton result of SE::getCouldNotCompute. Using the isa<SCEVCouldNotCompute> form is both shorter, and more readable.
2020-11-24[LAA] Minor code style tweaks [NFC]Philip Reames1-23/+15
2020-10-07[LAA] Use DL to get element size for bound computation.Florian Hahn1-1/+2
Currently LAA uses getScalarSizeInBits to compute the size of an element when computing the end bound of an access. This does not work as expected for pointers to pointers, because getScalarSizeInBits will return 0 for pointer types. By using DataLayout to get the size of the element we can also correctly handle pointer element types. Note the changes to the existing test, which seems to also use the wrong offset for the end. Fixes PR47751. Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D88953
2020-10-02LoopAccessAnalysis.cpp - use const reference in for-range loops. NFCI.Simon Pilgrim1-4/+4
2020-09-22[LoopInfo] empty() -> isInnermost(), add isOutermost()Stefanos Baziotis1-1/+1
Differential Revision: https://reviews.llvm.org/D82895
2020-07-31[NFC] Remove unused GetUnderlyingObject paramenterVitaly Buka1-10/+7
Depends on D84617. Differential Revision: https://reviews.llvm.org/D84621
2020-07-30[NFC] GetUnderlyingObject -> getUnderlyingObjectVitaly Buka1-4/+4
I am going to touch them in the next patch anyway
2020-07-30[LAA] Avoid adding pointers to the checks if they are not needed.Florian Hahn1-27/+33
Currently we skip alias sets with only reads or a single write and no reads, but still add the pointers to the list of pointers in RtCheck. This can lead to cases where we try to access a pointer that does not exist when grouping checks. In most cases, the way we access PositionMap masked that, as the value would default to index 0. But in the example in PR46854 it causes a crash. This patch updates the logic to avoid adding pointers for alias sets that do not need any checks. It makes things slightly more verbose, by first checking the numbers of reads/writes and bailing out early if we don't need checks for the alias set. I think this makes the logic a bit simpler to follow. Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D84608
2020-07-07[LV] Vectorize without versioning-for-unit-stride under -Os/-OzAyal Zaks1-2/+6
If a loop is in a function marked OptSize, Loop Access Analysis should refrain from generating runtime checks for unit strides that will version the loop. If a loop is in a function marked OptSize and its vectorization is enabled, it should be vectorized w/o any versioning. Fixes PR46228. Differential Revision: https://reviews.llvm.org/D81345
2020-06-25LoopAccessAnalysis.h - reduce AliasAnalysis.h include to forward ↵Simon Pilgrim1-3/+3
declaration. NFC. Fix implicit include dependencies in source files and replace legacy AliasAnalysis typedef with AAResults where necessary.
2020-06-14[LAA] Do not set CanDoRT to false for AS that do not need RT checks.Florian Hahn1-2/+19
Alternative approach to D80570. canCheckPtrAtRT already contains checks the figure out for which alias sets runtime checks are needed. But it currently sets CanDoRT to false for alias sets for which we cannot do RT checks but also do not need any. If we know that we do not need RT checks based on the number of reads/writes in the alias set, we can skip processing the AS. This patch also adds an assertion to ensure that DepCands does not contain more than one write from the alias set. Reviewers: Ayal, anemet, hfinkel, dmgreen Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D80622
2020-05-27[LAA] We only need pointer checks if there are non-zero checks (NFC).Florian Hahn1-9/+12
If it turns out that we can do runtime checks, but there are no runtime-checks to generate, set RtCheck.Need to false. This can happen if we can prove statically that the pointers passed in to canCheckPtrAtRT do not alias. This should not change any results, but allows us to skip some work and assert that runtime checks are generated, if LAA indicates that runtime checks are required. Reviewers: anemet, Ayal Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D79969 Note: This is a recommit of 259abfc7cbc11cd98c05b1eb8e4b3fb6a9664bc0, with some suggested renaming.
2020-05-27Revert "[LAA] We only need pointer checks if there are non-zero checks (NFC)."Florian Hahn1-16/+10
This reverts commit 259abfc7cbc11cd98c05b1eb8e4b3fb6a9664bc0. Reverting this, as I missed a case where we return without setting RtCheck.Need.
2020-05-27[LAA] We only need pointer checks if there are non-zero checks (NFC).Florian Hahn1-10/+16
If it turns out that we can do runtime checks, but there are no runtime-checks to generate, set RtCheck.Need to false. This can happen if we can prove statically that the pointers passed in to canCheckPtrAtRT do not alias. This should not change any results, but allows us to skip some work and assert that runtime checks are generated, if LAA indicates that runtime checks are required. Reviewers: anemet, Ayal Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D79969
2020-05-20[SCEV] Move ScalarEvolutionExpander.cpp to Transforms/Utils (NFC).Florian Hahn1-1/+0
SCEVExpander modifies the underlying function so it is more suitable in Transforms/Utils, rather than Analysis. This allows using other transform utils in SCEVExpander. This patch was originally committed as b8a3c34eee06, but broke the modules build, as LoopAccessAnalysis was using the Expander. The code-gen part of LAA was moved to lib/Transforms recently, so this patch can be landed again. Reviewers: sanjoy.google, efriedma, reames Reviewed By: sanjoy.google Differential Revision: https://reviews.llvm.org/D71537
2020-05-10[LAA] Move runtime-check generation to Transforms/Utils/loopUtils (NFC)Florian Hahn1-153/+0
Currently LAA's uses of ScalarEvolutionExpander blocks moving the expander from Analysis to Transforms. Conceptually the expander does not fit into Analysis (it is only used for code generation) and runtime-check generation also seems to be better suited as a transformation utility. Reviewers: Ayal, anemet Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D78460
2020-05-10Recommit "[LAA] Remove one addRuntimeChecks function (NFC)."Florian Hahn1-8/+0
The failing assertion has been fixed and the problematic test case has been added. This reverts the revert commit fc44617f28847417e55836193bbe8e9c3f09eca9.
2020-05-10Revert "[LAA] Remove one addRuntimeChecks function (NFC)."Florian Hahn1-0/+8
This reverts commit c28114c8ffde705d7e16cd4c065fd23269661c81. This causes some bots to fail: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-android/builds/30596/steps/build%20android%2Faarch64/logs/stdio
2020-05-10[LAA] Remove one addRuntimeChecks function (NFC).Florian Hahn1-8/+0
In order to reduce the API surface area (preparation for D78460), remove a addRuntimeChecks() function and do the additional check in the single caller. Reviewers: Ayal, anemet Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D79679
2020-05-09[LAA] Remove unneeded PtrRtChecking argument (NFC).Florian Hahn1-9/+7
The argument is not required and simplifies D78460 a bit.
2020-04-28[LAA] Move CheckingPtrGroup/PointerCheck outside class (NFC).Florian Hahn1-23/+29
This allows forward declarations of PointerCheck, which in turn reduce the number of times LoopAccessAnalysis needs to be included. Ultimately this helps with moving runtime check generation to Transforms/Utils/LoopUtils.h, without having to include it there. Reviewers: anemet, Ayal Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D78458
2020-01-16[VectorUtils] Rework the Vector Function Database (VFDatabase).Francesco Petrogalli1-1/+1
Summary: This commits is a rework of the patch in https://reviews.llvm.org/D67572. The rework was requested to prevent out-of-tree performance regression when vectorizing out-of-tree IR intrinsics. The vectorization of such intrinsics is enquired via the static function `isTLIScalarize`. For detail see the discussion in https://reviews.llvm.org/D67572. Reviewers: uabelho, fhahn, sdesmalen Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72734
2020-01-04Revert "[SCEV] Move ScalarEvolutionExpander.cpp to Transforms/Utils (NFC)."Florian Hahn1-1/+1
This reverts commit 51ef53f3bd23559203fe9af82ff2facbfedc1db3, as it breaks some bots.
2020-01-04[SCEV] Move ScalarEvolutionExpander.cpp to Transforms/Utils (NFC).Florian Hahn1-1/+1
SCEVExpander modifies the underlying function so it is more suitable in Transforms/Utils, rather than Analysis. This allows using other transform utils in SCEVExpander. Reviewers: sanjoy.google, efriedma, reames Reviewed By: sanjoy.google Differential Revision: https://reviews.llvm.org/D71537
2019-12-13Revert "[VectorUtils] Introduce the Vector Function Database (VFDatabase)."Francesco Petrogalli1-1/+1
This reverts commit 0be81968a283fd4161cb9ac9748d5ed200926292. The VFDatabase needs some rework to be able to handle vectorization and subsequent scalarization of intrinsics in out-of-tree versions of the compiler. For more details, see the discussion in https://reviews.llvm.org/D67572.
2019-12-10[VectorUtils] Introduce the Vector Function Database (VFDatabase).Francesco Petrogalli1-1/+1
This patch introduced the VFDatabase, the framework proposed in http://lists.llvm.org/pipermail/llvm-dev/2019-June/133484.html. [*] In this patch the VFDatabase is used to bridge the TargetLibraryInfo (TLI) calls that were previously used to query for the availability of vector counterparts of scalar functions. The VFISAKind field `ISA` of VFShape have been moved into into VFInfo, under the assumption that different vector ISAs may provide the same vector signature. At the moment, the vectorizer accepts any of the available ISAs as long as the signature provided by the VFDatabase matches the one expected in the vectorization process. For example, when targeting AVX or AVX2, which both have 256-bit registers, the IR signature of the two vector functions associated to the two ISAs is the same. The `getVectorizedFunction` method at the moment returns the first available match. We will need to add more heuristics to the search system to decide which of the available version (TLI, AVX, AVX2, ...) the system should prefer, when multiple versions with the same VFShape are present. Some of the code in this patch is based on the work done by Sumedh Arani in https://reviews.llvm.org/D66025. [*] Notice that in the proposal the VFDatabase was called SVFS. The name VFDatabase is more in line with LLVM recommendations for naming classes and variables. Differential Revision: https://reviews.llvm.org/D67572
2019-11-13Sink all InitializePasses.h includesReid Kleckner1-0/+5
This file lists every pass in LLVM, and is included by Pass.h, which is very popular. Every time we add, remove, or rename a pass in LLVM, it caused lots of recompilation. I found this fact by looking at this table, which is sorted by the number of times a file was changed over the last 100,000 git commits multiplied by the number of object files that depend on it in the current checkout: recompiles touches affected_files header 342380 95 3604 llvm/include/llvm/ADT/STLExtras.h 314730 234 1345 llvm/include/llvm/InitializePasses.h 307036 118 2602 llvm/include/llvm/ADT/APInt.h 213049 59 3611 llvm/include/llvm/Support/MathExtras.h 170422 47 3626 llvm/include/llvm/Support/Compiler.h 162225 45 3605 llvm/include/llvm/ADT/Optional.h 158319 63 2513 llvm/include/llvm/ADT/Triple.h 140322 39 3598 llvm/include/llvm/ADT/StringRef.h 137647 59 2333 llvm/include/llvm/Support/Error.h 131619 73 1803 llvm/include/llvm/Support/FileSystem.h Before this change, touching InitializePasses.h would cause 1345 files to recompile. After this change, touching it only causes 550 compiles in an incremental rebuild. Reviewers: bkramer, asbirlea, bollu, jdoerfert Differential Revision: https://reviews.llvm.org/D70211
2019-10-02LoopAccessAnalysis isConsecutiveAccess() - silence static analyzer ↵Simon Pilgrim1-2/+2
dyn_cast<SCEVConstant> null dereference warning. NFCI. The static analyzer is warning about potential null dereferences, but in these cases we should be able to use cast<SCEVConstant> directly and if not assert will fire for us. llvm-svn: 373465
2019-09-07Change TargetLibraryInfo analysis passes to always require FunctionTeresa Johnson1-1/+1
Summary: This is the first change to enable the TLI to be built per-function so that -fno-builtin* handling can be migrated to use function attributes. See discussion on D61634 for background. This is an enabler for fixing handling of these options for LTO, for example. This change should not affect behavior, as the provided function is not yet used to build a specifically per-function TLI, but rather enables that migration. Most of the changes were very mechanical, e.g. passing a Function to the legacy analysis pass's getTLI interface, or in Module level cases, adding a callback. This is similar to the way the per-function TTI analysis works. There was one place where we were looking for builtins but not in the context of a specific function. See FindCXAAtExit in lib/Transforms/IPO/GlobalOpt.cpp. I'm somewhat concerned my workaround could provide the wrong behavior in some corner cases. Suggestions welcome. Reviewers: chandlerc, hfinkel Subscribers: arsenm, dschuff, jvesely, nhaehnle, mehdi_amini, javed.absar, sbc100, jgravelle-google, eraman, aheejin, steven_wu, george.burgess.iv, dexonsmith, jfb, asbirlea, gchatelet, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66428 llvm-svn: 371284
2019-08-15[llvm] Migrate llvm::make_unique to std::make_uniqueJonas Devlieghere1-5/+5
Now that we've moved to C++14, we no longer need the llvm::make_unique implementation from STLExtras.h. This patch is a mechanical replacement of (hopefully) all the llvm::make_unique instances across the monorepo. llvm-svn: 369013