aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/Analysis/MemoryDependenceAnalysis.cpp
AgeCommit message (Collapse)AuthorFilesLines
5 days[GVN/MemDep] Limit the size of the cache for non-local dependencies. (#150539)Alina Sbirlea1-0/+8
An attempt to resolve the issue flagged in [PR150531](https://github.com/llvm/llvm-project/issues/150531)
2025-08-08[IR] Remove size argument from lifetime intrinsics (#150248)Nikita Popov1-5/+5
Now that #149310 has restricted lifetime intrinsics to only work on allocas, we can also drop the explicit size argument. Instead, the size is implied by the alloca. This removes the ability to only mark a prefix of an alloca alive/dead. We never used that capability, so we should remove the need to handle that possibility everywhere (though many key places, including stack coloring, did not actually respect this).
2025-07-25[MemDep] Optimize SortNonLocalDepInfoCache sorting strategy for large caches ↵DingdWang1-20/+24
with few unsorted entries (#143107) During compilation of large files with many branches, I observed that the function `SortNonLocalDepInfoCache` in `MemoryDependenceAnalysis` becomes a significant performance bottleneck. This is because `Cache.size()` can be very large (around 20,000), but only a small number of entries (approximately 5 to 8) actually need sorting. The original implementation performs a full sort in all cases, which is inefficient. This patch introduces a lightweight heuristic to quickly estimate the number of unsorted entries and choose a more efficient sorting method accordingly. As a result, the GVN pass runtime on a large file is reduced from approximately 26.3 minutes to 16.5 minutes.
2025-07-15[DebugInfo][RemoveDIs] Suppress getNextNonDebugInfoInstruction (#144383)Jeremy Morse1-1/+1
There are no longer debug-info instructions, thus we don't need this skipping. Horray!
2025-06-17[DebugInfo][RemoveDIs] Remove a swathe of debug-intrinsic code (#144389)Jeremy Morse1-8/+0
Seeing how we can't generate any debug intrinsics any more: delete a variety of codepaths where they're handled. For the most part these are plain deletions, in others I've tweaked comments to remain coherent, or added a type to (what was) type-generic-lambdas. This isn't all the DbgInfoIntrinsic call sites but it's most of the simple scenarios. Co-authored-by: Nikita Popov <github@npopov.com>
2025-04-21[LLVM] Cleanup pass initialization for Analysis passes (#135858)Rahul Joshi1-3/+1
- Do not call pass initialization from pass constructors. - Instead, pass initialization should happen in the `initializeAnalysis` function. - https://github.com/llvm/llvm-project/issues/111767
2025-03-13[Analysis] Avoid repeated hash lookups (NFC) (#131066)Kazu Hirata1-2/+4
2024-11-21[MemDepAnalysis] Don't reuse NonLocalPointerDeps cache if memory location ↵Arthur Eubanks1-34/+12
size differs (#116936) As seen in #111585, we can end up using a previous cache entry where the size was too large and was UB. Compile time impact: https://llvm-compile-time-tracker.com/compare.php?from=6a863f7e2679a60f2f38ae6a920d0b6e1a2c1690&to=faccf4e1f47fcd5360a438de2a56d02b770ad498&stat=instructions:u. Fixes #111585.
2024-11-20[AA] Rename CaptureInfo -> CaptureAnalysis (NFC) (#116842)Nikita Popov1-3/+3
I'd like to use the name CaptureInfo to represent the new attribute proposed at https://discourse.llvm.org/t/rfc-improvements-to-capture-tracking/81420, but it's already taken by AA, and I can't think of great alternatives (CaptureEffects would be something of a stretch). As such, I'd like to rename CaptureInfo -> CaptureAnalysis in AA, which also seems like the more accurate terminology.
2024-09-26[NFC] Reapply 3f37c517f, SmallDenseMap speedupsJeremy Morse1-2/+2
This time with 100% more building unit tests. Original commit message follows. [NFC] Switch a number of DenseMaps to SmallDenseMaps for speedup (#109417) If we use SmallDenseMaps instead of DenseMaps at these locations, we get a substantial speedup because there's less spurious malloc traffic. Discovered by instrumenting DenseMap with some accounting code, then selecting sites where we'll get the most bang for our buck.
2024-09-25Revert "[NFC] Switch a number of DenseMaps to SmallDenseMaps for speedup ↵Jeremy Morse1-2/+2
(#109417)" This reverts commit 3f37c517fbc40531571f8b9f951a8610b4789cd6. Lo and behold, I missed a unit test
2024-09-25[NFC] Switch a number of DenseMaps to SmallDenseMaps for speedup (#109417)Jeremy Morse1-2/+2
If we use SmallDenseMaps instead of DenseMaps at these locations, we get a substantial speedup because there's less spurious malloc traffic. Discovered by instrumenting DenseMap with some accounting code, then selecting sites where we'll get the most bang for our buck.
2024-08-13[LLVM] Don't peek through bitcast on pointers and gep with zero indices. ↵Yingwei Zheng1-41/+12
NFC. (#102889) Since we are using opaque pointers now, we don't need to peek through bitcast on pointers and gep with zero indices.
2024-06-27[IR] Add getDataLayout() helpers to BasicBlock and Instruction (#96902)Nikita Popov1-2/+2
This is a helper to avoid writing `getModule()->getDataLayout()`. I regularly try to use this method only to remember it doesn't exist... `getModule()->getDataLayout()` is also a common (the most common?) reason why code has to include the Module.h header.
2023-12-24[Analysis] Use range-based for loops (NFC)Kazu Hirata1-9/+9
2023-10-24[Analysis] Add Scalable field in MemoryLocation.h (#69716)Harvin Iriawan1-2/+6
This is the first of a series of patch to improve Alias Analysis on Scalable quantities. Keep Scalable information from TypeSize which will be used in Alias Analysis.
2023-10-23[MemDep] Use EarliestEscapeInfo (#69727)Nikita Popov1-7/+5
Use BatchAA with EarliestEscapeInfo instead of callCapturesBefore() in MemDepAnalysis. The advantage of this is that it will also take not-captured-before information into account for non-calls (see test_store_before_capture for a representative example), and that this is a cached analysis. The disadvantage is that EII is slightly less precise than full CapturedBefore analysis. In practice the impact is positive, with gvn.NumGVNLoad going from 22022 to 22808 on test-suite. The impact to compile-time is also positive, mainly in the ThinLTO configuration.
2023-10-10[GVN] Drop Clobber dependency if store may overwrite only the same value ↵Sergey Kachkov1-5/+40
(#68322) In some cases clobbering store can be safely skipped if it can only must or no alias with memory location and it writes the same value. This patch supports simple case when the value from memory location was loaded in the same basic block before the store and there are no modifications between them.
2023-02-27Revert "[GVN] Support address translation through select instructions"Sergey Kachkov1-15/+4
This reverts commit b5bf6f6392a3408be1b7b7e036eb69358c5a2c29.
2023-02-27[GVN] Support address translation through select instructionsSergey Kachkov1-4/+15
Process cases when phi incoming in predecessor block has select instruction, and this select address is unavailable, but there are addresses translated from both sides of select instruction. Differential Revision: https://reviews.llvm.org/D142705
2023-02-03[NFC] PHITransAddr refactoring - return translated value directly or nullptr onSergey Kachkov1-2/+2
failure (instead of bool flag) Differential Revision: https://reviews.llvm.org/D143171
2023-02-02[NFC] Fix function naming conventions in PHITransAddr methodsSergey Kachkov1-3/+3
Differential Revision: https://reviews.llvm.org/D143166
2023-01-17[GVN] Refactor handling of pointer-select in GVN passSergey Kachkov1-1/+6
This patch extends Def memory dependency with support of select instructions to consistently handle pointer-select conversion. Differential Revision: https://reviews.llvm.org/D141619
2023-01-16Revert "[GVN] Refactor handling of pointer-select in GVN pass"Sergey Kachkov1-6/+1
This reverts commit fc7cdaa373308ce3d72218b4d80101ae19850a6c.
2023-01-16[GVN] Refactor handling of pointer-select in GVN passSergey Kachkov1-1/+6
This patch introduces new type of memory dependency - Select to consistently handle it like Def/Clobber dependency. Differential Revision: https://reviews.llvm.org/D141619
2023-01-13[MemDep] Reduce block limitNikita Popov1-2/+2
The non-local MemDep analysis has a limit on the number of blocks it will scan trying to find dependencies. The current limit of 1000 is very high, especially when we consider that each block scan can also visit up to 100 instructions. In degenerate cases (where we actually scan that many blocks) MemDep/GVN dominate overall compile-time, for little benefit. This patch reduces the limit to 200, which is probably still too large, but at least mitigates some of the more catastrophic cases. (For comparison, MSSA clobber walks consider up to 100 MemoryDefs/MemoryPhis, rather than 200 blocks * 100 instructions, but these limits aren't directly comparable.) I know that we were kind of hoping that this issue would resolve itself in time, either by a switch to NewGVN or use of MSSA in GVN. But I think we should still address this in the meantime. Additionally, a switch to an MSSA-based implementation will effectively be doing this as well, in a roundabout way (by dint of MSSA having lower cutoffs than MDA). Differential Revision: https://reviews.llvm.org/D140097
2022-12-12[BasicAA] Remove support for PhiValues analysisNikita Popov1-14/+3
BasicAA currently has an optional dependency on the PhiValues analysis. However, at least with our current pipeline setup, we never actually make use of it. It's possible that this used to work with the legacy pass manager, but I'm not sure of that either. Given that this analysis has not actually been in use for a long time, and nobody noticed or complained, I think we should drop support for it and focus on one code path. It is worth noting that analysis quality for the non-PhiValues case has significantly improved in the meantime. If we really wanted to make use of PhiValues, the right way would probably be to pass it in via AAQI in places we want to use it, rather than using an optional pass manager dependency (which are an unpredictable PITA and should really only ever be used for analyses that are only preserved and not used). Differential Revision: https://reviews.llvm.org/D139719
2022-10-31[AliasAnalysis] Introduce getModRefInfoMask() as a generalization of ↵Patrick Walton1-1/+1
pointsToConstantMemory(). The pointsToConstantMemory() method returns true only if the memory pointed to by the memory location is globally invariant. However, the LLVM memory model also has the semantic notion of *locally-invariant*: memory that is known to be invariant for the life of the SSA value representing that pointer. The most common example of this is a pointer argument that is marked readonly noalias, which the Rust compiler frequently emits. It'd be desirable for LLVM to treat locally-invariant memory the same way as globally-invariant memory when it's safe to do so. This patch implements that, by introducing the concept of a *ModRefInfo mask*. A ModRefInfo mask is a bound on the Mod/Ref behavior of an instruction that writes to a memory location, based on the knowledge that the memory is globally-constant memory (in which case the mask is NoModRef) or locally-constant memory (in which case the mask is Ref). ModRefInfo values for an instruction can be combined with the ModRefInfo mask by simply using the & operator. Where appropriate, this patch has modified uses of pointsToConstantMemory() to instead examine the mask. The most notable optimization change I noticed with this patch is that now redundant loads from readonly noalias pointers can be eliminated across calls, even when the pointer is captured. Internally, before this patch, AliasAnalysis was assigning Ref to reads from constant memory; now AA can assign NoModRef, which is a tighter bound. Differential Revision: https://reviews.llvm.org/D136659
2022-08-08[llvm] LLVM_FALLTHROUGH => [[fallthrough]]. NFCFangrui Song1-2/+2
With C++17 there is no Clang pedantic warning or MSVC C5051.
2022-08-01[AA] Do not track Must in ModRefInfoNikita Popov1-1/+1
getModRefInfo() queries currently track whether the result is a MustAlias on a best-effort basis. The only user of this functionality is the optimized memory access type in MemorySSA -- which in turn has no users. Given that this functionality has not found a user since it was introduced five years ago (in D38862), I think we should drop it again. The context is that I'm working to separate FunctionModRefBehavior to track mod/ref for different location kinds (like argmem or inaccessiblemem) separately, and the fact that ModRefInfo also has an unrelated Must flag makes this quite awkward, especially as this means that NoModRef is not a zero value. If we want to retain the functionality, I would probably split getModRefInfo() results into a part that just contains the ModRef information, and a separate part containing a (best-effort) AliasResult. Differential Revision: https://reviews.llvm.org/D130713
2022-07-21[MemoryBuiltins] Add getFreedOperand() function (NFCI)Nikita Popov1-4/+6
We currently assume in a number of places that free-like functions free their first argument. This is true for all hardcoded free-like functions, but with the new attribute-based design, the freed argument is supposed to be indicated by the allocptr attribute. To make sure we handle this correctly once allockind(free) is respected, add a getFreedOperand() helper which returns the freed argument, rather than just indicating whether the call frees *some* argument. This migrates most but not all users of isFreeCall() to the new API. The remaining users are a bit more tricky.
2022-06-08[NFC] Remove commented cerr debugging loggingsChuanqi Xu1-2/+0
There are some unused cerr debugging loggings in the codes. It is weird to remain such commented debug helpers in the product.
2022-06-07Revert "[MemDep][NFCI] Remove redundant dyn_cast, replace with cast"Philip Reames1-2/+2
This reverts commit 180d3f251d1ad5473705d3f00e6d426b5f8162e6. This commit is simply wrong. IsLoad is set within the same file based on modref state, not whether the instruction is a LoadInst. This went uncaught because cast<Ty>(X) has been broken. See https://discourse.llvm.org/t/cast-x-is-broken-implications-and-proposal-to-address/63033 for context.
2022-06-03[NFC][MemDep] Remove unnecessary Worklist.clearMax Kazantsev1-1/+0
This execution path leads to return 'false' where the Worklist will be deallocated anyways. No need to clear it separately.
2022-05-30[MemDep][NFC] Remove duplicating check in `if` and `else` branchMax Kazantsev1-7/+3
Same check is done whether the condition is true or false. Just hoist it out of conditional.
2022-05-30[MemDep][NFCI] Remove redundant dyn_cast, replace with castMax Kazantsev1-2/+2
When `IsLoad` is `true`, we don't need to check if the instruction is actually a load with dyn_cast. Saves some petty amount of CT.
2022-03-01Cleanup includes: LLVMAnalysisserge-sans-paille1-7/+0
Number of lines output by preprocessor: before: 1065940348 after: 1065307662 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D120659
2022-02-18[MemoryDependency] Simplfy re-ordering condition. Cleanup. NFC.Serguei Katkov1-26/+13
Make the reading of condition for restricting re-ordering simpler. Reviewers: reames Reviewed By: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D120005
2022-02-17[MemoryDependency] Relax the re-ordering of atomic store and unordered ↵Serguei Katkov1-3/+20
load/store Atomic store with Release semantic allows re-ordering of unordered load/store before the store. Implement it. Reviewers: reames Reviewed By: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D119844
2022-02-16[MemoryDependency] Relax the re-ordering with volatile store.Serguei Katkov1-3/+1
Volatile store does not provide any special rules for reordering with atomics. Usual must alias anaylsis is enough here. This makes the bahavior similar to how volatile load is handled. Reviewers: reames, nikic Reviewed By: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D119818
2022-01-10[MemoryBuiltins] Remove isNoAliasFn() in favor of isNoAliasCall()Nikita Popov1-1/+1
We currently have two similar implementations of this concept: isNoAliasCall() only checks for the noalias return attribute. isNoAliasFn() also checks for allocation functions. We should switch to only checking the attribute. SLC is responsible for inferring the noalias return attribute for non-new allocation functions (with a missing case fixed in https://github.com/llvm/llvm-project/commit/348bc76e3548c52dbcd442590ca0a7f5b09b7534). For new, clang is responsible for setting the attribute, if -fno-assume-sane-operator-new is not passed. Differential Revision: https://reviews.llvm.org/D116800
2021-11-20[llvm] Use range-based for loops (NFC)Kazu Hirata1-3/+3
2021-05-31[NFC] MemoryDependenceAnalysis cleanup.Daniil Fukalov1-16/+15
1. Removed redundant includes, 2. Removed never defined and used `releaseMemory()`. 3. Fixed member functions names first letter case. 4. Renamed duplicate (in nested struct `NonLocalPointerInfo`) name `NonLocalDeps` to `NonLocalDepsMap`. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D102358
2021-05-14[MemDep] Use BatchAA in more places (NFCI)Nikita Popov1-9/+20
Previously, we already used BatchAA for individual simple pointer dependency queries. This extends BatchAA usage for the non-local case, so that only one BatchAA instance is used for all blocks, instead of one instance per block. Use of BatchAA is safe as IR cannot be modified during a MemDep query.
2021-05-14[AA] Support callCapturesBefore() on BatchAA (NFCI)Nikita Popov1-2/+1
This is not expected to have any practical compile-time effect, as the alias() calls inside callCapturesBefore() are rare. This should still be supported for API completeness, and might be useful for reachability caching.
2021-05-14[GVN] Clobber partially aliased loads.dfukalov1-7/+4
Use offsets stored in `AliasResult` implemented in D98718. Updated with fix of issue reported in https://reviews.llvm.org/D95543#2745161 Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D95543
2021-05-11Revert "[GVN] Clobber partially aliased loads."Jordan Rupprecht1-4/+7
This reverts commit 6c570442318e2d3b8b13e95c2f2f588d71491acb. It causes assertion errors due to widening atomic loads, and potentially causes miscompile elsewhere too. Repro, also posted to D95543: ``` $ cat repro.ll ; ModuleID = 'repro.ll' source_filename = "repro.ll" target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" %struct.widget = type { i32 } %struct.baz = type { i32, %struct.snork } %struct.snork = type { %struct.spam } %struct.spam = type { i32, i32 } @global = external local_unnamed_addr global %struct.widget, align 4 @global.1 = external local_unnamed_addr global i8, align 1 @global.2 = external local_unnamed_addr global i32, align 4 define void @zot(%struct.baz* %arg) local_unnamed_addr align 2 { bb: %tmp = getelementptr inbounds %struct.baz, %struct.baz* %arg, i64 0, i32 1 %tmp1 = bitcast %struct.snork* %tmp to i64* %tmp2 = load i64, i64* %tmp1, align 4 %tmp3 = getelementptr inbounds %struct.baz, %struct.baz* %arg, i64 0, i32 1, i32 0, i32 1 %tmp4 = icmp ugt i64 %tmp2, 4294967295 br label %bb5 bb5: ; preds = %bb14, %bb %tmp6 = load i32, i32* %tmp3, align 4 %tmp7 = icmp ne i32 %tmp6, 0 %tmp8 = select i1 %tmp7, i1 %tmp4, i1 false %tmp9 = zext i1 %tmp8 to i8 store i8 %tmp9, i8* @global.1, align 1 %tmp10 = load i32, i32* @global.2, align 4 switch i32 %tmp10, label %bb11 [ i32 1, label %bb12 i32 2, label %bb12 ] bb11: ; preds = %bb5 br label %bb14 bb12: ; preds = %bb5, %bb5 %tmp13 = load atomic i32, i32* getelementptr inbounds (%struct.widget, %struct.widget* @global, i64 0, i32 0) acquire, align 4 br label %bb14 bb14: ; preds = %bb12, %bb11 br label %bb5 } $ opt -O2 repro.ll -disable-output opt: /home/rupprecht/src/llvm-project/llvm/lib/Transforms/Utils/VNCoercion.cpp:496: llvm::Value *llvm::VNCoercion::getLoadValueForLoad(llvm::LoadInst *, unsigned int, llvm::Type *, llvm::Instruction *, const llvm::DataLayout &): Assertion `SrcVal->isSimple() && "Cannot widen volatile/atomic load!"' failed. PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace. Stack dump: 0. Program arguments: /home/rupprecht/dev/opt -O2 repro.ll -disable-output ... ```
2021-04-24[GVN] Clobber partially aliased loads.dfukalov1-7/+4
Use offsets stored in `AliasResult` implemented in D98718. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D95543
2021-04-09[NFC][AA] Prepare to convert AliasResult to class with PartialAlias offset.dfukalov1-8/+8
Main reason is preparation to transform AliasResult to class that contains offset for PartialAlias case. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D98027
2021-04-01Extract isVolatile helper on Instruction [NFCI]Philip Reames1-12/+2
We have this logic duplicated in several cases, none of which were exhaustive. Consolidate it in one place. I don't believe this actually impacts behavior of the callers. I think they all filter their inputs such that their partial implementations were correct. If not, this might be fixing a cornercase bug.