aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/Analysis
AgeCommit message (Collapse)AuthorFilesLines
2 days[LAA] Revert 56a1cbb and 1aded51, due to crash (#160993)Ramkumar Ramachandra1-34/+22
This reverts commits 56a1cbb ([LAA] Fix non-NFC parts of 1aded51), 1aded51 ([LAA] Prepare to handle diff type sizes (NFC)). The original NFC patch caused some regressions, which the later patch tried to fix. However, the later patch is the cause of some crashes, and it would be best to revert both for now, and re-land after thorough testing.
2 daysRevert "[TTI][RISCV] Add cost modelling for intrinsic vp.load.ff (#160470)"ShihPo Hung1-9/+0
This reverts commit aa08b1a9963f33ded658d3ee655429e1121b5212.
3 days[TTI][RISCV] Add cost modelling for intrinsic vp.load.ff (#160470)Shih-Po Hung1-0/+9
Split out from #151300 to isolate TargetTransformInfo cost modelling for fault-only-first loads from VPlan implementation details. This change adds costing support for vp.load.ff independently of the VPlan work. For now, model a vp.load.ff as cost-equivalent to a vp.load.
4 days[NVPTX] Fix NaN + overflow semantics of f2ll/d2i (#159530)Lewis Crawford1-8/+15
Fix the NaN-handling semantics of various NVVM intrinsics converting from fp types to integer types. Previously in ConstantFolding, NaN inputs would be constant-folded to 0. However, v9.0 of the PTX spec states that: In float-to-integer conversions, depending upon conversion types, NaN input results in following value: * Zero if source is not `.f64` and destination is not `.s64`, .`u64`. * Otherwise `1 << (BitWidth(dst) - 1)` corresponding to the value of `(MAXINT >> 1) + 1` for unsigned type or `MININT` for signed type. Also, support for constant-folding +/-Inf and values which overflow/underflow the integer output type has been added (they clamp to min/max int). Because of this NaN-handling semantic difference, we also need to disable transforming several intrinsics to FPToSI/FPToUI, as the LLVM intstruction will return poison, but the intrinsics have defined behaviour for these edge-cases like NaN/Inf/overflow.
4 days[LAA] Fix non-NFC parts of 1aded51 (#160701)Ramkumar Ramachandra1-1/+2
1aded51 ([LAA] Prepare to handle diff type sizes (NFC)) was supposed to be a non-functional patch, but introduced functional changes as known-non-negative and known-non-positive is not equivalent to !known-non-zero. Fix this.
4 days[DropUnnecessaryAssumes] Add support for operand bundles (#160311)Nikita Popov1-11/+20
This extends the DropUnnecessaryAssumes pass to also handle operand bundle assumes. For this purpose, export the affected value analysis for operand bundles from AssumptionCache. If the bundle only affects ephemeral values, drop it. If all bundles on an assume are dropped, drop the whole assume.
5 days[AssumptionCache] Don't use ResultElem for assumption list (NFC) (#160462)Nikita Popov1-2/+2
ResultElem stores a weak handle of an assume, plus an index for referring to a specific operand bundle. This makes sense for the results of assumptionsFor(), which refers to specific operands of assumes. However, assumptions() is a plain list of assumes. It does *not* contain separate entries for each operand bundles. The operand bundle index is always ExprResultIdx. As such, we should be directly using WeakVH for this case, without the additional wrapper.
5 days[InstSimplify] Consider vscale_range for get active lane mask (#160073)Matthew Devereau1-1/+18
Scalable get_active_lane_mask intrinsic calls can be simplified to i1 splat (ptrue) when its constant range is larger than or equal to the maximum possible number of elements, which can be inferred from vscale_range(x, y)
5 days[ConstantFolding] Avoid use of isNonIntegralPointerType()Alexander Richardson1-7/+7
Avoiding any new inttoptr is unnecessarily restrictive for "plain" non-integral pointers, but it is important for unstable pointers and pointers with external state. Fixes another test codegen regression from https://github.com/llvm/llvm-project/pull/105735. Reviewed By: nikic Pull Request: https://github.com/llvm/llvm-project/pull/159959
6 days[GVN/MemDep] Limit the size of the cache for non-local dependencies. (#150539)Alina Sbirlea1-0/+8
An attempt to resolve the issue flagged in [PR150531](https://github.com/llvm/llvm-project/issues/150531)
6 days[TTI][ASan][RISCV] reland Move InterestingMemoryOperand to Analysis and ↵Hank Chang1-0/+1
embed in MemIntrinsicInfo #157863 (#159713) [Previously reverted due to failures on asan-rvv-intrinsics.ll, the test case is riscv only and it is triggered by other target] Reland [#157863](https://github.com/llvm/llvm-project/pull/157863), and add `; REQUIRES: riscv-registered-target` in test case to skip the configuration that doesn't register riscv target. Previously asan considers target intrinsics as black boxes, so asan could not instrument accurate check. This patch make SmallVector<InterestingMemoryOperand> a member of MemIntrinsicInfo so that TTI can make targets describe their intrinsic informations to asan. Note, 1. This patch move InterestingMemoryOperand from Transforms to Analysis. 2. Extend MemIntrinsicInfo by adding a SmallVector<InterestingMemoryOperand> member. 3. This patch does not support RVV indexed/segment load/store.
8 days[llvm][Analysis] Silence warning when building with MSVCAlexandre Ganea1-1/+2
When building an assert-enabled target, silence the following: ``` C:\git\llvm-project\llvm\include\llvm/Analysis/DependenceAnalysis.h(290): warning C4018: '<=': signed/unsigned mismatch ```
9 days[InstCombine] Generalise optimisation of redundant floating point ↵Rajveer Singh Bharadwaj1-28/+43
comparisons with `ConstantFPRange` (#159315) Follow up of #158097 Similar to `simplifyAndOrOfICmpsWithConstants`, we can do so for floating point comparisons.
9 days[ValueTracking] a - b == NonZero -> a != b (#159792)Yingwei Zheng1-1/+21
Alive2: https://alive2.llvm.org/ce/z/8rX5Rk Closes https://github.com/llvm/llvm-project/issues/118106.
10 days[DependenceAnalysis] Extending SIV to handle fusable loops (#128782)Alireza Torabian1-158/+303
When there is a dependency between two memory instructions in separate loops that have the same iteration space and depth, SIV will be able to test them and compute the direction and the distance of the dependency.
10 days[KnownBits] Add setAllConflict to set all bits in Zero and One. NFC (#159815)Craig Topper1-14/+8
This is a common pattern to initialize Knownbits that occurs before loops that call intersectWith.
10 days[LLVM][SCEV] Look through common vscale multiplicand when simplifying ↵Paul Walker1-1/+20
compares. (#141798) My usecase is simplifying the control flow generated by LoopVectorize when vectorising loops whose tripcount is a function of the runtime vector length. This can be problematic because: * CSE is a pre-LoopVectorize transform and so it's common for an IR function to include several calls to llvm.vscale(). (NOTE: Code generation will typically remove the duplicates) * Pre-LoopVectorize instcombines will rewrite some multiplies as shifts. This leads to a mismatch between VL based maths of the scalar loop and that created for the vector loop, which prevents some obvious simplifications. SCEV does not suffer these issues because it effectively does CSE during construction and shifts are represented as multiplies.
10 days[DA] Add overflow check in ExactSIV (#157086)Ryotaro Kasuga1-1/+13
This patch adds an overflow check to the `exactSIVtest` function to fix the issue demonstrated in the test case added in #157085. This patch only fixes one of the routines. To fully resolve the test case, the other functions need to be addressed as well.
10 daysRevert "[TTI][ASan][RISCV] Move InterestingMemoryOperand to Analysis and ↵Florian Mayer1-1/+0
embed in MemIntrinsicInfo" (#159700) Reverts llvm/llvm-project#157863
10 days[TTI][ASan][RISCV] Move InterestingMemoryOperand to Analysis and embed in ↵Hank Chang1-0/+1
MemIntrinsicInfo (#157863) Previously asan considers target intrinsics as black boxes, so asan could not instrument accurate check. This patch make SmallVector<InterestingMemoryOperand> a member of MemIntrinsicInfo so that TTI can make targets describe their intrinsic informations to asan. Note, 1. This patch move InterestingMemoryOperand from Transforms to Analysis. 2. Extend MemIntrinsicInfo by adding a SmallVector<InterestingMemoryOperand> member. 3. This patch does not support RVV indexed/segment load/store.
11 days[LAA] Prepare to handle diff type sizes (NFC) (#122318)Ramkumar Ramachandra1-22/+33
As depend_diff_types shows, there are several places where the HasSameSize check can be relaxed for higher analysis precision. As a first step, return both the source size and the sink size from getDependenceDistanceStrideAndSize, along with a HasSameSize boolean for the moment.
12 days[PatternMatch] Introduce match functor (NFC) (#159386)Ramkumar Ramachandra2-13/+8
A common idiom is the usage of the PatternMatch match function within a functional algorithm like all_of. Introduce a match functor to shorten this idiom. Co-authored-by: Luke Lau <luke@igalia.com>
12 daysReapply "[SCEV] Fold (C1 * A /u C2) -> A /u (C2 /u C1), if C2 > C1." (#158328)Florian Hahn1-6/+15
This reverts commit fd58f235f8c5bd40d98acfd8e7fb11d41de301c7. The recommit contains an extra check to make sure that D is a multiple of C2, if C2 > C1. This fixes the issue causing the revert fd58f235f8c. Tests have been added in 6a726e9a4d3d0. Original message: If C2 >u C1 and C1 >u 1, fold to A /u (C2 /u C1). Depends on https://github.com/llvm/llvm-project/pull/157555. Alive2 Proof: https://alive2.llvm.org/ce/z/BWvQYN PR: https://github.com/llvm/llvm-project/pull/157656
12 days[CaptureTracking] Fix handling for non-returning read-only calls (#158979)Nikita Popov1-9/+9
We currently infer `captures(none)` for calls that are read-only, nounwind and willreturn. As pointed out in https://github.com/llvm/llvm-project/issues/129090, this is not correct even with this set of pre-conditions, because the function could conditionally cause UB depending on the address. As such, change this logic to instead report `captures(address)`. This also allows dropping the nounwind and willreturn checks, as these can also only capture the address.
12 days[BasicAA] Handle scalable vectors in new errno aliasing checks. (#159248)David Green1-1/+2
This is a minor fixup for scalable vectors after f9f62ef4ae555a. It handles them in the same way as other memory locations that are larger than errno, preventing the failure on implicit conversion from a scalable location.
13 days[DA] Add option to run only SIV routines (#157084)Ryotaro Kasuga1-0/+14
This patch introduces a new option, `da-run-siv-routines-only`, which runs only the SIV family routines in the DA. This is useful for testing (regression tests, not dependence tests) as it helps detect behavioral changes in the SIV routines. Actually, regarding the test cases added in #157085, fixing the incorrect result requires changes across multiple functions (at a minimum, `exactSIVtest`, `gcdMIVtest` and `symbolicRDIVtest`). It is difficult to address all of them at once. This patch also generates the CHECK directives using the new option for `ExactSIV.ll` as it is necessary for subsequent patches. However, I believe it will also be useful for other `xxSIV.ll` tests. Notably, the SIV family routines tend to be affected by other routines, as they are typically invoked at the beginning of the overall analysis.
13 days[AA] Refine ModRefInfo taking into account `errnomem` locationAntonio Frighetto3-1/+56
Ensure alias analyses mask out `errnomem` location, refining the resulting modref info, when the given access/location does not alias errno. This may occur either when TBAA proves there is no alias with errno (e.g., float TBAA for the same root would be disjoint with the int-only compatible TBAA node for errno); or if the memory access size is larger than the integer size, or when the underlying object is a potentially-escaping alloca. Previous discussion: https://discourse.llvm.org/t/rfc-modelling-errno-memory-effects/82972.
13 days[InstCombine] Optimize redundant floating point comparisons in `or`/`and` ↵Rajveer Singh Bharadwaj1-0/+29
inst's (#158097) Resolves #157371 We can eliminate one of the `fcmp` when we have two same `olt` or `ogt` instructions matched in `or`/`and` simplification.
13 days[ValueTracking] Don't take sign bit from NaN operands (#157250)Yingwei Zheng1-0/+5
Closes https://github.com/llvm/llvm-project/issues/157238.
13 days[DA] Remove base pointers from subscripts (NFCI) (#157083)Ryotaro Kasuga1-4/+8
This patch removes base pointers from subscripts when delinearization fails. Previously, in such cases, the pointer type SCEVs were used instead of offset SCEVs derived from them. For example, here is a portion of the debug output when analyzing `strong0` in `test/Analysis/DependenceAnalysis/StrongSIV.ll`: ``` testing subscript 0, SIV src = {(8 + %A),+,4}<nuw><%for.body> dst = {(8 + %A),+,4}<nuw><%for.body> Strong SIV test Coeff = 4, i64 SrcConst = (8 + %A), ptr DstConst = (8 + %A), ptr Delta = 0, i64 UpperBound = (-1 + %n), i64 Distance = 0 Remainder = 0 ``` As shown above, the `SrcConst` and `DstConst` are pointer values rather than integer offsets. `%A` should be removed. This change is necessary for #157086, since `ScalarEvolution::willNotOverflow` expects integer type SCEVs as arguments. This change alone alone should not affect the analysis results.
13 days[SCEV] Don't perform implication checks with many predicates (#158652)Nikita Popov1-2/+7
When adding a new predicate to a union, we currently do a bidirectional implication for all the contained predicates. This means that the number of implication checks is quadratic in the number of total predicates (if they don't end up being eliminated). Fix this by not checking for implication if the number of predicates grows too large. The expectation is that if there is a large number of predicates, we should be discarding them later anyway, as expanding them would be too expensive. Fixes https://github.com/llvm/llvm-project/issues/156114.
2025-09-12Revert "[SCEV] Fold (C1 * A /u C2) -> A /u (C2 /u C1), if C2 > C1." (#158328)Reid Kleckner1-13/+5
Reverts llvm/llvm-project#157656 There are multiple reports that this is causing miscompiles in the MSan test suite after bootstrapping and that this is causing miscompiles in rustc. Let's revert for now, and work to capture a reproducer next week.
2025-09-12[SCEV] Fix a hang introduced by collectForPHI (#158153)Philip Reames1-0/+9
If we have a phi where one of it's source blocks is an unreachable block, we don't want to traverse back into the unreachable region. Doing so allows e.g. finding a trivial self loop when walking back the predecessor chain.
2025-09-12[InstSimplify] Simplify get.active.lane.mask when 2nd arg is zero (#158018)David Sherwood1-0/+4
When the second argument passed to the get.active.lane.mask intrinsic is zero we can simplify the instruction to return an all-false mask regardless of the first operand.
2025-09-12[VPlan] Always consider register pressure on RISC-V (#156951)Luke Lau1-0/+4
Stacked on #156923 In https://godbolt.org/z/8svWaredK, we spill a lot on RISC-V because whilst the largest element type is i8, we generate a bunch of pointer vectors for gathers and scatters. This means the VF chosen is quite high e.g. <vscale x 16 x i8>, but we end up using a bunch of <vscale x 16 x i64> m8 registers for the pointers. This was briefly fixed by #132190 where we computed register pressure in VPlan and used it to prune VFs that were likely to spill. The legacy cost model wasn't able to do this pruning because it didn't have visibility into the pointer vectors that were needed for the gathers/scatters. However VF pruning was restricted again to just the case when max bandwidth was enabled in #141736 to avoid an AArch64 regression, and restricted again in #149056 to only prune VFs that had max bandwidth enabled. On RISC-V we take advantage of register grouping for performance and choose a default of LMUL 2, which means there are 16 registers to work with – half the number as SVE, so we encounter higher register pressure more frequently. As such, we likely want to always consider pruning VFs with high register pressure and not just the VFs from max bandwidth. This adds a TTI hook to opt into this behaviour for RISC-V which fixes the motivating godbolt example above. When last checked this significantly reduces the number of spills on SPEC CPU 2017, up to 80% on 538.imagick_r.
2025-09-12Revert "[LoopInfo] Pointer to stack object may not be loop invariant in a ↵Weibo He1-17/+5
coroutine function (#149936)" (#157986) Since #156788 has resolved #149604, we can revert this workaround now.
2025-09-11[ConstFold] Don't crash on ConstantExprs when folding get_active_lane_m.Florian Hahn1-3/+3
Check if operands are ConstantInt to avoid crashing on constant expression after https://github.com/llvm/llvm-project/pull/156659.
2025-09-11[ConstantFolding] Fold scalable get_active_lane_masks (#156659)Matthew Devereau1-0/+7
Scalable get_active_lane_mask intrinsics with a range of 0 can be lowered to zeroinitializer. This helps remove no-op scalable masked stores and loads.
2025-09-11[SCEV] Fold (C1 * A /u C2) -> A /u (C2 /u C1), if C2 > C1. (#157656)Florian Hahn1-5/+13
If C2 >u C1 and C1 >u 1, fold to A /u (C2 /u C1). Depends on https://github.com/llvm/llvm-project/pull/157555. Alive2 Proof: https://alive2.llvm.org/ce/z/BWvQYN PR: https://github.com/llvm/llvm-project/pull/157656
2025-09-10[LVI] Support no constant range of cast value in getEdgeValueLocal. (#157614)Andreas Jonson1-0/+18
proof: https://alive2.llvm.org/ce/z/8emkHY
2025-09-10[AMDGPU] Propagate Constants for Wave Reduction Intrinsics (#150395)Aaditya1-0/+14
2025-09-10[LLVM][LangRef] Remove "n > 0" restriction from get.active.lanes.mask. (#152140)Paul Walker1-7/+0
The specification for get.active.lanes.mask says a limit value of zero results in poison. This seems like an artificial restriction and means you cannot use the intrinsic to create minimal loops of the form: ``` foo(int count, ....) { int i = 0; while (mask = get.active.lane.mask(i, count)) { ; do work i += count_bits(mask); } } ``` I cannot see any code that generates poison in this case, in fact ConstantFoldFixedVectorCall returns the logical result (i.e. an all false vector). There are also cases like `can_overflow_i64_induction_var` in sve-tail-folding-overflow-checks.ll that look broken by the current definition? for the case when "%N <= vscale * 4".
2025-09-10[SCEV] Fold ((-1 * C1) * D / C1) -> -1 * D. (#157555)Florian Hahn1-6/+10
Treat negative constants C as -1 * abs(C1) when folding multiplies and udivs. Alive2 Proof: https://alive2.llvm.org/ce/z/bdj9W2 PR: https://github.com/llvm/llvm-project/pull/157555
2025-09-10[LAA] Strip findForkedPointer (NFC) (#140298)Ramkumar Ramachandra1-39/+28
Remove a level of indirection due to findForkedPointer, in an effort to improve code.
2025-09-09[InstCombine] Support GEP chains in foldCmpLoadFromIndexedGlobal() (#157447)Nikita Popov1-0/+64
Currently this fold only supports a single GEP. However, in ptradd representation, it may be split across multiple GEPs. In particular, PR #151333 will split off constant offset GEPs. To support this, add a new helper decomposeLinearExpression(), which decomposes a pointer into a linear expression of the form BasePtr + Index * Scale + Offset. I plan to also extend this helper to look through mul/shl on the index and use it in more places that currently use collectOffset() to extract a single index * scale. This will make sure such optimizations are not affected by the ptradd migration.
2025-09-09[SCEV] Generalize (C * A /u C) -> A fold to (C1 * A /u C2) -> C1/C2 * A. ↵Florian Hahn1-6/+9
(#157159) Generalize fold added in 74ec38fad0a1289 (https://github.com/llvm/llvm-project/pull/156730) to support multiplying and dividing by different constants, given they are both powers-of-2 and C1 is a multiple of C2, checked via logBase2. https://alive2.llvm.org/ce/z/eqJ2xj PR: https://github.com/llvm/llvm-project/pull/157159
2025-09-08[HashRecognize] Clarify hdr comment on GF(2^n) (NFC) (#157482)Ramkumar Ramachandra1-8/+8
Unify explanation for GF(2^n) and GF(2), which was previously convoluted.
2025-09-08[HashRecognize] Strip excess-TC check (#157479)Ramkumar Ramachandra1-1/+1
Checking if trip-count exceeds 256 is no longer necessary, as we have moved away from KnownBits computations to pattern-matching, which is very cheap and independent of TC.
2025-09-08[InstCombine][VectorCombine][NFC] Unify uses of lossless inverse cast (#156597)Hongyu Chen1-0/+51
This patch addresses https://github.com/llvm/llvm-project/pull/155216#discussion_r2297724663. This patch adds a helper function to put the inverse cast on constants, with cast flags preserved(optional). Follow-up patches will add trunc/ext handling on VectorCombine and flags preservation on InstCombine.
2025-09-07[nfc][ir2vec] Remove `Valid` field (#157132)Mircea Trofin1-8/+1
It is tied to the vocab having had been set. Checking that vector's `emtpy` is sufficient. Less state to track (for a maintainer)