riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
2 days	[LAA] Revert 56a1cbb and 1aded51, due to crash (#160993)	Ramkumar Ramachandra	1	-34/+22
	This reverts commits 56a1cbb ([LAA] Fix non-NFC parts of 1aded51), 1aded51 ([LAA] Prepare to handle diff type sizes (NFC)). The original NFC patch caused some regressions, which the later patch tried to fix. However, the later patch is the cause of some crashes, and it would be best to revert both for now, and re-land after thorough testing.
2 days	Revert "[TTI][RISCV] Add cost modelling for intrinsic vp.load.ff (#160470)"	ShihPo Hung	1	-9/+0
	This reverts commit aa08b1a9963f33ded658d3ee655429e1121b5212.
3 days	[TTI][RISCV] Add cost modelling for intrinsic vp.load.ff (#160470)	Shih-Po Hung	1	-0/+9
	Split out from #151300 to isolate TargetTransformInfo cost modelling for fault-only-first loads from VPlan implementation details. This change adds costing support for vp.load.ff independently of the VPlan work. For now, model a vp.load.ff as cost-equivalent to a vp.load.
4 days	[NVPTX] Fix NaN + overflow semantics of f2ll/d2i (#159530)	Lewis Crawford	1	-8/+15
	Fix the NaN-handling semantics of various NVVM intrinsics converting from fp types to integer types. Previously in ConstantFolding, NaN inputs would be constant-folded to 0. However, v9.0 of the PTX spec states that: In float-to-integer conversions, depending upon conversion types, NaN input results in following value: * Zero if source is not `.f64` and destination is not `.s64`, .`u64`. * Otherwise `1 << (BitWidth(dst) - 1)` corresponding to the value of `(MAXINT >> 1) + 1` for unsigned type or `MININT` for signed type. Also, support for constant-folding +/-Inf and values which overflow/underflow the integer output type has been added (they clamp to min/max int). Because of this NaN-handling semantic difference, we also need to disable transforming several intrinsics to FPToSI/FPToUI, as the LLVM intstruction will return poison, but the intrinsics have defined behaviour for these edge-cases like NaN/Inf/overflow.
4 days	[LAA] Fix non-NFC parts of 1aded51 (#160701)	Ramkumar Ramachandra	1	-1/+2
	1aded51 ([LAA] Prepare to handle diff type sizes (NFC)) was supposed to be a non-functional patch, but introduced functional changes as known-non-negative and known-non-positive is not equivalent to !known-non-zero. Fix this.
4 days	[DropUnnecessaryAssumes] Add support for operand bundles (#160311)	Nikita Popov	1	-11/+20
	This extends the DropUnnecessaryAssumes pass to also handle operand bundle assumes. For this purpose, export the affected value analysis for operand bundles from AssumptionCache. If the bundle only affects ephemeral values, drop it. If all bundles on an assume are dropped, drop the whole assume.
5 days	[AssumptionCache] Don't use ResultElem for assumption list (NFC) (#160462)	Nikita Popov	1	-2/+2
	ResultElem stores a weak handle of an assume, plus an index for referring to a specific operand bundle. This makes sense for the results of assumptionsFor(), which refers to specific operands of assumes. However, assumptions() is a plain list of assumes. It does not contain separate entries for each operand bundles. The operand bundle index is always ExprResultIdx. As such, we should be directly using WeakVH for this case, without the additional wrapper.
5 days	[InstSimplify] Consider vscale_range for get active lane mask (#160073)	Matthew Devereau	1	-1/+18
	Scalable get_active_lane_mask intrinsic calls can be simplified to i1 splat (ptrue) when its constant range is larger than or equal to the maximum possible number of elements, which can be inferred from vscale_range(x, y)
5 days	[ConstantFolding] Avoid use of isNonIntegralPointerType()	Alexander Richardson	1	-7/+7
	Avoiding any new inttoptr is unnecessarily restrictive for "plain" non-integral pointers, but it is important for unstable pointers and pointers with external state. Fixes another test codegen regression from https://github.com/llvm/llvm-project/pull/105735. Reviewed By: nikic Pull Request: https://github.com/llvm/llvm-project/pull/159959
6 days	[GVN/MemDep] Limit the size of the cache for non-local dependencies. (#150539)	Alina Sbirlea	1	-0/+8
	An attempt to resolve the issue flagged in [PR150531](https://github.com/llvm/llvm-project/issues/150531)
6 days	[TTI][ASan][RISCV] reland Move InterestingMemoryOperand to Analysis and ↵	Hank Chang	1	-0/+1
	embed in MemIntrinsicInfo #157863 (#159713) [Previously reverted due to failures on asan-rvv-intrinsics.ll, the test case is riscv only and it is triggered by other target] Reland [#157863](https://github.com/llvm/llvm-project/pull/157863), and add `; REQUIRES: riscv-registered-target` in test case to skip the configuration that doesn't register riscv target. Previously asan considers target intrinsics as black boxes, so asan could not instrument accurate check. This patch make SmallVector<InterestingMemoryOperand> a member of MemIntrinsicInfo so that TTI can make targets describe their intrinsic informations to asan. Note, 1. This patch move InterestingMemoryOperand from Transforms to Analysis. 2. Extend MemIntrinsicInfo by adding a SmallVector<InterestingMemoryOperand> member. 3. This patch does not support RVV indexed/segment load/store.
8 days	[llvm][Analysis] Silence warning when building with MSVC	Alexandre Ganea	1	-1/+2
	When building an assert-enabled target, silence the following: ``` C:\git\llvm-project\llvm\include\llvm/Analysis/DependenceAnalysis.h(290): warning C4018: '<=': signed/unsigned mismatch ```
9 days	[InstCombine] Generalise optimisation of redundant floating point ↵	Rajveer Singh Bharadwaj	1	-28/+43
	comparisons with `ConstantFPRange` (#159315) Follow up of #158097 Similar to `simplifyAndOrOfICmpsWithConstants`, we can do so for floating point comparisons.
9 days	[ValueTracking] a - b == NonZero -> a != b (#159792)	Yingwei Zheng	1	-1/+21
	Alive2: https://alive2.llvm.org/ce/z/8rX5Rk Closes https://github.com/llvm/llvm-project/issues/118106.
10 days	[DependenceAnalysis] Extending SIV to handle fusable loops (#128782)	Alireza Torabian	1	-158/+303
	When there is a dependency between two memory instructions in separate loops that have the same iteration space and depth, SIV will be able to test them and compute the direction and the distance of the dependency.
10 days	[KnownBits] Add setAllConflict to set all bits in Zero and One. NFC (#159815)	Craig Topper	1	-14/+8
	This is a common pattern to initialize Knownbits that occurs before loops that call intersectWith.
10 days	[LLVM][SCEV] Look through common vscale multiplicand when simplifying ↵	Paul Walker	1	-1/+20
	compares. (#141798) My usecase is simplifying the control flow generated by LoopVectorize when vectorising loops whose tripcount is a function of the runtime vector length. This can be problematic because: * CSE is a pre-LoopVectorize transform and so it's common for an IR function to include several calls to llvm.vscale(). (NOTE: Code generation will typically remove the duplicates) * Pre-LoopVectorize instcombines will rewrite some multiplies as shifts. This leads to a mismatch between VL based maths of the scalar loop and that created for the vector loop, which prevents some obvious simplifications. SCEV does not suffer these issues because it effectively does CSE during construction and shifts are represented as multiplies.
10 days	[DA] Add overflow check in ExactSIV (#157086)	Ryotaro Kasuga	1	-1/+13
	This patch adds an overflow check to the `exactSIVtest` function to fix the issue demonstrated in the test case added in #157085. This patch only fixes one of the routines. To fully resolve the test case, the other functions need to be addressed as well.
10 days	Revert "[TTI][ASan][RISCV] Move InterestingMemoryOperand to Analysis and ↵	Florian Mayer	1	-1/+0
	embed in MemIntrinsicInfo" (#159700) Reverts llvm/llvm-project#157863
10 days	[TTI][ASan][RISCV] Move InterestingMemoryOperand to Analysis and embed in ↵	Hank Chang	1	-0/+1
	MemIntrinsicInfo (#157863) Previously asan considers target intrinsics as black boxes, so asan could not instrument accurate check. This patch make SmallVector<InterestingMemoryOperand> a member of MemIntrinsicInfo so that TTI can make targets describe their intrinsic informations to asan. Note, 1. This patch move InterestingMemoryOperand from Transforms to Analysis. 2. Extend MemIntrinsicInfo by adding a SmallVector<InterestingMemoryOperand> member. 3. This patch does not support RVV indexed/segment load/store.
11 days	[LAA] Prepare to handle diff type sizes (NFC) (#122318)	Ramkumar Ramachandra	1	-22/+33
	As depend_diff_types shows, there are several places where the HasSameSize check can be relaxed for higher analysis precision. As a first step, return both the source size and the sink size from getDependenceDistanceStrideAndSize, along with a HasSameSize boolean for the moment.
12 days	[PatternMatch] Introduce match functor (NFC) (#159386)	Ramkumar Ramachandra	2	-13/+8
	A common idiom is the usage of the PatternMatch match function within a functional algorithm like all_of. Introduce a match functor to shorten this idiom. Co-authored-by: Luke Lau <luke@igalia.com>
12 days	Reapply "[SCEV] Fold (C1 * A /u C2) -> A /u (C2 /u C1), if C2 > C1." (#158328)	Florian Hahn	1	-6/+15
	This reverts commit fd58f235f8c5bd40d98acfd8e7fb11d41de301c7. The recommit contains an extra check to make sure that D is a multiple of C2, if C2 > C1. This fixes the issue causing the revert fd58f235f8c. Tests have been added in 6a726e9a4d3d0. Original message: If C2 >u C1 and C1 >u 1, fold to A /u (C2 /u C1). Depends on https://github.com/llvm/llvm-project/pull/157555. Alive2 Proof: https://alive2.llvm.org/ce/z/BWvQYN PR: https://github.com/llvm/llvm-project/pull/157656
12 days	[CaptureTracking] Fix handling for non-returning read-only calls (#158979)	Nikita Popov	1	-9/+9
	We currently infer `captures(none)` for calls that are read-only, nounwind and willreturn. As pointed out in https://github.com/llvm/llvm-project/issues/129090, this is not correct even with this set of pre-conditions, because the function could conditionally cause UB depending on the address. As such, change this logic to instead report `captures(address)`. This also allows dropping the nounwind and willreturn checks, as these can also only capture the address.
12 days	[BasicAA] Handle scalable vectors in new errno aliasing checks. (#159248)	David Green	1	-1/+2
	This is a minor fixup for scalable vectors after f9f62ef4ae555a. It handles them in the same way as other memory locations that are larger than errno, preventing the failure on implicit conversion from a scalable location.
13 days	[DA] Add option to run only SIV routines (#157084)	Ryotaro Kasuga	1	-0/+14
	This patch introduces a new option, `da-run-siv-routines-only`, which runs only the SIV family routines in the DA. This is useful for testing (regression tests, not dependence tests) as it helps detect behavioral changes in the SIV routines. Actually, regarding the test cases added in #157085, fixing the incorrect result requires changes across multiple functions (at a minimum, `exactSIVtest`, `gcdMIVtest` and `symbolicRDIVtest`). It is difficult to address all of them at once. This patch also generates the CHECK directives using the new option for `ExactSIV.ll` as it is necessary for subsequent patches. However, I believe it will also be useful for other `xxSIV.ll` tests. Notably, the SIV family routines tend to be affected by other routines, as they are typically invoked at the beginning of the overall analysis.
13 days	[AA] Refine ModRefInfo taking into account `errnomem` location	Antonio Frighetto	3	-1/+56
	Ensure alias analyses mask out `errnomem` location, refining the resulting modref info, when the given access/location does not alias errno. This may occur either when TBAA proves there is no alias with errno (e.g., float TBAA for the same root would be disjoint with the int-only compatible TBAA node for errno); or if the memory access size is larger than the integer size, or when the underlying object is a potentially-escaping alloca. Previous discussion: https://discourse.llvm.org/t/rfc-modelling-errno-memory-effects/82972.
13 days	[InstCombine] Optimize redundant floating point comparisons in `or`/`and` ↵	Rajveer Singh Bharadwaj	1	-0/+29
	inst's (#158097) Resolves #157371 We can eliminate one of the `fcmp` when we have two same `olt` or `ogt` instructions matched in `or`/`and` simplification.
13 days	[ValueTracking] Don't take sign bit from NaN operands (#157250)	Yingwei Zheng	1	-0/+5
	Closes https://github.com/llvm/llvm-project/issues/157238.
13 days	[DA] Remove base pointers from subscripts (NFCI) (#157083)	Ryotaro Kasuga	1	-4/+8
	This patch removes base pointers from subscripts when delinearization fails. Previously, in such cases, the pointer type SCEVs were used instead of offset SCEVs derived from them. For example, here is a portion of the debug output when analyzing `strong0` in `test/Analysis/DependenceAnalysis/StrongSIV.ll`: ``` testing subscript 0, SIV src = {(8 + %A),+,4}<nuw><%for.body> dst = {(8 + %A),+,4}<nuw><%for.body> Strong SIV test Coeff = 4, i64 SrcConst = (8 + %A), ptr DstConst = (8 + %A), ptr Delta = 0, i64 UpperBound = (-1 + %n), i64 Distance = 0 Remainder = 0 ``` As shown above, the `SrcConst` and `DstConst` are pointer values rather than integer offsets. `%A` should be removed. This change is necessary for #157086, since `ScalarEvolution::willNotOverflow` expects integer type SCEVs as arguments. This change alone alone should not affect the analysis results.
13 days	[SCEV] Don't perform implication checks with many predicates (#158652)	Nikita Popov	1	-2/+7
	When adding a new predicate to a union, we currently do a bidirectional implication for all the contained predicates. This means that the number of implication checks is quadratic in the number of total predicates (if they don't end up being eliminated). Fix this by not checking for implication if the number of predicates grows too large. The expectation is that if there is a large number of predicates, we should be discarding them later anyway, as expanding them would be too expensive. Fixes https://github.com/llvm/llvm-project/issues/156114.
2025-09-12	Revert "[SCEV] Fold (C1 * A /u C2) -> A /u (C2 /u C1), if C2 > C1." (#158328)	Reid Kleckner	1	-13/+5
	Reverts llvm/llvm-project#157656 There are multiple reports that this is causing miscompiles in the MSan test suite after bootstrapping and that this is causing miscompiles in rustc. Let's revert for now, and work to capture a reproducer next week.
2025-09-12	[SCEV] Fix a hang introduced by collectForPHI (#158153)	Philip Reames	1	-0/+9
	If we have a phi where one of it's source blocks is an unreachable block, we don't want to traverse back into the unreachable region. Doing so allows e.g. finding a trivial self loop when walking back the predecessor chain.
2025-09-12	[InstSimplify] Simplify get.active.lane.mask when 2nd arg is zero (#158018)	David Sherwood	1	-0/+4
	When the second argument passed to the get.active.lane.mask intrinsic is zero we can simplify the instruction to return an all-false mask regardless of the first operand.
2025-09-12	[VPlan] Always consider register pressure on RISC-V (#156951)	Luke Lau	1	-0/+4
	Stacked on #156923 In https://godbolt.org/z/8svWaredK, we spill a lot on RISC-V because whilst the largest element type is i8, we generate a bunch of pointer vectors for gathers and scatters. This means the VF chosen is quite high e.g. <vscale x 16 x i8>, but we end up using a bunch of <vscale x 16 x i64> m8 registers for the pointers. This was briefly fixed by #132190 where we computed register pressure in VPlan and used it to prune VFs that were likely to spill. The legacy cost model wasn't able to do this pruning because it didn't have visibility into the pointer vectors that were needed for the gathers/scatters. However VF pruning was restricted again to just the case when max bandwidth was enabled in #141736 to avoid an AArch64 regression, and restricted again in #149056 to only prune VFs that had max bandwidth enabled. On RISC-V we take advantage of register grouping for performance and choose a default of LMUL 2, which means there are 16 registers to work with – half the number as SVE, so we encounter higher register pressure more frequently. As such, we likely want to always consider pruning VFs with high register pressure and not just the VFs from max bandwidth. This adds a TTI hook to opt into this behaviour for RISC-V which fixes the motivating godbolt example above. When last checked this significantly reduces the number of spills on SPEC CPU 2017, up to 80% on 538.imagick_r.
2025-09-12	Revert "[LoopInfo] Pointer to stack object may not be loop invariant in a ↵	Weibo He	1	-17/+5
	coroutine function (#149936)" (#157986) Since #156788 has resolved #149604, we can revert this workaround now.
2025-09-11	[ConstFold] Don't crash on ConstantExprs when folding get_active_lane_m.	Florian Hahn	1	-3/+3
	Check if operands are ConstantInt to avoid crashing on constant expression after https://github.com/llvm/llvm-project/pull/156659.
2025-09-11	[ConstantFolding] Fold scalable get_active_lane_masks (#156659)	Matthew Devereau	1	-0/+7
	Scalable get_active_lane_mask intrinsics with a range of 0 can be lowered to zeroinitializer. This helps remove no-op scalable masked stores and loads.
2025-09-11	[SCEV] Fold (C1 * A /u C2) -> A /u (C2 /u C1), if C2 > C1. (#157656)	Florian Hahn	1	-5/+13
	If C2 >u C1 and C1 >u 1, fold to A /u (C2 /u C1). Depends on https://github.com/llvm/llvm-project/pull/157555. Alive2 Proof: https://alive2.llvm.org/ce/z/BWvQYN PR: https://github.com/llvm/llvm-project/pull/157656
2025-09-10	[LVI] Support no constant range of cast value in getEdgeValueLocal. (#157614)	Andreas Jonson	1	-0/+18
	proof: https://alive2.llvm.org/ce/z/8emkHY
2025-09-10	[AMDGPU] Propagate Constants for Wave Reduction Intrinsics (#150395)	Aaditya	1	-0/+14

2025-09-10	[LLVM][LangRef] Remove "n > 0" restriction from get.active.lanes.mask. (#152140)	Paul Walker	1	-7/+0
	The specification for get.active.lanes.mask says a limit value of zero results in poison. This seems like an artificial restriction and means you cannot use the intrinsic to create minimal loops of the form: ``` foo(int count, ....) { int i = 0; while (mask = get.active.lane.mask(i, count)) { ; do work i += count_bits(mask); } } ``` I cannot see any code that generates poison in this case, in fact ConstantFoldFixedVectorCall returns the logical result (i.e. an all false vector). There are also cases like `can_overflow_i64_induction_var` in sve-tail-folding-overflow-checks.ll that look broken by the current definition? for the case when "%N <= vscale * 4".
2025-09-10	[SCEV] Fold ((-1 * C1) * D / C1) -> -1 * D. (#157555)	Florian Hahn	1	-6/+10
	Treat negative constants C as -1 * abs(C1) when folding multiplies and udivs. Alive2 Proof: https://alive2.llvm.org/ce/z/bdj9W2 PR: https://github.com/llvm/llvm-project/pull/157555
2025-09-10	[LAA] Strip findForkedPointer (NFC) (#140298)	Ramkumar Ramachandra	1	-39/+28
	Remove a level of indirection due to findForkedPointer, in an effort to improve code.
2025-09-09	[InstCombine] Support GEP chains in foldCmpLoadFromIndexedGlobal() (#157447)	Nikita Popov	1	-0/+64
	Currently this fold only supports a single GEP. However, in ptradd representation, it may be split across multiple GEPs. In particular, PR #151333 will split off constant offset GEPs. To support this, add a new helper decomposeLinearExpression(), which decomposes a pointer into a linear expression of the form BasePtr + Index * Scale + Offset. I plan to also extend this helper to look through mul/shl on the index and use it in more places that currently use collectOffset() to extract a single index * scale. This will make sure such optimizations are not affected by the ptradd migration.
2025-09-09	[SCEV] Generalize (C * A /u C) -> A fold to (C1 * A /u C2) -> C1/C2 * A. ↵	Florian Hahn	1	-6/+9
	(#157159) Generalize fold added in 74ec38fad0a1289 (https://github.com/llvm/llvm-project/pull/156730) to support multiplying and dividing by different constants, given they are both powers-of-2 and C1 is a multiple of C2, checked via logBase2. https://alive2.llvm.org/ce/z/eqJ2xj PR: https://github.com/llvm/llvm-project/pull/157159
2025-09-08	[HashRecognize] Clarify hdr comment on GF(2^n) (NFC) (#157482)	Ramkumar Ramachandra	1	-8/+8
	Unify explanation for GF(2^n) and GF(2), which was previously convoluted.
2025-09-08	[HashRecognize] Strip excess-TC check (#157479)	Ramkumar Ramachandra	1	-1/+1
	Checking if trip-count exceeds 256 is no longer necessary, as we have moved away from KnownBits computations to pattern-matching, which is very cheap and independent of TC.
2025-09-08	[InstCombine][VectorCombine][NFC] Unify uses of lossless inverse cast (#156597)	Hongyu Chen	1	-0/+51
	This patch addresses https://github.com/llvm/llvm-project/pull/155216#discussion_r2297724663. This patch adds a helper function to put the inverse cast on constants, with cast flags preserved(optional). Follow-up patches will add trunc/ext handling on VectorCombine and flags preservation on InstCombine.
2025-09-07	[nfc][ir2vec] Remove `Valid` field (#157132)	Mircea Trofin	1	-8/+1
	It is tied to the vocab having had been set. Checking that vector's `emtpy` is sufficient. Less state to track (for a maintainer)