riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
2024-08-09	[MemoryBuiltins] Handle allocator attributes on call-site	Nikita Popov	1	-31/+28
	We should handle allocator attributes not only on function declarations, but also on the call-site. That way we can e.g. also optimize cases where the allocator function is a virtual function call. This was already supported in some of the MemoryBuiltins helpers, but not all of them. This adds support for allocsize, alloc-family and allockind("free").
2024-08-09	Revert "Enable logf128 constant folding for hosts with 128bit floats (#96287)"	Nikita Popov	2	-2/+8
	This reverts commit ccb2b011e577e861254f61df9c59494e9e122b38. Causes buildbot failures, e.g. on ppc64le builders.
2024-08-09	Enable logf128 constant folding for hosts with 128bit floats (#96287)	Matthew Devereau	2	-8/+2
	Hosts which support a float size of 128 bits can benefit from constant fp128 folding.
2024-08-08	Reapply "[ctx_prof] Fix the pre-thinlink "use" case (#102511)"	Mircea Trofin	1	-2/+6
	This reverts commit 967185eeb85abb77bd6b6cdd2b026d5c54b7d4f3. The problem was link dependencies, moved `UseCtxProfile` to `Analysis`.
2024-08-08	[LICM][MustExec] Make must-exec logic for IV condition commutative (#93150)	Nikita Popov	1	-7/+12
	MustExec has special logic to determine whether the first loop iteration will always be executed, by simplifying the IV comparison with the start value. Currently, this code assumes that the IV is on the LHS of the comparison, but this is not guaranteed. Make sure it handles the commuted variant as well. The changed PhaseOrdering test previously performed peeling to make the loads dereferenceable -- as a side effect, this also reduced the exit count by one, avoiding the awkward <= MAX case. Now we know up-front the the loads are dereferenceable and can be simply hoisted. As such, we retain the original exit count and now have to handle it by widening the exit count calculation to i128. This is a regression, but at least it preserves the vectorization, which was the original goal. I'm not sure what else can be done about that test.
2024-08-08	Revert "[Asan] Provide TTI hook to provide memory reference infromation of ↵	Jeremy Morse	1	-5/+0
	target intrinsics. (#97070)" This reverts commit e8ad87c7d06afe8f5dde2e4c7f13c314cb3a99e9. This reverts commit d3c9bb0cf811424dcb8c848cf06773dbdde19965. A few buildbots trip up on asan-rvv-intrinsics.ll. I've also reverted the follow-up commit d3c9bb0cf8. https://lab.llvm.org/buildbot/#/builders/46/builds/2895
2024-08-08	[Asan] Provide TTI hook to provide memory reference infromation of target ↵	Yeting Kuo	1	-0/+5
	intrinsics. (#97070) Previously asan considers target intrinsics as black boxes, so asan could not instrument accurate check. This patch provide TTI hooks to make targets describe their intrinsic informations to asan. Note, 1. this patch renames InterestingMemoryOperand to MemoryRefInfo. 2. this patch does not support RVV indexed/segment load/store.
2024-08-07	[ctx_prof] CtxProfAnalysis (#102084)	Mircea Trofin	2	-0/+96
	This is an immutable analysis that loads and makes the contextual profile available to other passes. This patch introduces the analysis and an analysis printer pass. Subsequent patches will introduce the APIs that IPO passes will call to modify the profile as result of their changes.
2024-08-07	[InstSimplify] Fold (insertelement Splat(C), C, X) -> Splat(C) (#102315)	Benjamin Kramer	1	-0/+4
	The index doesn't matter here.
2024-08-07	[BasicAA] Make use of nusw+nuw -> nneg implication (#102141)	Nikita Popov	1	-2/+8
	If the GEP is both nuw and inbounds/nusw, the offset is non-negative. Pass this information to CastedValue and make use of it when determining the value range. Proof for nusw+nuw->nneg: https://alive2.llvm.org/ce/z/a_CKAw Proof for the test case: https://alive2.llvm.org/ce/z/yJ3ymP
2024-08-06	Add __size_returning_new variant detection to TLI. (#101564)	Snehasish Kumar	1	-0/+4
	Add support to detect __size_returning_new variants defined inproposal P0901R5 to extend to operator new, see http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p0901r5.html for details. This PR matches the declarations exported by tcmalloc in https://github.com/google/tcmalloc/blob/f2516691d01051defc558679f37720bba88d9862/tcmalloc/malloc_extension.h#L707-L711
2024-08-06	[BasicAA] Check nusw instead of inbounds	Nikita Popov	1	-1/+1
	For the offset scaling, this is sufficient to guarantee nsw. The other checks for inbounds in this file do need proper inbounds.
2024-08-06	[ValueTracking] Infer relationship for the select with SLT	zhongyunde 00443407	1	-0/+9

2024-08-06	[ValueTracking] Infer relationship for the select with ICmp	zhongyunde 00443407	1	-0/+11
	x -nsw y < -C is false when x > y and C >= 0 Alive2 proof for sgt, sge : https://alive2.llvm.org/ce/z/tupvfi Note: It only really makes sense in the context of signed comparison for "X - Y must be positive if X >= Y and no overflow". Fixes https://github.com/llvm/llvm-project/issues/54735
2024-08-04	[llvm] Construct SmallVector with ArrayRef (NFC) (#101872)	Kazu Hirata	2	-2/+2

2024-08-03	[SCEV] Use const SCEV * explicitly in more places.	Florian Hahn	3	-7/+7
	Use const SCEV * explicitly in more places to prepare for https://github.com/llvm/llvm-project/pull/91961. Split off as suggested.
2024-08-02	[StructuralHashPrinter] Always print 16-digit hash (#101655)	Sergei Barannikov	1	-3/+3
	The hash may contain less than 14 significant digits, which caused the test to fail.
2024-08-02	[SCEV] Fix warning (NFC)	Nikita Popov	1	-1/+1
	Produces -Wrange-loop-construct on some buildbots.
2024-08-02	[SCEV] Handle more adds in computeConstantDifference() (#101339)	Nikita Popov	1	-26/+25
	Currently it only deals with the case where we're subtracting adds with at most one non-constant operand. This patch extends it to cancel out common operands for the subtraction of arbitrary add expressions. The background here is that I want to replace a getMinusSCEV() call in LAA with computeConstantDifference(): https://github.com/llvm/llvm-project/blob/93fecc2577ece0329f3bbe2719bbc5b4b9b30010/llvm/lib/Analysis/LoopAccessAnalysis.cpp#L1602-L1603 This particular call is very expensive in some cases (e.g. lencod with LTO) and computeConstantDifference() could achieve this much more cheaply, because it does not need to construct new SCEV expressions. However, the current computeConstantDifference() implementation is too weak for this and misses many basic cases. This is a step towards making it more powerful while still keeping it pretty fast.
2024-08-02	[SCEV] Unify and optimize constant folding (NFC) (#101473)	Nikita Popov	1	-100/+92
	Add a common constantFoldAndGroupOps() helper that takes care of constant folding and grouping transforms that are common to all nary ops. This moves the constant folding prior to grouping, which is more efficient, and excludes any constant from the sort. The constant folding has hooks for folding, identity constants and absorber constants. This gives a compile-time improvement for SCEV-heavy workloads like lencod.
2024-08-02	[LVI][NFC] Delete an outdated comment (#101504)	Piotr Fusik	1	-3/+1
	Transitioned from inheritance to has-a relationship in 9db7948e
2024-08-01	[SCEV] Prove no-self-wrap from negative power of two step (#101416)	Philip Reames	1	-5/+10
	We have existing code which reasons about a step evenly dividing the iteration space is a finite loop with a single exit implying no-self-wrap. The sign of the step doesn't effect this. --------- Co-authored-by: Nikita Popov <github@npopov.com>
2024-07-31	[SCEV] Use power of two facts involving vscale when inferring wrap flags ↵	Philip Reames	1	-59/+71
	(#101380) SCEV has logic for inferring wrap flags on AddRecs which are known to control an exit based on whether the step is a power of two. This logic only considered constants, and thus did not trigger for steps such as (4 x vscale) which are common in scalably vectorized loops. The net effect is that we were very sensative to the preservation of nsw/nuw flags on such IVs, and could not infer trip counts if they got lost for any reason. --------- Co-authored-by: Nikita Popov <github@npopov.com>
2024-07-31	[Support] Erase blocks after DomTree::eraseNode (#101195)	Alexis Engelke	1	-5/+3
	Change eraseNode to require that the basic block is still contained inside the function. This is a preparation for using numbers of basic blocks inside the dominator tree, which are invalid for blocks that are not inside a function.
2024-07-30	Reapply "[DXIL][Analysis] Make alignment on StructuredBuffer optional" (#101113)	Justin Bogner	1	-3/+5
	Unfortunately storing a `MaybeAlign` in ResourceInfo deletes our move constructor in compilers that haven't implemented [P0602R4], like GCC 7. Since we only ever use the alignment in ways where alignment 1 and unset are ambiguous anyway, we'll just store the integer AlignLog2 value that we'll eventually use directly. [P0602R4]: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0602r4.html This reverts commit c22171f12fa9f260e2525cf61b93c136889e17f2, reapplying a94edb6b8e321a46fe429934236aaa4e2e9fb97f.
2024-07-30	[SCEV] Fix outdated comment (NFC)	Nikita Popov	1	-16/+1
	The EqCache parameter has been removed.
2024-07-30	Remove value cache in SCEV comparator. (#100721)	Johannes Reifferscheid	1	-17/+10
	The cache triggers almost never, and seems unlikely to help with performance. However, when it does, it is likely to cause the comparator to become inconsistent due to a bad interaction of the depth limit and cache hits. This leads to crashes in debug builds. See the new unit test for a reproducer.
2024-07-30	[SCEV] Avoid unnecessary computeConstantDifference() call (NFC)	Nikita Popov	1	-1/+3
	No need to do the second one if the first one already failed.
2024-07-30	[NFC] fix build failure (#100993)	Chen Zheng	1	-2/+2
	Fix the build failure caused by https://github.com/llvm/llvm-project/pull/94944 Fixes https://github.com/llvm/llvm-project/issues/100296
2024-07-29	Revert "[DXIL][Analysis] Make alignment on StructuredBuffer optional" (#101088)	Justin Bogner	1	-6/+3
	Seeing build failures, reverting to investigate. Reverts llvm/llvm-project#100697
2024-07-29	[DXIL][Analysis] Make alignment on StructuredBuffer optional	Justin Bogner	1	-3/+6
	HLSL allows StructuredBuffer<> to be defined with scalar or up-to-4-element vectors as well as with structs, but when doing so `dxc` doesn't set the alignment. Emulate this. Pull Request: https://github.com/llvm/llvm-project/pull/100697
2024-07-29	[DXIL][Analysis] Use setters for dxil::ResourceInfo initialization. NFC	Justin Bogner	1	-45/+23
	This simplifies making sure we set all of the members of the unions and adds asserts to help catch if we do something wrong. Pull Request: https://github.com/llvm/llvm-project/pull/100696
2024-07-29	[InstCombine][asan] Don't speculate loads before `select ptr` (#100773)	Vitaly Buka	1	-2/+6
	Even if memory is valid from `llvm` point of view, e.g. local alloca, sanitizers have API for user specific memory annotations. These annotations can be used to track size of the local object, e.g. inline vectors may prevent accesses beyond the current vector size. So valid programs should not access those parts of alloca before checking preconditions. Fixes #100639.
2024-07-29	[NFC][Load] Find better place for `mustSuppressSpeculation` (#100794)	Vitaly Buka	2	-11/+13
	And extract `suppressSpeculativeLoadForSanitizers`. For #100639.
2024-07-29	[LowerMemIntrinsics][NFC] Use Align in TTI::getMemcpyLoopLoweringType (#100984)	Fabian Ritter	1	-2/+2
	...and also in TTI::getMemcpyLoopResidualLoweringType.
2024-07-29	[PatternMatch] Use `m_SpecificCmp` matchers. NFC. (#100878)	Yingwei Zheng	2	-11/+10
	Compile-time improvement: http://llvm-compile-time-tracker.com/compare.php?from=13996378d81c8fa9a364aeaafd7382abbc1db83a&to=861ffa4ec5f7bde5a194a7715593a1b5359eb581&stat=instructions:u baseline: 803eaf29267c6aae9162d1a83a4a2ae508b440d3 ``` Top 5 improvements: stockfish/movegen.ll 2541620819 2538599412 -0.12% minetest/profiler.cpp.ll 431724935 431246500 -0.11% abc/luckySwap.c.ll 581173720 580581935 -0.10% abc/kitTruth.c.ll 2521936288 2519445570 -0.10% abc/extraUtilTruth.c.ll 1216674614 1215495502 -0.10% Top 5 regressions: openssl/libcrypto-shlib-sm4.ll 1155054721 1155943201 +0.08% openssl/libcrypto-lib-sm4.ll 1155054838 1155943063 +0.08% spike/vsm4r_vv.ll 1296430080 1297039258 +0.05% spike/vsm4r_vs.ll 1312496906 1313093460 +0.05% nuttx/lib_rand48.c.ll 126201233 126246692 +0.04% Overall: -0.02112308% ```
2024-07-26	[LAA] Refine stride checks for SCEVs during dependence analysis. (#99577)	Florian Hahn	1	-64/+55
	Update getDependenceDistanceStrideAndSize to reason about different combinations of strides directly and explicitly. Update getPtrStride to return 0 for invariant pointers. Then proceed by checking the strides. If either source or sink are not strided by a constant (i.e. not a non-wrapping AddRec) or invariant, the accesses may overlap with earlier or later iterations and we cannot generate runtime checks to disambiguate them. Otherwise they are either loop invariant or strided. In that case, we can generate a runtime check to disambiguate them. If both are strided by constants, we proceed as previously. This is an alternative to https://github.com/llvm/llvm-project/pull/99239 and also replaces additional checks if the underlying object is loop-invariant. Fixes https://github.com/llvm/llvm-project/issues/87189. PR: https://github.com/llvm/llvm-project/pull/99577
2024-07-25	[DXIL][Analysis] Make the DXILResource binding optional. NFC	Justin Bogner	1	-57/+35
	This makes the binding structure in a DXILResource default to empty and need a separate call to set up, and also moves the unique ID into it since bindings are the only place where those are actually used. This will put us in a better position when dealing with resource handles in libraries. Pull Request: https://github.com/llvm/llvm-project/pull/100623
2024-07-25	[DXIL][Analysis] Replace #include with forward declaration. NFC	Justin Bogner	1	-0/+1
	Pull Request: https://github.com/llvm/llvm-project/pull/100622
2024-07-25	[DXIL][Analysis] Move dxil::ResourceInfo to the Analysis library. NFC	Justin Bogner	2	-0/+371
	I had put this in Transforms/Utils, but that doesn't actually make sense if we want to populate these structures via an analysis pass. Pull Request: https://github.com/llvm/llvm-project/pull/100621
2024-07-25	LAA: fix style after cursory reading (NFC) (#100447)	Ramkumar Ramachandra	1	-43/+40

2024-07-25	Remove the `x86_mmx` IR type. (#98505)	James Y Knight	1	-5/+3
	It is now translated to `<1 x i64>`, which allows the removal of a bunch of special casing. This _incompatibly_ changes the ABI of any LLVM IR function with `x86_mmx` arguments or returns: instead of passing in mmx registers, they will now be passed via integer registers. However, the real-world incompatibility caused by this is expected to be minimal, because Clang never uses the x86_mmx type -- it lowers `__m64` to either `<1 x i64>` or `double`, depending on ABI. This change does _not_ eliminate the SelectionDAG `MVT::x86mmx` type. That type simply no longer corresponds to an IR type, and is used only by MMX intrinsics and inline-asm operands. Because SelectionDAGBuilder only knows how to generate the operands/results of intrinsics based on the IR type, it thus now generates the intrinsics with the type MVT::v1i64, instead of MVT::x86mmx. We need to fix this before the DAG LegalizeTypes, and thus have the X86 backend fix them up in DAGCombine. (This may be a short-lived hack, if all the MMX intrinsics can be removed in upcoming changes.) Works towards issue #98272.
2024-07-25	[BasicAA] Fix handling of indirect assumption based results (#100130)	Nikita Popov	1	-4/+24
	If a result is potentially based on a not yet proven assumption, BasicAA will remember it inside AssumptionBasedResults and remove the cache entry if an assumption higher up is later disproved. However, we currently miss the case where another cache entry ends up depending on such an AssumptionBased result. Fix this by introducing an additional AssumptionBased state for cache entries. If such a result is used, we'll still increment AAQI.NumAssumptionUses, which means that the using entry will also become AssumptionBased and be cleared if the assumption is disproved. At the end of the root query, convert remaining AssumptionBased results into definitive results. Fixes https://github.com/llvm/llvm-project/issues/98978.
2024-07-25	[TBAA] Do not rewrite TBAA if exists, always null out `!tbaa.struct`	Antonio Frighetto	1	-4/+4
	Retrieve `!tbaa` metadata via `!tbaa.struct` in `adjustForAccess` unless it already exists, as struct-path aware `MDNodes` emitted via `new-struct-path-tbaa` may be leveraged. As `!tbaa.struct` carries memcpy padding semantics among struct fields and `!tbaa` is already meant to aid to alias semantics, it should be possible to zero out `!tbaa.struct` once the memcpy has been simplified. `SROA/tbaa-struct.ll` test has gone out of scope, as `!tbaa` has already replaced `!tbaa.struct` in SROA. Fixes: https://github.com/llvm/llvm-project/issues/95661.
2024-07-24	LAA: mark LoopInfo pointer const (NFC) (#100373)	Ramkumar Ramachandra	2	-2/+3

2024-07-24	[InstCombine] Infer sub nuw from dominating conditions (#100164)	Yingwei Zheng	1	-10/+7
	Alive2: https://alive2.llvm.org/ce/z/g3xxnM
2024-07-24	[ValueTracking] Don't use CondContext in dataflow analysis of phi nodes ↵	Yingwei Zheng	1	-11/+11
	(#100316) See the following case: ``` define i16 @pr100298() { entry: br label %for.inc for.inc: %indvar = phi i32 [ -15, %entry ], [ %mask, %for.inc ] %add = add nsw i32 %indvar, 9 %mask = and i32 %add, 65535 %cmp1 = icmp ugt i32 %mask, 5 br i1 %cmp1, label %for.inc, label %for.end for.end: %conv = trunc i32 %add to i16 %cmp2 = icmp ugt i32 %mask, 3 %shl = shl nuw i16 %conv, 14 %res = select i1 %cmp2, i16 %conv, i16 %shl ret i16 %res } ``` When computing knownbits of `%shl` with `%cmp2=false`, we cannot use this condition in the analysis of `%mask (%for.inc -> %for.inc)`. Fixes https://github.com/llvm/llvm-project/issues/100298.
2024-07-23	[SimplifyCFG] Increase budget for FoldTwoEntryPHINode() if the branch is ↵	Tianqing Wang	1	-0/+4
	unpredictable. (#98495) The `!unpredictable` metadata has been present for a long time, but it's usage in optimizations is still limited. This patch teaches `FoldTwoEntryPHINode()` to be more aggressive with an unpredictable branch to reduce mispredictions. A TTI interface `getBranchMispredictPenalty()` is added to distinguish between different hardwares to ensure we don't go too far for simpler cores. For simplicity, only a naive x86 implementation is included for the time being.
2024-07-22	[GVN] Look through select/phi when determining underlying object (#99509)	Nikita Popov	2	-3/+44
	This addresses an optimization regression in Rust we have observed after https://github.com/llvm/llvm-project/pull/82458. We now only perform pointer replacement if they have the same underlying object. However, getUnderlyingObject() by default only looks through linear chains, not selects/phis. In particular, this means that we miss cases involving involving pointer induction variables. This patch fixes this by introducing a new helper getUnderlyingObjectAggressive() which basically does what getUnderlyingObjects() does, just specialized to the case where we must arrive at a single underlying object in the end, and with a limit on the number of inspected values. Doing this more expensive underlying object check has no measurable compile-time impact on CTMark.
2024-07-22	[Analysis] Bail out for negative offsets in ↵	David Sherwood	1	-0/+7
	isDereferenceableAndAlignedInLoop (#99490) This patch now bails out explicitly for negative offsets so that it's more consistent with the unsigned remainder and add calculations, and it fixes a genuine bug as shown with the new test.