riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
2024-12-02	[ValueTracking] Handle and/or of conditions in ↵	Yingwei Zheng	1	-8/+18
	`computeKnownFPClassFromContext` (#118257) Fix a typo introduced by https://github.com/llvm/llvm-project/pull/83161. This patch also supports decomposition of and/or expressions in `computeKnownFPClassFromContext`. Compile-time improvement: http://llvm-compile-time-tracker.com/compare.php?from=688bb432c4b618de69a1d0e7807077a22f15762a&to=07493fc354b686f0aca79d6f817091a757bd7cd5&stat=instructions:u
2024-12-02	[InstCombine] Fold `X Pred C2 ? X BOp C1 : C2 BOp C1` to `min/max(X, C2) BOp ↵	Veera	1	-21/+48
	C1` (#116888) Fixes #82414. General Proof: https://alive2.llvm.org/ce/z/ERjNs4 Proof for Tests: https://alive2.llvm.org/ce/z/K-934G This PR transforms `select` instructions of the form `select (Cmp X C1) (BOp X C2) C3` to `BOp (min/max X C1) C2` iff `C3 == BOp C1 C2`. This helps in eliminating a noop loop in https://github.com/rust-lang/rust/issues/123845 but does not improve optimizations.
2024-11-08	[Analysis] atan2: isTriviallyVectorizable; add to massv and accelerate ↵	Tex Riddell	1	-0/+4
	veclibs (#113637) This change is part of this proposal: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 - Return true for atan2 from isTriviallyVectorizable - Add atan2 to VecFuncs.def for massv and accelerate libraries. - Add atan2 to hasOptimizedCodeGen - Add atan2 support in llvm/lib/Analysis/ValueTracking.cpp llvm::getIntrinsicForCallSite and update vectorization tests - Add atan2 name check to isLoweredToCall in llvm/include/llvm/Analysis/TargetTransformInfoImpl.h - Note: there's no test coverage for these names in isLoweredToCall, except that Transforms/TailCallElim/inf-recursion.ll is impacted by the "fabs" case Thanks to @jroelofs for the atan2 accelerate veclib and associated test additions, plus the hasOptimizedCodeGen addition. Part of: Implement the atan2 HLSL Function #70096.
2024-11-07	ValueTracking: Do not return nullptr from getUnderlyingObject (#115258)	Matt Arsenault	1	-3/+4
	Fixup for 29a5c054e6d56a912ed5ba3f84e8ca631872db8b. The failure case should return the last value found.
2024-11-07	ValueTracking: simplify udiv/urem recurrences (#108973)	Ramkumar Ramachandra	1	-7/+19
	A urem recurrence has the property that the result can never exceed the start value. A udiv recurrence has the property that the result can never exceed either the start value or the numerator, whichever is greater. Implement a simplification based on these properties.
2024-11-07	[ValueTracking] Don't special case depth for phi of select (#114996)	Nikita Popov	1	-10/+8
	As discussed on https://github.com/llvm/llvm-project/pull/114689#pullrequestreview-2411822612 and following, there is no principled reason why the phi of select case should have a different recursion limit than the general case. There may still be fan-out, and there may still be indirect recursion. Revert that part of #113707.
2024-11-06	ValueTracking: Allow getUnderlyingObject to look at vectors (#114311)	Matt Arsenault	1	-2/+2
	We can identify some easy vector of pointer cases, such as a getelementptr with a scalar base.
2024-11-05	[Analysis] Remove unused includes (NFC) (#114936)	Kazu Hirata	1	-1/+0
	Identified with misc-include-cleaner.
2024-11-02	[ValueTracking] Compute known bits from recursive select/phi (#113707)	Yingwei Zheng	1	-7/+17
	This patch is inspired by https://github.com/llvm/llvm-project/pull/113686. I found that it removes a lot of unnecessary "and X, 1" in some applications that represent boolean values with int.
2024-11-01	[ValueTracking] Handle recursive phis in knownFPClass (#114008)	David Green	1	-4/+15
	As a follow-on to 113686, this breaks the recursion between phi nodes that have p1 = phi(x, p2) and p2 = phi(y, p1). The knownFPClass can be calculated from the classes of p1 and p2.
2024-10-31	[ValueTracking] Compute KnownFP state from recursive select/phi. (#113686)	David Green	1	-0/+7
	Given a recursive phi with select: %p = phi [ 0, entry ], [ %sel, loop] %sel = select %c, %other, %p The fp state can be calculated using the knowledge that the select/phi pair can only be the initial state (0 here) or from %other. This adds a short-cut into computeKnownFPClass for PHI to detect that the select is recursive back to the phi, and if so use the state from the other operand. This helps to address a regression from #83200.
2024-10-29	Adding more vector calls for -fveclib=AMDLIBM (#109662)	Rohit Aggarwal	1	-0/+4
	AMD has it's own implementation of vector calls. New vector calls are introduced in the library for exp10, log10, sincos and finite asin/acos Please refer [https://github.com/amd/aocl-libm-ose] --------- Co-authored-by: Rohit Aggarwal <Rohit.Aggarwal@amd.com>
2024-10-17	[ValueTracking] Respect `samesign` flag in `isKnownInversion` (#112390)	Yingwei Zheng	1	-0/+9
	In https://github.com/llvm/llvm-project/pull/93591 we introduced `isKnownInversion` and assumes `X` is poison implies `Y` is poison because they share common operands. But after introducing `samesign` this assumption no longer hold if `X` is an icmp has `samesign` flag. Alive2 link: https://alive2.llvm.org/ce/z/rj3EwQ (Please run it locally with this patch and https://github.com/AliveToolkit/alive2/pull/1098). This approach is the most conservative way in my mind to address this problem. If `X` has `samesign` flag, it will check if `Y` also has this flag and make sure constant RHS operands have the same sign. Fixes https://github.com/llvm/llvm-project/issues/112350.
2024-10-15	[InstCombine] Extend fcmp+select folding to minnum/maxnum intrinsics (#112088)	Alexey Bader	1	-6/+2
	Today, InstCombine can fold fcmp+select patterns to minnum/maxnum intrinsics when the nnan and nsz flags are set. The ordering of the operands in both the fcmp and select instructions is important for the folding to occur. maxnum patterns: 1. (a op b) ? a : b -> maxnum(a, b), where op is one of {ogt, oge} 2. (a op b) ? b : a -> maxnum(a, b), where op is one of {ule, ult} The second pattern is supposed to make the order of the operands in the select instruction irrelevant. However, the pattern matching code uses the CmpInst::getInversePredicate method to invert the comparison predicate. This method doesn't take into account the fast-math flags, which can lead missing the folding opportunity. The patch extends the pattern matching code to handle unordered fcmp instructions. This allows the folding to occur even when the select instruction has the operands in the inverse order. New maxnum patterns: 1. (a op b) ? a : b -> maxnum(a, b), where op is one of {ugt, uge} 2. (a op b) ? b : a -> maxnum(a, b), where op is one of {ole, olt} The same changes are applied to the minnum intrinsic.
2024-10-15	InstCombine: extend select-equiv to support vectors (#111966)	Ramkumar Ramachandra	1	-1/+3
	foldSelectEquivalence currently doesn't support GVN-like replacements on vector types. Put in the checks for potentially lane-crossing operations, and lift the limitation.
2024-10-14	ValueTracking: handle more ops in isNotCrossLaneOperation (#112183)	Ramkumar Ramachandra	1	-19/+2
	Reuse llvm::isTriviallyVectorizable in llvm::isNotCrossLaneOperation, in order to get it to handle more intrinsics. Alive2 proofs for changed tests: https://alive2.llvm.org/ce/z/XSV_GT
2024-10-14	ValueTracking: introduce llvm::isNotCrossLaneOperation (#112011)	Ramkumar Ramachandra	1	-0/+23
	Factor out and unify common code from InstSimplify and InstCombine that partially guard against cross-lane vector operations into llvm::isNotCrossLaneOperation in ValueTracking. Alive2 proofs for changed tests: https://alive2.llvm.org/ce/z/68H4ka
2024-10-04	ValueTracking: refactor recurrence-matching (NFC) (#109659)	Ramkumar Ramachandra	1	-36/+55

2024-10-03	[ValueTracking] AllowEphemerals for alignment assumptions. (#108632)	Florian Hahn	1	-1/+4
	Allow AllowEphemerals in isValidAssumeForContext, as the CxtI might be the producer of the pointer in the bundle. At the moment, align assumptions aren't optimized away. This allows using the assumption in the computeKnownBits call in getConstantMultipleImpl. We could extend the computeKnownBits API to allow callers to specify if ephemerals are allowed, if the info from computeKnownBitsFromContext is used to remove alignment assumptions. PR: https://github.com/llvm/llvm-project/pull/108632
2024-10-02	[ValueTracking] mul nuw nsw with factor sgt 1 is non-negative (#110803)	Nikita Popov	1	-6/+14
	Proof: https://alive2.llvm.org/ce/z/bC0eJf
2024-10-02	ValueTracking: strip stray break in recur-match (#109794)	Ramkumar Ramachandra	1	-2/+0
	There is a stray break statement in the recurrence-handling code in computeKnownBitsFromOperator, that seems to be unintended. Strip this statement so that we have the opportunity to go through the rest of phi-handling code, and refine KnownBits further.
2024-09-19	[ValueTracking] Support assume in entry block without DT (#109264)	Nikita Popov	1	-1/+2
	isValidAssumeForContext() handles a couple of trivial cases even if no dominator tree is available. This adds one more for the case where there is an assume in the entry block, and a use in some other block. The entry block always dominates all blocks. As having context instruction but not having DT is fairly rare, there is not much impact. Only test change is in assume-builder.ll, where less redundant assumes are generated. I've found having this special case is useful for an upcoming change though.
2024-09-13	[ValueTracking] Infer is-power-of-2 from dominating conditions (#107994)	Yingwei Zheng	1	-9/+28
	Addresses downstream rustc issue: https://github.com/rust-lang/rust/issues/129795
2024-09-10	[ValueTracking] Infer is-power-of-2 from assumptions. (#107745)	Yingwei Zheng	1	-3/+36
	This patch tries to infer is-power-of-2 from assumptions. I don't see that this kind of assumption exists in my dataset. Related issue: https://github.com/rust-lang/rust/issues/129795 Close https://github.com/llvm/llvm-project/issues/58996.
2024-09-05	[ConstantRange] Perform increment on APInt (NFC)	Nikita Popov	1	-1/+1
	This handles the edge case where BitWidth is 1 and doing the increment gets a value that's not valid in that width, while we just want wrap-around. Split out of https://github.com/llvm/llvm-project/pull/80309.
2024-09-03	[Analysis] getIntrinsicForCallSite - add vectorization support for ↵	Simon Pilgrim	1	-0/+24
	acos/asin/atan and cosh/sinh/tanh libcalls (#106844) Followup to #106584 - ensure acos/asin/atan and cosh/sinh/tanh libcalls correctly map to the llvm intrinsic equivalents
2024-08-30	[ValueTracking] use KnownBits to compute fpclass from bitcast (#97762)	Alex MacLean	1	-0/+55
	When we encounter a bitcast from an integer type we can use the information from `KnownBits` to glean some information about the fpclass: - If the sign bit is known, we can transfer this information over. - If the float is IEEE format and enough of the bits are known, we may be able to prove or rule out some fpclasses such as NaN, Zero, or Inf.
2024-08-19	[ValueTracking] Handle incompatible types instead of asserting in ↵	Noah Goldstein	1	-2/+3
	`isKnownNonEqual`; NFC Downstream hit this assert, since it doesn't really make any difference, just change code to return false.
2024-08-15	[Analysis] Use a range-based for loop (NFC) (#104445)	Kazu Hirata	1	-2/+2

2024-08-15	[ValueTracking] Fix f16 fptosi range for large integers	Nikita Popov	1	-1/+1
	We were missing the signed flag on the negative value, so the range was incorrectly interpreted for integers larger than 64-bit. Split out from https://github.com/llvm/llvm-project/pull/80309.
2024-08-12	[KnownBits] Add KnownBits::add and KnownBits::sub helper wrappers. (#99468)	Simon Pilgrim	1	-12/+5

2024-08-06	[ValueTracking] Infer relationship for the select with SLT	zhongyunde 00443407	1	-0/+9

2024-08-06	[ValueTracking] Infer relationship for the select with ICmp	zhongyunde 00443407	1	-0/+11
	x -nsw y < -C is false when x > y and C >= 0 Alive2 proof for sgt, sge : https://alive2.llvm.org/ce/z/tupvfi Note: It only really makes sense in the context of signed comparison for "X - Y must be positive if X >= Y and no overflow". Fixes https://github.com/llvm/llvm-project/issues/54735
2024-08-04	[llvm] Construct SmallVector with ArrayRef (NFC) (#101872)	Kazu Hirata	1	-1/+1

2024-07-29	[NFC][Load] Find better place for `mustSuppressSpeculation` (#100794)	Vitaly Buka	1	-11/+0
	And extract `suppressSpeculativeLoadForSanitizers`. For #100639.
2024-07-29	[PatternMatch] Use `m_SpecificCmp` matchers. NFC. (#100878)	Yingwei Zheng	1	-6/+5
	Compile-time improvement: http://llvm-compile-time-tracker.com/compare.php?from=13996378d81c8fa9a364aeaafd7382abbc1db83a&to=861ffa4ec5f7bde5a194a7715593a1b5359eb581&stat=instructions:u baseline: 803eaf29267c6aae9162d1a83a4a2ae508b440d3 ``` Top 5 improvements: stockfish/movegen.ll 2541620819 2538599412 -0.12% minetest/profiler.cpp.ll 431724935 431246500 -0.11% abc/luckySwap.c.ll 581173720 580581935 -0.10% abc/kitTruth.c.ll 2521936288 2519445570 -0.10% abc/extraUtilTruth.c.ll 1216674614 1215495502 -0.10% Top 5 regressions: openssl/libcrypto-shlib-sm4.ll 1155054721 1155943201 +0.08% openssl/libcrypto-lib-sm4.ll 1155054838 1155943063 +0.08% spike/vsm4r_vv.ll 1296430080 1297039258 +0.05% spike/vsm4r_vs.ll 1312496906 1313093460 +0.05% nuttx/lib_rand48.c.ll 126201233 126246692 +0.04% Overall: -0.02112308% ```
2024-07-24	LAA: mark LoopInfo pointer const (NFC) (#100373)	Ramkumar Ramachandra	1	-1/+1

2024-07-24	[InstCombine] Infer sub nuw from dominating conditions (#100164)	Yingwei Zheng	1	-10/+7
	Alive2: https://alive2.llvm.org/ce/z/g3xxnM
2024-07-24	[ValueTracking] Don't use CondContext in dataflow analysis of phi nodes ↵	Yingwei Zheng	1	-11/+11
	(#100316) See the following case: ``` define i16 @pr100298() { entry: br label %for.inc for.inc: %indvar = phi i32 [ -15, %entry ], [ %mask, %for.inc ] %add = add nsw i32 %indvar, 9 %mask = and i32 %add, 65535 %cmp1 = icmp ugt i32 %mask, 5 br i1 %cmp1, label %for.inc, label %for.end for.end: %conv = trunc i32 %add to i16 %cmp2 = icmp ugt i32 %mask, 3 %shl = shl nuw i16 %conv, 14 %res = select i1 %cmp2, i16 %conv, i16 %shl ret i16 %res } ``` When computing knownbits of `%shl` with `%cmp2=false`, we cannot use this condition in the analysis of `%mask (%for.inc -> %for.inc)`. Fixes https://github.com/llvm/llvm-project/issues/100298.
2024-07-22	[GVN] Look through select/phi when determining underlying object (#99509)	Nikita Popov	1	-0/+42
	This addresses an optimization regression in Rust we have observed after https://github.com/llvm/llvm-project/pull/82458. We now only perform pointer replacement if they have the same underlying object. However, getUnderlyingObject() by default only looks through linear chains, not selects/phis. In particular, this means that we miss cases involving involving pointer induction variables. This patch fixes this by introducing a new helper getUnderlyingObjectAggressive() which basically does what getUnderlyingObjects() does, just specialized to the case where we must arrive at a single underlying object in the end, and with a limit on the number of inspected values. Doing this more expensive underlying object check has no measurable compile-time impact on CTMark.
2024-07-22	[IR] Remove non-canonical matchings (#96763)	AtariDreams	1	-1/+1

2024-07-22	[InstCombine] Do not use operand info in `replaceInInstruction` (#99492)	Yingwei Zheng	1	-4/+8
	Consider the following case: ``` %cmp = icmp eq ptr %p, null %load = load i32, ptr %p, align 4 %sel = select i1 %cmp, i32 %load, i32 0 ``` `foldSelectValueEquivalence` converts `load i32, ptr %p, align 4` into `load i32, ptr null, align 4`, which causes immediate UB. `%load` is speculatable, but it doesn't hold after operand substitution. This patch introduces a new helper `isSafeToSpeculativelyExecuteWithVariableReplaced`. It ignores operand info in these instructions since their operands will be replaced later. Fixes #99436. --------- Co-authored-by: Nikita Popov <github@npopov.com>
2024-07-19	[ValueTracking] Let ComputeKnownSignBits handle (shl (zext X), C) (#97693)	Bjorn Pettersson	1	-3/+14
	Add simple support for looking through a zext when doing ComputeKnownSignBits for shl. This is valid for the case when all extended bits are shifted out, because then the number of sign bits can be found by analysing the zext operand. The solution here is simple as it only handle a single zext (not passing remaining left shift amount during recursion). It could be possible to generalize this in the future by for example passing an 'OffsetFromMSB' parameter to ComputeNumSignBitsImpl, telling it to calculate number of sign bits starting at some offset from the most significant bit.
2024-07-18	[ValueTracking] Remove unnecessary `m_ElementWiseBitCast` from ↵	Noah Goldstein	1	-4/+1
	`isKnownNonZeroFromOperator`; NFC
2024-07-18	[ValueTracking] Consistently propagate `DemandedElts` is `computeKnownFPClass`	Noah Goldstein	1	-2/+3
	Closes #99080
2024-07-18	[ValueTracking] Consistently propagate `DemandedElts` is `ComputeNumSignBits`	Noah Goldstein	1	-27/+40

2024-07-18	[ValueTracking] Consistently propagate `DemandedElts` is `isKnownNonZero`	Noah Goldstein	1	-34/+56

2024-07-18	[ValueTracking] Consistently propagate `DemandedElts` is `computeKnownBits`	Noah Goldstein	1	-41/+47

2024-07-17	[ValueTracking] Implement Known{Bits,NonZero,FPClass} for `llvm.vector.reverse`	Noah Goldstein	1	-0/+9
	`llvm.vector.reverse` preserves each of the elements and thus elements common to them. Alive2 doesn't support the intrin yet, but the logic seems pretty self-evident. Closes #99013
2024-07-16	[SLP]Correctly detect minnum/maxnum patterns for select/cmp operations on ↵	Alexey Bataev	1	-4/+5
	floats. The patch enables detection of minnum/maxnum patterns for float point instruction, represented as select/cmp. Also, enables better cost estimation for integer min/max patterns since the compiler starts to estimate the scalars separately. Reviewers: nikic, RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/98570