Age | Commit message (Collapse) | Author | Files | Lines |
|
Add support for the new maximumnum and minimumnum intrinsics in various
optimizations in InstSimplify.
Also, change the behavior of optimizing maxnum(sNaN, x) to simplify to
qNaN instead of x to better match the LLVM IR spec, and add more tests
for sNaN behavior for all 3 max/min intrinsic types.
|
|
This patch simplifies an fcmp into true/false if it is implied by a
dominating fcmp.
As an initial support, it only handles two cases:
+ `fcmp pred1, X, Y -> fcmp pred2, X, Y`: use set operations.
+ `fcmp pred1, X, C1 -> fcmp pred2, X, C2`: use `ConstantFPRange` and
set operations.
Note: It doesn't fix https://github.com/llvm/llvm-project/issues/70985,
as the second fcmp in the motivating case is not dominated by the edge.
We may need to adjust JumpThreading to handle this case.
Comptime impact (~+0.1%):
https://llvm-compile-time-tracker.com/compare.php?from=a728f213c863e4dd19f8969a417148d2951323c0&to=8ca70404fb0d66a824f39d83050ac38e2f1b25b9&stat=instructions:u
IR diff: https://github.com/dtcxzyw/llvm-opt-benchmark/pull/2848
|
|
When using information from dereferenceable assumptions, we need to make
sure that the memory is not freed between the assume and the specified
context instruction. Instead of just checking canBeFreed, check if there
any calls that may free between the assume and the context instruction.
This patch introduces a willNotFreeBetween to check for calls that may
free between an assume and a context instructions, to also be used in
https://github.com/llvm/llvm-project/pull/161255.
PR: https://github.com/llvm/llvm-project/pull/161725
|
|
ninf/nnan in the phi node may produce poison values. They should be
considered in `isGuaranteedNotToBeUndefOrPoison`.
Closes https://github.com/llvm/llvm-project/issues/161524.
|
|
Alive2: https://alive2.llvm.org/ce/z/8rX5Rk
Closes https://github.com/llvm/llvm-project/issues/118106.
|
|
This is a common pattern to initialize Knownbits that occurs before
loops that call intersectWith.
|
|
A common idiom is the usage of the PatternMatch match function within a
functional algorithm like all_of. Introduce a match functor to shorten
this idiom.
Co-authored-by: Luke Lau <luke@igalia.com>
|
|
Closes https://github.com/llvm/llvm-project/issues/157238.
|
|
Fixes an issue where bits next to the sign bit were not constant-folded
when squaring a sign- or zero-extended small integer. Added logic to
detect when both operands of a multiplication are the same extended
value, allowing InstCombine to mark bits above the maximum possible
square as known zero. This enables correct folding of (x * x) & (1 << N)
to 0 when N is out of range.
Proof: https://alive2.llvm.org/ce/z/YGou44
Fixes #152061
|
|
We may have already gotten information from range metadata, don't
overwrite it.
|
|
|
|
Add operators to shift left or right and insert unknown bits.
|
|
We just replaced SmallSet<T *, N> with SmallPtrSet<T *, N>, bypassing
the redirection found in SmallSet.h. With that, we no longer need to
include SmallSet.h in many files.
|
|
(#154404)
They cannot make poison or undef, same for IR. They can only make -1, 0,
or 1
Alive 2: https://alive2.llvm.org/ce/z/--Jd78
|
|
types. (#153623)
This makes the optimization in optimizeStringLength for strlen(gep
@glob, %x) -> sub endof@glob, %x a little more resilient, and maybe a
bit more correct for geps with non-array types.
|
|
This patch replaces SmallSet<T *, N> with SmallPtrSet<T *, N>. Note
that SmallSet.h "redirects" SmallSet to SmallPtrSet for pointer
element types:
template <typename PointeeType, unsigned N>
class SmallSet<PointeeType*, N> : public SmallPtrSet<PointeeType*, N>
{};
We only have 140 instances that rely on this "redirection", with the
vast majority of them under llvm/. Since relying on the redirection
doesn't improve readability, this patch replaces SmallSet with
SmallPtrSet for pointer element types.
|
|
When InstTy is a type like IntrinsicInst which can have a variable
number of arguments, we can encounter a case where Operation will have
fewer than two arguments and error at the getOperand() calls.
Fixes: https://github.com/llvm/llvm-project/issues/152725.
|
|
Fixes #125228
---------
Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>
|
|
(#149243)
Fixes https://github.com/llvm/llvm-project/issues/146769
Test cases added to
`llvm/test/Transforms/InstSimplify/fold-intrinsics.ll`
|
|
Fold trig functions call of poison to poison.
This includes sin, cos, asin, acos, atan, atan2, sinh, cosh, sincos,
sincospi.
Test cases are fixed and also added to
llvm/test/Transforms/InstSimplify/fold-intrinsics.ll just like in
https://github.com/llvm/llvm-project/pull/146750
|
|
|
|
Fixes #147116.
|
|
(#146975)
If x - y == 0, then x ^ y == 0. Therefore, we can do the exact same
checks.
https://alive2.llvm.org/ce/z/MtBRoj
|
|
Fixes #146560 as well as propagate poison for [us]add, [us]sub and
[us]mul
|
|
This consolidates the "fold poison arg to poison result" constant
folding logic for intrinsics, based on a common
intrinsicPropagatesPoison() helper, which is also used for poison
propagation reasoning in ValueTracking. This ensures that the set of
supported intrinsics is consistent.
This add ucmp, scmp, smul.fix, smul.fix.sat, canonicalize and sqrt to
the intrinsicPropagatesPoison list, as these were handled by
ConstantFolding but not ValueTracking. The ctpop test is an example of
the converse, where it was handled by ValueTracking but not
ConstantFolding.
|
|
Similarly to what it is being done to match simple recurrence cycle
relations, attempt to match value-accumulating recurrences of kind:
```
%umax.acc = phi i8 [ %umax, %backedge ], [ %a, %entry ]
%umax = call i8 @llvm.umax.i8(i8 %umax.acc, i8 %b)
```
Preliminary work to let InstCombine avoid folding such recurrences,
so that simple loop-invariant computation may get hoisted. Minor
opportunity to refactor out code as well.
|
|
b` (#145105)
Alive2: https://alive2.llvm.org/ce/z/an9npN
Fixes #142283.
|
|
(#144686)
In our downstream GPU target, following IR is valid before instcombine
although the second addrspacecast causes UB.
define i1 @test(ptr addrspace(1) noundef %v) {
%0 = addrspacecast ptr addrspace(1) %v to ptr addrspace(4)
%1 = call i32 @llvm.xxxx.isaddr.shared(ptr addrspace(4) %0)
%2 = icmp eq i32 %1, 0
%3 = addrspacecast ptr addrspace(4) %0 to ptr addrspace(3)
%4 = select i1 %2, ptr addrspace(3) null, ptr addrspace(3) %3
%5 = icmp eq ptr addrspace(3) %4, null
ret i1 %5
}
We have a custom optimization that replaces invalid addrspacecast with
poison, and IR is still valid since `select` stops poison propagation.
However, instcombine pass optimizes `select` to `or`:
%0 = addrspacecast ptr addrspace(1) %v to ptr addrspace(4)
%1 = call i32 @llvm.xxxx.isaddr.shared(ptr addrspace(4) %0)
%2 = icmp eq i32 %1, 0
%3 = addrspacecast ptr addrspace(1) %v to ptr addrspace(3)
%4 = icmp eq ptr addrspace(3) %3, null
%5 = or i1 %2, %4
ret i1 %5
The transform is invalid for our target.
---------
Co-authored-by: Nikita Popov <github@npopov.com>
|
|
Co-authored-by: Yingwei Zheng <dtcxzyw2333@gmail.com>
|
|
Reverts llvm/llvm-project#125935
Causes miscompiles, see comments in #125935
|
|
Closes #125228
|
|
Seeing how we can't generate any debug intrinsics any more: delete a
variety of codepaths where they're handled. For the most part these are
plain deletions, in others I've tweaked comments to remain coherent, or
added a type to (what was) type-generic-lambdas.
This isn't all the DbgInfoIntrinsic call sites but it's most of the
simple scenarios.
Co-authored-by: Nikita Popov <github@npopov.com>
|
|
|
|
This also patches HashRecognize to avoid it mishandling some opcodes.
|
|
We can determine the range from a subtraction if it has nsw or nuw.
https://alive2.llvm.org/ce/z/tXAKVV
|
|
Alive2: https://alive2.llvm.org/ce/z/cJ75Ya
Closes https://github.com/llvm/llvm-project/issues/117436.
|
|
When the original predicate is ordered and both operands are non-NaN,
`Ordered` should be set to true. This variable still matters even if
both operands are non-NaN because FMF only applies to select, not fcmp.
Closes https://github.com/llvm/llvm-project/issues/143123.
|
|
isGuaranteedNotToBeUndefOrPoison. (#142894)
Scalable vectors use insertelt+shufflevector ConstantExpr to
represent a splat.
|
|
computeKnownFPClass. (#143051)
We can support scalable vectors by setting the demanded mask to APInt(1,
1) to demand the whole vector.
|
|
Having a finite Depth (or recursion limit) for computeKnownBits is very
limiting, but is currently a load-bearing necessity, as all KnownBits
are recomputed on each call and there is no caching. As a prerequisite
for an effort to remove the recursion limit altogether, either using a
clever caching technique, or writing a easily-invalidable KnownBits
analysis, make the Depth argument in APIs in ValueTracking uniformly the
last argument with a default value. This would aid in removing the
argument when the time comes, as many callers that currently pass 0
explicitly are now updated to omit the argument altogether.
|
|
|
|
This patch introduces an FMF parameter for
`matchDecomposedSelectPattern` to pass FMF flags from select, instead of
fcmp.
Closes https://github.com/llvm/llvm-project/issues/137998.
Closes https://github.com/llvm/llvm-project/issues/141017.
|
|
Already handled by default case.
|
|
more cases (#141015)
This patch was originally part of
https://github.com/llvm/llvm-project/pull/139861. It generalizes
`ignoreSignBitOfZero/NaN` to handle more instructions/intrinsics.
BTW, I find it mitigates performance regressions caused by
https://github.com/llvm/llvm-project/pull/141010 (IR diff
https://github.com/dtcxzyw/llvm-opt-benchmark/pull/2365/files). We don't
need to propagate FMF from fcmp into select, since we can infer demanded
properties from the user of select.
|
|
Proof: https://alive2.llvm.org/ce/z/oqQyxC
|
|
#140254 was previously missing 2 files in the bazel build config.
|
|
This reverts commit d00d74bb2564103ae3cb5ac6b6ffecf7e1cc2238.
The PR breaks our buildbots and blocks downstream merge.
|
|
add `GenericFloatingPointPredicateUtils` in order to generalize
effects of floating point comparisons on `KnownFPClass` for both IR and
MIR.
---------
Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
|
|
The inrange constexpr GEP case is handled since 425cbbc602c9.
|
|
As requested in https://github.com/llvm/llvm-project/pull/138961#discussion_r2078483175
|