Age | Commit message (Collapse) | Author | Files | Lines |
|
This strengthens our `isKnownNonEqual` logic with some fairly
trivial cases.
Proofs: https://alive2.llvm.org/ce/z/4pxRTj
Closes #87705
|
|
Inserts don't modify the data, so if all elements that end up in the
destination are non-zero the result is non-zero.
Closes #87703
|
|
Shuffles don't modify the data, so if all elements that end up in the
destination are non-zero the result is non-zero.
Closes #87702
|
|
Instead of relying on known-bits for strictly positive, use the
`isKnownPositive` API. This will use `isKnownNonZero` which is more
accurate.
Closes #88170
|
|
Adds support for: `{s,u}{add,sub,mul}.with.overflow`
The logic is identical to the the non-overflow binops, we where just
missing the cases.
Closes #87701
|
|
|
|
`computeKnownBits`
Previously missing. We compute by just applying the reduce function on
the knownbits of each element.
Closes #88169
|
|
`isKnownNonZero`
Previously missing, proofs for all implementations:
https://alive2.llvm.org/ce/z/G8wpmG
|
|
`insertelement`
Its same logic as before, we just need to intersect what we know about
the new Elt and the entire pre-existing Vec.
Closes #87707
|
|
`isKnownNonZero`; NFC
Closes #87700
|
|
computeKnownFPClass
|
|
|
|
|
|
`insertelement`
Its same logic as before, we just need to intersect what we know about
the new Elt and the entire pre-existing Vec.
Closes #87708
|
|
There is one notable "regression". This patch replaces the bespoke `or
disjoint` logic we a direct match. This means we fail some
simplification during `instsimplify`.
All the cases we fail in `instsimplify` we do handle in `instcombine`
as we add `disjoint` flags.
Other than that, just some basic cases.
See proofs: https://alive2.llvm.org/ce/z/_-g7C8
Closes #86083
|
|
In `(icmp eq (and x,y), C)` all 1s in `C` must also be set in both
`x`/`y`.
In `(icmp eq (or x,y), C)` all 0s in `C` must also be set in both
`x`/`y`.
Closes #87143
|
|
Assumption/DomCondition Cache
We can definitionally treat `or disjoint` as `add` anywhere.
Closes #86302
|
|
Handle the range attribute in ValueTracking.
|
|
This patch handles `not` in `isImpliedCondition` to enable more fold in
some multi-use cases.
|
|
All the isKnownNonZero() handling for instructions should be inside
this function. This makes the structure more similar to
computeKnownBitsFromOperator() as well.
This may not be entirely NFC due to different depth handling.
|
|
Move the declaration of the Ty variable outside the NDEBUG guard
and make use of it in the remainder of the function.
|
|
Nowadays !range can be placed on instructions with vector of int
return value. Support this case in isKnownNonZero().
|
|
If we had a comparison to a literal nan with a false predicate,
we were incorrectly treating it as an unordered compare. This was
correct for fcmp true, but not fcmp false. I noticed this in the
review for e44d3b3e503fa12fdaead2936b28844aa36237c1 but misdiagnosed
the reason. Also change the test for the fcmp true case to be more
useful, but it wasn't wrong previously.
|
|
We don't always have canonical order here, so do it manually.
Closes #85575
|
|
If we have something like `(select (icmp ult x, 8), x, y)`, we can use
the `(icmp ult x, 8)` to help compute the knownbits of `x`.
Closes #84699
|
|
In 2fe81edef6f
[NFC][RemoveDIs] Insert instruction using iterators in Transforms/
we changed
if (*req_idx != *i)
return FindInsertedValue(I->getAggregateOperand(), idx_range,
- InsertBefore);
+ *InsertBefore);
}
but there is no guarantee that is InsertBefore is non-empty at that
point,
which we e.g can see in the added testcase.
Instead just pass on the optional InsertBefore in the recursive call to
FindInsertedValue, as we do at several other places already.
|
|
(#84339)
At the moment, getUnderlyingObjects simply continues for phis that do
not refer to the same underlying object in loops, without adding them to
the list of underlying objects, effectively ignoring those phis.
Instead of ignoring those phis, add them to the list of underlying
objects. This fixes a miscompile where LoopAccessAnalysis fails to
identify a memory dependence, because no underlying objects can be found
for a set of memory accesses.
Fixes https://github.com/llvm/llvm-project/issues/82665.
PR: https://github.com/llvm/llvm-project/pull/84339
|
|
|
|
If a function only exits for certain input values we can still derive
that an argument is "returned". We can also derive range metadata that
describe the possible value range returned by the function. However, it
turns out that those two analyses can result in conflicting information.
Example:
declare i16 @foo(i16 returned)
...
%A = call i16 @foo(i16 4095), !range !{i16 32, i16 33}
To avoid "Bits known to be one AND zero?" assertion failures we know
make sure to discard the known bits for this kind of scenario.
|
|
`isKnownPositive`; NFC
Just a simple compile time improvement. This function isn't used much,
however, so its not particularly impactful.
Closes #83638
|
|
(#82803)
This patch handles the pattern `icmp pred (trunc X), C` in
`computeKnownBitsFromCmp` to infer low bits of `X` from dominating
conditions.
|
|
So that we can benefit from some instcombine optimizations.
This PR contains two commits: the first is for adding tests and the
second is for the optimization.
|
|
|
|
As part of the RemoveDIs project we need LLVM to insert instructions using
iterators wherever possible, so that the iterators can carry a bit of
debug-info. This commit implements some of that by updating the contents of
llvm/lib/Transforms/Utils to always use iterator-versions of instruction
constructors.
There are two general flavours of update:
* Almost all call-sites just call getIterator on an instruction
* Several make use of an existing iterator (scenarios where the code is
actually significant for debug-info)
The underlying logic is that any call to getFirstInsertionPt or similar
APIs that identify the start of a block need to have that iterator passed
directly to the insertion function, without being converted to a bare
Instruction pointer along the way.
Noteworthy changes:
* FindInsertedValue now takes an optional iterator rather than an
instruction pointer, as we need to always insert with iterators,
* I've added a few iterator-taking versions of some value-tracking and
DomTree methods -- they just unwrap the iterator. These are purely
convenience methods to avoid extra syntax in some passes.
* A few calls to getNextNode become std::next instead (to keep in the
theme of using iterators for positions),
* SeparateConstOffsetFromGEP has it's insertion-position field changed.
Noteworthy because it's not a purely localised spelling change.
All this should be NFC.
|
|
DomConditionCache
This helps cover some missing cases in both and hopefully serves as
creating an easier framework for extending general condition based
analysis.
Closes #83161
|
|
AssumptionCache; NFC
|
|
|
|
of branch
The false branch for `and` and true branch for `or` provide less
information (intersection as opposed to union), but still can give
some useful information.
Closes #82818
|
|
well-defined/non-poison operands. (#82812)
According to the [coverage
result](https://dtcxzyw.github.io/llvm-opt-benchmark/coverage/home/dtcxzyw/llvm-project/llvm/lib/Analysis/ValueTracking.cpp.html#L7193)
on my benchmark, `llvm::mustTriggerUB` returns true with an average of
35.0M/12.3M=2.85 matches. I think we can stop enumerating when one of
the matches succeeds to avoid filling the temporary buffer
`NonPoisonOps`.
This patch introduces two template functions
`handleGuaranteedWellDefinedOps/handleGuaranteedNonPoisonOps`. They will
pass well-defined/non-poison operands to inlinable callbacks `Handle`.
If the callback returns true, stop processing and return true.
Otherwise, return false.
Compile-time improvement:
https://llvm-compile-time-tracker.com/compare.php?from=13acb3af5ad48e850cf37dcf02270ede3f267bd4&to=2b55f513c1b6dd2732cb79a25f3eaf6c5e4d6619&stat=instructions:u
|stage1-O3|stage1-ReleaseThinLTO|stage1-ReleaseLTO-g|stage1-O0-g|stage2-O3|stage2-O0-g|stage2-clang|
|--|--|--|--|--|--|--|
|-0.03%|-0.04%|-0.06%|-0.03%|-0.05%|+0.03%|-0.02%|
|
|
This patch extends `propagatesPoison` to handle more integer intrinsics.
It will turn more logical ands/ors into bitwise ands/ors.
See also https://reviews.llvm.org/D99671.
|
|
Current we only support `C` as the remainder, but we can also limit
with a constant numerator.
Proofs: https://alive2.llvm.org/ce/z/QB95gU
Closes #82303
|
|
This patch adds the missing `subnormal -> normal` part for `fpext` in
`computeKnownFPClass`.
Fixes the miscompilation reported by
https://github.com/llvm/llvm-project/pull/80941#issuecomment-1947302100.
|
|
This patch improves `computeKnownFPClass` by using context-sensitive
information from `DomConditionCache`.
The motivation of this patch is to optimize the following case found in
[fmt/format.h](https://github.com/fmtlib/fmt/blob/e17bc67547a66cdd378ca6a90c56b865d30d6168/include/fmt/format.h#L3555-L3566):
```
define float @test(float %x, i1 %cond) {
%i32 = bitcast float %x to i32
%cmp = icmp slt i32 %i32, 0
br i1 %cmp, label %if.then1, label %if.else
if.then1:
%fneg = fneg float %x
br label %if.end
if.else:
br i1 %cond, label %if.then2, label %if.end
if.then2:
br label %if.end
if.end:
%value = phi float [ %fneg, %if.then1 ], [ %x, %if.then2 ], [ %x, %if.else ]
%ret = call float @llvm.fabs.f32(float %value)
ret float %ret
}
```
We can prove the sign bit of %value is always zero. Then the fabs can be
eliminated.
This pattern also exists in cpython/duckdb/oiio/openexr.
Compile-time impact:
https://llvm-compile-time-tracker.com/compare.php?from=f82e0809ba12170e2f648f8a1ac01e78ef06c958&to=041218bf5491996edd828cc15b3aec5a59ddc636&stat=instructions:u
|stage1-O3|stage1-ReleaseThinLTO|stage1-ReleaseLTO-g|stage1-O0-g|stage2-O3|stage2-O0-g|stage2-clang|
|--|--|--|--|--|--|--|
|-0.00%|+0.01%|+0.00%|-0.03%|+0.00%|+0.00%|+0.02%|
|
|
(#81704)
This patch moves the `isSignBitCheck` helper into ValueTracking to reuse
the logic in ValueTracking/InstSimplify.
Addresses the comment
https://github.com/llvm/llvm-project/pull/80740#discussion_r1488440050.
|
|
|
|
This patch improves `computeKnownFPClass` by using context-sensitive
information from `DomConditionCache`.
|
|
This extends computeKnownBits() support for dominating conditions to
also handle and/or conditions. We'll look through either and or or
depending on which edge we're considering.
This change is mainly for the sake of completeness, so we don't start
missing optimizations if SimplifyCFG decides to merge some branches.
|
|
(#80657)
This patch refactors the interface of the `computeKnownFPClass` family
to pass `SimplifyQuery` directly.
The motivation of this patch is to compute known fpclass with
`DomConditionCache`, which was introduced by
https://github.com/llvm/llvm-project/pull/73662. With
`DomConditionCache`, we can do more optimization with context-sensitive
information.
Example (extracted from
[fmt/format.h](https://github.com/fmtlib/fmt/blob/e17bc67547a66cdd378ca6a90c56b865d30d6168/include/fmt/format.h#L3555-L3566)):
```
define float @test(float %x, i1 %cond) {
%i32 = bitcast float %x to i32
%cmp = icmp slt i32 %i32, 0
br i1 %cmp, label %if.then1, label %if.else
if.then1:
%fneg = fneg float %x
br label %if.end
if.else:
br i1 %cond, label %if.then2, label %if.end
if.then2:
br label %if.end
if.end:
%value = phi float [ %fneg, %if.then1 ], [ %x, %if.then2 ], [ %x, %if.else ]
%ret = call float @llvm.fabs.f32(float %value)
ret float %ret
}
```
We can prove the signbit of `%value` is always zero. Then the fabs can
be eliminated.
|
|
`computeKnownFPClass` (#76360)
This patch merges the logic of `cannotBeOrderedLessThanZeroImpl` into
`computeKnownFPClass` to improve the signbit inference.
---------
Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
|
|
If the input is non-zero, this intrinsic should also return a non-zero
value.
|