Age | Commit message (Collapse) | Author | Files | Lines |
|
This implements the `nusw` and `nuw` flags for `getelementptr` as
proposed at
https://discourse.llvm.org/t/rfc-add-nusw-and-nuw-flags-for-getelementptr/78672.
The three possible flags are encapsulated in the new `GEPNoWrapFlags`
class. Currently this class has a ctor from bool, interpreted as the
InBounds flag. This ctor should be removed in the future, as code gets
migrated to handle all flags.
There are a few places annotated with `TODO(gep_nowrap)`, where I've had
to touch code but opted to not infer or precisely preserve the new
flags, so as to keep this as NFC as possible and make sure any changes
of that kind get test coverage when they are made.
|
|
cloned callee (#83809)
A related change is https://reviews.llvm.org/D133121, which correctly
preserves both branch weights and value profiles for invoke instruction.
* If the branch weight of the `invokeinst` specifies taken / not-taken branches, there is no scale.
|
|
CallBase::has/getFnAttrOnCalledFunction (#91392)
With opaque pointers, we shouldn't have bitcasts between function
pointer types.
|
|
These methods aren't used yet, but may be in the future. This keeps them
in line with other methods like getFnAttr().
|
|
This patch is moving out following intrinsics:
* vector.interleave2/deinterleave2
* vector.reverse
* vector.splice
from the experimental namespace.
All these intrinsics exist in LLVM for more than a year now, and are
widely used, so should not be considered as experimental.
|
|
Similar to #87934, this adds costs to the shuffles in a canonical LD3/LD4
pattern, which are represented in LLVM as deinterleaving-shuffle(load). This
likely has less effect at the moment than the ST3/ST4 costs as instcombine will
perform certain transforms without considering the cost.
|
|
- Put the helper function in `ProfDataUtil.h/cpp`, which is already a
dependency of `Instructions.cpp`
- The helper function could be re-used to update profiles of
`InvokeInst` (in a follow-up pull request)
|
|
This patch removes APIs that creating NUW neg. It is a trivial case
because `sub nuw 0, X` always gets simplified into zero.
I believe there is no optimization opportunities in the real-world
applications that we can take advantage of the nuw flag.
Motivated by
https://github.com/llvm/llvm-project/pull/84792#discussion_r1524891134.
Compile-time improvement:
https://llvm-compile-time-tracker.com/compare.php?from=d1f182c895728d89c5c3d198b133e212a5d9d4a3&to=da7b7478b7cbb32c09d760f6b8d0e67901e0d533&stat=instructions:u
|
|
Fixes #86164
|
|
Handle the range attribute in ValueTracking.
|
|
I'd reverted this in 6c7805d5d1 after a bad stage. Original commit
messsage follows:
[NFC][RemoveDIs] Bulk update utilities to insert with iterators
As part of the RemoveDIs project we need LLVM to insert instructions using
iterators wherever possible, so that the iterators can carry a bit of
debug-info. This commit implements some of that by updating the contents of
llvm/lib/Transforms/Utils to always use iterator-versions of instruction
constructors.
There are two general flavours of update:
* Almost all call-sites just call getIterator on an instruction
* Several make use of an existing iterator (scenarios where the code is
actually significant for debug-info)
The underlying logic is that any call to getFirstInsertionPt or similar
APIs that identify the start of a block need to have that iterator passed
directly to the insertion function, without being converted to a bare
Instruction pointer along the way.
I've also switched DemotePHIToStack to take an optional iterator: it needs
to take an iterator, and having a no-insert-location behaviour appears to
be important. The constructors for ICmpInst and FCmpInst have been updated
too. They're the only instructions that take block _references_ rather than
pointers for certain calls, and a future patch is going to make use of
default-null block insertion locations.
All of this should be NFC.
|
|
This reverts commit 3fda50d3915b2163a54a37b602be7783a89dd808.
Apparently I've missed a hunk while staging this; will back out for now.
Picked up here: https://lab.llvm.org/buildbot/#/builders/139/builds/60429/steps/6/logs/stdio
|
|
As part of the RemoveDIs project we need LLVM to insert instructions using
iterators wherever possible, so that the iterators can carry a bit of
debug-info. This commit implements some of that by updating the contents of
llvm/lib/Transforms/Utils to always use iterator-versions of instruction
constructors.
There are two general flavours of update:
* Almost all call-sites just call getIterator on an instruction
* Several make use of an existing iterator (scenarios where the code is
actually significant for debug-info)
The underlying logic is that any call to getFirstInsertionPt or similar
APIs that identify the start of a block need to have that iterator passed
directly to the insertion function, without being converted to a bare
Instruction pointer along the way.
I've also switched DemotePHIToStack to take an optional iterator: it needs
to take an iterator, and having a no-insert-location behaviour appears to
be important. The constructors for ICmpInst and FCmpInst have been updated
too. They're the only instructions that take block _references_ rather than
pointers for certain calls, and a future patch is going to make use of
default-null block insertion locations.
All of this should be NFC.
|
|
In a previous commit I added declarations for all these functions, but
forgot to add bodies for them (as nothing uses them yet). These
iterator-taking constructors are necessary for the future where we only
use iterators for insertion, preserving some debug-info properties.
Also adds two extra declarations I missed in 76dd4bc036f
|
|
Removing debug-intrinsics requires that we always insert with an
iterator, not with an instruction position. To enforce that, we need to
eliminate the `Instruction *` taking functions. It's safe to leave the
insert-at-end-of-block functions as the intention is clear for debug
info purposes (i.e., insert after both instructions and debug-info at
the end of the function).
This patch demonstrates how that needs to happen. At a variety of
call-sites to the `CreateNeg` constructor we need to consider:
* Has this instruction been selected because of the operation it
performs? In that case, just call `getIterator` and pass an iterator in.
* Has this instruction been selected because of it's position? If so, we
need to keep the iterator identifying that position (see the 3rd hunk
changing Reassociate.cpp, although it's coincidentally not debug-info
significant).
This also demonstrates what we'll try and do with the constructor
methods going forwards: have one fully explicit set of parameters
including iterator, and another with default-arguments where the
block-to-insert-into argument defaults to nullptr / no-position,
creating an instruction that hasn't been inserted yet.
|
|
Part of removing debug-intrinsics from LLVM requires using iterators
whenever we insert an instruction into a block. That means we need all
instruction constructors and factory functions to have an iterator
taking option, which this patch adds.
The whole of this patch should be NFC: it's adding new flavours of
existing constructors, and plumbing those through to the Instruction
constructor that takes iterators. It's almost entirely boilerplate
copy-and-paste too.
|
|
fpext from bfloat (#82493)
This fixes the case where we would shrink an frem to half and then
bitcast to bfloat, producing invalid results. The transformation was
written under the assumption that there is only one type with a given
bit width.
Also add a strategic assert to CastInst::CreateFPCast to turn this
miscompilation into a crash.
|
|
Miscompilation arises due to instruction combining of cast pairs of the
type `bitcast bfloat to half` + `<FPOp> bfloat to half` or `bitcast half
to bfloat` + `<FPOp half to bfloat`. For example `bitcast bfloat to
half`+`fpext half to double` or `bitcast bfloat to half`+`fpext bfloat
to double` respectively reduce to `fpext bfloat to double` and `fpext
half to double`. This is an incorrect conversion as it assumes the
representation of `bfloat` and `half` are equivalent due to having the
same width. As a consequence miscompilation arises.
Fixes #61984
|
|
result type. (#77046)
It's not enough to just make sure destination type is floating point,
because the following chain may be incorrectly optimized:
```LLVM
%trunc = fptrunc float %src to bfloat
%cast = bitcast bfloat %trunc to half
```
Before the fix, the instruction sequence mentioned above used to be
translated into single fptrunc instruction as follows:
```LLVM
%trunc = fptrunc float %src to half
```
Such transformation was semantically incorrect.
|
|
This Op<2> usage was missed in 1ee6ec2bf3, which replaced the third
shuffle operand with a vector of integer mask constants.
I noticed this when attempting to make changes to the layout of
llvm::Value.
|
|
Because we're storing some extra debug-info information in the iterator
class, we need to insert new LICM-created stores using such iterators.
Switch LICM to storing iterators instead of pointers when it promotes
variables in loops, add a test for the desired behaviour, and enable
RemoveDIs instrumentation on a variety of other LICM tests for good
measure.
(This would appear to be the only pass in LLVM that needs to store
iterators on the heap).
|
|
...behind an experimental CMAKE option that's off by default.
This patch adds a new ilist-iterator-like class that can carry two extra bits
as well as the usual node pointer. This is part of the project to remove
debug-intrinsics from LLVM: see the rationale here [0], they're needed to
signal whether a "position" in a BasicBlock includes any debug-info before or
after the iterator.
This entirely duplicates ilist_iterator, attempting re-use showed it to be a
false economy. It's enable-able through the existing ilist_node options
interface, hence a few sites where the instruction-list type needs to be
updated. The actual main feature, the extra bits in the class, aren't part of
the class unless the cmake flag is given: this is because there's a
compile-time cost associated with it, and I'd like to get everything in-tree
but off-by-default so that we can do proper comparisons.
Nothing actually makes use of this yet, but will do soon, see the Phab patch
stack.
[0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939
Differential Revision: https://reviews.llvm.org/D153777
|
|
Need to add NumSrcElts param to is..Mask functions in
ShuffleVectorInstruction class for better mask analysis. Mask.size() not
always matches the sizes of the permuted vector(s). Allows to better
estimate the cost in SLP and fix uses of the functions in other cases.
Differential Revision: https://reviews.llvm.org/D158449
|
|
ShuffleVectorInst."
This reverts commit b186f1f68be11630355afb0c08b80374a6d31782.
Causes crashes, see https://reviews.llvm.org/D158449.
|
|
Need to add NumSrcElts param to is..Mask functions in
ShuffleVectorInstruction class for better mask analysis. Mask.size() not
always matches the sizes of the permuted vector(s). Allows to better
estimate the cost in SLP and fix uses of the functions in other cases.
Differential Revision: https://reviews.llvm.org/D158449
|
|
ShuffleVectorInst."
This reverts commit 6f43d28f3452b3ef598bc12b761cfc2dbd0f34c9 to fix
a crash reported in https://reviews.llvm.org/D158449.
|
|
Need to add NumSrcElts param to is..Mask functions in
ShuffleVectorInstruction class for better mask analysis. Mask.size() not
always matches the sizes of the permuted vector(s). Allows to better
estimate the cost in SLP and fix uses of the functions in other cases.
Differential Revision: https://reviews.llvm.org/D158449
|
|
ShuffleVectorInst."
This reverts commit 9f5960e004ff54082ccfa9396522e07358f5b66b to fix
buildbots reported here https://lab.llvm.org/buildbot/#/builders/230/builds/19412.
|
|
Need to add NumSrcElts param to is..Mask functions in
ShuffleVectorInstruction class for better mask analysis. Mask.size() not
always matches the sizes of the permuted vector(s). Allows to better
estimate the cost in SLP and fix uses of the functions in other cases.
Differential Revision: https://reviews.llvm.org/D158449
|
|
ShuffleVectorInst."
This reverts commit c88c281cf1ac1a01c55231b93826d7c8ae83985b to fix the
crash revealed by https://lab.llvm.org/buildbot/#/builders/230/builds/19353.
|
|
Need to add NumSrcElts param to is..Mask functions in
ShuffleVectorInstruction class for better mask analysis. Mask.size() not
always matches the sizes of the permuted vector(s). Allows to better
estimate the cost in SLP and fix uses of the functions in other cases.
Differential Revision: https://reviews.llvm.org/D158449
|
|
|
|
Similarly to D158861 I'm moving the `CreateFree` method from `CallInst` to `IRBuilderBase`.
Differential Revision: https://reviews.llvm.org/D159418
|
|
This removes `CreateMalloc` from `CallInst` and adds it to the `IRBuilderBase`
class.
We no longer needed the `Instruction *InsertBefore` and
`BasicBlock *InsertAtEnd` arguments of the `createMalloc` helper
function because we're using `IRBuilder` now. That's why I we also don't
need 4 `CreateMalloc` functions, but only two.
Differential Revision: https://reviews.llvm.org/D158861
|
|
This bitcast is no longer necessary with opaque pointers. This
results in some annoying variable name changes in tests.
|
|
|
|
Given a shuffle mask like <3, 0, 1, 2, 7, 4, 5, 6> for v8i8, we can
reinterpret it as a shuffle of v2i32 where the two i32s are bit rotated, and
lower it as a vror.vi (if legal with zvbb enabled).
We also need to make sure that the larger element type is a valid SEW, hence
the tests for zve32x.
X86 already did this, so I've extracted the logic for it and put it inside
ShuffleVectorSDNode so it could be reused by RISC-V. I originally tried to add
this as a generic combine in DAGCombiner.cpp, but it ended up causing worse
codegen on X86 and PPC.
Reviewed By: reames, pengfei
Differential Revision: https://reviews.llvm.org/D157417
|
|
/Users/jiefu/llvm-project/llvm/lib/IR/Instructions.cpp:166:3: error: ignoring return value of function declared with 'nodiscard' attribute [-Werror,-Wunused-result]
std::remove_if(const_cast<block_iterator>(block_begin()),
^~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.
|
|
Add an API that allows removing multiple incoming phi values based
on a predicate callback, as suggested on D157621.
This makes sure that the removal is linear time rather than quadratic,
and avoids subtleties around iterator invalidation.
I have replaced some of the more straightforward users with the new
API, though there's a couple more places that should be able to use it.
Differential Revision: https://reviews.llvm.org/D158064
|
|
Differential Revision: https://reviews.llvm.org/D157551
|
|
It seems the ranges start with 0 in most cases.
Reviewed By: dblaikie, gchatelet
Differential Revision: https://reviews.llvm.org/D156135
|
|
|
|
Always returns true with opaque pointers.
|
|
Update to use new shufflevector semantics for undefined values in the mask
Differential Revision: https://reviews.llvm.org/D149548
|
|
Following the change in shufflevector semantics,
poison will be used to represent undefined elements in shufflevector masks.
Differential Revision: https://reviews.llvm.org/D149256
|
|
This is no longer relevant with opaque pointers.
Also drop the CastInst::isLosslessCast() method, which was only
used here.
|
|
integer code.
Confusingly ConstantFP's getZeroValueForNegation intentionally
handles non-FP constants. It calls getNullValue in Constant.
Nearly all uses in tree are for integers rather than FP. Maybe due
to replacing FSub -0.0, X idiom with an FNeg instructions a few
years ago.
This patch replaces all the integer uses in tree with ConstantInt::get(0, Ty).
The one remaining use is in clang with a FIXME that it should use fneg.
I'll fix that next and then delete ConstantFP::getZeroValueForNegation.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D147492
|
|
This adds two new methods to ShuffleVectorInst, isInterleave and
isInterleaveMask, so that the logic to check if a shuffle mask is an
interleave can be shared across the TTI, codegen and the interleaved
access pass.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D145971
|
|
I regularly try and fail to use this while debugging.
|
|
This carries a bitmask indicating forbidden floating-point value kinds
in the argument or return value. This will enable interprocedural
-ffinite-math-only optimizations. This is primarily to cover the
no-nans and no-infinities cases, but also covers the other floating
point classes for free. Textually, this provides a number of names
corresponding to bits in FPClassTest, e.g.
call nofpclass(nan inf) @must_be_finite()
call nofpclass(snan) @cannot_be_snan()
This is more expressive than the existing nnan and ninf fast math
flags. As an added bonus, you can represent fun things like nanf:
declare nofpclass(inf zero sub norm) float @only_nans()
Compared to nnan/ninf:
- Can be applied to individual call operands as well as the return value
- Can distinguish signaling and quiet nans
- Distinguishes the sign of infinities
- Can be safely propagated since it doesn't imply anything about
other operands.
- Does not apply to FP instructions; it's not a flag
This is one step closer to being able to retire "no-nans-fp-math" and
"no-infs-fp-math". The one remaining situation where we have no way to
represent no-nans/infs is for loads (if we wanted to solve this we
could introduce !nofpclass metadata, following along with
noundef/!noundef).
This is to help simplify the GPU builtin math library
distribution. Currently the library code has explicit finite math only
checks, read from global constants the compiler driver needs to set
based on the compiler flags during linking. We end up having to
internalize the library into each translation unit in case different
linked modules have different math flags. By propagating known-not-nan
and known-not-infinity information, we can automatically prune the
edge case handling in most functions if the function is only reached
from fast math uses.
|