aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/IR/Instructions.cpp
AgeCommit message (Collapse)AuthorFilesLines
2024-05-27[IR] Add getelementptr nusw and nuw flags (#90824)Nikita Popov1-1/+22
This implements the `nusw` and `nuw` flags for `getelementptr` as proposed at https://discourse.llvm.org/t/rfc-add-nusw-and-nuw-flags-for-getelementptr/78672. The three possible flags are encapsulated in the new `GEPNoWrapFlags` class. Currently this class has a ctor from bool, interpreted as the InBounds flag. This ctor should be removed in the future, as code gets migrated to handle all flags. There are a few places annotated with `TODO(gep_nowrap)`, where I've had to touch code but opted to not infer or precisely preserve the new flags, so as to keep this as NFC as possible and make sure any changes of that kind get test coverage when they are made.
2024-05-08[Inline][PGO] After inline, update InvokeInst profile counts in caller and ↵Mingming Liu1-0/+12
cloned callee (#83809) A related change is https://reviews.llvm.org/D133121, which correctly preserves both branch weights and value profiles for invoke instruction. * If the branch weight of the `invokeinst` specifies taken / not-taken branches, there is no scale.
2024-05-08[IR] Remove check for bitcast of called function in ↵Arthur Eubanks1-18/+3
CallBase::has/getFnAttrOnCalledFunction (#91392) With opaque pointers, we shouldn't have bitcasts between function pointer types.
2024-05-08[IR] Check callee param attributes as well in CallBase::getParamAttr() (#91394)Arthur Eubanks1-0/+16
These methods aren't used yet, but may be in the future. This keeps them in line with other methods like getFnAttr().
2024-04-29Move several vector intrinsics out of experimental namespace (#88748)Maciej Gabka1-1/+1
This patch is moving out following intrinsics: * vector.interleave2/deinterleave2 * vector.reverse * vector.splice from the experimental namespace. All these intrinsics exist in LLVM for more than a year now, and are widely used, so should not be considered as experimental.
2024-04-21[AArch64] Add costs for LD3/LD4 shuffles.David Green1-0/+25
Similar to #87934, this adds costs to the shuffles in a canonical LD3/LD4 pattern, which are represented in LLVM as deinterleaving-shuffle(load). This likely has less effect at the moment than the ST3/ST4 costs as instcombine will perform certain transforms without considering the cost.
2024-03-27[nfc][PGO]Factor out profile scaling into a standalone helper function (#83780)Mingming Liu1-45/+1
- Put the helper function in `ProfDataUtil.h/cpp`, which is already a dependency of `Instructions.cpp` - The helper function could be re-used to update profiles of `InvokeInst` (in a follow-up pull request)
2024-03-26[LLVM] Remove nuw neg (#86295)Yingwei Zheng1-12/+0
This patch removes APIs that creating NUW neg. It is a trivial case because `sub nuw 0, X` always gets simplified into zero. I believe there is no optimization opportunities in the real-world applications that we can take advantage of the nuw flag. Motivated by https://github.com/llvm/llvm-project/pull/84792#discussion_r1524891134. Compile-time improvement: https://llvm-compile-time-tracker.com/compare.php?from=d1f182c895728d89c5c3d198b133e212a5d9d4a3&to=da7b7478b7cbb32c09d760f6b8d0e67901e0d533&stat=instructions:u
2024-03-24[InstCombine] Copy flags of extractelement for extelt -> icmp combine (#86366)Marc Auberer1-0/+10
Fixes #86164
2024-03-20[ValueTracking] Handle range attributes (#85143)Andreas Jonson1-0/+8
Handle the range attribute in ValueTracking.
2024-03-04[RemoveDIs] Reapply 3fda50d3915, insert instructions using iteratorsJeremy Morse1-2/+2
I'd reverted this in 6c7805d5d1 after a bad stage. Original commit messsage follows: [NFC][RemoveDIs] Bulk update utilities to insert with iterators As part of the RemoveDIs project we need LLVM to insert instructions using iterators wherever possible, so that the iterators can carry a bit of debug-info. This commit implements some of that by updating the contents of llvm/lib/Transforms/Utils to always use iterator-versions of instruction constructors. There are two general flavours of update: * Almost all call-sites just call getIterator on an instruction * Several make use of an existing iterator (scenarios where the code is actually significant for debug-info) The underlying logic is that any call to getFirstInsertionPt or similar APIs that identify the start of a block need to have that iterator passed directly to the insertion function, without being converted to a bare Instruction pointer along the way. I've also switched DemotePHIToStack to take an optional iterator: it needs to take an iterator, and having a no-insert-location behaviour appears to be important. The constructors for ICmpInst and FCmpInst have been updated too. They're the only instructions that take block _references_ rather than pointers for certain calls, and a future patch is going to make use of default-null block insertion locations. All of this should be NFC.
2024-02-29Revert "[NFC][RemoveDIs] Bulk update utilities to insert with iterators"Jeremy Morse1-2/+2
This reverts commit 3fda50d3915b2163a54a37b602be7783a89dd808. Apparently I've missed a hunk while staging this; will back out for now. Picked up here: https://lab.llvm.org/buildbot/#/builders/139/builds/60429/steps/6/logs/stdio
2024-02-29[NFC][RemoveDIs] Bulk update utilities to insert with iteratorsJeremy Morse1-2/+2
As part of the RemoveDIs project we need LLVM to insert instructions using iterators wherever possible, so that the iterators can carry a bit of debug-info. This commit implements some of that by updating the contents of llvm/lib/Transforms/Utils to always use iterator-versions of instruction constructors. There are two general flavours of update: * Almost all call-sites just call getIterator on an instruction * Several make use of an existing iterator (scenarios where the code is actually significant for debug-info) The underlying logic is that any call to getFirstInsertionPt or similar APIs that identify the start of a block need to have that iterator passed directly to the insertion function, without being converted to a bare Instruction pointer along the way. I've also switched DemotePHIToStack to take an optional iterator: it needs to take an iterator, and having a no-insert-location behaviour appears to be important. The constructors for ICmpInst and FCmpInst have been updated too. They're the only instructions that take block _references_ rather than pointers for certain calls, and a future patch is going to make use of default-null block insertion locations. All of this should be NFC.
2024-02-29[NFC][RemoveDIs] Add bodies for inst-constructors taking iteratorsJeremy Morse1-0/+129
In a previous commit I added declarations for all these functions, but forgot to add bodies for them (as nothing uses them yet). These iterator-taking constructors are necessary for the future where we only use iterators for insertion, preserving some debug-info properties. Also adds two extra declarations I missed in 76dd4bc036f
2024-02-29[NFC][RemoveDIs] Have CreateNeg only accept iterators (#82999)Jeremy Morse1-8/+0
Removing debug-intrinsics requires that we always insert with an iterator, not with an instruction position. To enforce that, we need to eliminate the `Instruction *` taking functions. It's safe to leave the insert-at-end-of-block functions as the intention is clear for debug info purposes (i.e., insert after both instructions and debug-info at the end of the function). This patch demonstrates how that needs to happen. At a variety of call-sites to the `CreateNeg` constructor we need to consider: * Has this instruction been selected because of the operation it performs? In that case, just call `getIterator` and pass an iterator in. * Has this instruction been selected because of it's position? If so, we need to keep the iterator identifying that position (see the 3rd hunk changing Reassociate.cpp, although it's coincidentally not debug-info significant). This also demonstrates what we'll try and do with the constructor methods going forwards: have one fully explicit set of parameters including iterator, and another with default-arguments where the block-to-insert-into argument defaults to nullptr / no-position, creating an instruction that hasn't been inserted yet.
2024-02-26[RemoveDIs] Add iterator-taking constructors and Create methods (#82778)Jeremy Morse1-6/+498
Part of removing debug-intrinsics from LLVM requires using iterators whenever we insert an instruction into a block. That means we need all instruction constructors and factory functions to have an iterator taking option, which this patch adds. The whole of this patch should be NFC: it's adding new flavours of existing constructors, and plumbing those through to the Instruction constructor that takes iterators. It's almost entirely boilerplate copy-and-paste too.
2024-02-22[InstCombine] Pick bfloat over half when shrinking ops that started with an ↵Benjamin Kramer1-0/+1
fpext from bfloat (#82493) This fixes the case where we would shrink an frem to half and then bitcast to bfloat, producing invalid results. The transformation was written under the assumption that there is only one type with a given bit width. Also add a strategic assert to CastInst::CreateFPCast to turn this miscompilation into a crash.
2024-01-31[llvm][InstCombine] bitcast bfloat half castpair bug (#79832)Nashe Mncube1-7/+1
Miscompilation arises due to instruction combining of cast pairs of the type `bitcast bfloat to half` + `<FPOp> bfloat to half` or `bitcast half to bfloat` + `<FPOp half to bfloat`. For example `bitcast bfloat to half`+`fpext half to double` or `bitcast bfloat to half`+`fpext bfloat to double` respectively reduce to `fpext bfloat to double` and `fpext half to double`. This is an incorrect conversion as it assumes the representation of `bfloat` and `half` are equivalent due to having the same width. As a consequence miscompilation arises. Fixes #61984
2024-01-16[InstCombine] Only fold bitcast(fptrunc) if destination type matches fptrunc ↵Victor Mustya1-2/+2
result type. (#77046) It's not enough to just make sure destination type is floating point, because the following chain may be incorrectly optimized: ```LLVM %trunc = fptrunc float %src to bfloat %cast = bitcast bfloat %trunc to half ``` Before the fix, the instruction sequence mentioned above used to be translated into single fptrunc instruction as follows: ```LLVM %trunc = fptrunc float %src to half ``` Such transformation was semantically incorrect.
2023-12-15[IR] Fix UB on Op<2> in ShuffleVector predicates (#75549)Reid Kleckner1-8/+1
This Op<2> usage was missed in 1ee6ec2bf3, which replaced the third shuffle operand with a vector of integer mask constants. I noticed this when attempting to make changes to the layout of llvm::Value.
2023-11-30[DebugInfo][RemoveDIs] Have LICM insert at iterator positions (#73671)Jeremy Morse1-0/+28
Because we're storing some extra debug-info information in the iterator class, we need to insert new LICM-created stores using such iterators. Switch LICM to storing iterators instead of pointers when it promotes variables in loops, add a test for the desired behaviour, and enable RemoveDIs instrumentation on a variety of other LICM tests for good measure. (This would appear to be the only pass in LLVM that needs to store iterators on the heap).
2023-10-17[ADT][DebugInfo][RemoveDIs] Add extra bits to ilist_iterator for debug-infoJeremy Morse1-1/+1
...behind an experimental CMAKE option that's off by default. This patch adds a new ilist-iterator-like class that can carry two extra bits as well as the usual node pointer. This is part of the project to remove debug-intrinsics from LLVM: see the rationale here [0], they're needed to signal whether a "position" in a BasicBlock includes any debug-info before or after the iterator. This entirely duplicates ilist_iterator, attempting re-use showed it to be a false economy. It's enable-able through the existing ilist_node options interface, hence a few sites where the instruction-list type needs to be updated. The actual main feature, the extra bits in the class, aren't part of the class unless the cmake flag is given: this is because there's a compile-time cost associated with it, and I'd like to get everything in-tree but off-by-default so that we can do proper comparisons. Nothing actually makes use of this yet, but will do soon, see the Phab patch stack. [0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939 Differential Revision: https://reviews.llvm.org/D153777
2023-10-05[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst.Alexey Bataev1-31/+44
Need to add NumSrcElts param to is..Mask functions in ShuffleVectorInstruction class for better mask analysis. Mask.size() not always matches the sizes of the permuted vector(s). Allows to better estimate the cost in SLP and fix uses of the functions in other cases. Differential Revision: https://reviews.llvm.org/D158449
2023-10-04Revert "[IR]Add NumSrcElts param to is..Mask static function in ↵Arthur Eubanks1-44/+31
ShuffleVectorInst." This reverts commit b186f1f68be11630355afb0c08b80374a6d31782. Causes crashes, see https://reviews.llvm.org/D158449.
2023-10-04[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst.Alexey Bataev1-31/+44
Need to add NumSrcElts param to is..Mask functions in ShuffleVectorInstruction class for better mask analysis. Mask.size() not always matches the sizes of the permuted vector(s). Allows to better estimate the cost in SLP and fix uses of the functions in other cases. Differential Revision: https://reviews.llvm.org/D158449
2023-10-03Revert "[IR]Add NumSrcElts param to is..Mask static function in ↵Alexey Bataev1-44/+31
ShuffleVectorInst." This reverts commit 6f43d28f3452b3ef598bc12b761cfc2dbd0f34c9 to fix a crash reported in https://reviews.llvm.org/D158449.
2023-10-03[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst.Alexey Bataev1-31/+44
Need to add NumSrcElts param to is..Mask functions in ShuffleVectorInstruction class for better mask analysis. Mask.size() not always matches the sizes of the permuted vector(s). Allows to better estimate the cost in SLP and fix uses of the functions in other cases. Differential Revision: https://reviews.llvm.org/D158449
2023-09-29Revert "[IR]Add NumSrcElts param to is..Mask static function in ↵Alexey Bataev1-44/+31
ShuffleVectorInst." This reverts commit 9f5960e004ff54082ccfa9396522e07358f5b66b to fix buildbots reported here https://lab.llvm.org/buildbot/#/builders/230/builds/19412.
2023-09-29[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst.Alexey Bataev1-31/+44
Need to add NumSrcElts param to is..Mask functions in ShuffleVectorInstruction class for better mask analysis. Mask.size() not always matches the sizes of the permuted vector(s). Allows to better estimate the cost in SLP and fix uses of the functions in other cases. Differential Revision: https://reviews.llvm.org/D158449
2023-09-28Revert "[IR]Add NumSrcElts param to is..Mask static function in ↵Alexey Bataev1-44/+31
ShuffleVectorInst." This reverts commit c88c281cf1ac1a01c55231b93826d7c8ae83985b to fix the crash revealed by https://lab.llvm.org/buildbot/#/builders/230/builds/19353.
2023-09-28[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst.Alexey Bataev1-31/+44
Need to add NumSrcElts param to is..Mask functions in ShuffleVectorInstruction class for better mask analysis. Mask.size() not always matches the sizes of the permuted vector(s). Allows to better estimate the cost in SLP and fix uses of the functions in other cases. Differential Revision: https://reviews.llvm.org/D158449
2023-09-22Use llvm::drop_begin and llvm::drop_end (NFC)Kazu Hirata1-1/+1
2023-09-19Move CallInst::CreateFree to IRBuilderBaseKonrad Kleine1-55/+0
Similarly to D158861 I'm moving the `CreateFree` method from `CallInst` to `IRBuilderBase`. Differential Revision: https://reviews.llvm.org/D159418
2023-09-19[llvm] Move CallInst::CreateMalloc to IRBuilderBase::CreateMallocKonrad Kleine1-125/+0
This removes `CreateMalloc` from `CallInst` and adds it to the `IRBuilderBase` class. We no longer needed the `Instruction *InsertBefore` and `BasicBlock *InsertAtEnd` arguments of the `createMalloc` helper function because we're using `IRBuilder` now. That's why I we also don't need 4 `CreateMalloc` functions, but only two. Differential Revision: https://reviews.llvm.org/D158861
2023-09-18[IR] Remove unnecessary bitcast from CreateMalloc()Nikita Popov1-15/+3
This bitcast is no longer necessary with opaque pointers. This results in some annoying variable name changes in tests.
2023-09-02[llvm] Use range-based for loops (NFC)Kazu Hirata1-2/+2
2023-08-30[RISCV][SelectionDAG] Lower shuffles as bitrotates with vror.vi when possibleLuke Lau1-0/+39
Given a shuffle mask like <3, 0, 1, 2, 7, 4, 5, 6> for v8i8, we can reinterpret it as a shuffle of v2i32 where the two i32s are bit rotated, and lower it as a vror.vi (if legal with zvbb enabled). We also need to make sure that the larger element type is a valid SEW, hence the tests for zve32x. X86 already did this, so I've extracted the logic for it and put it inside ShuffleVectorSDNode so it could be reused by RISC-V. I originally tried to add this as a generic combine in DAGCombiner.cpp, but it ended up causing worse codegen on X86 and PPC. Reviewed By: reames, pengfei Differential Revision: https://reviews.llvm.org/D157417
2023-08-17[IR] Ignore the return value of std::remove_if (NFC)Jie Fu1-1/+1
/Users/jiefu/llvm-project/llvm/lib/IR/Instructions.cpp:166:3: error: ignoring return value of function declared with 'nodiscard' attribute [-Werror,-Wunused-result] std::remove_if(const_cast<block_iterator>(block_begin()), ^~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1 error generated.
2023-08-17[IR] Add PHINode::removeIncomingValueIf() (NFC)Nikita Popov1-0/+33
Add an API that allows removing multiple incoming phi values based on a predicate callback, as suggested on D157621. This makes sure that the removal is linear time rather than quadratic, and avoids subtleties around iterator invalidation. I have replaced some of the more straightforward users with the new API, though there's a couple more places that should be able to use it. Differential Revision: https://reviews.llvm.org/D158064
2023-08-10[llvm] Drop some bitcasts and references related to typed pointersBjorn Pettersson1-13/+7
Differential Revision: https://reviews.llvm.org/D157551
2023-07-26[ADT] Support iterating size-based integer ranges.Ivan Kosarev1-1/+1
It seems the ranges start with 0 in most cases. Reviewed By: dblaikie, gchatelet Differential Revision: https://reviews.llvm.org/D156135
2023-07-18[llvm] Remove some uses of isOpaqueOrPointeeTypeEquals() (NFC)Nikita Popov1-14/+0
2023-07-14[llvm] Remove uses of hasSameElemenTypeAs() (NFC)Nikita Popov1-9/+3
Always returns true with opaque pointers.
2023-06-13[IR] Update to use new shufflevector semanticsManuelJBrito1-1/+1
Update to use new shufflevector semantics for undefined values in the mask Differential Revision: https://reviews.llvm.org/D149548
2023-04-27[IR][NFC] Change UndefMaskElem to PoisonMaskElemManuelJBrito1-10/+10
Following the change in shufflevector semantics, poison will be used to represent undefined elements in shufflevector masks. Differential Revision: https://reviews.llvm.org/D149256
2023-04-05[InstCombine] Remove varargs cast transform (NFC)Nikita Popov1-17/+0
This is no longer relevant with opaque pointers. Also drop the CastInst::isLosslessCast() method, which was only used here.
2023-04-04[IR] Remove uses of the oddly named ConstantFP::getZeroValueForNegation in ↵Craig Topper1-12/+12
integer code. Confusingly ConstantFP's getZeroValueForNegation intentionally handles non-FP constants. It calls getNullValue in Constant. Nearly all uses in tree are for integers rather than FP. Maybe due to replacing FSub -0.0, X idiom with an FNeg instructions a few years ago. This patch replaces all the integer uses in tree with ConstantInt::get(0, Ty). The one remaining use is in clang with a FIXME that it should use fneg. I'll fix that next and then delete ConstantFP::getZeroValueForNegation. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D147492
2023-03-14[RISCV][NFC] Share interleave mask checking logicLuke Lau1-0/+92
This adds two new methods to ShuffleVectorInst, isInterleave and isInterleaveMask, so that the logic to check if a shuffle mask is an interleave can be shared across the TTI, codegen and the interleaved access pass. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D145971
2023-03-07[IR] Add operator<< overload for CmpInst::Predicate (NFC)Nikita Popov1-0/+5
I regularly try and fail to use this while debugging.
2023-02-24IR: Add nofpclass parameter attributeMatt Arsenault1-0/+16
This carries a bitmask indicating forbidden floating-point value kinds in the argument or return value. This will enable interprocedural -ffinite-math-only optimizations. This is primarily to cover the no-nans and no-infinities cases, but also covers the other floating point classes for free. Textually, this provides a number of names corresponding to bits in FPClassTest, e.g. call nofpclass(nan inf) @must_be_finite() call nofpclass(snan) @cannot_be_snan() This is more expressive than the existing nnan and ninf fast math flags. As an added bonus, you can represent fun things like nanf: declare nofpclass(inf zero sub norm) float @only_nans() Compared to nnan/ninf: - Can be applied to individual call operands as well as the return value - Can distinguish signaling and quiet nans - Distinguishes the sign of infinities - Can be safely propagated since it doesn't imply anything about other operands. - Does not apply to FP instructions; it's not a flag This is one step closer to being able to retire "no-nans-fp-math" and "no-infs-fp-math". The one remaining situation where we have no way to represent no-nans/infs is for loads (if we wanted to solve this we could introduce !nofpclass metadata, following along with noundef/!noundef). This is to help simplify the GPU builtin math library distribution. Currently the library code has explicit finite math only checks, read from global constants the compiler driver needs to set based on the compiler flags during linking. We end up having to internalize the library into each translation unit in case different linked modules have different math flags. By propagating known-not-nan and known-not-infinity information, we can automatically prune the edge case handling in most functions if the function is only reached from fast math uses.