aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/CodeGen/CodeGenPrepare.cpp
AgeCommit message (Collapse)AuthorFilesLines
4 days[CodeGenPrepare] Bail out of usubo creation if sub's parent is not the same ↵AZero131-0/+6
as the comparison (#160358) We match uadd's behavior here. Codegen comparison: https://godbolt.org/z/x8j4EhGno
10 days[CodeGenPrepare] Consider target memory intrinics as memory use (#159638)Jeffrey Byrnes1-0/+13
When deciding to sink address instructions into their uses, we check if it is profitable to do so. The profitability check is based on the types of uses of this address instruction -- if there are users which are not memory instructions, then do not fold. However, this profitability check wasn't considering target intrinsics, which may be loads / stores. This adds some logic to handle target memory intrinsics.
13 daysRe-apply "[NFCI][Globals] In GlobalObjects::setSectionPrefix, do conditional ↵Mingming Liu1-4/+4
update if existing prefix is not equivalent to the new one. Returns whether prefix changed." (#159161) This is a reland of https://github.com/llvm/llvm-project/pull/158460 Test failures are gone once I undo the changes in codegenprepare.
13 daysRevert "[NFCI][Globals] In GlobalObjects::setSectionPrefix, do conditional ↵Mingming Liu1-4/+4
update if existing prefix is not equivalent to the new one. Returns whether prefix changed." (#159159) Reverts llvm/llvm-project#158460 due to buildbot failures
13 days[NFCI][Globals] In GlobalObjects::setSectionPrefix, do conditional update if ↵Mingming Liu1-4/+4
existing prefix is not equivalent to the new one. Returns whether prefix changed. (#158460) Before this change, `setSectionPrefix` overwrites existing section prefix with new one unconditionally. After this change, `setSectionPrefix` checks for equivalences, updates conditionally and returns whether an update happens. Update the existing callers to make use of the return value. [PR 155337](https://github.com/llvm/llvm-project/pull/155337/files#diff-cc0c67ac89807f4453f0cfea9164944a4650cd6873a468a0f907e7158818eae9) is a motivating use case whether the 'update' semantic is needed.
2025-09-04[CodeGen] Remove ExpandInlineAsm hook (#156617)Nikita Popov1-16/+3
This hook replaces inline asm with LLVM intrinsics. It was intended to match inline assembly implementations of bswap in libc headers and replace them more optimizable implementations. At this point, it has outlived its usefulness (see https://github.com/llvm/llvm-project/issues/156571#issuecomment-3247638412), as libc implementations no longer use inline assembly for this purpose. Additionally, it breaks the "black box" property of inline assembly, which some languages like Rust would like to guarantee. Fixes https://github.com/llvm/llvm-project/issues/156571.
2025-08-18[llvm] Replace SmallSet with SmallPtrSet (NFC) (#154068)Kazu Hirata1-8/+7
This patch replaces SmallSet<T *, N> with SmallPtrSet<T *, N>. Note that SmallSet.h "redirects" SmallSet to SmallPtrSet for pointer element types: template <typename PointeeType, unsigned N> class SmallSet<PointeeType*, N> : public SmallPtrSet<PointeeType*, N> {}; We only have 140 instances that rely on this "redirection", with the vast majority of them under llvm/. Since relying on the redirection doesn't improve readability, this patch replaces SmallSet with SmallPtrSet for pointer element types.
2025-08-05[LLVM][CGP] Allow finer control for sinking compares. (#151366)Paul Walker1-1/+1
Compare sinking is selectable based on the result of hasMultipleConditionRegisters. This function is too coarse grained by not taking into account the differences between scalar and vector compares. This PR extends the interface to take an EVT to allow finer control. The new interface is used by AArch64 to disable sinking of scalable vector compares, but with isProfitableToSinkOperands updated to maintain the cases that are specifically tested.
2025-07-31[X86][APX] Do optimizeMemoryInst for v1X masked load/store (#151331)Phoebe Wang1-0/+23
Fix redundant LEA: https://godbolt.org/z/34xEYE818
2025-07-26[CodeGenPrepare] Make sure that `AddOffset` is also a loop invariant (#150625)Yingwei Zheng1-0/+4
Closes https://github.com/llvm/llvm-project/issues/150611.
2025-07-21[DebugInfo] Remove intrinsic-flavours of findDbgUsers (#149816)Jeremy Morse1-3/+1
This is one of the final remaining debug-intrinsic specific codepaths out there, and pieces of cross-LLVM infrastructure to do with debug intrinsics.
2025-07-18[DebugInfo] Suppress lots of users of DbgValueInst (#149476)Jeremy Morse1-54/+5
This is another prune of dead code -- we never generate debug intrinsics nowadays, therefore there's no need for these codepaths to run. --------- Co-authored-by: Nikita Popov <github@npopov.com>
2025-07-16[DebugInfo] Remove getPrevNonDebugInstruction (#148859)Jeremy Morse1-2/+2
With the advent of intrinsic-less debug-info, we no longer need to scatter calls to getPrevNonDebugInstruction around the codebase. Remove most of them -- there are one or two that have the "SkipPseudoOp" flag turned on, however they don't seem to be in positions where skipping anything would be reasonable.
2025-07-15[DebugInfo][RemoveDIs] Suppress getNextNonDebugInfoInstruction (#144383)Jeremy Morse1-1/+1
There are no longer debug-info instructions, thus we don't need this skipping. Horray!
2025-06-28[CodeGenPrepare] Filter out unrecreatable addresses from memory optimization ↵Evgenii Kudriashov1-0/+7
(#143566) Follow up on #139303
2025-06-17[DebugInfo][RemoveDIs] Remove a swathe of debug-intrinsic code (#144389)Jeremy Morse1-10/+4
Seeing how we can't generate any debug intrinsics any more: delete a variety of codepaths where they're handled. For the most part these are plain deletions, in others I've tweaked comments to remain coherent, or added a type to (what was) type-generic-lambdas. This isn't all the DbgInfoIntrinsic call sites but it's most of the simple scenarios. Co-authored-by: Nikita Popov <github@npopov.com>
2025-06-12[DebugInfo][RemoveDIs] Delete debug-info-format flag (#143746)Jeremy Morse1-2/+1
This flag was used to let us incrementally introduce debug records into LLVM, however everything is now using records. It serves no purpose now, so delete it.
2025-06-06[CGP] Bail out if (Base|Scaled)Reg does not dominate insert point. (#142949)Florian Hahn1-2/+11
(Base|Scaled)Reg may not dominate the chosen insert point, if there are multiple uses of the address. Bail out if that's the case, otherwise we will generate invalid IR. In some cases, we could probably adjust the insert point or hoist the (Base|Scaled)Reg. Fixes https://github.com/llvm/llvm-project/issues/142830. PR: https://github.com/llvm/llvm-project/pull/142949
2025-06-03[CodeGenPrepare] Fix signed overflow (#141487)mikael-nilsson-arm1-5/+11
The signed addition could overflow which is undefined behavior, now the code checks for it.
2025-05-22Reland [llvm] add GenericFloatingPointPredicateUtils #140254 (#141065)Tim Gymnich1-0/+1
#140254 was previously missing 2 files in the bazel build config.
2025-05-21Revert "[llvm] add GenericFloatingPointPredicateUtils (#140254)" (#140968)Kewen121-1/+0
This reverts commit d00d74bb2564103ae3cb5ac6b6ffecf7e1cc2238. The PR breaks our buildbots and blocks downstream merge.
2025-05-21[llvm] add GenericFloatingPointPredicateUtils (#140254)Tim Gymnich1-0/+1
add `GenericFloatingPointPredicateUtils` in order to generalize effects of floating point comparisons on `KnownFPClass` for both IR and MIR. --------- Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
2025-05-15[CodeGenPrepare] Make sure instruction get from SunkAddrs is before ↵weiguozhi1-5/+36
MemoryInst (#139303) Function optimizeBlock may do optimizations on a block for multiple times. In the first iteration of the loop, MemoryInst1 may generate a sunk instruction and store it into SunkAddrs. In the second iteration of the loop, MemoryInst2 may use the same address and then it can reuse the sunk instruction stored in SunkAddrs, but MemoryInst2 may be before MemoryInst1 and the corresponding sunk instruction. In order to avoid use before def error, we need to find appropriate insert position for the sunk instruction. Fixes #138208.
2025-05-08Reapply "IR: Remove uselist for constantdata (#137313)" (#138961)Matt Arsenault1-0/+3
Reapply "IR: Remove uselist for constantdata (#137313)" This reverts commit 5936c02c8b9c6d1476f7830517781ce8b6e26e75. Fix checking uselists of constants in assume bundle queries
2025-05-07Revert "IR: Remove uselist for constantdata (#137313)"Kirill Stoimenov1-3/+0
Possibly breaks the build: https://lab.llvm.org/buildbot/#/builders/24/builds/8119 This reverts commit 87f312aad6ede636cd2de5d18f3058bf2caf5651.
2025-05-06IR: Remove uselist for constantdata (#137313)Matt Arsenault1-0/+3
This is a resurrected version of the patch attached to this RFC: https://discourse.llvm.org/t/rfc-constantdata-should-not-have-use-lists/42606 In this adaptation, there are a few differences. In the original patch, the Use's use list was replaced with an unsigned* to the reference count in the value. This version leaves them as null and leaves the ref counting only in Value. Remove use-lists from instances of ConstantData (which are shared across modules and have no operands). To continue supporting most of the use-list API, store a ref-count in place of the use-list; this is for API like Value::use_empty and Value::hasNUses. Operations that actually need the use-list -- like Value::use_begin -- will assert. This change has three benefits: 1. The compiler output cannot in any way depend on the use-list order of instances of ConstantData. 2. There's no use-list traffic when adding and removing simple constants from operand lists (although there is ref-count traffic; YMMV). 3. It's cheaper to serialize use-lists (since we're no longer serializing the use-list order of things like i32 0). The downside is that you can't look at all the users of ConstantData, but traversals of users of i32 0 are already ill-advised. Possible follow-ups: - Track if an instance of a ConstantVector/ConstantArray/etc. is known to have all ConstantData arguments, and drop the use-lists to ref-counts in those cases. Callers need to check Value::hasUseList before iterating through the use-list. - Remove even the ref-counts. I'm not sure they have any benefit besides minimizing the scope of this commit, and maintaining the counts is not free. Fixes #58629 Co-authored-by: Duncan P. N. Exon Smith <dexonsmith@apple.com>
2025-04-29[CGP] Despeculate ctlz/cttz with "illegal" integer types (#137197)Sergei Barannikov1-2/+2
The code below the removed check looks generic enough to support arbitrary integer widths. This change helps 32-bit targets avoid expensive expansion/libcalls in the case of zero input. Pull Request: https://github.com/llvm/llvm-project/pull/137197
2025-04-23[CodeGenPrepare] Unfold slow ctpop when used in power-of-two test (#102731)Sergei Barannikov1-28/+73
DAG combiner already does this transformation, but in some cases it does not have a chance because either CodeGenPrepare or SelectionDAGBuilder move icmp to a different basic block. https://alive2.llvm.org/ce/z/ARzh99 Fixes #94829 Pull Request: https://github.com/llvm/llvm-project/pull/102731
2025-04-18CodeGenPrepare: Check use_empty instead of getNumUses == 0 (#136334)Matt Arsenault1-1/+1
2025-04-18[CodeGen] Construct SmallVector with iterator ranges (NFC) (#136258)Kazu Hirata1-3/+2
2025-04-02[CodeGenPrepare][RISCV] Combine (X ^ Y) and (X == Y) where appropriate (#130922)Ryan Buchner1-1/+2
Fixes #130510. In RISCV, modify the folding of (X ^ Y == 0) -> (X == Y) to account for cases where the (X ^ Y) will be re-used. If a constant is being used for the XOR before a branch, ensure that it is small enough to fit within a 12-bit immediate field. Otherwise, the equality check is more efficient than the check against 0, see the following: ``` # %bb.0: lui a1, 5 addiw a1, a1, 1365 xor a0, a0, a1 beqz a0, .LBB0_2 # %bb.1: ret .LBB0_2: ``` ``` # %bb.0: lui a1, 5 addiw a1, a1, 1365 beq a0, a1, .LBB0_2 # %bb.1: xor a0, a0, a1 ret .LBB0_2: ``` Similarly, if the XOR is between 1 and a size one integer, we should still fold away the XOR since that comparison can be optimized as a comparison against 0. ``` # %bb.0: slt a0, a0, a1 xor a0, a0, 1 beqz a0, .LBB0_2 # %bb.1: ret .LBB0_2: ``` ``` # %bb.0: slt a0, a0, a1 bnez a0, .LBB0_2 # %bb.1: xor a0, a0, 1 ret .LBB0_2: ``` One question about my code is that I used a hard-coded value for the width of a RISCV ALU immediate. Do you know of a way that I can gather this from the `context`, I was unable to devise one.
2025-03-27[llvm] Use *Set::insert_range (NFC) (#133353)Kazu Hirata1-2/+1
We can use *Set::insert_range to collapse: for (auto Elem : Range) Set.insert(E.first); down to: Set.insert_range(llvm::make_first_range(Range)); In some cases, we can further fold that into the set declaration.
2025-03-23[CodeGen] Use *Set::insert_range (NFC) (#132651)Kazu Hirata1-2/+1
We can use *Set::insert_range to collapse: for (auto Elem : Range) Set.insert(E); down to: Set.insert_range(Range);
2025-03-23[llvm] Use range constructors for *Set (NFC) (#132636)Kazu Hirata1-2/+1
2025-03-22[llvm] Use *Set::insert_range (NFC) (#132509)Kazu Hirata1-1/+1
DenseSet, SmallPtrSet, SmallSet, SetVector, and StringSet recently gained C++23-style insert_range. This patch uses insert_range in conjunction with llvm::{predecessors,successors} and MachineBasicBlock::{predecessors,successors}.
2025-03-20[llvm] Use *Set::insert_range (NFC) (#132325)Kazu Hirata1-1/+1
DenseSet, SmallPtrSet, SmallSet, SetVector, and StringSet recently gained C++23-style insert_range. This patch replaces: Dest.insert(Src.begin(), Src.end()); with: Dest.insert_range(Src); This patch does not touch custom begin like succ_begin for now.
2025-03-09[CodeGen] Avoid repeated hash lookups (NFC) (#130543)Kazu Hirata1-3/+3
2025-01-30[CodeGenPrepare] Replace deleted ext instr with the promoted value. (#71058)Yingwei Zheng1-0/+11
This PR replaces the deleted ext with the promoted value in `AddrMode`. Fixes #70938.
2025-01-27[NFC][DebugInfo] Make some block-start-position methods return iterators ↵Jeremy Morse1-1/+1
(#124287) As part of the "RemoveDIs" work to eliminate debug intrinsics, we're replacing methods that use Instruction*'s as positions with iterators. A number of these (such as getFirstNonPHIOrDbg) are sufficiently infrequently used that we can just replace the pointer-returning version with an iterator-returning version, hopefully without much/any disruption. Thus this patch has getFirstNonPHIOrDbg and getFirstNonPHIOrDbgOrLifetime return an iterator, and updates all call-sites. There are no concerns about the iterators returned being converted to Instruction*'s and losing the debug-info bit: because the methods skip debug intrinsics, the iterator head bit is always false anyway.
2025-01-27[NFC][DebugInfo] Use iterators for instruction insertion in more places ↵Jeremy Morse1-11/+11
(#124291) As part of the "RemoveDIs" work to eliminate debug intrinsics, we're replacing methods that use Instruction*'s as positions with iterators. This patch changes some more complex call-sites, those crossing file boundaries and where I've had to perform some minor rewrites.
2025-01-24[NFC][DebugInfo] Use iterator moveBefore at many call-sites (#123583)Jeremy Morse1-16/+16
As part of the "RemoveDIs" project, BasicBlock::iterator now carries a debug-info bit that's needed when getFirstNonPHI and similar feed into instruction insertion positions. Call-sites where that's necessary were updated a year ago; but to ensure some type safety however, we'd like to have all calls to moveBefore use iterators. This patch adds a (guaranteed dereferenceable) iterator-taking moveBefore, and changes a bunch of call-sites where it's obviously safe to change to use it by just calling getIterator() on an instruction pointer. A follow-up patch will contain less-obviously-safe changes. We'll eventually deprecate and remove the instruction-pointer insertBefore, but not before adding concise documentation of what considerations are needed (very few).
2025-01-16[CodeGenPrepare] Replace `undef` use with `poison` [NFC] (#123111)Pedro Lobo1-4/+4
When generating a constant vector, if `UseSplat` is false, the indices different from the index of the extract can be filled with `poison` instead of `undef`.
2025-01-15[CodeGen] Avoid repeated hash lookups (NFC) (#123016)Kazu Hirata1-2/+2
2025-01-08[LLVM] Fix various cl::desc typos and whitespace issues (NFC) (#121955)Ryan Mansfield1-1/+1
2024-12-13PatternMatch: migrate to CmpPredicate (#118534)Ramkumar Ramachandra1-2/+2
With the introduction of CmpPredicate in 51a895a (IR: introduce struct with CmpInst::Predicate and samesign), PatternMatch is one of the first key pieces of infrastructure that must be updated to match a CmpInst respecting samesign information. Implement this change to Cmp-matchers. This is a preparatory step in migrating the codebase over to CmpPredicate. Since we no functional changes are desired at this stage, we have chosen not to migrate CmpPredicate::operator==(CmpPredicate) calls to use CmpPredicate::getMatching(), as that would have visible impact on tests that are not yet written: instead, we call CmpPredicate::operator==(Predicate), preserving the old behavior, while also inserting a few FIXME comments for follow-ups.
2024-12-01[CodeGenPrepare] Drop nsw flags in `optimizeLoadExt` (#118180)Yingwei Zheng1-0/+7
Alive2: https://alive2.llvm.org/ce/z/pMcD7q Closes https://github.com/llvm/llvm-project/issues/118172.
2024-11-12[CodeGen] Remove unused includes (NFC) (#115996)Kazu Hirata1-1/+0
Identified with misc-include-cleaner.
2024-10-31[CGP] [CodeGenPrepare] Folding `urem` with loop invariant value plus offset ↵goldsteinn1-9/+49
(#104724) This extends the existing fold: ``` for(i = Start; i < End; ++i) Rem = (i nuw+- IncrLoopInvariant) u% RemAmtLoopInvariant; ``` -> ``` Rem = (Start nuw+- IncrLoopInvariant) % RemAmtLoopInvariant; for(i = Start; i < End; ++i, ++rem) Rem = rem == RemAmtLoopInvariant ? 0 : Rem; ``` To work with a non-zero `IncrLoopInvariant`. This is a common usage in cases such as: ``` for(i = 0; i < N; ++i) if ((i + 1) % X) == 0) do_something_occasionally_but_not_first_iter(); ``` Alive2 w/ i4/unrolled 6x (needs to be ran locally due to timeout): https://alive2.llvm.org/ce/z/6tgyN3 Exhaust proof over all uint8_t combinations in C++: https://godbolt.org/z/WYa561388
2024-10-28Check hasOptSize() in shouldOptimizeForSize() (#112626)Ellis Hoag1-6/+3
2024-10-24replace 2 placeholder uses of undef with poison [NFC]Nuno Lopes1-1/+1