aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/CodeGen/InlineSpiller.cpp
AgeCommit message (Collapse)AuthorFilesLines
3 days[CodeGen] Finish untangling LRE::scanRemattable [nfc] (#161963)Philip Reames1-6/+15
This is an attempt to simplify the rematerialization logic in InlineSpiller and SplitKit. I'd earlier done the same for RegisterCoalescer in 57b673. The basic idea of this change is that we don't need to check whether an instruction is rematerializable early. Instead, we can defer the check to the point where we're actually trying to materialize something. We also don't need to indirect that query through a VNI key, and can instead just check the instruction directly at the use site.
14 days[RegAlloc] Add additional tracing in InlineSpiller::rematerializeFor (#160761)Philip Reames1-2/+11
We didn't have trace logging for two cases in this routine which makes it sometimes hard to tell what is going on. In addition to debug trace statements, add comments to explain the logic behind the early exits which don't mark the virtual register live. Suggestions on how to word these more precisely very welcome; I'm not clear I understand all the intrinicies of this code myself.
2025-07-10[InlineSpiller] Drop unused elements in Virt2SiblingsMap. NFC (#147866)csstormq1-1/+1
2025-05-24[CodeGen] Remove unused includes (NFC) (#141320)Kazu Hirata1-1/+0
These are identified by misc-include-cleaner. I've filtered out those that break builds. Also, I'm staying away from llvm-config.h, config.h, and Compiler.h, which likely cause platform- or compiler-specific build failures.
2025-04-28[InlineSpiller] Check rematerialization before folding operand (#134015)weiguozhi1-7/+31
Current implementation tries to fold the operand before rematerialization because it can reduce one register usage. But if there is a physical register available we can still rematerialize it without causing high register pressure. This patch do this check to find the better choice. Then we can produce xorps %xmm1, %xmm1 ucomiss %xmm1, %xmm0 instead of ucomiss LCPI0_1(%rip), %xmm0
2025-03-20[llvm] Use *Set::insert_range (NFC) (#132325)Kazu Hirata1-1/+1
DenseSet, SmallPtrSet, SmallSet, SetVector, and StringSet recently gained C++23-style insert_range. This patch replaces: Dest.insert(Src.begin(), Src.end()); with: Dest.insert_range(Src); This patch does not touch custom begin like succ_begin for now.
2025-03-14[CodeGen] Remove parameter from LiveRangeEdit::canRematerializeAt [NFC]Philip Reames1-1/+1
Only one caller cares about the true case of this parameter, so move the check to that single caller. Note that RegisterCoalescer seems like it should care, but it already duplicates the check several lines above.
2025-03-02[InlineSpiller] Use Register. NFCCraig Topper1-10/+10
2025-02-22[CodeGen] Avoid repeated hash lookups (NFC) (#128300)Kazu Hirata1-12/+10
2025-02-19[CodeGen] Remove static member functions ↵Craig Topper1-2/+1
Register::stackSlot2Index/isStackSlot. NFC Migrate the few users to the nonstatic member functions.
2025-02-02[CodeGen] Avoid repeated hash lookups (NFC) (#125382)Kazu Hirata1-4/+6
2025-02-01[CodeGen][NFC] Remove redundant map lookup (#125342)Balazs Benics1-3/+6
2025-01-26[CodeGen] Avoid repeated hash lookups (NFC) (#124455)Kazu Hirata1-1/+2
2025-01-13[aarch64][win] Update Called Globals info when updating Call Site info (#122762)Daniel Paoliello1-3/+3
Fixes the "use after poison" issue introduced by #121516 (see <https://github.com/llvm/llvm-project/pull/121516#issuecomment-2585912395>). The root cause of this issue is that #121516 introduced "Called Global" information for call instructions modeling how "Call Site" info is stored in the machine function, HOWEVER it didn't copy the copy/move/erase operations for call site information. The fix is to rename and update the existing copy/move/erase functions so they also take care of Called Global info.
2025-01-13Reapply "Spiller: Detach legacy pass and supply analyses instead (#119181)" ↵Akshat Oke1-25/+15
(#122665) Makes Inline Spiller amenable to the new PM. This reapplies commit a531800344dc54e9c197a13b22e013f919f3f5e1 reverted because of two unused private members reported on sanitizer bots.
2025-01-10Revert "Spiller: Detach legacy pass and supply analyses instead (#119… ↵Akshat Oke1-14/+22
(#122426) …181)" This reverts commit a531800344dc54e9c197a13b22e013f919f3f5e1.
2025-01-10Spiller: Detach legacy pass and supply analyses instead (#119181)Akshat Oke1-22/+14
Makes Inline Spiller amenable to the new PM.
2024-12-06[CodeGen][NewPM] Port LiveStacks analysis to NPM (#118778)Akshat Oke1-2/+2
2024-11-12[CodeGen] Remove unused includes (NFC) (#115996)Kazu Hirata1-1/+0
Identified with misc-include-cleaner.
2024-10-02[CodeGen][RAGreedy] Inform LiveDebugVariables about snippets spilled by ↵Bevin Hansson1-1/+14
InlineSpiller. (#109962) RAGreedy invokes InlineSpiller to spill a particular virtreg inline. When the spiller does this, it also identifies small, adjacent liveranges called snippets. These are also spilled or rematerialized in the process. However, the spiller does not inform RA that it has spilled these regs. This means that debug variable locations referencing these regs/ranges are lost. Mark any spilled regs which do not have a stack slot assigned to them as allocated to the slot being spilled to to tell LDV that those regs are located in that slot, even though the regs might no longer exist in the program after regalloc is finished. Also, inform RA about all of the regs which were replaced (spilled or rematted), not just the one that was requested so that it can properly manage the ranges of the debug vars.
2024-09-19[LLVM] Use {} instead of std::nullopt to initialize empty ArrayRef (#109133)Jay Foad1-1/+1
It is almost always simpler to use {} instead of std::nullopt to initialize an empty ArrayRef. This patch changes all occurrences I could find in LLVM itself. In future the ArrayRef(std::nullopt_t) constructor could be deprecated or removed.
2024-07-12[CodeGen][NewPM] Port `machine-block-freq` to new pass manager (#98317)paperchalice1-2/+4
- Add `MachineBlockFrequencyAnalysis`. - Add `MachineBlockFrequencyPrinterPass`. - Use `MachineBlockFrequencyInfoWrapperPass` in legacy pass manager. - `LazyMachineBlockFrequencyInfo::print` is empty, drop it due to new pass manager migration.
2024-07-10[CodeGen][NewPM] Port `LiveIntervals` to new pass manager (#98118)paperchalice1-2/+2
- Add `LiveIntervalsAnalysis`. - Add `LiveIntervalsPrinterPass`. - Use `LiveIntervalsWrapperPass` in legacy pass manager. - Use `std::unique_ptr` instead of raw pointer for `LICalc`, so destructor and default move constructor can handle it correctly. This would be the last analysis required by `PHIElimination`.
2024-06-27[NFC][RegAlloc] Delete unused optionAiden Grossman1-2/+0
The option -disable-spill-hoist does not actually control anything and is not used anywhere, so it should be removed.
2024-06-11[CodeGen][NewPM] Split `MachineDominatorTree` into a concrete analysis ↵paperchalice1-8/+8
result (#94571) Prepare for new pass manager version of `MachineDominatorTreeAnalysis`. We may need a machine dominator tree version of `DomTreeUpdater` to handle `SplitCriticalEdge` in some CodeGen passes.
2024-04-24[CodeGen] Make the parameter TRI required in some functions. (#85968)Xu Zhang1-1/+1
Fixes #82659 There are some functions, such as `findRegisterDefOperandIdx` and `findRegisterDefOperand`, that have too many default parameters. As a result, we have encountered some issues due to the lack of TRI parameters, as shown in issue #82411. Following @RKSimon 's suggestion, this patch refactors 9 functions, including `{reads, kills, defines, modifies}Register`, `registerDefIsDead`, and `findRegister{UseOperandIdx, UseOperand, DefOperandIdx, DefOperand}`, adjusting the order of the TRI parameter and making it required. In addition, all the places that call these functions have also been updated correctly to ensure no additional impact. After this, the caller of these functions should explicitly know whether to pass the `TargetRegisterInfo` or just a `nullptr`.
2023-12-03[llvm] Stop including vector (NFC)Kazu Hirata1-1/+0
Identified with clangd.
2023-11-16Remove unused LoopInfo from InlineSpiller and SpillPlacement (NFC) (#71874)Matthias Braun1-7/+2
2023-11-16[AMDGPU] RA inserted scalar instructions can be at the BB top (#72140)Christudasan Devadasan1-1/+1
We adjust the insertion point at the BB top for spills/copies during RA to ensure they are placed after the exec restore instructions required for the divergent control flow execution. This is, however, required only for the vector operations. The insertions for scalar registers can still go to the BB top.
2023-10-26[Inline Spiller] Consider bundles when marking defs as deadPiotr Sobczak1-0/+29
Fix bug where the code expects just a single MI, but a series of bundled MIs need to be handled instead. The semi-formed bundled are created by SplitKit for the case where not all lanes are live (buildSingleSubRegCopy). Then the remat kicks in, and since the values that are copied in the bundle do not need to be preserved due to the remat (dead defs), all instructions in the bundle should be marked as dead. However, only the first one gets marked as dead, which causes the verifier to complain later with error: "Live range continues after dead def flag". Differential Revision: https://reviews.llvm.org/D156999
2023-10-20InlineSpiller: Delete assert that implicit_def has no implicit operands (#69087)Matt Arsenault1-2/+1
It's not a verifier enforced property that implicit_def may only have one operand. Fixes assertions after the coalescer implicit-defs to preserve super register liveness to arbitrary instructions. For some reason I'm unable to reproduce this as a MIR test running only the allocator for the x86 test. Not sure it's worth keeping around.
2023-10-02Revert "InlineSpiller: Consider if all subranges are the same when avoiding ↵JP Lehr1-32/+1
redundant spills" This reverts commit d8127b2ba8a87a610851b9a462f2fc2526c36e37.
2023-10-01InlineSpiller: Consider if all subranges are the same when avoiding ↵Matt Arsenault1-1/+32
redundant spills This avoids some redundant spills of subranges, and avoids a compile failure. This greatly reduces the numbers of spills in a loop. The main range is not informative when multiple instructions are needed to fully define a register. A common scenario is a lowered reg_sequence where every subregister is sequentially defined, but each def changes the main range's value number. If we look at specific lanes at the use index, we can see the value is actually the same. In this testcase, there are a large number of materialized 64-bit constant defs which are hoisted outside of the loop by MachineLICM. These are feeding REG_SEQUENCES, which is not considered rematerializable inside the loop. After coalescing, the split constant defs produce main ranges with an apparent phi def. There's no phi def if you look at each individual subrange, and only half of the register is really redefined to a constant. Fixes: SWDEV-380865 https://reviews.llvm.org/D147079
2023-07-31Reapply "[CodeGen]Allow targets to use target specific COPY instructions for ↵Matt Arsenault1-16/+18
live range splitting" This reverts commit a496c8be6e638ae58bb45f13113dbe3a4b7b23fd. The workaround in c26dfc81e254c78dc23579cf3d1336f77249e1f6 should work around the underlying problem with SUBREG_TO_REG.
2023-07-26Revert "[CodeGen]Allow targets to use target specific COPY instructions for ↵Vitaly Buka1-18/+16
live range splitting" And dependent commits. Details in D150388. This reverts commit 825b7f0ca5f2211ec3c93139f98d1e24048c225c. This reverts commit 7a98f084c4d121244ef7286bc6503b6a181d446e. This reverts commit b4a62b1fa546312d882fa12dfdcd015177d66826. This reverts commit b7836d856206ec39509d42529f958c920368166b. No conflicts in the code, few tests had conflicts in autogenerated CHECKs: llvm/test/CodeGen/Thumb2/mve-float32regloops.ll llvm/test/CodeGen/AMDGPU/fix-frame-reg-in-custom-csr-spills.ll Reviewed By: alexfh Differential Revision: https://reviews.llvm.org/D156381
2023-07-17InlineSpiller: Fix copy identification bugs in isCopyOfBundleMatt Arsenault1-3/+4
Noticed by inspection of b7836d856206ec39509d42529f958c920368166b. This was checking if the first instruction was a copy, not the current MI. It should fully respect the isCopyInstr result. Hopefully this fixes a reported regression which we can extract a test from.
2023-07-07[CodeGen]Allow targets to use target specific COPY instructions for live ↵Yashwant Singh1-14/+15
range splitting Replacing D143754. Right now the LiveRangeSplitting during register allocation uses TargetOpcode::COPY instruction for splitting. For AMDGPU target that creates a problem as we have both vector and scalar copies. Vector copies perform a copy over a vector register but only on the lanes(threads) that are active. This is mostly sufficient however we do run into cases when we have to copy the entire vector register and not just active lane data. One major place where we need that is live range splitting. Allowing targets to use their own copy instructions(if defined) will provide a lot of flexibility and ease to lower these pseudo instructions to correct MIR. - Introduce getTargetCopyOpcode() virtual function and use if to generate copy in Live range splitting. - Replace necessary MI.isCopy() checks with TII.isCopyInstr() in register allocator pipeline. Reviewed By: arsenm, cdevadas, kparzysz Differential Revision: https://reviews.llvm.org/D150388
2023-06-20InlineSpiller: Consider copy bundles when looking for snippet copiesMatt Arsenault1-25/+69
This was looking for full copies produced by SplitKit, but SplitKit introduces copy bundles if not all lanes are live. The scan for uses needs to look at bundles, not individual instructions. This is a prerequisite to avoiding some redundant spills due to subregisters which will help avoid an allocation failure in a future patch.
2023-06-01[CodeGen] Make use of MachineInstr::all_defs and all_uses. NFCI.Jay Foad1-4/+4
Differential Revision: https://reviews.llvm.org/D151424
2023-04-17Fix uninitialized pointer members in CodeGenAkshay Khadse1-2/+2
This change initializes the members TSI, LI, DT, PSI, and ORE pointer feilds of the SelectOptimize class to nullptr. Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D148303
2023-03-14[CodeGen] Use *{Set,Map}::contains (NFC)Kazu Hirata1-5/+4
2023-01-13[CodeGen] Remove uses of Register::isPhysicalRegister/isVirtualRegister. NFCCraig Topper1-2/+2
Use isPhysical/isVirtual methods. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D141715
2023-01-09[Inline Spiller] Extend the snippet by statepoint usesSerguei Katkov1-2/+20
Snippet is a tiny live interval which has copy or fill like def and copy or spill like use at the end (any of them might abcent). Snippet has only one use/def inside interval and interval is located in one basic block. When inline spiller spills some reg around uses it also forces the spilling of connected snippets those which got by splitting the same original reg and its def is a full copy of our reg or its last use is a full copy to our reg. The definition of snippet is extended to allow not only one use/def but more. However all other uses are statepoint instructions which will fold fill into its operand. That way we do not introduce new fills/spills. Reviewed By: qcolombet, dantrushin Differential Revision: https://reviews.llvm.org/D138093
2022-12-17[CodeGen] Additional Register argument to ↵Christudasan Devadasan1-4/+4
storeRegToStackSlot/loadRegFromStackSlot With D134950, targets get notified when a virtual register is created and/or cloned. Targets can do the needful with the delegate callback. AMDGPU propagates the virtual register flags maintained in the target file itself. They are useful to identify a certain type of machine operands while inserting spill stores and reloads. Since RegAllocFast spills the physical register itself, there is no way its virtual register can be mapped back to retrieve the flags. It can be solved by passing the virtual register as an additional argument. This argument has no use when the spill interfaces are called during the greedy allocator or even the PrologEpilogInserter and can pass a null register in such cases. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D138656
2022-12-06[ADT] Don't including None.h (NFC)Kazu Hirata1-1/+0
These source files no longer use None, so they do not need to include None.h. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-02[CodeGen] Use std::nullopt instead of None (NFC)Kazu Hirata1-1/+1
This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-08-07[llvm] Fix comment typos (NFC)Kazu Hirata1-1/+1
2022-07-18CodeGen: Remove AliasAnalysis from regallocMatt Arsenault1-8/+4
This was stored in LiveIntervals, but not actually used for anything related to LiveIntervals. It was only used in one check for if a load instruction is rematerializable. I also don't think this was entirely correct, since it was implicitly assuming constant loads are also dereferenceable. Remove this and rely only on the invariant+dereferenceable flags in the memory operand. Set the flag based on the AA query upfront. This should have the same net benefit, but has the possible disadvantage of making this AA query nonlazy. Preserve the behavior of assuming pointsToConstantMemory implying dereferenceable for now, but maybe this should be changed.
2022-07-17[CodeGen] Qualify auto variables in for loops (NFC)Kazu Hirata1-5/+5
2022-06-22InlineSpiller: Don't fold spills into undef readsMatt Arsenault1-0/+7
This was producing a load into a dead register which was a verifier error.