aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/CodeGen/MachineRegisterInfo.cpp
AgeCommit message (Collapse)AuthorFilesLines
4 daysCodeGen: Stop checking for physregs in constrainRegClass (#161795)Matt Arsenault1-2/+0
It's nonsensical to call this function on a physical register.
2025-09-03[AMDGPU] si-peephole-sdwa: reuse getOne{NonDBGUse,Def} (NFC) (#156455)Frederik Harwath1-0/+5
This patch changes the findSingleRegDef function from si-peephole-sdwa to reuse MachineRegisterInfo::getOneDef and findSingleRefUse to use a new MachineRegisterInfo::getOneNonDBGUse function.
2025-05-13Reapply "[AMDGPU][Scheduler] Refactor ArchVGPR rematerialization during ↵Lucas Ramirez1-0/+5
scheduling (#125885)" (#139548) This reapplies 067caaa and 382a085 (reverting b35f6e2) with fixes to issues detected by the address sanitizer (MIs have to be removed from live intervals before being removed from their parent MBB). Original commit description below. AMDGPU scheduler's `PreRARematStage` attempts to increase function occupancy w.r.t. ArchVGPR usage by rematerializing trivial ArchVGPR-defining instruction next to their single use. It first collects all eligible trivially rematerializable instructions in the function, then sinks them one-by-one while recomputing occupancy in all affected regions each time to determine if and when it has managed to increase overall occupancy. If it does, changes are committed to the scheduler's state; otherwise modifications to the IR are reverted and the scheduling stage gives up. In both cases, this scheduling stage currently involves repeated queries for up-to-date occupancy estimates and some state copying to enable reversal of sinking decisions when occupancy is revealed not to increase. The current implementation also does not accurately track register pressure changes in all regions affected by sinking decisions. This commit refactors this scheduling stage, improving RP tracking and splitting the stage into two distinct steps to avoid repeated occupancy queries and IR/state rollbacks. - Analysis and collection (`canIncreaseOccupancyOrReduceSpill`). The number of ArchVGPRs to save to reduce spilling or increase function occupancy by 1 (when there is no spilling) is computed. Then, instructions eligible for rematerialization are collected, stopping as soon as enough have been identified to be able to achieve our goal (according to slightly optimistic heuristics). If there aren't enough of such instructions, the scheduling stage stops here. - Rematerialization (`rematerialize`). Instructions collected in the first step are rematerialized one-by-one. Now we are able to directly update the scheduler's state since we have already done the occupancy analysis and know we won't have to rollback any state. Register pressures for impacted regions are recomputed only once, as opposed to at every sinking decision. In the case where the stage attempted to increase occupancy, and if both rematerializations alone and rescheduling after were unable to improve occupancy, then all rematerializations are rollbacked.
2025-05-09Revert "[AMDGPU][Scheduler] Refactor ArchVGPR rematerialization during ↵Vitaly Buka1-5/+0
scheduling (#125885)" (#139341) And related "[AMDGPU] Regenerate mfma-loop.ll test" Introduce memory error detected by Asan #125885. This reverts commit 382a085a95b0abeac77b150b7b644b372bd08e78. This reverts commit 067caaafb58a156d0d77229422607782a639f5b5.
2025-05-08[AMDGPU][Scheduler] Refactor ArchVGPR rematerialization during scheduling ↵Lucas Ramirez1-0/+5
(#125885) AMDGPU scheduler's `PreRARematStage` attempts to increase function occupancy w.r.t. ArchVGPR usage by rematerializing trivial ArchVGPR-defining instruction next to their single use. It first collects all eligible trivially rematerializable instructions in the function, then sinks them one-by-one while recomputing occupancy in all affected regions each time to determine if and when it has managed to increase overall occupancy. If it does, changes are committed to the scheduler's state; otherwise modifications to the IR are reverted and the scheduling stage gives up. In both cases, this scheduling stage currently involves repeated queries for up-to-date occupancy estimates and some state copying to enable reversal of sinking decisions when occupancy is revealed not to increase. The current implementation also does not accurately track register pressure changes in all regions affected by sinking decisions. This commit refactors this scheduling stage, improving RP tracking and splitting the stage into two distinct steps to avoid repeated occupancy queries and IR/state rollbacks. - Analysis and collection (`canIncreaseOccupancyOrReduceSpill`). The number of ArchVGPRs to save to reduce spilling or increase function occupancy by 1 (when there is no spilling) is computed. Then, instructions eligible for rematerialization are collected, stopping as soon as enough have been identified to be able to achieve our goal (according to slightly optimistic heuristics). If there aren't enough of such instructions, the scheduling stage stops here. - Rematerialization (`rematerialize`). Instructions collected in the first step are rematerialized one-by-one. Now we are able to directly update the scheduler's state since we have already done the occupancy analysis and know we won't have to rollback any state. Register pressures for impacted regions are recomputed only once, as opposed to at every sinking decision. In the case where the stage attempted to increase occupancy, and if both rematerializations alone and rescheduling after were unable to improve occupancy, then all rematerializations are rollbacked.
2025-01-23MachineRegisterInfo: Use variable for TRIMatt Arsenault1-4/+3
2025-01-19[CodeGen] Remove some implict conversions of MCRegister to unsigned by ↵Craig Topper1-2/+2
using(). NFC Many of these are indexing BitVectors or something where we can't using MCRegister and need the register number.
2025-01-18[CodeGen] Use Register/MCRegister::isPhysical. NFCCraig Topper1-1/+1
2025-01-02[CodeGen] Remove atEnd method from defusechain iterators (#120610)Jay Foad1-3/+5
This was not used much and there are better ways of writing it.
2024-12-06[RISCV][MRI] Account for fixed registers when determining callee saved regs ↵Michael Maitland1-1/+7
(#115756) This fixes https://discourse.llvm.org/t/fixed-register-being-spill-and-restored-in-clang/83058. We need to do it in `MachineRegisterInfo::getCalleeSavedRegs` instead of `RISCVRegisterInfo::getCalleeSavedRegs` since the MF argument of `TargetRegisterInfo:::getCalleeSavedRegs` is `const`, so we can't call `MF->getRegInfo().disableCalleeSavedRegister` there. So to put it in `MachineRegisterInfo::getCalleeSavedRegs`, we move `isRegisterReservedByUser` into `TargetSubtargetInfo`.
2024-08-07[CodeGen] Allocate RegAllocHints map lazily (#102186)Alexis Engelke1-2/+0
This hint map is not required whenever a new register is added, in fact, at -O0, it is not used at all. Growing this map is quite expensive, as SmallVectors are not trivially copyable. Grow this map only when hints are actually added to avoid multiple grows and grows when no hints are added at all.
2024-06-14[CodeGen] Remove target SubRegLiveness flags (#95437)David Green1-2/+4
This removes the uses of target flags to disable subreg liveness, relying on the `-enable-subreg-liveness` flag instead. The `-enable-subreg-liveness` flag has been changed to take precedence over the subtarget if set, and one use of `Subtarget->enableSubRegLiveness()` has been changed to `MRI->subRegLivenessEnabled()` to make sure the option properly applies.
2024-03-11[CodeGen] Remove unused MachineRegisterInfo methodsJay Foad1-12/+0
2024-03-11[CodeGen] Do not pass MF into MachineRegisterInfo methods. NFC. (#84770)Jay Foad1-11/+8
MachineRegisterInfo already knows the MF so there is no need to pass it in as an argument.
2024-02-05AMDGPU/GlobalISelDivergenceLowering: select divergent i1 phis (#80003)Petar Avramovic1-0/+9
Implement PhiLoweringHelper for GlobalISel in DivergenceLoweringHelper. Use machine uniformity analysis to find divergent i1 phis and select them as lane mask phis in same way SILowerI1Copies select VReg_1 phis. Note that divergent i1 phis include phis created by LCSSA and all cases of uses outside of cycle are actually covered by "lowering LCSSA phis". GlobalISel lane masks are registers with sgpr register class and S1 LLT. TODO: General goal is that instructions created in this pass are fully instruction-selected so that selection of lane mask phis is not split across multiple passes. patch 3 from: https://github.com/llvm/llvm-project/pull/73337
2024-01-24Revert "AMDGPU/GlobalISelDivergenceLowering: select divergent i1 phis" (#79274)Petar Avramovic1-11/+0
Reverts llvm/llvm-project#78482
2024-01-24AMDGPU/GlobalISelDivergenceLowering: select divergent i1 phis (#78482)Petar Avramovic1-0/+11
Implement PhiLoweringHelper for GlobalISel in DivergenceLoweringHelper. Use machine uniformity analysis to find divergent i1 phis and select them as lane mask phis in same way SILowerI1Copies select VReg_1 phis. Note that divergent i1 phis include phis created by LCSSA and all cases of uses outside of cycle are actually covered by "lowering LCSSA phis". GlobalISel lane masks are registers with sgpr register class and S1 LLT. TODO: General goal is that instructions created in this pass are fully instruction-selected so that selection of lane mask phis is not split across multiple passes. patch 3 from: https://github.com/llvm/llvm-project/pull/73337
2023-10-24[ADT] Rename llvm::erase_value to llvm::erase (NFC) (#70156)Kazu Hirata1-1/+1
C++20 comes with std::erase to erase a value from std::vector. This patch renames llvm::erase_value to llvm::erase for consistency with C++20. We could make llvm::erase more similar to std::erase by having it return the number of elements removed, but I'm not doing that for now because nobody seems to care about that in our code base. Since there are only 50 occurrences of erase_value in our code base, this patch replaces all of them with llvm::erase and deprecates llvm::erase_value.
2023-08-13[CodeGen] MachineRegisterInfo::constrainRegAttrs - add explicit auto ↵Simon Pilgrim1-2/+2
reference to prevent copy. Fixes static analysis warning
2023-04-18[MC] Simplify uses of subregs/superregs. NFC.Jay Foad1-9/+2
2023-04-18[MC] Use subregs/superregs instead of MCSubRegIterator/MCSuperRegIterator. NFC.Jay Foad1-3/+2
Differential Revision: https://reviews.llvm.org/D148613
2023-04-17[nfc][llvm] Replace pointer cast functions in PointerUnion by llvm casting ↵Shraiysh Vaishay1-5/+5
functions. This patch replaces the uses of PointerUnion.is function by llvm::isa, PointerUnion.get function by llvm::cast, and PointerUnion.dyn_cast by llvm::dyn_cast_if_present. This is according to the FIXME in the definition of the class PointerUnion. This patch does not remove them as they are being used in other subprojects. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D148449
2023-01-13[CodeGen] Remove uses of Register::isPhysicalRegister/isVirtualRegister. NFCCraig Topper1-2/+2
Use isPhysical/isVirtual methods. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D141715
2022-12-17[CodeGen] Use delegate to notify targets when virtual registers are createdChristudasan Devadasan1-6/+4
This will help targets to customize certain codegen decisions based on the virtual registers involved in special operations. This patch also extends the existing delegate in MRI to start support multicast. Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D134950
2022-12-15[MRI] Print more debug infor in clearVirtRegs() (NFC)Nikita Popov1-1/+5
2022-09-15[AMDGPU] Always select s_cselect_b32 for uniform 'select' SDNodeAlexander Timofeev1-4/+4
This patch contains changes necessary to carry physical condition register (SCC) dependencies through the SDNode scheduler. It adds the edge in the SDNodeScheduler dependency graph instead of inserting the SCC copy between each definition and use. This approach lets the scheduler place instructions in an optimal way placing the copy only when the dependency cannot be resolved. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D133593
2022-07-27Use hasNItemsOrLess() in MRI::hasAtMostUserInstrs().Amara Emerson1-6/+2
2022-07-27[AArch64][GlobalISel] Add heuristics for localizing G_CONSTANT.Amara Emerson1-0/+10
This adds similar heuristics to G_GLOBAL_VALUE, querying the cost of materializing a specific constant in code size. Doing so prevents us from sinking constants which require multiple instructions to generate into use blocks. Code size savings on CTMark -Os: Program size.__text before after diff ClamAV/clamscan 381940.00 382052.00 0.0% lencod/lencod 428408.00 428428.00 0.0% SPASS/SPASS 411868.00 411876.00 0.0% kimwitu++/kc 449944.00 449944.00 0.0% Bullet/bullet 463588.00 463556.00 -0.0% sqlite3/sqlite3 284696.00 284668.00 -0.0% consumer-typeset/consumer-typeset 414492.00 414424.00 -0.0% 7zip/7zip-benchmark 595244.00 594972.00 -0.0% mafft/pairlocalalign 247512.00 247368.00 -0.1% tramp3d-v4/tramp3d-v4 372884.00 372044.00 -0.2% Geomean difference -0.0% Differential Revision: https://reviews.llvm.org/D130554
2022-03-16Cleanup codegen includesserge-sans-paille1-1/+0
This is a (fixed) recommit of https://reviews.llvm.org/D121169 after: 1061034926 before: 1063332844 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121681
2022-03-10Revert "Cleanup codegen includes"Nico Weber1-0/+1
This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20. Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang, and many LLVM tests, see comments on https://reviews.llvm.org/D121169
2022-03-10Cleanup codegen includesserge-sans-paille1-1/+0
after: 1061034926 before: 1063332844 Differential Revision: https://reviews.llvm.org/D121169
2022-02-08[X86] Implement -fzero-call-used-regs optionBill Wendling1-0/+15
The "-fzero-call-used-regs" option tells the compiler to zero out certain registers before the function returns. It's also available as a function attribute: zero_call_used_regs. The two upper categories are: - "used": Zero out used registers. - "all": Zero out all registers, whether used or not. The individual options are: - "skip": Don't zero out any registers. This is the default. - "used": Zero out all used registers. - "used-arg": Zero out used registers that are used for arguments. - "used-gpr": Zero out used registers that are GPRs. - "used-gpr-arg": Zero out used GPRs that are used as arguments. - "all": Zero out all registers. - "all-arg": Zero out all registers used for arguments. - "all-gpr": Zero out all GPRs. - "all-gpr-arg": Zero out all GPRs used for arguments. This is used to help mitigate Return-Oriented Programming exploits. Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D110869
2022-01-30[CodeGen] Use default member initialization (NFC)Kazu Hirata1-2/+1
Identified with modernize-use-default-member-init.
2021-10-31[CodeGen] Use make_early_inc_range (NFC)Kazu Hirata1-3/+1
2021-06-14[AIX][XCOFF] emit vector info of traceback table.zhijian1-2/+3
Summary: emit vector info of traceback table. Reviewers: Jason Liu,Hubert Tong Differential Revision: https://reviews.llvm.org/D93659
2021-03-05Reapply "[DebugInfo] Add new instruction and DIExpression operator for ↵Stephen Tozer1-3/+3
variadic debug values" Rewrites test to use correct architecture triple; fixes incorrect reference in SourceLevelDebugging doc; simplifies `spillReg` behaviour so as to not be dependent on changes elsewhere in the patch stack. This reverts commit d2000b45d033c06dc7973f59909a0ad12887ff51.
2021-03-04Revert "[DebugInfo] Add new instruction and DIExpression operator for ↵Stephen Tozer1-3/+3
variadic debug values" This reverts commit d07f106f4a48b6e941266525b6f7177834d7b74e.
2021-03-04[DebugInfo] Add new instruction and DIExpression operator for variadic debug ↵gbtozers1-3/+3
values This patch adds a new instruction that can represent variadic debug values, DBG_VALUE_VAR. This patch alone covers the addition of the instruction and a set of basic code changes in MachineInstr and a few adjacent areas, but does not correctly handle variadic debug values outside of these areas, nor does it generate them at any point. The new instruction is similar to the existing DBG_VALUE instruction, with the following differences: the operands are in a different order, any number of values may be used in the instruction following the Variable and Expression operands (these are referred to in code as “debug operands”) and are indexed from 0 so that getDebugOperand(X) == getOperand(X+2), and the Expression in a DBG_VALUE_VAR must use the DW_OP_LLVM_arg operator to pass arguments into the expression. The new DW_OP_LLVM_arg operator is only valid in expressions appearing in a DBG_VALUE_VAR; it takes a single argument and pushes the debug operand at the index given by the argument onto the Expression stack. For example the sub-expression `DW_OP_LLVM_arg, 0` has the meaning “Push the debug operand at index 0 onto the expression stack.” Differential Revision: https://reviews.llvm.org/D82363
2021-02-20[CodeGen] Use range-based for loops (NFC)Kazu Hirata1-7/+4
2021-02-19[CodeGen] Use range-based for loops (NFC)Kazu Hirata1-8/+8
2021-01-21[CodeGen] Use llvm::append_range (NFC)Kazu Hirata1-2/+1
2021-01-20[llvm] Use hasSingleElement (NFC)Kazu Hirata1-8/+2
2021-01-07[CodeGen] Remove unused function isCallerPreservedOrConstPhysReg (NFC)Kazu Hirata1-7/+0
The last use of the function was removed on Oct 20, 2018 in commit 8d6ff4c0af843e1a61b76d89812aed91e358de34.
2020-12-13[CodeGen] Use llvm::erase_value (NFC)Kazu Hirata1-2/+1
2020-10-28[NFC] Use [MC]Register in CSE & LICMGaurav Jain1-1/+1
Differential Revision: https://reviews.llvm.org/D90327
2020-06-22[DebugInfo] Update MachineInstr to help support variadic DBG_VALUE instructionsstozer1-1/+1
Following on from this RFC[0] from a while back, this is the first patch towards implementing variadic debug values. This patch specifically adds a set of functions to MachineInstr for performing operations specific to debug values, and replacing uses of the more general functions where appropriate. The most prevalent of these is replacing getOperand(0) with getDebugOperand(0) for debug-value-specific code, as the operands corresponding to values will no longer be at index 0, but index 2 and upwards: getDebugOperand(x) == getOperand(x+2). Similar replacements have been added for the other operands, along with some helper functions to replace oft-repeated code and operate on a variable number of value operands. [0] http://lists.llvm.org/pipermail/llvm-dev/2020-February/139376.html<Paste> Differential Revision: https://reviews.llvm.org/D81852
2020-04-07CodeGen: Use Register in more placesMatt Arsenault1-6/+6
2020-04-06Revert "[IPRA][ARM] Spill extra registers at -Oz"Oliver Stannard1-37/+13
Reverting because this is causing failures on bots with expensive checks enabled. This reverts commit 73cea83a6f5ab521edf3cccfc603534776d691ec.
2020-03-18[IPRA][ARM] Spill extra registers at -OzOliver Stannard1-13/+37
When optimising for code size at the expense of performance, it is often worth saving and restoring some of r0-r3, if IPRA will be able to take advantage of them. This doesn't cost any extra code size if we already have a PUSH/POP pair, and increases the number of available registers across any calls to the function. We already have an optimisation which tries fold the subtract/add of the SP into the PUSH/POP by using extra registers, which somewhat conflicts with this. I've made the new optimisation less aggressive in cases where the existing one is likely to trigger, which gives better results than either of these optimisations by themselves. Differential revision: https://reviews.llvm.org/D69936
2020-01-30CodeGen: Use RegisterMatt Arsenault1-31/+31