aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/CodeGen/RegAllocFast.cpp
AgeCommit message (Collapse)AuthorFilesLines
2022-12-17[CodeGen] Additional Register argument to ↵Christudasan Devadasan1-2/+3
storeRegToStackSlot/loadRegFromStackSlot With D134950, targets get notified when a virtual register is created and/or cloned. Targets can do the needful with the delegate callback. AMDGPU propagates the virtual register flags maintained in the target file itself. They are useful to identify a certain type of machine operands while inserting spill stores and reloads. Since RegAllocFast spills the physical register itself, there is no way its virtual register can be mapped back to retrieve the flags. It can be solved by passing the virtual register as an additional argument. This argument has no use when the spill interfaces are called during the greedy allocator or even the PrologEpilogInserter and can pass a null register in such cases. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D138656
2022-10-05[RegAllocFast] Clean-up. Remove redundant operations. NFC.Serguei Katkov1-7/+2
Reviewed By: MatzeB, arsenm Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D109213
2022-08-20(Reland) [fastalloc] Support allocating specific register class in fastallocLuo, Yuanke1-6/+40
This reverts commit 853bb192c407f5d9e75a5fd55cc089151530cbd3.
2022-08-15Revert "(Reland) [fastalloc] Support allocating specific register class in ↵Luo, Yuanke1-40/+6
fastalloc" This reverts commit 30f9e6ebd30b79d13f99eaca4d829e0da07186b3.
2022-08-13(Reland) [fastalloc] Support allocating specific register class in fastallocLuo, Yuanke1-6/+40
Reland commit 719658d078c4 The base RA support infrastructure that only allow a specific register class be allocated in RA pss. Since greedy RA, basic RA derived from base RA, they all allow allocating specific register class. Fast RA doesn't support allocating register for specific register class. This patch is to enable ShouldAllocateClass in fast RA, so that it can support allocating register for specific register class. Differential Revision: https://reviews.llvm.org/D131825
2022-07-17[CodeGen] Qualify auto variables in for loops (NFC)Kazu Hirata1-1/+1
2022-07-16[CodeGen] Use RegClassFilterFunc where appropriate (NFC)Kazu Hirata1-3/+2
2022-06-23Revert "[fastalloc] Support allocating specific register class in fastalloc"Nico Weber1-40/+6
This reverts commit 719658d078c4093d1ee716fb65ae94673df7b22b. Breaks a few things, see comments on https://reviews.llvm.org/D128437 There's disagreement about the best fix. So let's keep HEAD green while discussions are happening.
2022-06-23[fastalloc] Support allocating specific register class in fastallocLuo, Yuanke1-6/+40
The base RA support infrastructure that only allow a specific register class be allocated in RA pss. Since greedy RA, basic RA derived from base RA, they all allow allocating specific register class. Fast RA doesn't support allocating register for specific register class. This patch is to enable ShouldAllocateClass in fast RA, so that it can support allocating register for specific register class. Differential Revision: https://reviews.llvm.org/D126771
2022-06-21[fastregalloc] Enhance the heuristics for liveout in self loop.Luo, Yuanke1-1/+10
For below case, virtual register is defined twice in the self loop. We don't need to spill %0 after the third instruction `%0 = def (tied %0)`, because it is defined in the second instruction `%0 = def`. 1 bb.1 2 %0 = def 3 %0 = def (tied %0) 4 ... 5 jmp bb.1 Reviewed By: MatzeB Differential Revision: https://reviews.llvm.org/D125079
2022-05-04[fastregalloc] Fix bug when undef value is tied to def.Luo, Yuanke1-5/+15
If the tied use is undef value, fastregalloc should free the def register. There is no reload needed for the undef value. Reviewed By: MatzeB Differential Revision: https://reviews.llvm.org/D124834
2022-03-16Cleanup codegen includesserge-sans-paille1-5/+0
This is a (fixed) recommit of https://reviews.llvm.org/D121169 after: 1061034926 before: 1063332844 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121681
2022-03-10Revert "Cleanup codegen includes"Nico Weber1-0/+5
This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20. Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang, and many LLVM tests, see comments on https://reviews.llvm.org/D121169
2022-03-10Cleanup codegen includesserge-sans-paille1-5/+0
after: 1061034926 before: 1063332844 Differential Revision: https://reviews.llvm.org/D121169
2021-11-28[llvm] Use range-based for loops (NFC)Kazu Hirata1-6/+3
2021-11-09[RegAllocFast] Fix nondeterminism in debuginfo generationIlya Yanok1-1/+2
Changes from commit 1db137b1859692ae33228c530d4df9f2431b2151 added iteration over hash map that can result in non-deterministic order. Fix that by using a SmallMapVector to preserve the order. Differential Revision: https://reviews.llvm.org/D113468
2021-07-13RegAlloc: Allow targets to split register allocationMatt Arsenault1-5/+27
AMDGPU normally spills SGPRs to VGPRs. Previously, since all register classes are handled at the same time, this was problematic. We don't know ahead of time how many registers will be needed to be reserved to handle the spilling. If no VGPRs were left for spilling, we would have to try to spill to memory. If the spilled SGPRs were required for exec mask manipulation, it is highly problematic because the lanes active at the point of spill are not necessarily the same as at the restore point. Avoid this problem by fully allocating SGPRs in a separate regalloc run from VGPRs. This way we know the exact number of VGPRs needed, and can reserve them for a second run. This fixes the most serious issues, but it is still possible using inline asm to make all VGPRs unavailable. Start erroring in the case where we ever would require memory for an SGPR spill. This is implemented by giving each regalloc pass a callback which reports if a register class should be handled or not. A few passes need some small changes to deal with leftover virtual registers. In the AMDGPU implementation, a new pass is introduced to take the place of PrologEpilogInserter for SGPR spills emitted during the first run. One disadvantage of this is currently StackSlotColoring is no longer used for SGPR spills. It would need to be run again, which will require more work. Error if the standard -regalloc option is used. Introduce new separate -sgpr-regalloc and -vgpr-regalloc flags, so the two runs can be controlled individually. PBQB is not currently supported, so this also prevents using the unhandled allocator.
2021-05-19MachineBasicBlock: add liveout iterator aware of which liveins are defined ↵Tim Northover1-4/+2
by the runtime. Using this in RegAlloc fast reduces register pressure, and in some cases allows x86 code to compile that wouldn't before.
2021-05-11[RegAllocFast] properly handle STATEPOINT instruction.Denis Antrushin1-13/+26
STATEPOINT is a fancy and complex pseudo instruction which has both tied defs and regmask operand. Basic FastRA algorithm is as follows: 1. Mark registers used by defs as free 2. If instruction has regmask operand displace clobbered registers according to regmask. 3. Assign registers for use operands. In case of tied defs step 1 is replaced with allocation of registers for them. But regmask is still processed, which may displace already allocated registers. As a result, tied use and def will get assigned to different registers. This patch makes FastRA to process instruction's RegMask (if any) when checking for physical registers interference. That way tied operands won't get registers clobbered by regmask. Reviewed By: arsenm, skatkov Differential Revision: https://reviews.llvm.org/D99284
2021-04-29Revert "RegAlloc: do not consider liveins to EH-pad successors as liveout."Tim Northover1-6/+1
Some liveins *can* come from this block (e.g. any SSA value except the call), it's only the ones that produce `landingpad` values that can't and I didn't think it through properly.
2021-04-29RegAlloc: do not consider liveins to EH-pad successors as liveout.Tim Northover1-1/+6
These registers get defined by the runtime, not the block being allocated, and treating them as preassigned in RegAllocFast adds extra pressure, sometimes enough to make the function unallocatable.
2021-03-10[DebugInfo] Handle DBG_VALUES with multiple variable location operands in MIRStephen Tozer1-23/+25
This patch adds handling for DBG_VALUE_LIST in the MIR-passes (after finalize-isel), excluding the debug liveness passes and DWARF emission. This most significantly affects MachineSink, which now needs to consider all used registers of a debug value when sinking, but for most passes this change is simply replacing getDebugOperand(0) with an iteration over all debug operands. Differential Revision: https://reviews.llvm.org/D92578
2021-03-05Reapply "[DebugInfo] Add new instruction and DIExpression operator for ↵Stephen Tozer1-38/+50
variadic debug values" Rewrites test to use correct architecture triple; fixes incorrect reference in SourceLevelDebugging doc; simplifies `spillReg` behaviour so as to not be dependent on changes elsewhere in the patch stack. This reverts commit d2000b45d033c06dc7973f59909a0ad12887ff51.
2021-03-04Revert "[DebugInfo] Add new instruction and DIExpression operator for ↵Stephen Tozer1-52/+39
variadic debug values" This reverts commit d07f106f4a48b6e941266525b6f7177834d7b74e.
2021-03-04[DebugInfo] Add new instruction and DIExpression operator for variadic debug ↵gbtozers1-39/+52
values This patch adds a new instruction that can represent variadic debug values, DBG_VALUE_VAR. This patch alone covers the addition of the instruction and a set of basic code changes in MachineInstr and a few adjacent areas, but does not correctly handle variadic debug values outside of these areas, nor does it generate them at any point. The new instruction is similar to the existing DBG_VALUE instruction, with the following differences: the operands are in a different order, any number of values may be used in the instruction following the Variable and Expression operands (these are referred to in code as “debug operands”) and are indexed from 0 so that getDebugOperand(X) == getOperand(X+2), and the Expression in a DBG_VALUE_VAR must use the DW_OP_LLVM_arg operator to pass arguments into the expression. The new DW_OP_LLVM_arg operator is only valid in expressions appearing in a DBG_VALUE_VAR; it takes a single argument and pushes the debug operand at the index given by the argument onto the Expression stack. For example the sub-expression `DW_OP_LLVM_arg, 0` has the meaning “Push the debug operand at index 0 onto the expression stack.” Differential Revision: https://reviews.llvm.org/D82363
2021-02-17[CodeGen] Use range-based for loops (NFC)Kazu Hirata1-3/+2
2021-01-14[CodeGen, Transforms] Use llvm::sort (NFC)Kazu Hirata1-2/+1
2020-12-21[FastRA] Fix handling of bundled MIsPushpinder Singh1-0/+43
Fast register allocator skips bundled MIs, as the main assignment loop uses MachineBasicBlock::iterator (= MachineInstrBundleIterator) This was causing SIInsertWaitcnts to crash which expects all instructions to have registers assigned. This patch makes sure to set everything inside bundle to the same assignments done on BUNDLE header. Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D90369
2020-12-02[NFC][MC] TargetRegisterInfo::getSubReg is a MCRegister.Mircea Trofin1-1/+1
Typing the API appropriately. Differential Revision: https://reviews.llvm.org/D92341
2020-11-02[NFC][regalloc] Use MCRegister appropriatelyMircea Trofin1-2/+2
Differential Revision: https://reviews.llvm.org/D90506
2020-10-28RegAlloc: Clear isSSAMatt Arsenault1-0/+5
The MIR parser may infer SSA, so -run-pass=regallocgreedy would hit a verifier error after multiple vreg defs are added.
2020-10-24Remove unused verifyRegStateMapping() function in RegAllocFast (NFC)Mehdi Amini1-14/+0
This fixes compiler warning when building with assertions.
2020-09-30RegAllocFast: Add extra DBG_VALUE for live out spillsMatt Arsenault1-3/+16
This allows LiveDebugValues to insert the proper DBG_VALUEs in live out blocks if a spill is inserted before the use of a register. Previously, this would see the register use as the last DBG_VALUE, even though the stack slot should be treated as the live out value. This avoids an lldb test regression when D52010 is re-applied.
2020-09-30Reapply "RegAllocFast: Rewrite and improve"Matt Arsenault1-547/+725
This reverts commit 73a6a164b84a8195defbb8f5eeb6faecfc478ad4.
2020-09-22Revert "Reapply Revert "RegAllocFast: Rewrite and improve""Muhammad Omair Javaid1-725/+547
This reverts commit 55f9f87da2c2ad791b9e62cccb1c035e037444fa. Breaks following buildbots: http://lab.llvm.org:8011/builders/lldb-arm-ubuntu/builds/4306 http://lab.llvm.org:8011/builders/lldb-aarch64-ubuntu/builds/9154
2020-09-21Reapply Revert "RegAllocFast: Rewrite and improve"Matt Arsenault1-547/+725
This reverts commit dbd53a1f0c939a55e7719c39d08179468f9ad3dc. Needed lldb test updates
2020-09-18Temporarily Revert "RegAllocFast: Rewrite and improve"Eric Christopher1-725/+547
as it's breaking a few tests in the lldb test suite. Bot: http://lab.llvm.org:8011/builders/lldb-arm-ubuntu/builds/4226/steps/test/logs/stdio This reverts commit c8757ff3aa7dd7a25a6343f6ef74a70c7be04325.
2020-09-18RegAllocFast: Rewrite and improveMatt Arsenault1-547/+725
This rewrites big parts of the fast register allocator. The basic strategy of doing block-local allocation hasn't changed but I tweaked several details: Track register state on register units instead of physical registers. This simplifies and speeds up handling of register aliases. Process basic blocks in reverse order: Definitions are known to end register livetimes when walking backwards (contrary when walking forward then uses may or may not be a kill so we need heuristics). Check register mask operands (calls) instead of conservatively assuming everything is clobbered. Enhance heuristics to detect killing uses: In case of a small number of defs/uses check if they are all in the same basic block and if so the last one is a killing use. Enhance heuristic for copy-coalescing through hinting: We check the first k defs of a register for COPYs rather than relying on there just being a single definition. When testing this on the full llvm test-suite including SPEC externals I measured: average 5.1% reduction in code size for X86, 4.9% reduction in code on aarch64. (ranging between 0% and 20% depending on the test) 0.5% faster compiletime (some analysis suggests the pass is slightly slower than before, but we more than make up for it because later passes are faster with the reduced instruction count) Also adds a few testcases that were broken without this patch, in particular bug 47278. Patch mostly by Matthias Braun
2020-09-18Reapply "RegAllocFast: Record internal state based on register units"Matt Arsenault1-135/+82
The regressions this caused should be fixed when https://reviews.llvm.org/D52010 is applied. This reverts commit a21387c65470417c58021f8d3194a4510bb64f46.
2020-09-16RegAllocFast: Make self loop live-out heuristic more aggressiveMatt Arsenault1-4/+33
This currently has no impact on code, but prevents sizeable code size regressions after D52010. This prevents spilling and reloading all values inside blocks that loop back. Add a baseline test which would regress without this patch.
2020-09-15Revert "RegAllocFast: Record internal state based on register units"Hans Wennborg1-82/+135
This seems to have caused incorrect register allocation in some cases, breaking tests in the Zig standard library (PR47278). As discussed on the bug, revert back to green for now. > Record internal state based on register units. This is often more > efficient as there are typically fewer register units to update > compared to iterating over all the aliases of a register. > > Original patch by Matthias Braun, but I've been rebasing and fixing it > for almost 2 years and fixed a few bugs causing intermediate failures > to make this patch independent of the changes in > https://reviews.llvm.org/D52010. This reverts commit 66251f7e1de79a7c1620659b7f58352b8c8e892e, and follow-ups 931a68f26b9a3de853807ffad7b2cd0a2dd30922 and 0671a4c5087d40450603d9d26cf239f1a8b1367e. It also adjust some test expectations.
2020-09-11RegAllocFast: Fix typo in commentMatt Arsenault1-2/+2
2020-06-22[DebugInfo] Update MachineInstr to help support variadic DBG_VALUE instructionsstozer1-1/+1
Following on from this RFC[0] from a while back, this is the first patch towards implementing variadic debug values. This patch specifically adds a set of functions to MachineInstr for performing operations specific to debug values, and replacing uses of the more general functions where appropriate. The most prevalent of these is replacing getOperand(0) with getDebugOperand(0) for debug-value-specific code, as the operands corresponding to values will no longer be at index 0, but index 2 and upwards: getDebugOperand(x) == getOperand(x+2). Similar replacements have been added for the other operands, along with some helper functions to replace oft-repeated code and operate on a variable number of value operands. [0] http://lists.llvm.org/pipermail/llvm-dev/2020-February/139376.html<Paste> Differential Revision: https://reviews.llvm.org/D81852
2020-06-10RegAllocFast: Avoid unused method warning in release buildsMatt Arsenault1-0/+5
2020-06-04RegAllocFast: Remove dead codeMatt Arsenault1-10/+0
2020-06-03RegAllocFast: Record internal state based on register unitsMatt Arsenault1-135/+87
Record internal state based on register units. This is often more efficient as there are typically fewer register units to update compared to iterating over all the aliases of a register. Original patch by Matthias Braun, but I've been rebasing and fixing it for almost 2 years and fixed a few bugs causing intermediate failures to make this patch independent of the changes in https://reviews.llvm.org/D52010.
2020-04-02[Alignment][NFC] Use more Align versions of various functionsGuillaume Chatelet1-2/+2
Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: MatzeB, qcolombet, arsenm, sdardis, jvesely, nhaehnle, hiraditya, jrtc27, atanasyan, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77291
2020-01-01[NFC] Fixes -Wrange-loop-analysis warningsMark de Wever1-1/+1
This avoids new warnings due to D68912 adds -Wrange-loop-analysis to -Wall. Differential Revision: https://reviews.llvm.org/D71857
2019-11-13Sink all InitializePasses.h includesReid Kleckner1-0/+1
This file lists every pass in LLVM, and is included by Pass.h, which is very popular. Every time we add, remove, or rename a pass in LLVM, it caused lots of recompilation. I found this fact by looking at this table, which is sorted by the number of times a file was changed over the last 100,000 git commits multiplied by the number of object files that depend on it in the current checkout: recompiles touches affected_files header 342380 95 3604 llvm/include/llvm/ADT/STLExtras.h 314730 234 1345 llvm/include/llvm/InitializePasses.h 307036 118 2602 llvm/include/llvm/ADT/APInt.h 213049 59 3611 llvm/include/llvm/Support/MathExtras.h 170422 47 3626 llvm/include/llvm/Support/Compiler.h 162225 45 3605 llvm/include/llvm/ADT/Optional.h 158319 63 2513 llvm/include/llvm/ADT/Triple.h 140322 39 3598 llvm/include/llvm/ADT/StringRef.h 137647 59 2333 llvm/include/llvm/Support/Error.h 131619 73 1803 llvm/include/llvm/Support/FileSystem.h Before this change, touching InitializePasses.h would cause 1345 files to recompile. After this change, touching it only causes 550 compiles in an incremental rebuild. Reviewers: bkramer, asbirlea, bollu, jdoerfert Differential Revision: https://reviews.llvm.org/D70211
2019-10-30RegAllocFast: Use RegisterMatt Arsenault1-69/+69