aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/CodeGen/PrologEpilogInserter.cpp
AgeCommit message (Collapse)AuthorFilesLines
11 days[MIR] Support save/restore points with independent sets of registers (#119358)Elizaveta Noskova1-8/+26
This patch adds the MIR parsing and serialization support for save and restore points with subsets of callee saved registers. That is, it syntactically allows a function to contain two or more distinct sub-regions in which distinct subsets of registers are spilled/filled as callee save. This is useful if e.g. one of the CSRs isn't modified in one of the sub-regions, but is in the other(s). Support for actually using this capability in code generation is still forthcoming. This patch is the next logical step for multiple save/restore points support. All points are now stored in DenseMap from MBB to vector of CalleeSavedInfo. Shrink-Wrap points split Part 4. RFC: https://discourse.llvm.org/t/shrink-wrap-save-restore-points-splitting/83581 Part 1: https://github.com/llvm/llvm-project/pull/117862 (landed) Part 2: https://github.com/llvm/llvm-project/pull/119355 (landed) Part 3: https://github.com/llvm/llvm-project/pull/119357 (landed) Part 5: https://github.com/llvm/llvm-project/pull/119359 (likely to be further split)
2025-08-12[llvm] Support multiple save/restore points in mir (#119357)Elizaveta Noskova1-9/+18
Currently mir supports only one save and one restore point specification: ``` savePoint: '%bb.1' restorePoint: '%bb.2' ``` This patch provide possibility to have multiple save and multiple restore points in mir: ``` savePoints: - point: '%bb.1' restorePoints: - point: '%bb.2' ``` Shrink-Wrap points split Part 3. RFC: https://discourse.llvm.org/t/shrink-wrap-save-restore-points-splitting/83581 Part 1: https://github.com/llvm/llvm-project/pull/117862 Part 2: https://github.com/llvm/llvm-project/pull/119355 Part 4: https://github.com/llvm/llvm-project/pull/119358 Part 5: https://github.com/llvm/llvm-project/pull/119359
2025-08-11Reapply "[X86] Correct 32-bit immediate assertion and fix 64-bit lowering ↵Wesley Wiser1-1/+1
for huge frame offsets" (#152239) The first commit is identical to 69bec0afbb8f2aa0021d18ea38768360b16583a9. The second commit fixes the instruction verification failures by replacing the erroneous instruction with a trap after the error is reported and adds `-verify-machineinstrs` to the tests added in the original PR to catch the issue sooner. After that change, all tests pass with both `LLVM_ENABLE_EXPENSIVE_CHECKS={On,Off}`. cc @RKSimon @e-kud @phoebewang @arsenm as reviewers on the original PR
2025-08-04Revert "[X86] Correct 32-bit immediate assertion and fix 64-bit lowering for ↵Simon Pilgrim1-1/+1
huge frame offsets" (#151975) Reverts llvm/llvm-project#123872 - this is breaking on EXPENSIVE_CHECKS builds Co-authored-by: Abhishek Kaushik <abhishek.kaushik@intel.com>
2025-08-03[X86] Correct 32-bit immediate assertion and fix 64-bit lowering for huge ↵Wesley Wiser1-1/+1
frame offsets (#123872) The assertion previously did not work correctly because the operand was being truncated to an `int` prior to comparison. Change the assertion into a a reported error as suggested in https://github.com/llvm/llvm-project/pull/101840#issuecomment-2304992425 by @arsenm Finally, fix the lowering on 64-bit targets so that offsets larger than 32-bit are correctly addressed and add tests for various reported issues.
2025-05-24[CodeGen] Remove unused includes (NFC) (#141320)Kazu Hirata1-1/+0
These are identified by misc-include-cleaner. I've filtered out those that break builds. Also, I'm staying away from llvm-config.h, config.h, and Compiler.h, which likely cause platform- or compiler-specific build failures.
2025-05-22[LLVM][CodeGen] Add convenience accessors for MachineFunctionProperties ↵users/pcc/spr/main.elf-add-branch-to-branch-optimizationRahul Joshi1-2/+1
(#140002) Add per-property has<Prop>/set<Prop>/reset<Prop> functions to MachineFunctionProperties.
2025-04-29[CodeGen][NewPM] Port "PrologEpilogInserter" to NPM (#130550)Vikram Hegde1-40/+65
2025-04-25Reland [AMDGPU] Support block load/store for CSR #130013 (#137169)Diana Picus1-29/+6
Add support for using the existing SCRATCH_STORE_BLOCK and SCRATCH_LOAD_BLOCK instructions for saving and restoring callee-saved VGPRs. This is controlled by a new subtarget feature, block-vgpr-csr. It does not include WWM registers - those will be saved and restored individually, just like before. This patch does not change the ABI. Use of this feature may lead to slightly increased stack usage, because the memory is not compacted if certain registers don't have to be transferred (this will happen in practice for calling conventions where the callee and caller saved registers are interleaved in groups of 8). However, if the registers at the end of the block of 32 don't have to be transferred, we don't need to use a whole 128-byte stack slot - we can trim some space off the end of the range. In order to implement this feature, we need to rely less on the target-independent code in the PrologEpilogInserter, so we override several new methods in SIFrameLowering. We also add new pseudos, SI_BLOCK_SPILL_V1024_SAVE/RESTORE. One peculiarity is that both the SI_BLOCK_V1024_RESTORE pseudo and the SCRATCH_LOAD_BLOCK instructions will have all the registers that are not transferred added as implicit uses. This is done in order to inform LiveRegUnits that those registers are not available before the restore (since we're not really restoring them - so we can't afford to scavenge them). Unfortunately, this trick doesn't work with the save, so before the save all the registers in the block will be unavailable (see the unit test). This was reverted due to failures in the builds with expensive checks on, now fixed by always updating LiveIntervals and SlotIndexes in SILowerSGPRSpills.
2025-04-23Revert "[AMDGPU] Support block load/store for CSR" (#136846)Diana Picus1-6/+29
Reverts llvm/llvm-project#130013 due to failures with expensive checks on.
2025-04-23[AMDGPU] Support block load/store for CSR (#130013)Diana Picus1-29/+6
Add support for using the existing `SCRATCH_STORE_BLOCK` and `SCRATCH_LOAD_BLOCK` instructions for saving and restoring callee-saved VGPRs. This is controlled by a new subtarget feature, `block-vgpr-csr`. It does not include WWM registers - those will be saved and restored individually, just like before. This patch does not change the ABI. Use of this feature may lead to slightly increased stack usage, because the memory is not compacted if certain registers don't have to be transferred (this will happen in practice for calling conventions where the callee and caller saved registers are interleaved in groups of 8). However, if the registers at the end of the block of 32 don't have to be transferred, we don't need to use a whole 128-byte stack slot - we can trim some space off the end of the range. In order to implement this feature, we need to rely less on the target-independent code in the PrologEpilogInserter, so we override several new methods in `SIFrameLowering`. We also add new pseudos, `SI_BLOCK_SPILL_V1024_SAVE/RESTORE`. One peculiarity is that both the SI_BLOCK_V1024_RESTORE pseudo and the SCRATCH_LOAD_BLOCK instructions will have all the registers that are not transferred added as implicit uses. This is done in order to inform LiveRegUnits that those registers are not available before the restore (since we're not really restoring them - so we can't afford to scavenge them). Unfortunately, this trick doesn't work with the save, so before the save all the registers in the block will be unavailable (see the unit test).
2025-03-02[CodeGen] Use MCRegister in CalleeSavedInfo. NFCCraig Topper1-1/+1
2025-02-20[FrameLowering] Use MCRegister instead of Register in CalleeSavedInfo. NFC ↵Craig Topper1-4/+4
(#128095) Callee saved registers should always be phyiscal registers. They are often passed directly to other functions that take MCRegister like getMinimalPhysRegClass or TargetRegisterClass::contains. Unfortunately, sometimes the MCRegister is compared to a Register which gave an ambiguous comparison error when the MCRegister is on the LHS. Adding a MCRegister==Register comparison operator created more ambiguous comparison errors elsewhere. These cases were usually comparing against a base or frame pointer register that is a physical register in a Register. For those I added an explicit conversion of Register to MCRegister to fix the error.
2025-01-27[CodeGen] Avoid repeated hash lookups (NFC) (#124506)Kazu Hirata1-2/+2
2025-01-19[CodeGen] Remove some implict conversions of MCRegister to unsigned by ↵Craig Topper1-6/+6
using(). NFC Many of these are indexing BitVectors or something where we can't using MCRegister and need the register number.
2024-11-12[CodeGen] Remove unused includes (NFC) (#115996)Kazu Hirata1-3/+0
Identified with misc-include-cleaner.
2024-08-21Revert "[LLVM] [X86] Fix integer overflows in frame layout for huge frames ↵Hans Wennborg1-1/+1
(#101840)" This casuses assertion failures targeting 32-bit x86: lib/Target/X86/X86RegisterInfo.cpp:989: virtual bool llvm::X86RegisterInfo::eliminateFrameIndex(MachineBasicBlock::iterator, int, unsigned int, RegScavenger *) const: Assertion `(Is64Bit || FitsIn32Bits) && "Requesting 64-bit offset in 32-bit immediate!"' failed. See comment on the PR. > Fix 32-bit integer overflows in the X86 target frame layout when dealing > with frames larger than 4gb. When this occurs, we'll scavenge a scratch > register to be able to hold the correct stack offset for frame locals. > > This completes reapplying #84114. > > Fixes #48911 > Fixes #75944 > Fixes #87154 This reverts commit 0abb7791614947bc24931dd851ade31d02496977.
2024-08-19[LLVM] [X86] Fix integer overflows in frame layout for huge frames (#101840)Wesley Wiser1-1/+1
Fix 32-bit integer overflows in the X86 target frame layout when dealing with frames larger than 4gb. When this occurs, we'll scavenge a scratch register to be able to hold the correct stack offset for frame locals. This completes reapplying #84114. Fixes #48911 Fixes #75944 Fixes #87154
2024-08-06Spill/restore FP/BP around instructions in which they are clobbered (#81048)weiguozhi1-0/+5
This patch fixes https://github.com/llvm/llvm-project/issues/17204. If a base pointer is used in a function, and it is clobbered by an instruction (typically an inline asm), current register allocator can't handle this situation, so BP becomes garbage after those instructions. It can also occur to FP in theory. We can spill and reload FP/BP registers around those instructions. But normal spill/reload instructions also use FP/BP, so we can't spill them into normal spill slots, instead we spill them into the top of stack by using SP register.
2024-08-06[AArch64] Add streaming-mode stack hazard optimization remarks (#101695)Hari Limaye1-0/+3
Emit an optimization remark when objects in the stack frame may cause hazards in a streaming mode function. The analysis requires either the `aarch64-stack-hazard-size` or `aarch64-stack-hazard-remark-size` flag to be set by the user, with the former flag taking precedence.
2024-07-23[LLVM] [MC] Update frame layout & CFI generation to handle frames larger ↵Wesley Wiser1-2/+2
than 2gb (#99263) Rebase of #84114. I've only included the core changes to frame layout calculation & CFI generation which sidesteps the regressions found after merging #84114. Since these changes are a necessary precursor to the overall fix and are themselves slightly beneficial as CFI is now generated correctly, I think it is reasonable to merge this first step. --- For very large stack frames, the offset from the stack pointer to a local can be more than 2^31 which overflows various `int` offsets in the frame lowering code. This patch updates the frame lowering code to calculate the offsets as 64-bit values and fixes CFI to use the corrected sizes. After this patch, additional work is needed to fix offset truncations in each target's codegen.
2024-07-09[CodeGen][NewPM] Port `machine-loops` to new pass manager (#97793)paperchalice1-2/+2
- Add `MachineLoopAnalysis`. - Add `MachineLoopPrinterPass`. - Convert to `MachineLoopInfoWrapperPass` in legacy pass manager.
2024-06-11[CodeGen][NewPM] Split `MachineDominatorTree` into a concrete analysis ↵paperchalice1-2/+2
result (#94571) Prepare for new pass manager version of `MachineDominatorTreeAnalysis`. We may need a machine dominator tree version of `DomTreeUpdater` to handle `SplitCriticalEdge` in some CodeGen passes.
2024-05-15Fix typo "indicies" (#92232)Jay Foad1-1/+1
2024-03-27Revert rG58de1e2c5eee548a9b365e3b1554d87317072ad9 "Fix stack layout for ↵Simon Pilgrim1-2/+2
frames larger than 2gb (#84114)" This is failing on some EXPENSIVE_CHECKS buildbots
2024-03-27Fix stack layout for frames larger than 2gb (#84114)Wesley Wiser1-2/+2
For very large stack frames, the offset from the stack pointer to a local can be more than 2^31 which overflows various `int` offsets in the frame lowering code. This patch updates the frame lowering code to calculate the offsets as 64-bit values and resolves the overflows, resulting in the correct codegen for very large frames. Fixes #48911
2024-03-20Revert "Move assertion for AdjustsStack from PEI to MachineVerifier. (#85698)"Jonas Paulsson1-0/+2
This reverts commit 05bde30585710a51592eee0a6cf6df8184d09c92. Reverting due to verifier complaints with expensive checks on build-bot.
2024-03-20Move assertion for AdjustsStack from PEI to MachineVerifier. (#85698)Jonas Paulsson1-2/+0
Have the verifier report a missing AdjustsStack flag rather than waiting until PEI asserts.
2024-03-18[CodeGen] Fix -Wunused-variable in PrologEpilogInserter.cpp (NFC)Jie Fu1-1/+1
llvm-project/llvm/lib/CodeGen/PrologEpilogInserter.cpp:369:12: error: unused variable 'MaxCFSIn' [-Werror,-Wunused-variable] uint32_t MaxCFSIn = ^ 1 error generated.
2024-03-18[MachineFrameInfo] Refactoring around computeMaxcallFrameSize() (NFC) (#78001)Jonas Paulsson1-28/+12
- Use computeMaxCallFrameSize() in PEI::calculateCallFrameInfo() instead of duplicating the code. - Set AdjustsStack in FinalizeISel instead of in computeMaxCallFrameSize().
2023-11-08[RegScavenger] Simplify state tracking for backwards scavenging (#71202)Jay Foad1-6/+1
Track the live register state immediately before, instead of after, MBBI. This makes it simple to track the state at the start or end of a basic block without a separate (and poorly named) Tracking flag. This changes the API of the backward(MachineBasicBlock::iterator I) method, which now recedes to the state just before, instead of just after, *I. Some clients are simplified by this change. There is one small functional change shown in the lit tests where multiple spilled registers all need to be reloaded before the same instruction. The reloads will now be inserted in the opposite order. This should not affect correctness.
2023-10-22[CodeGen][Remarks] Add the function name to the stack size remark (#69346)Jon Roelofs1-1/+3
It is already present in the yaml, but missing from the printed diagnostics.
2023-09-14[NFC][CodeGen] Change CodeGenOpt::Level/CodeGenFileType into enum classes ↵Arthur Eubanks1-2/+2
(#66295) This will make it easy for callers to see issues with and fix up calls to createTargetMachine after a future change to the params of TargetMachine. This matches other nearby enums. For downstream users, this should be a fairly straightforward replacement, e.g. s/CodeGenOpt::Aggressive/CodeGenOptLevel::Aggressive or s/CGFT_/CodeGenFileType::
2023-09-08[PEI][PowerPC] Fix false alarm of stack size limit (#65559)bzEq1-1/+1
PPC64 allows stack size up to ((2^63)-1) bytes. Currently llc reports ``` warning: stack frame size (4294967568) exceeds limit (4294967295) in function 'main' ``` if the stack allocated is larger than 4G.
2023-08-04[PEI] Remove support for register scavenging during forwards frame index ↵Jay Foad1-10/+3
elimination All targets that require register scavenging now use backwards frame index elimination. Differential Revision: https://reviews.llvm.org/D156986
2023-08-03[PEI] Switch to backwards frame index elimination by defaultJay Foad1-1/+1
Also rename the flag from supportsBackwardScavenger to eliminateFrameIndicesBackwards to reflect what it actually does. X86 is the only target still using forwards frame index elimination. This will not block removing support for forwards register scavenging, because X86 does not use the register scavenger. Differential Revision: https://reviews.llvm.org/D156983
2023-08-01[PEI][PowerPC] Switch to backwards frame index eliminationJay Foad1-15/+17
This adds support for reprocessing new instructions that were generated by the target's eliminateFrameIndex. Backwards frame index elimination uses backwards register scavenging, which is preferred because it does not rely on accurate kill flags. Differential Revision: https://reviews.llvm.org/D156690
2023-07-28[PEI] Don't zero out noreg operandsArthur Eubanks1-2/+7
A tail call may have $noreg operands. Fixes a crash. Reviewed By: xgupta Differential Revision: https://reviews.llvm.org/D156485
2023-07-28[PEI][ARM] Switch to backwards frame index eliminationJay Foad1-18/+44
This adds better support for call frame pseudos that adjust SP in PEI::replaceFrameIndicesBackward. Running frame index elimination backwards is preferred because it can do backwards register scavenging (on targets that require scavenging) which does not rely on accurate kill flags. Differential Revision: https://reviews.llvm.org/D156434
2023-07-27[CodeGen] Store call frame size in MachineBasicBlockJay Foad1-30/+16
Record the call frame size on entry to each basic block. This is usually zero except when a basic block has been split in the middle of a call sequence. This simplifies PEI::replaceFrameIndices which previously had to visit basic blocks in a specific order and had special handling for unreachable blocks. More importantly it paves the way for an equally simple implementation of a backwards version of replaceFrameIndices, which is required to fully convert PrologEpilogInserter to backwards register scavenging, which is preferred because it does not rely on accurate kill flags. Differential Revision: https://reviews.llvm.org/D156113
2023-07-13Revert "[CodeGen] Store SP adjustment in MachineBasicBlock. NFCI."Oliver Stannard1-10/+28
This reverts commit 58d1eaa3b6ce4f7285c51f83faff7a3ac374c746.
2023-07-12[CodeGen] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds after ↵Fangrui Song1-1/+1
D154281
2023-07-12[CodeGen] Store SP adjustment in MachineBasicBlock. NFCI.Jay Foad1-28/+10
Record the SP adjustment on entry to each basic block. This is almost always zero except on targets like ARM which can split a basic block in the middle of a call sequence. This simplifies PEI::replaceFrameIndices which previously had to visit basic blocks in a specific order and had special handling for unreachable blocks. More importantly it paves the way for an equally simple implementation of a backwards version of replaceFrameIndices, which is required to fully convert PrologEpilogInserter to backwards register scavenging, which is preferred because it does not rely on accurate kill flags. Differential Revision: https://reviews.llvm.org/D154281
2023-07-07[PEI][Mips] Switch to backwards frame index eliminationJay Foad1-5/+18
This adds support for running PEI::replaceFrameIndicesBackward with no RegisterScavenger, and basic support for eliminating call frame pseudo instructions. Differential Revision: https://reviews.llvm.org/D154347
2023-07-07[PEI] Simplify iterator handling in replaceFrameIndicesBackward. NFCI.Jay Foad1-40/+8
Differential Revision: https://reviews.llvm.org/D154346
2023-07-05Weaken MFI Max Call Frame Size AssertionOskar Wirga1-2/+2
A year ago when I was not invested at all into compilers, I found an assertion error when building an AArch64 debug build with LTO + CFI, among other combinations. It was posted as a github issue here: https://github.com/llvm/llvm-project/issues/54088 I took it upon myself to revisit the issue now that I have spent some more time working on LLVM. Reviewed By: MatzeB Differential Revision: https://reviews.llvm.org/D151276
2023-06-16[MC] Add MCRegisterInfo::regunits for iteration over register unitsSergei Barannikov1-2/+2
Reviewed By: foad Differential Revision: https://reviews.llvm.org/D152098
2023-05-09PrologEpilogInserter: Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off ↵Fangrui Song1-9/+9
builds
2023-05-09Wrap debug code with the LLVM_DEBUG macro; NFCAaron Ballman1-5/+6
While investigating a bug in Clang, I noticed that -Wframe-larger-than was emitting extra debug information along with the diagnostic. It turns out that 2e1e2f52f357768186ecfcc5ac53d5fa53d1b094 fixed an issue with the diagnostic, but accidentally left in some debug code that was exposed in all builds. So now we no longer emit things like: 8/4294967304 (0.00%) spills, 4294967296/4294967304 (100.00%) variables along with the diagnostic
2023-04-20Fix uninitialized class membersAkshay Khadse1-2/+2
Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D148692