Age | Commit message (Collapse) | Author | Files | Lines |
|
This is a low level utility to parse the MCInstrInfo and should
not depend on the state of the function.
|
|
Remove the Native Client support now that it has finally reached end of life.
|
|
/llvm-project/llvm/lib/Target/ARM/ARMISelLowering.cpp:2769:19: error: loop variable '[Reg, N]' creates a copy from type 'std::pair<unsigned int, llvm::SDValue> const' [-Werror,-Wrange-loop-construct]
for (const auto [Reg, N] : RegsToPass) {
^
/llvm-project/llvm/lib/Target/ARM/ARMISelLowering.cpp:2769:8: note: use reference type 'std::pair<unsigned int, llvm::SDValue> const &' to prevent copying
for (const auto [Reg, N] : RegsToPass) {
^~~~~~~~~~~~~~~~~~~~~
&
/llvm-project/llvm/lib/Target/ARM/ARMISelLowering.cpp:2954:19: error: loop variable '[Reg, N]' creates a copy from type 'std::pair<unsigned int, llvm::SDValue> const' [-Werror,-Wrange-loop-construct]
for (const auto [Reg, N] : RegsToPass)
^
/llvm-project/llvm/lib/Target/ARM/ARMISelLowering.cpp:2954:8: note: use reference type 'std::pair<unsigned int, llvm::SDValue> const &' to prevent copying
for (const auto [Reg, N] : RegsToPass)
^~~~~~~~~~~~~~~~~~~~~
&
2 errors generated.
|
|
|
|
These are identified by misc-include-cleaner. I've filtered out those
that break builds. Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
|
|
|
|
FPSCR and FPEXC will be stored in FPStatusRegs, after GPRCS2 has been
saved.
- GPRCS1
- GPRCS2
- FPStatusRegs (new)
- DPRCS
- GPRCS3
- DPRCS2
FPSCR is present on all targets with a VFP, but the FPEXC register is
not present on Cortex-M devices, so different amounts of bytes are
being pushed onto the stack depending on our target, which would
affect alignment for subsequent saves.
DPRCS1 will sum up all previous bytes that were saved, and will emit
extra instructions to ensure that its alignment is correct. My
assumption is that if DPRCS1 is able to correct its alignment to be
correct, then all subsequent saves will also have correct alignment.
Avoid annotating the saving of FPSCR and FPEXC for functions marked
with the interrupt_save_fp attribute, even though this is done as part
of frame setup. Since these are status registers, there really is no
viable way of annotating this. Since these aren't GPRs or DPRs, they
can't be used with .save or .vsave directives. Instead, just record
that the intermediate registers r4 and r5 are saved to the stack
again.
Co-authored-by: Jake Vossen <jake@vossen.dev>
Co-authored-by: Alan Phipps <a-phipps@ti.com>
|
|
Similar to #135845.
PR: https://github.com/llvm/llvm-project/pull/135994
|
|
We can use *Set::insert_range to collapse:
for (auto Elem : Range)
Set.insert(E);
down to:
Set.insert_range(Range);
In some cases, we can further fold that into the set declaration.
|
|
This reverts commit 1f05703176d43a339b41a474f51c0e8b1a83c9bb.
|
|
FPSCR and FPEXC will be stored in FPStatusRegs, after GPRCS2 has been
saved.
- GPRCS1
- GPRCS2
- FPStatusRegs (new)
- DPRCS
- GPRCS3
- DPRCS2
FPSCR is present on all targets with a VFP, but the FPEXC register is
not present on Cortex-M devices, so different amounts of bytes are
being pushed onto the stack depending on our target, which would
affect alignment for subsequent saves.
DPRCS1 will sum up all previous bytes that were saved, and will emit
extra instructions to ensure that its alignment is correct. My
assumption is that if DPRCS1 is able to correct its alignment to be
correct, then all subsequent saves will also have correct alignment.
Avoid annotating the saving of FPSCR and FPEXC for functions marked
with the interrupt_save_fp attribute, even though this is done as part
of frame setup. Since these are status registers, there really is no
viable way of annotating this. Since these aren't GPRs or DPRs, they
can't be used with .save or .vsave directives. Instead, just record
that the intermediate registers r4 and r5 are saved to the stack
again.
Co-authored-by: Jake Vossen <jake@vossen.dev>
Co-authored-by: Alan Phipps <a-phipps@ti.com>
|
|
(#128095)
Callee saved registers should always be phyiscal registers. They are
often passed directly to other functions that take MCRegister like
getMinimalPhysRegClass or TargetRegisterClass::contains.
Unfortunately, sometimes the MCRegister is compared to a Register which
gave an ambiguous comparison error when the MCRegister is on the LHS.
Adding a MCRegister==Register comparison operator created more ambiguous
comparison errors elsewhere. These cases were usually comparing against
a base or frame pointer register that is a physical register in a
Register. For those I added an explicit conversion of Register to
MCRegister to fix the error.
|
|
Primarily around uses of getSubReg/getSuperReg.
|
|
This seems like an oversight when copying code from other backends.
|
|
This patch fixes:
llvm/lib/Target/ARM/ARMFrameLowering.cpp:1404:39: error: unused
variable 'PushPopSplit' [-Werror,-Wunused-variable]
|
|
Currently, the relative position of GPRCS2 (with respect to other
instructions in the prologue of a function) can be different depending
on the type of ARMSubtarget::PushPopSplitVariant.
When the PushPopSpiltVariant is SplitR11WindowsSEH, GPRCS2 comes
after both GPRCS1 and DPRCS2:
GPRCS1
DPRCS1
GPRCS2
However, in all other cases, GPRCS2 comes before DPRCS1, like so:
GPRCS1
GPRCS2
DPRCS1
This makes the MI walking code in ARMFrameLowering::emitPrologue a bit
confusing. If GPRCS2Size is non-zero, we also have to check the
PushPopSplitVariant to know if we will encounter the DPRCS1 push
instruction first or the GPRCS2 push instruction first.
This commit changes to SplitR11WindowsSEH such that the spill area is
as follows:
GPRCS1
DPRCS1
GPRCS3
This disambiguates a lot of the ARMFrameLowering.cpp MI traversal
code.
|
|
Identified with misc-include-cleaner.
|
|
With -fomit-frame-pointer, even if we set up a frame pointer for other
reasons (e.g. variable-sized or over-aligned stack allocations), we
don't need to create an ABI-compliant frame record. This means that we
can save all of the general-purpose registers in one push, instead of
splitting it to ensure that the frame pointer and link register are
adjacent on the stack, saving two instructions per function.
|
|
When using AAPCS-compliant frame chains with PACBTI return address
signing, there ware a number of bugs in the generation of the frame
pointer and function prologues. The most obvious was that we sometimes
would modify r11 before pushing it to the stack, so it wasn't preserved
as required by the PCS. We also sometimes did not push R11 and LR
adjacent to one another on the stack, or used R11 as a frame pointer
without pointing it at the saved value of R11, both of which are
required to have an AAPCS compliant frame chain.
The original work of this patch was done by James Westwood, reviewed as
#82801 and #81249, with some tidy-ups done by Mark Murray and myself.
|
|
Reverting because this is causing failures with MSan:
https://lab.llvm.org/buildbot/#/builders/169/builds/4378
This reverts commit e1f8f84acec05997893c305c78fbf7feecf44dd7.
|
|
`TargetFrameLowering::hasFP()` (#106014)
Some targets (e.g. PPC and Hexagon) already did this. I think it's best
to do this consistently so that frontend authors don't run into
inconsistent results when they emit `naked` functions. For example, in
Zig, we had to change our emit code to also set `frame-pointer=none` to
get reliable results across targets.
Note: I don't have commit access.
|
|
(#109628)
The -mno-omit-leaf-frame-pointer flag works on 32-bit ARM architectures
and addresses the bug reported in #108019
|
|
/llvm-project/llvm/lib/Target/ARM/ARMFrameLowering.cpp:1028:9:
error: unused variable 'FPOffset' [-Werror,-Wunused-variable]
int FPOffset = MFI.getObjectOffset(FramePtrSpillFI);
^
1 error generated.
|
|
When using AAPCS-compliant frame chains with PACBTI return address
signing, there ware a number of bugs in the generation of the frame
pointer and function prologues. The most obvious was that we sometimes
would modify r11 before pushing it to the stack, so it wasn't preserved
as required by the PCS. We also sometimes did not push R11 and LR
adjacent to one another on the stack, or used R11 as a frame pointer
without pointing it at the saved value of R11, both of which are
required to have an AAPCS compliant frame chain.
The original work of this patch was done by James Westwood, reviewed as
#82801 and #81249, with some tidy-ups done by Mark Murray and myself.
|
|
|
|
These used a set of callback functions to check which callee-save area a
register is in, refactor them to use the same data as other parts of
ARMFrameLowering. This will make it easier to add a new variant to the
register splitting.
|
|
There were multiple loops in ARMFrameLowering which sort the callee
saved registers into spill areas, which were hard to understand and
modify. This splits the information about which register is in which
save area into a separate function.
|
|
We have two different ways of splitting the pushes of callee-saved
registers onto the stack, controlled by the confusingly similar names
STI.splitFramePushPop() and STI.splitFramePointerPush(). This removes
those functions and replaces them with a single function which returns
an enum. This is in preparation for adding another value to that enum.
The original work of this patch was done by James Westwood, reviewed as
#82801 and #81249, with some tidy-ups done by Mark Murray and myself.
|
|
than 2gb (#99263)
Rebase of #84114. I've only included the core changes to frame layout
calculation & CFI generation which sidesteps the regressions found after
merging #84114. Since these changes are a necessary precursor to the
overall fix and are themselves slightly beneficial as CFI is now
generated correctly, I think it is reasonable to merge this first step.
---
For very large stack frames, the offset from the stack pointer to a
local can be more than 2^31 which overflows various `int` offsets in the
frame lowering code.
This patch updates the frame lowering code to calculate the offsets as
64-bit values and fixes CFI to use the corrected sizes.
After this patch, additional work is needed to fix offset truncations in
each target's codegen.
|
|
MachineFunction's probably should not include a backreference to
the owning MachineModuleInfo. Most of these references were used
just to query the MCContext, which MachineFunction already directly
stores. Other contexts are using it to query the LLVMContext, which
can already be accessed through the IR function reference.
|
|
This iteration drops hunks where the loop body adds more elements.
|
|
This reverts commit 3614f65a7ba9d925010e3316a1d93bcebc632178.
fixupImmediateBr seems to resize ImmBranches.
|
|
|
|
When using the -mframe-chain=aapcs or -mframe-chain=aapcs-leaf options,
we cannot use r11 as an allocatable register, even if
-fomit-frame-pointer is also used. This is so that r11 will always point
to a valid frame record, even if we don't create one in every function.
|
|
When compiling for thumbv8.1m with +pacbti and making an indirect tail
call, the compiler was free to put the function pointer into R12.
This is incorrect because R12 is restored to contain authentication code
for the caller's return address.
This patch excludes R12 from the set of registers the compiler can put
the function pointer in.
Fixes https://github.com/llvm/llvm-project/issues/75998
|
|
Fixes #82659
There are some functions, such as `findRegisterDefOperandIdx` and `findRegisterDefOperand`, that have too many default parameters. As a result, we have encountered some issues due to the lack of TRI parameters, as shown in issue #82411.
Following @RKSimon 's suggestion, this patch refactors 9 functions, including `{reads, kills, defines, modifies}Register`, `registerDefIsDead`, and `findRegister{UseOperandIdx, UseOperand, DefOperandIdx, DefOperand}`, adjusting the order of the TRI parameter and making it required. In addition, all the places that call these functions have also been updated correctly to ensure no additional impact.
After this, the caller of these functions should explicitly know whether to pass the `TargetRegisterInfo` or just a `nullptr`.
|
|
frames larger than 2gb (#84114)"
This is failing on some EXPENSIVE_CHECKS buildbots
|
|
For very large stack frames, the offset from the stack pointer to a local can be more than 2^31 which overflows various `int` offsets in the frame lowering code.
This patch updates the frame lowering code to calculate the offsets as 64-bit values and resolves the overflows, resulting in the correct codegen for very large frames.
Fixes #48911
|
|
(#84019)
… AAPCS frame chain fix (#82801)"
This reverts commit 00e4a4197137410129d4725ffb82bae9ce44bdde. This patch
was found to cause miscompilations and compilation failures.
|
|
chain fix (#82801)
When code for M class architecture was compiled with AAPCS and PAC
enabled, the frame pointer, r11, was not pushed to the stack adjacent to
the link register. Due to PAC being enabled, r12 was placed between r11
and lr. This patch fixes this by adding an extra case to the already
existing code that splits the GPR push in two when R11 is the frame
pointer and certain paremeters are met. The differential revision for
this previous change can be found here:
https://reviews.llvm.org/D125649. This now ensures that r11 and lr are
pushed in a separate push instruction to the other GPRs when PAC and
AAPCS are enabled, meaning the frame pointer and link register are now
pushed onto the stack adjacent to each other.
|
|
PR #75527 fixed ARMFrameLowering to set the IsRestored flag for LR based
on all of the return instructions in the function, not just one.
However, there is also code in ARMLoadStoreOptimizer which changes
return instructions, but it set IsRestored based on the one instruction
it changed, not the whole function.
The fix is to factor out the code added in #75527, and also call it from
ARMLoadStoreOptimizer if it made a change to return instructions.
Fixes #80287.
|
|
|
|
emitPopInst checks a single function exit MBB. If other paths also exit
the function and any of there terminators uses LR implicitly, it is not
save to clear the Restored bit.
Check all terminators for the function before clearing Restored.
This fixes a mis-compile in outlined-fn-may-clobber-lr-in-caller.ll
where the machine-outliner previously introduced BLs that clobbered LR
which in turn is used by the tail call return.
Alternative to #73553
|
|
R12 is callee-saved in functions with pacbti-m enabled, but this is
done in assignCalleeSavedSpillSlots, meaning that in
determineCalleeSaves we have to manually set CanEliminateFrame.
This fixes a bug where in leaf functions with no other callee-saved
registers the aut instruction wouldn't be emitted and stack offsets
of arguments passed on the stack would be incorrect.
Differential Revision: https://reviews.llvm.org/D157865
|
|
When resolving a frame index with a large offset for v6M execute-only,
we emit a tMOVimm32 pseudo-instruction, which later gets lowered to a
sequence of instructions, all of which are flag-setting. However, a
frame index may be generated for a register spill or reload instruction,
which can be inserted at a point where CPSR is live. This patch inserts
MRS and MSR instructions around the tMOVimm32 to save and restore the
value of CPSR, if CPSR is live at that point.
This may need up to two virtual registers (one to build the immediate
value, one to save CPSR) during frame index lowering, which happens
after register allocation, so we need to ensure two spill slots are
avilable to the register scavenger to ensure it can free up enough
registers for this.
There is no test for the emission (or not) of the MRS/MSR pair, because
it requires a spill or reload to be inserted at a point where CPSR is
live, which requires a large, complex function and is fragile enough
that any optimisation changes will break the test. This bug was easily
found by csmith with -verify-machineinstrs, which I now run regularly on
v6M execute-only (and many other combinations).
Patch by John Brawn and myself.
Reviewed By: stuij
Differential Revision: https://reviews.llvm.org/D158404
|
|
Using segmented stacks with execute-only mostly works, but we need to
use the correct movi32 opcode in 6-M, and there's one place where for
thumb1 (i.e. 6-M and 8-M.base) a constant pool was unconditionally
used which needed to be fixed.
Differential Revision: https://reviews.llvm.org/D156339
|
|
Assuming that the stack grows downwards, it is fine if the stack
pointer is exactly at the stacklet boundary. We should use
less-or-equal condition when deciding whether to skip new memory
allocation.
Differential Revision: https://reviews.llvm.org/D149315
|
|
|
|
to 64 bits" warning. NFC.
|
|
This fixes compiling some uncommon cases.
Differential Revision: https://reviews.llvm.org/D147212
|