aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/CodeGen/PrologEpilogInserter.cpp
AgeCommit message (Collapse)AuthorFilesLines
2016-04-18[NFC] Header cleanupMehdi Amini1-2/+0
Removed some unused headers, replaced some headers with forward class declarations. Found using simple scripts like this one: clear && ack --cpp -l '#include "llvm/ADT/IndexedMap.h"' | xargs grep -L 'IndexedMap[<]' | xargs grep -n --color=auto 'IndexedMap' Patch by Eugene Kosov <claprix@yandex.ru> Differential Revision: http://reviews.llvm.org/D19219 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266595
2016-04-12CodeGen: Clear the MFI's save and restore point after PrologEpilogInserterJustin Bogner1-0/+2
This state is no longer useful and not guaranteed to be valid in later codegen passes. For example, see the added test, which would print a savepoint of %bb.-1 without this change, and crashes with a use-after-free error under ASan if you apply the recycling allocator patch from llvm.org/PR26808. llvm-svn: 266150
2016-04-06RegisterScavenger: Take a reference as enterBasicBlock() argument.Matthias Braun1-2/+2
Make it obvious that the argument cannot be nullptr. Remove an unnecessary nullptr check in initRegState. llvm-svn: 265511
2016-03-31Change eliminateCallFramePseudoInstr() to return an iteratorHans Wennborg1-9/+1
This will become necessary in a subsequent change to make this method merge adjacent stack adjustments, i.e. it might erase the previous and/or next instruction. It also greatly simplifies the calls to this function from Prolog- EpilogInserter. Previously, that had a bunch of logic to resume iteration after the call; now it just continues with the returned iterator. Note that this changes the behaviour of PEI a little. Previously, it attempted to re-visit the new instruction created by eliminateCallFramePseudoInstr(). That code was added in r36625, but I can't see any reason for it: the new instructions will obviously not be pseudo instructions, they will not have FrameIndex operands, and we have already accounted for the stack adjustment. Differential Revision: http://reviews.llvm.org/D18627 llvm-svn: 265036
2016-03-28Introduce MachineFunctionProperties and the AllVRegsAllocated propertyDerek Schuff1-0/+5
MachineFunctionProperties represents a set of properties that a MachineFunction can have at particular points in time. Existing examples of this idea are MachineRegisterInfo::isSSA() and MachineRegisterInfo::tracksLiveness() which will eventually be switched to use this mechanism. This change introduces the AllVRegsAllocated property; i.e. the property that all virtual registers have been allocated and there are no VReg operands left. With this mechanism, passes can declare that they require a particular property to be set, or that they set or clear properties by implementing e.g. MachineFunctionPass::getRequiredProperties(). The MachineFunctionPass base class verifies that the requirements are met, and handles the setting and clearing based on the delcarations. Passes can also directly query and update the current properties of the MF if they want to have conditional behavior. This change annotates the target-independent post-regalloc passes; future changes will also annotate target-specific ones. Reviewers: qcolombet, hfinkel Differential Revision: http://reviews.llvm.org/D18421 llvm-svn: 264593
2016-03-19CodeGen: use range based for loopSaleem Abdulrasool1-5/+4
Convert a loop to use a range based style loop. NFC. llvm-svn: 263884
2016-03-01[WinEH] Allocate the registration node before the catch objectsDavid Majnemer1-0/+12
The CatchObjOffset is relative to the end of the EH registration node for 32-bit x86 WinEH targets. A special sentinel value, 0, is used to indicate that no catch object should be initialized. This means that a catch object allocated immediately before the registration node would be assigned a CatchObjOffset of 0, leading the runtime to believe that a catch object should not be initialized. To handle this, allocate the registration node prior to any other frame object. This will ensure that catch objects will not be allocated before the registration node. This fixes PR26757. Differential Revision: http://reviews.llvm.org/D17689 llvm-svn: 262294
2016-02-15Implemented stack symbol table ordering/packing optimization to improve data ↵Zia Ansari1-3/+14
locality and code size from SP/FP offset encoding. Differential Revision: http://reviews.llvm.org/D15393 llvm-svn: 260917
2016-02-01[PrologEpilogInserter] Add some debug output for callee-save frame object ↵Geoff Berry1-0/+2
allocation Reviewers: mcrosier Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D16733 llvm-svn: 259367
2016-01-14Update to use new name alignTo().Rui Ueyama1-5/+5
llvm-svn: 257804
2015-11-10Support for emitting inline stack probesAndy Ayers1-0/+3
For CoreCLR on Windows, stack probes must be emitted as inline sequences that probe successive stack pages between the current stack limit and the desired new stack pointer location. This implements support for the inline expansion on x64. For in-body alloca probes, expansion is done during instruction lowering. For prolog probes, a stub call is initially emitted during prolog creation, and expanded after epilog generation, to avoid complications that arise when introducing new machine basic blocks during prolog and epilog creation. Added a new test case, modified an existing one to exclude non-x64 coreclr (for now). Add test case Fix tests llvm-svn: 252578
2015-10-09CodeGen: Avoid more ilist iterator implicit conversions, NFCDuncan P. N. Exon Smith1-6/+6
llvm-svn: 249903
2015-10-07[WinEH] Undo the effect of r249578 for 32-bitReid Kleckner1-22/+0
The __CxxFrameHandler3 tables for 32-bit are supposed to hold stack offsets relative to EBP, not ESP. I blindly updated the win-catchpad.ll test case, and immediately noticed that 32-bit catching stopped working. While I'm at it, move the frame index to frame offset WinEH table logic out of PEI. PEI shouldn't have to know about WinEHFuncInfo. I realized we can calculate frame index offsets just fine from the table printer. llvm-svn: 249618
2015-10-07[WinEH] Fix two minor issues in __CxxFrameHandler3 tablesReid Kleckner1-3/+2
There was an off-by-one bug in ip2state tables which manifested when one call immediately preceded the try-range of the next. The return address of the previous call would appear to be within the try range of the next scope, resulting in extra destructors or catches running. We also computed the wrong offset for catch parameter stack objects. The offset should be from RSP, not from RBP. llvm-svn: 249578
2015-09-29HHVM calling conventions.Maksim Panchenko1-15/+18
HHVM calling convention, hhvmcc, is used by HHVM JIT for functions in translated cache. We currently support LLVM back end to generate code for X86-64 and may support other architectures in the future. In HHVM calling convention any GP register could be used to pass and return values, with the exception of R12 which is reserved for thread-local area and is callee-saved. Other than R12, we always pass RBX and RBP as args, which are our virtual machine's stack pointer and frame pointer respectively. When we enter translation cache via hhvmcc function, we expect the stack to be aligned at 16 bytes, i.e. skewed by 8 bytes as opposed to standard ABI alignment. This affects stack object alignment and stack adjustments for function calls. One extra calling convention, hhvm_ccc, is used to call C++ helpers from HHVM's translation cache. It is almost identical to standard C calling convention with an exception of first argument which is passed in RBP (before we use RDI, RSI, etc.) Differential Revision: http://reviews.llvm.org/D12681 llvm-svn: 248832
2015-09-25PrologueEpilogInserter: Fix missing live-ins when savepoint equals restorepointMatthias Braun1-3/+6
The algorithm would not modify the live-in list of blocks below the save block point which is correct unless it happens to be a restore point at the same time. Also fixes the benign issue of live-in registers being added twice in some cases. The testcase is based on a test submitted by Kit Barton. Differential Revision: http://reviews.llvm.org/D13176 llvm-svn: 248620
2015-09-25MachineBasicBlock: Factor out common code into isReturnBlock()Matthias Braun1-9/+2
llvm-svn: 248617
2015-09-19[PrologEpilogInserter] Minor refactoring.Maksim Panchenko1-1/+1
Differential Revision: http://reviews.llvm.org/D12924 llvm-svn: 248084
2015-09-17Test commit.Zia Ansari1-1/+1
llvm-svn: 247901
2015-09-16[WinEH] Rip out the landingpad-based C++ EH state numbering codeReid Kleckner1-6/+0
It never really worked, and the new code is working better every day. llvm-svn: 247860
2015-09-16[WinEH] Pull Adjectives and CatchObj out of the catchpad arg listReid Kleckner1-0/+10
Clang now passes the adjectives as an argument to catchpad. Getting the CatchObj working is simply a matter of threading another static alloca through codegen, first as an alloca, then as a frame index, and finally as a frame offset. llvm-svn: 247844
2015-09-12Fix typos.Bruce Mitchener1-2/+2
Summary: This fixes a variety of typos in docs, code and headers. Subscribers: jholewinski, sanjoy, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D12626 llvm-svn: 247495
2015-09-08[WinEH] Emit prologues and epilogues for funcletsReid Kleckner1-24/+32
Summary: 32-bit funclets have short prologues that allocate enough stack for the largest call in the whole function. The runtime saves CSRs for the funclet. It doesn't restore CSRs after we finally transfer control back to the parent funciton via a CATCHRET, but that's a separate issue. 32-bit funclets also have to adjust the incoming EBP value, which is what llvm.x86.seh.recoverframe does in the old model. 64-bit funclets need to spill CSRs as normal. For simplicity, this just spills the same set of CSRs as the parent function, rather than trying to compute different CSR sets for the parent function and each funclet. 64-bit funclets also allocate enough stack space for the largest outgoing call frame, like 32-bit. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12546 llvm-svn: 247092
2015-07-14MachineRegisterInfo: Remove UsedPhysReg infrastructureMatthias Braun1-4/+0
We have a detailed def/use lists for every physical register in MachineRegisterInfo anyway, so there is little use in maintaining an additional bitset of which ones are used. Removing it frees us from extra book keeping. This simplifies VirtRegMap. Differential Revision: http://reviews.llvm.org/D10911 llvm-svn: 242173
2015-07-14PrologEpilogInserter: Rewrite API to determine callee save regsiters.Matthias Braun1-27/+15
This changes TargetFrameLowering::processFunctionBeforeCalleeSavedScan(): - Rename the function to determineCalleeSaves() - Pass a bitset of callee saved registers by reference, thus avoiding the function-global PhysRegUsed bitset in MachineRegisterInfo. - Without PhysRegUsed the implementation is fine tuned to not save physcial registers which are only read but never modified. Related to rdar://21539507 Differential Revision: http://reviews.llvm.org/D10909 llvm-svn: 242165
2015-07-10[ShrinkWrap][PEI] Do not insert epilogue for unreachable blocks.Quentin Colombet1-3/+8
Although this is not incorrect to insert such code, it is useless and it hurts the binary size. llvm-svn: 241946
2015-05-18MachineInstr: Change return value of getOpcode() to unsigned.Matthias Braun1-5/+5
This was previously returning int. However there are no negative opcode numbers and more importantly this was needlessly different from MCInstrDesc::getOpcode() (which even is the value returned here) and SDValue::getOpcode()/SDNode::getOpcode(). llvm-svn: 237611
2015-05-05[ShrinkWrap] Add (a simplified version) of shrink-wrapping.Quentin Colombet1-30/+84
This patch introduces a new pass that computes the safe point to insert the prologue and epilogue of the function. The interest is to find safe points that are cheaper than the entry and exits blocks. As an example and to avoid regressions to be introduce, this patch also implements the required bits to enable the shrink-wrapping pass for AArch64. ** Context ** Currently we insert the prologue and epilogue of the method/function in the entry and exits blocks. Although this is correct, we can do a better job when those are not immediately required and insert them at less frequently executed places. The job of the shrink-wrapping pass is to identify such places. ** Motivating example ** Let us consider the following function that perform a call only in one branch of a if: define i32 @f(i32 %a, i32 %b) { %tmp = alloca i32, align 4 %tmp2 = icmp slt i32 %a, %b br i1 %tmp2, label %true, label %false true: store i32 %a, i32* %tmp, align 4 %tmp4 = call i32 @doSomething(i32 0, i32* %tmp) br label %false false: %tmp.0 = phi i32 [ %tmp4, %true ], [ %a, %0 ] ret i32 %tmp.0 } On AArch64 this code generates (removing the cfi directives to ease readabilities): _f: ; @f ; BB#0: stp x29, x30, [sp, #-16]! mov x29, sp sub sp, sp, #16 ; =16 cmp w0, w1 b.ge LBB0_2 ; BB#1: ; %true stur w0, [x29, #-4] sub x1, x29, #4 ; =4 mov w0, wzr bl _doSomething LBB0_2: ; %false mov sp, x29 ldp x29, x30, [sp], #16 ret With shrink-wrapping we could generate: _f: ; @f ; BB#0: cmp w0, w1 b.ge LBB0_2 ; BB#1: ; %true stp x29, x30, [sp, #-16]! mov x29, sp sub sp, sp, #16 ; =16 stur w0, [x29, #-4] sub x1, x29, #4 ; =4 mov w0, wzr bl _doSomething add sp, x29, #16 ; =16 ldp x29, x30, [sp], #16 LBB0_2: ; %false ret Therefore, we would pay the overhead of setting up/destroying the frame only if we actually do the call. ** Proposed Solution ** This patch introduces a new machine pass that perform the shrink-wrapping analysis (See the comments at the beginning of ShrinkWrap.cpp for more details). It then stores the safe save and restore point into the MachineFrameInfo attached to the MachineFunction. This information is then used by the PrologEpilogInserter (PEI) to place the related code at the right place. This pass runs right before the PEI. Unlike the original paper of Chow from PLDI’88, this implementation of shrink-wrapping does not use expensive data-flow analysis and does not need hack to properly avoid frequently executed point. Instead, it relies on dominance and loop properties. The pass is off by default and each target can opt-in by setting the EnableShrinkWrap boolean to true in their derived class of TargetPassConfig. This setting can also be overwritten on the command line by using -enable-shrink-wrap. Before you try out the pass for your target, make sure you properly fix your emitProlog/emitEpilog/adjustForXXX method to cope with basic blocks that are not necessarily the entry block. ** Design Decisions ** 1. ShrinkWrap is its own pass right now. It could frankly be merged into PEI but for debugging and clarity I thought it was best to have its own file. 2. Right now, we only support one save point and one restore point. At some point we can expand this to several save point and restore point, the impacted component would then be: - The pass itself: New algorithm needed. - MachineFrameInfo: Hold a list or set of Save/Restore point instead of one pointer. - PEI: Should loop over the save point and restore point. Anyhow, at least for this first iteration, I do not believe this is interesting to support the complex cases. We should revisit that when we motivating examples. Differential Revision: http://reviews.llvm.org/D9210 <rdar://problem/3201744> llvm-svn: 236507
2015-03-30[WinEH] Run cleanup handlers when an exception is thrownDavid Majnemer1-0/+20
Generate tables in the .xdata section representing what actions to take when an exception is thrown. This currently fills in state for cleanups, catch handlers are still unfinished. llvm-svn: 233636
2015-03-19Internalize PEI. NFC.Benjamin Kramer1-1/+48
llvm-svn: 232722
2015-03-05Replace llvm.frameallocate with llvm.frameescapeReid Kleckner1-11/+0
Turns out it's pretty straightforward and simplifies the implementation. Reviewers: andrew.w.kaylor Differential Revision: http://reviews.llvm.org/D8051 llvm-svn: 231386
2015-02-24PrologEpilogInserter: Clean up math in calculateFrameObjectOffsetsDavid Majnemer1-5/+4
There is no need to open-code the alignment calculation, we have a handy RoundUpToAlignment function which "Does The Right Thing (TM)". llvm-svn: 230392
2015-02-01[X86] Convert esp-relative movs of function arguments to pushes, step 2Michael Kuperstein1-8/+12
This moves the transformation introduced in r223757 into a separate MI pass. This allows it to cover many more cases (not only cases where there must be a reserved call frame), and perform rudimentary call folding. It still doesn't have a heuristic, so it is enabled only for optsize/minsize, with stack alignment <= 8, where it ought to be a fairly clear win. (Re-commit of r227728) Differential Revision: http://reviews.llvm.org/D6789 llvm-svn: 227752
2015-02-01Revert r227728 due to bad line endings.Michael Kuperstein1-12/+8
llvm-svn: 227746
2015-02-01[X86] Convert esp-relative movs of function arguments to pushes, step 2Michael Kuperstein1-8/+12
This moves the transformation introduced in r223757 into a separate MI pass. This allows it to cover many more cases (not only cases where there must be a reserved call frame), and perform rudimentary call folding. It still doesn't have a heuristic, so it is enabled only for optsize/minsize, with stack alignment <= 8, where it ought to be a fairly clear win. Differential Revision: http://reviews.llvm.org/D6789 llvm-svn: 227728
2015-01-13Add the llvm.frameallocate and llvm.recoverframeallocation intrinsicsReid Kleckner1-0/+11
These intrinsics allow multiple functions to share a single stack allocation from one function's call frame. The function with the allocation may only perform one allocation, and it must be in the entry block. Functions accessing the allocation call llvm.recoverframeallocation with the function whose frame they are accessing and a frame pointer from an active call frame of that function. These intrinsics are very difficult to inline correctly, so the intention is that they be introduced rarely, or at least very late during EH preparation. Reviewers: echristo, andrew.w.kaylor Differential Revision: http://reviews.llvm.org/D6493 llvm-svn: 225746
2015-01-08Move SPAdj logic from PEI into the targets (NFC)Michael Kuperstein1-11/+11
PEI tries to keep track of how much starting or ending a call sequence adjusts the stack pointer by, so that it can resolve frame-index references. Currently, it takes a very simplistic view of how SP adjustments are done - both FrameStartOpcode and FrameDestroyOpcode adjust it exactly by the amount written in its first argument. This view is in fact incorrect for some targets (e.g. due to stack re-alignment, or because it may want to adjust the stack pointer in multiple steps). However, that doesn't cause breakage, because most targets (the only in-tree exception appears to be 32-bit ARM) rely on being able to simplify the call frame pseudo-instructions earlier, so this code is never hit. Moving the computation into TargetInstrInfo allows targets to override the way the adjustment is computed if they need to have a non-zero SPAdj. Differential Revision: http://reviews.llvm.org/D6863 llvm-svn: 225437
2014-12-01[Statepoints 2/4] Statepoint infrastructure for garbage collection: MI & ↵Philip Reames1-0/+20
x86-64 Backend This is the second patch in a small series. This patch contains the MachineInstruction and x86-64 backend pieces required to lower Statepoints. It does not include the code to actually generate the STATEPOINT machine instruction and as a result, the entire patch is currently dead code. I will be submitting the SelectionDAG parts within the next 24-48 hours. Since those pieces are by far the most complicated, I wanted to minimize the size of that patch. That patch will include the tests which exercise the functionality in this patch. The entire series can be seen as one combined whole in http://reviews.llvm.org/D5683. The STATEPOINT psuedo node is generated after all gc values are explicitly spilled to stack slots. The purpose of this node is to wrap an actual call instruction while recording the spill locations of the meta arguments used for garbage collection and other purposes. The STATEPOINT is modeled as modifing all of those locations to prevent backend optimizations from forwarding the value from before the STATEPOINT to after the STATEPOINT. (Doing so would break relocation semantics for collectors which wish to relocate roots.) The implementation of STATEPOINT is closely modeled on PATCHPOINT. Eventually, much of the code in this patch will be removed. The long term plan is to merge the functionality provided by statepoints and patchpoints. Merging their implementations in the backend is likely to be a good starting point. Reviewed by: atrick, ributzka llvm-svn: 223085
2014-10-14Grab the subtarget and subtarget dependent variables off ofEric Christopher1-5/+4
MachineFunction rather than TargetMachine. llvm-svn: 219670
2014-08-24Use range based for loops to avoid needing to re-mention SmallPtrSet size.Craig Topper1-2/+1
llvm-svn: 216351
2014-08-05Have MachineFunction cache a pointer to the subtarget to make lookupsEric Christopher1-26/+13
shorter/easier and have the DAG use that to do the same lookup. This can be used in the future for TargetMachine based caching lookups from the MachineFunction easily. Update the MIPS subtarget switching machinery to update this pointer at the same time it runs. llvm-svn: 214838
2014-08-04Changed the liveness tracking in the RegisterScavengerPedro Artigas1-2/+7
to use register units instead of registers. reviewed by Jakob Stoklund Olesen. llvm-svn: 214798
2014-08-04Remove the TargetMachine forwards for TargetSubtargetInfo basedEric Christopher1-16/+31
information and update all callers. No functional change. llvm-svn: 214781
2014-06-25Re-apply r211399, "Generate native unwind info on Win64" with a fix to ↵NAKAMURA Takumi1-41/+46
ignore SEH pseudo ops in X86 JIT emitter. -- This patch enables LLVM to emit Win64-native unwind info rather than DWARF CFI. It handles all corner cases (I hope), including stack realignment. Because the unwind info is not flexible enough to describe stack frames with a gap of unknown size in the middle, such as the one caused by stack realignment, I modified register spilling code to place all spills into the fixed frame slots, so that they can be accessed relative to the frame pointer. Patch by Vadim Chugunov! Reviewed By: rnk Differential Revision: http://reviews.llvm.org/D4081 llvm-svn: 211691
2014-06-22Revert r211399, "Generate native unwind info on Win64"NAKAMURA Takumi1-46/+41
It broke Legacy JIT Tests on x86_64-{mingw32|msvc}, aka Windows x64. llvm-svn: 211480
2014-06-20Generate native unwind info on Win64Reid Kleckner1-41/+46
This patch enables LLVM to emit Win64-native unwind info rather than DWARF CFI. It handles all corner cases (I hope), including stack realignment. Because the unwind info is not flexible enough to describe stack frames with a gap of unknown size in the middle, such as the one caused by stack realignment, I modified register spilling code to place all spills into the fixed frame slots, so that they can be accessed relative to the frame pointer. Patch by Vadim Chugunov! Reviewed By: rnk Differential Revision: http://reviews.llvm.org/D4081 llvm-svn: 211399
2014-06-07Fix typosAlp Toker1-1/+1
llvm-svn: 210401
2014-04-22[Modules] Remove potential ODR violations by sinking the DEBUG_TYPEChandler Carruth1-1/+2
define below all header includes in the lib/CodeGen/... tree. While the current modules implementation doesn't check for this kind of ODR violation yet, it is likely to grow support for it in the future. It also removes one layer of macro pollution across all the included headers. Other sub-trees will follow. llvm-svn: 206837
2014-04-14[C++11] More 'nullptr' conversion. In some cases just using a boolean check ↵Craig Topper1-6/+7
instead of comparing to nullptr. llvm-svn: 206142
2014-04-10Move the segmented stack switch to a function attributeReid Kleckner1-1/+1
This removes the -segmented-stacks command line flag in favor of a per-function "split-stack" attribute. Patch by Luqman Aden and Alex Crichton! llvm-svn: 205997