aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/CodeGen
AgeCommit message (Collapse)AuthorFilesLines
2016-04-13Calculate __builtin_object_size when pointer depends on a conditionPetar Jovanovic1-3/+12
This patch fixes calculating of builtin_object_size if it depends on a condition. Before this patch compiler did not know how to calculate the object size when it finds a condition that cannot be eliminated. This patch enables calculating of builtin_object_size even in case when condition cannot be eliminated by choosing minimum or maximum value as a result from condition. Choosing minimum or maximum value from condition is based on the second argument of __builtin_object_size function. Patch by Strahinja Petrovic. Differential Revision: http://reviews.llvm.org/D18438 llvm-svn: 266193
2016-04-13Recommit r265547, and r265610,r265639,r265657 on top of it, plusWei Mi10-533/+749
two fixes with one about error verify-regalloc reported, and another about live range update of phi after rematerialization. r265547: Replace analyzeSiblingValues with new algorithm to fix its compile time issue. The patch is to solve PR17409 and its duplicates. analyzeSiblingValues is a N x N complexity algorithm where N is the number of siblings generated by reg splitting. Although it causes siginificant compile time issue when N is large, it is also important for performance since it removes redundent spills and enables rematerialization. To solve the compile time issue, the patch removes analyzeSiblingValues and replaces it with lower cost alternatives containing two parts. The first part creates a new spill hoisting method in postOptimization of register allocation. It does spill hoisting at once after all the spills are generated instead of inside every instance of selectOrSplit. The second part queries the define expr of the original register for rematerializaiton and keep it always available during register allocation even if it is already dead. It deletes those dead instructions only in postOptimization. With the two parts in the patch, it can remove analyzeSiblingValues without sacrificing performance. Patches on top of r265547: r265610 "Fix the compare-clang diff error introduced by r265547." r265639 "Fix the sanitizer bootstrap error in r265547." r265657 "InlineSpiller.cpp: Escap \@ in r265547. [-Wdocumentation]" Differential Revision: http://reviews.llvm.org/D15302 Differential Revision: http://reviews.llvm.org/D18934 Differential Revision: http://reviews.llvm.org/D18935 Differential Revision: http://reviews.llvm.org/D18936 llvm-svn: 266162
2016-04-12CodeGen: Clear the MFI's save and restore point after PrologEpilogInserterJustin Bogner1-0/+2
This state is no longer useful and not guaranteed to be valid in later codegen passes. For example, see the added test, which would print a savepoint of %bb.-1 without this change, and crashes with a use-after-free error under ASan if you apply the recycling allocator patch from llvm.org/PR26808. llvm-svn: 266150
2016-04-12Pre-fill LibcallRoutineNames with nullptr.James Y Knight1-32/+12
And rearrange InitLibcallNames slightly. llvm-svn: 266142
2016-04-12Add __atomic_* lowering to AtomicExpandPass.James Y Knight2-9/+555
(Recommit of r266002, with r266011, r266016, and not accidentally including an extra unused/uninitialized element in LibcallRoutineNames) AtomicExpandPass can now lower atomic load, atomic store, atomicrmw, and cmpxchg instructions to __atomic_* library calls, when the target doesn't support atomics of a given size. This is the first step towards moving all atomic lowering from clang into llvm. When all is done, the behavior of __sync_* builtins, __atomic_* builtins, and C11 atomics will be unified. Previously LLVM would pass everything through to the ISelLowering code. There, unsupported atomic instructions would turn into __sync_* library calls. Because of that behavior, Clang currently avoids emitting llvm IR atomic instructions when this would happen, and emits __atomic_* library functions itself, in the frontend. This change makes LLVM able to emit __atomic_* libcalls, and thus will eventually allow clang to depend on LLVM to do the right thing. It is advantageous to do the new lowering to atomic libcalls in AtomicExpandPass, before ISel time, because it's important that all atomic operations for a given size either lower to __atomic_* libcalls (which may use locks), or native instructions which won't. No mixing and matching. At the moment, this code is enabled only for SPARC, as a demonstration. The next commit will expand support to all of the other targets. Differential Revision: http://reviews.llvm.org/D18200 llvm-svn: 266115
2016-04-12[CodeGen] Remove constant-folding dead code. NFC.Ahmed Bougacha1-12/+4
This code was specific to vector operations with scalar operands: all the opcodes in FoldValue (via FoldConstantArithmetic) can't match those criteria. Replace it with an assert if that ever changes: at that point, we might need to add back a splat BUILD_VECTOR. llvm-svn: 266100
2016-04-12Introduce an GCRelocateInst class [NFC]Philip Reames3-7/+6
Previously, we were using isGCRelocate predicates. Using a subclass of IntrinsicInst is far more idiomatic. The refactoring also enables a couple of minor simplifications and code sharing. llvm-svn: 266098
2016-04-12[ScheduleDAGInstrs] Handle instructions with multiple MMOsGeoff Berry1-30/+41
Summary: In getUnderlyingObjectsForInstr(): Don't give up on instructions with multiple MMOs, instead look through all the MMOs and if they all meet the conservative criteria previously used for single MMO instructions, then return all of the underlying objects derived from the MMOs. The change to ScheduleDAGInstrs::buildSchedGraph() is needed to avoid the case where multiple underlying objects are present and are related in such a way that successive iterations of the loop end up adding a dependency from an instruction to itself. Reviewers: atrick, hfinkel Subscribers: MatzeB, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18093 llvm-svn: 266084
2016-04-12This reverts commit r266002, r266011 and r266016.Rafael Espindola2-555/+9
They broke the msan bot. Original message: Add __atomic_* lowering to AtomicExpandPass. AtomicExpandPass can now lower atomic load, atomic store, atomicrmw,and cmpxchg instructions to __atomic_* library calls, when the target doesn't support atomics of a given size. This is the first step towards moving all atomic lowering from clang into llvm. When all is done, the behavior of __sync_* builtins, __atomic_* builtins, and C11 atomics will be unified. Previously LLVM would pass everything through to the ISelLowering code. There, unsupported atomic instructions would turn into __sync_* library calls. Because of that behavior, Clang currently avoids emitting llvm IR atomic instructions when this would happen, and emits __atomic_* library functions itself, in the frontend. This change makes LLVM able to emit __atomic_* libcalls, and thus will eventually allow clang to depend on LLVM to do the right thing. It is advantageous to do the new lowering to atomic libcalls in AtomicExpandPass, before ISel time, because it's important that all atomic operations for a given size either lower to __atomic_* libcalls (which may use locks), or native instructions which won't. No mixing and matching. At the moment, this code is enabled only for SPARC, as a demonstration. The next commit will expand support to all of the other targets. Differential Revision: http://reviews.llvm.org/D18200 llvm-svn: 266062
2016-04-12[RegBankSelect] Teach the repairing code how to handle physicalQuentin Colombet1-2/+6
registers. llvm-svn: 266029
2016-04-12[RegisterBankInfo] Do not provide a default mapping for non-reg of phiQuentin Colombet1-0/+7
operations. llvm-svn: 266027
2016-04-12[RegBankSelect] Teach how to repair definitions.Quentin Colombet1-13/+112
Although repairing definitions is not mandatory for correctness (only phis would be impacted because of the RPO traversal), not repairing might go against the cost model. Therefore, just repair when it is possible. llvm-svn: 266025
2016-04-11Replace MachineRegisterInfo::TracksLiveness with a MachineFunctionPropertyDerek Schuff2-9/+7
Use the MachineFunctionProperty mechanism to indicate whether the liveness info is accurate instead of a bool flag on MRI. Keeps the MRI accessor function for convenience. NFC Differential Revision: http://reviews.llvm.org/D18767 llvm-svn: 266020
2016-04-11AtomicExpandPass: mark assert variable as usedJF Bastien1-0/+3
Avoid -Wunused-variable llvm-svn: 266016
2016-04-11Fix compile with GCC after r266002 (Add __atomic_* lowering to AtomicExpandPass)James Y Knight1-8/+8
It doesn't like implicitly calling the ArrayRef constructor with a returned array -- it appears to decays the returned value to a pointer, first, before trying to make an ArrayRef out of it. llvm-svn: 266011
2016-04-11CodeGen: Fix a use-after-free in TailDuplicationJustin Bogner1-2/+0
The call to processPHI already erased MI from its parent, so MI isn't even valid here, making the getParent() call a use-after-free in addition to being redundant. Found by ASan with the ArrayRecycler changes in llvm.org/pr26808. llvm-svn: 266008
2016-04-11[safestack] Add canary to unsafe stack framesEvgeniy Stepanov2-19/+79
Add StackProtector to SafeStack. This adds limited protection against data corruption in the caller frame. Current implementation treats all stack protector levels as -fstack-protector-all. llvm-svn: 266004
2016-04-11Add __atomic_* lowering to AtomicExpandPass.James Y Knight2-9/+552
AtomicExpandPass can now lower atomic load, atomic store, atomicrmw, and cmpxchg instructions to __atomic_* library calls, when the target doesn't support atomics of a given size. This is the first step towards moving all atomic lowering from clang into llvm. When all is done, the behavior of __sync_* builtins, __atomic_* builtins, and C11 atomics will be unified. Previously LLVM would pass everything through to the ISelLowering code. There, unsupported atomic instructions would turn into __sync_* library calls. Because of that behavior, Clang currently avoids emitting llvm IR atomic instructions when this would happen, and emits __atomic_* library functions itself, in the frontend. This change makes LLVM able to emit __atomic_* libcalls, and thus will eventually allow clang to depend on LLVM to do the right thing. It is advantageous to do the new lowering to atomic libcalls in AtomicExpandPass, before ISel time, because it's important that all atomic operations for a given size either lower to __atomic_* libcalls (which may use locks), or native instructions which won't. No mixing and matching. At the moment, this code is enabled only for SPARC, as a demonstration. The next commit will expand support to all of the other targets. Differential Revision: http://reviews.llvm.org/D18200 llvm-svn: 266002
2016-04-11[DAGCombiner] Fold xor/and/or (bitcast(A), bitcast(B)) -> bitcast(op (A,B)) ↵Simon Pilgrim1-2/+2
anytime before LegalizeVectorOprs xor/and/or (bitcast(A), bitcast(B)) -> bitcast(op (A,B)) was only being combined at the AfterLegalizeTypes stage, this patch permits the combine to occur anytime before then as well. The main aim with this to improve the ability to recognise bitmasks that can be converted to shuffles. I had to modify a number of AVX512 mask tests as the basic bitcast to/from scalar pattern was being stripped out, preventing testing of the mmask bitops. By replacing the bitcasts with loads we can get almost the same result. Differential Revision: http://reviews.llvm.org/D18944 llvm-svn: 265998
2016-04-11Fix a couple of redundant conditional expressions (PR27283, PR28282)Hans Wennborg1-2/+2
llvm-svn: 265987
2016-04-11use range-loops; NFCISanjay Patel1-13/+8
llvm-svn: 265985
2016-04-11Combine redundant stack realignment booleans in MachineFrameInfoReid Kleckner1-17/+14
MachineFrameInfo does not need to be able to distinguish between the user asking us not to realign the stack and the target telling us it doesn't support stack realignment. Either way, fixed stack objects have their alignment clamped. llvm-svn: 265971
2016-04-11TargetRegisterInfo: Add getRegAsmName()Tom Stellard1-1/+1
Summary: The motivation for this new function is to move an invalid assumption about the relationship between the names of register definitions in tablegen files and their assembly names into TargetRegisterInfo, so that we can begin working on fixing this assumption. The current problem is that if you have a register definition in TableGen like: def MYReg0 : Register<"r0", 0>; The function TargetLowering::getRegForInlineAsmConstraint() derives the assembly name from the tablegen name: "MyReg0" rather than the given assembly name "r0". This is working, because on most targets the tablegen name and the assembly names are case insensitive matches for each other (e.g. def EAX : X86Reg<"eax", ...> getRegAsmName() will allow targets to override this default assumption and return the correct assembly name. Reviewers: echristo, hfinkel Subscribers: SamWot, echristo, hfinkel, llvm-commits Differential Revision: http://reviews.llvm.org/D15614 llvm-svn: 265955
2016-04-09[CodeGen] Don't assume that fixed stack objects are aligned in a ↵Charles Davis1-5/+16
stack-realigned function. Summary: After we make the adjustment, we can assume that for local allocas, but not for stack parameters, the return address, or any other fixed stack object (which has a negative offset and therefore lies prior to the adjusted SP). Fixes PR26662. Reviewers: hfinkel, qcolombet, rnk Subscribers: rnk, llvm-commits Differential Revision: http://reviews.llvm.org/D18471 llvm-svn: 265886
2016-04-09Drop debug info for DISubprograms that are not referenced by anythingAdrian Prantl4-53/+8
This patch drops the debug info for all DISubprograms that are (a) not attached to an llvm::Function and (b) not indirectly reachable via inline scopes from any surviving Function and (c) not reachable from a type (i.e.: member functions). Background: I'm currently working on a patch to reverse the pointers between DICompileUnit and DISubprogram (for more info check Duncan's RFC on lazy-loading of debug info metadata http://lists.llvm.org/pipermail/llvm-dev/2016-March/097419.html). The idea is to remove the list of subprograms from DICompileUnit and instead point to the owning compile unit from each DISubprogram. After doing this all DISubprograms fulfilling the above criteria will be implicitly dropped unless we go through an extra effort to preserve them. http://reviews.llvm.org/D18477 <rdar://problem/25256815> llvm-svn: 265876
2016-04-09[x86] use BMI 'andn' for logic + compare ops Sanjay Patel1-0/+4
With BMI, we can use 'andn' to save an instruction when the result is only used in a compare. This is related to one of the potential sequences to check 'isfinite' in: https://llvm.org/bugs/show_bug.cgi?id=27164 Differential Revision: http://reviews.llvm.org/D18910 llvm-svn: 265875
2016-04-08Support the Nodebug emission kind for DICompileUnits.Adrian Prantl1-12/+21
Sample-based profiling and optimization remarks currently remove DICompileUnits from llvm.dbg.cu to suppress the emission of debug info from them. This is somewhat of a hack and only borderline legal IR. This patch uses the recently introduced NoDebug emission kind in DICompileUnit to achieve the same result without breaking the Verifier. A nice side-effect of this change is that it is now possible to combine NoDebug and regular compile units under LTO. http://reviews.llvm.org/D18808 <rdar://problem/25427165> llvm-svn: 265861
2016-04-08[SSP] Remove llvm.stackprotectorcheck.Tim Shen5-125/+76
This is a cleanup patch for SSP support in LLVM. There is no functional change. llvm.stackprotectorcheck is not needed, because SelectionDAG isn't actually lowering it in SelectBasicBlock; rather, it adds check code in FinishBasicBlock, ignoring the position where the intrinsic is inserted (See FindSplitPointForStackProtector()). llvm-svn: 265851
2016-04-08Codegen: Factor tail duplication into a utility class. NFCKyle Butt3-949/+923
This is in preparation for tail duplication during block placement. See D18226. This needs to be a utility class for 2 reasons. No passes may run after block placement, and also, tail-duplication affects subsequent layout decisions, so it must be interleaved with placement, and can't be separated out into its own pass. The original pass is still useful, and now runs by delegating to the utility class. llvm-svn: 265842
2016-04-08Fix Load Control Dependence in MemCpy GenerationNirav Dave2-57/+1
In Memcpy lowering we had missed a dependence from the load of the operation to successor operations. This causes us to potentially construct an in initial DAG with a memory dependence not fully represented in the chain sub-DAG but rather require looking at the entire DAG breaking alias analysis by allowing incorrect repositioning of memory operations. To work around this, r200033 changed DAGCombiner::GatherAllAliases to be conservative if any possible issues to happen. Unfortunately this check forbade many non-problematic situations as well. For example, it's common for incoming argument lowering to add a non-aliasing load hanging off of EntryNode. Then, if GatherAllAliases visited EntryNode, it would find that other (unvisited) use of the EntryNode chain, and just give up entirely. Furthermore, the check was incomplete: it would not actually detect all such potentially problematic DAG constructions, because GatherAllAliases did not guarantee to visit all chain nodes going up to the root EntryNode. This is in general fine -- giving up early will just miss a potential optimization, not generate incorrect results. But, for this non-chain dependency detection code, it's possible that you could have a load attached to a higher-up chain node than any which were visited. If that load aliases your store, but the only dependency is through the value operand of a non-aliasing store, it would've been missed by this code, and potentially reordered. With the dependence added, this check can be removed and Alias Analysis can be much more aggressive. This fixes code quality regression in the Consecutive Store Merge cleanup (D14834). Test Change: ppc64-align-long-double.ll now may see multiple serializations of its stores Differential Revision: http://reviews.llvm.org/D18062 llvm-svn: 265836
2016-04-08[RegBankSelect] Use reverse post order traversal.Quentin Colombet1-2/+12
When assigning the register banks of an instruction, it is best to know all the constraints of the input to have a good idea of how this will impact the cost of the whole function. llvm-svn: 265812
2016-04-08[RegisterBankInfo] Change the implementation for the default mapping.Quentin Colombet1-1/+14
Do not give that much importance to the current register bank of an operand. This is likely just a side effect of the current execution and it is properly wise to prefer a register bank that can be extracted from the information available statically (like encoding constraints and type). llvm-svn: 265810
2016-04-08[RegBankSelect] Improve debug output.Quentin Colombet1-1/+10
Add verbose information when checking if the current and the desired register banks match. Detail what happens when we assign a register bank. llvm-svn: 265804
2016-04-08[MIR] Teach the parser how to deal with register banks.Quentin Colombet1-10/+51
llvm-svn: 265802
2016-04-08[MachineVerifier] Teach how to check some of the properties of genericQuentin Colombet1-1/+24
virtual registers. Generic virtual registers: - May not have a register class - May not have a register bank - If they do not have a register class they must have a size - If they have a register bank, the size of the register bank must be greater or equal to the size of the virtual register (basically check that the virtual register will fit into that register class) llvm-svn: 265798
2016-04-08[MIR] Teach the mir printer how to print the register bank.Quentin Colombet1-5/+8
For now, we put the register bank in the Class field since a register may only have one of those at a given time. The downside of that representation is that if a register class and a register bank have the same name, we will not be able to distinguish them. llvm-svn: 265796
2016-04-08Revert r265547 "Recommit r265309 after fixed an invalid memory reference bug ↵Hans Wennborg10-722/+526
happened" It caused PR27275: "ARM: Bad machine code: Using an undefined physical register" Also reverting the following commits that were landed on top: r265610 "Fix the compare-clang diff error introduced by r265547." r265639 "Fix the sanitizer bootstrap error in r265547." r265657 "InlineSpiller.cpp: Escap \@ in r265547. [-Wdocumentation]" llvm-svn: 265790
2016-04-08CXX_FAST_TLS calling convention: performance improvement for PPC64Chuang-Yu Cheng1-2/+5
This is the same change on PPC64 as r255821 on AArch64. I have even borrowed his commit message. The access function has a short entry and a short exit, the initialization block is only run the first time. To improve the performance, we want to have a short frame at the entry and exit. We explicitly handle most of the CSRs via copies. Only the CSRs that are not handled via copies will be in CSR_SaveList. Frame lowering and prologue/epilogue insertion will generate a short frame in the entry and exit according to CSR_SaveList. The majority of the CSRs will be handled by register allcoator. Register allocator will try to spill and reload them in the initialization block. We add CSRsViaCopy, it will be explicitly handled during lowering. 1> we first set FunctionLoweringInfo->SplitCSR if conditions are met (the target supports it for the given machine function and the function has only return exits). We also call TLI->initializeSplitCSR to perform initialization. 2> we call TLI->insertCopiesSplitCSR to insert copies from CSRsViaCopy to virtual registers at beginning of the entry block and copies from virtual registers to CSRsViaCopy at beginning of the exit blocks. 3> we also need to make sure the explicit copies will not be eliminated. Author: Tom Jablin (tjablin) Reviewers: hfinkel kbarton cycheng http://reviews.llvm.org/D17533 llvm-svn: 265781
2016-04-08Use std::fill to simplify some code. NFCCraig Topper1-2/+3
llvm-svn: 265771
2016-04-08[TargetRegisterInfo] Re-apply r265734.Quentin Colombet1-12/+5
Original commit message: [TargetRegisterInfo] Refactor the code to use BitMaskClassIterator. llvm-svn: 265764
2016-04-08DwarfDebug: Support floating point constants in location lists.Adrian Prantl3-18/+41
This patch closes a gap in the DWARF backend that caused LLVM to drop debug info for floating point variables that were constant for part of their scope. Floating point constants are emitted as one or more DW_OP_constu joined via DW_OP_piece. This fixes a regression caught by the LLDB testsuite that I introduced in r262247 when we stopped blindly expanding the range of singular DBG_VALUEs to span the entire scope and started to emit location lists with accurate ranges instead. Also deletes a now-impossible testcase (debug-loc-empty-entries). <rdar://problem/25448338> llvm-svn: 265760
2016-04-08Revert "[TargetRegisterInfo] Refactor the code to use BitMaskClassIterator."Quentin Colombet1-5/+12
This reverts commit r265734. Looks like ASan is not happy about it. http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/11741 Looking. llvm-svn: 265755
2016-04-08[RegisterBankInfo] Make the debug output more compact.Quentin Colombet1-1/+1
Print the mask of the partial mapping as an hexadecimal instead of a binary value. llvm-svn: 265754
2016-04-07[RegBankSelect] Add a few debug statements.Quentin Colombet1-0/+9
llvm-svn: 265749
2016-04-07[RegisterBankInfo] Add print and dump method to the InstructionMappingQuentin Colombet1-0/+16
helper class. llvm-svn: 265747
2016-04-07[RegisterBankInfo] Add print and dump method to the ValueMapping helperQuentin Colombet1-0/+16
class. llvm-svn: 265746
2016-04-07[MachineInstr] Teach the print method about RegisterBank.Quentin Colombet1-11/+10
Properly print either the register class or the register bank or a virtual register. Get rid of a few ifdefs in the process. llvm-svn: 265745
2016-04-07[RegisterBankInfo] Strengthen getInstrMappingImpl.Quentin Colombet1-14/+21
Teach the target independent code how to take advantage of type information to get the mapping of an instruction. llvm-svn: 265739
2016-04-07[RegisterBankInfo] Add a way to record what register bank covers aQuentin Colombet1-1/+10
specific type. This will be used to find the default mapping of the instruction. Also, this information is recorded, instead of computed, because it is expensive from a type to know which register bank maps it. Indeed, we need to iterate through all the register classes of all the register banks to find the one that maps the given type. llvm-svn: 265736
2016-04-07[RegisterBankInfo] Introduce getRegBankFromConstraints as an helperQuentin Colombet1-20/+34
method. NFC. The refactoring intends to make the code more readable and expose more features to potential derived classes. llvm-svn: 265735