aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/Target
AgeCommit message (Collapse)AuthorFilesLines
2015-01-08Add saving and restoring of r30 to the prologue and epilogue, respectivelyJustin Hibbits2-0/+17
Summary: The PIC additions didn't update the prologue and epilogue code to save and restore r30 (PIC base register). This does that. Test Plan: Tests updated. Reviewers: hfinkel Reviewed By: hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6876 llvm-svn: 225450
2015-01-08Fix large stack alignment codegen for ARM and Thumb2 targetsKristof Beyls2-22/+84
This partially fixes PR13007 (ARM CodeGen fails with large stack alignment): for ARM and Thumb2 targets, but not for Thumb1, as it seems stack alignment for Thumb1 targets hasn't been supported at all. Producing an aligned stack pointer is done by zero-ing out the lower bits of the stack pointer. The BIC instruction was used for this. However, the immediate field of the BIC instruction only allows to encode an immediate that can zero out up to a maximum of the 8 lower bits. When a larger alignment is requested, a BIC instruction cannot be used; llvm was silently producing incorrect code in this case. This commit fixes code generation for large stack aligments by using the BFC instruction instead, when the BFC instruction is available. When not, it uses 2 instructions: a right shift, followed by a left shift to zero out the lower bits. The lowering of ARM::Int_eh_sjlj_dispatchsetup still has code that unconditionally uses BIC to realign the stack pointer, so it very likely has the same problem. However, I wasn't able to produce a test case for that. This commit adds an assert so that the compiler will fail the assert instead of silently generating wrong code if this is ever reached. llvm-svn: 225446
2015-01-08R600/SI: Remove SIISelLowering::legalizeOperands()Tom Stellard2-176/+1
Its functionality has been replaced by calling SIInstrInfo::legalizeOperands() from SIISelLowering::AdjstInstrPostInstrSelection() and running the SIFoldOperands and SIShrinkInstructions passes. llvm-svn: 225445
2015-01-08[X86] Don't try to generate direct calls to TLS globalsMichael Kuperstein1-1/+2
The call lowering assumes that if the callee is a global, we want to emit a direct call. This is correct for regular globals, but not for TLS ones. Differential Revision: http://reviews.llvm.org/D6862 llvm-svn: 225438
2015-01-08[X86] Don't print 'dword ptr' or 'qword ptr' on the operand to some of the ↵Craig Topper4-4/+14
LEA variants in Intel syntax. The memory operand is inherently unsized. llvm-svn: 225432
2015-01-08[SelectionDAG] Allow targets to specify legality of extloads' resultAhmed Bougacha15-136/+190
type (in addition to the memory type). The *LoadExt* legalization handling used to only have one type, the memory type. This forced users to assume that as long as the extload for the memory type was declared legal, and the result type was legal, the whole extload was legal. However, this isn't always the case. For instance, on X86, with AVX, this is legal: v4i32 load, zext from v4i8 but this isn't: v4i64 load, zext from v4i8 Whereas v4i64 is (arguably) legal, even without AVX2. Note that the same thing was done a while ago for truncstores (r46140), but I assume no one needed it yet for extloads, so here we go. Calls to getLoadExtAction were changed to add the value type, found manually in the surrounding code. Calls to setLoadExtAction were mechanically changed, by wrapping the call in a loop, to match previous behavior. The loop iterates over the MVT subrange corresponding to the memory type (FP vectors, etc...). I also pulled neighboring setTruncStoreActions into some of the loops; those shouldn't make a difference, as the additional types are illegal. (e.g., i128->i1 truncstores on PPC.) No functional change intended. Differential Revision: http://reviews.llvm.org/D6532 llvm-svn: 225421
2015-01-08X86: VZeroUpperInserter: shortcut should not trigger if we have any function ↵Matthias Braun1-8/+12
live-ins. llvm-svn: 225419
2015-01-07R600/SI: Commute instructions to enable more folding opportunitiesTom Stellard2-19/+51
llvm-svn: 225410
2015-01-07R600/SI: Only fold immediates that have one useTom Stellard1-1/+8
Folding the same immediate into multiple instruction will increase program size, which can hurt performance. llvm-svn: 225405
2015-01-07[CodeGen] Use MVT iterator_ranges in legality loops. NFC intended.Ahmed Bougacha6-84/+45
A few loops do trickier things than just iterating on an MVT subset, so I'll leave them be for now. Follow-up of r225387. llvm-svn: 225392
2015-01-07R600/SI: Remove VReg_32 register classTom Stellard13-154/+152
Use VGPR_32 register class instead. These two register classes were identical and having separate classes was causing SIInstrInfo::isLegalOperands() to be overly conservative in some cases. This change is necessary to prevent future paches from missing a folding opportunity in fneg-fabs.ll. llvm-svn: 225382
2015-01-07[Hexagon] Fix 225372 USR register is not fully complete. Removing Uses = ↵Colin LeMahieu1-12/+12
[USR] maintains existing functionality to old instructions without encodings. llvm-svn: 225377
2015-01-07[Hexagon] Adding floating point classification and creation.Colin LeMahieu1-0/+45
llvm-svn: 225374
2015-01-07R600/SI: Add a V_MOV_B64 pseudo instructionTom Stellard3-0/+38
This is used to simplify the SIFoldOperands pass and make it easier to fold immediates. llvm-svn: 225373
2015-01-07[Hexagon] Adding encodings for v5 floating point instructions.Colin LeMahieu1-0/+326
llvm-svn: 225372
2015-01-07[Hexagon] Adding encoding for popcount, fastcorner, dword asr with rounding.Colin LeMahieu2-1/+62
llvm-svn: 225371
2015-01-07R600/SI: Teach SIFoldOperands to split 64-bit constants when foldingTom Stellard3-25/+51
This allows folding of sequences like: s[0:1] = s_mov_b64 4 v_add_i32 v0, s0, v0 v_addc_u32 v1, s1, v1 into v_add_i32 v0, 4, v0 v_add_i32 v1, 0, v1 llvm-svn: 225369
2015-01-07[X86] Fix 512->256 typo in comments. NFC.Ahmed Bougacha1-2/+2
llvm-svn: 225367
2015-01-07X86: Allow the stack probe size to be configurable per functionDavid Majnemer1-3/+9
LLVM emits stack probes on Windows targets to ensure that the stack is correctly accessed. However, the amount of stack allocated before emitting such a probe is hardcoded to 4096. It is desirable to have this be configurable so that a function might opt-out of stack probes. Our level of granularity is at the function level instead of, say, the module level to permit proper generation of code after LTO. Patch by Andrew H! N.B. The inliner needs to be updated to properly consider what happens after inlining a function with a specific stack-probe-size into another function with a different stack-probe-size. llvm-svn: 225360
2015-01-07R600/SI: Refactor SIFoldOperands to simplify immediate foldingTom Stellard1-25/+54
This will make a future patch much less intrusive. llvm-svn: 225358
2015-01-07[X86] Teach FCOPYSIGN lowering to recognize constant magnitudes.Ahmed Bougacha1-6/+19
For code like: float foo(float x) { return copysign(1.0, x); } We used to generate: andps <-0.000000e+00,0,0,0>, %xmm0 movss <1.000000e+00>, %xmm1 andps <nan>, %xmm1 orps %xmm0, %xmm1 Basically doing an abs(1.0f) in the two middle instructions. We now generate: andps <-0.000000e+00,0,0,0>, %xmm0 orps <1.000000e+00,0,0,0>, %xmm0 Builds on cleanups r223415, r223542. rdar://19049548 Differential Revision: http://reviews.llvm.org/D6555 llvm-svn: 225357
2015-01-07Fix regression in r225266.Asiri Rathnayake1-1/+1
The change in r225266 was reviewed under D6722. But the commit r225266 has a typo, causing some MCHammer failures. This patch fixes it. Change-Id: I573efcff25003af7478ac02548ebbe929fc7f5fd llvm-svn: 225347
2015-01-07[X86] Merge a switch statement inside a default case of another switch ↵Craig Topper1-160/+155
statement on the same variable. There was no additional code in the default so this should be no functional change. llvm-svn: 225345
2015-01-07[X86] Don't mark the shift by 1 instructions as isConvertibleToThreeAddress. ↵Craig Topper1-1/+1
There is no handling for them. llvm-svn: 225344
2015-01-07[X86] Remove some unused TYPE enums from the disassembler.Craig Topper3-18/+1
llvm-svn: 225343
2015-01-07Revert r225165 and r225169Karthik Bhat1-39/+0
Even thouh gcc produces simialr instructions as Owen pointed out the two patterns aren’t equivalent in the case where the original subtraction could have caused an overflow. Reverting the same. llvm-svn: 225341
2015-01-07R600/SI: Add check for amdgcn triple forgotten in r225276.Tom Stellard1-2/+3
llvm-svn: 225331
2015-01-07[PowerPC] Transform a README.txt entry into a FIXMEHal Finkel2-14/+9
Remove the README.txt entry regarding register allocation of CR logical ops, and replace it with a FIXME in PPCInstrInfo.td. The text in the README.txt was not really accurate, and thanks goes to Pat Haugen (and Bill Schmidt) from IBM for clarifying what was intended and highlighting the relevant text in the ISA specification. llvm-svn: 225325
2015-01-06Revert r224935 "Refactor duplicated code. No intended functionality change."Lang Hames6-8/+30
This is affecting the behavior of some ObjC++ / AArch64 test cases on Darwin. Reverting to get the bots green while I track down the source of the changed behavior. llvm-svn: 225311
2015-01-06R600/SI: Add combine for isinfinite patternMatt Arsenault2-0/+57
llvm-svn: 225310
2015-01-06R600/SI: Pattern match isinf to v_cmp_class instructionsMatt Arsenault2-0/+34
llvm-svn: 225307
2015-01-06R600/SI: Add basic DAG combines for fp_classMatt Arsenault2-1/+50
llvm-svn: 225306
2015-01-06R600/SI: Add class intrinsicMatt Arsenault7-5/+82
llvm-svn: 225305
2015-01-06[PowerPC] Reuse a load operand in int->fp conversionsHal Finkel3-41/+142
int->fp conversions on PPC must be done through memory loads and stores. On a modern core, this process begins by storing the int value to memory, then loading it using a (sometimes special) FP load instruction. Unfortunately, we would do this even when the value to be converted was itself a load, and we can just use that same memory location instead of copying it to another first. There is a slight complication when handling int_to_fp(fp_to_int(x)) pairs, because the fp_to_int operand has not been lowered when the int_to_fp is being lowered. We handle this specially by invoking fp_to_int's lowering logic (partially) and getting the necessary memory location (some trivial refactoring was done to make this possible). This is all somewhat ugly, and it would be nice if some later CodeGen stage could just clean this stuff up, but because doing so would involve modifying target-specific nodes (or instructions), it is not immediately clear how that would work. Also, remove a related entry from the README.txt for which we now generate reasonable code. llvm-svn: 225301
2015-01-06[Hexagon] Adding compound jump encodings.Colin LeMahieu2-0/+266
llvm-svn: 225291
2015-01-06R600/SI: Insert s_waitcnt before s_barrier instructions.Tom Stellard1-1/+5
This ensures that all memory operations are complete when all threads reach the barrier. llvm-svn: 225290
2015-01-06R600/SI: Fix dependency calculation for DS writes instructions in SIInsertWaitsTom Stellard1-0/+23
In DS write instructions, the address operand comes before the value operand(s) which is reversed from every other instruction type. The SIInsertWait assumed that the first use for each instruction was the value, so for DS write it was protecting the address operand with s_waitcnt instructions when it should have been protecting the value operand. llvm-svn: 225289
2015-01-06[Hexagon] Adding encoding for misc v4 instructions: boundscheck, tlbmatch, ↵Colin LeMahieu3-1/+101
dcfetch. llvm-svn: 225283
2015-01-06[Hexagon] Adding encoding information for absolute address loads.Colin LeMahieu1-124/+186
llvm-svn: 225279
2015-01-06R600/SI: Add a stub GCNTargetMachineTom Stellard8-1/+46
This is equivalent to the AMDGPUTargetMachine now, but it is the starting point for separating R600 and GCN functionality into separate targets. It is recommened that users start using the gcn triple for GCN-based GPUs, because using the r600 triple for these GPUs will be deprecated in the future. llvm-svn: 225277
2015-01-06R600/SI: Remove MachineFunction dump from AsmPrinterTom Stellard1-17/+12
The dump was dependent on a feature string, which meant that it couldn't be disabled or enable on a per compile basis. llvm-svn: 225275
2015-01-06[Hexagon] Fix 225267. GP register is not yet fully implemented. Removing ↵Colin LeMahieu1-2/+2
Uses [GP] maintains existing behavior. llvm-svn: 225270
2015-01-06[Hexagon] Adding dealloc_return encoding and absolute address stores.Colin LeMahieu5-239/+347
llvm-svn: 225267
2015-01-06[ARM] Cleanup so_imm* tblgen defintionsAsiri Rathnayake2-109/+43
No functional changes. Support for ARM's modified immediate syntax was added in r223113 and r223115 (review: D6408). That patch introduced the mod_imm* tblegen definitions which renders the existing so_imm* definitions redundant. This patch gets rid of them completely. Reviewed as: D6722 llvm-svn: 225266
2015-01-06[X86] Add OpSize32 to XBEGIN_4. Add XBEGIN_2 with OpSize16.Craig Topper4-7/+40
Requires new AsmParserOperand types that detect 16-bit and 32/64-bit mode so that we choose the right instruction based on default sizing without predicates. This is necessary since predicates mess up the disassembler table building. llvm-svn: 225256
2015-01-06[X86] Make isel select the 2-byte register form of INC/DEC even in ↵Craig Topper5-126/+78
non-64-bit mode. Convert to the 1-byte form in non-64-bit mode as part of MCInst lowering. Overall this seems simpler. It reduces duplication of patterns between both modes and it simplifies the memory folding/unfolding tables as they don't need to create fake instructions just to keep track of 64-bitness. llvm-svn: 225252
2015-01-06[PowerPC] Remove old README.txt entry regarding struct passingHal Finkel1-8/+0
Because of how Clang represents structs as arrays (at least on non-Darwin platforms), and what SROA does, etc. this is no longer a problem. llvm-svn: 225251
2015-01-06X86: Don't make illegal GOTTPOFF relocationsDavid Majnemer2-0/+17
"ELF Handling for Thread-Local Storage" specifies that R_X86_64_GOTTPOFF relocation target a movq or addq instruction. Prohibit the truncation of such loads to movl or addl. This fixes PR22083. Differential Revision: http://reviews.llvm.org/D6839 llvm-svn: 225250
2015-01-06[PowerPC] Add some missing names in getTargetNodeNameHal Finkel1-0/+7
These are used for debugging output; NFC. llvm-svn: 225249
2015-01-06[PowerPC] Improve int_to_fp(fp_to_int(x)) combiningHal Finkel2-30/+74
The old target DAG combine that allowed for performing int_to_fp(fp_to_int(x)) without a load/store pair is updated here with support for unsigned integers, and to support single-precision values without a third rounding step, on newer cores with the appropriate instructions. llvm-svn: 225248