aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib
AgeCommit message (Collapse)AuthorFilesLines
2016-09-09Do not widen load for different variable in GVN.Dehao Chen1-37/+1
Summary: Widening load in GVN is too early because it will block other optimizations like PRE, LICM. https://llvm.org/bugs/show_bug.cgi?id=29110 The SPECCPU2006 benchmark impact of this patch: Reference: o2_nopatch (1): o2_patched Benchmark Base:Reference (1) ------------------------------------------------------- spec/2006/fp/C++/444.namd 25.2 -0.08% spec/2006/fp/C++/447.dealII 45.92 +1.05% spec/2006/fp/C++/450.soplex 41.7 -0.26% spec/2006/fp/C++/453.povray 35.65 +1.68% spec/2006/fp/C/433.milc 23.79 +0.42% spec/2006/fp/C/470.lbm 41.88 -1.12% spec/2006/fp/C/482.sphinx3 47.94 +1.67% spec/2006/int/C++/471.omnetpp 22.46 -0.36% spec/2006/int/C++/473.astar 21.19 +0.24% spec/2006/int/C++/483.xalancbmk 36.09 -0.11% spec/2006/int/C/400.perlbench 33.28 +1.35% spec/2006/int/C/401.bzip2 22.76 -0.04% spec/2006/int/C/403.gcc 32.36 +0.12% spec/2006/int/C/429.mcf 41.04 -0.41% spec/2006/int/C/445.gobmk 26.94 +0.04% spec/2006/int/C/456.hmmer 24.5 -0.20% spec/2006/int/C/458.sjeng 28 -0.46% spec/2006/int/C/462.libquantum 55.25 +0.27% spec/2006/int/C/464.h264ref 45.87 +0.72% geometric mean +0.23% For most benchmarks, it's a wash, but we do see stable improvements on some benchmarks, e.g. 447,453,482,400. Reviewers: davidxl, hfinkel, dberlin, sanjoy, reames Subscribers: gberry, junbuml Differential Revision: https://reviews.llvm.org/D24096 llvm-svn: 281074
2016-09-09Fix another -Wunused-variable for non-assert build.Rui Ueyama1-3/+4
llvm-svn: 281073
2016-09-09Fix -Wunused-variable for non-assert build.Rui Ueyama1-3/+2
llvm-svn: 281069
2016-09-09[pdb] Pass CVRecord's through the visitor as non-const references.Zachary Turner5-85/+85
This simplifies a lot of code, and will actually be necessary for an upcoming patch to serialize TPI record hash values. The idea before was that visitors should be examining records, not modifying them. But this is no longer true with a visitor that constructs a CVRecord from Yaml. To handle this until now, we were doing some fixups on CVRecord objects at a higher level, but the code is really awkward, and it makes sense to just have the visitor write the bytes into the CVRecord. In doing so I uncovered a few bugs related to `Data` and `RawData` and fixed those. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D24362 llvm-svn: 281067
2016-09-09[libFuzzer] one more puzzle, value_profile cracks it in a secondKostya Serebryany3-0/+25
llvm-svn: 281066
2016-09-09[pdb] Write PDB TPI Stream from Yaml.Zachary Turner9-74/+177
This writes the full sequence of type records described in Yaml to the TPI stream of the PDB file. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D24316 llvm-svn: 281063
2016-09-09[codeview] Don't assert if the array element type is incompleteReid Kleckner1-15/+26
This can happen when the frontend knows the debug info will be emitted somewhere else. Usually this happens for dynamic classes with out of line constructors or key functions, but it can also happen when modules are enabled. llvm-svn: 281060
2016-09-09AMDGPU] Assembler: better support for immediate literals in assembler.Sam Kolton14-351/+708
Summary: Prevously assembler parsed all literals as either 32-bit integers or 32-bit floating-point values. Because of this we couldn't support f64 literals. E.g. in instruction "v_fract_f64 v[0:1], 0.5", literal 0.5 was encoded as 32-bit literal 0x3f000000, which is incorrect and will be interpreted as 3.0517578125E-5 instead of 0.5. Correct encoding is inline constant 240 (optimal) or 32-bit literal 0x3FE00000 at least. With this change the way immediate literals are parsed is changed. All literals are always parsed as 64-bit values either integer or floating-point. Then we convert parsed literals to correct form based on information about type of operand parsed (was literal floating or binary) and type of expected instruction operands (is this f32/64 or b32/64 instruction). Here are rules how we convert literals: - We parsed fp literal: - Instruction expects 64-bit operand: - If parsed literal is inlinable (e.g. v_fract_f64_e32 v[0:1], 0.5) - then we do nothing this literal - Else if literal is not-inlinable but instruction requires to inline it (e.g. this is e64 encoding, v_fract_f64_e64 v[0:1], 1.5) - report error - Else literal is not-inlinable but we can encode it as additional 32-bit literal constant - If instruction expect fp operand type (f64) - Check if low 32 bits of literal are zeroes (e.g. v_fract_f64 v[0:1], 1.5) - If so then do nothing - Else (e.g. v_fract_f64 v[0:1], 3.1415) - report warning that low 32 bits will be set to zeroes and precision will be lost - set low 32 bits of literal to zeroes - Instruction expects integer operand type (e.g. s_mov_b64_e32 s[0:1], 1.5) - report error as it is unclear how to encode this literal - Instruction expects 32-bit operand: - Convert parsed 64 bit fp literal to 32 bit fp. Allow lose of precision but not overflow or underflow - Is this literal inlinable and are we required to inline literal (e.g. v_trunc_f32_e64 v0, 0.5) - do nothing - Else report error - Do nothing. We can encode any other 32-bit fp literal (e.g. v_trunc_f32 v0, 10000000.0) - Parsed binary literal: - Is this literal inlinable (e.g. v_trunc_f32_e32 v0, 35) - do nothing - Else, are we required to inline this literal (e.g. v_trunc_f32_e64 v0, 35) - report error - Else, literal is not-inlinable and we are not required to inline it - Are high 32 bit of literal zeroes or same as sign bit (32 bit) - do nothing (e.g. v_trunc_f32 v0, 0xdeadbeef) - Else - report error (e.g. v_trunc_f32 v0, 0x123456789abcdef0) For this change it is required that we know operand types of instruction (are they f32/64 or b32/64). I added several new register operands (they extend previous register operands) and set operand types to corresponding types: ''' enum OperandType { OPERAND_REG_IMM32_INT, OPERAND_REG_IMM32_FP, OPERAND_REG_INLINE_C_INT, OPERAND_REG_INLINE_C_FP, } ''' This is not working yet: - Several tests are failing - Problems with predicate methods for inline immediates - LLVM generated assembler parts try to select e64 encoding before e32. More changes are required for several AsmOperands. Reviewers: vpykhtin, tstellarAMD Subscribers: arsenm, kzhuravl, artem.tamazov Differential Revision: https://reviews.llvm.org/D22922 llvm-svn: 281050
2016-09-09[Sparc][LEON] Removed the parts of the errata fixes implemented using inline ↵Chris Dewhurst1-76/+0
assembly as this is not the desired behaviour for end-users. Small change to a unit test to implement this without requiring the inline assembly. llvm-svn: 281047
2016-09-09[ARM] ADD with a negative offset can become SUB for freeJames Molloy1-0/+4
So model that directly in TTI::getIntImmCost(). llvm-svn: 281044
2016-09-09[ARM] icmp %x, -C can be lowered to a simple ADDS or CMNJames Molloy1-0/+11
Tell TargetTransformInfo about this so ConstantHoisting is informed. llvm-svn: 281043
2016-09-09[SelectionDAG] Ensure DAG::getZeroExtendInReg is called with a scalar typeSimon Pilgrim2-3/+3
Fixes issue with rL280927 identified by Mikael Holmén llvm-svn: 281042
2016-09-09[Thumb] Select (CMPZ X, -C) -> (CMPZ (ADDS X, C), 0)James Molloy1-0/+42
The CMPZ #0 disappears during peepholing, leaving just a tADDi3, tADDi8 or t2ADDri. This avoids having to materialize the expensive negative constant in Thumb-1, and allows a shrinking from a 32-bit CMN to a 16-bit ADDS in Thumb-2. llvm-svn: 281040
2016-09-09GlobalISel: remove G_TYPE and G_PHITim Northover5-20/+3
These instructions were only necessary when type information was stored in the MachineInstr (because only generic MachineInstrs possessed a type). Now that it's in MachineRegisterInfo, COPY and PHI work fine. llvm-svn: 281037
2016-09-09GlobalISel: fix comments and add assertions for valid instructions.Tim Northover1-4/+88
llvm-svn: 281036
2016-09-09GlobalISel: move type information to MachineRegisterInfo.Tim Northover17-383/+275
We want each register to have a canonical type, which means the best place to store this is in MachineRegisterInfo rather than on every MachineInstr that happens to use or define that register. Most changes following from this are pretty simple (you need an MRI anyway if you're going to be doing any transformations, so just check the type there). But legalization doesn't really want to check redundant operands (when, for example, a G_ADD only ever has one type) so I've made use of MCInstrDesc's operand type field to encode these constraints and limit legalization's work. As an added bonus, more validation is possible, both in MachineVerifier and MachineIRBuilder (coming soon). llvm-svn: 281035
2016-09-09Revert "[mips] Fix c.<cc>.<fmt> instruction definition."Simon Dardis15-539/+209
This reverts commit r281022. Mips buildbot broke, due to unhandled register class FCC. llvm-svn: 281033
2016-09-09[AMDGPU] Assembler: rename amd_kernel_code_t asm names according to specSam Kolton3-242/+85
Summary: Also removed duplicate code from AMDGPUTargetAsmStreamer. This change only change how amd_kernel_code_t is parsed and printed. No variable names are changed. Reviewers: vpykhtin, tstellarAMD Subscribers: arsenm, wdng, nhaehnle Differential Revision: https://reviews.llvm.org/D24296 llvm-svn: 281028
2016-09-09[Thumb1] Teach optimizeCompareInstr about thumb1 comparesJames Molloy1-4/+21
This avoids us doing a completely unneeded "cmp r0, #0" after a flag-setting instruction if we only care about the Z or C flags. Add LSL/LSR to the whitelist while we're here and add testing. This code could really do with a spring clean. llvm-svn: 281027
2016-09-09[AMDGPU] Assembler: match e32 VOP instructions before e64.Sam Kolton7-32/+126
Summary: Split assembler match table in 4 tables with assembler variants: Default - all instructions except VOP3, SDWA and DPP - VOP3 - SDWA - DPP First match Default table then VOP3, SDWA and DPP. Reviewers: tstellarAMD, artem.tamazov, vpykhtin Subscribers: arsenm, wdng, nhaehnle, AMDGPU Differential Revision: https://reviews.llvm.org/D24252 llvm-svn: 281023
2016-09-09[mips] Fix c.<cc>.<fmt> instruction definition.Simon Dardis15-209/+539
As part of this effort, remove MipsFCmp nodes and use tablegen patterns rather than custom lowering through C++. Unexpectedly, this improves codesize for microMIPS as previous floating point setcc expansions would materialize 0 and 1 into GPRs before using the relevant mov[tf].[sd] instruction. Now $zero is used directly. Reviewers: dsanders, vkalintiris, zoran.jovanovic Differential Review: https://reviews.llvm.org/D23118 llvm-svn: 281022
2016-09-09[Coroutines] Part13: Handle single edge PHINodes across suspendsGor Nishanov3-4/+28
Summary: If one of the uses of the value is a single edge PHINode, handle it. Original: %val = something <suspend> %p = PHINode [%val] After Spill + Part13: %val = something %slot = gep val.spill.slot store %val, %slot <suspend> %p = load %slot Plus tiny fixes/changes: * use correct index for coro.free in CoroCleanup * fixup id parameter in coro.free to allow authoring coroutine in plain C with __builtins Reviewers: majnemer Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D24242 llvm-svn: 281020
2016-09-09Rationalise the attribute getter/setter methods on Function and CallSite.Amaury Sechet2-40/+4
Summary: While woring on mapping attributes in the C API, it clearly appeared that the recent changes in the API on the C++ side left Function and Call/Invoke with an attribute API that grew in an ad hoc manner. This makes it difficult to work with it, because one doesn't know which overloads exists and which do not. Make sure that getter/setter function exists for both enum and string version. Remove inconsistent getter/setter, unless they have many callsites. This should make it easier to work with attributes in the future. This doesn't change how attribute works. Reviewers: bkramer, whitequark, mehdi_amini, void Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D21514 llvm-svn: 281019
2016-09-09[libFuzzer] improve -print_pcs to not print new PCs coming from libFuzzer itselfKostya Serebryany2-8/+19
llvm-svn: 281016
2016-09-09[libFuzzer] remove unneeded callKostya Serebryany2-9/+0
llvm-svn: 281014
2016-09-09[AVX-512] Add VPCMP instructions to the load folding tables and make them ↵Craig Topper2-1/+57
commutable. llvm-svn: 281013
2016-09-09[libFuzzer] remove use_traces=1 since use_value_profile seems to be strictly ↵Kostya Serebryany6-67/+9
better llvm-svn: 281007
2016-09-09[X86] Tighten up a comment which confused x64 ABI terminology.David Majnemer1-3/+3
The x64 ABI has two major function types: - frame functions - leaf functions A frame function is one which requires a stack frame. A leaf function is one which does not. A frame function may or may not have a frame pointer. A leaf function does not require a stack frame and may never modify SP except via a return (RET, tail call via JMP). A frame function which has a frame pointer is permitted to use the LEA instruction in the epilogue, a frame function without which doesn't establish a frame pointer must use ADD to adjust the stack pointer epilogue. Fun fact: Leaf functions don't require a function table entry (associated PDATA/XDATA). llvm-svn: 281006
2016-09-08Win64: Don't use REX prefix for direct tail callsHans Wennborg5-8/+4
The REX prefix should be used on indirect jmps, but not direct ones. For direct jumps, the unwinder looks at the offset to determine if it's inside the current function. Differential Revision: https://reviews.llvm.org/D24359 llvm-svn: 281003
2016-09-08Remove debug info when hoisting instruction from then/else branch.Dehao Chen1-0/+8
Summary: The hoisted instruction is executed speculatively. It could affect the debugging experience as user would see gdb go into code that may not be expected to execute. It will also affect sample profile accuracy by assigning incorrect frequency to source within then/else branch. Reviewers: davidxl, dblaikie, chandlerc, kcc, echristo Subscribers: mehdi_amini, probinson, eric_niebler, andreadb, llvm-commits Differential Revision: https://reviews.llvm.org/D24164 llvm-svn: 280995
2016-09-08[LV] Ensure proper handling of multi-use case when collecting uniformsMatthew Simpson1-5/+5
The test case included in r280979 wasn't checking what it was supposed to be checking for the predicated store case. Fixing the test revealed that the multi-use case (when a pointer is used by both vectorized and scalarized memory accesses) wasn't being handled properly. We can't skip over non-consecutive-like pointers since they may have looked consecutive-like with a different memory access. llvm-svn: 280992
2016-09-08[RDF] Further improve handling of multiple phis reached from shadowsKrzysztof Parzyszek1-31/+16
llvm-svn: 280987
2016-09-08[LV] Don't mark pointers used by scalarized memory accesses uniformMatthew Simpson1-42/+143
Previously, all consecutive pointers were marked uniform after vectorization. However, if a consecutive pointer is used by a memory access that is eventually scalarized, the pointer won't remain uniform after all. An example is predicated stores. Even though a predicated store may be consecutive, it will still be scalarized, making it's pointer operand non-uniform. This patch updates the logic in collectLoopUniforms to consider the cases where a memory access may be scalarized. If a memory access may be scalarized, its pointer operand is not marked uniform. The determination of whether a given memory instruction will be scalarized or not has been moved into a common function that is used by the vectorizer, cost model, and legality analysis. Differential Revision: https://reviews.llvm.org/D24271 llvm-svn: 280979
2016-09-08[YAMLIO] Add the ability to map with context.Zachary Turner1-1/+2
mapping a yaml field to an object in code has always been a stateless operation. You could still pass state by using the `setContext` function of the YAMLIO object, but this represented global state for the entire yaml input. In order to have context-sensitive state, it is necessary to pass this state in at the granularity of an individual mapping. This patch adds support for this type of context-sensitive state. You simply pass an additional argument of type T to the `mapRequired` or `mapOptional` functions, and provided you have specialized a `MappingContextTraits<U, T>` class with the appropriate mapping function, you can pass this context into the mapping function. Reviewed By: chandlerc Differential Revision: https://reviews.llvm.org/D24162 llvm-svn: 280977
2016-09-08AMDGPU: Sign extend constants when splitting themMatt Arsenault1-3/+2
This will confuse later passes which try to look at the immediate value and don't truncate first. llvm-svn: 280974
2016-09-08[Hexagon] Expand sext- and zextloads of vector types, not just extloadsKrzysztof Parzyszek1-1/+5
Recent change exposed this issue, breaking the Hexagon buildbots. llvm-svn: 280973
2016-09-08AMDGPU: Try to commute when selecting s_addk_i32/s_mulk_i32Matt Arsenault1-9/+14
llvm-svn: 280972
2016-09-08AArch64 .arch directive - Include default arch attributes with extensions.Eric Christopher1-3/+39
Fix the .arch asm parser to use the full set of features for the architecture and any extensions on the command line. Add and update testcases accordingly as well as add an extension that was used but not supported. llvm-svn: 280971
2016-09-08AMDGPU: Support commuting with immediate in src0Matt Arsenault2-98/+81
llvm-svn: 280970
2016-09-08Revert "[XRay] ARM 32-bit no-Thumb support in LLVM"Renato Golin11-213/+61
And associated commits, as they broke the Thumb bots. This reverts commit r280935. This reverts commit r280891. This reverts commit r280888. llvm-svn: 280967
2016-09-08[LoopDataPrefetch] Use range based for loop; NFCIBalaram Makam1-17/+12
Switch to range based for loop. No functional change, but more readable code. llvm-svn: 280966
2016-09-08[InstCombine] return a vector-safe true/false constantSanjay Patel1-2/+2
I introduced this potential bug by missing this diff in: https://reviews.llvm.org/rL280873 ...however, I'm not sure how to reach this code path with a regression test. We may be able to remove this code and assume that the transform to a constant is always handled by InstSimplify? llvm-svn: 280964
2016-09-08revert r280427Dehao Chen2-6/+4
Refactor replaceDominatedUsesWith to have a flag to control whether to replace uses in BB itself. Summary: This is in preparation for LoopSink pass which calls replaceDominatedUsesWith to update after sinking. llvm-svn: 280949
2016-09-08[ARM XRay] Try to fix Thumb-only failureRenato Golin1-1/+1
I mised the check that it had to support ARM to work. This commit tries to fix that, to make sure we don't emit ARM code in Thumb-only mode. llvm-svn: 280935
2016-09-08[SDAGBuilder] Don't create a binary tree for switches in minsize modeJames Molloy1-1/+2
This bloats codesize - all of the non-leaf nodes are extra code. llvm-svn: 280932
2016-09-08[Thumb1] AND with a constant operand can be converted into BICJames Molloy1-0/+4
So model the cost of materializing the constant operand C as the minimum of C and ~C. llvm-svn: 280929
2016-09-08[Thumb1] Fix cost calculation for complemented immediatesJames Molloy1-1/+1
Materializing something like "-3" can be done as 2 instructions: MOV r0, #3 MVN r0, r0 This has a cost of 2, not 3. It looks like we were already trying to detect this pattern in TII::getIntImmCost(), but were taking the complement of the zero-extended value instead of the sign-extended value which is unlikely to ever produce a number < 256. There were no tests failing after changing this... :/ llvm-svn: 280928
2016-09-08[SelectionDAG] Add BUILD_VECTOR support to computeKnownBits and ↵Simon Pilgrim2-0/+47
SimplifyDemandedBits Add the ability to computeKnownBits and SimplifyDemandedBits to extract the known zero/one bits from BUILD_VECTOR, returning the known bits that are shared by every vector element. This is an initial step towards determining the sign bits of a vector (PR29079). Differential Revision: https://reviews.llvm.org/D24253 llvm-svn: 280927
2016-09-08[DAGCombiner] Enable AND combines of splatted constant vectorsSimon Pilgrim1-7/+7
Allow AND combines to use a vector splatted constant as well as a constant scalar. Preliminary part of D24253. llvm-svn: 280926
2016-09-08Revert "[ARM] Lower UDIV+UREM to UDIV+MLS (and the same for SREM)"Pablo Barrio1-18/+1
This reverts commit r280808. It is possible that this change results in an infinite loop. This is causing timeouts in some tests on ARM, and a Chromebook bot is failing. llvm-svn: 280918