aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/CodeGen
AgeCommit message (Collapse)AuthorFilesLines
2018-04-24[DAGCombiner][X86] When promoting loads don't use ZEXTLOAD even its legalCraig Topper1-8/+4
We were previously prefering ZEXTLOAD over EXTLOAD if it is legal. This triggers during X86's promotion of i16->i32. Not sure about other targets. Using ZEXTLOAD can prevent folding it to SEXTLOAD later if we were to promote a sign extended operand like we would need for SRA. However, X86 doesn't currently promote i16 SRA. I was looking into doing that which is how I found this issue. This is also blocking our ability to fold 4 byte aligned EXTLOADs with "loadi32". This is what caused most of the test changes here. Differential Revision: https://reviews.llvm.org/D45585#inline-402825 llvm-svn: 330781
2018-04-24[X86] Account for partial stack slot spills (PR30821)Warren Ristow1-3/+7
Previously, _any_ store or load instruction was considered to be operating on a spill if it had a frameindex as an operand, and thus was fair game for optimisations such as "StackSlotColoring". This usually works, except on architectures where spills can be partially restored, for example on X86 where a spilt vector can have a single component loaded (zeroing the rest of the target register). This can be mis-interpreted and the zero extension unsoundly eliminated, see pr30821. To avoid this, this commit optionally provides the caller to isLoadFromStackSlot and isStoreToStackSlot with the number of bytes spilt/loaded by the given instruction. Optimisations can then determine that a full spill followed by a partial load (or vice versa), for example, cannot necessarily be commuted. Patch by Jeremy Morse! Differential Revision: https://reviews.llvm.org/D44782 llvm-svn: 330778
2018-04-24[CodeGen] Print user-friendly debug locations as MI commentsFrancis Visoiu Mistrih1-1/+14
If available, print the file, line and column of the DebugLoc attached to the MachineInstr: MOV16mr $rbp, 1, $noreg, -112, $noreg, killed renamable $ax, debug-location !56 :: (store 2 into %ir.._value12); stepping.swift:10:17 renamable $edx = MOVZX32rm16 $rbp, 1, $noreg, -112, $noreg, debug-location !62 :: (dereferenceable load 2 from %ir.._value13); stepping.swift:10:17 Differential Revision: https://reviews.llvm.org/D45992 llvm-svn: 330709
2018-04-24Correct dwarf unwind information in function epiloguePetar Jovanovic5-12/+368
This patch aims to provide correct dwarf unwind information in function epilogue for X86. It consists of two parts. The first part inserts CFI instructions that set appropriate cfa offset and cfa register in emitEpilogue() in X86FrameLowering. This part is X86 specific. The second part is platform independent and ensures that: * CFI instructions do not affect code generation (they are not counted as instructions when tail duplicating or tail merging) * Unwind information remains correct when a function is modified by different passes. This is done in a late pass by analyzing information about cfa offset and cfa register in BBs and inserting additional CFI directives where necessary. Added CFIInstrInserter pass: * analyzes each basic block to determine cfa offset and register are valid at its entry and exit * verifies that outgoing cfa offset and register of predecessor blocks match incoming values of their successors * inserts additional CFI directives at basic block beginning to correct the rule for calculating CFA Having CFI instructions in function epilogue can cause incorrect CFA calculation rule for some basic blocks. This can happen if, due to basic block reordering, or the existence of multiple epilogue blocks, some of the blocks have wrong cfa offset and register values set by the epilogue block above them. CFIInstrInserter is currently run only on X86, but can be used by any target that implements support for adding CFI instructions in epilogue. Patch by Violeta Vukobrat. Differential Revision: https://reviews.llvm.org/D42848 llvm-svn: 330706
2018-04-24[CodeGen] Do not allow opt-bisect-limit to skip ScalarizeMaskedMemIntrin.Andrei Elovikov1-3/+0
Summary: The pass is supposed to scalarize such intrinsics if the target does not support them natively, so if the scalarization does not happen instruction selection crashes due to inability to lower these intrinsics. Reviewers: andrew.w.kaylor, craig.topper Reviewed By: andrew.w.kaylor Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45947 llvm-svn: 330700
2018-04-23[DAGCombiner] Unfold scalar masked merge if profitableRoman Lebedev1-0/+67
Summary: This is [[ https://bugs.llvm.org/show_bug.cgi?id=37104 | PR37104 ]]. [[ https://bugs.llvm.org/show_bug.cgi?id=6773 | PR6773 ]] will introduce an IR canonicalization that is likely bad for the end assembly. Previously, `andl`+`andn`/`andps`+`andnps` / `bic`/`bsl` would be generated. (see `@out`) Now, they would no longer be generated (see `@in`). So we need to make sure that they are still generated. If the mask is constant, we do nothing. InstCombine should have unfolded it. Else, i use `hasAndNot()` TLI hook. For now, only handle scalars. https://rise4fun.com/Alive/bO6 ---- I *really* don't like the code i wrote in `DAGCombiner::unfoldMaskedMerge()`. It is super fragile. Is there something like IR Pattern Matchers for this? Reviewers: spatel, craig.topper, RKSimon, javed.absar Reviewed By: spatel Subscribers: andreadb, courbet, kristof.beyls, javed.absar, rengolin, nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D45733 llvm-svn: 330646
2018-04-23[SelectionDAG] Dump debug locs in SDNodesVedant Kumar1-0/+4
This helps debug issues where selection-dag assigns the wrong location to an instruction. Differential Revision: https://reviews.llvm.org/D45913 llvm-svn: 330618
2018-04-23StackSlotColoring: Fix missing skipFunction checkMatt Arsenault1-0/+3
llvm-svn: 330606
2018-04-23[SelectionDAG] Refactor lowering of atomic memory intrinsics.Daniel Neilson2-91/+150
Summary: This just refactors the lowering of the atomic memory intrinsics to more closely match the code patterns used in the lowering of the non-atomic memory intrinsics. Specifically, we encapsulate the lowering in SelectionDAG::getAtomicMem*() functions rather than embedding the code directly in the SelectionDAGBuilder code. llvm-svn: 330603
2018-04-21[AArch64] Don't crash trying to resolve __stack_chk_guard.Eli Friedman1-2/+5
In certain cases, the compiler might try to merge __stack_chk_guard with another global variable. (Or someone could theoretically define __stack_chk_guard as an alias.) In that case, make sure we don't crash. Differential Revision: https://reviews.llvm.org/D45746 llvm-svn: 330495
2018-04-20Remove unused argument from emitModuleMetadata.Eric Christopher2-7/+7
NFCI. llvm-svn: 330470
2018-04-20[DAGCombine] (float)((int) f) --> ftrunc (PR36617)Sanjay Patel1-0/+18
This was originally committed at rL328921 and reverted at rL329920 to investigate failures in Chrome. This time I've added to the ReleaseNotes to warn users of the potential of exposing UB and let me repeat that here for more exposure: Optimization of floating-point casts is improved. This may cause surprising results for code that is relying on undefined behavior. Code sanitizers can be used to detect affected patterns such as this: int main() { float x = 4294967296.0f; x = (float)((int)x); printf("junk in the ftrunc: %f\n", x); return 0; } $ clang -O1 ftrunc.c -fsanitize=undefined ; ./a.out ftrunc.c:5:15: runtime error: 4.29497e+09 is outside the range of representable values of type 'int' junk in the ftrunc: 0.000000 Original commit message: fptosi / fptoui round towards zero, and that's the same behavior as ISD::FTRUNC, so replace a pair of casts with the equivalent node. We don't have to account for special cases (NaN, INF) because out-of-range casts are undefined. Differential Revision: https://reviews.llvm.org/D44909 llvm-svn: 330437
2018-04-20Move a dump() implementation out of line.Amara Emerson1-0/+11
Fixes some link issues. llvm-svn: 330384
2018-04-19[MachineOutliner] NFC: Move EnableLinkOnceODROutlining into MachineOutliner.cppJessica Paquette2-10/+20
This moves the EnableLinkOnceODROutlining flag from TargetPassConfig.cpp into MachineOutliner.cpp. It also removes OutlineFromLinkOnceODRs from the MachineOutliner constructor. This is now handled by the moved command-line flag. llvm-svn: 330373
2018-04-19[if-converter] Handle BBs that terminate in ret during diamond conversionKrzysztof Parzyszek1-11/+28
This fixes https://llvm.org/PR36825. Original patch by Valentin Churavy (D45218). Differential Revision: https://reviews.llvm.org/D45731 llvm-svn: 330345
2018-04-18[DEBUG] Initial adaptation of NVPTX target for debug info emission.Alexey Bataev1-7/+15
Summary: Patch adds initial emission of the debug info for NVPTX target. Currently, only .file and .loc directives are emitted, everything else is commented out to not break the compilation of Cuda. Reviewers: echristo, jlebar, tra, jholewinski Subscribers: mgorny, aprantl, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D41827 llvm-svn: 330271
2018-04-18[AMDGPU] Fix issues for backend divergence trackingDavid Stuttard1-0/+1
Summary: A change to use divergence analysis in the AMDGPU backend was getting formal arguments incorrect (not tagged as divergent) unless they were VGPR0, VGPR1 or VGPR2 For graphics shaders it is possible to have more than these passed in as VGPR Modified the checking code to check for any VGPR registers passed in as formal arguments. Also, some intrinsics that are sources of divergence may have been lowered during instruction selection and are missed on subsequent calls to isSDNodeSourceOfDivergence - added the relevant AMDGPUISD checks as well. Finally, the FunctionLoweringInfo tracks virtual registers that are live across basic block boundaries. This is used to check for divergence of CopyFromRegister registers using the DivergenceAnalysis analysis. For multiple blocks the lazily evaluated inverted map VirtReg2Value was not cleared when the ValueMap map was. Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D45372 Change-Id: I112f3bd6dfe0f62e63ce9b43b893982778e4bee3 llvm-svn: 330257
2018-04-18[CodeGen/Dwarf] Make debug_names compatible with split-dwarfPavel Labath3-7/+17
Summary: Previously we crashed for the combination of the two features because we tried to reference the dwo CU from the main object file. The fix consists of two items: - reference the skeleton CU from the name index (the consumer is expected to use the skeleton CU to find the real data). - use the main object file string pool for the strings in the index Reviewers: JDevlieghere, aprantl, dblaikie Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45566 llvm-svn: 330249
2018-04-17[XRay] Typed event logging intrinsicKeith Wyss3-0/+72
Summary: Add an LLVM intrinsic for type discriminated event logging with XRay. Similar to the existing intrinsic for custom events, but also accepts a type tag argument to allow plugins to be aware of different types and semantically interpret logged events they know about without choking on those they don't. Relies on a symbol defined in compiler-rt patch D43668. I may wait to submit before I can see demo everything working together including a still to come clang patch. Reviewers: dberris, pelikan, eizan, rSerge, timshen Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45633 llvm-svn: 330219
2018-04-17Fix incorrect choice of callee-saved registers save/restore pointsMomchil Velikov1-2/+14
Make the shrink wrapping pass pay attention to uses/defs of the stack pointer. Differential revision: https://reviews.llvm.org/D45524 llvm-svn: 330183
2018-04-17[DAGCombiner] Fix for oss-fuzz bugGerolf Hoflehner1-1/+2
llvm-svn: 330178
2018-04-16[CodeView] Initial support for emitting S_THUNK32 symbols for compiler...Brock Wyma2-1/+62
When emitting CodeView debug information, compiler-generated thunk routines should be emitted using S_THUNK32 symbols instead of S_GPROC32_ID symbols so Visual Studio can properly step into the user code. This initial support only handles standard thunk ordinals. Differential Revision: https://reviews.llvm.org/D43838 llvm-svn: 330132
2018-04-16[MIR-Canon] Adding ISA-Agnostic COPY Folding.Puyan Lotfi1-0/+43
Transforms the following: %vreg1234:gpr32 = COPY %42 %vreg1235:gpr32 = COPY %vreg1234 %vreg1236:gpr32 = COPY %vreg1235 $w0 = COPY %vreg1236 into: $w0 = COPY %42 Assuming %42 is also a gpr32 llvm-svn: 330113
2018-04-16[NFC][MIR-Canon] clang-format cleanup of Mir Canonicalizer Pass.Puyan Lotfi1-66/+60
llvm-svn: 330111
2018-04-15[X86] Use APInt::isSubsetof instead of APInt::intersects to avoid a negation ↵Craig Topper1-2/+2
of an APInt value. NFC llvm-svn: 330105
2018-04-15[SelectionDAG][NFC] haveNoCommonBitsSet(): add FIXME notesRoman Lebedev1-0/+2
As suggested in https://reviews.llvm.org/D45631#1068338 llvm-svn: 330102
2018-04-15[MC] Moved all the remaining logic that computed instruction latency and ↵Andrea Di Biagio2-37/+12
reciprocal throughput from TargetSchedModel to MCSchedModel. TargetSchedModel now always delegates to MCSchedModel the computation of instruction latency and reciprocal throughput. No functional change intended. llvm-svn: 330099
2018-04-15[DAGCombiner, PowerPC] allow X - (fpext(-Y) --> X + fpext(Y) with multiple usesSanjay Patel1-6/+6
This is a transform that I limited in instcombine in rL329821 because it was creating more instructions in IR when the cast has multiple uses. But if the cast is free, then we can do the transform regardless of other uses because it improves the potential throughput of the calculation by removing a dependency on the fneg. Differential Revision: https://reviews.llvm.org/D45598 llvm-svn: 330098
2018-04-13[PostRASink]Add register dependency check for implicit operandsJun Bum Lim1-23/+103
Summary: This change extend the register dependency check for implicit operands in Copy instructions. Fixes PR36902. Reviewers: thegameg, sebpop, uweigand, jnspaulsson, gberry, mcrosier, qcolombet, MatzeB Reviewed By: thegameg Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44958 llvm-svn: 330018
2018-04-13[NFC] fix trivial typos in commentsHiroshi Inoue1-1/+1
"the the" -> "the", "we we" -> "we", etc llvm-svn: 330006
2018-04-12[DAGCombiner] simplify code; NFCSanjay Patel1-3/+2
llvm-svn: 329964
2018-04-12revert r328921 - [DAGCombine] (float)((int) f) --> ftrunc (PR36617)Sanjay Patel1-18/+0
This change is exposing UB in source code - as was warned/predicted. :) See D44909 for discussion. Reverting while we figure out how to fix things. llvm-svn: 329920
2018-04-12[Pipeliner] Use std::stable_sort when ordering NodeSetsKrzysztof Parzyszek1-1/+1
There are cases when individual NodeSets can be equal with respect to the ordering criteria. Since they are stored in an ordered container, use stable_sort to preserve the relative order of equal NodeSets. This should remove non-determinism discovered by shuffling done in llvm::sort with expensive checks enabled. llvm-svn: 329915
2018-04-12[CodeGen] Allow printing MachineMemOperands with less context in SDAGDumperFrancis Visoiu Mistrih1-8/+21
Don't assume SelectionDAG is non-null as the targets can use it with a null pointer. Differential Revision: https://reviews.llvm.org/D44611 llvm-svn: 329908
2018-04-12[MachineScheduler] NFC refactoringJonas Paulsson1-21/+25
This patch makes tryCandidate() virtual and some utility functions like tryLess(), tryGreater(), ... externally available (used to be static). This makes it possible for a target to derive a new MachineSchedStrategy from GenericScheduler and reuse most parts. It was necessary to wrap functions with the same names in AMDGPU/SIMachineScheduler in a local namespace. Review: Andy Trick, Florian Hahn https://reviews.llvm.org/D43329 llvm-svn: 329884
2018-04-12[LegalizeTypes] Remove unnecessary type action check on the type of operand ↵Craig Topper1-11/+5
0 when promoting shift result type. NFC Operand 0 should have the same type of the result. So if the result type needs to be promoted, operand 0 needs to be promoted unconditionally. llvm-svn: 329883
2018-04-12[NFC] fix trivial typos in documents and commentsHiroshi Inoue1-1/+1
"is is" -> "is", "if if" -> "if", "or or" -> "or" llvm-svn: 329878
2018-04-11CodeGen: Don't try to canonicalize Unix-style paths in CodeView debug info.Peter Collingbourne1-0/+10
Most importantly, we should not replace slashes with backslashes because that would invalidate the path. Differential Revision: https://reviews.llvm.org/D45473 llvm-svn: 329838
2018-04-11[FastISel] Disable local value sinking by defaultReid Kleckner1-1/+8
This is causing compilation timeouts on code with long sequences of local values and calls (i.e. foo(1); foo(2); foo(3); ...). It turns out that code coverage instrumentation is a great way to create sequences like this, which how our users ran into the issue in practice. Intel has a tool that detects these kinds of non-linear compile time issues, and Andy Kaylor reported it as PR37010. The current sinking code scans the whole basic block once per local value sink, which happens before emitting each call. In theory, local values should only be introduced to be used by instructions between the current flush point and the last flush point, so we should only need to scan those instructions. llvm-svn: 329822
2018-04-10[MachO] Emit Weak ReadOnlyWithRel to ConstDataSectionSteven Wu1-0/+2
Summary: Darwin dynamic linker can handle weak symbols in ConstDataSection. ReadonReadOnlyWithRel symbols should be emitted in ConstDataSection instead of normal DataSection. rdar://problem/39298457 Reviewers: dexonsmith, kledzik Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45472 llvm-svn: 329752
2018-04-10[CodeGen] Fix printing bundles in MIR outputKrzysztof Parzyszek2-4/+7
Delay printing the newline until after the opening bracket was printed, e.g. BUNDLE implicit-def $r1, implicit-def $r21, implicit $r1 { renamable $r1 = S2_asr_i_r renamable $r1, 1 renamable $r21 = A2_tfrsi 0 } instead of BUNDLE implicit-def $r1, implicit-def $r21, implicit $r1 { renamable $r1 = S2_asr_i_r renamable $r1, 1 renamable $r21 = A2_tfrsi 0 } llvm-svn: 329719
2018-04-10[CodeGen/Dwarf] Rename the "sizetype" synthetic type and add it to the ↵Pavel Labath1-1/+3
accelerator table Summary: This type is created on-demand and used as the base type for array ranges. Since it is "special", its construction did not go through the createTypeDIE function and so it was never inserted into the accelerator table, although it clearly belongs there. I add an explicit addAccelType call to insert it into the table. During review, we also decided to rename the type to something more unique to avoid confusion in case the user has own "sizetype" type. The new name for the type size __ARRAY_SIZE_TYPE__. Reviewers: JDevlieghere, aprantl, dblaikie Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45445 llvm-svn: 329705
2018-04-10[x86] Introduce a pass to begin more systematically fixing PR36028 and ↵Chandler Carruth1-0/+8
similar issues. The key idea is to lower COPY nodes populating EFLAGS by scanning the uses of EFLAGS and introducing dedicated code to preserve the necessary state in a GPR. In the vast majority of cases, these uses are cmovCC and jCC instructions. For such cases, we can very easily save and restore the necessary information by simply inserting a setCC into a GPR where the original flags are live, and then testing that GPR directly to feed the cmov or conditional branch. However, things are a bit more tricky if arithmetic is using the flags. This patch handles the vast majority of cases that seem to come up in practice: adc, adcx, adox, rcl, and rcr; all without taking advantage of partially preserved EFLAGS as LLVM doesn't currently model that at all. There are a large number of operations that techinaclly observe EFLAGS currently but shouldn't in this case -- they typically are using DF. Currently, they will not be handled by this approach. However, I have never seen this issue come up in practice. It is already pretty rare to have these patterns come up in practical code with LLVM. I had to resort to writing MIR tests to cover most of the logic in this pass already. I suspect even with its current amount of coverage of arithmetic users of EFLAGS it will be a significant improvement over the current use of pushf/popf. It will also produce substantially faster code in most of the common patterns. This patch also removes all of the old lowering for EFLAGS copies, and the hack that forced us to use a frame pointer when EFLAGS copies were found anywhere in a function so that the dynamic stack adjustment wasn't a problem. None of this is needed as we now lower all of these copies directly in MI and without require stack adjustments. Lots of thanks to Reid who came up with several aspects of this approach, and Craig who helped me work out a couple of things tripping me up while working on this. Differential Revision: https://reviews.llvm.org/D45146 llvm-svn: 329657
2018-04-09[globalisel][legalizerinfo] Add support for the Lower action in ↵Daniel Sanders2-0/+8
getActionDefinitionsBuilder() and use it in AArch64. Lower is slightly odd. It often doesn't change the type but the lowerings do use the new type to decide what code to create. Treat it like a mutation but provide convenience functions that re-use the existing type. Re-uses the existing tests: test/CodeGen/AArch64/GlobalISel/legalize-rem.mir test/CodeGen/AArch64/GlobalISel//legalize-mul.mir test/CodeGen/AArch64/GlobalISel//legalize-cmpxchg-with-success.mir llvm-svn: 329623
2018-04-09Fix printing of stack id in MachineFrameInfoMatt Arsenault1-1/+1
uint8_t is printed as a char, so it needs to be casted to do the right thing. llvm-svn: 329622
2018-04-09[Debuginfo][COFF] Minimal serialization support for precompiled types recordsAlexandre Ganea2-2/+3
This change adds support for the LF_PRECOMP and LF_ENDPRECOMP records required to read/write Microsoft precompiled types .objs. See https://en.wikipedia.org/wiki/Precompiled_header#Microsoft_Visual_C_and_C++ This also adds handling for the .debug$P section, which is actually a .debug$T section in disguise, found only in precompiled .objs. Differential Revision: https://reviews.llvm.org/D45283 llvm-svn: 329613
2018-04-09Fix type mismatch between MachineMemOperand constructor and accessors. NFCDaniel Sanders1-1/+1
This allows MachineMemOperand::getSize()'s result to be fed directly into MachineMemOperand::MachineMemOperand() without a narrowing type conversion warning. llvm-svn: 329602
2018-04-09[GISel] Refactor MachineIRBuilder to allow transformations whileAditya Nandakumar1-257/+250
building. https://reviews.llvm.org/D45067 This change attempts to do two things: 1) It separates out the state that is stored in the MachineIRBuilder(InsertionPt, MF, MRI, InsertFunction etc) into a separate object called MachineIRBuilderState. 2) Add the ability to constant fold operations while building instructions (optionally). MachineIRBuilder is now refactored into a MachineIRBuilderBase which contains lots of non foldable build methods and their implementation. Instructions which can be constant folded/transformed are now in a class called FoldableInstructionBuilder which uses CRTP to use the implementation of the derived class for buildBinaryOps. Additionally buildInstr in the derived class can be used to implement other kinds of transformations. Also because of separation of state, given a MachineIRBuilder in an API, if one wishes to use another MachineIRBuilder, a new one can be constructed from the state locally. For eg, void doFoo(MachineIRBuilder &B) { MyCustomBuilder CustomB(B.getState()); // Use CustomB for building. } reviewed by : aemerson llvm-svn: 329596
2018-04-09Support generic expansion of ordered vector reduction (PR36732)Simon Pilgrim1-6/+9
Without the fast math flags, the llvm.experimental.vector.reduce.fadd/fmul intrinsic expansions must be expanded in order. This patch scalarizes the reduction, applying the accumulator at the start of the sequence: ((((Acc + Scl[0]) + Scl[1]) + Scl[2]) + ) ... + Scl[NumElts-1] Differential Revision: https://reviews.llvm.org/D45366 llvm-svn: 329585
2018-04-09[MachineLICM] Re-enable hoisting of constant storesZaara Syeda1-2/+9
This patch fixes an issue exposed on the SystemZ build bots when committing https://reviews.llvm.org/rL327856. The hoisting was temporarily disabled with an option. This patch now re-enables hoisting and checks that we only hoist a store instruction when all its operands are either constant caller preserved registers or immediates. Differential Revision: https://reviews.llvm.org/D45286 llvm-svn: 329577