aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib
AgeCommit message (Collapse)AuthorFilesLines
2016-09-08[mips][microMIPS] Implement DBITSWAP, DLSA and LWUPC and add tests for AUI ↵Hrvoje Varga8-12/+126
instructions Differential Revision: https://reviews.llvm.org/D16452 llvm-svn: 280909
2016-09-08[asan] Avoid lifetime analysis for allocas with can be in ambiguous stateVitaly Buka1-0/+75
Summary: C allows to jump over variables declaration so lifetime.start can be avoid before variable usage. To avoid false-positives on such rare cases we detect them and remove from lifetime analysis. PR27453 PR28267 Reviewers: eugenis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24321 llvm-svn: 280907
2016-09-08Revert "[LoopUnroll] Properly update loop-info when cloning prologues and ↵Michael Zolotukhin1-54/+11
epilogues." This reverts commit r280901. This caused a bunch of failures, reverting it until I investigate them. llvm-svn: 280905
2016-09-08[LoopUnroll] Properly update loop-info when cloning prologues and epilogues.Michael Zolotukhin1-11/+54
Summary: When cloning blocks for prologue/epilogue we need to replicate the loop structure from the original loop. It wasn't a problem for the innermost loops, but it led to an incorrect loop info when we unrolled a loop with a child loop - in this case created prologue-loop had a child loop, but loop info didn't reflect that. This fixes PR28888. Reviewers: chandlerc, sanjoy, hfinkel Subscribers: llvm-commits, silvas Differential Revision: https://reviews.llvm.org/D24203 llvm-svn: 280901
2016-09-08[CGP] Be less conservative about tail-duplicating a ret to allow tail callsMichael Kuperstein2-26/+34
CGP tail-duplicates rets into blocks that end with a call that feed the ret. This puts the call in tail position, potentially allowing the DAG builder to lower it as a tail call. To avoid tail duplication in cases where we won't form the tail call, CGP tried to predict whether this is going to be possible, and avoids doing it when lowering as a tail call will definitely fail. However, it was being too conservative by always throwing away calls to functions with a signext/zeroext attribute on the return type. Instead, we can use the same logic the builder uses to determine whether the attributes work out. Differential Revision: https://reviews.llvm.org/D24315 llvm-svn: 280894
2016-09-08[XRay] Remove unused variableDean Michael Berris1-2/+2
llvm-svn: 280891
2016-09-08[XRay] ARM 32-bit no-Thumb support in LLVMDean Michael Berris11-61/+213
This is a port of XRay to ARM 32-bit, without Thumb support yet. The XRay instrumentation support is moving up to AsmPrinter. This is one of 3 commits to different repositories of XRay ARM port. The other 2 are: 1. https://reviews.llvm.org/D23932 (Clang test) 2. https://reviews.llvm.org/D23933 (compiler-rt) Differential Revision: https://reviews.llvm.org/D23931 llvm-svn: 280888
2016-09-07IR: Remove Value::intersectOptionalDataWith, replace all calls with calls to ↵Peter Collingbourne4-6/+6
Instruction::andIRFlags. The two functions are functionally equivalent. Differential Revision: https://reviews.llvm.org/D22830 llvm-svn: 280884
2016-09-07Revert "[asan] Avoid lifetime analysis for allocas with can be in ambiguous ↵Vitaly Buka1-74/+0
state" Fails on Windows. This reverts commit r280880. llvm-svn: 280883
2016-09-07[asan] Avoid lifetime analysis for allocas with can be in ambiguous stateVitaly Buka1-0/+74
Summary: C allows to jump over variables declaration so lifetime.start can be avoid before variable usage. To avoid false-positives on such rare cases we detect them and remove from lifetime analysis. PR27453 PR28267 Reviewers: eugenis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24321 llvm-svn: 280880
2016-09-07[InstCombine] use m_APInt to allow icmp (and (sh X, Y), C2), C1 folds for ↵Sanjay Patel2-52/+22
splat constant vectors llvm-svn: 280873
2016-09-07[SimplifyCFG] Don't try to create metadata-valued PHIsHal Finkel1-0/+4
We can't create metadata-valued PHIs; don't try to do so when sinking. I created a test case for this using the @llvm.type.test intrinsic, because it takes a metadata parameter and does not have severe side effects (thus SimplifyCFG is willing to otherwise sink it). Previously, running the test case would crash with: Invalid use of metadata! %.sink = select i1 %flag, metadata <...>, metadata <0x4e45dc0> LLVM ERROR: Broken function found, compilation aborted! llvm-svn: 280866
2016-09-07[LoopUnroll] Correct a debug message. NFC.Haicheng Wu1-1/+1
Differential Revision: https://reviews.llvm.org/D24299 llvm-svn: 280865
2016-09-07Shift-left (ISD::SHL) operation crashes on "DAG Legalization" phase.Elena Demikhovsky1-21/+27
https://llvm.org/bugs/show_bug.cgi?id=29058. While node legalization we tried to legalize its operands. If an operand node is replaced during legalization the user node may be destroyed. Differential Revision: https://reviews.llvm.org/D24244 llvm-svn: 280862
2016-09-07[InstCombine] allow icmp (and X, C2), C1 folds for splat constant vectorsSanjay Patel1-43/+33
This is a revert of r280676 which was a revert of r280637; ie, this is r280637 again. It was speculatively reverted to help debug buildbot failures. llvm-svn: 280861
2016-09-07[RDF] Fix liveness analysis for phi nodes with shadow usesKrzysztof Parzyszek2-37/+82
Shadow uses need to be analyzed together, since each individual shadow will only have a partial reaching def. All shadows together may cover a given register ref, while each individual shadow may not. llvm-svn: 280855
2016-09-07Don't reuse a variable name in a nested scope. NFC.Michael Kuperstein1-6/+6
llvm-svn: 280853
2016-09-07[RDF] Introduce "undef" flag for ref nodesKrzysztof Parzyszek3-24/+74
llvm-svn: 280851
2016-09-07AMDGPU: Remove a useless variable which caused build failure for lld.Yaxun Liu1-1/+1
llvm-svn: 280841
2016-09-07Don't reduce the width of vector mul if the target doesn't support SSE2.Wei Mi1-1/+2
The patch is to fix PR30298, which is caused by rL272694. The solution is to bail out if the target has no SSE2. Differential Revision: https://reviews.llvm.org/D24288 llvm-svn: 280837
2016-09-07Typo. NFC.Chad Rosier1-1/+1
llvm-svn: 280834
2016-09-07CodeGen: ensure that libcalls are always AAPCS CCSaleem Abdulrasool1-7/+169
The original commit was too aggressive about marking LibCalls as AAPCS. The libcalls contain libc/libm/libunwind calls which are not AAPCS, but C. llvm-svn: 280833
2016-09-07X86: Fold tail calls into conditional branches where possible (PR26302)Hans Wennborg6-17/+158
When branching to a block that immediately tail calls, it is possible to fold the call directly into the branch if the call is direct and there is no stack adjustment, saving one byte. Example: define void @f(i32 %x, i32 %y) { entry: %p = icmp eq i32 %x, %y br i1 %p, label %bb1, label %bb2 bb1: tail call void @foo() ret void bb2: tail call void @bar() ret void } before: f: movl 4(%esp), %eax cmpl 8(%esp), %eax jne .LBB0_2 jmp foo .LBB0_2: jmp bar after: f: movl 4(%esp), %eax cmpl 8(%esp), %eax jne bar .LBB0_1: jmp foo I don't expect any significant size savings from this (on a Clang bootstrap I saw 288 bytes), but it does make the code a little tighter. This patch only does 32-bit, but 64-bit would work similarly. Differential Revision: https://reviews.llvm.org/D24108 llvm-svn: 280832
2016-09-07[lib/LTO] Add a way to run a custom pipelineDavide Italiano2-1/+46
Differential Revision: https://reviews.llvm.org/D24095 llvm-svn: 280830
2016-09-07AMDGPU: Add hidden kernel arguments to runtime metadataYaxun Liu2-83/+157
OpenCL kernels have hidden kernel arguments for global offset and printf buffer. For consistency, these hidden argument should be included in the runtime metadata. Also updated kernel argument kind metadata. Differential Revision: https://reviews.llvm.org/D23424 llvm-svn: 280829
2016-09-07[codeview] Add new directives to record inlined call site line infoReid Kleckner6-140/+325
Summary: Previously we were trying to represent this with the "contains" list of the .cv_inline_linetable directive, which was not enough information. Now we directly represent the chain of inlined call sites, so we know what location to emit when we encounter a .cv_loc directive of an inner inlined call site while emitting the line table of an outer function or inlined call site. Fixes PR29146. Also fixes PR29147, where we would crash when .cv_loc directives crossed sections. Now we write down the section of the first .cv_loc directive, and emit an error if any other .cv_loc directive for that function is in a different section. Also fixes issues with discontiguous inlined source locations, like in this example: volatile int unlikely_cond = 0; extern void __declspec(noreturn) abort(); __forceinline void f() { if (!unlikely_cond) abort(); } int main() { unlikely_cond = 0; f(); unlikely_cond = 0; } Previously our tables gave bad location information for the 'abort' call, and the debugger wouldn't snow the inlined stack frame for 'f'. It is important to emit good line tables for this code pattern, because it comes up whenever an asan bug occurs in an inlined function. The __asan_report* stubs are generally placed after the normal function epilogue, leading to discontiguous regions of inlined code. Reviewers: majnemer, amccarth Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24014 llvm-svn: 280822
2016-09-07[LoopInterchange] Improve debug output. NFC.Chad Rosier1-6/+6
llvm-svn: 280820
2016-09-07[LoopInterchange] Improve debug output. NFC.Chad Rosier1-4/+6
llvm-svn: 280819
2016-09-07[LSV] Use the original loads' names for the extractelement instructions.Justin Lebar1-2/+4
Summary: LSV replaces multiple adjacent loads with one vectorized load and a bunch of extractelement instructions. This patch makes the extractelement instructions' names match those of the original loads, for (hopefully) improved readability. Reviewers: asbirlea, tstellarAMD Subscribers: arsenm, mzolotukhin Differential Revision: https://reviews.llvm.org/D23748 llvm-svn: 280818
2016-09-07[x86] move combines of 'select of 2 constants' to its own function; NFCSanjay Patel1-92/+103
There are missing folds here and possibly folds that could be made generic. llvm-svn: 280817
2016-09-07[ARM] Lower UDIV+UREM to UDIV+MLS (and the same for SREM)Pablo Barrio1-1/+18
Summary: This saves a library call to __aeabi_uidivmod. However, the processor must feature hardware division in order to benefit from the transformation. Reviewers: scott-0, jmolloy, compnerd, rengolin Subscribers: t.p.northover, compnerd, aemerson, rengolin, samparker, llvm-commits Differential Revision: https://reviews.llvm.org/D24133 llvm-svn: 280808
2016-09-07[InstCombine][SSE4a] Fix assertion failure in the insertq/insertqi combining ↵Andrea Di Biagio1-3/+3
logic. This fixes a similar issue to the one already fixed by r280804 (revieved in D24256). Revision 280804 fixed the problem with unsafe dyn_casts in the extrq/extrqi combining logic. However, it turns out that even the insertq/insertqi logic was affected by the same problem. llvm-svn: 280807
2016-09-07[InstCombine][SSE4a] Fix assertion failure caused by unsafe dyn_casts on the ↵Andrea Di Biagio1-3/+3
operands of extrq/extrqi intrinsic calls. This patch fixes an assertion failure caused by unsafe dynamic casts on the constant operands of sse4a intrinsic calls to extrq/extrqi The combine logic that simplifies sse4a extrq/extrqi intrinsic calls currently checks if the input operands are constants. Internally, that logic relies on dyn_casts of values returned by calls to method Constant::getAggregateElement. However, method getAggregateElemet may return nullptr if the constant element cannot be retrieved. So, all the dyn_casts can potentially fail. This is what happens for example if a constexpr value is passed in input to an extrq/extrqi intrinsic call. This patch fixes the problem by using a dyn_cast_or_null (instead of a simple dyn_cast) on the result of each call to Constant::getAggregateElement. Added reproducible test cases to x86-sse4a.ll. Differential Revision: https://reviews.llvm.org/D24256 llvm-svn: 280804
2016-09-07Revert "[EfficiencySanitizer] Adds shadow memory parameters for 40-bit ↵Renato Golin1-34/+9
virtual memory address." This reverts commit r280796, as it broke the AArch64 bots for no reason. The tests were passing and we should try to keep them passing, so a proper review should make that happen. llvm-svn: 280802
2016-09-07[mips] Disable the TImode shift libcalls for 32-bit targets.Vasileios Kalintiris1-0/+7
Summary: The o32 ABI doesn't not support the TImode helpers. For the time being, disable just the shift libcalls as they break recursive builds on MIPS. Reviewers: sdardis Subscribers: llvm-commits, sdardis Differential Revision: https://reviews.llvm.org/D24259 llvm-svn: 280798
2016-09-07[EfficiencySanitizer] Adds shadow memory parameters for 40-bit virtual ↵Sagar Thakur1-9/+34
memory address. Adding 40-bit shadow memory parameters because MIPS64 uses 40-bit virtual memory addresses. Reviewed by bruening Differential: D23801 llvm-svn: 280796
2016-09-07[SimplifyCFG] Followup fix to r280790James Molloy1-1/+3
In failure cases it's not guaranteed that the PHI we're inspecting is actually in the successor block! In this case we need to bail out early, and never query getIncomingValueForBlock() as that will cause an assert. llvm-svn: 280794
2016-09-07[SimplifyCFG] Update workaround for PR30188 to also include loadsJames Molloy1-2/+7
I should have realised this the first time around, but if we're avoiding sinking stores where the operands come from allocas so they don't create selects, we also have to do the same for loads because SROA will be just as defective looking at loads of selected addresses as stores. Fixes PR30188 (again). llvm-svn: 280792
2016-09-07[SimplifyCFG] Check PHI uses more accuratelyJames Molloy1-1/+3
PR30292 showed a case where our PHI checking wasn't correct. We were checking that all values were used by the same PHI before deciding to sink, but we weren't checking that the incoming values for that PHI were what we expected. As a result, we had to bail out after block splitting which caused us to never reach a steady state in SimplifyCFG. Fixes PR30292. llvm-svn: 280790
2016-09-07[PowerPC] Fix address-offset folding for plain addiHal Finkel1-15/+38
When folding an addi into a memory access that can take an immediate offset, we were implicitly assuming that the existing offset was zero. This was incorrect. If we're dealing with an addi with a plain constant, we can add it to the existing offset (assuming that doesn't overflow the immediate, etc.), but if we have anything else (i.e. something that will become a relocation expression), we'll go back to requiring the existing immediate offset to be zero (because we don't know what the requirements on that relocation expression might be - e.g. maybe it is paired with some addis in some relevant way). On the other hand, when dealing with a plain addi with a regular constant immediate, the alignment restrictions (from the TOC base pointer, etc.) are irrelevant. I've added the test case from PR30280, which demonstrated the bug, but also demonstrates a missed optimization opportunity (i.e. we don't need the memory accesses at all). Fixes PR30280. llvm-svn: 280789
2016-09-07AVX512F: FMA intrinsic + FNEG - sequence optimizationElena Demikhovsky1-90/+102
The previous commit (r280368 - https://reviews.llvm.org/D23313) does not cover AVX-512F, KNL set. FNEG(x) operation is lowered to (bitcast (vpxor (bitcast x), (bitcast constfp(0x80000000))). It happens because FP XOR is not supported for 512-bit data types on KNL and we use integer XOR instead. I added pattern match for integer XOR. Differential Revision: https://reviews.llvm.org/D24221 llvm-svn: 280785
2016-09-07AMDGPU: Make some scalar instructions commutableMatt Arsenault1-2/+9
llvm-svn: 280784
2016-09-07Remove unnecessary call to getAllocatableRegClassMatt Arsenault3-18/+19
This reapplies r252565 and r252674, effectively reverting r252956. This allows VS_32/VS_64 to be unallocatable like they should be. llvm-svn: 280783
2016-09-07[X86] Add hasSideEffects=0 to some instructions.Craig Topper2-3/+5
llvm-svn: 280782
2016-09-07[AVX-512] Add support for commuting masked instructions in ↵Craig Topper1-1/+23
findCommutedOpIndices. The default implementation doesn't skip the mask input or the preserved input. llvm-svn: 280781
2016-09-07Revert "CodeGen: ensure that libcalls are always AAPCS CC"Saleem Abdulrasool1-6/+7
This reverts SVN r280683. Revert until I figure out why this is breaking lli tests. llvm-svn: 280778
2016-09-07Fix typo in comment, NFCNick Lewycky1-1/+1
llvm-svn: 280774
2016-09-07[LTO] Rename variables to be more explicative.Davide Italiano1-29/+30
Thanks to Mehdi for the suggestion! llvm-svn: 280772
2016-09-06[DAGCombine] More fixups to SETCC legality checking (visitANDLike/visitORLike)Hal Finkel1-28/+46
I might have called this "r246507, the sequel". It fixes the same issue, as the issue has cropped up in a few more places. The underlying problem is that isSetCCEquivalent can pick up select_cc nodes with a result type that is not legal for a setcc node to have, and if we use that type to create new setcc nodes, nothing fixes that (and so we've violated the contract that the infrastructure has with the backend regarding setcc node types). Fixes PR30276. For convenience, here's the commit message from r246507, which explains the problem is greater detail: [DAGCombine] Fixup SETCC legality checking SETCC is one of those special node types for which operation actions (legality, etc.) is keyed off of an operand type, not the node's value type. This makes sense because the value type of a legal SETCC node is determined by its operands' value type (via the TLI function getSetCCResultType). When the SDAGBuilder creates SETCC nodes, it either creates them with an MVT::i1 value type, or directly with the value type provided by TLI.getSetCCResultType. The first problem being fixed here is that DAGCombine had several places querying TLI.isOperationLegal on SETCC, but providing the return of getSetCCResultType, instead of the operand type directly. This does not mean what the author thought, and "luckily", most in-tree targets have SETCC with Custom lowering, instead of marking them Legal, so these checks return false anyway. The second problem being fixed here is that two of the DAGCombines could create SETCC nodes with arbitrary (integer) value types; specifically, those that would simplify: (setcc a, b, op1) and|or (setcc a, b, op2) -> setcc a, b, op3 (which is possible for some combinations of (op1, op2)) If the operands of the and|or node are actual setcc nodes, then this is not an issue (because the and|or must share the same type), but, the relevant code in DAGCombiner::visitANDLike and DAGCombiner::visitORLike actually calls DAGCombiner::isSetCCEquivalent on each operand, and that function will recognise setcc-like select_cc nodes with other return types. And, thus, when creating new SETCC nodes, we need to be careful to respect the value-type constraint. This is even true before type legalization, because it is quite possible for the SELECT_CC node to have a legal type that does not happen to match the corresponding TLI.getSetCCResultType type. To be explicit, there is nothing that later fixes the value types of SETCC nodes (if the type is legal, but does not happen to match TLI.getSetCCResultType). Creating SETCCs with an MVT::i1 value type seems to work only because, either MVT::i1 is not legal, or it is what TLI.getSetCCResultType returns if it is legal. Fixing that is a larger change, however. For the time being, restrict the relevant transformations to produce only SETCC nodes with a value type matching TLI.getSetCCResultType (or MVT::i1 prior to type legalization). Fixes PR24636. llvm-svn: 280767
2016-09-06Explicitly require DominatorTreeAnalysis pass for instsimplify pass.Dehao Chen1-5/+6
Summary: DominatorTreeAnalysis is always required by instsimplify. Reviewers: danielcdh, davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24173 llvm-svn: 280760