aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib
AgeCommit message (Collapse)AuthorFilesLines
2014-10-31PR20557: Fix the bug that bogus cpu parameter crashes llc on AArch64 backend.Hao Liu1-1/+5
Initial patch by Oleg Ranevskyy. llvm-svn: 220945
2014-10-30[SelectionDAG] When scalarizing trunc, don't assert for legal operands.Ahmed Bougacha1-1/+17
r212242 introduced a legalizer hook, originally to let AArch64 widen v1i{32,16,8} rather than scalarize, because the legalizer expected, when scalarizing the result of a conversion operation, to already have scalarized the operands. On AArch64, v1i64 is legal, so that commit ensured operations such as v1i32 = trunc v1i64 wouldn't assert. It did that by choosing to widen v1 types whenever possible. However, v1i1 types, for which there's no legal widened type, would still trigger the assert. This commit fixes that, by only scalarizing a trunc's result when the operand has already been scalarized, and introducing an extract_elt otherwise. This is similar to r205625. Fixes PR20777. llvm-svn: 220937
2014-10-30Speculative fix for Windows build after r220932Hans Wennborg1-0/+5
llvm-svn: 220936
2014-10-30Fix incorrect invariant check in DAG CombineLouis Gerbarg1-1/+1
Earlier this summer I fixed an issue where we were incorrectly combining multiple loads that had different constraints such alignment, invariance, temporality, etc. Apparently in one case I made copt paste error and swapped alignment and invariance. Tests included. rdar://18816719 llvm-svn: 220933
2014-10-30Removing the static initializer in ManagedStatic.cpp by using llvm_call_once ↵Chris Bieneman5-4/+44
to initialize the ManagedStatic mutex. Summary: This patch adds an llvm_call_once which is a wrapper around std::call_once on platforms where it is available and devoid of bugs. The patch also migrates the ManagedStatic mutex to be allocated using llvm_call_once. These changes are philosophically equivalent to the changes added in r219638, which were reverted due to a hang on Win32 which was the result of a bug in the Windows implementation of std::call_once. Reviewers: aaron.ballman, chapuni, chandlerc, rnk Reviewed By: rnk Subscribers: majnemer, llvm-commits Differential Revision: http://reviews.llvm.org/D5922 llvm-svn: 220932
2014-10-30Fix the merging of the constantness of declarations.Rafael Espindola1-3/+2
The langref says: LLVM explicitly allows declarations of global variables to be marked constant, even if the final definition of the global is not. This capability can be used to enable slightly better optimization of the program, but requires the language definition to guarantee that optimizations based on the ‘constantness’ are valid for the translation units that do not include the definition. Given that definition, when merging two declarations, we have to drop constantness if of of them is not marked contant, since the Module without the constant marker might not have the necessary guarantees. llvm-svn: 220927
2014-10-30Add handling for range metadata in ValueTracking isKnownNonZeroPhilip Reames1-0/+29
If we load from a location with range metadata, we can use information about the ranges of the loaded value for optimization purposes. This helps to remove redundant checks and canonicalize checks for other optimization passes. This particular patch checks whether a value is known to be non-zero from the range metadata. Currently, these tests are against InstCombine. In theory, all of these should be InstSimplify since we're not inserting any new instructions. Moving the code may follow in a separate change. Reviewed by: Hal Differential Revision: http://reviews.llvm.org/D5947 llvm-svn: 220925
2014-10-30PR21408: Workaround the appearance of duplicate variables due to problems ↵David Blaikie1-1/+6
when inlining two calls to the same function from the same call site. llvm-svn: 220923
2014-10-30Fix Twine corruption problem with diagnostics.Diego Novillo1-2/+1
This fixes the autobuilders I broke with a recent patch. Thanks echristo and dblaikie for beating me with a clue stick. llvm-svn: 220918
2014-10-30Add profile writing capabilities for sampling profiles.Diego Novillo5-40/+381
Summary: This patch finishes up support for handling sampling profiles in both text and binary formats. The new binary format uses uleb128 encoding to represent numeric values. This makes profiles files about 25% smaller. The profile writer class can write profiles in the existing text and the new binary format. In subsequent patches, I will add the capability to read (and perhaps write) profiles in the gcov format used by GCC. Additionally, I will be adding support in llvm-profdata to manipulate sampling profiles. There was a bit of refactoring needed to separate some code that was in the reader files, but is actually common to both the reader and writer. The new test checks that reading the same profile encoded as text or raw, produces the same results. Reviewers: bogner, dexonsmith Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6000 llvm-svn: 220915
2014-10-30[AVX512] Added VBROADCAST{SS/SD} encoding for VL subset.Robert Khasanov1-26/+51
Refactored through AVX512_maskable llvm-svn: 220908
2014-10-30[dfsan] New calling convention for custom functions with variadic arguments.Peter Collingbourne1-9/+22
Summary: The previous calling convention prevented custom functions from being able to access argument labels unless it knew how many variadic arguments there were, and of which type. This restriction made it impossible to correctly model functions in the printf family, as it is legal to pass more arguments than required to those functions. We now pass arguments in the following order: non-vararg arguments labels for non-vararg arguments [if vararg function, pointer to array of labels for vararg arguments] [if non-void function, pointer to label for return value] vararg arguments Differential Revision: http://reviews.llvm.org/D6028 llvm-svn: 220906
2014-10-29Untabify.NAKAMURA Takumi1-2/+2
llvm-svn: 220884
2014-10-29Do not simplifyLatch for loops where hoisting increments couldresult in ↵Yi Jiang1-3/+30
extra live range interferance llvm-svn: 220872
2014-10-29[AVX512] Implemented AVX512VL FP bnary packed instructions (VADDP*, VSUBP*, ↵Robert Khasanov1-107/+47
VMULP*, VDIVP*, VMAXP*, VMINP*) Refactored through AVX512_maskable Added encoding tests for them. llvm-svn: 220858
2014-10-29Whitespace.NAKAMURA Takumi5-40/+40
llvm-svn: 220857
2014-10-29Fix build with CMake if LLVM_USE_INTEL_JITEVENTS option is enabledMichael Kuperstein1-0/+1
* Added LLVM libraries required for IntelJITEvents to LLVMBuild.txt. * Removed 'jit' library from llvm-jitlistener. * Added support for OptionalLibraries to llvm-build cmake files generator. Patch by aleksey.a.bader@intel.com Differential Revision: http://reviews.llvm.org/D5646 llvm-svn: 220848
2014-10-28[C API] PR19859: Add functions to query and modify branches.Peter Zotov1-0/+28
Patch by Gabriel Radanne <drupyog@zoho.com>. llvm-svn: 220817
2014-10-28[C API] PR19859: Add LLVMGetFCmpPredicate and LLVMConstRealGetDouble.Peter Zotov1-1/+31
Patch by Gabriel Radanne <drupyog@zoho.com>. llvm-svn: 220814
2014-10-28Transforms: reapply SVN r219899Saleem Abdulrasool2-11/+18
This restores the commit from SVN r219899 with an additional change to ensure that the CodeGen is correct for the case that was identified as being incorrect (originally PR7272). In the case that during inlining we need to synthesize a value on the stack (i.e. for passing a value byval), then any function involving that alloca must be stripped of its tailness as the restriction that it does not access the parent's stack no longer holds. Unfortunately, a single alloca can cause a rippling effect through out the inlining as the value may be aliased or may be mutated through an escaped external call. As such, we simply track if an alloca has been introduced in the frame during inlining, and strip any tail calls. llvm-svn: 220811
2014-10-28[AVX512] Fix VSQRT packed instructions internal names.Robert Khasanov2-9/+9
No functional change llvm-svn: 220808
2014-10-28[AVX512] Extended avx512_sqrt_packed (sqrt instructions) to VL subset.Robert Khasanov1-28/+44
Refactored through AVX512_maskable llvm-svn: 220806
2014-10-28[AVX-512] Expanded rsqrt/rcp instructions to VL subset.Robert Khasanov1-20/+47
Refactored multiclass through AVX512_maskable llvm-svn: 220783
2014-10-28[AVX512] Removed special case for cmp instructions in getVectorMaskingNode. ↵Robert Khasanov1-15/+4
Now cmp intrinsics lower as other intrinsics through VSELECT, and then VSELECT tranforms to AND in PerformSELECTCombine. No functional change. llvm-svn: 220779
2014-10-28[x86] Simplify vector selection if condition value type matches vselect ↵Robert Khasanov1-10/+10
value type and true value is all ones or false value is all zeros. This transformation worked if selector is produced by SETCC, however SETCC is needed only if we consider to swap operands. So I replaced SETCC check for this case. Added tests for vselect of <X x i1> values. llvm-svn: 220777
2014-10-28Silencing an "enumeral and non-enumeral type in conditional expression" ↵Aaron Ballman1-1/+2
warning; NFC. llvm-svn: 220775
2014-10-28[AVX512] Bring back vector-shuffle lowering support through broadcastsRobert Khasanov2-8/+17
Ffter commit at rev219046 512-bit broadcasts lowering become non-optimal. Most of tests on broadcasting and embedded broadcasting were changed and they doesn’t produce efficient code. Example below is from commit changes (it’s the first test from test/CodeGen/X86/avx512-vbroadcast.ll): define <16 x i32> @_inreg16xi32(i32 %a) { ; CHECK-LABEL: _inreg16xi32: ; CHECK: ## BB#0: -; CHECK-NEXT: vpbroadcastd %edi, %zmm0 +; CHECK-NEXT: vmovd %edi, %xmm0 +; CHECK-NEXT: vpbroadcastd %xmm0, %ymm0 +; CHECK-NEXT: vinserti64x4 $1, %ymm0, %zmm0, %zmm0 ; CHECK-NEXT: retq %b = insertelement <16 x i32> undef, i32 %a, i32 0 %c = shufflevector <16 x i32> %b, <16 x i32> undef, <16 x i32> zeroinitializer ret <16 x i32> %c } Here, 256-bit broadcast was generated instead of 512-bit one. In this patch 1) I added vector-shuffle lowering through broadcasts 2) Removed asserts and branches likes because this is incorrect - assert(Subtarget->hasDQI() && "We can only lower v8i64 with AVX-512-DQI"); 3) Fixed lowering tests llvm-svn: 220774
2014-10-28Reformat partially, where I touched for whitespace changes.NAKAMURA Takumi5-21/+18
llvm-svn: 220773
2014-10-28LoopRerollPass.cpp: Use range-based loop. NFC.NAKAMURA Takumi1-11/+9
llvm-svn: 220772
2014-10-28Untabify and whitespace cleanups.NAKAMURA Takumi7-48/+46
llvm-svn: 220771
2014-10-28Minimize the scope of some variables, NFC.David Blaikie1-2/+2
llvm-svn: 220759
2014-10-28X86: Implement the vectorcall calling conventionReid Kleckner7-35/+129
This is a Microsoft calling convention that supports both x86 and x86_64 subtargets. It passes vector and floating point arguments in XMM0-XMM5, and passes them indirectly once they are consumed. Homogenous vector aggregates of up to four elements can be passed in sequential vector registers, but this part is not implemented in LLVM and will be handled in Clang. On 32-bit x86, it is similar to fastcall in that it uses ecx:edx as integer register parameters and is callee cleanup. On x86_64, it delegates to the normal win64 calling convention. Reviewers: majnemer Differential Revision: http://reviews.llvm.org/D5943 llvm-svn: 220745
2014-10-28AArch64: enable Cortex-A57 FP balancing on Cortex-A53.Tim Northover1-1/+2
Benchmarks have shown that it's harmless to the performance there, and having a unified set of passes between the two cores where possible helps big.LITTLE deployment. Patch by Z. Zheng. llvm-svn: 220744
2014-10-28Remove the PreserveSource linker mode.Rafael Espindola1-29/+20
I noticed that it was untested, and forcing it on caused some tests to fail: LLVM :: Linker/metadata-a.ll LLVM :: Linker/prefixdata.ll LLVM :: Linker/type-unique-odr-a.ll LLVM :: Linker/type-unique-simple-a.ll LLVM :: Linker/type-unique-simple2-a.ll LLVM :: Linker/type-unique-simple2.ll LLVM :: Linker/type-unique-type-array-a.ll LLVM :: Linker/unnamed-addr1-a.ll LLVM :: Linker/visibility1.ll If it is to be resurrected, it has to be fixed and we should probably have a -preserve-source command line option in llvm-mc and run tests with and without it. llvm-svn: 220741
2014-10-27AArch64InstrInfo.h: Fix a warning introduced in clang r220703. ↵NAKAMURA Takumi1-1/+1
[-Winconsistent-missing-override] llvm-svn: 220739
2014-10-27[AVX512] Add vpermil variable versionAdam Nemet1-2/+25
This is implemented via a multiclass that derives from the vperm imm multiclass. Fixes <rdar://problem/18426089> llvm-svn: 220737
2014-10-27[AVX512] Clean up avx512_perm_imm to use X86VectorVTInfoAdam Nemet1-25/+22
No functionality change. No change in X86.td.expanded except that we only set the CD8 attributes for the memory variants. (This shouldn't be used unless we have a memory operand.) llvm-svn: 220736
2014-10-27[AVX512] Derive vpermil* from avx512_perm_immAdam Nemet1-14/+14
This used to derive from avx512_pshuf_imm which is confusing. NFC. Compared X86.td.expanded. llvm-svn: 220735
2014-10-27[AVX512] Fix copy-and-paste bugs in vpermilAdam Nemet1-3/+3
1) i512mem -> f512mem (this is the packed FP input being permuted) 2) element size is 64 bits in EVEX_CD8 for PD. (A good illustration why X86VectorVTInfo is useful) llvm-svn: 220734
2014-10-27Make it easier to pass a custom diagnostic handler to the IR linker.Rafael Espindola1-27/+27
llvm-svn: 220732
2014-10-27Fix a stackmap bug introduced in r220710.Pete Cooper1-4/+14
For a call to not return in to the stackmap shadow, the shadow must end with the call. To do this, we must insert any required nops *before* the call, and not after it. llvm-svn: 220728
2014-10-27Fix bug where sys::Wait could wait on wrong pid.Rafael Espindola1-1/+0
Setting ChildPid to -1 would cause waitpid to wait for any child process. Patch by Daniel Reynaud! llvm-svn: 220717
2014-10-27[FastISel][AArch64] Emit immediate version of icmp (subs) for null pointer ↵Juergen Ributzka1-2/+6
check. This is a minor change to use the immediate version when the operand is a null value. This should get rid of an unnecessary 'mov' instruction in debug builds and align the code more with the one generated by SelectionDAG. This fixes rdar://problem/18785125. llvm-svn: 220713
2014-10-27[FastISel][AArch64] Optimize compare-and-branch for i1 to use 'tbz'.Juergen Ributzka1-0/+4
Minor enhancement to use 'tbz' for i1 compare-and-branch to get rid of an 'and' instruction. This fixes rdar://problem/18784953. llvm-svn: 220712
2014-10-27Stackmap shadows should consider call returns a branch target.Pete Cooper1-0/+6
To avoid emitting too many nops, a stackmap shadow can include emitted instructions in the shadow, but these must not include branch targets. A return from a call should count as a branch target as patching over the instructions after the call would lead to incorrect behaviour for threads currently making that call, when they return. llvm-svn: 220710
2014-10-27[FastISel][AArch64] Use 'cbz' also for null values (pointers).Juergen Ributzka1-15/+12
The pattern matching for a 'ConstantInt' value was too restrictive. Checking for a 'Constant' with a bull value is sufficient for using an 'cbz/cbnz' instruction. This fixes rdar://problem/18784732. llvm-svn: 220709
2014-10-27[FastISel][AArch64] Don't fold the 'and' instruction into the 'tbz/tbnz' ↵Juergen Ributzka1-2/+2
instruction if it is in a different basic block. This fixes a bug where the input register was not defined for the 'tbz/tbnz' instruction. This happened, because we folded the 'and' instruction from a different basic block. This fixes rdar://problem/18784013. llvm-svn: 220704
2014-10-27[FastISel][AArch64] Fix load/store with frame indices.Juergen Ributzka1-23/+20
At higher optimization levels the LLVM IR may contain more complex patterns for loads/stores from/to frame indices. The 'computeAddress' function wasn't able to handle this and triggered an assertion. This fix extends the possible addressing modes for frame indices. This fixes rdar://problem/18783298. llvm-svn: 220700
2014-10-27[asan] experimental tracing for indirect calls, llvm part.Kostya Serebryany1-4/+44
llvm-svn: 220699
2014-10-27[PBQP] Unique allowed-sets for nodes in the PBQP graph and use pairs of theseLang Hames2-37/+58
sets as keys into a cache of interference matrice values in the Interference constraint adder. Creating interference matrices was one of the large remaining time-sinks in PBQP. Caching them reduces the total compile time (when using PBQP) on the nightly test suite by ~10%. llvm-svn: 220688