Age | Commit message (Collapse) | Author | Files | Lines |
|
vector shuffles. They were removed from clang previously but accidentally left in the backend.
llvm-svn: 281300
|
|
intrinsics and upgrade to native IR.
llvm-svn: 280633
|
|
native IR.
llvm-svn: 280611
|
|
llvm-svn: 280603
|
|
them with native IR.
llvm-svn: 280466
|
|
This reverts commit 32fc6488e48eafc0ca1bac1bd9cbf0008224d530.
llvm-svn: 278609
|
|
This reverts commit r276447.
llvm-svn: 278608
|
|
No functionality change is intended.
llvm-svn: 278417
|
|
a sufficiently low alignment for the IR load created.
There is no test case because we don't have any test cases for the *IR*
produced by the autoupgrade, only the x86 assembly, and it happens that
the x86 assembly for this intrinsic as it is tested in the autoupgrade
path just happens to not produce a separate load instruction where we
might have observed the alignment.
I'm going to follow up on the original commit to suggest getting
IR-level testing in addition to the asm level testing here so that we
can see and test these kinds of issues. We might never get an x86
instruction out with an alignment constraint, but we could stil
miscompile code by folding against the alignment marked on (or inferred
for in this case) the load.
llvm-svn: 278203
|
|
Summary:
The llvm.invariant.start and llvm.invariant.end intrinsics currently
support specifying invariant memory objects only in the default address
space.
With this change, these intrinsics are overloaded for any adddress space
for memory objects
and we can use these llvm invariant intrinsics in non-default address
spaces.
Example: llvm.invariant.start.p1i8(i64 4, i8 addrspace(1)* %ptr)
This overloaded intrinsic is needed for representing final or invariant
memory in managed languages.
Reviewers: apilipenko, reames
Subscribers: llvm-commits
llvm-svn: 276447
|
|
(reapplied)
As reported on PR26235, we don't currently make use of the VBROADCASTF128/VBROADCASTI128 instructions (or the AVX512 equivalents) to load+splat a 128-bit vector to both lanes of a 256-bit vector.
This patch enables lowering from subvector insertion/concatenation patterns and auto-upgrades the llvm.x86.avx.vbroadcastf128.pd.256 / llvm.x86.avx.vbroadcastf128.ps.256 intrinsics to match.
We could possibly investigate using VBROADCASTF128/VBROADCASTI128 to load repeated constants as well (similar to how we already do for scalar broadcasts).
Reapplied with fix for PR28657 - removed intrinsic definitions (clang companion patch to be be submitted shortly).
Differential Revision: https://reviews.llvm.org/D22460
llvm-svn: 276416
|
|
It caused PR28657.
This reverts commit r276281.
llvm-svn: 276405
|
|
This reverts commit r276316.
llvm-svn: 276320
|
|
Summary:
The llvm.invariant.start and llvm.invariant.end intrinsics currently
support specifying invariant memory objects only in the default address space.
With this change, these intrinsics are overloaded for any adddress space for memory objects
and we can use these llvm invariant intrinsics in non-default address spaces.
Example: llvm.invariant.start.p1i8(i64 4, i8 addrspace(1)* %ptr)
This overloaded intrinsic is needed for representing final or invariant memory in managed languages.
Reviewers: tstellarAMD, reames, apilipenko
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D22519
llvm-svn: 276316
|
|
As reported on PR26235, we don't currently make use of the VBROADCASTF128/VBROADCASTI128 instructions (or the AVX512 equivalents) to load+splat a 128-bit vector to both lanes of a 256-bit vector.
This patch enables lowering from subvector insertion/concatenation patterns and auto-upgrades the llvm.x86.avx.vbroadcastf128.pd.256 / llvm.x86.avx.vbroadcastf128.ps.256 intrinsics to match.
We could possibly investigate using VBROADCASTF128/VBROADCASTI128 to load repeated constants as well (similar to how we already do for scalar broadcasts).
Differential Revision: https://reviews.llvm.org/D22460
llvm-svn: 276281
|
|
generic IR
D20859 and D20860 attempted to replace the SSE (V)CVTTPS2DQ and VCVTTPD2DQ truncating conversions with generic IR instead.
It turns out that the behaviour of these intrinsics is different enough from generic IR that this will cause problems, INF/NAN/out of range values are guaranteed to result in a 0x80000000 value - which plays havoc with constant folding which converts them to either zero or UNDEF. This is also an issue with the scalar implementations (which were already generic IR and what I was trying to match).
This patch changes both scalar and packed versions back to using x86-specific builtins.
It also deals with the other scalar conversion cases that are runtime rounding mode dependent and can have similar issues with constant folding.
A companion clang patch is at D22105
Differential Revision: https://reviews.llvm.org/D22106
llvm-svn: 275981
|
|
llvm-svn: 275155
|
|
X86 intrinsic upgrade functions.
llvm-svn: 275138
|
|
llvm-svn: 274966
|
|
llvm-svn: 274862
|
|
original Name is overridden.
Summary: lib/IR/AutoUpgrade.cpp:348 and lib/IR/AutoUpgrade.cpp:350 upset sanitizer.
Reviewers: bkramer
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D22140
llvm-svn: 274861
|
|
intrinsics.
I'm not sure if clang ever used these builtin names or not.
llvm-svn: 274827
|
|
the autoupgrade code. This currently results in worse codegen but is needed for correctness.
llvm-svn: 274736
|
|
llvm-svn: 274550
|
|
we can shorten the length of the comparison strings and avoid repeatedly comparing the common prefix. No functional change intended.
llvm-svn: 274522
|
|
llvm-svn: 274506
|
|
llvm-svn: 274498
|
|
llvm-svn: 274439
|
|
This is a resubmittion of 263158 change after fixing the existing problem with intrinsics mangling (see LTO and intrinsics mangling llvm-dev thread for details).
This patch fixes the problem which occurs when loop-vectorize tries to use @llvm.masked.load/store intrinsic for a non-default addrspace pointer. It fails with "Calling a function with a bad signature!" assertion in CallInst constructor because it tries to pass a non-default addrspace pointer to the pointer argument which has default addrspace.
The fix is to add pointer type as another overloaded type to @llvm.masked.load/store intrinsics.
Reviewed By: reames
Differential Revision: http://reviews.llvm.org/D17270
llvm-svn: 274043
|
|
intrinsics" since some of the clang tests don't expect to see the updated signatures.
llvm-svn: 273895
|
|
This is a resubmittion of 263158 change after fixing the existing problem with intrinsics mangling (see LTO and intrinsics mangling llvm-dev thread for details).
This patch fixes the problem which occurs when loop-vectorize tries to use @llvm.masked.load/store intrinsic for a non-default addrspace pointer. It fails with "Calling a function with a bad signature!" assertion in CallInst constructor because it tries to pass a non-default addrspace pointer to the pointer argument which has default addrspace.
The fix is to add pointer type as another overloaded type to @llvm.masked.load/store intrinsics.
Reviewed By: reames
Differential Revision: http://reviews.llvm.org/D17270
llvm-svn: 273892
|
|
and selects.
llvm-svn: 273543
|
|
native icmps.
llvm-svn: 273240
|
|
Required better annotation of the instruction defs upon removal of the builtin intrinsic pattern.
llvm-svn: 273077
|
|
This will (hopefully very temporarily) break clang.
The clang side of this should be the next commit.
llvm-svn: 272932
|
|
Follow-up to:
http://reviews.llvm.org/rL272806
http://reviews.llvm.org/rL272807
llvm-svn: 272907
|
|
llvm-svn: 272848
|
|
autoupgrade them to selects and shufflevector.
llvm-svn: 272527
|
|
select generation into routines that can be reused for future intrinsic upgrades. NFC
llvm-svn: 272526
|
|
shufflevector.
llvm-svn: 272510
|
|
code instead of calling push_back in a loop. This removes the need to check if the vector needs to grow on each iteration.
llvm-svn: 272501
|
|
fully derive everything using types of the intrinsic arguments rather than writing separate loops for each intrinsic. NFC
llvm-svn: 272496
|
|
ArrayRef<uint32_t> to avoid the need to manually create a bunch of Constants and a ConstantVector. NFC
llvm-svn: 272493
|
|
for one of the signatures of CreateShuffleVector. This better emphasises that you can't use it for the -1 as undef behavior.
llvm-svn: 272491
|
|
Auto-upgrade to generic shuffles like sse/avx2 implementations now that we can lower to VPSLLDQ/VPSRLDQ
llvm-svn: 272308
|
|
of vector shuffle and select.
llvm-svn: 271872
|
|
This patch begins adding support for lowering to the XOP VPERMIL2PD/VPERMIL2PS shuffle instructions - adding the X86ISD::VPERMIL2 opcode and cleaning up the usage.
The internal llvm intrinsics were assuming the shuffle mask operand was the same type as the float/double input operands (I guess to simplify the intrinsic definitions in X86InstrXOP.td to a single value type). These needed changing to integer types (matching the clang builtin and the AMD intrinsics definitions), an auto upgrade path is added to convert old calls.
Mask decoding/target shuffle support will be added in future patches.
Differential Revision: http://reviews.llvm.org/D20049
llvm-svn: 271633
|
|
f32/f64 to i32 with generic IR (llvm)
This patch removes the llvm intrinsics (V)CVTTPS2DQ and VCVTTPD2DQ truncation (round to zero) conversions and auto-upgrades to FP_TO_SINT calls instead.
Note: I looked at updating CVTTPD2DQ as well but this still requires a lot more work to correctly lower.
Differential Revision: http://reviews.llvm.org/D20860
llvm-svn: 271510
|
|
intrinsics instead.
The intrinsics will be autoupgraded to the same generic masked loads.
llvm-svn: 271478
|
|
generic masked load intrinsics instead."
Looks like something isn't quite right still. Also forgot to move the test cases to an autoupgrade test.
llvm-svn: 271363
|