aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/CodeGen/ScalarizeMaskedMemIntrin.cpp
AgeCommit message (Collapse)AuthorFilesLines
2019-08-02[ScalarizeMaskedMemIntrin] Bitcast the mask to the scalar domain and use ↵Craig Topper1-3/+32
scalar bit tests for the branches for expandload/compressstore. Same as what was done for gather/scatter/load/store in r367489. Expandload/compressstore were delayed due to lack of constant masking handling that has since been fixed. llvm-svn: 367738
2019-08-02[ScalarizeMaskedMemIntrin] Add constant mask support to expandload and ↵Craig Topper1-0/+34
compressstore scalarization This adds support for generating all the loads or stores for a constant mask into a single basic block with no conditionals. Differential Revision: https://reviews.llvm.org/D65613 llvm-svn: 367715
2019-07-31[ScalarizeMaskedMemIntrin] Bitcast the mask to the scalar domain and use ↵Craig Topper1-11/+72
scalar bit tests for the branches. X86 at least is able to use movmsk or kmov to move the mask to the scalar domain. Then we can just use test instructions to test individual bits. This is more efficient than extracting each mask element individually. I special cased v1i1 to use the previous behavior. This avoids poor type legalization of bitcast of v1i1 to i1. I've skipped expandload/compressstore as I think we need to handle constant masks for those better first. Many tests end up with duplicate test instructions due to tail duplication in the branch folding pass. But the same thing happens when constructing similar code in C. So its not unique to the scalarization. Not sure if this lowering code will also be good for other targets, but we're only testing X86 today. Differential Revision: https://reviews.llvm.org/D65319 llvm-svn: 367489
2019-06-02[X86] Add test cases for masked store and masked scatter with an all zeroes ↵Craig Topper1-1/+1
mask. Fix bug in ScalarizeMaskedMemIntrin Need to cast only to Constant instead of ConstantVector to allow ConstantAggregateZero. llvm-svn: 362341
2019-03-21[ScalarizeMaskedMemIntrin] Add support for scalarizing expandload and ↵Craig Topper1-0/+158
compressstore intrinsics. This adds support for scalarizing these intrinsics as well the X86TargetTransformInfo support to avoid scalarizing them in the cases X86 can handle. I've omitted handling special cases for constant masks for this first pass. Though CodeGenPrepare can constant fold the branch conditions and remove some of the control flow anyway. Fixes PR40994 and is covers most of PR3666. Might want to implement constant masks to close that. Differential Revision: https://reviews.llvm.org/D59180 llvm-svn: 356687
2019-03-21[ScalarizeMaskedMemIntrinsics] Reverse some if conditions to reduce ↵Craig Topper1-20/+16
indentations to remove curly braces. Pre-commit for D59180 llvm-svn: 356646
2019-03-09[ScalarizeMaskedMemIntrin] Use IRBuilder functions that take ↵Craig Topper1-43/+29
uint32_t/uint64_t for getelementptr, extractelement, and insertelement. This saves needing to call getInt32 ourselves. Making the code a little shorter. The test changes are because insert/extract use getInt64 internally. Shouldn't be a functional issue. This cleanup because I plan to write similar code for expandload/compressstore. llvm-svn: 355767
2019-03-08[ScalarizeMaskedMemIntrin] Only set the ModifiedDT flag if new basic blocks ↵Craig Topper1-12/+16
were added. There are special cases in the scalarization for constant masks. If we hit one of the special cases we don't need to reset the iteration. Noticed while starting work on adding expandload/compressstore to this pass. llvm-svn: 355754
2019-02-01[opaque pointer types] Pass value type to LoadInst creation.James Y Knight1-5/+6
This cleans up all LoadInst creation in LLVM to explicitly pass the value type rather than deriving it from the pointer's element-type. Differential Revision: https://reviews.llvm.org/D57172 llvm-svn: 352911
2019-01-19Update the file headers across all of the LLVM projects in the monorepoChandler Carruth1-4/+3
to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
2018-10-30[ScalarizeMaskedMemIntrin] Limit the scope of some variables that are only ↵Craig Topper1-8/+5
used inside loops. llvm-svn: 345638
2018-09-28[ScalarizeMaskedMemIntrin] Use MinAlign to calculate alignment for the ↵Craig Topper1-2/+2
scalar load/stores to handle element types that are byte-sized but not powers of 2. This pass doesn't handle non-byte sized types correctly at all, but at least we can make byte sized types work. llvm-svn: 343294
2018-09-28[ScalarizeMaskedMemIntrin] Fix the alignment calculation for the scalar ↵Craig Topper1-1/+1
stores of a masked store expansion. It should be the minimum of the original alignment and the scalar size. llvm-svn: 343284
2018-09-27[ScalarizeMaskedMemIntrin] Ensure the mask is a vector of ConstantInts ↵Craig Topper1-4/+19
before generating the expansion without control flow. Its possible the mask itself or one of the elements is a ConstantExpr and we shouldn't optimize in that case. llvm-svn: 343278
2018-09-27[ScalarizeMaskedMemIntrin] Use cast instead of dyn_cast checked by an ↵Craig Topper1-10/+6
assert. Consistently make use of the element type variable we already have. NFCI cast will take care of asserting internally. llvm-svn: 343277
2018-09-27[ScalarizeMaskedMemIntrin] When expanding masked gathers, start with the ↵Craig Topper1-22/+11
passthru vector and insert the new load results into it. Previously we started with undef and did a final merge with the passthru at the end. llvm-svn: 343273
2018-09-27[ScalarizeMaskedMemIntrin] When expanding masked loads, start with the ↵Craig Topper1-22/+12
passthru value and insert each conditional load result over their element. Previously we started with undef and did one final merge at the end with a select. llvm-svn: 343271
2018-09-27[ScalarizeMaskedMemIntrin] Handle the case where the mask is an all zero vector.Craig Topper1-8/+8
This shouldn't really happen in practice I hope, but we tried to handle other constant cases. We missed this one because we checked for ConstantVector without realizing that zero becomes ConstantAggregateZero instead. So instead just check for Constant and use getAggregateElement which will do the dirty work for us. llvm-svn: 343270
2018-09-27[ScalarizeMaskedMemIntrin] Remove some temporary variables that are only ↵Craig Topper1-14/+5
used by a single if condition. llvm-svn: 343268
2018-09-27[ScalarizeMaskedMemIntrin] Cleanup comments. NFCCraig Topper1-58/+49
llvm-svn: 343267
2018-09-27[ScalarizeMaskedMemIntrin] Don't emit 'icmp eq i1 %x, 1' to check mask ↵Craig Topper1-23/+9
values. That's just %x so use that directly. Had we emitted this IR earlier, InstCombine would have removed icmp so I'm going to assume using the i1 directly would be considered canonical. llvm-svn: 343244
2018-04-24[CodeGen] Do not allow opt-bisect-limit to skip ScalarizeMaskedMemIntrin.Andrei Elovikov1-3/+0
Summary: The pass is supposed to scalarize such intrinsics if the target does not support them natively, so if the scalarization does not happen instruction selection crashes due to inability to lower these intrinsics. Reviewers: andrew.w.kaylor, craig.topper Reviewed By: andrew.w.kaylor Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45947 llvm-svn: 330700
2017-11-17Fix a bunch more layering of CodeGen headers that are in TargetDavid Blaikie1-1/+1
All these headers already depend on CodeGen headers so moving them into CodeGen fixes the layering (since CodeGen depends on Target, not the other way around). llvm-svn: 318490
2017-09-27[CodeGen] Fix some Clang-tidy modernize-use-default-member-init and Include ↵Eugene Zelenko1-17/+29
What You Use warnings; other minor fixes (NFC). llvm-svn: 314363
2017-09-07Sink some IntrinsicInst.h and Intrinsics.h out of llvm/includeReid Kleckner1-0/+1
Many of these uses can get by with forward declarations. Hopefully this speeds up compilation after adding a single intrinsic. llvm-svn: 312759
2017-05-25CodeGen: Rename DEBUG_TYPE to match passnamesMatthias Braun1-6/+2
Rename the DEBUG_TYPE to match the names of corresponding passes where it makes sense. Also establish the pattern of simply referencing DEBUG_TYPE instead of repeating the passname where possible. llvm-svn: 303921
2017-05-15[X86] Relocate code of replacement of subtarget unsupported masked memory ↵Ayman Musa1-0/+660
intrinsics to run also on -O0 option. Currently, when masked load, store, gather or scatter intrinsics are used, we check in CodeGenPrepare pass if the subtarget support these intrinsics, if not we replace them with scalar code - this is a functional transformation not an optimization (not optional). CodeGenPrepare pass does not run when the optimization level is set to CodeGenOpt::None (-O0). Functional transformation should run with all optimization levels, so here I created a new pass which runs on all optimization levels and does no more than this transformation. Differential Revision: https://reviews.llvm.org/D32487 llvm-svn: 303050