aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/Transforms/Utils/LoopUnrollRuntime.cpp
diff options
context:
space:
mode:
authorSanjay Patel <spatel@rotateright.com>2019-04-12 16:31:56 +0000
committerSanjay Patel <spatel@rotateright.com>2019-04-12 16:31:56 +0000
commit5e4ad39af7c257d5d426368efd1bd07f5c2be772 (patch)
tree55412e2e5c78993415b8f7b928775fc08765a40f /llvm/lib/Transforms/Utils/LoopUnrollRuntime.cpp
parent7bd8c37b17730a1a6e76d5543fba0939238b7640 (diff)
downloadllvm-5e4ad39af7c257d5d426368efd1bd07f5c2be772.zip
llvm-5e4ad39af7c257d5d426368efd1bd07f5c2be772.tar.gz
llvm-5e4ad39af7c257d5d426368efd1bd07f5c2be772.tar.bz2
[DAGCombiner] narrow shuffle of concatenated vectors
// shuffle (concat X, undef), (concat Y, undef), Mask --> // concat (shuffle X, Y, Mask0), (shuffle X, Y, Mask1) The ARM changes with 'vtrn' and narrowed 'vuzp' are improvements. The x86 changes look neutral or better. There's one test with an extra instruction, but that could be reversed for a subtarget with the right attributes. But by default, we want to avoid the 256-bit op when possible (in my motivating benchmark, a handful of ymm ops sprinkled into a sequence of xmm ops are triggering frequency throttling on Haswell resulting in significantly worse perf). Differential Revision: https://reviews.llvm.org/D60545 llvm-svn: 358291
Diffstat (limited to 'llvm/lib/Transforms/Utils/LoopUnrollRuntime.cpp')
0 files changed, 0 insertions, 0 deletions