diff options
author | Sanjay Patel <spatel@rotateright.com> | 2018-08-25 14:56:05 +0000 |
---|---|---|
committer | Sanjay Patel <spatel@rotateright.com> | 2018-08-25 14:56:05 +0000 |
commit | 8a84c747d2de2e99e035d8e072a00795b406ca6e (patch) | |
tree | cf484c68d117ba1ba4c8e4ab63a737b067dce94b /llvm/lib/Transforms/Utils/DemoteRegToStack.cpp | |
parent | 904343f879b34b44185f60d277ab568342d62bf8 (diff) | |
download | llvm-8a84c747d2de2e99e035d8e072a00795b406ca6e.zip llvm-8a84c747d2de2e99e035d8e072a00795b406ca6e.tar.gz llvm-8a84c747d2de2e99e035d8e072a00795b406ca6e.tar.bz2 |
[x86] try harder to use broadcast to load a scalar into vector reg
This is a preliminary step for a preliminary step for D50992.
I noticed that x86 often misses chances to load a scalar directly
into a vector register.
So this patch is just allowing more of those cases to match a
broadcast op in lowerBuildVectorAsBroadcast(). The old code comment
said it doesn't make sense to use a broadcast when we're loading a
single element and everything else is undef, but I think that's the
best case in the improved tests in insert-loaded-scalar.ll. We avoid
scalar-to-vector-register move and/or less efficient shuffling.
Note that there are some existing types that were already producing
a broadcast, but that happens semi-accidentally. Ie, it's not
happening as part of lowerBuildVectorAsBroadcast(). The build vector
gets expanded into load + shuffle, and then shuffle lowering produces
the broadcast.
Description of the other test diffs:
1. avx-basic.ll - replacing load+shufle is a win.
2. sse3-avx-addsub-2.ll - vmovddup vs. vbroadcastss is neutral
3. sse41.ll - don't care - we convert that intrinsic to generic IR now, so this test is deprecated
4. vector-shuffle-128-v8.ll / vector-shuffle-256-v16.ll - pshufb alternatives with an extra instruction are not obviously bad
Differential Revision: https://reviews.llvm.org/D51125
llvm-svn: 340685
Diffstat (limited to 'llvm/lib/Transforms/Utils/DemoteRegToStack.cpp')
0 files changed, 0 insertions, 0 deletions