aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/Transforms/Utils/InlineFunction.cpp
diff options
context:
space:
mode:
authorSanjay Patel <spatel@rotateright.com>2018-01-06 16:16:04 +0000
committerSanjay Patel <spatel@rotateright.com>2018-01-06 16:16:04 +0000
commit5a48aef3f0dbb0934e266dbd068ff46dff5c4dbe (patch)
treeb3339387d653e097d5942262ae56c44ece95f984 /llvm/lib/Transforms/Utils/InlineFunction.cpp
parentb77bc6bb8b1df9b05a9cda0555d3c58655aba5ae (diff)
downloadllvm-5a48aef3f0dbb0934e266dbd068ff46dff5c4dbe.zip
llvm-5a48aef3f0dbb0934e266dbd068ff46dff5c4dbe.tar.gz
llvm-5a48aef3f0dbb0934e266dbd068ff46dff5c4dbe.tar.bz2
[x86, MemCmpExpansion] allow 2 pairs of loads per block (PR33325)
This is the last step needed to fix PR33325: https://bugs.llvm.org/show_bug.cgi?id=33325 We're trading branch and compares for loads and logic ops. This makes the code smaller and hopefully faster in most cases. The 24-byte test shows an interesting construct: we load the trailing scalar elements into vector registers and generate the same pcmpeq+movmsk code that we expected for a pair of full vector elements (see the 32- and 64-byte tests). Differential Revision: https://reviews.llvm.org/D41714 llvm-svn: 321934
Diffstat (limited to 'llvm/lib/Transforms/Utils/InlineFunction.cpp')
0 files changed, 0 insertions, 0 deletions