diff options
author | Sanjay Patel <spatel@rotateright.com> | 2019-11-27 13:33:11 -0500 |
---|---|---|
committer | Sanjay Patel <spatel@rotateright.com> | 2019-11-27 14:08:56 -0500 |
commit | 5c166f1d1969e9c1e5b72aa672add429b9c22b53 (patch) | |
tree | adf6302c8508cb2d3cf48fcf5e53eab409bfa65f /clang/lib/Frontend/CompilerInvocation.cpp | |
parent | 5c5e860535d8924a3d6eb950bb8a4945df01e9b7 (diff) | |
download | llvm-5c166f1d1969e9c1e5b72aa672add429b9c22b53.zip llvm-5c166f1d1969e9c1e5b72aa672add429b9c22b53.tar.gz llvm-5c166f1d1969e9c1e5b72aa672add429b9c22b53.tar.bz2 |
[x86] make SLM extract vector element more expensive than default
I'm not sure what the effect of this change will be on all of the affected
tests or a larger benchmark, but it fixes the horizontal add/sub problems
noted here:
https://reviews.llvm.org/D59710?vs=227972&id=228095&whitespace=ignore-most#toc
The costs are based on reciprocal throughput numbers in Agner's tables for
PEXTR*; these appear to be very slow ops on Silvermont.
This is a small step towards the larger motivation discussed in PR43605:
https://bugs.llvm.org/show_bug.cgi?id=43605
Also, it seems likely that insert/extract is the source of perf regressions on
other CPUs (up to 30%) that were cited as part of the reason to revert D59710,
so maybe we'll extend the table-based approach to other subtargets.
Differential Revision: https://reviews.llvm.org/D70607
Diffstat (limited to 'clang/lib/Frontend/CompilerInvocation.cpp')
0 files changed, 0 insertions, 0 deletions