diff options
author | Han Zhu <zhuhan7737@gmail.com> | 2023-02-13 17:16:25 -0800 |
---|---|---|
committer | Han Zhu <zhuhan7737@gmail.com> | 2023-04-12 17:11:18 -0700 |
commit | ef38880ce03bc1f1fb3606c5a629151f3d0e975e (patch) | |
tree | 84685077c48960a6c6b25b09631e83358cd52642 /llvm/lib/CodeGen/MachineBasicBlock.cpp | |
parent | 9638da200e00bd069e6dd63604e14cbafede9324 (diff) | |
download | llvm-ef38880ce03bc1f1fb3606c5a629151f3d0e975e.zip llvm-ef38880ce03bc1f1fb3606c5a629151f3d0e975e.tar.gz llvm-ef38880ce03bc1f1fb3606c5a629151f3d0e975e.tar.bz2 |
[X86 isel] Remove lane requirement from lowerShuffleAsUNPCKAndPermute
`lowerShuffleAsUNPCKAndPermute` requires the shuffle mask element to be
in the same lane in both the input and output vectors. This prevents it from
matching certain patterns for example in [GHI
61964](https://github.com/llvm/llvm-project/issues/61964). Removing the lane
requirement fixes the issue.
The change I'm targeting is in the test llvm/test/CodeGen/X86/pr61964.ll. The
codegen has improved notably with this patch. Otherwise, looks like some
broadcast instructions are replaced with unpck and perm. To check if there's
any other performance change, I ran llvm-test-suite benchmarks from the
SingleSource, MultiSource, and MicroBenchmarks directories:
```
Tests: 2665
Short Running: 2009 (filtered out)
Same hash: 140 (filtered out)
In Blacklist: 513 (filtered out)
Remaining: 3
Metric: exec_time
Program exec_time
lhs rhs diff
test-suite :: MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg2000.test 1.64 1.64 0.1%
test-suite :: SingleSource/Benchmarks/Adobe-C++/loop_unroll.test 1.06 1.06 0.0%
test-suite :: MultiSource/Applications/JM/lencod/lencod.test 5.25 5.25 0.0%
Geomean difference nan nan 0.0%
exec_time
l/r lhs rhs diff
count 3.000000 3.000000 3.000000
mean 2.648300 2.649100 0.000462
std 2.269035 2.268849 0.000415
min 1.055500 1.055900 0.000095
25% 1.349300 1.350250 0.000237
50% 1.643100 1.644600 0.000379
75% 3.444700 3.445700 0.000646
max 5.246300 5.246800 0.000913
```
The patch only hits three cases and the result is neutral. (The 513 blacklisted
benchmarks are the ones under MicroBenchmarks, which `--filter-hash` does
not work and I manually verified their code did not change).
Differential Revision: https://reviews.llvm.org/D147668
Diffstat (limited to 'llvm/lib/CodeGen/MachineBasicBlock.cpp')
0 files changed, 0 insertions, 0 deletions