diff options
author | Jeffrey Byrnes <jeffrey.byrnes@amd.com> | 2024-05-21 09:21:36 -0700 |
---|---|---|
committer | GitHub <noreply@github.com> | 2024-05-21 09:21:36 -0700 |
commit | ea43a30899df5c3c36412392c8f4db79973a1c43 (patch) | |
tree | 631df57d306ae446ca12a4d055f2010e815a42b5 /llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp | |
parent | f52d29c9ab7d3c712d36c28d00adc95fe7d52805 (diff) | |
download | llvm-ea43a30899df5c3c36412392c8f4db79973a1c43.zip llvm-ea43a30899df5c3c36412392c8f4db79973a1c43.tar.gz llvm-ea43a30899df5c3c36412392c8f4db79973a1c43.tar.bz2 |
[AMDGPU] Vectorize more 16 bit shuffles (#90648)
In the case of larger vectors, we should still prefer the vectorized
version (i.e. shufflevector vs extract/insert chains).
In arithmetic chains, vectorization results in chains of packed math
instructions (as opposed to unpack/repack & scalarized arithmetic):
https://godbolt.org/z/c5onaf6G5
In chains with PHIs, vectorization again removes the unnecessary pack /
repack code around BBs: https://godbolt.org/z/vz7zYzvhs
Diffstat (limited to 'llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp')
0 files changed, 0 insertions, 0 deletions