aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
diff options
context:
space:
mode:
authorJeffrey Byrnes <jeffrey.byrnes@amd.com>2024-05-21 09:21:36 -0700
committerGitHub <noreply@github.com>2024-05-21 09:21:36 -0700
commitea43a30899df5c3c36412392c8f4db79973a1c43 (patch)
tree631df57d306ae446ca12a4d055f2010e815a42b5 /llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
parentf52d29c9ab7d3c712d36c28d00adc95fe7d52805 (diff)
downloadllvm-ea43a30899df5c3c36412392c8f4db79973a1c43.zip
llvm-ea43a30899df5c3c36412392c8f4db79973a1c43.tar.gz
llvm-ea43a30899df5c3c36412392c8f4db79973a1c43.tar.bz2
[AMDGPU] Vectorize more 16 bit shuffles (#90648)
In the case of larger vectors, we should still prefer the vectorized version (i.e. shufflevector vs extract/insert chains). In arithmetic chains, vectorization results in chains of packed math instructions (as opposed to unpack/repack & scalarized arithmetic): https://godbolt.org/z/c5onaf6G5 In chains with PHIs, vectorization again removes the unnecessary pack / repack code around BBs: https://godbolt.org/z/vz7zYzvhs
Diffstat (limited to 'llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp')
0 files changed, 0 insertions, 0 deletions