diff options
author | Alex MacLean <amaclean@nvidia.com> | 2025-07-13 15:06:53 -0700 |
---|---|---|
committer | GitHub <noreply@github.com> | 2025-07-13 15:06:53 -0700 |
commit | 86203b6b33e49cc1a8ce6d7d69e7df4970d8f7bd (patch) | |
tree | a0f3a0a83e911b0cceff84157f224a71a1100973 /llvm/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp | |
parent | c384ec431dd7f771c9dd7c462cec5301ac0f32bb (diff) | |
download | llvm-86203b6b33e49cc1a8ce6d7d69e7df4970d8f7bd.zip llvm-86203b6b33e49cc1a8ce6d7d69e7df4970d8f7bd.tar.gz llvm-86203b6b33e49cc1a8ce6d7d69e7df4970d8f7bd.tar.bz2 |
[NVPTX] Use PRMT more widely, and improve folding around this instruction (#148261)
Replace uses of BFE with PRMT when lowering v4i8 vectors. This will
generally lead to equivalent or better SASS and reduces the number of
target specific operations we need to represent.
(https://cuda.godbolt.org/z/M75W6f8xd) Also implement KnownBits tracking
for PRMT allowing elimination of redundant AND instructions when
lowering various i8 operations.
Diffstat (limited to 'llvm/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp')
0 files changed, 0 insertions, 0 deletions