diff options
author | Justin Fargnoli <jfargnoli@nvidia.com> | 2024-10-31 16:09:20 -0700 |
---|---|---|
committer | GitHub <noreply@github.com> | 2024-10-31 16:09:20 -0700 |
commit | a1987beac58765b7df9690eb898c14f629449210 (patch) | |
tree | 631118d3b2ee65da3319426d907821594f4f1c72 /clang/lib/Serialization/ASTWriter.cpp | |
parent | 19b4f17d4c0ae12725050d09f04f85bccc686d8e (diff) | |
download | llvm-a1987beac58765b7df9690eb898c14f629449210.zip llvm-a1987beac58765b7df9690eb898c14f629449210.tar.gz llvm-a1987beac58765b7df9690eb898c14f629449210.tar.bz2 |
Reland "[NVPTX] Prefer prmt.b32 over bfi.b32" (#114326)
Fix
[failure](https://github.com/llvm/llvm-project/pull/110766#discussion_r1796832635)
identified by @akuegel.
---
In [[NVPTX] Improve lowering of
v4i8](https://github.com/llvm/llvm-project/commit/cbafb6f2f5c99474164dcc725820cbbeb2e02e14)
@Artem-B add the ability to lower ISD::BUILD_VECTOR with bfi PTX
instructions. @Artem-B did this because:
(https://github.com/llvm/llvm-project/pull/67866#discussion_r1343066911)
Under the hood byte extraction/insertion ends up as BFI/BFE
instructions, so we may as well do that in PTX, too.
https://godbolt.org/z/Tb3zWbj9b
However, the example that @Artem-B linked was targeting sm_52. On modern
architectures, ptxas uses prmt.b32.
[Example](https://godbolt.org/z/Ye4W1n84o).
Thus, remove uses of NVPTXISD::BFI in favor of NVPTXISD::PRMT.
Diffstat (limited to 'clang/lib/Serialization/ASTWriter.cpp')
0 files changed, 0 insertions, 0 deletions