riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	Justin Fargnoli <jfargnoli@nvidia.com>	2024-10-31 16:09:20 -0700
committer	GitHub <noreply@github.com>	2024-10-31 16:09:20 -0700
commit	a1987beac58765b7df9690eb898c14f629449210 (patch)
tree	631118d3b2ee65da3319426d907821594f4f1c72 /clang/lib/Serialization/ASTWriter.cpp
parent	19b4f17d4c0ae12725050d09f04f85bccc686d8e (diff)
download	llvm-a1987beac58765b7df9690eb898c14f629449210.zip llvm-a1987beac58765b7df9690eb898c14f629449210.tar.gz llvm-a1987beac58765b7df9690eb898c14f629449210.tar.bz2

Reland "[NVPTX] Prefer prmt.b32 over bfi.b32" (#114326)

Fix [failure](https://github.com/llvm/llvm-project/pull/110766#discussion_r1796832635) identified by @akuegel. --- In [[NVPTX] Improve lowering of v4i8](https://github.com/llvm/llvm-project/commit/cbafb6f2f5c99474164dcc725820cbbeb2e02e14) @Artem-B add the ability to lower ISD::BUILD_VECTOR with bfi PTX instructions. @Artem-B did this because: (https://github.com/llvm/llvm-project/pull/67866#discussion_r1343066911) Under the hood byte extraction/insertion ends up as BFI/BFE instructions, so we may as well do that in PTX, too. https://godbolt.org/z/Tb3zWbj9b However, the example that @Artem-B linked was targeting sm_52. On modern architectures, ptxas uses prmt.b32. [Example](https://godbolt.org/z/Ye4W1n84o). Thus, remove uses of NVPTXISD::BFI in favor of NVPTXISD::PRMT.

Diffstat (limited to 'clang/lib/Serialization/ASTWriter.cpp')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: