diff options
author | Justin Fargnoli <jfargnoli@nvidia.com> | 2024-10-10 10:24:02 -0700 |
---|---|---|
committer | GitHub <noreply@github.com> | 2024-10-10 10:24:02 -0700 |
commit | 3f9998af4f79e95fe8be615df9d6b898008044b9 (patch) | |
tree | 1d0999815b8c6abb1e60d106789fb9ad096453c2 /lldb/packages/Python/lldbsuite/test/configuration.py | |
parent | 43ba97e7079525a9686e15a6963508dfbd493f81 (diff) | |
download | llvm-3f9998af4f79e95fe8be615df9d6b898008044b9.zip llvm-3f9998af4f79e95fe8be615df9d6b898008044b9.tar.gz llvm-3f9998af4f79e95fe8be615df9d6b898008044b9.tar.bz2 |
[NVPTX] Prefer prmt.b32 over bfi.b32 (#110766)
In [[NVPTX] Improve lowering of
v4i8](https://github.com/llvm/llvm-project/commit/cbafb6f2f5c99474164dcc725820cbbeb2e02e14)
@Artem-B add the ability to lower ISD::BUILD_VECTOR with bfi PTX
instructions. @Artem-B did this because:
([source](https://github.com/llvm/llvm-project/pull/67866#discussion_r1343066911))
> Under the hood byte extraction/insertion ends up as BFI/BFE
instructions, so we may as well do that in PTX, too.
https://godbolt.org/z/Tb3zWbj9b
However, the example that @Artem-B linked was targeting sm_52. On modern
architectures, ptxas uses prmt.b32.
[Example](https://godbolt.org/z/Ye4W1n84o).
Thus, remove uses of NVPTXISD::BFI in favor of NVPTXISD::PRMT.
Diffstat (limited to 'lldb/packages/Python/lldbsuite/test/configuration.py')
0 files changed, 0 insertions, 0 deletions