diff options
author | weiwei chen <weiwei.chen@modular.com> | 2024-06-24 22:15:58 -0400 |
---|---|---|
committer | GitHub <noreply@github.com> | 2024-06-24 22:15:58 -0400 |
commit | b0e9b00ce7d623175c5e60e82afe24e7f8a200be (patch) | |
tree | 0661e08ecd34a9c2adb6b0b48adb24f57e3d49ca /clang/unittests/Format/ConfigParseTest.cpp | |
parent | 7ea63b9db4198688873036f3b0b81f9124076f7a (diff) | |
download | llvm-b0e9b00ce7d623175c5e60e82afe24e7f8a200be.zip llvm-b0e9b00ce7d623175c5e60e82afe24e7f8a200be.tar.gz llvm-b0e9b00ce7d623175c5e60e82afe24e7f8a200be.tar.bz2 |
[NVPTX] Make nvptx mma instructions convergent. (#96521)
We are running into NVPTX backend generating wrong code for an input:
```
%0 = llvm.nvvm.mma.m?n?k?.row.col.??? (...)
if laneid == 0:
ret
else:
store %0
```
The backend reorder the instruction (as an effect of `MachineSink` pass)
to
```
if laneid == 0:
ret
else:
%0 = llvm.nvvm.mma.m?n?k?.row.col.??? (...)
store %0
```
This is incorrect because `mma` is a warp instruction which needs all
threads to sync before performing the operation instead of being guarded
by a specific thread id. It should be similar as the shuffle instruction
`shfl` in terms of warp level sync, and `shfl` is marked as
`isConvergent = true`.
Apply `isConvergent = true` to `mma` instructions.
Diffstat (limited to 'clang/unittests/Format/ConfigParseTest.cpp')
0 files changed, 0 insertions, 0 deletions