rocket-tools/riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	weiwei chen <weiwei.chen@modular.com>	2024-06-24 22:15:58 -0400
committer	GitHub <noreply@github.com>	2024-06-24 22:15:58 -0400
commit	b0e9b00ce7d623175c5e60e82afe24e7f8a200be (patch)
tree	0661e08ecd34a9c2adb6b0b48adb24f57e3d49ca /clang/unittests/Format/ConfigParseTest.cpp
parent	7ea63b9db4198688873036f3b0b81f9124076f7a (diff)
download	llvm-b0e9b00ce7d623175c5e60e82afe24e7f8a200be.zip llvm-b0e9b00ce7d623175c5e60e82afe24e7f8a200be.tar.gz llvm-b0e9b00ce7d623175c5e60e82afe24e7f8a200be.tar.bz2

[NVPTX] Make nvptx mma instructions convergent. (#96521)

We are running into NVPTX backend generating wrong code for an input: ``` %0 = llvm.nvvm.mma.m?n?k?.row.col.??? (...) if laneid == 0: ret else: store %0 ``` The backend reorder the instruction (as an effect of `MachineSink` pass) to ``` if laneid == 0: ret else: %0 = llvm.nvvm.mma.m?n?k?.row.col.??? (...) store %0 ``` This is incorrect because `mma` is a warp instruction which needs all threads to sync before performing the operation instead of being guarded by a specific thread id. It should be similar as the shuffle instruction `shfl` in terms of warp level sync, and `shfl` is marked as `isConvergent = true`. Apply `isConvergent = true` to `mma` instructions.

Diffstat (limited to 'clang/unittests/Format/ConfigParseTest.cpp')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: