rocket-tools/riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	Durgadoss R <durgadossr@nvidia.com>	2024-10-26 11:15:50 +0530
committer	GitHub <noreply@github.com>	2024-10-26 11:15:50 +0530
commit	13d6233e77982f2a596922a79365373e1466a968 (patch)
tree	2246258f1ada07fd18e24868f3dc7ba4e0312bf5 /clang/lib/CodeGen/CodeGenFunction.cpp
parent	bb00f5b1edd0ed77b7e7b0113dad223505564b18 (diff)
download	llvm-13d6233e77982f2a596922a79365373e1466a968.zip llvm-13d6233e77982f2a596922a79365373e1466a968.tar.gz llvm-13d6233e77982f2a596922a79365373e1466a968.tar.bz2

[MLIR][NVGPU] Fix nvgpu_arrive syntax in matmulBuilder.py (#113713)

This patch updates the syntax for nvgpu_arrive Op in matmulBuilder.py. This fixes the compilation error for this test. For the warp-specialized matmul_kernel implementation, removing the WaitGroupSyncOp (after the mma-main-loop) fixes the hang observed. With these two fixes, the test compiles and executes successfully on an sm90a machine. Signed-off-by: Durgadoss R <durgadossr@nvidia.com>

Diffstat (limited to 'clang/lib/CodeGen/CodeGenFunction.cpp')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: