riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	Guray Ozen <guray.ozen@gmail.com>	2023-08-09 17:56:59 +0200
committer	Guray Ozen <guray.ozen@gmail.com>	2023-08-09 23:08:00 +0200
commit	18e161f9e15b036faf48bfd8813d9330e06e2ee3 (patch)
tree	32a5c253dd30b3766729e033a2118b42c2e0f78a /llvm/lib/Transforms/Utils/InlineFunction.cpp
parent	90dbd5860bf650c769e84172e0835901396d81a2 (diff)
download	llvm-18e161f9e15b036faf48bfd8813d9330e06e2ee3.zip llvm-18e161f9e15b036faf48bfd8813d9330e06e2ee3.tar.gz llvm-18e161f9e15b036faf48bfd8813d9330e06e2ee3.tar.bz2

[MLIR][NVVM] Introduction of the `wgmma.mma_async` Op

This work introduces the `wgmma.mma_async` Op along PTX generation using `BasicPtxBuilderOpInterface`. The Op is designed to execute the matrix multiply-and-accumulate operation across a warpgroup (128 threads). It's important to note that this operation works for devices with the sm_90a capability. The matrix multiply-and-accumulate operation can take one of the following forms. In both cases, matrix D is referred to as the accumulator: D = A * B + D : Result is added to the accumulator matrix D. D = A * B : The input from the accumulator matrix D is not utilized. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D157370

Diffstat (limited to 'llvm/lib/Transforms/Utils/InlineFunction.cpp')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: