aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/Transforms/Utils/InlineFunction.cpp
diff options
context:
space:
mode:
authorGuray Ozen <guray.ozen@gmail.com>2023-08-09 17:56:59 +0200
committerGuray Ozen <guray.ozen@gmail.com>2023-08-09 23:08:00 +0200
commit18e161f9e15b036faf48bfd8813d9330e06e2ee3 (patch)
tree32a5c253dd30b3766729e033a2118b42c2e0f78a /llvm/lib/Transforms/Utils/InlineFunction.cpp
parent90dbd5860bf650c769e84172e0835901396d81a2 (diff)
downloadllvm-18e161f9e15b036faf48bfd8813d9330e06e2ee3.zip
llvm-18e161f9e15b036faf48bfd8813d9330e06e2ee3.tar.gz
llvm-18e161f9e15b036faf48bfd8813d9330e06e2ee3.tar.bz2
[MLIR][NVVM] Introduction of the `wgmma.mma_async` Op
This work introduces the `wgmma.mma_async` Op along PTX generation using `BasicPtxBuilderOpInterface`. The Op is designed to execute the matrix multiply-and-accumulate operation across a warpgroup (128 threads). It's important to note that this operation works for devices with the sm_90a capability. The matrix multiply-and-accumulate operation can take one of the following forms. In both cases, matrix D is referred to as the accumulator: D = A * B + D : Result is added to the accumulator matrix D. D = A * B : The input from the accumulator matrix D is not utilized. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D157370
Diffstat (limited to 'llvm/lib/Transforms/Utils/InlineFunction.cpp')
0 files changed, 0 insertions, 0 deletions