aboutsummaryrefslogtreecommitdiff
path: root/clang/lib/ExtractAPI/Serialization/SymbolGraphSerializer.cpp
diff options
context:
space:
mode:
authorGuray Ozen <guray.ozen@gmail.com>2023-08-22 12:33:36 +0200
committerGuray Ozen <guray.ozen@gmail.com>2023-08-22 16:12:25 +0200
commitcce3e8ed895b2d4c1396929c363c071e15fdbf8b (patch)
tree84cea5a4ac49e0519e2527678ecbb5f621ec56e7 /clang/lib/ExtractAPI/Serialization/SymbolGraphSerializer.cpp
parentf740bcb3707a17ed4ccd52157089011a586cc2a6 (diff)
downloadllvm-cce3e8ed895b2d4c1396929c363c071e15fdbf8b.zip
llvm-cce3e8ed895b2d4c1396929c363c071e15fdbf8b.tar.gz
llvm-cce3e8ed895b2d4c1396929c363c071e15fdbf8b.tar.bz2
[MLIR][NVGPU] Introduction of wgmma.generate.descriptor Op
This work introduces a new Op, `wgmma.generate.descriptor`, designed to create a wgmma descriptor for inputs of matrix multiply and accumulate operations using `wgmma.mma_async` PTX instruction. The descriptor format specifications can be found in the following link: https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#asynchronous-warpgroup-level-matrix-shared-memory-layout-matrix-descriptor It's important to note that this op is in its initial phase, and it does come with certain limitations. It only supports 128b swizzling and does not incorporate interleaving. In the future, different calculations will be addressed in separate works, expanding the capabilities of the op. Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D157382
Diffstat (limited to 'clang/lib/ExtractAPI/Serialization/SymbolGraphSerializer.cpp')
0 files changed, 0 insertions, 0 deletions