diff options
author | Kirill Vedernikov <kvedernikov@nvidia.com> | 2025-10-06 10:21:49 +0200 |
---|---|---|
committer | GitHub <noreply@github.com> | 2025-10-06 13:51:49 +0530 |
commit | bd8a7f9ef394c7f722fc8ae3f852311550669e56 (patch) | |
tree | a0cee32151966eadd399b98798e6262200f5db24 /clang/lib/Frontend/CompilerInvocation.cpp | |
parent | e573c795e4938440aa1ddb0371568be69eb08390 (diff) | |
download | llvm-bd8a7f9ef394c7f722fc8ae3f852311550669e56.zip llvm-bd8a7f9ef394c7f722fc8ae3f852311550669e56.tar.gz llvm-bd8a7f9ef394c7f722fc8ae3f852311550669e56.tar.bz2 |
[NVPTX] Added more MMA intrinsics for F8F6F4 and FP64 types. (#156040)
This change adds more MMA intrinsics for F8F6F4 and FP64 types. The implementation is based on [PTX ISA version 9.0](https://docs.nvidia.com/cuda/parallel-thread-execution/#warp-level-matrix-instructions-mma). New restrictions were added for dtype/ctype combinations for MMA and sparse MMA intrinsics. MLIR restrictions for dtype/ctype MMA intrinsics were aligned with NVVM IR.
Diffstat (limited to 'clang/lib/Frontend/CompilerInvocation.cpp')
0 files changed, 0 insertions, 0 deletions