riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	Kirill Vedernikov <kvedernikov@nvidia.com>	2025-10-06 10:21:49 +0200
committer	GitHub <noreply@github.com>	2025-10-06 13:51:49 +0530
commit	bd8a7f9ef394c7f722fc8ae3f852311550669e56 (patch)
tree	a0cee32151966eadd399b98798e6262200f5db24 /clang/lib/Frontend/CompilerInvocation.cpp
parent	e573c795e4938440aa1ddb0371568be69eb08390 (diff)
download	llvm-bd8a7f9ef394c7f722fc8ae3f852311550669e56.zip llvm-bd8a7f9ef394c7f722fc8ae3f852311550669e56.tar.gz llvm-bd8a7f9ef394c7f722fc8ae3f852311550669e56.tar.bz2

[NVPTX] Added more MMA intrinsics for F8F6F4 and FP64 types. (#156040)

This change adds more MMA intrinsics for F8F6F4 and FP64 types. The implementation is based on [PTX ISA version 9.0](https://docs.nvidia.com/cuda/parallel-thread-execution/#warp-level-matrix-instructions-mma). New restrictions were added for dtype/ctype combinations for MMA and sparse MMA intrinsics. MLIR restrictions for dtype/ctype MMA intrinsics were aligned with NVVM IR.

Diffstat (limited to 'clang/lib/Frontend/CompilerInvocation.cpp')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: