aboutsummaryrefslogtreecommitdiff
path: root/clang/lib/Frontend/CompilerInvocation.cpp
diff options
context:
space:
mode:
authorKirill Vedernikov <kvedernikov@nvidia.com>2025-10-06 10:21:49 +0200
committerGitHub <noreply@github.com>2025-10-06 13:51:49 +0530
commitbd8a7f9ef394c7f722fc8ae3f852311550669e56 (patch)
treea0cee32151966eadd399b98798e6262200f5db24 /clang/lib/Frontend/CompilerInvocation.cpp
parente573c795e4938440aa1ddb0371568be69eb08390 (diff)
downloadllvm-bd8a7f9ef394c7f722fc8ae3f852311550669e56.zip
llvm-bd8a7f9ef394c7f722fc8ae3f852311550669e56.tar.gz
llvm-bd8a7f9ef394c7f722fc8ae3f852311550669e56.tar.bz2
[NVPTX] Added more MMA intrinsics for F8F6F4 and FP64 types. (#156040)
This change adds more MMA intrinsics for F8F6F4 and FP64 types. The implementation is based on [PTX ISA version 9.0](https://docs.nvidia.com/cuda/parallel-thread-execution/#warp-level-matrix-instructions-mma). New restrictions were added for dtype/ctype combinations for MMA and sparse MMA intrinsics. MLIR restrictions for dtype/ctype MMA intrinsics were aligned with NVVM IR.
Diffstat (limited to 'clang/lib/Frontend/CompilerInvocation.cpp')
0 files changed, 0 insertions, 0 deletions