diff options
author | Pradeep Kumar <pradeepku@nvidia.com> | 2025-03-17 20:44:52 +0530 |
---|---|---|
committer | GitHub <noreply@github.com> | 2025-03-17 20:44:52 +0530 |
commit | 52e7ca9279b4cbe30cacca67548347ef5f96b120 (patch) | |
tree | 6cfe601d068211fd6f055c30879c69990f896d59 /llvm/lib/Transforms/Utils/LoopUtils.cpp | |
parent | 269c40fafc80576ab4efcd7fba954fd5588ea118 (diff) | |
download | llvm-52e7ca9279b4cbe30cacca67548347ef5f96b120.zip llvm-52e7ca9279b4cbe30cacca67548347ef5f96b120.tar.gz llvm-52e7ca9279b4cbe30cacca67548347ef5f96b120.tar.bz2 |
[LLVM][NVPTX] Add support for ldmatrix extensions introduced in PTX 8.6 (#124899)
This commit adds support for the following ldmatrix extensions
introduced in PTX 8.6
- Support for m16n16 with b8 type with mandatory transpose
- Support for m16n16 with m8n16 with source and desitination formats
The above extensions are only supported on sm_100a, sm_101a, sm_120a
Please refer the PTX ISA for more information:
https://docs.nvidia.com/cuda/parallel-thread-execution/#warp-level-matrix-instructions-ldmatrix
Diffstat (limited to 'llvm/lib/Transforms/Utils/LoopUtils.cpp')
0 files changed, 0 insertions, 0 deletions