diff options
author | Durgadoss R <durgadossr@nvidia.com> | 2024-11-15 11:22:48 +0530 |
---|---|---|
committer | GitHub <noreply@github.com> | 2024-11-15 11:22:48 +0530 |
commit | 1b23ebe0770aaf85f37e085b53067066d2d99cc8 (patch) | |
tree | 0fde912d07e66e158ea9fe9f699c45a0f2a0dc19 /llvm/lib/CodeGen/MachineModuleSlotTracker.cpp | |
parent | 7b54976d11a5fc6aa1f22e9d96bcb4c81bbf2abf (diff) | |
download | llvm-1b23ebe0770aaf85f37e085b53067066d2d99cc8.zip llvm-1b23ebe0770aaf85f37e085b53067066d2d99cc8.tar.gz llvm-1b23ebe0770aaf85f37e085b53067066d2d99cc8.tar.bz2 |
[MLIR][NVVM] Add Op for TMA Prefetch (#116232)
PR #115527 adds intrinsics for TMA prefetch.
This patch adds an NVVM Dialect Op for the same.
Lit tests to verify the lowering to LLVM intrinsics as well as
verifier tests (for invalid cases) are added.
PTX Spec reference:
https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#data-movement-and-conversion-instructions-cp-async-bulk-prefetch-tensor
Signed-off-by: Durgadoss R <durgadossr@nvidia.com>
Diffstat (limited to 'llvm/lib/CodeGen/MachineModuleSlotTracker.cpp')
0 files changed, 0 insertions, 0 deletions