aboutsummaryrefslogtreecommitdiff
path: root/clang/lib/AST/ByteCode/Compiler.cpp
diff options
context:
space:
mode:
authorDurgadoss R <durgadossr@nvidia.com>2025-05-15 16:08:01 +0530
committerGitHub <noreply@github.com>2025-05-15 16:08:01 +0530
commitc507a0830df2e4fd0c234eee035aac2109de6d6e (patch)
tree98cda74dc70e29430000ac58897f8a1488a22fb0 /clang/lib/AST/ByteCode/Compiler.cpp
parentd5da557782dd47395fb41e03d7663df6319d7ea6 (diff)
downloadllvm-c507a0830df2e4fd0c234eee035aac2109de6d6e.zip
llvm-c507a0830df2e4fd0c234eee035aac2109de6d6e.tar.gz
llvm-c507a0830df2e4fd0c234eee035aac2109de6d6e.tar.bz2
[NVPTX] Add TMA Bulk Copy Intrinsics (#138679)
This patch adds a new variant of TMA Bulk Copy intrinsics introduced in sm100+. This variant has an additional byte_mask to select the bytes for the copy operation. * Selection is all done through table-gen now. So, this patch removes the corresponding SelectCpAsyncBulkS2G() function. * lit tests are verified with a cuda-12.8 ptxas executable. PTX Spec link: https://docs.nvidia.com/cuda/parallel-thread-execution/#data-movement-and-conversion-instructions-bulk-copy Signed-off-by: Durgadoss R <durgadossr@nvidia.com>
Diffstat (limited to 'clang/lib/AST/ByteCode/Compiler.cpp')
0 files changed, 0 insertions, 0 deletions