diff options
author | Srinivasa Ravi <srinivasar@nvidia.com> | 2025-02-14 11:11:44 +0530 |
---|---|---|
committer | GitHub <noreply@github.com> | 2025-02-14 11:11:44 +0530 |
commit | bd860f986406b6630e49b1836b3c208acd721d3e (patch) | |
tree | 73d79c6cf0d0abee39867b70d2079c167fc61e70 /clang/unittests/Interpreter/InterpreterTest.cpp | |
parent | 8a0914c24530c98c5ff65bce3710552ce3ebf7d7 (diff) | |
download | llvm-bd860f986406b6630e49b1836b3c208acd721d3e.zip llvm-bd860f986406b6630e49b1836b3c208acd721d3e.tar.gz llvm-bd860f986406b6630e49b1836b3c208acd721d3e.tar.bz2 |
[NVPTX] Add intrinsics for redux.sync f32 instructions (#126664)
Adds NVVM intrinsics, NVPTX codegen and Clang builtins for `redux.sync`
f32 instructions introduced in ptx8.6 for sm_100a.
Tests added in `CodeGen/NVPTX/redux-sync.ll` and
`CodeGenCUDA/redux-builtins.cu` and verified through ptxas 12.8.0.
PTX Spec Reference:
https://docs.nvidia.com/cuda/parallel-thread-execution/#parallel-synchronization-and-communication-instructions-redux-sync
Diffstat (limited to 'clang/unittests/Interpreter/InterpreterTest.cpp')
0 files changed, 0 insertions, 0 deletions