aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
diff options
context:
space:
mode:
authorSrinivasa Ravi <srinivasar@nvidia.com>2025-01-29 10:57:51 +0530
committerGitHub <noreply@github.com>2025-01-29 10:57:51 +0530
commitd4159e2a1d1d640077b2e5cde66b0a284049955f (patch)
treec698938b09661761a9abb33d17d9ef539394b409 /llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
parentab6d41eae12b29bcb99ed66ca8edb48296ccfe42 (diff)
downloadllvm-d4159e2a1d1d640077b2e5cde66b0a284049955f.zip
llvm-d4159e2a1d1d640077b2e5cde66b0a284049955f.tar.gz
llvm-d4159e2a1d1d640077b2e5cde66b0a284049955f.tar.bz2
[MLIR][NVVM] Add support for griddepcontrol Ops (#124603)
Adds `griddepcontrol.wait` and `griddepcontrol.launch.dependents` MLIR Ops to generate griddepcontrol instructions. `griddepcontrol` - Allows dependent and prerequisite grids as defined by the runtime to control execution in the following ways: - `griddepcontrol.wait` - causes the executing thread to wait until all prerequisite grids in flight have completed and all the memory operations from the prerequisite grids are performed and made visible to the current grid. - `griddepcontrol.launch.dependents` - signals that specific dependents the runtime system designated to react to this instruction can be scheduled as soon as all other CTAs in the grid issue the same instruction or have completed. PTX Spec Reference: https://docs.nvidia.com/cuda/parallel-thread-execution/#parallel-synchronization-and-communication-instructions-griddepcontrol
Diffstat (limited to 'llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp')
0 files changed, 0 insertions, 0 deletions