diff options
author | Srinivasa Ravi <srinivasar@nvidia.com> | 2025-01-29 10:57:51 +0530 |
---|---|---|
committer | GitHub <noreply@github.com> | 2025-01-29 10:57:51 +0530 |
commit | d4159e2a1d1d640077b2e5cde66b0a284049955f (patch) | |
tree | c698938b09661761a9abb33d17d9ef539394b409 /llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp | |
parent | ab6d41eae12b29bcb99ed66ca8edb48296ccfe42 (diff) | |
download | llvm-d4159e2a1d1d640077b2e5cde66b0a284049955f.zip llvm-d4159e2a1d1d640077b2e5cde66b0a284049955f.tar.gz llvm-d4159e2a1d1d640077b2e5cde66b0a284049955f.tar.bz2 |
[MLIR][NVVM] Add support for griddepcontrol Ops (#124603)
Adds `griddepcontrol.wait` and `griddepcontrol.launch.dependents`
MLIR Ops to generate griddepcontrol instructions.
`griddepcontrol` - Allows dependent and prerequisite grids as defined by
the runtime to control execution in the following ways:
- `griddepcontrol.wait` - causes the executing thread to wait until all
prerequisite grids in flight have completed and all the memory
operations from the prerequisite grids are performed and made visible
to the current grid.
- `griddepcontrol.launch.dependents` - signals that specific dependents
the runtime system designated to react to this instruction can be
scheduled as soon as all other CTAs in the grid issue the same
instruction or have completed.
PTX Spec Reference:
https://docs.nvidia.com/cuda/parallel-thread-execution/#parallel-synchronization-and-communication-instructions-griddepcontrol
Diffstat (limited to 'llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp')
0 files changed, 0 insertions, 0 deletions