diff options
author | Diana Picus <Diana-Magda.Picus@amd.com> | 2025-07-21 10:39:09 +0200 |
---|---|---|
committer | GitHub <noreply@github.com> | 2025-07-21 10:39:09 +0200 |
commit | 20d8398825a799008ae508d8463dbb9b11df81e7 (patch) | |
tree | a77c9a724dbd4a3bd37e21653ac3f21e20dc21c0 /llvm/tools/llvm-objdump/llvm-objdump.cpp | |
parent | e87d3904f693b9e13c54b87d0f2b749e1d818809 (diff) | |
download | llvm-20d8398825a799008ae508d8463dbb9b11df81e7.zip llvm-20d8398825a799008ae508d8463dbb9b11df81e7.tar.gz llvm-20d8398825a799008ae508d8463dbb9b11df81e7.tar.bz2 |
[AMDGPU] ISel & PEI for whole wave functions (#145858)
Whole wave functions are functions that will run with a full EXEC mask.
They will not be invoked directly, but instead will be launched by way
of a new intrinsic, `llvm.amdgcn.call.whole.wave` (to be added in
a future patch). These functions are meant as an alternative to the
`llvm.amdgcn.init.whole.wave` or `llvm.amdgcn.strict.wwm` intrinsics.
Whole wave functions will set EXEC to -1 in the prologue and restore the
original value of EXEC in the epilogue. They must have a special first
argument, `i1 %active`, that is going to be mapped to EXEC. They may
have either the default calling convention or amdgpu_gfx. The inactive
lanes need to be preserved for all registers used, active lanes only for
the CSRs.
At the IR level, arguments to a whole wave function (other than
`%active`) contain poison in their inactive lanes. Likewise, the return
value for the inactive lanes is poison.
This patch contains the following work:
* 2 new pseudos, SI_SETUP_WHOLE_WAVE_FUNC and SI_WHOLE_WAVE_FUNC_RETURN
used for managing the EXEC mask. SI_SETUP_WHOLE_WAVE_FUNC will return
a SReg_1 representing `%active`, which needs to be passed into
SI_WHOLE_WAVE_FUNC_RETURN.
* SelectionDAG support for generating these 2 new pseudos and the
special handling of %active. Since the return may be in a different
basic block, it's difficult to add the virtual reg for %active to
SI_WHOLE_WAVE_FUNC_RETURN, so we initially generate an IMPLICIT_DEF
which is later replaced via a custom inserter.
* Expansion of the 2 pseudos during prolog/epilog insertion. PEI also
marks any used VGPRs as WWM registers, which are then spilled and
restored with the usual logic.
Future patches will include the `llvm.amdgcn.call.whole.wave` intrinsic
and a lot of optimization work (especially in order to reduce spills
around function calls).
---------
Co-authored-by: Matt Arsenault <Matthew.Arsenault@amd.com>
Co-authored-by: Shilei Tian <i@tianshilei.me>
Diffstat (limited to 'llvm/tools/llvm-objdump/llvm-objdump.cpp')
0 files changed, 0 insertions, 0 deletions