aboutsummaryrefslogtreecommitdiff
path: root/llvm/tools/llvm-objdump/llvm-objdump.cpp
diff options
context:
space:
mode:
authorDiana Picus <Diana-Magda.Picus@amd.com>2025-07-21 10:39:09 +0200
committerGitHub <noreply@github.com>2025-07-21 10:39:09 +0200
commit20d8398825a799008ae508d8463dbb9b11df81e7 (patch)
treea77c9a724dbd4a3bd37e21653ac3f21e20dc21c0 /llvm/tools/llvm-objdump/llvm-objdump.cpp
parente87d3904f693b9e13c54b87d0f2b749e1d818809 (diff)
downloadllvm-20d8398825a799008ae508d8463dbb9b11df81e7.zip
llvm-20d8398825a799008ae508d8463dbb9b11df81e7.tar.gz
llvm-20d8398825a799008ae508d8463dbb9b11df81e7.tar.bz2
[AMDGPU] ISel & PEI for whole wave functions (#145858)
Whole wave functions are functions that will run with a full EXEC mask. They will not be invoked directly, but instead will be launched by way of a new intrinsic, `llvm.amdgcn.call.whole.wave` (to be added in a future patch). These functions are meant as an alternative to the `llvm.amdgcn.init.whole.wave` or `llvm.amdgcn.strict.wwm` intrinsics. Whole wave functions will set EXEC to -1 in the prologue and restore the original value of EXEC in the epilogue. They must have a special first argument, `i1 %active`, that is going to be mapped to EXEC. They may have either the default calling convention or amdgpu_gfx. The inactive lanes need to be preserved for all registers used, active lanes only for the CSRs. At the IR level, arguments to a whole wave function (other than `%active`) contain poison in their inactive lanes. Likewise, the return value for the inactive lanes is poison. This patch contains the following work: * 2 new pseudos, SI_SETUP_WHOLE_WAVE_FUNC and SI_WHOLE_WAVE_FUNC_RETURN used for managing the EXEC mask. SI_SETUP_WHOLE_WAVE_FUNC will return a SReg_1 representing `%active`, which needs to be passed into SI_WHOLE_WAVE_FUNC_RETURN. * SelectionDAG support for generating these 2 new pseudos and the special handling of %active. Since the return may be in a different basic block, it's difficult to add the virtual reg for %active to SI_WHOLE_WAVE_FUNC_RETURN, so we initially generate an IMPLICIT_DEF which is later replaced via a custom inserter. * Expansion of the 2 pseudos during prolog/epilog insertion. PEI also marks any used VGPRs as WWM registers, which are then spilled and restored with the usual logic. Future patches will include the `llvm.amdgcn.call.whole.wave` intrinsic and a lot of optimization work (especially in order to reduce spills around function calls). --------- Co-authored-by: Matt Arsenault <Matthew.Arsenault@amd.com> Co-authored-by: Shilei Tian <i@tianshilei.me>
Diffstat (limited to 'llvm/tools/llvm-objdump/llvm-objdump.cpp')
0 files changed, 0 insertions, 0 deletions