diff options
author | Diana Picus <Diana-Magda.Picus@amd.com> | 2025-07-21 10:39:09 +0200 |
---|---|---|
committer | GitHub <noreply@github.com> | 2025-07-21 10:39:09 +0200 |
commit | 20d8398825a799008ae508d8463dbb9b11df81e7 (patch) | |
tree | a77c9a724dbd4a3bd37e21653ac3f21e20dc21c0 /llvm/lib/IR/Verifier.cpp | |
parent | e87d3904f693b9e13c54b87d0f2b749e1d818809 (diff) | |
download | llvm-20d8398825a799008ae508d8463dbb9b11df81e7.zip llvm-20d8398825a799008ae508d8463dbb9b11df81e7.tar.gz llvm-20d8398825a799008ae508d8463dbb9b11df81e7.tar.bz2 |
[AMDGPU] ISel & PEI for whole wave functions (#145858)
Whole wave functions are functions that will run with a full EXEC mask.
They will not be invoked directly, but instead will be launched by way
of a new intrinsic, `llvm.amdgcn.call.whole.wave` (to be added in
a future patch). These functions are meant as an alternative to the
`llvm.amdgcn.init.whole.wave` or `llvm.amdgcn.strict.wwm` intrinsics.
Whole wave functions will set EXEC to -1 in the prologue and restore the
original value of EXEC in the epilogue. They must have a special first
argument, `i1 %active`, that is going to be mapped to EXEC. They may
have either the default calling convention or amdgpu_gfx. The inactive
lanes need to be preserved for all registers used, active lanes only for
the CSRs.
At the IR level, arguments to a whole wave function (other than
`%active`) contain poison in their inactive lanes. Likewise, the return
value for the inactive lanes is poison.
This patch contains the following work:
* 2 new pseudos, SI_SETUP_WHOLE_WAVE_FUNC and SI_WHOLE_WAVE_FUNC_RETURN
used for managing the EXEC mask. SI_SETUP_WHOLE_WAVE_FUNC will return
a SReg_1 representing `%active`, which needs to be passed into
SI_WHOLE_WAVE_FUNC_RETURN.
* SelectionDAG support for generating these 2 new pseudos and the
special handling of %active. Since the return may be in a different
basic block, it's difficult to add the virtual reg for %active to
SI_WHOLE_WAVE_FUNC_RETURN, so we initially generate an IMPLICIT_DEF
which is later replaced via a custom inserter.
* Expansion of the 2 pseudos during prolog/epilog insertion. PEI also
marks any used VGPRs as WWM registers, which are then spilled and
restored with the usual logic.
Future patches will include the `llvm.amdgcn.call.whole.wave` intrinsic
and a lot of optimization work (especially in order to reduce spills
around function calls).
---------
Co-authored-by: Matt Arsenault <Matthew.Arsenault@amd.com>
Co-authored-by: Shilei Tian <i@tianshilei.me>
Diffstat (limited to 'llvm/lib/IR/Verifier.cpp')
-rw-r--r-- | llvm/lib/IR/Verifier.cpp | 10 |
1 files changed, 10 insertions, 0 deletions
diff --git a/llvm/lib/IR/Verifier.cpp b/llvm/lib/IR/Verifier.cpp index 9bd573e..e7b491e 100644 --- a/llvm/lib/IR/Verifier.cpp +++ b/llvm/lib/IR/Verifier.cpp @@ -2979,6 +2979,16 @@ void Verifier::visitFunction(const Function &F) { "perfect forwarding!", &F); break; + case CallingConv::AMDGPU_Gfx_WholeWave: + Check(!F.arg_empty() && F.arg_begin()->getType()->isIntegerTy(1), + "Calling convention requires first argument to be i1", &F); + Check(!F.arg_begin()->hasInRegAttr(), + "Calling convention requires first argument to not be inreg", &F); + Check(!F.isVarArg(), + "Calling convention does not support varargs or " + "perfect forwarding!", + &F); + break; } // Check that the argument values match the function type for this function... |