diff options
author | Diana Picus <Diana-Magda.Picus@amd.com> | 2025-08-15 10:12:47 +0200 |
---|---|---|
committer | GitHub <noreply@github.com> | 2025-08-15 10:12:47 +0200 |
commit | ac005e16f617451ad2dc0c794661159cb8111f72 (patch) | |
tree | e9f8ad6b910ca90dacbb7aaf2b3c170b7303f1cb /llvm/lib/IR/Verifier.cpp | |
parent | fdd2d4df1212ef6b7c8e0dfbba8f2a24343d2d9d (diff) | |
download | llvm-ac005e16f617451ad2dc0c794661159cb8111f72.zip llvm-ac005e16f617451ad2dc0c794661159cb8111f72.tar.gz llvm-ac005e16f617451ad2dc0c794661159cb8111f72.tar.bz2 |
Reapply "[AMDGPU] Intrinsic for launching whole wave functions" (#153584)
This reverts commit 14cd1339318b16e08c1363ec6896bd7d1e4ae281. The
buildbot failure seems to have been a cmake issue which has been
discussed in more detail in this Discourse post:
https://discourse.llvm.org/t/cmake-doesnt-regenerate-all-tablegen-target-files/87901
If any buildbots fail to select arbitrary intrinsics with this patch,
it's worth considering using clean builds with ccache instead of
incremental builds, as recommended here:
https://llvm.org/docs/HowToAddABuilder.html#:~:text=Use%20CCache%20and%20NOT%20incremental%20builds
The original commit message for this patch:
Add the llvm.amdgcn.call.whole.wave intrinsic for calling whole wave
functions. This will take as its first argument the callee with the
amdgpu_gfx_whole_wave calling convention, followed by the call
parameters which must match the signature of the callee except for the
first function argument (the i1 original EXEC mask, which doesn't need
to be passed in). Indirect calls are not allowed.
Make direct calls to amdgpu_gfx_whole_wave functions a verifier error.
Tail calls are handled in a future patch.
Diffstat (limited to 'llvm/lib/IR/Verifier.cpp')
-rw-r--r-- | llvm/lib/IR/Verifier.cpp | 30 |
1 files changed, 30 insertions, 0 deletions
diff --git a/llvm/lib/IR/Verifier.cpp b/llvm/lib/IR/Verifier.cpp index 1d3c379..5a93228 100644 --- a/llvm/lib/IR/Verifier.cpp +++ b/llvm/lib/IR/Verifier.cpp @@ -6639,6 +6639,36 @@ void Verifier::visitIntrinsicCall(Intrinsic::ID ID, CallBase &Call) { "Value for inactive lanes must be a VGPR function argument", &Call); break; } + case Intrinsic::amdgcn_call_whole_wave: { + auto F = dyn_cast<Function>(Call.getArgOperand(0)); + Check(F, "Indirect whole wave calls are not allowed", &Call); + + CallingConv::ID CC = F->getCallingConv(); + Check(CC == CallingConv::AMDGPU_Gfx_WholeWave, + "Callee must have the amdgpu_gfx_whole_wave calling convention", + &Call); + + Check(!F->isVarArg(), "Variadic whole wave calls are not allowed", &Call); + + Check(Call.arg_size() == F->arg_size(), + "Call argument count must match callee argument count", &Call); + + // The first argument of the call is the callee, and the first argument of + // the callee is the active mask. The rest of the arguments must match. + Check(F->arg_begin()->getType()->isIntegerTy(1), + "Callee must have i1 as its first argument", &Call); + for (auto [CallArg, FuncArg] : + drop_begin(zip_equal(Call.args(), F->args()))) { + Check(CallArg->getType() == FuncArg.getType(), + "Argument types must match", &Call); + + // Check that inreg attributes match between call site and function + Check(Call.paramHasAttr(FuncArg.getArgNo(), Attribute::InReg) == + FuncArg.hasInRegAttr(), + "Argument inreg attributes must match", &Call); + } + break; + } case Intrinsic::amdgcn_s_prefetch_data: { Check( AMDGPU::isFlatGlobalAddrSpace( |