aboutsummaryrefslogtreecommitdiff
path: root/clang/lib/CodeGen/CodeGenModule.h
diff options
context:
space:
mode:
authorJohannes Doerfert <johannes@jdoerfert.de>2023-03-02 18:35:15 -0800
committerJohannes Doerfert <johannes@jdoerfert.de>2023-07-24 22:04:45 -0700
commitef9ec4bbcca2fa4f64df47bc426f1d1c59ea47e2 (patch)
treeb3f8edde0b515fde75774548bf3dbb07eb98e640 /clang/lib/CodeGen/CodeGenModule.h
parentfb2a971c015fa991b47aa8d93bd97379c012cb68 (diff)
downloadllvm-ef9ec4bbcca2fa4f64df47bc426f1d1c59ea47e2.zip
llvm-ef9ec4bbcca2fa4f64df47bc426f1d1c59ea47e2.tar.gz
llvm-ef9ec4bbcca2fa4f64df47bc426f1d1c59ea47e2.tar.bz2
[OpenMP] Add the `ompx_attribute` clause for target directives
CUDA and HIP have kernel attributes to tune the code generation (in the backend). To reuse this functionality for OpenMP target regions we introduce the `ompx_attribute` clause that takes these kernel attributes and emits code as if they had been attached to the kernel fuction (which is implicitly generated). To limit the impact, we only support three kernel attributes: `amdgpu_waves_per_eu`, for AMDGPU `amdgpu_flat_work_group_size`, for AMDGPU `launch_bounds`, for NVPTX The existing implementations of those attributes are used for error checking and code generation. `ompx_attribute` can be attached to any executable target region and it can hold more than one kernel attribute. Differential Revision: https://reviews.llvm.org/D156184
Diffstat (limited to 'clang/lib/CodeGen/CodeGenModule.h')
-rw-r--r--clang/lib/CodeGen/CodeGenModule.h15
1 files changed, 15 insertions, 0 deletions
diff --git a/clang/lib/CodeGen/CodeGenModule.h b/clang/lib/CodeGen/CodeGenModule.h
index 05cb217..f5fd944 100644
--- a/clang/lib/CodeGen/CodeGenModule.h
+++ b/clang/lib/CodeGen/CodeGenModule.h
@@ -1557,6 +1557,21 @@ public:
/// because we'll lose all important information after each repl.
void moveLazyEmissionStates(CodeGenModule *NewBuilder);
+ /// Emit the IR encoding to attach the CUDA launch bounds attribute to \p F.
+ void handleCUDALaunchBoundsAttr(llvm::Function *F,
+ const CUDALaunchBoundsAttr *A);
+
+ /// Emit the IR encoding to attach the AMD GPU flat-work-group-size attribute
+ /// to \p F. Alternatively, the work group size can be taken from a \p
+ /// ReqdWGS.
+ void handleAMDGPUFlatWorkGroupSizeAttr(
+ llvm::Function *F, const AMDGPUFlatWorkGroupSizeAttr *A,
+ const ReqdWorkGroupSizeAttr *ReqdWGS = nullptr);
+
+ /// Emit the IR encoding to attach the AMD GPU waves-per-eu attribute to \p F.
+ void handleAMDGPUWavesPerEUAttr(llvm::Function *F,
+ const AMDGPUWavesPerEUAttr *A);
+
private:
llvm::Constant *GetOrCreateLLVMFunction(
StringRef MangledName, llvm::Type *Ty, GlobalDecl D, bool ForVTable,