aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
diff options
context:
space:
mode:
authorGiorgi Gvalia <49309634+gvalson@users.noreply.github.com>2025-07-07 15:26:16 -0400
committerGitHub <noreply@github.com>2025-07-07 15:26:16 -0400
commit5110ac4113b5969315a38e0cffe7580a4ca847a1 (patch)
tree83e9343c08b7ca3e21c2ba30f8915e3e680b5486 /llvm/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
parent3ea636e2f6d6225aa38bab1f8708483dbf6a0a1d (diff)
downloadllvm-5110ac4113b5969315a38e0cffe7580a4ca847a1.zip
llvm-5110ac4113b5969315a38e0cffe7580a4ca847a1.tar.gz
llvm-5110ac4113b5969315a38e0cffe7580a4ca847a1.tar.bz2
[Offload] Allow CUDA Kernels to use arbitrarily large shared memory (#145963)
Previously, the user was not able to use more than 48 KB of shared memory on NVIDIA GPUs. In order to do so, setting the function attribute `CU_FUNC_ATTRIBUTE_MAX_THREADS_PER_BLOCK` is required, which was not present in the code base. With this commit, we add the ability toset this attribute, allowing the user to utilize the full power of their GPU. In order to not have to reset the function attribute for each launch of the same kernel, we keep track of the maximum memory limit (as the variable `MaxDynCGroupMemLimit`) and only set the attribute if our desired amount exceeds the limit. By default, this limit is set to 48 KB. Feedback is greatly appreciated, especially around setting the new variable as mutable. I did this becuase the `launchImpl` method is const and I am not able to modify my variable otherwise. --------- Co-authored-by: Giorgi Gvalia <ggvalia@login33.chn.perlmutter.nersc.gov> Co-authored-by: Giorgi Gvalia <ggvalia@login07.chn.perlmutter.nersc.gov>
Diffstat (limited to 'llvm/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp')
0 files changed, 0 insertions, 0 deletions