aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/Bitcode/Reader/BitcodeReader.cpp
diff options
context:
space:
mode:
authorKrzysztof Drewniak <Krzysztof.Drewniak@amd.com>2021-12-14 18:47:09 +0000
committerKrzysztof Drewniak <Krzysztof.Drewniak@amd.com>2021-12-14 20:12:23 +0000
commitc57b2a0635df9eae0b1d699f83b9b158d5a89135 (patch)
tree64e1e8d61b62043cf2990417e0039ac2a05c8e5d /llvm/lib/Bitcode/Reader/BitcodeReader.cpp
parent100863ccd8d41091f90749ba76d91f6dfafdde57 (diff)
downloadllvm-c57b2a0635df9eae0b1d699f83b9b158d5a89135.zip
llvm-c57b2a0635df9eae0b1d699f83b9b158d5a89135.tar.gz
llvm-c57b2a0635df9eae0b1d699f83b9b158d5a89135.tar.bz2
[MLIR][GPU] Make max flat work group size for ROCDL kernels configurable
While the default value for the amdgpu-flat-work-group-size attribute, "1, 256", matches the defaults from Clang, some users of the ROCDL dialect, namely Tensorflow, use larger workgroups, such as 1024. Therefore, instead of hardcoding this value, we add a rocdl.max_flat_work_group_size attribute that can be set on GPU kernels to override the default value. Reviewed By: whchung Differential Revision: https://reviews.llvm.org/D115741
Diffstat (limited to 'llvm/lib/Bitcode/Reader/BitcodeReader.cpp')
0 files changed, 0 insertions, 0 deletions