aboutsummaryrefslogtreecommitdiff
path: root/clang/lib/Serialization/ModuleManager.cpp
diff options
context:
space:
mode:
authorChangpeng Fang <changpeng.fang@amd.com>2022-09-20 17:25:52 -0700
committerChangpeng Fang <changpeng.fang@amd.com>2022-09-20 17:25:52 -0700
commit3ae4c3589ec7336d363fc1779c4a99360164c8f4 (patch)
tree3f9890c66762823d3bbc6ae991232f5a1ecc5154 /clang/lib/Serialization/ModuleManager.cpp
parent2d3b54feb25b36f8c76333718adb17e47d9953ca (diff)
downloadllvm-3ae4c3589ec7336d363fc1779c4a99360164c8f4.zip
llvm-3ae4c3589ec7336d363fc1779c4a99360164c8f4.tar.gz
llvm-3ae4c3589ec7336d363fc1779c4a99360164c8f4.tar.bz2
AMDGPU: Implicit kernel arguments related optimization when uniform-workgroup-size=true
Summary: Under code object version 5, ockl_get_local_size returns the value computed by the expression: workgroup_id < hidden_block_count ? hidden_group_size : hidden_remainder For functions with the attribute uniform-work-group-size=true. we can evaluate workgroup_id < hidden_block_count as true, and thus hidden_group_size is returned for ockl_get_local_size. With uniform-workgroup-size=true, this work also set all remainders to zero, and if there is reqd_work_group_size, we also set work-group-size to the required value from the metadata. Reviewers: arsenm and bcahoon Differential Revision: https://reviews.llvm.org/D131276
Diffstat (limited to 'clang/lib/Serialization/ModuleManager.cpp')
0 files changed, 0 insertions, 0 deletions