rocket-tools/riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	Changpeng Fang <changpeng.fang@amd.com>	2022-09-20 17:25:52 -0700
committer	Changpeng Fang <changpeng.fang@amd.com>	2022-09-20 17:25:52 -0700
commit	3ae4c3589ec7336d363fc1779c4a99360164c8f4 (patch)
tree	3f9890c66762823d3bbc6ae991232f5a1ecc5154 /clang/lib/Serialization/ModuleManager.cpp
parent	2d3b54feb25b36f8c76333718adb17e47d9953ca (diff)
download	llvm-3ae4c3589ec7336d363fc1779c4a99360164c8f4.zip llvm-3ae4c3589ec7336d363fc1779c4a99360164c8f4.tar.gz llvm-3ae4c3589ec7336d363fc1779c4a99360164c8f4.tar.bz2

AMDGPU: Implicit kernel arguments related optimization when uniform-workgroup-size=true

Summary: Under code object version 5, ockl_get_local_size returns the value computed by the expression: workgroup_id < hidden_block_count ? hidden_group_size : hidden_remainder For functions with the attribute uniform-work-group-size=true. we can evaluate workgroup_id < hidden_block_count as true, and thus hidden_group_size is returned for ockl_get_local_size. With uniform-workgroup-size=true, this work also set all remainders to zero, and if there is reqd_work_group_size, we also set work-group-size to the required value from the metadata. Reviewers: arsenm and bcahoon Differential Revision: https://reviews.llvm.org/D131276

Diffstat (limited to 'clang/lib/Serialization/ModuleManager.cpp')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: