rocket-tools/riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>	2022-08-29 12:16:52 -0700
committer	Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>	2022-08-30 12:22:08 -0700
commit	fd1f8c85f2c0b989d1ac3a05b920f5b5fa355645 (patch)
tree	04f0b669d3e4b33ac41d1c9a2b86c9967b237fba /clang/lib/Lex/ModuleMap.cpp
parent	c9aba600745131fca4f7333d7c2e21556c2577cc (diff)
download	llvm-fd1f8c85f2c0b989d1ac3a05b920f5b5fa355645.zip llvm-fd1f8c85f2c0b989d1ac3a05b920f5b5fa355645.tar.gz llvm-fd1f8c85f2c0b989d1ac3a05b920f5b5fa355645.tar.bz2

[AMDGPU] Limit TID / wavefrontsize uniformness to 1D kernels

If a kernel has uneven dimensions we can have a value of workitem-id-x divided by the wavefrontsize non-uniform. For example dimensions (65, 2) will have workitems with address (64, 0) and (0, 1) packed into a same wave which gives 1 and 0 after the division by 64 respectively. Unfortunately, this limits the optimization to OpenCL only and only if reqd_work_group_size attribute is set. This patch limits it to 1D kernels, although that shall be possible to perform this optimization is the size of the X dimension is a power of 2, we just do not currently have infrastructure to query it. Note that presence of amdgpu-no-workitem-id-y attribute does not help as it only hints the lack of the workitem-id-y query, but not the absence of the actual 2nd dimension, therefore affecting just the SGPR allocation. Differential Revision: https://reviews.llvm.org/D132879

Diffstat (limited to 'clang/lib/Lex/ModuleMap.cpp')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: