aboutsummaryrefslogtreecommitdiff
path: root/clang/unittests/Frontend/CompilerInstanceTest.cpp
diff options
context:
space:
mode:
authorLucas Ramirez <11032120+lucas-rami@users.noreply.github.com>2025-05-01 13:22:23 +0200
committerGitHub <noreply@github.com>2025-05-01 13:22:23 +0200
commite377dc4d38b69050a3301c68637d1b6dacaee3a9 (patch)
tree30925f848fe55ec04f240157761e6a2bd11fd776 /clang/unittests/Frontend/CompilerInstanceTest.cpp
parent212f2456fcde822fad37bfa4e69ced1a51a4c19d (diff)
downloadllvm-e377dc4d38b69050a3301c68637d1b6dacaee3a9.zip
llvm-e377dc4d38b69050a3301c68637d1b6dacaee3a9.tar.gz
llvm-e377dc4d38b69050a3301c68637d1b6dacaee3a9.tar.bz2
[AMDGPU] Max. WG size-induced occupancy limits max. waves/EU (#137807)
The default maximum waves/EU returned by the family of `AMDGPUSubtarget::getWavesPerEU` is currently the maximum number of waves/EU supported by the subtarget (only a valid occupancy range in "amdgpu-waves-per-eu" may lower that maximum). This ignores maximum achievable occupancy imposed by flat workgroup size and LDS usage, resulting in situations where `AMDGPUSubtarget::getWavesPerEU` produces a maximum higher than the one from `AMDGPUSubtarget::getOccupancyWithWorkGroupSizes`. This limits the waves/EU range's maximum to the maximum achievable occupancy derived from flat workgroup sizes and LDS usage. This only has an impact on functions which restrict flat workgroup size with "amdgpu-flat-work-group-size", since the default range of flat workgroup sizes achieves the maximum number of waves/EU supported by the subtarget. Improvements to the handling of "amdgpu-waves-per-eu" are left for a follow up PR (e.g., I think the attribute should be able to lower the full range of waves/EU produced by these methods).
Diffstat (limited to 'clang/unittests/Frontend/CompilerInstanceTest.cpp')
0 files changed, 0 insertions, 0 deletions