aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/CodeGen/TwoAddressInstructionPass.cpp
diff options
context:
space:
mode:
authorJay Foad <jay.foad@amd.com>2021-10-12 15:39:43 +0100
committerJay Foad <jay.foad@amd.com>2021-10-12 16:09:04 +0100
commit66e13c7f439cf162d7ed1d25883e71a5755ac7ec (patch)
tree0cc1f94faba9cefe58a9bad84c76732777ee6dde /llvm/lib/CodeGen/TwoAddressInstructionPass.cpp
parent838b4a533e6853d44e0c6d1977bcf0b06557d4ab (diff)
downloadllvm-66e13c7f439cf162d7ed1d25883e71a5755ac7ec.zip
llvm-66e13c7f439cf162d7ed1d25883e71a5755ac7ec.tar.gz
llvm-66e13c7f439cf162d7ed1d25883e71a5755ac7ec.tar.bz2
[AMDGPU] Enable load clustering in the post-RA scheduler
This has a couple of benefits: 1. It can sometimes fix clusters that got broken apart when the register allocator inserted a copy. 2. Post-RA scheduling does not have to worry about increasing register pressure, which in some cases gives it more freedom to reorder instructions. Testing on a collection of 10,000 graphics shaders compiled for gfx1010 showed: - The average length of each run of one or more load instructions increased by about 1%. - The number of runs of two or more load instructions increased by about 4%.
Diffstat (limited to 'llvm/lib/CodeGen/TwoAddressInstructionPass.cpp')
0 files changed, 0 insertions, 0 deletions