diff options
author | Jay Foad <jay.foad@amd.com> | 2021-10-12 15:39:43 +0100 |
---|---|---|
committer | Jay Foad <jay.foad@amd.com> | 2021-10-12 16:09:04 +0100 |
commit | 66e13c7f439cf162d7ed1d25883e71a5755ac7ec (patch) | |
tree | 0cc1f94faba9cefe58a9bad84c76732777ee6dde /llvm/lib/CodeGen/TwoAddressInstructionPass.cpp | |
parent | 838b4a533e6853d44e0c6d1977bcf0b06557d4ab (diff) | |
download | llvm-66e13c7f439cf162d7ed1d25883e71a5755ac7ec.zip llvm-66e13c7f439cf162d7ed1d25883e71a5755ac7ec.tar.gz llvm-66e13c7f439cf162d7ed1d25883e71a5755ac7ec.tar.bz2 |
[AMDGPU] Enable load clustering in the post-RA scheduler
This has a couple of benefits:
1. It can sometimes fix clusters that got broken apart when the register
allocator inserted a copy.
2. Post-RA scheduling does not have to worry about increasing register
pressure, which in some cases gives it more freedom to reorder
instructions.
Testing on a collection of 10,000 graphics shaders compiled for gfx1010
showed:
- The average length of each run of one or more load instructions
increased by about 1%.
- The number of runs of two or more load instructions increased by
about 4%.
Diffstat (limited to 'llvm/lib/CodeGen/TwoAddressInstructionPass.cpp')
0 files changed, 0 insertions, 0 deletions