diff options
author | Matt Arsenault <Matthew.Arsenault@amd.com> | 2015-10-01 21:43:15 +0000 |
---|---|---|
committer | Matt Arsenault <Matthew.Arsenault@amd.com> | 2015-10-01 21:43:15 +0000 |
commit | d1d499aa56e1c7dd6f2f4341932948359f84b57a (patch) | |
tree | e05e8b11bdf325a79bca4232ae1fa8e88cd3d866 /llvm/tools/llvm-objdump/llvm-objdump.cpp | |
parent | fc64fae6e35cf024bd07a364c6268cef5c54c56c (diff) | |
download | llvm-d1d499aa56e1c7dd6f2f4341932948359f84b57a.zip llvm-d1d499aa56e1c7dd6f2f4341932948359f84b57a.tar.gz llvm-d1d499aa56e1c7dd6f2f4341932948359f84b57a.tar.bz2 |
AMDGPU: Make SIInsertWaits about a factor of 4 faster
This was the slowest target custom pass and was spending 80%
of the time in getMinimalPhysRegClass which was called
for every register operand.
Try to use the statically known register class when possible from
the instruction's MCOperandInfo. There are a few pseudo instructions
which are not well behaved with unknown register classes which still
require the expensive physical register class search.
There are a few other possibilities for making this even faster,
such as not inspecting implicit operands. For now those are checked
because it is technically possible to have a scalar load into
exec or vcc which can be implicitly used.
llvm-svn: 249079
Diffstat (limited to 'llvm/tools/llvm-objdump/llvm-objdump.cpp')
0 files changed, 0 insertions, 0 deletions