aboutsummaryrefslogtreecommitdiff
path: root/clang/lib/CodeGen/CodeGenModule.cpp
diff options
context:
space:
mode:
authorPierre van Houtryve <pierre.vanhoutryve@amd.com>2024-06-14 11:20:45 +0200
committerGitHub <noreply@github.com>2024-06-14 11:20:45 +0200
commitab0d01a5f0f17f20b106b0f6cc6d1b7d13cf4d65 (patch)
tree4571ee06c5a8e40c9edacd3f733493dcbf340faf /clang/lib/CodeGen/CodeGenModule.cpp
parent88e42c6779067c4b65624939be74db2d56ee017b (diff)
downloadllvm-ab0d01a5f0f17f20b106b0f6cc6d1b7d13cf4d65.zip
llvm-ab0d01a5f0f17f20b106b0f6cc6d1b7d13cf4d65.tar.gz
llvm-ab0d01a5f0f17f20b106b0f6cc6d1b7d13cf4d65.tar.bz2
[MC] Cache MCRegAliasIterator (#93510)
AMDGPU has a lot of registers, almost 9000. Many of those registers have aliases. For instance, SGPR0 has a ton of aliases due to the presence of register tuples. It's even worse if you query the aliases of a register tuple itself. A large register tuple can have hundreds of aliases because it may include 16 registers, and each of those registers have their own tuples as well. The current implementation of MCRegAliasIterator is not good at this. In some extreme cases it can iterate, 7000 more times than necessary, just giving duplicates over and over again and using a lot of expensive iterators. This patch implements a cache system for MCRegAliasIterator. It does the expensive part only once and then saves it for us so the next iterations on that register's aliases are just a map lookup. Furthermore, the cached data is uniqued (and sorted). Thus, this speeds up code by both speeding up the iterator itself, but also by minimizing the number of loop iterations users of the iterator do.
Diffstat (limited to 'clang/lib/CodeGen/CodeGenModule.cpp')
0 files changed, 0 insertions, 0 deletions