diff options
author | Pierre van Houtryve <pierre.vanhoutryve@amd.com> | 2024-06-14 11:20:45 +0200 |
---|---|---|
committer | GitHub <noreply@github.com> | 2024-06-14 11:20:45 +0200 |
commit | ab0d01a5f0f17f20b106b0f6cc6d1b7d13cf4d65 (patch) | |
tree | 4571ee06c5a8e40c9edacd3f733493dcbf340faf /clang/lib/CodeGen/CodeGenModule.cpp | |
parent | 88e42c6779067c4b65624939be74db2d56ee017b (diff) | |
download | llvm-ab0d01a5f0f17f20b106b0f6cc6d1b7d13cf4d65.zip llvm-ab0d01a5f0f17f20b106b0f6cc6d1b7d13cf4d65.tar.gz llvm-ab0d01a5f0f17f20b106b0f6cc6d1b7d13cf4d65.tar.bz2 |
[MC] Cache MCRegAliasIterator (#93510)
AMDGPU has a lot of registers, almost 9000. Many of those registers have
aliases. For instance, SGPR0 has a ton of aliases due to the presence of
register tuples. It's even worse if you query the aliases of a register
tuple itself. A large register tuple can have hundreds of aliases
because it may include 16 registers, and each of those registers have
their own tuples as well.
The current implementation of MCRegAliasIterator is not good at this. In
some extreme cases it can iterate, 7000 more times than
necessary, just giving duplicates over and over again and using a lot of
expensive iterators.
This patch implements a cache system for MCRegAliasIterator. It does the
expensive part only once and then saves it for us so the next iterations
on that register's aliases are just a map lookup.
Furthermore, the cached data is uniqued (and sorted). Thus, this speeds
up code by both speeding up the iterator itself, but also by minimizing
the number of loop iterations users of the iterator do.
Diffstat (limited to 'clang/lib/CodeGen/CodeGenModule.cpp')
0 files changed, 0 insertions, 0 deletions