diff options
author | Kazu Hirata <kazu@google.com> | 2024-11-14 15:54:55 -0800 |
---|---|---|
committer | GitHub <noreply@github.com> | 2024-11-14 15:54:55 -0800 |
commit | 59da1afd2ad74af2a8b8475412353c5d54a7d7f5 (patch) | |
tree | 25842b3ff84fc28031b031019fcaef7526690d4e /llvm/lib/CodeGen/MachineModuleSlotTracker.cpp | |
parent | d761b7485dbf0d951db34abcca270c405be1e93a (diff) | |
download | llvm-59da1afd2ad74af2a8b8475412353c5d54a7d7f5.zip llvm-59da1afd2ad74af2a8b8475412353c5d54a7d7f5.tar.gz llvm-59da1afd2ad74af2a8b8475412353c5d54a7d7f5.tar.bz2 |
[memprof] Speed up caller-callee pair extraction (#116184)
We know that the MemProf profile has a lot of duplicate call stacks.
Extracting caller-callee pairs from a call stack we've seen before is
a wasteful effort.
This patch makes the extraction more efficient by first coming up with
a work list of linear call stack IDs -- the set of starting positions
in the radix tree array -- and then extract caller-callee pairs from
each call stack in the work list.
We implement the work list as a bit vector because we expect the work
list to be dense in the range [0, RadixTreeSize). Also, we want the
set insertion to be cheap.
Without this patch, it takes 25 seconds to extract caller-callee pairs
from a large MemProf profile. This patch shortenes that down to 4
seconds.
Diffstat (limited to 'llvm/lib/CodeGen/MachineModuleSlotTracker.cpp')
0 files changed, 0 insertions, 0 deletions