rocket-tools/riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	Kazu Hirata <kazu@google.com>	2024-11-15 15:33:23 -0800
committer	GitHub <noreply@github.com>	2024-11-15 15:33:23 -0800
commit	57ed628fb397c6427f820fb217c8a58e67f8e10a (patch)
tree	1a56f15d9c644290e50831d46fb76e82d586bd20 /llvm/tools/llvm-profdata/llvm-profdata.cpp
parent	b1fa9d154b3765cab951162f5e4777a824bc9fa7 (diff)
download	llvm-57ed628fb397c6427f820fb217c8a58e67f8e10a.zip llvm-57ed628fb397c6427f820fb217c8a58e67f8e10a.tar.gz llvm-57ed628fb397c6427f820fb217c8a58e67f8e10a.tar.bz2

[memprof] Speed up caller-callee pair extraction (Part 2) (#116441)

This patch further speeds up the extraction of caller-callee pairs from the profile. Recall that we reconstruct a call stack by traversing the radix tree from one of its leaf nodes toward a root. The implication is that when we decode many different call stacks, we end up visiting nodes near the root(s) repeatedly. That in turn adds many duplicates to our data structure: DenseMap<uint64_t, SmallVector<CallEdgeTy, 0>> Calls; only to be deduplicated later with sort+unique for each vector. This patch makes the extraction process more efficient by keeping track of indices of the radix tree array we've visited so far and terminating traversal as soon as we encounter an element previously visited. Note that even with this improvement, we still add at least one caller-callee pair to the data structure above for each call stack because we do need to add a caller-callee pair for the leaf node with the callee GUID being 0. Without this patch, it takes 4 seconds to extract caller-callee pairs from a large MemProf profile. This patch shortenes that down to 900ms.

Diffstat (limited to 'llvm/tools/llvm-profdata/llvm-profdata.cpp')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: