aboutsummaryrefslogtreecommitdiff
path: root/clang/lib/Frontend/CompilerInvocation.cpp
diff options
context:
space:
mode:
authorwlei <wlei@fb.com>2021-01-11 12:47:22 -0800
committerwlei <wlei@fb.com>2021-02-03 18:50:14 -0800
commit1714ad2336293f351b15dd4b518f9e8618ec38f2 (patch)
treefd7efa7d21eba929cb45d6b5c3ac503373fed08a /clang/lib/Frontend/CompilerInvocation.cpp
parent0609f257dc2e2c3e4c7cd30fe2ffd520117e706b (diff)
downloadllvm-1714ad2336293f351b15dd4b518f9e8618ec38f2.zip
llvm-1714ad2336293f351b15dd4b518f9e8618ec38f2.tar.gz
llvm-1714ad2336293f351b15dd4b518f9e8618ec38f2.tar.bz2
[CSSPGO][llvm-profgen] Aggregate samples on call frame trie to speed up profile generation
For CS profile generation, the process of call stack unwinding is time-consuming since for each LBR entry we need linear time to generate the context( hash, compression, string concatenation). This change speeds up this by grouping all the call frame within one LBR sample into a trie and aggregating the result(sample counter) on it, deferring the context compression and string generation to the end of unwinding. Specifically, it uses `StackLeaf` as the top frame on the stack and manipulates(pop or push a trie node) it dynamically during virtual unwinding so that the raw sample can just be recoded on the leaf node, the path(root to leaf) will represent its calling context. In the end, it traverses the trie and generates the context on the fly. Results: Our internal branch shows about 5X speed-up on some large workloads in SPEC06 benchmark. Differential Revision: https://reviews.llvm.org/D94110
Diffstat (limited to 'clang/lib/Frontend/CompilerInvocation.cpp')
0 files changed, 0 insertions, 0 deletions