rocket-tools/riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	wlei <wlei@fb.com>	2021-01-11 12:47:22 -0800
committer	wlei <wlei@fb.com>	2021-02-03 18:50:14 -0800
commit	1714ad2336293f351b15dd4b518f9e8618ec38f2 (patch)
tree	fd7efa7d21eba929cb45d6b5c3ac503373fed08a /clang/lib/Frontend/CompilerInvocation.cpp
parent	0609f257dc2e2c3e4c7cd30fe2ffd520117e706b (diff)
download	llvm-1714ad2336293f351b15dd4b518f9e8618ec38f2.zip llvm-1714ad2336293f351b15dd4b518f9e8618ec38f2.tar.gz llvm-1714ad2336293f351b15dd4b518f9e8618ec38f2.tar.bz2

[CSSPGO][llvm-profgen] Aggregate samples on call frame trie to speed up profile generation

For CS profile generation, the process of call stack unwinding is time-consuming since for each LBR entry we need linear time to generate the context( hash, compression, string concatenation). This change speeds up this by grouping all the call frame within one LBR sample into a trie and aggregating the result(sample counter) on it, deferring the context compression and string generation to the end of unwinding. Specifically, it uses `StackLeaf` as the top frame on the stack and manipulates(pop or push a trie node) it dynamically during virtual unwinding so that the raw sample can just be recoded on the leaf node, the path(root to leaf) will represent its calling context. In the end, it traverses the trie and generates the context on the fly. Results: Our internal branch shows about 5X speed-up on some large workloads in SPEC06 benchmark. Differential Revision: https://reviews.llvm.org/D94110

Diffstat (limited to 'clang/lib/Frontend/CompilerInvocation.cpp')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: