aboutsummaryrefslogtreecommitdiff
path: root/llvm/tools/llvm-profgen/PerfReader.cpp
diff options
context:
space:
mode:
authorHongtao Yu <hoy@fb.com>2021-12-14 10:03:05 -0800
committerHongtao Yu <hoy@fb.com>2021-12-14 14:40:25 -0800
commit5740bb801a14efd5239a0e521395c09e71e61f5c (patch)
treea6e8186dc9a82c72c34fc3d95b1f07e6a696faea /llvm/tools/llvm-profgen/PerfReader.cpp
parentea15b862d77ed0f163d97735f45b7683ad86ffc9 (diff)
downloadllvm-5740bb801a14efd5239a0e521395c09e71e61f5c.zip
llvm-5740bb801a14efd5239a0e521395c09e71e61f5c.tar.gz
llvm-5740bb801a14efd5239a0e521395c09e71e61f5c.tar.bz2
[CSSPGO] Use nested context-sensitive profile.
CSSPGO currently employs a flat profile format for context-sensitive profiles. Such a flat profile allows for precisely manipulating contexts that is either inlined or not inlined. This is a benefit over the nested profile format used by non-CS AutoFDO. A downside of this is the longer build time due to parsing the indexing the full CS contexts. For a CS flat profile, though only the context profiles relevant to a module are loaded when that module is compiled, the cost to figure out what profiles are relevant is noticeably high when there're many contexts, since the sample reader will need to scan all context strings anyway. On the contrary, a nested function profile has its related inline subcontexts isolated from other unrelated contexts. Therefore when compiling a set of functions, unrelated contexts will never need to be scanned. In this change we are exploring using nested profile format for CSSPGO. This is expected to work based on an assumption that with a preinliner-computed profile all contexts are precomputed and expected to be inlined by the compiler. Contexts not expected to be inlined will be cut off and returned to corresponding base profiles (for top-level outlined functions). This naturally forms a nested profile where all nested contexts are expected to be inlined. The compiler will less likely optimize on derived contexts that are not precomputed. A CS-nested profile will look exactly the same with regular nested profile except that each nested profile can come with an attributes. With pseudo probes, a nested profile shown as below can also have a CFG checksum. ``` main:1968679:12 2: 24 3: 28 _Z5funcAi:18 3.1: 28 _Z5funcBi:30 3: _Z5funcAi:1467398 0: 10 1: 10 _Z8funcLeafi:11 3: 24 1: _Z8funcLeafi:1467299 0: 6 1: 6 3: 287884 4: 287864 _Z3fibi:315608 15: 23 !CFGChecksum: 138828622701 !Attributes: 2 !CFGChecksum: 281479271677951 !Attributes: 2 ``` Specific work included in this change: - A recursive profile converter to convert CS flat profile to nested profile. - Extend function checksum and attribute metadata to be stored in nested way for text profile and extbinary profile. - Unifiy sample loader inliner path for CS and preinlined nested profile. - Changes in the sample loader to support probe-based nested profile. I've seen promising results regarding build time. A nested profile can result in a 20% shorter build time than a CS flat profile while keep an on-par performance. This is with -duplicate-contexts-into-base=1. Test Plan: Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D115205
Diffstat (limited to 'llvm/tools/llvm-profgen/PerfReader.cpp')
-rw-r--r--llvm/tools/llvm-profgen/PerfReader.cpp8
1 files changed, 4 insertions, 4 deletions
diff --git a/llvm/tools/llvm-profgen/PerfReader.cpp b/llvm/tools/llvm-profgen/PerfReader.cpp
index 6f6926e..1e9f0f4 100644
--- a/llvm/tools/llvm-profgen/PerfReader.cpp
+++ b/llvm/tools/llvm-profgen/PerfReader.cpp
@@ -728,7 +728,7 @@ void PerfScriptReader::writeUnsymbolizedProfile(raw_fd_ostream &OS) {
for (auto &CI : OrderedCounters) {
uint32_t Indent = 0;
- if (ProfileIsCS) {
+ if (ProfileIsCSFlat) {
// Context string key
OS << "[" << CI.first << "]\n";
Indent = 2;
@@ -815,7 +815,7 @@ void UnsymbolizedProfileReader::readUnsymbolizedProfile(StringRef FileName) {
StringRef Line = TraceIt.getCurrentLine();
// Read context stack for CS profile.
if (Line.startswith("[")) {
- ProfileIsCS = true;
+ ProfileIsCSFlat = true;
auto I = ContextStrSet.insert(Line.str());
SampleContext::createCtxVectorFromStr(*I.first, Key->Context);
TraceIt.advance();
@@ -1026,8 +1026,8 @@ PerfContent PerfScriptReader::checkPerfScriptType(StringRef FileName) {
}
void HybridPerfReader::generateUnsymbolizedProfile() {
- ProfileIsCS = !IgnoreStackSamples;
- if (ProfileIsCS)
+ ProfileIsCSFlat = !IgnoreStackSamples;
+ if (ProfileIsCSFlat)
unwindSamples();
else
PerfScriptReader::generateUnsymbolizedProfile();