aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/Support/Signposts.cpp
diff options
context:
space:
mode:
authorspupyrev <spupyrev@fb.com>2021-06-11 21:45:47 -0700
committerWenlei He <aktoon@gmail.com>2021-06-11 21:46:04 -0700
commit0a0800c4d10c250ffb152b5f059d6f9a19ed8efe (patch)
tree44ca29663e60fb3abccf6da0f584eee8fc28f015 /llvm/lib/Support/Signposts.cpp
parentdbc262968f8ed4b25281d5855eda8f5a699b95c6 (diff)
downloadllvm-0a0800c4d10c250ffb152b5f059d6f9a19ed8efe.zip
llvm-0a0800c4d10c250ffb152b5f059d6f9a19ed8efe.tar.gz
llvm-0a0800c4d10c250ffb152b5f059d6f9a19ed8efe.tar.bz2
A post-processing for BFI inference
The current implementation for computing relative block frequencies does not handle correctly control-flow graphs containing irreducible loops. This results in suboptimally generated binaries, whose perf can be up to 5% worse than optimal. To resolve the problem, we apply a post-processing step, which iteratively updates block frequencies based on the frequencies of their predesessors. This corresponds to finding the stationary point of the Markov chain by an iterative method aka "PageRank computation". The algorithm takes at most O(|E| * IterativeBFIMaxIterations) steps but typically converges faster. It is turned on by passing option `use-iterative-bfi-inference` and applied only for functions containing profile data and irreducible loops. Tested on SPEC06/17, where it is helping to get correct profile counts for one of the binaries (403.gcc). In prod binaries, we've seen a speedup of up to 2%-5% for binaries containing functions with hot irreducible loops. Reviewed By: hoy, wenlei, davidxl Differential Revision: https://reviews.llvm.org/D103289
Diffstat (limited to 'llvm/lib/Support/Signposts.cpp')
0 files changed, 0 insertions, 0 deletions