riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	spupyrev <spupyrev@fb.com>	2021-11-23 08:47:23 -0800
committer	Hongtao Yu <hoy@fb.com>	2021-11-23 09:08:30 -0800
commit	b00fc198224efa038a7469e068dd920b3f1aba75 (patch)
tree	ef08128782d1675055c75d74719d4f3868761a71 /llvm/lib/Transforms/Utils/InlineFunction.cpp
parent	38211bbab1d949f682271abba0171424a5a335ab (diff)
download	llvm-b00fc198224efa038a7469e068dd920b3f1aba75.zip llvm-b00fc198224efa038a7469e068dd920b3f1aba75.tar.gz llvm-b00fc198224efa038a7469e068dd920b3f1aba75.tar.bz2

profi - a flow-based profile inference algorithm: Part I (out of 3)

The benefits of sampling-based PGO crucially depends on the quality of profile data. This diff implements a flow-based algorithm, called profi, that helps to overcome the inaccuracies in a profile after it is collected. Profi is an extended and significantly re-engineered classic MCMF (min-cost max-flow) approach suggested by Levin, Newman, and Haber [2008, Complementing missing and inaccurate profiling using a minimum cost circulation algorithm]. It models profile inference as an optimization problem on a control-flow graph with the objectives and constraints capturing the desired properties of profile data. Three important challenges that are being solved by profi: - "fixing" errors in profiles caused by sampling; - converting basic block counts to edge frequencies (branch probabilities); - dealing with "dangling" blocks having no samples in the profile. The main implementation (and required docs) are in SampleProfileInference.cpp. The worst-time complexity is quadratic in the number of blocks in a function, O(|V|^2). However a careful engineering and extensive evaluation shows that the running time is (slightly) super-linear. In particular, instances with 1000 blocks are solved within 0.1 second. The algorithm has been extensively tested internally on prod workloads, significantly improving the quality of generated profile data and providing speedups in the range from 0% to 5%. For "smaller" benchmarks (SPEC06/17), it generally improves the performance (with a few outliers) but extra work in the compiler might be needed to re-tune existing optimization passes relying on profile counts. Reviewed By: wenlei, hoy Differential Revision: https://reviews.llvm.org/D109860

Diffstat (limited to 'llvm/lib/Transforms/Utils/InlineFunction.cpp')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: