aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/Transforms/IPO/SampleProfile.cpp
diff options
context:
space:
mode:
authorDiego Novillo <dnovillo@google.com>2015-11-27 23:14:49 +0000
committerDiego Novillo <dnovillo@google.com>2015-11-27 23:14:49 +0000
commitb5792408751cfb3ad0a61c6f328b5b5cb1f1b99d (patch)
treeba84a58e2a36e1e0911984ec95c291c35ada4116 /llvm/lib/Transforms/IPO/SampleProfile.cpp
parent138f895655517694522a669fe8c09a76a006b28b (diff)
downloadllvm-b5792408751cfb3ad0a61c6f328b5b5cb1f1b99d.zip
llvm-b5792408751cfb3ad0a61c6f328b5b5cb1f1b99d.tar.gz
llvm-b5792408751cfb3ad0a61c6f328b5b5cb1f1b99d.tar.bz2
SamplePGO - Fix default threshold for hot callsites.
Based on testing of internal benchmarks, I'm lowering this threshold to a value of 0.1%. This means that SamplePGO will respect 99.9% of the original inline decisions when following a profile. The performance difference is noticeable in some tests. With the previous threshold, the speedups over baseline -O2 was about 0.63%. With the new default, the speedups are around 3% on average. The point of this threshold is not to do more aggressive inlining. When an inlined callsite crosses this threshold, SamplePGO will redo the inline decision so that it can better apply the input profile. By respecting most original inline decisions, we can apply more of the input profile because the shape of the code follows the profile more closely. In the next series, I'll be looking at adding some inline hints for the cold callsites and for toplevel functions that are hot/cold as well. llvm-svn: 254211
Diffstat (limited to 'llvm/lib/Transforms/IPO/SampleProfile.cpp')
-rw-r--r--llvm/lib/Transforms/IPO/SampleProfile.cpp7
1 files changed, 4 insertions, 3 deletions
diff --git a/llvm/lib/Transforms/IPO/SampleProfile.cpp b/llvm/lib/Transforms/IPO/SampleProfile.cpp
index 1f18fb7..69194ea 100644
--- a/llvm/lib/Transforms/IPO/SampleProfile.cpp
+++ b/llvm/lib/Transforms/IPO/SampleProfile.cpp
@@ -71,8 +71,8 @@ static cl::opt<unsigned> SampleProfileSampleCoverage(
"sample-profile-check-sample-coverage", cl::init(0), cl::value_desc("N"),
cl::desc("Emit a warning if less than N% of samples in the input profile "
"are matched to the IR."));
-static cl::opt<unsigned> SampleProfileHotThreshold(
- "sample-profile-inline-hot-threshold", cl::init(5), cl::value_desc("N"),
+static cl::opt<double> SampleProfileHotThreshold(
+ "sample-profile-inline-hot-threshold", cl::init(0.1), cl::value_desc("N"),
cl::desc("Inlined functions that account for more than N% of all samples "
"collected in the parent function, will be inlined again."));
@@ -262,7 +262,8 @@ bool callsiteIsHot(const FunctionSamples *CallerFS,
if (CallsiteTotalSamples == 0)
return false; // Callsite is trivially cold.
- uint64_t PercentSamples = CallsiteTotalSamples * 100 / ParentTotalSamples;
+ double PercentSamples =
+ (double)CallsiteTotalSamples / (double)ParentTotalSamples * 100.0;
return PercentSamples >= SampleProfileHotThreshold;
}