aboutsummaryrefslogtreecommitdiff
path: root/bolt
diff options
context:
space:
mode:
authorAmir Ayupov <aaupov@fb.com>2023-06-06 17:53:15 -0700
committerAmir Ayupov <aaupov@fb.com>2023-06-08 04:17:07 -0700
commitc061f755546e04b5804294111943b0f139a5ab60 (patch)
tree07f1dabbd9a374d21bb58204463e9ebb61c83681 /bolt
parentf5f6daf00f3dfd9560ce69455aac2a7d9743e3c9 (diff)
downloadllvm-c061f755546e04b5804294111943b0f139a5ab60.zip
llvm-c061f755546e04b5804294111943b0f139a5ab60.tar.gz
llvm-c061f755546e04b5804294111943b0f139a5ab60.tar.bz2
[BOLT] Handle recursive calls as inter-branches in DataAggregator
Align yaml and fdata profiles by applying the same treatment to recursive calls (direct, indirect, tail). fdata profile increments entry count when handling recursive calls. Make perf/pre-aggregated perf reader (DataAggregator) do the same. Test Plan: In pre-aggregated-perf.test, add a dummy pre-aggregated branch entry between an indirect call in `frame_dummy` function and its entry point. Check that YAML profile gets incremented entry count for this function. End-to-end test: https://github.com/rafaelauler/bolt-tests/pull/24 Reviewed By: #bolt, maksfb Differential Revision: https://reviews.llvm.org/D152338
Diffstat (limited to 'bolt')
-rw-r--r--bolt/lib/Profile/DataAggregator.cpp3
-rw-r--r--bolt/test/X86/Inputs/pre-aggregated.txt1
-rw-r--r--bolt/test/X86/pre-aggregated-perf.test5
3 files changed, 8 insertions, 1 deletions
diff --git a/bolt/lib/Profile/DataAggregator.cpp b/bolt/lib/Profile/DataAggregator.cpp
index 22bb9b7..9ca3eef 100644
--- a/bolt/lib/Profile/DataAggregator.cpp
+++ b/bolt/lib/Profile/DataAggregator.cpp
@@ -777,7 +777,8 @@ bool DataAggregator::doBranch(uint64_t From, uint64_t To, uint64_t Count,
if (!FromFunc && !ToFunc)
return false;
- if (FromFunc == ToFunc) {
+ // Treat recursive control transfers as inter-branches.
+ if (FromFunc == ToFunc && (To != ToFunc->getAddress())) {
recordBranch(*FromFunc, From - FromFunc->getAddress(),
To - FromFunc->getAddress(), Count, Mispreds);
return doIntraBranch(*FromFunc, From, To, Count, Mispreds);
diff --git a/bolt/test/X86/Inputs/pre-aggregated.txt b/bolt/test/X86/Inputs/pre-aggregated.txt
index 788ceb4..5851509 100644
--- a/bolt/test/X86/Inputs/pre-aggregated.txt
+++ b/bolt/test/X86/Inputs/pre-aggregated.txt
@@ -6,3 +6,4 @@ B 4005f0 X:7f36d18f2ce0 1 0
B 4011a0 4011a9 33 4
B 4011ad 401180 58 0
F 401170 4011b2 22
+B 400dae 400d90 1 0
diff --git a/bolt/test/X86/pre-aggregated-perf.test b/bolt/test/X86/pre-aggregated-perf.test
index c737034..9868914 100644
--- a/bolt/test/X86/pre-aggregated-perf.test
+++ b/bolt/test/X86/pre-aggregated-perf.test
@@ -43,6 +43,11 @@ PERF2BOLT: 1 usqrt 3d 1 usqrt 10 0 58
PERF2BOLT: 1 usqrt 3d 1 usqrt 3f 0 22
PERF2BOLT: 1 usqrt a 1 usqrt 10 0 22
+NEWFORMAT: - name: 'frame_dummy/1'
+NEWFORMAT: fid: 3
+NEWFORMAT: hash: 0x24496F7F9594E89F
+NEWFORMAT: exec: 1
+
NEWFORMAT: - name: usqrt
NEWFORMAT: fid: 7
NEWFORMAT: exec: 0