diff options
author | Jan Hubicka <hubicka@ucw.cz> | 2025-07-06 14:42:54 +0200 |
---|---|---|
committer | Jan Hubicka <hubicka@ucw.cz> | 2025-07-06 14:42:54 +0200 |
commit | 5c0758c174c596215857427092e33353f4c1fa72 (patch) | |
tree | 016abd196a179792371bee7b01fe4045e5be115a /gcc/rust/hir/tree/rust-hir-pattern-abstract.h | |
parent | 1757c320badc92c0628eafcd07d54585659692ed (diff) | |
download | gcc-5c0758c174c596215857427092e33353f4c1fa72.zip gcc-5c0758c174c596215857427092e33353f4c1fa72.tar.gz gcc-5c0758c174c596215857427092e33353f4c1fa72.tar.bz2 |
Add cutoff information to profile_info and use it when forcing non-zero value
Main difference between normal profile feedback and auto-fdo is that with profile
feedback every basic block with non-zero profile has an incomming edge with non-zero
profile. With auto-profile it is possible that none of predecessors was sampled
and also the tool has cutoff parameter which makes it to ignore small counts.
This becomes a problem when one tries to specialize code and scale profile.
For exmaple if inline function happens to have hot loop with non-zero counts
but its entry count has zero counts and we want to inline to zero counts and we
want to inline to a call with a non-zero count X, we want to scale the body by
X/0 which we currently turn into X/1.
This is a problem since I added logic to scale up the auto-profiles (to get
some extra bits of precision) so X is often a large value and multiplying by X
is not a right answer at all. The multiply factor should be <= 1.
Iterating this few times will make counts to cap and we will lost any useful info.
Original implementation avoided this by doing all inlines before AFDO readback,
bit this is not possible with LTO (unless we move AFDO readback to WPA or add
support for context sensitive profiles). I think I can get the scaling work
reasonably well and then we can look into possible benefits of context sensitive
profiling which can be implemented both atop of AFDO as well as FDO.
This patch adds cutoff value to profile_info which is initialized by profile
feedback to 1 and by auto-profile to the scale factor (since we do not know the
cutoff create_gcov used; llvm's tool streams it and we probably should too).
Then force_nonzero forces every value smaller than cutoff/2 to cutoff/2 which
should keep scaling factors in reasonable ranges.
gcc/ChangeLog:
* auto-profile.cc
(autofdo_source_profile::read): Scale cutoff.
(read_autofdo_file): Initialize cutoff
* coverage.cc (read_counts_file): Initialize cutoff to 1.
* gcov-io.h (struct gcov_summary): Add cutoff field.
* ipa-inline.cc (inline_small_functions): mac_count can be non-zero
also with auto_profile.
* lto-cgraph.cc (output_profile_summary): Write cutoff
and sum_max.
(input_profile_summary): Read cutoff and sum max.
(merge_profile_summaries): Initialize and scale global cutoffs
and sum max.
* profile-count.cc: Include profile.h
(profile_count::force_nonzero): move here from ...; use cutoff.
* profile-count.h: (profile_count::force_nonzero): ... here.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-prof/clone-merge-1.c:
Diffstat (limited to 'gcc/rust/hir/tree/rust-hir-pattern-abstract.h')
0 files changed, 0 insertions, 0 deletions