diff options
author | Xiong Hu Luo <luoxhu@linux.ibm.com> | 2019-04-24 00:10:44 -0500 |
---|---|---|
committer | Xiong Hu Luo <luoxhu@linux.vnet.ibm.com> | 2020-01-13 19:10:46 -0600 |
commit | f1ba88b1b20cb579b3b7ce6ce65470205742be7e (patch) | |
tree | 98070d8a50651dab76f87ca3470e7b1fad571a44 /gcc/ipa-profile.c | |
parent | 64378144aabf65bf3df2313191250accc042170e (diff) | |
download | gcc-f1ba88b1b20cb579b3b7ce6ce65470205742be7e.zip gcc-f1ba88b1b20cb579b3b7ce6ce65470205742be7e.tar.gz gcc-f1ba88b1b20cb579b3b7ce6ce65470205742be7e.tar.bz2 |
Missed function specialization + partial devirtualization
v8:
1. Rebase to master with Martin's static function (r280043) comments merge.
Boostrap/testsuite/SPEC2017 tested pass on Power8-LE.
2. TODO:
2.1. C++ devirt for multiple speculative call targets.
2.2. ipa-icf ipa_merge_profiles refine with COMDAT inline testcase.
This patch aims to fix PR69678 caused by PGO indirect call profiling
performance issues.
The bug that profiling data is never working was fixed by Martin's pull
back of topN patches, performance got GEOMEAN ~1% improvement(+24% for
511.povray_r specifically).
Still, currently the default profile only generates SINGLE indirect target
that called more than 75%. This patch leverages MULTIPLE indirect
targets use in LTO-WPA and LTO-LTRANS stage, as a result, function
specialization, profiling, partial devirtualization, inlining and
cloning could be done successfully based on it.
Performance can get improved from 0.70 sec to 0.38 sec on simple tests.
Details are:
1. PGO with topn is enabled by default now, but only one indirect
target edge will be generated in ipa-profile pass, so add variables to enable
multiple speculative edges through passes, speculative_id will record the
direct edge index bind to the indirect edge, indirect_call_targets length
records how many direct edges owned by the indirect edge, postpone gimple_ic
to ipa-profile like default as inline pass will decide whether it is benefit
to transform indirect call.
2. Use speculative_id to track and search the reference node matched
with the direct edge's callee for multiple targets. Actually, it is the
caller's responsibility to handle the direct edges mapped to same indirect
edge. speculative_call_info will return one of the direct edge specified,
this will leverage current IPA edge process framework mostly.
3. Enable LTO WPA/LTRANS stage multiple indirect call targets analysis for
profile full support in ipa passes and cgraph_edge functions. speculative_id
can be set by make_speculative id when multiple targets are binded to
one indirect edge, and cloned if new edge is cloned. speculative_id
is streamed out and stream int by lto like lto_stmt_uid.
4. Create and duplicate all speculative direct edge's call summary
in ipa-fnsummary.c with auto_vec.
5. Add 1 in module testcase and 2 cross module testcases.
6. Bootstrap and regression test passed on Power8-LE. No function
and performance regression for SPEC2017.
gcc/ChangeLog
2020-01-14 Xiong Hu Luo <luoxhu@linux.ibm.com>
PR ipa/69678
* cgraph.c (symbol_table::create_edge): Init speculative_id and
target_prob.
(cgraph_edge::make_speculative): Add param for setting speculative_id
and target_prob.
(cgraph_edge::speculative_call_info): Update comments and find reference
by speculative_id for multiple indirect targets.
(cgraph_edge::resolve_speculation): Decrease the speculations
for indirect edge, drop it's speculative if not direct target
left. Update comments.
(cgraph_edge::redirect_call_stmt_to_callee): Likewise.
(cgraph_node::dump): Print num_speculative_call_targets.
(cgraph_node::verify_node): Don't report error if speculative
edge not include statement.
(cgraph_edge::num_speculative_call_targets_p): New function.
* cgraph.h (int common_target_id): Remove.
(int common_target_probability): Remove.
(num_speculative_call_targets): New variable.
(make_speculative): Add param for setting speculative_id.
(cgraph_edge::num_speculative_call_targets_p): New declare.
(target_prob): New variable.
(speculative_id): New variable.
* ipa-fnsummary.c (analyze_function_body): Create and duplicate
call summaries for multiple speculative call targets.
* cgraphclones.c (cgraph_node::create_clone): Clone speculative_id.
* ipa-profile.c (struct speculative_call_target): New struct.
(class speculative_call_summary): New class.
(class speculative_call_summaries): New class.
(call_sums): New variable.
(ipa_profile_generate_summary): Generate indirect multiple targets summaries.
(ipa_profile_write_edge_summary): New function.
(ipa_profile_write_summary): Stream out indirect multiple targets summaries.
(ipa_profile_dump_all_summaries): New function.
(ipa_profile_read_edge_summary): New function.
(ipa_profile_read_summary_section): New function.
(ipa_profile_read_summary): Stream in indirect multiple targets summaries.
(ipa_profile): Generate num_speculative_call_targets from
profile summaries.
* ipa-ref.h (speculative_id): New variable.
* ipa-utils.c (ipa_merge_profiles): Update with target_prob.
* lto-cgraph.c (lto_output_edge): Remove indirect common_target_id and
common_target_probability. Stream out speculative_id and
num_speculative_call_targets.
(input_edge): Likewise.
* predict.c (dump_prediction): Remove edges count assert to be
precise.
* symtab.c (symtab_node::create_reference): Init speculative_id.
(symtab_node::clone_references): Clone speculative_id.
(symtab_node::clone_referring): Clone speculative_id.
(symtab_node::clone_reference): Clone speculative_id.
(symtab_node::clear_stmts_in_references): Clear speculative_id.
* tree-inline.c (copy_bb): Duplicate all the speculative edges
if indirect call contains multiple speculative targets.
* value-prof.h (check_ic_target): Remove.
* value-prof.c (gimple_value_profile_transformations):
Use void function gimple_ic_transform.
* value-prof.c (gimple_ic_transform): Handle topn case.
Fix comment typos. Change it to a void function.
gcc/testsuite/ChangeLog
2020-01-14 Xiong Hu Luo <luoxhu@linux.ibm.com>
PR ipa/69678
* gcc.dg/tree-prof/indir-call-prof-topn.c: New testcase.
* gcc.dg/tree-prof/crossmodule-indir-call-topn-1.c: New testcase.
* gcc.dg/tree-prof/crossmodule-indir-call-topn-1a.c: New testcase.
* gcc.dg/tree-prof/crossmodule-indir-call-topn-2.c: New testcase.
* lib/scandump.exp: Dump executable file name.
* lib/scanwpaipa.exp: New scan-pgo-wap-ipa-dump.
Diffstat (limited to 'gcc/ipa-profile.c')
-rw-r--r-- | gcc/ipa-profile.c | 353 |
1 files changed, 323 insertions, 30 deletions
diff --git a/gcc/ipa-profile.c b/gcc/ipa-profile.c index 017f63e..fc231c9 100644 --- a/gcc/ipa-profile.c +++ b/gcc/ipa-profile.c @@ -159,7 +159,99 @@ dump_histogram (FILE *file, vec<histogram_entry *> histogram) } } -/* Collect histogram from CFG profiles. */ +/* Structure containing speculative target information from profile. */ + +struct speculative_call_target +{ + speculative_call_target (unsigned int id = 0, int prob = 0) + : target_id (id), target_probability (prob) + { + } + + /* Profile_id of target obtained from profile. */ + unsigned int target_id; + /* Probability that call will land in function with target_id. */ + unsigned int target_probability; +}; + +class speculative_call_summary +{ +public: + speculative_call_summary () : speculative_call_targets () + {} + + auto_vec<speculative_call_target> speculative_call_targets; + + void dump (FILE *f); + +}; + + /* Class to manage call summaries. */ + +class ipa_profile_call_summaries + : public call_summary<speculative_call_summary *> +{ +public: + ipa_profile_call_summaries (symbol_table *table) + : call_summary<speculative_call_summary *> (table) + {} + + /* Duplicate info when an edge is cloned. */ + virtual void duplicate (cgraph_edge *, cgraph_edge *, + speculative_call_summary *old_sum, + speculative_call_summary *new_sum); +}; + +static ipa_profile_call_summaries *call_sums = NULL; + +/* Dump all information in speculative call summary to F. */ + +void +speculative_call_summary::dump (FILE *f) +{ + cgraph_node *n2; + + unsigned spec_count = speculative_call_targets.length (); + for (unsigned i = 0; i < spec_count; i++) + { + speculative_call_target item = speculative_call_targets[i]; + n2 = find_func_by_profile_id (item.target_id); + if (n2) + fprintf (f, " The %i speculative target is %s with prob %3.2f\n", i, + n2->dump_name (), + item.target_probability / (float) REG_BR_PROB_BASE); + else + fprintf (f, " The %i speculative target is %u with prob %3.2f\n", i, + item.target_id, + item.target_probability / (float) REG_BR_PROB_BASE); + } +} + +/* Duplicate info when an edge is cloned. */ + +void +ipa_profile_call_summaries::duplicate (cgraph_edge *, cgraph_edge *, + speculative_call_summary *old_sum, + speculative_call_summary *new_sum) +{ + if (!old_sum) + return; + + unsigned old_count = old_sum->speculative_call_targets.length (); + if (!old_count) + return; + + new_sum->speculative_call_targets.reserve_exact (old_count); + new_sum->speculative_call_targets.quick_grow_cleared (old_count); + + for (unsigned i = 0; i < old_count; i++) + { + new_sum->speculative_call_targets[i] + = old_sum->speculative_call_targets[i]; + } +} + +/* Collect histogram and speculative target summaries from CFG profiles. */ static void ipa_profile_generate_summary (void) @@ -169,7 +261,10 @@ ipa_profile_generate_summary (void) basic_block bb; hash_table<histogram_hash> hashtable (10); - + + gcc_checking_assert (!call_sums); + call_sums = new ipa_profile_call_summaries (symtab); + FOR_EACH_FUNCTION_WITH_GIMPLE_BODY (node) if (ENTRY_BLOCK_PTR_FOR_FN (DECL_STRUCT_FUNCTION (node->decl))->count.ipa_p ()) FOR_EACH_BB_FN (bb, DECL_STRUCT_FUNCTION (node->decl)) @@ -191,23 +286,35 @@ ipa_profile_generate_summary (void) if (h) { gcov_type val, count, all; - if (get_nth_most_common_value (NULL, "indirect call", h, - &val, &count, &all)) + struct cgraph_edge *e = node->get_edge (stmt); + if (e && !e->indirect_unknown_callee) + continue; + + speculative_call_summary *csum + = call_sums->get_create (e); + + for (unsigned j = 0; j < GCOV_TOPN_VALUES; j++) { - struct cgraph_edge * e = node->get_edge (stmt); - if (e && !e->indirect_unknown_callee) + if (!get_nth_most_common_value (NULL, "indirect call", + h, &val, &count, &all, + j)) + continue; + + if (val == 0) continue; - e->indirect_info->common_target_id = val; - e->indirect_info->common_target_probability - = GCOV_COMPUTE_SCALE (count, all); - if (e->indirect_info->common_target_probability > REG_BR_PROB_BASE) + speculative_call_target item ( + val, GCOV_COMPUTE_SCALE (count, all)); + if (item.target_probability > REG_BR_PROB_BASE) { if (dump_file) - fprintf (dump_file, "Probability capped to 1\n"); - e->indirect_info->common_target_probability = REG_BR_PROB_BASE; + fprintf (dump_file, + "Probability capped to 1\n"); + item.target_probability = REG_BR_PROB_BASE; } + csum->speculative_call_targets.safe_push (item); } + gimple_remove_histogram_value (DECL_STRUCT_FUNCTION (node->decl), stmt, h); } @@ -222,6 +329,33 @@ ipa_profile_generate_summary (void) histogram.qsort (cmp_counts); } +/* Serialize the speculative summary info for LTO. */ + +static void +ipa_profile_write_edge_summary (lto_simple_output_block *ob, + speculative_call_summary *csum) +{ + unsigned len = 0; + + len = csum->speculative_call_targets.length (); + + gcc_assert (len <= GCOV_TOPN_VALUES); + + streamer_write_hwi_stream (ob->main_stream, len); + + if (len) + { + unsigned spec_count = csum->speculative_call_targets.length (); + for (unsigned i = 0; i < spec_count; i++) + { + speculative_call_target item = csum->speculative_call_targets[i]; + gcc_assert (item.target_id); + streamer_write_hwi_stream (ob->main_stream, item.target_id); + streamer_write_hwi_stream (ob->main_stream, item.target_probability); + } + } +} + /* Serialize the ipa info for lto. */ static void @@ -238,10 +372,122 @@ ipa_profile_write_summary (void) streamer_write_uhwi_stream (ob->main_stream, histogram[i]->time); streamer_write_uhwi_stream (ob->main_stream, histogram[i]->size); } + + if (!call_sums) + return; + + /* Serialize speculative targets information. */ + unsigned int count = 0; + lto_symtab_encoder_t encoder = ob->decl_state->symtab_node_encoder; + lto_symtab_encoder_iterator lsei; + cgraph_node *node; + + for (lsei = lsei_start_function_in_partition (encoder); !lsei_end_p (lsei); + lsei_next_function_in_partition (&lsei)) + { + node = lsei_cgraph_node (lsei); + if (node->definition && node->has_gimple_body_p () + && node->indirect_calls) + count++; + } + + streamer_write_uhwi_stream (ob->main_stream, count); + + /* Process all of the functions. */ + for (lsei = lsei_start_function_in_partition (encoder); + !lsei_end_p (lsei) && count; lsei_next_function_in_partition (&lsei)) + { + cgraph_node *node = lsei_cgraph_node (lsei); + if (node->definition && node->has_gimple_body_p () + && node->indirect_calls) + { + int node_ref = lto_symtab_encoder_encode (encoder, node); + streamer_write_uhwi_stream (ob->main_stream, node_ref); + + for (cgraph_edge *e = node->indirect_calls; e; e = e->next_callee) + { + speculative_call_summary *csum = call_sums->get_create (e); + ipa_profile_write_edge_summary (ob, csum); + } + } + } + lto_destroy_simple_output_block (ob); } -/* Deserialize the ipa info for lto. */ +/* Dump all profile summary data for all cgraph nodes and edges to file F. */ + +static void +ipa_profile_dump_all_summaries (FILE *f) +{ + fprintf (dump_file, + "\n========== IPA-profile speculative targets: ==========\n"); + cgraph_node *node; + FOR_EACH_FUNCTION_WITH_GIMPLE_BODY (node) + { + fprintf (f, "\nSummary for node %s:\n", node->dump_name ()); + for (cgraph_edge *e = node->indirect_calls; e; e = e->next_callee) + { + fprintf (f, " Summary for %s of indirect edge %d:\n", + e->caller->dump_name (), e->lto_stmt_uid); + speculative_call_summary *csum = call_sums->get_create (e); + csum->dump (f); + } + } + fprintf (f, "\n\n"); +} + +/* Read speculative targets information about edge for LTO WPA. */ + +static void +ipa_profile_read_edge_summary (class lto_input_block *ib, cgraph_edge *edge) +{ + unsigned i, len; + + len = streamer_read_hwi (ib); + gcc_assert (len <= GCOV_TOPN_VALUES); + + speculative_call_summary *csum = call_sums->get_create (edge); + + for (i = 0; i < len; i++) + { + speculative_call_target item (streamer_read_hwi (ib), + streamer_read_hwi (ib)); + csum->speculative_call_targets.safe_push (item); + } +} + +/* Read profile speculative targets section information for LTO WPA. */ + +static void +ipa_profile_read_summary_section (struct lto_file_decl_data *file_data, + class lto_input_block *ib) +{ + if (!ib) + return; + + lto_symtab_encoder_t encoder = file_data->symtab_node_encoder; + + unsigned int count = streamer_read_uhwi (ib); + + unsigned int i; + unsigned int index; + cgraph_node * node; + + for (i = 0; i < count; i++) + { + index = streamer_read_uhwi (ib); + encoder = file_data->symtab_node_encoder; + node + = dyn_cast<cgraph_node *> (lto_symtab_encoder_deref (encoder, index)); + + for (cgraph_edge *e = node->indirect_calls; e; e = e->next_callee) + ipa_profile_read_edge_summary (ib, e); + } +} + +/* Deserialize the IPA histogram and speculative targets summary info for LTO. + */ static void ipa_profile_read_summary (void) @@ -253,6 +499,9 @@ ipa_profile_read_summary (void) hash_table<histogram_hash> hashtable (10); + gcc_checking_assert (!call_sums); + call_sums = new ipa_profile_call_summaries (symtab); + while ((file_data = file_data_vec[j++])) { const char *data; @@ -273,6 +522,9 @@ ipa_profile_read_summary (void) account_time_size (&hashtable, histogram, count, time, size); } + + ipa_profile_read_summary_section (file_data, ib); + lto_destroy_simple_input_block (file_data, LTO_section_ipa_profile, ib, data, len); @@ -512,6 +764,7 @@ ipa_profile (void) int nindirect = 0, ncommon = 0, nunknown = 0, nuseless = 0, nconverted = 0; int nmismatch = 0, nimpossible = 0; bool node_map_initialized = false; + gcov_type threshold; if (dump_file) dump_histogram (dump_file, histogram); @@ -520,14 +773,12 @@ ipa_profile (void) overall_time += histogram[i]->count * histogram[i]->time; overall_size += histogram[i]->size; } + threshold = 0; if (overall_time) { - gcov_type threshold; - gcc_assert (overall_size); cutoff = (overall_time * param_hot_bb_count_ws_permille + 500) / 1000; - threshold = 0; for (i = 0; cumulated < cutoff; i++) { cumulated += histogram[i]->count * histogram[i]->time; @@ -563,10 +814,21 @@ ipa_profile (void) histogram.release (); histogram_pool.release (); - /* Produce speculative calls: we saved common target from porfiling into - e->common_target_id. Now, at link time, we can look up corresponding + /* Produce speculative calls: we saved common target from profiling into + e->target_id. Now, at link time, we can look up corresponding function node and produce speculative call. */ + gcc_checking_assert (call_sums); + + if (dump_file) + { + if (!node_map_initialized) + init_node_map (false); + node_map_initialized = true; + + ipa_profile_dump_all_summaries (dump_file); + } + FOR_EACH_DEFINED_FUNCTION (n) { bool update = false; @@ -578,13 +840,35 @@ ipa_profile (void) { if (n->count.initialized_p ()) nindirect++; - if (e->indirect_info->common_target_id) + + speculative_call_summary *csum = call_sums->get_create (e); + unsigned spec_count = csum->speculative_call_targets.length (); + if (spec_count) { if (!node_map_initialized) - init_node_map (false); + init_node_map (false); node_map_initialized = true; ncommon++; - n2 = find_func_by_profile_id (e->indirect_info->common_target_id); + + if (in_lto_p) + { + if (dump_file) + { + fprintf (dump_file, + "Updating hotness threshold in LTO mode.\n"); + fprintf (dump_file, "Updated min count: %" PRId64 "\n", + (int64_t) threshold / spec_count); + } + set_hot_bb_threshold (threshold / spec_count); + } + + unsigned speculative_id = 0; + bool speculative_found = false; + for (unsigned i = 0; i < spec_count; i++) + { + speculative_call_target item + = csum->speculative_call_targets[i]; + n2 = find_func_by_profile_id (item.target_id); if (n2) { if (dump_file) @@ -593,11 +877,10 @@ ipa_profile (void) " other module %s => %s, prob %3.2f\n", n->dump_name (), n2->dump_name (), - e->indirect_info->common_target_probability - / (float)REG_BR_PROB_BASE); + item.target_probability + / (float) REG_BR_PROB_BASE); } - if (e->indirect_info->common_target_probability - < REG_BR_PROB_BASE / 2) + if (item.target_probability < REG_BR_PROB_BASE / 2) { nuseless++; if (dump_file) @@ -653,20 +936,26 @@ ipa_profile (void) n2 = alias; } nconverted++; - e->make_speculative - (n2, - e->count.apply_probability - (e->indirect_info->common_target_probability)); + e->make_speculative (n2, + e->count.apply_probability ( + item.target_probability), + speculative_id, + item.target_probability); update = true; + speculative_id++; + speculative_found = true; } } else { if (dump_file) fprintf (dump_file, "Function with profile-id %i not found.\n", - e->indirect_info->common_target_id); + item.target_id); nunknown++; } + } + if (speculative_found) + e->indirect_info->num_speculative_call_targets = speculative_id; } } if (update) @@ -729,6 +1018,10 @@ ipa_profile (void) } } free (order); + + if (dump_file && (dump_flags & TDF_DETAILS)) + symtab->dump (dump_file); + return 0; } |