aboutsummaryrefslogtreecommitdiff
path: root/libgcc
diff options
context:
space:
mode:
authorMartin Jambor <mjambor@suse.cz>2021-10-27 14:49:01 +0200
committerMartin Jambor <mjambor@suse.cz>2021-10-27 14:49:56 +0200
commitd1e2e4f9ce4df50564f1244dcea9befc3066faa8 (patch)
treeb29359c07351b1beba8dab42f6566b1da7de89e2 /libgcc
parentb528e226d19335796c355d202c8e8686506680cd (diff)
downloadgcc-d1e2e4f9ce4df50564f1244dcea9befc3066faa8.zip
gcc-d1e2e4f9ce4df50564f1244dcea9befc3066faa8.tar.gz
gcc-d1e2e4f9ce4df50564f1244dcea9befc3066faa8.tar.bz2
ipa-cp: Fix updating of profile counts and self-gen value evaluation
IPA-CP does not do a reasonable job when it is updating profile counts after it has created clones of recursive functions. This patch addresses that by: 1. Only updating counts for special-context clones. When a clone is created for all contexts, the original is going to be dead and the cgraph machinery has copied counts to the new node which is the right thing to do. Therefore updating counts has been moved from create_specialized_node to decide_about_value and decide_whether_version_node. 2. The current profile updating code artificially increased the assumed old count when the sum of counts of incoming edges to both the original and new node were bigger than the count of the original node. This always happened when self-recursive edge from the clone was also redirected to the clone because both the original edge and its clone had original high counts. This clutch was removed and replaced by the next point. 3. When cloning also redirects a self-recursive clone to the clone itself, new logic has been added to divide the counts brought by such recursive edges between the original node and the clone. This is impossible to do well without special knowledge about the function and which non-recursive entry calls are responsible for what portion of recursion depth, so the approach taken is rather crude. For local nodes, we detect the case when the original node is never called (in the training run at least) with another value and if so, steal all its counts like if it was dead. If that is not the case, we try to divide the count brought by recursive edges (or rather not brought by direct edges) proportionally to the counts brought by non-recursive edges - but with artificial limits in place so that we do not take too many or too few, because that was happening with detrimental effect in mcf_r. 4. When cloning creates extra clones for values brought by a formerly self-recursive edge with an arithmetic pass-through jump function on it, such as it does in exchange2_r, all such clones are processed at once rather than one after another. The counts of all such nodes are distributed evenly (modulo even-formerly-non-recursive-edges) and the whole situation is then fixed up so that the edge counts fit. This is what new function update_counts_for_self_gen_clones does. 5. When values brought by a formerly self-recursive edge with an arithmetic pass-through jump function on it are evaluated by heuristics which assumes vast majority of node counts are result of recursive calls and so we simply divide those with the number of clones there would be if we created another one. 6. The mechanisms in init_caller_stats and gather_caller_stats and get_info_about_necessary_edges was enhanced to gather data required for the above and a missing check not to count dead incoming edges was also added. gcc/ChangeLog: 2021-10-15 Martin Jambor <mjambor@suse.cz> * ipa-cp.c (struct caller_statistics): New fields rec_count_sum, n_nonrec_calls and itself, document all fields. (init_caller_stats): Initialize the above new fields. (gather_caller_stats): Gather self-recursive counts and calls number. (get_info_about_necessary_edges): Gather counts of self-recursive and other edges bringing in the requested value separately. (dump_profile_updates): Rework to dump info about a single node only. (lenient_count_portion_handling): New function. (struct gather_other_count_struct): New type. (gather_count_of_non_rec_edges): New function. (struct desc_incoming_count_struct): New type. (analyze_clone_icoming_counts): New function. (adjust_clone_incoming_counts): Likewise. (update_counts_for_self_gen_clones): Likewise. (update_profiling_info): Rewritten. (update_specialized_profile): Adjust call to dump_profile_updates. (create_specialized_node): Do not update profiling info. (decide_about_value): New parameter self_gen_clones, either push new clones into it or updat their profile counts. For self-recursively generated values, use a portion of the node count instead of count from self-recursive edges to estimate goodness. (decide_whether_version_node): Gather clones for self-generated values in a new vector, update their profiles at once at the end.
Diffstat (limited to 'libgcc')
0 files changed, 0 insertions, 0 deletions