Age | Commit message (Collapse) | Author | Files | Lines |
|
This patch follows Martin's suggestion here[1], to support
range based loop for iterating loops, analogously to the
patch for vec[2].
For example, use below range-based for loop
for (auto loop : loops_list (cfun, 0))
to replace the previous macro FOR_EACH_LOOP
FOR_EACH_LOOP (loop, 0)
[1] https://gcc.gnu.org/pipermail/gcc-patches/2021-June/573424.html
[2] https://gcc.gnu.org/pipermail/gcc-patches/2021-June/572315.html
gcc/ChangeLog:
* cfgloop.h (as_const): New function.
(class loop_iterator): Rename to ...
(class loops_list): ... this.
(loop_iterator::next): Rename to ...
(loops_list::Iter::fill_curr_loop): ... this and adjust.
(loop_iterator::loop_iterator): Rename to ...
(loops_list::loops_list): ... this and adjust.
(loops_list::Iter): New class.
(loops_list::iterator): New type.
(loops_list::const_iterator): New type.
(loops_list::begin): New function.
(loops_list::end): Likewise.
(loops_list::begin const): Likewise.
(loops_list::end const): Likewise.
(FOR_EACH_LOOP): Remove.
(FOR_EACH_LOOP_FN): Remove.
* cfgloop.c (flow_loops_dump): Adjust FOR_EACH_LOOP* with range-based
for loop with loops_list instance.
(sort_sibling_loops): Likewise.
(disambiguate_loops_with_multiple_latches): Likewise.
(verify_loop_structure): Likewise.
* cfgloopmanip.c (create_preheaders): Likewise.
(force_single_succ_latches): Likewise.
* config/aarch64/falkor-tag-collision-avoidance.c
(execute_tag_collision_avoidance): Likewise.
* config/mn10300/mn10300.c (mn10300_scan_for_setlb_lcc): Likewise.
* config/s390/s390.c (s390_adjust_loops): Likewise.
* doc/loop.texi: Likewise.
* gimple-loop-interchange.cc (pass_linterchange::execute): Likewise.
* gimple-loop-jam.c (tree_loop_unroll_and_jam): Likewise.
* gimple-loop-versioning.cc (loop_versioning::analyze_blocks): Likewise.
(loop_versioning::make_versioning_decisions): Likewise.
* gimple-ssa-split-paths.c (split_paths): Likewise.
* graphite-isl-ast-to-gimple.c (graphite_regenerate_ast_isl): Likewise.
* graphite.c (canonicalize_loop_form): Likewise.
(graphite_transform_loops): Likewise.
* ipa-fnsummary.c (analyze_function_body): Likewise.
* ipa-pure-const.c (analyze_function): Likewise.
* loop-doloop.c (doloop_optimize_loops): Likewise.
* loop-init.c (loop_optimizer_finalize): Likewise.
(fix_loop_structure): Likewise.
* loop-invariant.c (calculate_loop_reg_pressure): Likewise.
(move_loop_invariants): Likewise.
* loop-unroll.c (decide_unrolling): Likewise.
(unroll_loops): Likewise.
* modulo-sched.c (sms_schedule): Likewise.
* predict.c (predict_loops): Likewise.
(pass_profile::execute): Likewise.
* profile.c (branch_prob): Likewise.
* sel-sched-ir.c (sel_finish_pipelining): Likewise.
(sel_find_rgns): Likewise.
* tree-cfg.c (replace_loop_annotate): Likewise.
(replace_uses_by): Likewise.
(move_sese_region_to_fn): Likewise.
* tree-if-conv.c (pass_if_conversion::execute): Likewise.
* tree-loop-distribution.c (loop_distribution::execute): Likewise.
* tree-parloops.c (parallelize_loops): Likewise.
* tree-predcom.c (tree_predictive_commoning): Likewise.
* tree-scalar-evolution.c (scev_initialize): Likewise.
(scev_reset): Likewise.
* tree-ssa-dce.c (find_obviously_necessary_stmts): Likewise.
* tree-ssa-live.c (remove_unused_locals): Likewise.
* tree-ssa-loop-ch.c (ch_base::copy_headers): Likewise.
* tree-ssa-loop-im.c (analyze_memory_references): Likewise.
(tree_ssa_lim_initialize): Likewise.
* tree-ssa-loop-ivcanon.c (canonicalize_induction_variables): Likewise.
* tree-ssa-loop-ivopts.c (tree_ssa_iv_optimize): Likewise.
* tree-ssa-loop-manip.c (get_loops_exits): Likewise.
* tree-ssa-loop-niter.c (estimate_numbers_of_iterations): Likewise.
(free_numbers_of_iterations_estimates): Likewise.
* tree-ssa-loop-prefetch.c (tree_ssa_prefetch_arrays): Likewise.
* tree-ssa-loop-split.c (tree_ssa_split_loops): Likewise.
* tree-ssa-loop-unswitch.c (tree_ssa_unswitch_loops): Likewise.
* tree-ssa-loop.c (gate_oacc_kernels): Likewise.
(pass_scev_cprop::execute): Likewise.
* tree-ssa-propagate.c (clean_up_loop_closed_phi): Likewise.
* tree-ssa-sccvn.c (do_rpo_vn): Likewise.
* tree-ssa-threadupdate.c
(jump_thread_path_registry::thread_through_all_blocks): Likewise.
* tree-vectorizer.c (vectorize_loops): Likewise.
* tree-vrp.c (vrp_asserts::find_assert_locations): Likewise.
|
|
gcc/c-family/ChangeLog:
* c-common.c (c_build_shufflevector): Adjust by-value argument to
by-const-reference.
* c-common.h (c_build_shufflevector): Same.
gcc/c/ChangeLog:
* c-tree.h (c_build_function_call_vec): Adjust by-value argument to
by-const-reference.
* c-typeck.c (c_build_function_call_vec): Same.
gcc/ChangeLog:
* cfgloop.h (single_likely_exit): Adjust by-value argument to
by-const-reference.
* cfgloopanal.c (single_likely_exit): Same.
* cgraph.h (struct cgraph_node): Same.
* cgraphclones.c (cgraph_node::create_virtual_clone): Same.
* genautomata.c (merge_states): Same.
* genextract.c (VEC_char_to_string): Same.
* genmatch.c (dt_node::gen_kids_1): Same.
(walk_captures): Adjust by-value argument to by-reference.
* gimple-ssa-store-merging.c (check_no_overlap): Adjust by-value argument
to by-const-reference.
* gimple.c (gimple_build_call_vec): Same.
(gimple_build_call_internal_vec): Same.
(gimple_build_switch): Same.
(sort_case_labels): Same.
(preprocess_case_label_vec_for_gimple): Adjust by-value argument to
by-reference.
* gimple.h (gimple_build_call_vec): Adjust by-value argument to
by-const-reference.
(gimple_build_call_internal_vec): Same.
(gimple_build_switch): Same.
(sort_case_labels): Same.
(preprocess_case_label_vec_for_gimple): Adjust by-value argument to
by-reference.
* haifa-sched.c (calc_priorities): Adjust by-value argument to
by-const-reference.
(sched_init_luids): Same.
(haifa_init_h_i_d): Same.
* ipa-cp.c (ipa_get_indirect_edge_target_1): Same.
(adjust_callers_for_value_intersection): Adjust by-value argument to
by-reference.
(find_more_scalar_values_for_callers_subset): Adjust by-value argument to
by-const-reference.
(find_more_contexts_for_caller_subset): Same.
(find_aggregate_values_for_callers_subset): Same.
(copy_useful_known_contexts): Same.
* ipa-fnsummary.c (remap_edge_summaries): Same.
(remap_freqcounting_predicate): Same.
* ipa-inline.c (add_new_edges_to_heap): Adjust by-value argument to
by-reference.
* ipa-predicate.c (predicate::remap_after_inlining): Adjust by-value argument
to by-const-reference.
* ipa-predicate.h (predicate::remap_after_inlining): Same.
* ipa-prop.c (ipa_find_agg_cst_for_param): Same.
* ipa-prop.h (ipa_find_agg_cst_for_param): Same.
* ira-build.c (ira_loop_tree_body_rev_postorder): Same.
* read-rtl.c (add_overload_instance): Same.
* rtl.h (native_decode_rtx): Same.
(native_decode_vector_rtx): Same.
* sched-int.h (sched_init_luids): Same.
(haifa_init_h_i_d): Same.
* simplify-rtx.c (native_decode_vector_rtx): Same.
(native_decode_rtx): Same.
* tree-call-cdce.c (gen_shrink_wrap_conditions): Same.
(shrink_wrap_one_built_in_call_with_conds): Same.
(shrink_wrap_conditional_dead_built_in_calls): Same.
* tree-data-ref.c (create_runtime_alias_checks): Same.
(compute_all_dependences): Same.
* tree-data-ref.h (compute_all_dependences): Same.
(create_runtime_alias_checks): Same.
(index_in_loop_nest): Same.
* tree-if-conv.c (mask_exists): Same.
* tree-loop-distribution.c (class loop_distribution): Same.
(loop_distribution::create_rdg_vertices): Same.
(dump_rdg_partitions): Same.
(debug_rdg_partitions): Same.
(partition_contains_all_rw): Same.
(loop_distribution::distribute_loop): Same.
* tree-parloops.c (oacc_entry_exit_ok_1): Same.
(oacc_entry_exit_single_gang): Same.
* tree-ssa-loop-im.c (hoist_memory_references): Same.
(loop_suitable_for_sm): Same.
* tree-ssa-loop-niter.c (bound_index): Same.
* tree-ssa-reassoc.c (update_ops): Same.
(swap_ops_for_binary_stmt): Same.
(rewrite_expr_tree): Same.
(rewrite_expr_tree_parallel): Same.
* tree-ssa-sccvn.c (ao_ref_init_from_vn_reference): Same.
* tree-ssa-sccvn.h (ao_ref_init_from_vn_reference): Same.
* tree-ssa-structalias.c (process_all_all_constraints): Same.
(make_constraints_to): Same.
(handle_lhs_call): Same.
(find_func_aliases_for_builtin_call): Same.
(sort_fieldstack): Same.
(check_for_overlaps): Same.
* tree-vect-loop-manip.c (vect_create_cond_for_align_checks): Same.
(vect_create_cond_for_unequal_addrs): Same.
(vect_create_cond_for_lower_bounds): Same.
(vect_create_cond_for_alias_checks): Same.
* tree-vect-slp-patterns.c (vect_validate_multiplication): Same.
* tree-vect-slp.c (vect_analyze_slp_instance): Same.
(vect_make_slp_decision): Same.
(vect_slp_bbs): Same.
(duplicate_and_interleave): Same.
(vect_transform_slp_perm_load): Same.
(vect_schedule_slp): Same.
* tree-vectorizer.h (vect_transform_slp_perm_load): Same.
(vect_schedule_slp): Same.
(duplicate_and_interleave): Same.
* tree.c (build_vector_from_ctor): Same.
(build_vector): Same.
(check_vector_cst): Same.
(check_vector_cst_duplicate): Same.
(check_vector_cst_fill): Same.
(check_vector_cst_stepped): Same.
* tree.h (build_vector_from_ctor): Same.
|
|
This patch converts the remaining users of get_range_info and
get_ptr_nonnull to the get_range_query API.
No effort was made to move passes away from VR_ANTI_RANGE, or any other
use of deprecated methods. This was a straight up conversion to the new
API, nothing else.
gcc/ChangeLog:
* builtins.c (check_nul_terminated_array): Convert to get_range_query.
(expand_builtin_strnlen): Same.
(determine_block_size): Same.
* fold-const.c (expr_not_equal_to): Same.
* gimple-fold.c (size_must_be_zero_p): Same.
* gimple-match-head.c: Include gimple-range.h.
* gimple-pretty-print.c (dump_ssaname_info): Convert to get_range_query.
* gimple-ssa-warn-restrict.c
(builtin_memref::extend_offset_range): Same.
* graphite-sese-to-poly.c (add_param_constraints): Same.
* internal-fn.c (get_min_precision): Same.
* ipa-fnsummary.c (set_switch_stmt_execution_predicate): Same.
* ipa-prop.c (ipa_compute_jump_functions_for_edge): Same.
* match.pd: Same.
* tree-data-ref.c (split_constant_offset): Same.
(dr_step_indicator): Same.
* tree-dfa.c (get_ref_base_and_extent): Same.
* tree-scalar-evolution.c (iv_can_overflow_p): Same.
* tree-ssa-loop-niter.c (refine_value_range_using_guard): Same.
(determine_value_range): Same.
(record_nonwrapping_iv): Same.
(infer_loop_bounds_from_signedness): Same.
(scev_var_range_cant_overflow): Same.
* tree-ssa-phiopt.c (two_value_replacement): Same.
* tree-ssa-pre.c (insert_into_preds_of_block): Same.
* tree-ssa-reassoc.c (optimize_range_tests_to_bit_test): Same.
* tree-ssa-strlen.c (handle_builtin_stxncpy_strncat): Same.
(get_range): Same.
(dump_strlen_info): Same.
(set_strlen_range): Same.
(maybe_diag_stxncpy_trunc): Same.
(get_len_or_size): Same.
(handle_integral_assign): Same.
* tree-ssa-structalias.c (find_what_p_points_to): Same.
* tree-ssa-uninit.c (find_var_cmp_const): Same.
* tree-switch-conversion.c (bit_test_cluster::emit): Same.
* tree-vect-patterns.c (vect_get_range_info): Same.
(vect_recog_divmod_pattern): Same.
* tree-vrp.c (intersect_range_with_nonzero_bits): Same.
(register_edge_assert_for_2): Same.
(determine_value_range_1): Same.
* tree.c (get_range_pos_neg): Same.
* vr-values.c (vr_values::get_lattice_entry): Same.
(vr_values::update_value_range): Same.
(simplify_conversion_using_ranges): Same.
|
|
The node and edge summaries defined in ipa-prop.h are probably the
oldest in GCC and so it happened that they are the only ones using
macros to look them up and create them. With Honza and Martin we
agreed it is ugly and the macros should be removed and the ipa-prop
summaries should be accessed like all the other ones but somehow I
never got to it until now.
The patch is mostly mechanical. Because the lookup machinery was much
simpler in the old times (something like the fast summaries we have
today), a lot of code queried for the summary multiple times for no
good reasons and I fixed that in places where it was easy.
Also, before we switched to hash based summaries, new summary pointers
had to be obtained whenever the underlying array could be reallocated
because of new cgraph nodes/edges. This is no longer necessary and so
I removed the instances which I found.
Both kinds of these non-mechanical changes should be specifically called
out in the ChangeLog.
I also removed the IS_VALID_JUMP_FUNC_INDEX macro because it not used
anywhere.
gcc/ChangeLog:
2021-05-07 Martin Jambor <mjambor@suse.cz>
* ipa-prop.h (IPA_NODE_REF): Removed.
(IPA_NODE_REF_GET_CREATE): Likewise.
(IPA_EDGE_REF): Likewise.
(IPA_EDGE_REF_GET_CREATE): Likewise.
(IS_VALID_JUMP_FUNC_INDEX): Likewise.
* ipa-cp.c (print_all_lattices): Replaced IPA_NODE_REF with a direct
use of ipa_node_params_sum.
(ipcp_versionable_function_p): Likewise.
(push_node_to_stack): Likewise.
(pop_node_from_stack): Likewise.
(set_single_call_flag): Replaced two IPA_NODE_REF with one single
direct use of ipa_node_params_sum.
(initialize_node_lattices): Replaced IPA_NODE_REF with a direct use of
ipa_node_params_sum.
(ipa_context_from_jfunc): Replaced IPA_EDGE_REF with a direct use of
ipa_edge_args_sum.
(ipcp_verify_propagated_values): Replaced IPA_NODE_REF with a direct
use of ipa_node_params_sum.
(self_recursively_generated_p): Likewise.
(propagate_scalar_across_jump_function): Likewise.
(propagate_context_across_jump_function): Replaced IPA_EDGE_REF with a
direct use of ipa_edge_args_sum, moved the lookup after the early
exit. Replaced IPA_NODE_REF with a direct use of ipa_node_params_sum.
(propagate_bits_across_jump_function): Replaced IPA_NODE_REF with
direct uses of ipa_node_params_sum.
(propagate_vr_across_jump_function): Likewise.
(propagate_aggregate_lattice): Likewise.
(propagate_aggs_across_jump_function): Likewise.
(propagate_constants_across_call): Likewise, also replaced
IPA_EDGE_REF with a direct use of ipa_edge_args_sum.
(good_cloning_opportunity_p): Replaced IPA_NODE_REF with a direct use
of ipa_node_params_sum.
(estimate_local_effects): Likewise.
(add_all_node_vals_to_toposort): Likewise.
(propagate_constants_topo): Likewise.
(ipcp_propagate_stage): Likewise.
(ipcp_discover_new_direct_edges): Likewise.
(calls_same_node_or_its_all_contexts_clone_p): Likewise.
(cgraph_edge_brings_value_p): Likewise (in both overloaded functions).
(get_info_about_necessary_edges): Likewise.
(want_remove_some_param_p): Likewise.
(create_specialized_node): Likewise.
(self_recursive_pass_through_p): Likewise.
(self_recursive_agg_pass_through_p): Likewise.
(find_more_scalar_values_for_callers_subset): Likewise and also
replaced IPA_EDGE_REF with direct uses of ipa_edge_args_sum, in one
case replacing two of those with a single query.
(find_more_contexts_for_caller_subset): Likewise for the
ipa_polymorphic_call_context overload.
(intersect_aggregates_with_edge): Replaced IPA_EDGE_REF with a direct
use of ipa_edge_args_sum. Replaced IPA_NODE_REF with direct uses of
ipa_node_params_sum.
(find_aggregate_values_for_callers_subset): Likewise, also reusing
results of ipa_edge_args_sum->get.
(cgraph_edge_brings_all_scalars_for_node): Replaced IPA_NODE_REF with
direct uses of ipa_node_params_sum, replaced IPA_EDGE_REF with a
direct use of ipa_edge_args_sum.
(cgraph_edge_brings_all_agg_vals_for_node): Likewise, moved node
summary query after the early exit and reused the result later.
(decide_about_value): Replaced IPA_NODE_REF with a direct use of
ipa_node_params_sum.
(decide_whether_version_node): Likewise. Removed re-querying for
summaries after cloning.
(spread_undeadness): Replaced IPA_NODE_REF with a direct use of
ipa_node_params_sum.
(has_undead_caller_from_outside_scc_p): Likewise, reusing results of
some queries.
(identify_dead_nodes): Likewise.
(ipcp_store_bits_results): Replaced IPA_NODE_REF with direct uses of
ipa_node_params_sum.
(ipcp_store_vr_results): Likewise.
* ipa-fnsummary.c (evaluate_properties_for_edge): Likewise.
(ipa_fn_summary_t::duplicate): Likewise.
(analyze_function_body): Likewise.
(estimate_calls_size_and_time): Likewise.
(ipa_cached_call_context::duplicate_from): Likewise.
(ipa_call_context::equal_to): Likewise.
(remap_edge_params): Likewise.
(ipa_merge_fn_summary_after_inlining): Likewise.
(inline_read_section): Likewise.
* ipa-icf.c (sem_function::param_used_p): Likewise.
* ipa-modref.c (compute_parm_map): Likewise.
(compute_parm_map): Replaced IPA_EDGE_REF with a direct use of
ipa_edge_args_sum.
(get_access_for_fnspec): Replaced IPA_NODE_REF with a direct use of
ipa_node_params_sum and replaced IPA_EDGE_REF with a direct use of
ipa_edge_args_sum.
* ipa-profile.c (check_argument_count): Likewise.
* ipa-prop.c (ipa_alloc_node_params): Replaced IPA_NODE_REF_GET_CREATE
with a direct use of ipa_node_params_sum.
(ipa_initialize_node_params): Likewise.
(ipa_print_node_jump_functions_for_edge): Replaced IPA_EDGE_REF with a
direct use of ipa_edge_args_sum and reused the query result.
(ipa_compute_jump_functions_for_edge): Replaced IPA_NODE_REF with a
direct use of ipa_node_params_sum and replaced IPA_EDGE_REF with a
direct use of ipa_edge_args_sum.
(ipa_note_param_call): Replaced IPA_NODE_REF with a direct use of
ipa_node_params_sum and reused the result of the query.
(ipa_analyze_node): Likewise.
(ipa_analyze_controlled_uses): Replaced IPA_NODE_REF with a direct use
of ipa_node_params_sum.
(update_jump_functions_after_inlining): Replaced IPA_EDGE_REF with
direct uses of ipa_edge_args_sum.
(update_indirect_edges_after_inlining): Replaced IPA_NODE_REF with
direct uses of ipa_node_params_sum and replaced IPA_EDGE_REF with a
direct use of ipa_edge_args_sum. Removed superficial re-querying the
top edge summary.
(propagate_controlled_uses): Replaced IPA_NODE_REF with direct uses of
ipa_node_params_sum and replaced IPA_EDGE_REF with a direct use of
ipa_edge_args_sum.
(ipa_propagate_indirect_call_infos): Replaced IPA_EDGE_REF with a
direct use of ipa_edge_args_sum.
(ipa_edge_args_sum_t::duplicate): Replaced IPA_NODE_REF with a direct
use of ipa_node_params_sum.
(ipa_print_node_params): Likewise.
(ipa_write_node_info): Likewise and also replaced IPA_EDGE_REF with
direct uses of ipa_edge_args_sum.
(ipa_read_edge_info): Replaced IPA_EDGE_REF with a direct use of
ipa_edge_args_sum.
(ipa_read_node_info): Replaced IPA_NODE_REF with a direct use of
ipa_node_params_sum.
(ipa_prop_write_jump_functions): Likewise. Move variable node to the
scopes where it is used.
|
|
PR ipa/98338
* ipa-fnsummary.c (compute_fn_summary): Fix sanity check.
|
|
The following instructs IPA not to inline calls with VLA parameters
and adjusts inlining not to create invalid view-converted VLA
parameters on mismatch and makes the error_mark paths with debug
stmts actually work.
The first part avoids the ICEs with the testcases already.
2021-02-18 Richard Biener <rguenther@suse.de>
PR middle-end/99122
* ipa-fnsummary.c (analyze_function_body): Set
CIF_FUNCTION_NOT_INLINABLE for VLA parameter calls.
* tree-inline.c (insert_init_debug_bind): Pass NULL for
error_mark_node values.
(force_value_to_type): Do not build V_C_Es for WITH_SIZE_EXPR
values.
(setup_one_parameter): Delay force_value_to_type until when
it's needed.
* gcc.dg/pr99122-1.c: New testcase.
* gcc.dg/pr99122-2.c: Likewise.
|
|
The walk_aliased_vdef calls do not update the walking budget until
it is hit by a single call (and then in one case it resumes with
no limit at all). The following rectifies this in multiple places.
It also makes the updates more consistend and fixes
determine_known_aggregate_parts to account its own alias queries.
2021-02-12 Richard Biener <rguenther@suse.de>
PR middle-end/38474
* ipa-fnsummary.c (unmodified_parm_1): Only walk when
fbi->aa_walk_budget is bigger than zero. Update
fbi->aa_walk_budget.
(param_change_prob): Likewise.
* ipa-prop.c (detect_type_change_from_memory_writes):
Properly account walk_aliased_vdefs.
(parm_preserved_before_stmt_p): Canonicalize updates.
(parm_ref_data_preserved_p): Likewise.
(parm_ref_data_pass_through_p): Likewise.
(determine_known_aggregate_parts): Account own alias queries.
|
|
|
|
gcc/:
* attr-fnspec.h (attr_fnspec::get_str): New accessor
* ipa-fnsummary.c (read_ipa_call_summary): Store also parm info
for builtins.
* ipa-modref.c (class fnspec_summary): New type.
(class fnspec_summaries_t): New type.
(modref_summary::modref_summary): Initialize writes_errno.
(struct modref_summary_lto): Add writes_errno.
(modref_summary_lto::modref_summary_lto): Initialize writes_errno.
(modref_summary::dump): Check for NULL pointers.
(modref_summary_lto::dump): Dump writes_errno.
(collapse_loads): Move up in source file.
(collapse_stores): New function.
(process_fnspec): Handle also internal calls.
(analyze_call): Likewise.
(analyze_stmt): Store fnspec string if needed.
(analyze_function): Initialize fnspec_sumarries.
(modref_summaries_lto::duplicate): Copy writes_errno.
(modref_write): Store writes_errno and fnspec summaries.
(read_section): Read writes_errno and fnspec summaries.
(modref_read): Initialize fnspec summaries.
(update_signature): Fix formating.
(compute_parm_map): Return true if sucessful.
(get_parm_type): New function.
(get_access_for_fnspec): New function.
(propagate_unknown_call): New function.
(modref_propagate_in_scc): Use it.
(pass_ipa_modref::execute): Delete fnspec_summaries.
(ipa_modref_c_finalize): Delete fnspec_summaries.
* ipa-prop.c: Include attr-fnspec.h.
(ipa_compute_jump_functions_for_bb): Also compute jump functions
for functions with fnspecs.
(ipa_read_edge_info): Read jump functions for builtins.
gcc/testsuite/ChangeLog:
* gcc.dg/ipa/modref-2.c: New test.
* gcc.dg/lto/modref-2_0.c: New test.
|
|
this patch moves size time tables out of ggc allocated memory. This makes
sources bit cleaner and saves about 60MB of GGC memory that turns to about 45MB
of heap memory for cc1plus LTO build.
* ipa-fnsummary.h (class size_time_entry): Do not GTY annotate.
(class ipa_fnsummary): Turn size_time_table to auto_vec and
call_size_time_table to effecient vec; update constructors.
* ipa-fnsummary.c (ipa_fn_summary::account_size_time): Update.
(ipa_fn_summary::~ipa_fn_summary): Update.
(ipa_fn_summary_t::duplicate): Update.
(ipa_dump_fn_summary): Update.
(set_switch_stmt_execution_predicate): Update.
(analyze_function_body): Update.
(estimate_calls_size_and_time): Update.
(ipa_call_context::estimate_size_and_time): Update.
(ipa_merge_fn_summary_after_inlining): Update.
(ipa_update_overall_fn_summary): Update.
(inline_read_section): Update.
(ipa_fn_summary_write): Update.
|
|
* Makefile.in: (OBJS): Add symtab-clones.o
(GTFILES): Add symtab-clones.h
* cgraph.c: Include symtab-clones.h.
(cgraph_edge::resolve_speculation): Fix formating
(cgraph_edge::redirect_call_stmt_to_callee): Update.
(cgraph_update_edges_for_call_stmt): Update
(release_function_body): Fix formating.
(cgraph_node::remove): Fix formating.
(cgraph_node::dump): Fix formating.
(cgraph_node::get_availability): Fix formating.
(cgraph_node::call_for_symbol_thunks_and_aliases): Fix formating.
(set_const_flag_1): Fix formating.
(set_pure_flag_1): Fix formating.
(cgraph_node::can_remove_if_no_direct_calls_p): Fix formating.
(collect_callers_of_node_1): Fix formating.
(clone_of_p): Update.
(cgraph_node::verify_node): Update.
(cgraph_c_finalize): Call clone_info::release ().
* cgraph.h (struct cgraph_clone_info): Move to symtab-clones.h.
(cgraph_node): Remove clone_info.
(symbol_table): Add m_clones.
* cgraphclones.c: Include symtab-clone.h.
(duplicate_thunk_for_node): Update.
(cgraph_node::create_clone): Update.
(cgraph_node::create_virtual_clone): Update.
(cgraph_node::find_replacement): Update.
(cgraph_node::materialize_clone): Update.
* gengtype.c (open_base_files): Include symtab-clones.h.
* ipa-cp.c: Include symtab-clones.h.
(initialize_node_lattices): Update.
(want_remove_some_param_p): Update.
(create_specialized_node): Update.
* ipa-fnsummary.c: Include symtab-clones.h.
(ipa_fn_summary_t::duplicate): Update.
* ipa-modref.c: Include symtab-clones.h.
(update_signature): Update.
* ipa-param-manipulation.c: Include symtab-clones.h.
(ipa_param_body_adjustments::common_initialization): Update.
* ipa-prop.c: Include symtab-clones.h.
(adjust_agg_replacement_values): Update.
(ipcp_get_parm_bits): Update.
(ipcp_update_bits): Update.
(ipcp_update_vr): Update.
* ipa-sra.c: Include symtab-clones.h.
(process_isra_node_results): Update.
(disable_unavailable_parameters): Update.
* lto-cgraph.c: Include symtab-clone.h.
(output_cgraph_opt_summary_p): Update.
(output_node_opt_summary): Update.
(input_node_opt_summary): Update.
* symtab-clones.cc: New file.
* symtab-clones.h: New file.
* tree-inline.c (expand_call_inline): Update.
(update_clone_info): Update.
(tree_function_versioning): Update.
|
|
gcc/ChangeLog:
PR lto/97508
* langhooks.c (lhd_begin_section): Call get_section with
not_existing = true.
* output.h (get_section): Add new argument.
* varasm.c (get_section): Fail when NOT_EXISTING is true
and a section already exists.
* ipa-cp.c (ipcp_write_summary): Remove.
(ipcp_read_summary): Likewise.
* ipa-fnsummary.c (ipa_fn_summary_read): Always read jump
functions summary.
(ipa_fn_summary_write): Always stream it.
|
|
this patch moves thunk_info out of cgraph_node into a symbol summary.
I also moved it to separate hearder file since cgraph.h became really too
fat. I plan to contiue with similar breakup in order to cleanup interfaces
and reduce WPA memory footprint (symbol table now consumes more memory than
trees)
gcc/ChangeLog:
2020-10-23 Jan Hubicka <hubicka@ucw.cz>
* Makefile.in: Add symtab-thunks.o
(GTFILES): Add symtab-thunks.h and symtab-thunks.cc; remove cgraphunit.c
* cgraph.c: Include symtab-thunks.h.
(cgraph_node::create_thunk): Update
(symbol_table::create_edge): Update
(cgraph_node::dump): Update
(cgraph_node::call_for_symbol_thunks_and_aliases): Update
(set_nothrow_flag_1): Update
(set_malloc_flag_1): Update
(set_const_flag_1): Update
(collect_callers_of_node_1): Update
(clone_of_p): Update
(cgraph_node::verify_node): Update
(cgraph_node::function_symbol): Update
(cgraph_c_finalize): Call thunk_info::release.
(cgraph_node::has_thunk_p): Update
(cgraph_node::former_thunk_p): Move here from cgraph.h; reimplement.
* cgraph.h (struct cgraph_thunk_info): Rename to symtab-thunks.h.
(cgraph_node): Remove thunk field; add thunk bitfield.
(cgraph_node::expand_thunk): Move to symtab-thunks.h
(symtab_thunks_cc_finalize): Declare.
(cgraph_node::has_gimple_body_p): Update.
(cgraph_node::former_thunk_p): Update.
* cgraphclones.c: Include symtab-thunks.h.
(duplicate_thunk_for_node): Update.
(cgraph_edge::redirect_callee_duplicating_thunks): Update.
(cgraph_node::expand_all_artificial_thunks): Update.
(cgraph_node::create_edge_including_clones): Update.
* cgraphunit.c: Include symtab-thunks.h.
(vtable_entry_type): Move to symtab-thunks.c.
(cgraph_node::analyze): Update.
(analyze_functions): Update.
(mark_functions_to_output): Update.
(thunk_adjust): Move to symtab-thunks.c
(cgraph_node::expand_thunk): Move to symtab-thunks.c
(cgraph_node::assemble_thunks_and_aliases): Update.
(output_in_order): Update.
(cgraphunit_c_finalize): Do not clear vtable_entry_type.
(cgraph_node::create_wrapper): Update.
* gengtype.c (open_base_files): Add symtab-thunks.h
* ipa-comdats.c (propagate_comdat_group): UPdate.
(ipa_comdats): Update.
* ipa-cp.c (determine_versionability): UPdate.
(gather_caller_stats): Update.
(count_callers): Update
(set_single_call_flag): Update
(initialize_node_lattices): Update
(call_passes_through_thunk_p): Update
(call_passes_through_thunk): Update
(propagate_constants_across_call): Update
(find_more_scalar_values_for_callers_subset): Update
(has_undead_caller_from_outside_scc_p): Update
* ipa-fnsummary.c (evaluate_properties_for_edge): Update.
(compute_fn_summary): Update.
(inline_analyze_function): Update.
* ipa-icf.c: Include symtab-thunks.h.
(sem_function::equals_wpa): Update.
(redirect_all_callers): Update.
(sem_function::init): Update.
(sem_function::parse): Update.
* ipa-inline-transform.c: Include symtab-thunks.h.
(inline_call): Update.
(save_inline_function_body): Update.
(preserve_function_body_p): Update.
* ipa-inline.c (inline_small_functions): Update.
* ipa-polymorphic-call.c: Include alloc-pool.h, symbol-summary.h,
symtab-thunks.h
(ipa_polymorphic_call_context::ipa_polymorphic_call_context): Update.
* ipa-pure-const.c: Include symtab-thunks.h.
(analyze_function): Update.
* ipa-sra.c (check_for_caller_issues): Update.
* ipa-utils.c (ipa_reverse_postorder): Update.
(ipa_merge_profiles): Update.
* ipa-visibility.c (non_local_p): Update.
(cgraph_node::local_p): Update.
(function_and_variable_visibility): Update.
* ipa.c (symbol_table::remove_unreachable_nodes): Update.
* lto-cgraph.c: Include alloc-pool.h, symbol-summary.h and
symtab-thunks.h
(lto_output_edge): Update.
(lto_output_node): Update.
(compute_ltrans_boundary): Update.
(output_symtab): Update.
(verify_node_partition): Update.
(input_overwrite_node): Update.
(input_node): Update.
* lto-streamer-in.c (fixup_call_stmt_edges): Update.
* symtab-thunks.cc: New file.
* symtab-thunks.h: New file.
* toplev.c (toplev::finalize): Call symtab_thunks_cc_finalize.
* trans-mem.c (ipa_tm_mayenterirr_function): Update.
(ipa_tm_execute): Update.
* tree-inline.c (expand_call_inline): Update.
* tree-nested.c (create_nesting_tree): Update.
(convert_all_function_calls): Update.
(gimplify_all_functions): Update.
* tree-profile.c (tree_profiling): Update.
* tree-ssa-structalias.c (associate_varinfo_to_alias): Update.
* tree.c (free_lang_data_in_decl): Update.
* value-prof.c (init_node_map): Update.
gcc/c-family/ChangeLog:
2020-10-23 Jan Hubicka <hubicka@ucw.cz>
* c-common.c (c_common_finalize_early_debug): Update for new thunk api.
gcc/d/ChangeLog:
2020-10-23 Jan Hubicka <hubicka@ucw.cz>
* decl.cc (finish_thunk): Update for new thunk api.
gcc/lto/ChangeLog:
2020-10-23 Jan Hubicka <hubicka@ucw.cz>
* lto-partition.c (add_symbol_to_partition_1): Update for new thunk
api.
|
|
This patch implements heuristics that increases inline limits (by the hints
mechanism) for inline functions that use builtin_constant_p on parameter. Those
are very likely intended to be always inlined and simplify after inlining.
The PR is about a function that we used to inline with
--param inline-insns-single=200 but with new default of 70 for -O2 we no longer
do so. Hints are currently configured to bump the bound up twice, so we
get limit of 140 that is still not enough to inline the particular testcase
but it should help in general. I can implement a stronger bump if that seems
useful (maybe it is). The example is bit operation written as a decision chain
with 64 conditions.
This blows up the limit on number of conditions we track per funtion (which is
30) and thus the size/time estimates are not working that well.
gcc/ChangeLog:
PR ipa/97445
* ipa-fnsummary.c (ipa_dump_hints): Add INLINE_HINT_builtin_constant_p.
(ipa_fn_summary::~ipa_fn_summary): Free builtin_constant_p_parms.
(ipa_fn_summary_t::duplicate): Duplicate builtin_constant_p_parms.
(ipa_dump_fn_summary): Dump builtin_constant_p_parms.
(add_builtin_constant_p_parm): New function
(set_cond_stmt_execution_predicate): Update builtin_constant_p_parms.
(ipa_call_context::estimate_size_and_time): Set
INLINE_HINT_builtin_constant_p..
(ipa_merge_fn_summary_after_inlining): Merge builtin_constant_p_parms.
(inline_read_section): Read builtin_constant_p_parms.
(ipa_fn_summary_write): Write builtin_constant_p_parms.
* ipa-fnsummary.h (enum ipa_hints_vals): Add
INLINE_HINT_builtin_constant_p.
* ipa-inline.c (want_inline_small_function_p): Use
INLINE_HINT_builtin_constant_p.
(edge_badness): Use INLINE_HINT_builtin_constant_p.
gcc/testsuite/ChangeLog:
PR ipa/97445
* gcc.dg/ipa/inlinehint-5.c: New test.
|
|
gcc/ChangeLog:
2020-10-14 Jan Hubicka <hubicka@ucw.cz>
* ipa-fnsummary.c (remap_edge_summaries): Make offset_map HOST_WIDE_INT.
(remap_freqcounting_predicate): Likewise.
(ipa_merge_fn_summary_after_inlining): Likewise.
* ipa-predicate.c (predicate::remap_after_inlining): Likewise
* ipa-predicate.h (remap_after_inlining): Update.
|
|
This patch enhances the ability of IPA to reason under what conditions
loops in a function have known iteration counts or strides because it
replaces single predicates which currently hold conjunction of
predicates for all loops with vectors capable of holding multiple
predicates, each with a cumulative frequency of loops with the
property.
This second property is then used by IPA-CP to much more aggressively
boost its heuristic score for cloning opportunities which make
iteration counts or strides of frequent loops compile time constant.
gcc/ChangeLog:
2020-09-03 Martin Jambor <mjambor@suse.cz>
* ipa-fnsummary.h (ipa_freqcounting_predicate): New type.
(ipa_fn_summary): Change the type of loop_iterations and loop_strides
to vectors of ipa_freqcounting_predicate.
(ipa_fn_summary::ipa_fn_summary): Construct the new vectors.
(ipa_call_estimates): New fields loops_with_known_iterations and
loops_with_known_strides.
* ipa-cp.c (hint_time_bonus): Multiply param_ipa_cp_loop_hint_bonus
with the expected frequencies of loops with known iteration count or
stride.
* ipa-fnsummary.c (add_freqcounting_predicate): New function.
(ipa_fn_summary::~ipa_fn_summary): Release the new vectors instead of
just two predicates.
(remap_hint_predicate_after_duplication): Replace with function
remap_freqcounting_preds_after_dup.
(ipa_fn_summary_t::duplicate): Use it or duplicate new vectors.
(ipa_dump_fn_summary): Dump the new vectors.
(analyze_function_body): Compute the loop property vectors.
(ipa_call_context::estimate_size_and_time): Calculate also
loops_with_known_iterations and loops_with_known_strides. Adjusted
dumping accordinly.
(remap_hint_predicate): Replace with function
remap_freqcounting_predicate.
(ipa_merge_fn_summary_after_inlining): Use it.
(inline_read_section): Stream loopcounting vectors instead of two
simple predicates.
(ipa_fn_summary_write): Likewise.
* params.opt (ipa-max-loop-predicates): New parameter.
* doc/invoke.texi (ipa-max-loop-predicates): Document new param.
gcc/testsuite/ChangeLog:
2020-09-03 Martin Jambor <mjambor@suse.cz>
* gcc.dg/ipa/ipcp-loophint-1.c: New test.
|
|
A subsequent patch adds another two estimates that the code in
ipa_call_context::estimate_size_and_time computes, and the fact that
the function has a special output parameter for each thing it computes
would make it have just too many. Therefore, this patch collapses all
those ouptut parameters into one output structure.
gcc/ChangeLog:
2020-09-02 Martin Jambor <mjambor@suse.cz>
* ipa-inline-analysis.c (do_estimate_edge_time): Adjusted to use
ipa_call_estimates.
(do_estimate_edge_size): Likewise.
(do_estimate_edge_hints): Likewise.
* ipa-fnsummary.h (struct ipa_call_estimates): New type.
(ipa_call_context::estimate_size_and_time): Adjusted declaration.
(estimate_ipcp_clone_size_and_time): Likewise.
* ipa-cp.c (hint_time_bonus): Changed the type of the second argument
to ipa_call_estimates.
(perform_estimation_of_a_value): Adjusted to use ipa_call_estimates.
(estimate_local_effects): Likewise.
* ipa-fnsummary.c (ipa_call_context::estimate_size_and_time): Adjusted
to return estimates in a single ipa_call_estimates parameter.
(estimate_ipcp_clone_size_and_time): Likewise.
|
|
Hi,
as we discussed with Honza on the mailin glist last week, making
cached call context structure distinct from the normal one may make it
clearer that the cached data need to be explicitely deallocated.
This patch does that division. It is not mandatory for the overall
main goals of the patch set and can be dropped if deemed superfluous.
gcc/ChangeLog:
2020-09-02 Martin Jambor <mjambor@suse.cz>
* ipa-fnsummary.h (ipa_cached_call_context): New forward declaration
and class.
(class ipa_call_context): Make friend ipa_cached_call_context. Moved
methods duplicate_from and release to it too.
* ipa-fnsummary.c (ipa_call_context::duplicate_from): Moved to class
ipa_cached_call_context.
(ipa_call_context::release): Likewise, removed the parameter.
* ipa-inline-analysis.c (node_context_cache_entry): Change the type of
ctx to ipa_cached_call_context.
(do_estimate_edge_time): Remove parameter from the call to
ipa_cached_call_context::release.
|
|
Hi,
this large patch is mostly mechanical change which aims to replace
uses of separate vectors about known scalar values (usually called
known_vals or known_csts), known aggregate values (known_aggs), known
virtual call contexts (known_contexts) and known value
ranges (known_value_ranges) with uses of either new type
ipa_call_arg_values or ipa_auto_call_arg_values, both of which simply
contain these vectors inside them.
The need for two distinct comes from the fact that when the vectors
are constructed from jump functions or lattices, we really should use
auto_vecs with embedded storage allocated on stack. On the other hand,
the bundle in ipa_call_context can be allocated on heap when in cache,
one time for each call_graph node.
ipa_call_context is constructible from ipa_auto_call_arg_values but
then its vectors must not be resized, otherwise the vectors will stop
pointing to the stack ones. Unfortunately, I don't think the
structure embedded in ipa_call_context can be made constant because we
need to manipulate and deallocate it when in cache.
gcc/ChangeLog:
2020-09-01 Martin Jambor <mjambor@suse.cz>
* ipa-prop.h (ipa_auto_call_arg_values): New type.
(class ipa_call_arg_values): Likewise.
(ipa_get_indirect_edge_target): Replaced vector arguments with
ipa_call_arg_values in declaration. Added an overload for
ipa_auto_call_arg_values.
* ipa-fnsummary.h (ipa_call_context): Removed members m_known_vals,
m_known_contexts, m_known_aggs, duplicate_from, release and equal_to,
new members m_avals, store_to_cache and equivalent_to_p. Adjusted
construcotr arguments.
(estimate_ipcp_clone_size_and_time): Replaced vector arguments
with ipa_auto_call_arg_values in declaration.
(evaluate_properties_for_edge): Likewise.
* ipa-cp.c (ipa_get_indirect_edge_target): Adjusted to work on
ipa_call_arg_values rather than on separate vectors. Added an
overload for ipa_auto_call_arg_values.
(devirtualization_time_bonus): Adjusted to work on
ipa_auto_call_arg_values rather than on separate vectors.
(gather_context_independent_values): Adjusted to work on
ipa_auto_call_arg_values rather than on separate vectors.
(perform_estimation_of_a_value): Likewise.
(estimate_local_effects): Likewise.
(modify_known_vectors_with_val): Adjusted both variants to work on
ipa_auto_call_arg_values and rename them to
copy_known_vectors_add_val.
(decide_about_value): Adjusted to work on ipa_call_arg_values rather
than on separate vectors.
(decide_whether_version_node): Likewise.
* ipa-fnsummary.c (evaluate_conditions_for_known_args): Likewise.
(evaluate_properties_for_edge): Likewise.
(ipa_fn_summary_t::duplicate): Likewise.
(estimate_edge_devirt_benefit): Adjusted to work on
ipa_call_arg_values rather than on separate vectors.
(estimate_edge_size_and_time): Likewise.
(estimate_calls_size_and_time_1): Likewise.
(summarize_calls_size_and_time): Adjusted calls to
estimate_edge_size_and_time.
(estimate_calls_size_and_time): Adjusted to work on
ipa_call_arg_values rather than on separate vectors.
(ipa_call_context::ipa_call_context): Construct from a pointer to
ipa_auto_call_arg_values instead of inividual vectors.
(ipa_call_context::duplicate_from): Adjusted to access vectors within
m_avals.
(ipa_call_context::release): Likewise.
(ipa_call_context::equal_to): Likewise.
(ipa_call_context::estimate_size_and_time): Adjusted to work on
ipa_call_arg_values rather than on separate vectors.
(estimate_ipcp_clone_size_and_time): Adjusted to work with
ipa_auto_call_arg_values rather than on separate vectors.
(ipa_merge_fn_summary_after_inlining): Likewise. Adjusted call to
estimate_edge_size_and_time.
(ipa_update_overall_fn_summary): Adjusted call to
estimate_edge_size_and_time.
* ipa-inline-analysis.c (do_estimate_edge_time): Adjusted to work with
ipa_auto_call_arg_values rather than with separate vectors.
(do_estimate_edge_size): Likewise.
(do_estimate_edge_hints): Likewise.
* ipa-prop.c (ipa_auto_call_arg_values::~ipa_auto_call_arg_values):
New destructor.
|
|
PR ipa/97244
* ipa-fnsummary.c (pass_free_fnsummary::execute): Free
also indirect inlining datastructure.
* ipa-modref.c (pass_ipa_modref::execute): Do not free them here.
* ipa-prop.c (ipa_free_all_node_params): Do not crash when info does
not exist.
(ipa_unregister_cgraph_hooks): Likewise.
|
|
this patch implement tracking wehther argument points to readonly memory. This
is is useful for ipa-modref as well as for inline heuristics. It is desirable
to inline functions that dereference pointers to local variables in order
to support SRA. We always did the oposite heuristics (guessing that the
dereferences will be optimized out with 50% probability) but here we could
increase the probability for cases where we can track that argument is indeed
a local memory (or readonly which is also good)
* ipa-fnsummary.c (dump_ipa_call_summary): Dump
points_to_local_or_readonly_memory flag.
(analyze_function_body): Compute points_to_local_or_readonly_memory
flag.
(remap_edge_change_prob): Rename to ...
(remap_edge_params): ... this one; update
points_to_local_or_readonly_memory.
(remap_edge_summaries): Update.
(read_ipa_call_summary): Stream the new flag.
(write_ipa_call_summary): Likewise.
* ipa-predicate.h (struct inline_param_summary): Add
points_to_local_or_readonly_memory.
(inline_param_summary::equal_to): Update.
(inline_param_summary::useless_p): Update.
|
|
This adds a move CTOR to auto_vec<T, 0> and makes use of a
auto_vec<edge> return value for get_loop_exit_edges denoting
that lifetime management of the vector is handed to the caller.
The move CTOR prompted the hash_table change because it appearantly
makes the copy CTOR implicitely deleted (good) and hash-table
expansion of the odr_enum_map which is
hash_map <nofree_string_hash, odr_enum> where odr_enum has an
auto_vec<odr_enum_val, 0> member triggers this. Not sure if
there's a latent bug there before this (I think we're not
invoking DTORs, but we're invoking copy-CTORs).
2020-08-06 Richard Biener <rguenther@suse.de>
* vec.h (auto_vec<T, 0>::auto_vec (auto_vec &&)): New move CTOR.
(auto_vec<T, 0>::operator=(auto_vec &&)): Delete.
* hash-table.h (hash_table::expand): Use std::move when expanding.
* cfgloop.h (get_loop_exit_edges): Return auto_vec<edge>.
* cfgloop.c (get_loop_exit_edges): Adjust.
* cfgloopmanip.c (fix_loop_placement): Likewise.
* ipa-fnsummary.c (analyze_function_body): Likewise.
* ira-build.c (create_loop_tree_nodes): Likewise.
(create_loop_tree_node_allocnos): Likewise.
(loop_with_complex_edge_p): Likewise.
* ira-color.c (ira_loop_edge_freq): Likewise.
* loop-unroll.c (analyze_insns_in_loop): Likewise.
* predict.c (predict_loops): Likewise.
* tree-predcom.c (last_always_executed_block): Likewise.
* tree-ssa-loop-ch.c (ch_base::copy_headers): Likewise.
* tree-ssa-loop-im.c (store_motion_loop): Likewise.
* tree-ssa-loop-ivcanon.c (loop_edge_to_cancel): Likewise.
(canonicalize_loop_induction_variables): Likewise.
* tree-ssa-loop-manip.c (get_loops_exits): Likewise.
* tree-ssa-loop-niter.c (find_loop_niter): Likewise.
(finite_loop_p): Likewise.
(find_loop_niter_by_eval): Likewise.
(estimate_numbers_of_iterations): Likewise.
* tree-ssa-loop-prefetch.c (emit_mfence_after_loop): Likewise.
(may_use_storent_in_loop_p): Likewise.
|
|
* ipa-fnsummary.c (refs_local_or_readonly_memory_p): New function.
(points_to_local_or_readonly_memory_p): New function.
* ipa-fnsummary.h (refs_local_or_readonly_memory_p): Declare.
(points_to_local_or_readonly_memory_p): Declare.
* ipa-modref.c (record_access_p): Use refs_local_or_readonly_memory_p.
* ipa-pure-const.c (check_op): Likewise.
* gcc.dg/tree-ssa/local-pure-const.c: Update template.
|
|
gcc/ada/ChangeLog:
* gcc-interface/trans.c (gigi): Set exact argument of a vector
growth function to true.
(Attribute_to_gnu): Likewise.
gcc/ChangeLog:
* alias.c (init_alias_analysis): Set exact argument of a vector
growth function to true.
* calls.c (internal_arg_pointer_based_exp_scan): Likewise.
* cfgbuild.c (find_many_sub_basic_blocks): Likewise.
* cfgexpand.c (expand_asm_stmt): Likewise.
* cfgrtl.c (rtl_create_basic_block): Likewise.
* combine.c (combine_split_insns): Likewise.
(combine_instructions): Likewise.
* config/aarch64/aarch64-sve-builtins.cc (function_expander::add_output_operand): Likewise.
(function_expander::add_input_operand): Likewise.
(function_expander::add_integer_operand): Likewise.
(function_expander::add_address_operand): Likewise.
(function_expander::add_fixed_operand): Likewise.
* df-core.c (df_worklist_dataflow_doublequeue): Likewise.
* dwarf2cfi.c (update_row_reg_save): Likewise.
* early-remat.c (early_remat::init_block_info): Likewise.
(early_remat::finalize_candidate_indices): Likewise.
* except.c (sjlj_build_landing_pads): Likewise.
* final.c (compute_alignments): Likewise.
(grow_label_align): Likewise.
* function.c (temp_slots_at_level): Likewise.
* fwprop.c (build_single_def_use_links): Likewise.
(update_uses): Likewise.
* gcc.c (insert_wrapper): Likewise.
* genautomata.c (create_state_ainsn_table): Likewise.
(add_vect): Likewise.
(output_dead_lock_vect): Likewise.
* genmatch.c (capture_info::capture_info): Likewise.
(parser::finish_match_operand): Likewise.
* genrecog.c (optimize_subroutine_group): Likewise.
(merge_pattern_info::merge_pattern_info): Likewise.
(merge_into_decision): Likewise.
(print_subroutine_start): Likewise.
(main): Likewise.
* gimple-loop-versioning.cc (loop_versioning::loop_versioning): Likewise.
* gimple.c (gimple_set_bb): Likewise.
* graphite-isl-ast-to-gimple.c (translate_isl_ast_node_user): Likewise.
* haifa-sched.c (sched_extend_luids): Likewise.
(extend_h_i_d): Likewise.
* insn-addr.h (insn_addresses_new): Likewise.
* ipa-cp.c (gather_context_independent_values): Likewise.
(find_more_contexts_for_caller_subset): Likewise.
* ipa-devirt.c (final_warning_record::grow_type_warnings): Likewise.
(ipa_odr_read_section): Likewise.
* ipa-fnsummary.c (evaluate_properties_for_edge): Likewise.
(ipa_fn_summary_t::duplicate): Likewise.
(analyze_function_body): Likewise.
(ipa_merge_fn_summary_after_inlining): Likewise.
(read_ipa_call_summary): Likewise.
* ipa-icf.c (sem_function::bb_dict_test): Likewise.
* ipa-prop.c (ipa_alloc_node_params): Likewise.
(parm_bb_aa_status_for_bb): Likewise.
(ipa_compute_jump_functions_for_edge): Likewise.
(ipa_analyze_node): Likewise.
(update_jump_functions_after_inlining): Likewise.
(ipa_read_edge_info): Likewise.
(read_ipcp_transformation_info): Likewise.
(ipcp_transform_function): Likewise.
* ipa-reference.c (ipa_reference_write_optimization_summary): Likewise.
* ipa-split.c (execute_split_functions): Likewise.
* ira.c (find_moveable_pseudos): Likewise.
* lower-subreg.c (decompose_multiword_subregs): Likewise.
* lto-streamer-in.c (input_eh_regions): Likewise.
(input_cfg): Likewise.
(input_struct_function_base): Likewise.
(input_function): Likewise.
* modulo-sched.c (set_node_sched_params): Likewise.
(extend_node_sched_params): Likewise.
(schedule_reg_moves): Likewise.
* omp-general.c (omp_construct_simd_compare): Likewise.
* passes.c (pass_manager::create_pass_tab): Likewise.
(enable_disable_pass): Likewise.
* predict.c (determine_unlikely_bbs): Likewise.
* profile.c (compute_branch_probabilities): Likewise.
* read-rtl-function.c (function_reader::parse_block): Likewise.
* read-rtl.c (rtx_reader::read_rtx_code): Likewise.
* reg-stack.c (stack_regs_mentioned): Likewise.
* regrename.c (regrename_init): Likewise.
* rtlanal.c (T>::add_single_to_queue): Likewise.
* sched-deps.c (init_deps_data_vector): Likewise.
* sel-sched-ir.c (sel_extend_global_bb_info): Likewise.
(extend_region_bb_info): Likewise.
(extend_insn_data): Likewise.
* symtab.c (symtab_node::create_reference): Likewise.
* tracer.c (tail_duplicate): Likewise.
* trans-mem.c (tm_region_init): Likewise.
(get_bb_regions_instrumented): Likewise.
* tree-cfg.c (init_empty_tree_cfg_for_function): Likewise.
(build_gimple_cfg): Likewise.
(create_bb): Likewise.
(move_block_to_fn): Likewise.
* tree-complex.c (tree_lower_complex): Likewise.
* tree-if-conv.c (predicate_rhs_code): Likewise.
* tree-inline.c (copy_bb): Likewise.
* tree-into-ssa.c (get_ssa_name_ann): Likewise.
(mark_phi_for_rewrite): Likewise.
* tree-object-size.c (compute_builtin_object_size): Likewise.
(init_object_sizes): Likewise.
* tree-predcom.c (initialize_root_vars_store_elim_1): Likewise.
(initialize_root_vars_store_elim_2): Likewise.
(prepare_initializers_chain_store_elim): Likewise.
* tree-ssa-address.c (addr_for_mem_ref): Likewise.
(multiplier_allowed_in_address_p): Likewise.
* tree-ssa-coalesce.c (ssa_conflicts_new): Likewise.
* tree-ssa-forwprop.c (simplify_vector_constructor): Likewise.
* tree-ssa-loop-ivopts.c (addr_offset_valid_p): Likewise.
(get_address_cost_ainc): Likewise.
* tree-ssa-loop-niter.c (discover_iteration_bound_by_body_walk): Likewise.
* tree-ssa-pre.c (add_to_value): Likewise.
(phi_translate_1): Likewise.
(do_pre_regular_insertion): Likewise.
(do_pre_partial_partial_insertion): Likewise.
(init_pre): Likewise.
* tree-ssa-propagate.c (ssa_prop_init): Likewise.
(update_call_from_tree): Likewise.
* tree-ssa-reassoc.c (optimize_range_tests_cmp_bitwise): Likewise.
* tree-ssa-sccvn.c (vn_reference_lookup_3): Likewise.
(vn_reference_lookup_pieces): Likewise.
(eliminate_dom_walker::eliminate_push_avail): Likewise.
* tree-ssa-strlen.c (set_strinfo): Likewise.
(get_stridx_plus_constant): Likewise.
(zero_length_string): Likewise.
(find_equal_ptrs): Likewise.
(printf_strlen_execute): Likewise.
* tree-ssa-threadedge.c (set_ssa_name_value): Likewise.
* tree-ssanames.c (make_ssa_name_fn): Likewise.
* tree-streamer-in.c (streamer_read_tree_bitfields): Likewise.
* tree-vect-loop.c (vect_record_loop_mask): Likewise.
(vect_get_loop_mask): Likewise.
(vect_record_loop_len): Likewise.
(vect_get_loop_len): Likewise.
* tree-vect-patterns.c (vect_recog_mask_conversion_pattern): Likewise.
* tree-vect-slp.c (vect_slp_convert_to_external): Likewise.
(vect_bb_slp_scalar_cost): Likewise.
(vect_bb_vectorization_profitable_p): Likewise.
(vectorizable_slp_permutation): Likewise.
* tree-vect-stmts.c (vectorizable_call): Likewise.
(vectorizable_simd_clone_call): Likewise.
(scan_store_can_perm_p): Likewise.
(vectorizable_store): Likewise.
* expr.c: Likewise.
* vec.c (test_safe_grow_cleared): Likewise.
* vec.h (vec_safe_grow): Likewise.
(vec_safe_grow_cleared): Likewise.
(vl_ptr>::safe_grow): Likewise.
(vl_ptr>::safe_grow_cleared): Likewise.
* config/c6x/c6x.c (insn_set_clock): Likewise.
gcc/c/ChangeLog:
* gimple-parser.c (c_parser_gimple_compound_statement): Set exact argument of a vector
growth function to true.
gcc/cp/ChangeLog:
* class.c (build_vtbl_initializer): Set exact argument of a vector
growth function to true.
* constraint.cc (get_mapped_args): Likewise.
* decl.c (cp_maybe_mangle_decomp): Likewise.
(cp_finish_decomp): Likewise.
* parser.c (cp_parser_omp_for_loop): Likewise.
* pt.c (canonical_type_parameter): Likewise.
* rtti.c (get_pseudo_ti_init): Likewise.
gcc/fortran/ChangeLog:
* trans-openmp.c (gfc_trans_omp_do): Set exact argument of a vector
growth function to true.
gcc/lto/ChangeLog:
* lto-common.c (lto_file_finalize): Set exact argument of a vector
growth function to true.
|
|
gcc/ChangeLog:
* ipa-fnsummary.c (evaluate_conditions_for_known_args): Use vec<>
instead of std::vector<>.
(evaluate_properties_for_edge): Same.
(ipa_fn_summary_t::duplicate): Same.
(estimate_ipcp_clone_size_and_time): Same.
* vec.h (<T, A, vl_embed>::embedded_size): Change vec_embedded
type to contain a char[].
|
|
This fixes a bootstrap error with clang 10 that would complain
/usr/include/c++/v1/typeinfo:346:5: error: no member named
'fancy_abort' in namespace 'std::__1'; did you mean simply
'fancy_abort'?
It mirrors how this is handled in gcov.c and indirectly includes
<vector> via system.h.
gcc/ChangeLog:
* ipa-fnsummary.c (INCLUDE_VECTOR): Define.
Remove direct inclusion of <vector>.
|
|
Implement class irange, a generic multi-range implementation for
value ranges. This class is API compatible with value_range, and is meant
to seamlessly coexist with it.
gcc/ChangeLog:
* Makefile.in (GTFILES): Move value-range.h up.
* gengtype-lex.l: Set yylval to handle GTY markers on templates.
* ipa-cp.c (initialize_node_lattices): Call value_range
constructor.
(ipcp_propagate_stage): Use in-place new so value_range construct
is called.
* ipa-fnsummary.c (evaluate_conditions_for_known_args): Use std
vec instead of GCC's vec<>.
(evaluate_properties_for_edge): Adjust for std vec.
(ipa_fn_summary_t::duplicate): Same.
(estimate_ipcp_clone_size_and_time): Same.
* ipa-prop.c (ipa_get_value_range): Use in-place new for
value_range.
* ipa-prop.h (struct GTY): Remove class keyword for m_vr.
* range-op.cc (empty_range_check): Rename to...
(empty_range_varying): ...this and adjust for varying.
(undefined_shift_range_check): Adjust for irange.
(range_operator::wi_fold): Same.
(range_operator::fold_range): Adjust for irange. Special case
single pairs for performance.
(range_operator::op1_range): Adjust for irange.
(range_operator::op2_range): Same.
(value_range_from_overflowed_bounds): Same.
(value_range_with_overflow): Same.
(create_possibly_reversed_range): Same.
(range_true): Same.
(range_false): Same.
(range_true_and_false): Same.
(get_bool_state): Adjust for irange and tweak for performance.
(operator_equal::fold_range): Adjust for irange.
(operator_equal::op1_range): Same.
(operator_equal::op2_range): Same.
(operator_not_equal::fold_range): Same.
(operator_not_equal::op1_range): Same.
(operator_not_equal::op2_range): Same.
(build_lt): Same.
(build_le): Same.
(build_gt): Same.
(build_ge): Same.
(operator_lt::fold_range): Same.
(operator_lt::op1_range): Same.
(operator_lt::op2_range): Same.
(operator_le::fold_range): Same.
(operator_le::op1_range): Same.
(operator_le::op2_range): Same.
(operator_gt::fold_range): Same.
(operator_gt::op1_range): Same.
(operator_gt::op2_range): Same.
(operator_ge::fold_range): Same.
(operator_ge::op1_range): Same.
(operator_ge::op2_range): Same.
(operator_plus::wi_fold): Same.
(operator_plus::op1_range): Same.
(operator_plus::op2_range): Same.
(operator_minus::wi_fold): Same.
(operator_minus::op1_range): Same.
(operator_minus::op2_range): Same.
(operator_min::wi_fold): Same.
(operator_max::wi_fold): Same.
(cross_product_operator::wi_cross_product): Same.
(operator_mult::op1_range): New.
(operator_mult::op2_range): New.
(operator_mult::wi_fold): Adjust for irange.
(operator_div::wi_fold): Same.
(operator_exact_divide::op1_range): Same.
(operator_lshift::fold_range): Same.
(operator_lshift::wi_fold): Same.
(operator_lshift::op1_range): New.
(operator_rshift::op1_range): New.
(operator_rshift::fold_range): Adjust for irange.
(operator_rshift::wi_fold): Same.
(operator_cast::truncating_cast_p): Abstract out from
operator_cast::fold_range.
(operator_cast::fold_range): Adjust for irange and tweak for
performance.
(operator_cast::inside_domain_p): Abstract out from fold_range.
(operator_cast::fold_pair): Same.
(operator_cast::op1_range): Use abstracted methods above. Adjust
for irange and tweak for performance.
(operator_logical_and::fold_range): Adjust for irange.
(operator_logical_and::op1_range): Same.
(operator_logical_and::op2_range): Same.
(unsigned_singleton_p): New.
(operator_bitwise_and::remove_impossible_ranges): New.
(operator_bitwise_and::fold_range): New.
(wi_optimize_and_or): Adjust for irange.
(operator_bitwise_and::wi_fold): Same.
(set_nonzero_range_from_mask): New.
(operator_bitwise_and::simple_op1_range_solver): New.
(operator_bitwise_and::op1_range): Adjust for irange.
(operator_bitwise_and::op2_range): Same.
(operator_logical_or::fold_range): Same.
(operator_logical_or::op1_range): Same.
(operator_logical_or::op2_range): Same.
(operator_bitwise_or::wi_fold): Same.
(operator_bitwise_or::op1_range): Same.
(operator_bitwise_or::op2_range): Same.
(operator_bitwise_xor::wi_fold): Same.
(operator_bitwise_xor::op1_range): New.
(operator_bitwise_xor::op2_range): New.
(operator_trunc_mod::wi_fold): Adjust for irange.
(operator_logical_not::fold_range): Same.
(operator_logical_not::op1_range): Same.
(operator_bitwise_not::fold_range): Same.
(operator_bitwise_not::op1_range): Same.
(operator_cst::fold_range): Same.
(operator_identity::fold_range): Same.
(operator_identity::op1_range): Same.
(class operator_unknown): New.
(operator_unknown::fold_range): New.
(class operator_abs): Adjust for irange.
(operator_abs::wi_fold): Same.
(operator_abs::op1_range): Same.
(operator_absu::wi_fold): Same.
(class operator_negate): Same.
(operator_negate::fold_range): Same.
(operator_negate::op1_range): Same.
(operator_addr_expr::fold_range): Same.
(operator_addr_expr::op1_range): Same.
(pointer_plus_operator::wi_fold): Same.
(pointer_min_max_operator::wi_fold): Same.
(pointer_and_operator::wi_fold): Same.
(pointer_or_operator::op1_range): New.
(pointer_or_operator::op2_range): New.
(pointer_or_operator::wi_fold): Adjust for irange.
(integral_table::integral_table): Add entries for IMAGPART_EXPR
and POINTER_DIFF_EXPR.
(range_cast): Adjust for irange.
(build_range3): New.
(range3_tests): New.
(widest_irange_tests): New.
(multi_precision_range_tests): New.
(operator_tests): New.
(range_tests): New.
* range-op.h (class range_operator): Adjust for irange.
(range_cast): Same.
* tree-vrp.c (range_fold_binary_symbolics_p): Adjust for irange and
tweak for performance.
(range_fold_binary_expr): Same.
(masked_increment): Change to extern.
* tree-vrp.h (masked_increment): New.
* tree.c (cache_wide_int_in_type_cache): New function abstracted
out from wide_int_to_tree_1.
(wide_int_to_tree_1): Cache 0, 1, and MAX for pointers.
* value-range-equiv.cc (value_range_equiv::deep_copy): Use kind
method.
(value_range_equiv::move): Same.
(value_range_equiv::check): Adjust for irange.
(value_range_equiv::intersect): Same.
(value_range_equiv::union_): Same.
(value_range_equiv::dump): Same.
* value-range.cc (irange::operator=): Same.
(irange::maybe_anti_range): New.
(irange::copy_legacy_range): New.
(irange::set_undefined): Adjust for irange.
(irange::swap_out_of_order_endpoints): Abstract out from set().
(irange::set_varying): Adjust for irange.
(irange::irange_set): New.
(irange::irange_set_anti_range): New.
(irange::set): Adjust for irange.
(value_range::set_nonzero): Move to header file.
(value_range::set_zero): Move to header file.
(value_range::check): Rename to...
(irange::verify_range): ...this.
(value_range::num_pairs): Rename to...
(irange::legacy_num_pairs): ...this, and adjust for irange.
(value_range::lower_bound): Rename to...
(irange::legacy_lower_bound): ...this, and adjust for irange.
(value_range::upper_bound): Rename to...
(irange::legacy_upper_bound): ...this, and adjust for irange.
(value_range::equal_p): Rename to...
(irange::legacy_equal_p): ...this.
(value_range::operator==): Move to header file.
(irange::equal_p): New.
(irange::symbolic_p): Adjust for irange.
(irange::constant_p): Same.
(irange::singleton_p): Same.
(irange::value_inside_range): Same.
(irange::may_contain_p): Same.
(irange::contains_p): Same.
(irange::normalize_addresses): Same.
(irange::normalize_symbolics): Same.
(irange::legacy_intersect): Same.
(irange::legacy_union): Same.
(irange::union_): Same.
(irange::intersect): Same.
(irange::irange_union): New.
(irange::irange_intersect): New.
(subtract_one): New.
(irange::invert): Adjust for irange.
(dump_bound_with_infinite_markers): New.
(irange::dump): Adjust for irange.
(debug): Add irange versions.
(range_has_numeric_bounds_p): Adjust for irange.
(vrp_val_max): Move to header file.
(vrp_val_min): Move to header file.
(DEFINE_INT_RANGE_GC_STUBS): New.
(DEFINE_INT_RANGE_INSTANCE): New.
* value-range.h (class irange): New.
(class int_range): New.
(class value_range): Rename to a instantiation of int_range.
(irange::legacy_mode_p): New.
(value_range::value_range): Remove.
(irange::kind): New.
(irange::num_pairs): Adjust for irange.
(irange::type): Adjust for irange.
(irange::tree_lower_bound): New.
(irange::tree_upper_bound): New.
(irange::type): Adjust for irange.
(irange::min): Same.
(irange::max): Same.
(irange::varying_p): Same.
(irange::undefined_p): Same.
(irange::zero_p): Same.
(irange::nonzero_p): Same.
(irange::supports_type_p): Same.
(range_includes_zero_p): Same.
(gt_ggc_mx): New.
(gt_pch_nx): New.
(irange::irange): New.
(int_range::int_range): New.
(int_range::operator=): New.
(irange::set): Moved from value-range.cc and adjusted for irange.
(irange::set_undefined): Same.
(irange::set_varying): Same.
(irange::operator==): Same.
(irange::lower_bound): Same.
(irange::upper_bound): Same.
(irange::union_): Same.
(irange::intersect): Same.
(irange::set_nonzero): Same.
(irange::set_zero): Same.
(irange::normalize_min_max): New.
(vrp_val_max): Move from value-range.cc.
(vrp_val_min): Same.
* vr-values.c (vr_values::get_lattice_entry): Call value_range
constructor.
|
|
The following testcase ICEs since r10-3199.
There is a switch with default label, where the controlling expression has
range just 0..7 and there are case labels for all those 8 values, but
nothing has yet optimized away the default.
Since r10-3199, set_switch_stmt_execution_predicate sets the switch to
default label's edge's predicate to a false predicate and then
compute_bb_predicates propagates the predicates through the cfg, but false
predicates aren't really added. The caller of compute_bb_predicates
in one place handles NULL bb->aux as false predicate:
if (fbi.info)
{
if (bb->aux)
bb_predicate = *(predicate *) bb->aux;
else
bb_predicate = false;
}
else
bb_predicate = true;
but then in two further spots that the patch below is changing
it assumes bb->aux must be non-NULL. Those two spots are guarded by a
condition that is only true if fbi.info is non-NULL, so I think the right
fix is to treat NULL aux as false predicate in those spots too.
2020-07-13 Jakub Jelinek <jakub@redhat.com>
PR ipa/96130
* ipa-fnsummary.c (analyze_function_body): Treat NULL bb->aux
as false predicate.
* gcc.dg/torture/pr96130.c: New test.
|
|
gcc/ChangeLog:
2020-04-04 Jan Hubicka <hubicka@ucw.cz>
PR ipa/93940
* ipa-fnsummary.c (vrp_will_run_p): New function.
(fre_will_run_p): New function.
(evaluate_properties_for_edge): Use it.
* ipa-inline.c (can_inline_edge_by_limits_p): Do not inline
!optimize_debug to optimize_debug.
gcc/testsuite/ChangeLog:
2020-04-04 Jan Hubicka <hubicka@ucw.cz>
* g++.dg/tree-ssa/pr93940.C: New test.
|
|
this patch fixes wrong code on a testcase where inline predicts
builtin_constant_p to be true but we fail to optimize its parameter to constant
becuase FRE is not run and the value is passed by an aggregate.
This patch makes the inline predicates to disable aggregate tracking
when FRE is not going to be run and similarly value range when VRP is not
going to be run.
This is just partial fix. Even with it we can arrange FRE/VRP to fail and
produce wrong code, unforutnately.
I think for GCC11 I will need to implement transformation in ipa-inline
but this is bit hard to do: predicates only tracks that value will be constant
and do not track what constant to be.
Optimizing builtin_constant_p in a conditional is not going to do good job
when the value is used later in a place that expects it to be constant.
This is pre-existing problem that is not limited to inline tracking. For example,
FRE may do the transofrm at one place but not in another due to alias oracle
walking limits.
So I am not sure what full fix would be :(
gcc/ChangeLog:
2020-04-04 Jan Hubicka <hubicka@ucw.cz>
PR ipa/93940
* ipa-fnsummary.c (vrp_will_run_p): New function.
(fre_will_run_p): New function.
(evaluate_properties_for_edge): Use it.
* ipa-inline.c (can_inline_edge_by_limits_p): Do not inline
!optimize_debug to optimize_debug.
gcc/testsuite/ChangeLog:
2020-04-04 Jan Hubicka <hubicka@ucw.cz>
* g++.dg/tree-ssa/pr93940.C: New test.
|
|
gcc/ChangeLog:
2020-03-20 Jan Hubicka <hubicka@ucw.cz>
PR ipa/93347
* cgraph.c (symbol_table::create_edge): Update calls_comdat_local flag.
(cgraph_edge::redirect_callee): Move here; likewise.
(cgraph_node::remove_callees): Update calls_comdat_local flag.
(cgraph_node::verify_node): Verify that calls_comdat_local flag match
reality.
(cgraph_node::check_calls_comdat_local_p): New member function.
* cgraph.h (cgraph_node::check_calls_comdat_local_p): Declare.
(cgraph_edge::redirect_callee): Move offline.
* ipa-fnsummary.c (compute_fn_summary): Do not compute
calls_comdat_local flag here.
* ipa-inline-transform.c (inline_call): Fix updating of
calls_comdat_local flag.
* ipa-split.c (split_function): Use true instead of 1 to set the flag.
* symtab.c (symtab_node::add_to_same_comdat_group): Update
calls_comdat_local flag.
gcc/testsuite/ChangeLog:
2020-03-20 Jan Hubicka <hubicka@ucw.cz>
* g++.dg/torture/pr93347.C: New test.
|
|
This patch started as work to resole Richard's comment on quadratic lookups
in resolve_speculation. While doing it I however noticed multiple problems
in the new speuclative call code which made the patch quite big. In
particular:
1) Before applying speculation we consider only targets with at lest
probability 1/2.
If profile is sane at most two targets can have probability greater or
equal to 1/2. So the new multi-target speculation code got enabled only
in very special scenario when there ae precisely two target with precise
probability 1/2 (which is tested by the single testcase).
As a conseuqence the multiple target logic got minimal test coverage and
this made us to miss several ICEs.
2) Profile updating in profile merging, tree-inline and indirect call
expansion was wrong which led to inconsistent profiles (as already seen
on the testcase).
3) Code responsible to turn speculative call to direct call was broken for
anything with more than one target.
4) There were multiple cases where call_site_hash went out of sync which
eventually leads to an ICE..
5) Some code expects that all speculative call targets forms a sequence in
the callee linked list but there is no code to maintain that invariant
nor a verifier.
Fixing this it became obvious that the current API of speculative_call_info is
not useful because it really builds on fact tht there are precisely three
components (direct call, ref and indirect call) in every speculative call
sequence. I ended up replacing it with iterator API for direct call
(first_speculative_call_target, next_speculative_call_target) and accessors for
the other coponents updating comment in cgraph.h.
Finally I made the work with call site hash more effetive by updating edge
manipulation to keep them in sequence. So first one can be looked up from the
hash and then they can be iterated by callee.
There are other things that can be improved (for example the speculation should
start with most common target first), but I will try to keep that for next
stage1. This patch is mostly about getting rid of ICE and profile corruption
which is a regression from GCC 9.
gcc/ChangeLog:
PR lto/93318
* cgraph.c (cgraph_add_edge_to_call_site_hash): Update call site
hash only when edge is first within the sequence.
(cgraph_edge::set_call_stmt): Update handling of speculative calls.
(symbol_table::create_edge): Do not set target_prob.
(cgraph_edge::remove_caller): Watch for speculative calls when updating
the call site hash.
(cgraph_edge::make_speculative): Drop target_prob parameter.
(cgraph_edge::speculative_call_info): Remove.
(cgraph_edge::first_speculative_call_target): New member function.
(update_call_stmt_hash_for_removing_direct_edge): New function.
(cgraph_edge::resolve_speculation): Rewrite to new API.
(cgraph_edge::speculative_call_for_target): New member function.
(cgraph_edge::make_direct): Rewrite to new API; fix handling of
multiple speculation targets.
(cgraph_edge::redirect_call_stmt_to_callee): Likewise; fix updating
of profile.
(verify_speculative_call): Verify that targets form an interval.
* cgraph.h (cgraph_edge::speculative_call_info): Remove.
(cgraph_edge::first_speculative_call_target): New member function.
(cgraph_edge::next_speculative_call_target): New member function.
(cgraph_edge::speculative_call_target_ref): New member function.
(cgraph_edge;:speculative_call_indirect_edge): New member funtion.
(cgraph_edge): Remove target_prob.
* cgraphclones.c (cgraph_node::set_call_stmt_including_clones):
Fix handling of speculative calls.
* ipa-devirt.c (ipa_devirt): Fix handling of speculative cals.
* ipa-fnsummary.c (analyze_function_body): Likewise.
* ipa-inline.c (speculation_useful_p): Use new speculative call API.
* ipa-profile.c (dump_histogram): Fix formating.
(ipa_profile_generate_summary): Watch for overflows.
(ipa_profile): Do not require probablity to be 1/2; update to new API.
* ipa-prop.c (ipa_make_edge_direct_to_target): Update to new API.
(update_indirect_edges_after_inlining): Update to new API.
* ipa-utils.c (ipa_merge_profiles): Rewrite merging of speculative call
profiles.
* profile-count.h: (profile_probability::adjusted): New.
* tree-inline.c (copy_bb): Update to new speculative call API; fix
updating of profile.
* value-prof.c (gimple_ic_transform): Rename to ...
(dump_ic_profile): ... this one; update dumping.
(stream_in_histogram_value): Fix formating.
(gimple_value_profile_transformations): Update.
gcc/testsuite/ChangeLog:
* g++.dg/tree-prof/indir-call-prof.C: Update template.
* gcc.dg/tree-prof/crossmodule-indircall-1.c: Add more targets.
* gcc.dg/tree-prof/crossmodule-indircall-1a.c: Add more targets.
* gcc.dg/tree-prof/indir-call-prof.c: Update template.
|
|
While analyzing code size regression in SPEC2k GCC binary I noticed that we
perform some inline decisions because we think that number of executions are
very high.
In particular there was inline decision inlining gen_rtx_fmt_ee to find_reloads
believing that it is called 4 billion times. This turned out to be cummulation
of roundoff errors in propagate_freq which was bit mechanically updated from
original sreals to C++ sreals and later to new probabilities.
This led us to estimate that a loopback edge is reached with probability 2.3
which was capped to 1-1/10000 and since this happened in nested loop it quickly
escalated to large values.
Originally capping to REG_BR_PROB_BASE avoided such problems but now we have
much higher range.
This patch avoids going from probabilites to REG_BR_PROB_BASE so precision is
kept. In addition it makes the propagation to not estimate more than
param-max-predicted-loop-iterations. The first change makes the cap to not
be triggered on the gcc build, but it is still better to be safe than sorry.
* ipa-fnsummary.c (estimate_calls_size_and_time): Fix formating of
dump.
* params.opt: (max-predicted-iterations): Set bounds.
* predict.c (real_almost_one, real_br_prob_base,
real_inv_br_prob_base, real_one_half, real_bb_freq_max): Remove.
(propagate_freq): Add max_cyclic_prob parameter; cap cyclic
probabilities; do not truncate to reg_br_prob_bases.
(estimate_loops_at_level): Pass max_cyclic_prob.
(estimate_loops): Compute max_cyclic_prob.
(estimate_bb_frequencies): Do not initialize real_*; update calculation
of back edge prob.
* profile-count.c (profile_probability::to_sreal): New.
* profile-count.h (class sreal): Move up in file.
(profile_probability::to_sreal): Declare.
|
|
v8:
1. Rebase to master with Martin's static function (r280043) comments merge.
Boostrap/testsuite/SPEC2017 tested pass on Power8-LE.
2. TODO:
2.1. C++ devirt for multiple speculative call targets.
2.2. ipa-icf ipa_merge_profiles refine with COMDAT inline testcase.
This patch aims to fix PR69678 caused by PGO indirect call profiling
performance issues.
The bug that profiling data is never working was fixed by Martin's pull
back of topN patches, performance got GEOMEAN ~1% improvement(+24% for
511.povray_r specifically).
Still, currently the default profile only generates SINGLE indirect target
that called more than 75%. This patch leverages MULTIPLE indirect
targets use in LTO-WPA and LTO-LTRANS stage, as a result, function
specialization, profiling, partial devirtualization, inlining and
cloning could be done successfully based on it.
Performance can get improved from 0.70 sec to 0.38 sec on simple tests.
Details are:
1. PGO with topn is enabled by default now, but only one indirect
target edge will be generated in ipa-profile pass, so add variables to enable
multiple speculative edges through passes, speculative_id will record the
direct edge index bind to the indirect edge, indirect_call_targets length
records how many direct edges owned by the indirect edge, postpone gimple_ic
to ipa-profile like default as inline pass will decide whether it is benefit
to transform indirect call.
2. Use speculative_id to track and search the reference node matched
with the direct edge's callee for multiple targets. Actually, it is the
caller's responsibility to handle the direct edges mapped to same indirect
edge. speculative_call_info will return one of the direct edge specified,
this will leverage current IPA edge process framework mostly.
3. Enable LTO WPA/LTRANS stage multiple indirect call targets analysis for
profile full support in ipa passes and cgraph_edge functions. speculative_id
can be set by make_speculative id when multiple targets are binded to
one indirect edge, and cloned if new edge is cloned. speculative_id
is streamed out and stream int by lto like lto_stmt_uid.
4. Create and duplicate all speculative direct edge's call summary
in ipa-fnsummary.c with auto_vec.
5. Add 1 in module testcase and 2 cross module testcases.
6. Bootstrap and regression test passed on Power8-LE. No function
and performance regression for SPEC2017.
gcc/ChangeLog
2020-01-14 Xiong Hu Luo <luoxhu@linux.ibm.com>
PR ipa/69678
* cgraph.c (symbol_table::create_edge): Init speculative_id and
target_prob.
(cgraph_edge::make_speculative): Add param for setting speculative_id
and target_prob.
(cgraph_edge::speculative_call_info): Update comments and find reference
by speculative_id for multiple indirect targets.
(cgraph_edge::resolve_speculation): Decrease the speculations
for indirect edge, drop it's speculative if not direct target
left. Update comments.
(cgraph_edge::redirect_call_stmt_to_callee): Likewise.
(cgraph_node::dump): Print num_speculative_call_targets.
(cgraph_node::verify_node): Don't report error if speculative
edge not include statement.
(cgraph_edge::num_speculative_call_targets_p): New function.
* cgraph.h (int common_target_id): Remove.
(int common_target_probability): Remove.
(num_speculative_call_targets): New variable.
(make_speculative): Add param for setting speculative_id.
(cgraph_edge::num_speculative_call_targets_p): New declare.
(target_prob): New variable.
(speculative_id): New variable.
* ipa-fnsummary.c (analyze_function_body): Create and duplicate
call summaries for multiple speculative call targets.
* cgraphclones.c (cgraph_node::create_clone): Clone speculative_id.
* ipa-profile.c (struct speculative_call_target): New struct.
(class speculative_call_summary): New class.
(class speculative_call_summaries): New class.
(call_sums): New variable.
(ipa_profile_generate_summary): Generate indirect multiple targets summaries.
(ipa_profile_write_edge_summary): New function.
(ipa_profile_write_summary): Stream out indirect multiple targets summaries.
(ipa_profile_dump_all_summaries): New function.
(ipa_profile_read_edge_summary): New function.
(ipa_profile_read_summary_section): New function.
(ipa_profile_read_summary): Stream in indirect multiple targets summaries.
(ipa_profile): Generate num_speculative_call_targets from
profile summaries.
* ipa-ref.h (speculative_id): New variable.
* ipa-utils.c (ipa_merge_profiles): Update with target_prob.
* lto-cgraph.c (lto_output_edge): Remove indirect common_target_id and
common_target_probability. Stream out speculative_id and
num_speculative_call_targets.
(input_edge): Likewise.
* predict.c (dump_prediction): Remove edges count assert to be
precise.
* symtab.c (symtab_node::create_reference): Init speculative_id.
(symtab_node::clone_references): Clone speculative_id.
(symtab_node::clone_referring): Clone speculative_id.
(symtab_node::clone_reference): Clone speculative_id.
(symtab_node::clear_stmts_in_references): Clear speculative_id.
* tree-inline.c (copy_bb): Duplicate all the speculative edges
if indirect call contains multiple speculative targets.
* value-prof.h (check_ic_target): Remove.
* value-prof.c (gimple_value_profile_transformations):
Use void function gimple_ic_transform.
* value-prof.c (gimple_ic_transform): Handle topn case.
Fix comment typos. Change it to a void function.
gcc/testsuite/ChangeLog
2020-01-14 Xiong Hu Luo <luoxhu@linux.ibm.com>
PR ipa/69678
* gcc.dg/tree-prof/indir-call-prof-topn.c: New testcase.
* gcc.dg/tree-prof/crossmodule-indir-call-topn-1.c: New testcase.
* gcc.dg/tree-prof/crossmodule-indir-call-topn-1a.c: New testcase.
* gcc.dg/tree-prof/crossmodule-indir-call-topn-2.c: New testcase.
* lib/scandump.exp: Dump executable file name.
* lib/scanwpaipa.exp: New scan-pgo-wap-ipa-dump.
|
|
2020-01-09 Martin Jambor <mjambor@suse.cz>
* cgraph.h (cgraph_edge): Make remove, set_call_stmt, make_direct,
resolve_speculation and redirect_call_stmt_to_callee static. Change
return type of set_call_stmt to cgraph_edge *.
* auto-profile.c (afdo_indirect_call): Adjust call to
redirect_call_stmt_to_callee.
* cgraph.c (cgraph_edge::set_call_stmt): Make return cgraph-edge *,
make the this pointer explicit, adjust self-recursive calls and the
call top make_direct. Return the resulting edge.
(cgraph_edge::remove): Make this pointer explicit.
(cgraph_edge::resolve_speculation): Likewise, adjust call to remove.
(cgraph_edge::make_direct): Likewise, adjust call to
resolve_speculation.
(cgraph_edge::redirect_call_stmt_to_callee): Likewise, also adjust
call to set_call_stmt.
(cgraph_update_edges_for_call_stmt_node): Update call to
set_call_stmt and remove.
* cgraphclones.c (cgraph_node::set_call_stmt_including_clones):
Renamed edge to master_edge. Adjusted calls to set_call_stmt.
(cgraph_node::create_edge_including_clones): Moved "first" definition
of edge to the block where it was used. Adjusted calls to
set_call_stmt.
(cgraph_node::remove_symbol_and_inline_clones): Adjust call to
cgraph_edge::remove.
* cgraphunit.c (walk_polymorphic_call_targets): Adjusted calls to
make_direct and redirect_call_stmt_to_callee.
* ipa-fnsummary.c (redirect_to_unreachable): Adjust calls to
resolve_speculation and make_direct.
* ipa-inline-transform.c (inline_transform): Adjust call to
redirect_call_stmt_to_callee.
(check_speculations_1):: Adjust call to resolve_speculation.
* ipa-inline.c (resolve_noninline_speculation): Adjust call to
resolve-speculation.
(inline_small_functions): Adjust call to resolve_speculation.
(ipa_inline): Likewise.
* ipa-prop.c (ipa_make_edge_direct_to_target): Adjust call to
make_direct.
* ipa-visibility.c (function_and_variable_visibility): Make iteration
safe with regards to edge removal, adjust calls to
redirect_call_stmt_to_callee.
* ipa.c (walk_polymorphic_call_targets): Adjust calls to make_direct
and redirect_call_stmt_to_callee.
* multiple_target.c (create_dispatcher_calls): Adjust call to
redirect_call_stmt_to_callee
(redirect_to_specific_clone): Likewise.
* tree-cfgcleanup.c (delete_unreachable_blocks_update_callgraph):
Adjust calls to cgraph_edge::remove.
* tree-inline.c (copy_bb): Adjust call to set_call_stmt.
(redirect_all_calls): Adjust call to redirect_call_stmt_to_callee.
(expand_call_inline): Adjust call to cgraph_edge::remove.
From-SVN: r280043
|
|
2020-01-09 Martin Liska <mliska@suse.cz>
* auto-profile.c (auto_profile): Use opt_for_fn
for a parameter.
* ipa-cp.c (ipcp_lattice::add_value): Likewise.
(propagate_vals_across_arith_jfunc): Likewise.
(hint_time_bonus): Likewise.
(incorporate_penalties): Likewise.
(good_cloning_opportunity_p): Likewise.
(perform_estimation_of_a_value): Likewise.
(estimate_local_effects): Likewise.
(ipcp_propagate_stage): Likewise.
* ipa-fnsummary.c (decompose_param_expr): Likewise.
(set_switch_stmt_execution_predicate): Likewise.
(analyze_function_body): Likewise.
* ipa-inline-analysis.c (offline_size): Likewise.
* ipa-inline.c (early_inliner): Likewise.
* ipa-prop.c (ipa_analyze_node): Likewise.
(ipcp_transform_function): Likewise.
* ipa-sra.c (process_scan_results): Likewise.
(ipa_sra_summarize_function): Likewise.
* params.opt: Rename ipcp-unit-growth to
ipa-cp-unit-growth. Add Optimization for various
IPA-related parameters.
From-SVN: r280040
|
|
2020-01-08 Martin Liska <mliska@suse.cz>
* cgraph.c (cgraph_node::dump): Use ::dump_name or
::dump_asm_name instead of (::name or ::asm_name).
* cgraphclones.c (symbol_table::materialize_all_clones): Likewise.
* cgraphunit.c (walk_polymorphic_call_targets): Likewise.
(analyze_functions): Likewise.
(expand_all_functions): Likewise.
* ipa-cp.c (ipcp_cloning_candidate_p): Likewise.
(propagate_bits_across_jump_function): Likewise.
(dump_profile_updates): Likewise.
(ipcp_store_bits_results): Likewise.
(ipcp_store_vr_results): Likewise.
* ipa-devirt.c (dump_targets): Likewise.
* ipa-fnsummary.c (analyze_function_body): Likewise.
* ipa-hsa.c (check_warn_node_versionable): Likewise.
(process_hsa_functions): Likewise.
* ipa-icf.c (sem_item_optimizer::merge_classes): Likewise.
(set_alias_uids): Likewise.
* ipa-inline-transform.c (save_inline_function_body): Likewise.
* ipa-inline.c (recursive_inlining): Likewise.
(inline_to_all_callers_1): Likewise.
(ipa_inline): Likewise.
* ipa-profile.c (ipa_propagate_frequency_1): Likewise.
(ipa_propagate_frequency): Likewise.
* ipa-prop.c (ipa_make_edge_direct_to_target): Likewise.
(remove_described_reference): Likewise.
* ipa-pure-const.c (worse_state): Likewise.
(check_retval_uses): Likewise.
(analyze_function): Likewise.
(propagate_pure_const): Likewise.
(propagate_nothrow): Likewise.
(dump_malloc_lattice): Likewise.
(propagate_malloc): Likewise.
(pass_local_pure_const::execute): Likewise.
* ipa-visibility.c (optimize_weakref): Likewise.
(function_and_variable_visibility): Likewise.
* ipa.c (symbol_table::remove_unreachable_nodes): Likewise.
(ipa_discover_variable_flags): Likewise.
* lto-streamer-out.c (output_function): Likewise.
(output_constructor): Likewise.
* tree-inline.c (copy_bb): Likewise.
* tree-ssa-structalias.c (ipa_pta_execute): Likewise.
* varpool.c (symbol_table::remove_unreferenced_decls): Likewise.
2020-01-08 Martin Liska <mliska@suse.cz>
* lto-partition.c (add_symbol_to_partition_1): Use ::dump_name or
::dump_asm_name instead of (::name or ::asm_name).
(lto_balanced_map): Likewise.
(promote_symbol): Likewise.
(rename_statics): Likewise.
* lto.c (lto_wpa_write_files): Likewise.
2020-01-08 Martin Liska <mliska@suse.cz>
* gcc.dg/ipa/ipa-icf-1.c: Update expected scanned output.
* gcc.dg/ipa/ipa-icf-10.c: Likewise.
* gcc.dg/ipa/ipa-icf-11.c: Likewise.
* gcc.dg/ipa/ipa-icf-12.c: Likewise.
* gcc.dg/ipa/ipa-icf-13.c: Likewise.
* gcc.dg/ipa/ipa-icf-16.c: Likewise.
* gcc.dg/ipa/ipa-icf-18.c: Likewise.
* gcc.dg/ipa/ipa-icf-2.c: Likewise.
* gcc.dg/ipa/ipa-icf-20.c: Likewise.
* gcc.dg/ipa/ipa-icf-21.c: Likewise.
* gcc.dg/ipa/ipa-icf-23.c: Likewise.
* gcc.dg/ipa/ipa-icf-25.c: Likewise.
* gcc.dg/ipa/ipa-icf-26.c: Likewise.
* gcc.dg/ipa/ipa-icf-27.c: Likewise.
* gcc.dg/ipa/ipa-icf-3.c: Likewise.
* gcc.dg/ipa/ipa-icf-35.c: Likewise.
* gcc.dg/ipa/ipa-icf-36.c: Likewise.
* gcc.dg/ipa/ipa-icf-37.c: Likewise.
* gcc.dg/ipa/ipa-icf-38.c: Likewise.
* gcc.dg/ipa/ipa-icf-5.c: Likewise.
* gcc.dg/ipa/ipa-icf-7.c: Likewise.
* gcc.dg/ipa/ipa-icf-8.c: Likewise.
* gcc.dg/ipa/ipa-icf-merge-1.c: Likewise.
* gcc.dg/ipa/pr64307.c: Likewise.
* gcc.dg/ipa/pr90555.c: Likewise.
* gcc.dg/ipa/propmalloc-1.c: Likewise.
* gcc.dg/ipa/propmalloc-2.c: Likewise.
* gcc.dg/ipa/propmalloc-3.c: Likewise.
From-SVN: r280009
|
|
2020-01-08 Martin Liska <mliska@suse.cz>
* ipa-fnsummary.c (dump_ipa_call_summary): Use symtab_node::dump_name.
(ipa_call_context::estimate_size_and_time): Likewise.
(inline_analyze_function): Likewise.
2020-01-08 Martin Liska <mliska@suse.cz>
* lto-partition.c (lto_balanced_map): Use symtab_node::dump_name.
From-SVN: r279999
|
|
From-SVN: r279813
|
|
* ipa-fnsummary.h (ipa_size_summary): Remove copy consturctor.
(ipa_size_summary_t): Add duplicate method; move to heap.
* ipa-fnsumary.c (ipa_fn_summary_alloc): Fix allocation.
From-SVN: r279563
|
|
PR ipa/92357
* ipa-fnsummary.c (ipa_fn_summary_write): Use
lto_symtab_encoder_iterator with lsei_start_function_in_partition and
lsei_next_function_in_partition instead of walking all cgraph nodes
in encoder.
From-SVN: r279395
|
|
* ipa-fnsummary.c: Include tree-into-ssa.h.
(compute_fn_summary): Call update_ssa.
From-SVN: r278946
|
|
* cgraph.c (cgraph_node::dump): Dump unit_id and merged_extern_inline.
* cgraph.h (cgraph_node): Add unit_id and
merged_extern_inline.
(symbol_table): Add max_unit.
(symbol_table::symbol_table): Initialize it.
* cgraphclones.c (duplicate_thunk_for_node): Copy unit_id.
merged_comdat, merged_extern_inline.
(cgraph_node::create_clone): Likewise.
(cgraph_node::create_version_clone): Likewise.
* ipa-fnsummary.c (dump_ipa_call_summary): Dump info about cross module
calls.
* ipa-fnsummary.h (cross_module_call_p): New inline function.
* ipa-inline-analyssi.c (simple_edge_hints): Use it.
* ipa-inline.c (inline_small_functions): Likewise.
* lto-symtab.c (lto_cgraph_replace_node): Record merged_extern_inline;
copy merged_comdat and merged_extern_inline.
* lto-cgraph.c (lto_output_node): Stream out merged_comdat,
merged_extern_inline and unit_id.
(input_overwrite_node): Stream in these.
(input_cgraph_1): Set unit_base.
* lto-streamer.h (lto_file_decl_data): Add unit_base.
* symtab.c (symtab_node::make_decl_local): Record former_comdat.
* g++.dg/lto/inline-crossmodule-1.h: New testcase.
* g++.dg/lto/inline-crossmodule-1_0.C: New testcase.
* g++.dg/lto/inline-crossmodule-1_1.C: New testcase.
From-SVN: r278876
|
|
2019-11-25 Martin Liska <mliska@suse.cz>
PR bootstrap/92653
* ipa-fnsummary.c (ipa_fn_summary::account_size_time): Comment out
too strict checking assert.
From-SVN: r278686
|
|
This patch adds opt_for_fn for all cross module params used by inliner
so they can be modified at function granuality. With inlining almost always
there are three functions to consider (callee and caller of the inlined edge
and the outer function caller is inlined to).
I always use the outer function params since that is how local parameters
behave. I hope it is kind of what is also expected in most case: it is better
to inline agressively into -O3 compiled code rather than inline agressively -O3
functions into their callers.
New params infrastructure is nice. One drawback is that is very hard to
search for individual param uses since they all occupy global namespace.
With C++ world we had chance to do something like params.param_flag_name
or params::param_flag_name instead...
Bootstrapped/regtested x86_64-linux, comitted.
* cif-code.def (MAX_INLINE_INSNS_SINGLE_O2_LIMIT): Remove.
* doc/invoke.texi (max-inline-insns-single-O2,
inline-heuristics-hint-percent-O2, inline-min-speedup-O2,
early-inlining-insns-O2): Remove documentation.
* ipa-fnsummary.c (analyze_function_body,
compute_fn_summary): Use opt_for_fn when accessing parameters.
* ipa-inline.c (caller_growth_limits, can_inline_edge_p,
inline_insns_auto, can_inline_edge_by_limits_p,
want_early_inline_function_p, big_speedup_p,
want_inline_small_function_p, want_inline_self_recursive_call_p,
recursive_inlining, compute_max_insns, inline_small_functions):
Likewise.
* opts.c (default_options): Add -O3 defaults for
OPT__param_early_inlining_insns_,
OPT__param_inline_heuristics_hint_percent_,
OPT__param_inline_min_speedup_, OPT__param_max_inline_insns_single_.
* params.opt (-param=early-inlining-insns-O2=,
-param=inline-heuristics-hint-percent-O2=,
-param=inline-min-speedup-O2=, -param=max-inline-insns-single-O2=
-param=early-inlining-insns=, -param=inline-heuristics-hint-percent=,
-param=inline-min-speedup=, -param=inline-unit-growth=,
-param=large-function-growth=, -param=large-stack-frame=,
-param=large-stack-frame-growth=, -param=large-unit-insns=,
-param=max-inline-insns-recursive=,
-param=max-inline-insns-recursive-auto=,
-param=max-inline-insns-single=,
-param=max-inline-insns-size=, -param=max-inline-insns-small=,
-param=max-inline-recursive-depth=,
-param=max-inline-recursive-depth-auto=,
-param=min-inline-recursive-probability=,
-param=partial-inlining-entry-probability=,
-param=uninlined-function-insns=, -param=uninlined-function-time=,
-param=uninlined-thunk-insns=, -param=uninlined-thunk-time=): Add
Optimization.
* g++.dg/tree-ssa/pr53844.C: Drop -O2 from param name.
* g++.dg/tree-ssa/pr61034.C: Likewise.
* g++.dg/tree-ssa/pr8781.C: Likewise.
* g++.dg/warn/Wstringop-truncation-1.C: Likewise.
* gcc.dg/ipa/pr63416.c: Likewise.
* gcc.dg/tree-ssa/ssa-thread-12.c: Likewise.
* gcc.dg/vect/pr66142.c: Likewise.
* gcc.dg/winline-3.c: Likewise.
* gcc.target/powerpc/pr72804.c: Likewise.
From-SVN: r278644
|
|
* ipa-fnsummary.c: Fix comment typos.
* ipa-ref.h: Likewise.
* ipa-predicate.h: Likewise.
* ipa-split.c: Likewise.
* ipa-inline-analysis.c: Likewise.
* ipa-predicate.c: Likewise.
* ipa-devirt.c: Likewise.
* ipa-icf.h: Likewise.
* profile-count.c: Likewise.
* ipa-icf.c: Likewise.
(sem_function::equals_wpa): Fix typos in dump messages.
* ipa-icf-gimple.h: Fix comment typos.
* ipa-inline-transform.c: Likewise.
* ipa-polymorphic-call.c: Likewise.
* ipa-fnsummary.h: Likewise.
* ipa-inline.c: Likewise.
(dump_inline_stats): Fix typo in debug dump message.
* profile-count.h: Fix comment typos.
From-SVN: r278643
|
|
vectors to not be allocated.
* ipa-fnsummary.c (evaluate_conditions_for_known_args): Be
ready for some vectors to not be allocated.
(evaluate_properties_for_edge): Document better; make
known_vals and known_aggs caller allocated; avoid determining
values of parameters which are not used.
(ipa_merge_fn_summary_after_inlining): Pre allocate known_vals and
known_aggs.
* ipa-inline-analysis.c (do_estimate_edge_time): Likewise.
(do_estimate_edge_size): Likewise.
(do_estimate_edge_hints): Likewise.
* ipa-cp.c (ipa_get_indirect_edge_target_1): Do not early exit when
values are not known.
(ipa_release_agg_values): Add option to not release vector itself.
From-SVN: r278553
|
|
* ipa-fnsummary.c (ipa_fn_summary::account_size_time): Allow
negative time in calls summary; correct roundoff errors
leading to negative times.
(ipa_merge_fn_summary_after_inlining): Update calls size time table
if present.
(ipa_update_overall_fn_summary): Add RESET parameter.
* ipa-fnsummary.h (ipa_update_overall_fn_summary): Update prototype.
* ipa-inline-transform.c (inline_call): Enable incremental updates.
From-SVN: r278541
|
|
* ipa-fnsummary.c (ipa_fn_summary::account_size_time): Add CALL
parameter and update call_size_time_table.
(ipa_fn_summary::max_size_time_table_size): New constant.
(estimate_calls_size_and_time_1): Break out from ...
(estimate_calls_size_and_time): ... here; implement summary production.
(summarize_calls_size_and_time): New function.
(ipa_call_context::estimate_size_and_time): Bypass
estimate_calls_size_and_time for leaf functions.
(ipa_update_overall_fn_summary): Likewise.
* ipa-fnsummary.h (call_size_time_table): New.
(ipa_fn_summary::account_size_time): Update prototype.
From-SVN: r278513
|
|
* ipa-fnsummary.c (estimate_edge_size_and_time): Drop parameter PROB.
(estimate_calls_size_and_time): Update.
From-SVN: r278460
|