Age | Commit message (Collapse) | Author | Files | Lines |
|
This patch fixes the last bit of PR 99122 where various bits of IPA
infrastructure are presented with a program with type mismatches that
make it have undefined behavior, and when inlining or performing
IPA-CP, and encountering such mismatch, we basically try to
VIEW_CONVERT_EXPR whatever the caller has into whatever the callee has
or simply use an empty constructor if that cannot be done. This
however does not work when the callee has VLA parameters because we
ICE in the process.
Richi has already disabled inlining for such cases, this patch avoids
the issue in IPA-CP. It adds checks that whatever constant the
propagation arrived at is actually compatible or fold_convertible to
the callees formal parameer type. Unlike in the past, we now have
types of all parameters of functions that we have analyzed, even with
LTO, and so can do it.
This should prevent only bogus propagations. I have looked at the
effect of the patch on WPA of Firefox and did not have any.
I have bootstrapped and LTO bootstrapped and tested the patch on
x86_64-linux. OK for trunk? And perhaps later for GCC 10 too?
Thanks
gcc/ChangeLog:
2021-02-26 Martin Jambor <mjambor@suse.cz>
PR ipa/99122
* ipa-cp.c (initialize_node_lattices): Mark as bottom all
parameters with unknown type.
(ipacp_value_safe_for_type): New function.
(propagate_vals_across_arith_jfunc): Verify that the constant type
can be used for a type of the formal parameter.
(propagate_vals_across_ancestor): Likewise.
(propagate_scalar_across_jump_function): Likewise. Pass the type
also to propagate_vals_across_ancestor.
gcc/testsuite/ChangeLog:
2021-02-26 Martin Jambor <mjambor@suse.cz>
PR ipa/99122
* gcc.dg/pr99122-3.c: Remove -fno-ipa-cp from options.
|
|
|
|
When looking at the testcase of PR 97816 I realized that the reason
why we were hitting overflows in size growth estimates in IPA-CP is
not because the chains of how lattices feed values to each other are
so long but mainly because we add estimates in callee lattices to
caller lattices for each value source, which roughly corresponds to a
call graph edge, and therefore if there are multiple calls between two
functions passing the same value in a parameter we end up doing it
more than once, sometimes actually quite many times.
This patch avoids it by using a has_set to remember the source values
we have already updated and not increasing their size again.
Furhtermore, to improve estimation of times we scale the propagated
time benefits with edge frequencies as we accumulate them.
This should make any overflows very unlikely but not impossible, so I
still included checks for overflows but decided to restructure the
code to only need it in the propagate_effects function and modified it
so that it does not need to perform the check before each sum.
This is because I decided to add local estimates to propagated
estimates already in propagate_effects and not at the evaluation time.
The function can then do the sums in a wide type and discard them in
the unlikely case of an overflow. I also decided to use the
opportunity to make propagated effect stats now include stats from
other values in the same SCCs. In the dumps I have seen this tended
to increase size cost a tiny bit more than the estimated time benefit
but both increases were small.
Martin
gcc/ChangeLog:
2020-11-20 Martin Jambor <mjambor@suse.cz>
PR ipa/97816
* ipa-cp.c (safe_add): Removed.
(good_cloning_opportunity_p): Remove special handling of INT_MAX.
(value_topo_info<valtype>::propagate_effects): Take care not to
propagate from size one value to another through more sources. Scale
propagated times with edge frequencies. Include local time and size
in propagates ones here. Take care not to overflow size.
(decide_about_value): Do not add local and propagated effects when
passing them to good_cloning_opportunity_p.
|
|
when Fortran functions pass array descriptors they receive as a
parameter to another function, they actually rebuild it. Thanks to
work done mainly by Feng, IPA-CP can already handle the cases when
they pass directly the values loaded from the original descriptor.
Unfortunately, perhaps the most important one, stride, is first
checked against zero and is replaced with one in that case:
_12 = *a_11(D).dim[0].stride;
if (_12 != 0)
goto <bb 4>; [50.00%]
else
goto <bb 3>; [50.00%]
<bb 3>
// empty BB
<bb 4>
# iftmp.22_9 = PHI <_12(2), 1(3)>
...
parm.6.dim[0].stride = iftmp.22_9;
...
__x_MOD_foo (&parm.6, b_31(D));
in the most important and hopefully common cases, the incoming value
is already 1 and we fail to propagate it.
I would therefore like to propose the following way of encoding this
situation in pass-through jump functions using using ASSERTT_EXPR
operation code meaning that if the incoming value is the same as the
"operand" in the jump function, it is passed on, otherwise the result
is unknown. This of course captures only the single (but most
important) case but is an improvement and does not need enlarging the
jump function structure and is simple to pattern match. Encoding that
zero needs to be changed to one would need another field and matching
it would be slightly more complicated too.
gcc/
2020-06-12 Martin Jambor <mjambor@suse.cz>
* ipa-prop.h (ipa_pass_through_data): Expand comment describing
operation.
* ipa-prop.c (analyze_agg_content_value): Detect new special case and
encode it as ASSERT_EXPR.
* ipa-cp.c (values_equal_for_ipcp_p): Move before
ipa_get_jf_arith_result.
(ipa_get_jf_arith_result): Special case ASSERT_EXPR.
gcc/testsuite/
2020-06-12 Martin Jambor <mjambor@suse.cz>
* gfortran.dg/ipcp-array-2.f90: New test.
|
|
The new behavior of safe_add triggered an ICE because of one use where
it had not been used instead of a simple addition. I'll fix it with the
following obvious patch so that periodic benchmarkers can continue
working because a proper fix (see below) will need a review.
The testcase showed me, however, that we can propagate time and cost
from one lattice to another more than once even when that was not the
intent. I'll address that as a follow-up after I verify it does not
affect the IPA-CP heuristics too much or change the corresponding
params accordingly.
Bootstrapped and tested on x86_64-linux.
gcc/ChangeLog:
2020-11-13 Martin Jambor <mjambor@suse.cz>
PR ipa/97816
* ipa-cp.c (value_topo_info<valtype>::propagate_effects): Use
safe_add instead of a simple addition.
|
|
This patch converts the variables that hold time benefits and
frequencies in IPA-CP from plain integers to sreals, avoiding the need
to cap them to avoid overflows and also fixing a potential underflows.
Size costs corresponding to individual constants are left as ints so
that they do not take up too much space. Care must be taken that
adding it up does not overflow, especially in the case of
prop_size_cost, because in cases of extremely long chains of lattice
dependencies it can overflow (e.g. in testsuite/gcc.dg/ipa/pr50744.c).
The overall size is already tracked in long ints.
gcc/ChangeLog:
2020-11-11 Martin Jambor <mjambor@suse.cz>
* ipa-cp.c (class ipcp_value_base): Change the type of
local_time_benefit and prop_time_benefit to sreal. Adjust the
constructor initializer.
(ipcp_lattice::print): Dump sreals.
(struct caller_statistics): Change the type of freq_sum to sreal.
(gather_caller_stats): Work with sreal freq_sum.
(incorporate_penalties): Work with sreal evaluation.
(good_cloning_opportunity_p): Adjusted for sreal sreal time_benefit
and freq_sum. Bail out if size_cost is INT_MAX.
(perform_estimation_of_a_value): Work with sreal time_benefit. Avoid
unnecessary capping.
(estimate_local_effects): Pass sreal time benefit to
good_cloning_opportunity_p without capping it. Adjust dumping.
(safe_add): If there can be overflow, return INT_MAX.
(propagate_effects): Work with sreal times.
(get_info_about_necessary_edges): Work with sreal frequencies.
(decide_about_value): Likewise and with sreal time benefits.
|
|
Martin Liška has been asking me to add debug counters to the IPA-CP pass so
that testcase reductions are easier. The pass already has one for the bit
value propagation, so this patch adds one for value_range propagation
and one for the actual constant propagation.
gcc/ChangeLog:
2020-10-30 Martin Jambor <mjambor@suse.cz>
* dbgcnt.def (ipa_cp_values): New counter.
(ipa_cp_vr): Likewise.
* ipa-cp.c (decide_about_value): Check and bump ipa_cp_values debug
counter.
(decide_whether_version_node): Likewise.
(ipcp_store_vr_results):Check and bump ipa_cp_vr debug counter.
|
|
* Makefile.in: (OBJS): Add symtab-clones.o
(GTFILES): Add symtab-clones.h
* cgraph.c: Include symtab-clones.h.
(cgraph_edge::resolve_speculation): Fix formating
(cgraph_edge::redirect_call_stmt_to_callee): Update.
(cgraph_update_edges_for_call_stmt): Update
(release_function_body): Fix formating.
(cgraph_node::remove): Fix formating.
(cgraph_node::dump): Fix formating.
(cgraph_node::get_availability): Fix formating.
(cgraph_node::call_for_symbol_thunks_and_aliases): Fix formating.
(set_const_flag_1): Fix formating.
(set_pure_flag_1): Fix formating.
(cgraph_node::can_remove_if_no_direct_calls_p): Fix formating.
(collect_callers_of_node_1): Fix formating.
(clone_of_p): Update.
(cgraph_node::verify_node): Update.
(cgraph_c_finalize): Call clone_info::release ().
* cgraph.h (struct cgraph_clone_info): Move to symtab-clones.h.
(cgraph_node): Remove clone_info.
(symbol_table): Add m_clones.
* cgraphclones.c: Include symtab-clone.h.
(duplicate_thunk_for_node): Update.
(cgraph_node::create_clone): Update.
(cgraph_node::create_virtual_clone): Update.
(cgraph_node::find_replacement): Update.
(cgraph_node::materialize_clone): Update.
* gengtype.c (open_base_files): Include symtab-clones.h.
* ipa-cp.c: Include symtab-clones.h.
(initialize_node_lattices): Update.
(want_remove_some_param_p): Update.
(create_specialized_node): Update.
* ipa-fnsummary.c: Include symtab-clones.h.
(ipa_fn_summary_t::duplicate): Update.
* ipa-modref.c: Include symtab-clones.h.
(update_signature): Update.
* ipa-param-manipulation.c: Include symtab-clones.h.
(ipa_param_body_adjustments::common_initialization): Update.
* ipa-prop.c: Include symtab-clones.h.
(adjust_agg_replacement_values): Update.
(ipcp_get_parm_bits): Update.
(ipcp_update_bits): Update.
(ipcp_update_vr): Update.
* ipa-sra.c: Include symtab-clones.h.
(process_isra_node_results): Update.
(disable_unavailable_parameters): Update.
* lto-cgraph.c: Include symtab-clone.h.
(output_cgraph_opt_summary_p): Update.
(output_node_opt_summary): Update.
(input_node_opt_summary): Update.
* symtab-clones.cc: New file.
* symtab-clones.h: New file.
* tree-inline.c (expand_call_inline): Update.
(update_clone_info): Update.
(tree_function_versioning): Update.
|
|
gcc/ChangeLog:
PR lto/97508
* langhooks.c (lhd_begin_section): Call get_section with
not_existing = true.
* output.h (get_section): Add new argument.
* varasm.c (get_section): Fail when NOT_EXISTING is true
and a section already exists.
* ipa-cp.c (ipcp_write_summary): Remove.
(ipcp_read_summary): Likewise.
* ipa-fnsummary.c (ipa_fn_summary_read): Always read jump
functions summary.
(ipa_fn_summary_write): Always stream it.
|
|
this patch moves thunk_info out of cgraph_node into a symbol summary.
I also moved it to separate hearder file since cgraph.h became really too
fat. I plan to contiue with similar breakup in order to cleanup interfaces
and reduce WPA memory footprint (symbol table now consumes more memory than
trees)
gcc/ChangeLog:
2020-10-23 Jan Hubicka <hubicka@ucw.cz>
* Makefile.in: Add symtab-thunks.o
(GTFILES): Add symtab-thunks.h and symtab-thunks.cc; remove cgraphunit.c
* cgraph.c: Include symtab-thunks.h.
(cgraph_node::create_thunk): Update
(symbol_table::create_edge): Update
(cgraph_node::dump): Update
(cgraph_node::call_for_symbol_thunks_and_aliases): Update
(set_nothrow_flag_1): Update
(set_malloc_flag_1): Update
(set_const_flag_1): Update
(collect_callers_of_node_1): Update
(clone_of_p): Update
(cgraph_node::verify_node): Update
(cgraph_node::function_symbol): Update
(cgraph_c_finalize): Call thunk_info::release.
(cgraph_node::has_thunk_p): Update
(cgraph_node::former_thunk_p): Move here from cgraph.h; reimplement.
* cgraph.h (struct cgraph_thunk_info): Rename to symtab-thunks.h.
(cgraph_node): Remove thunk field; add thunk bitfield.
(cgraph_node::expand_thunk): Move to symtab-thunks.h
(symtab_thunks_cc_finalize): Declare.
(cgraph_node::has_gimple_body_p): Update.
(cgraph_node::former_thunk_p): Update.
* cgraphclones.c: Include symtab-thunks.h.
(duplicate_thunk_for_node): Update.
(cgraph_edge::redirect_callee_duplicating_thunks): Update.
(cgraph_node::expand_all_artificial_thunks): Update.
(cgraph_node::create_edge_including_clones): Update.
* cgraphunit.c: Include symtab-thunks.h.
(vtable_entry_type): Move to symtab-thunks.c.
(cgraph_node::analyze): Update.
(analyze_functions): Update.
(mark_functions_to_output): Update.
(thunk_adjust): Move to symtab-thunks.c
(cgraph_node::expand_thunk): Move to symtab-thunks.c
(cgraph_node::assemble_thunks_and_aliases): Update.
(output_in_order): Update.
(cgraphunit_c_finalize): Do not clear vtable_entry_type.
(cgraph_node::create_wrapper): Update.
* gengtype.c (open_base_files): Add symtab-thunks.h
* ipa-comdats.c (propagate_comdat_group): UPdate.
(ipa_comdats): Update.
* ipa-cp.c (determine_versionability): UPdate.
(gather_caller_stats): Update.
(count_callers): Update
(set_single_call_flag): Update
(initialize_node_lattices): Update
(call_passes_through_thunk_p): Update
(call_passes_through_thunk): Update
(propagate_constants_across_call): Update
(find_more_scalar_values_for_callers_subset): Update
(has_undead_caller_from_outside_scc_p): Update
* ipa-fnsummary.c (evaluate_properties_for_edge): Update.
(compute_fn_summary): Update.
(inline_analyze_function): Update.
* ipa-icf.c: Include symtab-thunks.h.
(sem_function::equals_wpa): Update.
(redirect_all_callers): Update.
(sem_function::init): Update.
(sem_function::parse): Update.
* ipa-inline-transform.c: Include symtab-thunks.h.
(inline_call): Update.
(save_inline_function_body): Update.
(preserve_function_body_p): Update.
* ipa-inline.c (inline_small_functions): Update.
* ipa-polymorphic-call.c: Include alloc-pool.h, symbol-summary.h,
symtab-thunks.h
(ipa_polymorphic_call_context::ipa_polymorphic_call_context): Update.
* ipa-pure-const.c: Include symtab-thunks.h.
(analyze_function): Update.
* ipa-sra.c (check_for_caller_issues): Update.
* ipa-utils.c (ipa_reverse_postorder): Update.
(ipa_merge_profiles): Update.
* ipa-visibility.c (non_local_p): Update.
(cgraph_node::local_p): Update.
(function_and_variable_visibility): Update.
* ipa.c (symbol_table::remove_unreachable_nodes): Update.
* lto-cgraph.c: Include alloc-pool.h, symbol-summary.h and
symtab-thunks.h
(lto_output_edge): Update.
(lto_output_node): Update.
(compute_ltrans_boundary): Update.
(output_symtab): Update.
(verify_node_partition): Update.
(input_overwrite_node): Update.
(input_node): Update.
* lto-streamer-in.c (fixup_call_stmt_edges): Update.
* symtab-thunks.cc: New file.
* symtab-thunks.h: New file.
* toplev.c (toplev::finalize): Call symtab_thunks_cc_finalize.
* trans-mem.c (ipa_tm_mayenterirr_function): Update.
(ipa_tm_execute): Update.
* tree-inline.c (expand_call_inline): Update.
* tree-nested.c (create_nesting_tree): Update.
(convert_all_function_calls): Update.
(gimplify_all_functions): Update.
* tree-profile.c (tree_profiling): Update.
* tree-ssa-structalias.c (associate_varinfo_to_alias): Update.
* tree.c (free_lang_data_in_decl): Update.
* value-prof.c (init_node_map): Update.
gcc/c-family/ChangeLog:
2020-10-23 Jan Hubicka <hubicka@ucw.cz>
* c-common.c (c_common_finalize_early_debug): Update for new thunk api.
gcc/d/ChangeLog:
2020-10-23 Jan Hubicka <hubicka@ucw.cz>
* decl.cc (finish_thunk): Update for new thunk api.
gcc/lto/ChangeLog:
2020-10-23 Jan Hubicka <hubicka@ucw.cz>
* lto-partition.c (add_symbol_to_partition_1): Update for new thunk
api.
|
|
A previous patch in the series has taught IPA-CP to identify the
important cloning opportunities in 548.exchange2_r as worthwhile on
their own, but the optimization is still prevented from taking place
because of the overall unit-growh limit. This patches raises that
limit so that it takes place and the benchmark runs 30% faster (on AMD
Zen2 CPU at least).
Before this patch, IPA-CP uses the following formulae to arrive at the
overall_size limit:
base = MAX(orig_size, param_large_unit_insns)
unit_growth_limit = base + base * param_ipa_cp_unit_growth / 100
since param_ipa_cp_unit_growth has default 10, param_large_unit_insns
has default value 10000.
The problem with exchange2 (at least on zen2 but I have had a quick
look on aarch64 too) is that the original estimated unit size is 10513
and so param_large_unit_insns does not apply and the default limit is
therefore 11564 which is good enough only for one of the ideal 8
clonings, we need the limit to be at least 16291.
I would like to raise param_ipa_cp_unit_growth a little bit more soon
too, but most certainly not to 55. Therefore, the large_unit must be
increased. In this patch, I decided to decouple the inlining and
ipa-cp large-unit parameters. It also makes sense because IPA-CP uses
it only at -O3 while inlining also at -O2 (IIUC). But if we agree we
can try raising param_large_unit_insns to 13-14 thousand
"instructions," perhaps it is not necessary. But then again, it may
make sense to actually increase the IPA-CP limit further.
I plan to experiment with IPA-CP tuning on a larger set of programs.
Meanwhile, mainly to address the 548.exchange2_r regression, I'm
suggesting this simple change.
gcc/ChangeLog:
2020-09-07 Martin Jambor <mjambor@suse.cz>
* params.opt (ipa-cp-large-unit-insns): New parameter.
* ipa-cp.c (get_max_overall_size): Use the new parameter.
|
|
When experimenting with IPA-CP parameters, especially when looking
into exchange2_r, it has been very useful to know what the value of
overall_size is at different stages of the decision process. This
patch therefore adds it to the generated dumps.
gcc/ChangeLog:
2020-09-07 Martin Jambor <mjambor@suse.cz>
* ipa-cp.c (estimate_local_effects): Add overeall_size to dumped
string.
(decide_about_value): Add dumping new overall_size.
|
|
This patch enhances the ability of IPA to reason under what conditions
loops in a function have known iteration counts or strides because it
replaces single predicates which currently hold conjunction of
predicates for all loops with vectors capable of holding multiple
predicates, each with a cumulative frequency of loops with the
property.
This second property is then used by IPA-CP to much more aggressively
boost its heuristic score for cloning opportunities which make
iteration counts or strides of frequent loops compile time constant.
gcc/ChangeLog:
2020-09-03 Martin Jambor <mjambor@suse.cz>
* ipa-fnsummary.h (ipa_freqcounting_predicate): New type.
(ipa_fn_summary): Change the type of loop_iterations and loop_strides
to vectors of ipa_freqcounting_predicate.
(ipa_fn_summary::ipa_fn_summary): Construct the new vectors.
(ipa_call_estimates): New fields loops_with_known_iterations and
loops_with_known_strides.
* ipa-cp.c (hint_time_bonus): Multiply param_ipa_cp_loop_hint_bonus
with the expected frequencies of loops with known iteration count or
stride.
* ipa-fnsummary.c (add_freqcounting_predicate): New function.
(ipa_fn_summary::~ipa_fn_summary): Release the new vectors instead of
just two predicates.
(remap_hint_predicate_after_duplication): Replace with function
remap_freqcounting_preds_after_dup.
(ipa_fn_summary_t::duplicate): Use it or duplicate new vectors.
(ipa_dump_fn_summary): Dump the new vectors.
(analyze_function_body): Compute the loop property vectors.
(ipa_call_context::estimate_size_and_time): Calculate also
loops_with_known_iterations and loops_with_known_strides. Adjusted
dumping accordinly.
(remap_hint_predicate): Replace with function
remap_freqcounting_predicate.
(ipa_merge_fn_summary_after_inlining): Use it.
(inline_read_section): Stream loopcounting vectors instead of two
simple predicates.
(ipa_fn_summary_write): Likewise.
* params.opt (ipa-max-loop-predicates): New parameter.
* doc/invoke.texi (ipa-max-loop-predicates): Document new param.
gcc/testsuite/ChangeLog:
2020-09-03 Martin Jambor <mjambor@suse.cz>
* gcc.dg/ipa/ipcp-loophint-1.c: New test.
|
|
A subsequent patch adds another two estimates that the code in
ipa_call_context::estimate_size_and_time computes, and the fact that
the function has a special output parameter for each thing it computes
would make it have just too many. Therefore, this patch collapses all
those ouptut parameters into one output structure.
gcc/ChangeLog:
2020-09-02 Martin Jambor <mjambor@suse.cz>
* ipa-inline-analysis.c (do_estimate_edge_time): Adjusted to use
ipa_call_estimates.
(do_estimate_edge_size): Likewise.
(do_estimate_edge_hints): Likewise.
* ipa-fnsummary.h (struct ipa_call_estimates): New type.
(ipa_call_context::estimate_size_and_time): Adjusted declaration.
(estimate_ipcp_clone_size_and_time): Likewise.
* ipa-cp.c (hint_time_bonus): Changed the type of the second argument
to ipa_call_estimates.
(perform_estimation_of_a_value): Adjusted to use ipa_call_estimates.
(estimate_local_effects): Likewise.
* ipa-fnsummary.c (ipa_call_context::estimate_size_and_time): Adjusted
to return estimates in a single ipa_call_estimates parameter.
(estimate_ipcp_clone_size_and_time): Likewise.
|
|
Hi,
this large patch is mostly mechanical change which aims to replace
uses of separate vectors about known scalar values (usually called
known_vals or known_csts), known aggregate values (known_aggs), known
virtual call contexts (known_contexts) and known value
ranges (known_value_ranges) with uses of either new type
ipa_call_arg_values or ipa_auto_call_arg_values, both of which simply
contain these vectors inside them.
The need for two distinct comes from the fact that when the vectors
are constructed from jump functions or lattices, we really should use
auto_vecs with embedded storage allocated on stack. On the other hand,
the bundle in ipa_call_context can be allocated on heap when in cache,
one time for each call_graph node.
ipa_call_context is constructible from ipa_auto_call_arg_values but
then its vectors must not be resized, otherwise the vectors will stop
pointing to the stack ones. Unfortunately, I don't think the
structure embedded in ipa_call_context can be made constant because we
need to manipulate and deallocate it when in cache.
gcc/ChangeLog:
2020-09-01 Martin Jambor <mjambor@suse.cz>
* ipa-prop.h (ipa_auto_call_arg_values): New type.
(class ipa_call_arg_values): Likewise.
(ipa_get_indirect_edge_target): Replaced vector arguments with
ipa_call_arg_values in declaration. Added an overload for
ipa_auto_call_arg_values.
* ipa-fnsummary.h (ipa_call_context): Removed members m_known_vals,
m_known_contexts, m_known_aggs, duplicate_from, release and equal_to,
new members m_avals, store_to_cache and equivalent_to_p. Adjusted
construcotr arguments.
(estimate_ipcp_clone_size_and_time): Replaced vector arguments
with ipa_auto_call_arg_values in declaration.
(evaluate_properties_for_edge): Likewise.
* ipa-cp.c (ipa_get_indirect_edge_target): Adjusted to work on
ipa_call_arg_values rather than on separate vectors. Added an
overload for ipa_auto_call_arg_values.
(devirtualization_time_bonus): Adjusted to work on
ipa_auto_call_arg_values rather than on separate vectors.
(gather_context_independent_values): Adjusted to work on
ipa_auto_call_arg_values rather than on separate vectors.
(perform_estimation_of_a_value): Likewise.
(estimate_local_effects): Likewise.
(modify_known_vectors_with_val): Adjusted both variants to work on
ipa_auto_call_arg_values and rename them to
copy_known_vectors_add_val.
(decide_about_value): Adjusted to work on ipa_call_arg_values rather
than on separate vectors.
(decide_whether_version_node): Likewise.
* ipa-fnsummary.c (evaluate_conditions_for_known_args): Likewise.
(evaluate_properties_for_edge): Likewise.
(ipa_fn_summary_t::duplicate): Likewise.
(estimate_edge_devirt_benefit): Adjusted to work on
ipa_call_arg_values rather than on separate vectors.
(estimate_edge_size_and_time): Likewise.
(estimate_calls_size_and_time_1): Likewise.
(summarize_calls_size_and_time): Adjusted calls to
estimate_edge_size_and_time.
(estimate_calls_size_and_time): Adjusted to work on
ipa_call_arg_values rather than on separate vectors.
(ipa_call_context::ipa_call_context): Construct from a pointer to
ipa_auto_call_arg_values instead of inividual vectors.
(ipa_call_context::duplicate_from): Adjusted to access vectors within
m_avals.
(ipa_call_context::release): Likewise.
(ipa_call_context::equal_to): Likewise.
(ipa_call_context::estimate_size_and_time): Adjusted to work on
ipa_call_arg_values rather than on separate vectors.
(estimate_ipcp_clone_size_and_time): Adjusted to work with
ipa_auto_call_arg_values rather than on separate vectors.
(ipa_merge_fn_summary_after_inlining): Likewise. Adjusted call to
estimate_edge_size_and_time.
(ipa_update_overall_fn_summary): Adjusted call to
estimate_edge_size_and_time.
* ipa-inline-analysis.c (do_estimate_edge_time): Adjusted to work with
ipa_auto_call_arg_values rather than with separate vectors.
(do_estimate_edge_size): Likewise.
(do_estimate_edge_hints): Likewise.
* ipa-prop.c (ipa_auto_call_arg_values::~ipa_auto_call_arg_values):
New destructor.
|
|
2020-08-31 Feng Xue <fxue@os.amperecomputing.com>
gcc/
PR tree-optimization/96806
* ipa-cp.c (decide_about_value): Use safe_add to avoid cost addition
overflow.
gcc/testsuite/
PR tree-optimization/96806
* g++.dg/ipa/pr96806.C: New test.
|
|
gcc/ada/ChangeLog:
* gcc-interface/trans.c (gigi): Set exact argument of a vector
growth function to true.
(Attribute_to_gnu): Likewise.
gcc/ChangeLog:
* alias.c (init_alias_analysis): Set exact argument of a vector
growth function to true.
* calls.c (internal_arg_pointer_based_exp_scan): Likewise.
* cfgbuild.c (find_many_sub_basic_blocks): Likewise.
* cfgexpand.c (expand_asm_stmt): Likewise.
* cfgrtl.c (rtl_create_basic_block): Likewise.
* combine.c (combine_split_insns): Likewise.
(combine_instructions): Likewise.
* config/aarch64/aarch64-sve-builtins.cc (function_expander::add_output_operand): Likewise.
(function_expander::add_input_operand): Likewise.
(function_expander::add_integer_operand): Likewise.
(function_expander::add_address_operand): Likewise.
(function_expander::add_fixed_operand): Likewise.
* df-core.c (df_worklist_dataflow_doublequeue): Likewise.
* dwarf2cfi.c (update_row_reg_save): Likewise.
* early-remat.c (early_remat::init_block_info): Likewise.
(early_remat::finalize_candidate_indices): Likewise.
* except.c (sjlj_build_landing_pads): Likewise.
* final.c (compute_alignments): Likewise.
(grow_label_align): Likewise.
* function.c (temp_slots_at_level): Likewise.
* fwprop.c (build_single_def_use_links): Likewise.
(update_uses): Likewise.
* gcc.c (insert_wrapper): Likewise.
* genautomata.c (create_state_ainsn_table): Likewise.
(add_vect): Likewise.
(output_dead_lock_vect): Likewise.
* genmatch.c (capture_info::capture_info): Likewise.
(parser::finish_match_operand): Likewise.
* genrecog.c (optimize_subroutine_group): Likewise.
(merge_pattern_info::merge_pattern_info): Likewise.
(merge_into_decision): Likewise.
(print_subroutine_start): Likewise.
(main): Likewise.
* gimple-loop-versioning.cc (loop_versioning::loop_versioning): Likewise.
* gimple.c (gimple_set_bb): Likewise.
* graphite-isl-ast-to-gimple.c (translate_isl_ast_node_user): Likewise.
* haifa-sched.c (sched_extend_luids): Likewise.
(extend_h_i_d): Likewise.
* insn-addr.h (insn_addresses_new): Likewise.
* ipa-cp.c (gather_context_independent_values): Likewise.
(find_more_contexts_for_caller_subset): Likewise.
* ipa-devirt.c (final_warning_record::grow_type_warnings): Likewise.
(ipa_odr_read_section): Likewise.
* ipa-fnsummary.c (evaluate_properties_for_edge): Likewise.
(ipa_fn_summary_t::duplicate): Likewise.
(analyze_function_body): Likewise.
(ipa_merge_fn_summary_after_inlining): Likewise.
(read_ipa_call_summary): Likewise.
* ipa-icf.c (sem_function::bb_dict_test): Likewise.
* ipa-prop.c (ipa_alloc_node_params): Likewise.
(parm_bb_aa_status_for_bb): Likewise.
(ipa_compute_jump_functions_for_edge): Likewise.
(ipa_analyze_node): Likewise.
(update_jump_functions_after_inlining): Likewise.
(ipa_read_edge_info): Likewise.
(read_ipcp_transformation_info): Likewise.
(ipcp_transform_function): Likewise.
* ipa-reference.c (ipa_reference_write_optimization_summary): Likewise.
* ipa-split.c (execute_split_functions): Likewise.
* ira.c (find_moveable_pseudos): Likewise.
* lower-subreg.c (decompose_multiword_subregs): Likewise.
* lto-streamer-in.c (input_eh_regions): Likewise.
(input_cfg): Likewise.
(input_struct_function_base): Likewise.
(input_function): Likewise.
* modulo-sched.c (set_node_sched_params): Likewise.
(extend_node_sched_params): Likewise.
(schedule_reg_moves): Likewise.
* omp-general.c (omp_construct_simd_compare): Likewise.
* passes.c (pass_manager::create_pass_tab): Likewise.
(enable_disable_pass): Likewise.
* predict.c (determine_unlikely_bbs): Likewise.
* profile.c (compute_branch_probabilities): Likewise.
* read-rtl-function.c (function_reader::parse_block): Likewise.
* read-rtl.c (rtx_reader::read_rtx_code): Likewise.
* reg-stack.c (stack_regs_mentioned): Likewise.
* regrename.c (regrename_init): Likewise.
* rtlanal.c (T>::add_single_to_queue): Likewise.
* sched-deps.c (init_deps_data_vector): Likewise.
* sel-sched-ir.c (sel_extend_global_bb_info): Likewise.
(extend_region_bb_info): Likewise.
(extend_insn_data): Likewise.
* symtab.c (symtab_node::create_reference): Likewise.
* tracer.c (tail_duplicate): Likewise.
* trans-mem.c (tm_region_init): Likewise.
(get_bb_regions_instrumented): Likewise.
* tree-cfg.c (init_empty_tree_cfg_for_function): Likewise.
(build_gimple_cfg): Likewise.
(create_bb): Likewise.
(move_block_to_fn): Likewise.
* tree-complex.c (tree_lower_complex): Likewise.
* tree-if-conv.c (predicate_rhs_code): Likewise.
* tree-inline.c (copy_bb): Likewise.
* tree-into-ssa.c (get_ssa_name_ann): Likewise.
(mark_phi_for_rewrite): Likewise.
* tree-object-size.c (compute_builtin_object_size): Likewise.
(init_object_sizes): Likewise.
* tree-predcom.c (initialize_root_vars_store_elim_1): Likewise.
(initialize_root_vars_store_elim_2): Likewise.
(prepare_initializers_chain_store_elim): Likewise.
* tree-ssa-address.c (addr_for_mem_ref): Likewise.
(multiplier_allowed_in_address_p): Likewise.
* tree-ssa-coalesce.c (ssa_conflicts_new): Likewise.
* tree-ssa-forwprop.c (simplify_vector_constructor): Likewise.
* tree-ssa-loop-ivopts.c (addr_offset_valid_p): Likewise.
(get_address_cost_ainc): Likewise.
* tree-ssa-loop-niter.c (discover_iteration_bound_by_body_walk): Likewise.
* tree-ssa-pre.c (add_to_value): Likewise.
(phi_translate_1): Likewise.
(do_pre_regular_insertion): Likewise.
(do_pre_partial_partial_insertion): Likewise.
(init_pre): Likewise.
* tree-ssa-propagate.c (ssa_prop_init): Likewise.
(update_call_from_tree): Likewise.
* tree-ssa-reassoc.c (optimize_range_tests_cmp_bitwise): Likewise.
* tree-ssa-sccvn.c (vn_reference_lookup_3): Likewise.
(vn_reference_lookup_pieces): Likewise.
(eliminate_dom_walker::eliminate_push_avail): Likewise.
* tree-ssa-strlen.c (set_strinfo): Likewise.
(get_stridx_plus_constant): Likewise.
(zero_length_string): Likewise.
(find_equal_ptrs): Likewise.
(printf_strlen_execute): Likewise.
* tree-ssa-threadedge.c (set_ssa_name_value): Likewise.
* tree-ssanames.c (make_ssa_name_fn): Likewise.
* tree-streamer-in.c (streamer_read_tree_bitfields): Likewise.
* tree-vect-loop.c (vect_record_loop_mask): Likewise.
(vect_get_loop_mask): Likewise.
(vect_record_loop_len): Likewise.
(vect_get_loop_len): Likewise.
* tree-vect-patterns.c (vect_recog_mask_conversion_pattern): Likewise.
* tree-vect-slp.c (vect_slp_convert_to_external): Likewise.
(vect_bb_slp_scalar_cost): Likewise.
(vect_bb_vectorization_profitable_p): Likewise.
(vectorizable_slp_permutation): Likewise.
* tree-vect-stmts.c (vectorizable_call): Likewise.
(vectorizable_simd_clone_call): Likewise.
(scan_store_can_perm_p): Likewise.
(vectorizable_store): Likewise.
* expr.c: Likewise.
* vec.c (test_safe_grow_cleared): Likewise.
* vec.h (vec_safe_grow): Likewise.
(vec_safe_grow_cleared): Likewise.
(vl_ptr>::safe_grow): Likewise.
(vl_ptr>::safe_grow_cleared): Likewise.
* config/c6x/c6x.c (insn_set_clock): Likewise.
gcc/c/ChangeLog:
* gimple-parser.c (c_parser_gimple_compound_statement): Set exact argument of a vector
growth function to true.
gcc/cp/ChangeLog:
* class.c (build_vtbl_initializer): Set exact argument of a vector
growth function to true.
* constraint.cc (get_mapped_args): Likewise.
* decl.c (cp_maybe_mangle_decomp): Likewise.
(cp_finish_decomp): Likewise.
* parser.c (cp_parser_omp_for_loop): Likewise.
* pt.c (canonical_type_parameter): Likewise.
* rtti.c (get_pseudo_ti_init): Likewise.
gcc/fortran/ChangeLog:
* trans-openmp.c (gfc_trans_omp_do): Set exact argument of a vector
growth function to true.
gcc/lto/ChangeLog:
* lto-common.c (lto_file_finalize): Set exact argument of a vector
growth function to true.
|
|
The patch aligns code with ipcp_bits_lattice::set_to_constant
where we properly mask m_value with m_mask. The same should
be done here.
gcc/ChangeLog:
PR ipa/96482
* ipa-cp.c (ipcp_bits_lattice::meet_with_1): Mask m_value
with m_mask.
gcc/testsuite/ChangeLog:
PR ipa/96482
* gcc.dg/ipa/pr96482-2.c: New test.
|
|
As mentioned in the PR, let's consider the following example:
int
__attribute__((noinline))
foo(int arg)
{
if (arg == 3)
return 1;
if (arg == 4)
return 123;
__builtin_unreachable ();
}
during WPA we find all calls of the function
(yes the call with value 5 is UBSAN):
Node: foo/0:
param [0]: 5 [loc_time: 4, loc_size: 2, prop_time: 0, prop_size: 0]
3 [loc_time: 3, loc_size: 3, prop_time: 0, prop_size: 0]
ctxs: VARIABLE
Bits: value = 0x5, mask = 0x6
in LTRANS we have the following VRP info:
# RANGE [3, 3] NONZERO 3
when we AND masks in get_default_value we end up with 6 & 3 = 2 (0x010).
That means the only second (least significant bit) is unknown and
value (5 = 0x101) & ~mask gives us either 7 (0x111) or 5 (0x101).
That's why if (arg_2(D) == 3) gets optimized to false.
gcc/ChangeLog:
PR ipa/96482
* ipa-cp.c (ipcp_bits_lattice::meet_with_1): Drop value bits
for bits that are unknown.
(ipcp_bits_lattice::set_to_constant): Likewise.
* tree-ssa-ccp.c (get_default_value): Add sanity check that
IPA CP bit info has all bits set to zero in bits that
are unknown.
gcc/testsuite/ChangeLog:
PR ipa/96482
* gcc.dg/ipa/pr96482.c: New test.
|
|
gcc/ChangeLog:
* dbgcnt.def (DEBUG_COUNTER): Add ipa_cp_bits.
* ipa-cp.c (ipcp_store_bits_results): Use it when we store known
bits for parameters.
|
|
Implement class irange, a generic multi-range implementation for
value ranges. This class is API compatible with value_range, and is meant
to seamlessly coexist with it.
gcc/ChangeLog:
* Makefile.in (GTFILES): Move value-range.h up.
* gengtype-lex.l: Set yylval to handle GTY markers on templates.
* ipa-cp.c (initialize_node_lattices): Call value_range
constructor.
(ipcp_propagate_stage): Use in-place new so value_range construct
is called.
* ipa-fnsummary.c (evaluate_conditions_for_known_args): Use std
vec instead of GCC's vec<>.
(evaluate_properties_for_edge): Adjust for std vec.
(ipa_fn_summary_t::duplicate): Same.
(estimate_ipcp_clone_size_and_time): Same.
* ipa-prop.c (ipa_get_value_range): Use in-place new for
value_range.
* ipa-prop.h (struct GTY): Remove class keyword for m_vr.
* range-op.cc (empty_range_check): Rename to...
(empty_range_varying): ...this and adjust for varying.
(undefined_shift_range_check): Adjust for irange.
(range_operator::wi_fold): Same.
(range_operator::fold_range): Adjust for irange. Special case
single pairs for performance.
(range_operator::op1_range): Adjust for irange.
(range_operator::op2_range): Same.
(value_range_from_overflowed_bounds): Same.
(value_range_with_overflow): Same.
(create_possibly_reversed_range): Same.
(range_true): Same.
(range_false): Same.
(range_true_and_false): Same.
(get_bool_state): Adjust for irange and tweak for performance.
(operator_equal::fold_range): Adjust for irange.
(operator_equal::op1_range): Same.
(operator_equal::op2_range): Same.
(operator_not_equal::fold_range): Same.
(operator_not_equal::op1_range): Same.
(operator_not_equal::op2_range): Same.
(build_lt): Same.
(build_le): Same.
(build_gt): Same.
(build_ge): Same.
(operator_lt::fold_range): Same.
(operator_lt::op1_range): Same.
(operator_lt::op2_range): Same.
(operator_le::fold_range): Same.
(operator_le::op1_range): Same.
(operator_le::op2_range): Same.
(operator_gt::fold_range): Same.
(operator_gt::op1_range): Same.
(operator_gt::op2_range): Same.
(operator_ge::fold_range): Same.
(operator_ge::op1_range): Same.
(operator_ge::op2_range): Same.
(operator_plus::wi_fold): Same.
(operator_plus::op1_range): Same.
(operator_plus::op2_range): Same.
(operator_minus::wi_fold): Same.
(operator_minus::op1_range): Same.
(operator_minus::op2_range): Same.
(operator_min::wi_fold): Same.
(operator_max::wi_fold): Same.
(cross_product_operator::wi_cross_product): Same.
(operator_mult::op1_range): New.
(operator_mult::op2_range): New.
(operator_mult::wi_fold): Adjust for irange.
(operator_div::wi_fold): Same.
(operator_exact_divide::op1_range): Same.
(operator_lshift::fold_range): Same.
(operator_lshift::wi_fold): Same.
(operator_lshift::op1_range): New.
(operator_rshift::op1_range): New.
(operator_rshift::fold_range): Adjust for irange.
(operator_rshift::wi_fold): Same.
(operator_cast::truncating_cast_p): Abstract out from
operator_cast::fold_range.
(operator_cast::fold_range): Adjust for irange and tweak for
performance.
(operator_cast::inside_domain_p): Abstract out from fold_range.
(operator_cast::fold_pair): Same.
(operator_cast::op1_range): Use abstracted methods above. Adjust
for irange and tweak for performance.
(operator_logical_and::fold_range): Adjust for irange.
(operator_logical_and::op1_range): Same.
(operator_logical_and::op2_range): Same.
(unsigned_singleton_p): New.
(operator_bitwise_and::remove_impossible_ranges): New.
(operator_bitwise_and::fold_range): New.
(wi_optimize_and_or): Adjust for irange.
(operator_bitwise_and::wi_fold): Same.
(set_nonzero_range_from_mask): New.
(operator_bitwise_and::simple_op1_range_solver): New.
(operator_bitwise_and::op1_range): Adjust for irange.
(operator_bitwise_and::op2_range): Same.
(operator_logical_or::fold_range): Same.
(operator_logical_or::op1_range): Same.
(operator_logical_or::op2_range): Same.
(operator_bitwise_or::wi_fold): Same.
(operator_bitwise_or::op1_range): Same.
(operator_bitwise_or::op2_range): Same.
(operator_bitwise_xor::wi_fold): Same.
(operator_bitwise_xor::op1_range): New.
(operator_bitwise_xor::op2_range): New.
(operator_trunc_mod::wi_fold): Adjust for irange.
(operator_logical_not::fold_range): Same.
(operator_logical_not::op1_range): Same.
(operator_bitwise_not::fold_range): Same.
(operator_bitwise_not::op1_range): Same.
(operator_cst::fold_range): Same.
(operator_identity::fold_range): Same.
(operator_identity::op1_range): Same.
(class operator_unknown): New.
(operator_unknown::fold_range): New.
(class operator_abs): Adjust for irange.
(operator_abs::wi_fold): Same.
(operator_abs::op1_range): Same.
(operator_absu::wi_fold): Same.
(class operator_negate): Same.
(operator_negate::fold_range): Same.
(operator_negate::op1_range): Same.
(operator_addr_expr::fold_range): Same.
(operator_addr_expr::op1_range): Same.
(pointer_plus_operator::wi_fold): Same.
(pointer_min_max_operator::wi_fold): Same.
(pointer_and_operator::wi_fold): Same.
(pointer_or_operator::op1_range): New.
(pointer_or_operator::op2_range): New.
(pointer_or_operator::wi_fold): Adjust for irange.
(integral_table::integral_table): Add entries for IMAGPART_EXPR
and POINTER_DIFF_EXPR.
(range_cast): Adjust for irange.
(build_range3): New.
(range3_tests): New.
(widest_irange_tests): New.
(multi_precision_range_tests): New.
(operator_tests): New.
(range_tests): New.
* range-op.h (class range_operator): Adjust for irange.
(range_cast): Same.
* tree-vrp.c (range_fold_binary_symbolics_p): Adjust for irange and
tweak for performance.
(range_fold_binary_expr): Same.
(masked_increment): Change to extern.
* tree-vrp.h (masked_increment): New.
* tree.c (cache_wide_int_in_type_cache): New function abstracted
out from wide_int_to_tree_1.
(wide_int_to_tree_1): Cache 0, 1, and MAX for pointers.
* value-range-equiv.cc (value_range_equiv::deep_copy): Use kind
method.
(value_range_equiv::move): Same.
(value_range_equiv::check): Adjust for irange.
(value_range_equiv::intersect): Same.
(value_range_equiv::union_): Same.
(value_range_equiv::dump): Same.
* value-range.cc (irange::operator=): Same.
(irange::maybe_anti_range): New.
(irange::copy_legacy_range): New.
(irange::set_undefined): Adjust for irange.
(irange::swap_out_of_order_endpoints): Abstract out from set().
(irange::set_varying): Adjust for irange.
(irange::irange_set): New.
(irange::irange_set_anti_range): New.
(irange::set): Adjust for irange.
(value_range::set_nonzero): Move to header file.
(value_range::set_zero): Move to header file.
(value_range::check): Rename to...
(irange::verify_range): ...this.
(value_range::num_pairs): Rename to...
(irange::legacy_num_pairs): ...this, and adjust for irange.
(value_range::lower_bound): Rename to...
(irange::legacy_lower_bound): ...this, and adjust for irange.
(value_range::upper_bound): Rename to...
(irange::legacy_upper_bound): ...this, and adjust for irange.
(value_range::equal_p): Rename to...
(irange::legacy_equal_p): ...this.
(value_range::operator==): Move to header file.
(irange::equal_p): New.
(irange::symbolic_p): Adjust for irange.
(irange::constant_p): Same.
(irange::singleton_p): Same.
(irange::value_inside_range): Same.
(irange::may_contain_p): Same.
(irange::contains_p): Same.
(irange::normalize_addresses): Same.
(irange::normalize_symbolics): Same.
(irange::legacy_intersect): Same.
(irange::legacy_union): Same.
(irange::union_): Same.
(irange::intersect): Same.
(irange::irange_union): New.
(irange::irange_intersect): New.
(subtract_one): New.
(irange::invert): Adjust for irange.
(dump_bound_with_infinite_markers): New.
(irange::dump): Adjust for irange.
(debug): Add irange versions.
(range_has_numeric_bounds_p): Adjust for irange.
(vrp_val_max): Move to header file.
(vrp_val_min): Move to header file.
(DEFINE_INT_RANGE_GC_STUBS): New.
(DEFINE_INT_RANGE_INSTANCE): New.
* value-range.h (class irange): New.
(class int_range): New.
(class value_range): Rename to a instantiation of int_range.
(irange::legacy_mode_p): New.
(value_range::value_range): Remove.
(irange::kind): New.
(irange::num_pairs): Adjust for irange.
(irange::type): Adjust for irange.
(irange::tree_lower_bound): New.
(irange::tree_upper_bound): New.
(irange::type): Adjust for irange.
(irange::min): Same.
(irange::max): Same.
(irange::varying_p): Same.
(irange::undefined_p): Same.
(irange::zero_p): Same.
(irange::nonzero_p): Same.
(irange::supports_type_p): Same.
(range_includes_zero_p): Same.
(gt_ggc_mx): New.
(gt_pch_nx): New.
(irange::irange): New.
(int_range::int_range): New.
(int_range::operator=): New.
(irange::set): Moved from value-range.cc and adjusted for irange.
(irange::set_undefined): Same.
(irange::set_varying): Same.
(irange::operator==): Same.
(irange::lower_bound): Same.
(irange::upper_bound): Same.
(irange::union_): Same.
(irange::intersect): Same.
(irange::set_nonzero): Same.
(irange::set_zero): Same.
(irange::normalize_min_max): New.
(vrp_val_max): Move from value-range.cc.
(vrp_val_min): Same.
* vr-values.c (vr_values::get_lattice_entry): Call value_range
constructor.
|
|
In PR ipa/96291 the test contained an SCC with one
unoptimized function. This tricked ipa-cp into NULL dereference.
has_undead_caller_from_outside_scc_p() did not take into account
that unoptimized funtions don't have IPA summary analysis. And
dereferenced NULL pointer causing an ICE.
gcc/
PR ipa/96291
* ipa-cp.c (has_undead_caller_from_outside_scc_p): Consider
unoptimized callers as undead.
gcc/testsuite/
PR ipa/96291
* gcc.dg/lto/pr96291_0.c: New testcase.
* gcc.dg/lto/pr96291_1.c: Support file.
* gcc.dg/lto/pr96291_2.c: Likewise.
* gcc.dg/lto/pr96291.h: Likewise.
|
|
2020-06-01 Feng Xue <fxue@os.amperecomputing.com>
gcc/
PR ipa/93429
* ipa-cp.c (propagate_aggs_across_jump_function): Check aggregate
lattice for simple pass-through by-ref argument.
gcc/testsuite/
PR ipa/93429
* gcc.dg/ipa/ipcp-agg-8.c: Change dump string.
* gcc.dg/ipa/ipcp-agg-13.c: New test.
|
|
PR ipa/94232
* ipa-cp.c (ipa_get_jf_ancestor_result): Use offset in bytes. Previously
build_ref_for_offset function was used and it transforms off to bytes
from bits.
|
|
This avoids using build_ref_for_offset and build_fold_addr_expr
where type mixup easily results in something not IP invariant.
2020-03-19 Richard Biener <rguenther@suse.de>
PR ipa/94217
* ipa-cp.c (ipa_get_jf_ancestor_result): Avoid build_fold_addr_expr
and build_ref_for_offset.
|
|
2020-02-27 Martin Jambor <mjambor@suse.cz>
Feng Xue <fxue@os.amperecomputing.com>
PR ipa/93707
* ipa-cp.c (same_node_or_its_all_contexts_clone_p): Replaced with
new function calls_same_node_or_its_all_contexts_clone_p.
(cgraph_edge_brings_value_p): Use it.
(cgraph_edge_brings_value_p): Likewise.
(self_recursive_pass_through_p): Return false if caller is a clone.
(self_recursive_agg_pass_through_p): Likewise.
testsuite/
* gcc.dg/ipa/pr93707.c: New test.
|
|
PR ipa/93763
* ipa-cp.c (self_recursively_generated_p): Mark self-dependent value as
self-recursively generated.
|
|
Besides simple pass-through (aggregate) jump function, arithmetic (aggregate)
jump function could also bring same (aggregate) value as parameter passed-in
for self-feeding recursive call. For example,
f1 (int i) /* normal jump function */
{
f1 (i & 1);
}
Suppose i is 0, recursive propagation via (i & 1) also gets 0, which
can be seen as a simple pass-through of i.
f2 (int *p) /* aggregate jump function */
{
int t = *p & 1;
f2 (&t);
}
Likewise, if *p is 0, (*p & 1) is also 0, and &t is an aggregate simple
pass-through of p.
2020-02-10 Feng Xue <fxue@os.amperecomputing.com>
PR ipa/93203
* ipa-cp.c (ipcp_lattice::add_value): Add source with same call edge
but different source value.
(adjust_callers_for_value_intersection): New function.
(gather_edges_for_value): Adjust order of callers to let a
non-self-recursive caller be the first element.
(self_recursive_pass_through_p): Add a new parameter "simple", and
check generalized self-recursive pass-through jump function.
(self_recursive_agg_pass_through_p): Likewise.
(find_more_scalar_values_for_callers_subset): Compute value from
pass-through jump function for self-recursive.
(intersect_with_plats): Cleanup previous implementation code for value
itersection with self-recursive call edge.
(intersect_with_agg_replacements): Likewise.
(intersect_aggregates_with_edge): Deduce value from pass-through jump
function for self-recursive call edge. Cleanup previous implementation
code for value intersection with self-recursive call edge.
(decide_whether_version_node): Remove dead callers and adjust order
to let a non-self-recursive caller be the first element.
PR ipa/93203
* g++.dg/ipa/pr93203.C: New test.
|
|
2020-01-25 Feng Xue <fxue@os.amperecomputing.com>
PR ipa/93166
* ipa-cp.c (get_info_about_necessary_edges): Remove value
check assertion.
PR ipa/93166
* g++.dg/pr93166.C: New test.
|
|
2020-01-13 Martin Jambor <mjambor@suse.cz>
PR ipa/93223
* ipa-cp.c (devirtualization_time_bonus): Check whether isummary is
NULL.
testsuite/
* g++.dg/ipa/pr93223.C: New test.
|
|
* ipa-cp.c (get_max_overall_size): Use newly
renamed param param_ipa_cp_unit_growth.
* params.opt: Remove legacy param name.
|
|
2020-01-10 Martin Jambor <mjambor@suse.cz>
* params.opt (param_ipcp_unit_growth): Mark as Optimization.
* ipa-cp.c (max_new_size): Removed.
(orig_overall_size): New variable.
(get_max_overall_size): New function.
(estimate_local_effects): Use it. Adjust dump.
(decide_about_value): Likewise.
(ipcp_propagate_stage): Do not calculate max_new_size, just store
orig_overall_size. Adjust dump.
(ipa_cp_c_finalize): Clear orig_overall_size instead of max_new_size.
From-SVN: r280099
|
|
2020-01-10 Martin Jambor <mjambor@suse.cz>
* params.opt (param_ipa_max_agg_items): Mark as Optimization
* ipa-cp.c (merge_agg_lats_step): New parameter max_agg_items, use
instead of param_ipa_max_agg_items.
(merge_aggregate_lattices): Extract param_ipa_max_agg_items from
optimization info for the callee.
From-SVN: r280098
|
|
2020-01-09 Martin Liska <mliska@suse.cz>
* auto-profile.c (auto_profile): Use opt_for_fn
for a parameter.
* ipa-cp.c (ipcp_lattice::add_value): Likewise.
(propagate_vals_across_arith_jfunc): Likewise.
(hint_time_bonus): Likewise.
(incorporate_penalties): Likewise.
(good_cloning_opportunity_p): Likewise.
(perform_estimation_of_a_value): Likewise.
(estimate_local_effects): Likewise.
(ipcp_propagate_stage): Likewise.
* ipa-fnsummary.c (decompose_param_expr): Likewise.
(set_switch_stmt_execution_predicate): Likewise.
(analyze_function_body): Likewise.
* ipa-inline-analysis.c (offline_size): Likewise.
* ipa-inline.c (early_inliner): Likewise.
* ipa-prop.c (ipa_analyze_node): Likewise.
(ipcp_transform_function): Likewise.
* ipa-sra.c (process_scan_results): Likewise.
(ipa_sra_summarize_function): Likewise.
* params.opt: Rename ipcp-unit-growth to
ipa-cp-unit-growth. Add Optimization for various
IPA-related parameters.
From-SVN: r280040
|
|
2020-01-08 Martin Liska <mliska@suse.cz>
* cgraph.c (cgraph_node::dump): Use ::dump_name or
::dump_asm_name instead of (::name or ::asm_name).
* cgraphclones.c (symbol_table::materialize_all_clones): Likewise.
* cgraphunit.c (walk_polymorphic_call_targets): Likewise.
(analyze_functions): Likewise.
(expand_all_functions): Likewise.
* ipa-cp.c (ipcp_cloning_candidate_p): Likewise.
(propagate_bits_across_jump_function): Likewise.
(dump_profile_updates): Likewise.
(ipcp_store_bits_results): Likewise.
(ipcp_store_vr_results): Likewise.
* ipa-devirt.c (dump_targets): Likewise.
* ipa-fnsummary.c (analyze_function_body): Likewise.
* ipa-hsa.c (check_warn_node_versionable): Likewise.
(process_hsa_functions): Likewise.
* ipa-icf.c (sem_item_optimizer::merge_classes): Likewise.
(set_alias_uids): Likewise.
* ipa-inline-transform.c (save_inline_function_body): Likewise.
* ipa-inline.c (recursive_inlining): Likewise.
(inline_to_all_callers_1): Likewise.
(ipa_inline): Likewise.
* ipa-profile.c (ipa_propagate_frequency_1): Likewise.
(ipa_propagate_frequency): Likewise.
* ipa-prop.c (ipa_make_edge_direct_to_target): Likewise.
(remove_described_reference): Likewise.
* ipa-pure-const.c (worse_state): Likewise.
(check_retval_uses): Likewise.
(analyze_function): Likewise.
(propagate_pure_const): Likewise.
(propagate_nothrow): Likewise.
(dump_malloc_lattice): Likewise.
(propagate_malloc): Likewise.
(pass_local_pure_const::execute): Likewise.
* ipa-visibility.c (optimize_weakref): Likewise.
(function_and_variable_visibility): Likewise.
* ipa.c (symbol_table::remove_unreachable_nodes): Likewise.
(ipa_discover_variable_flags): Likewise.
* lto-streamer-out.c (output_function): Likewise.
(output_constructor): Likewise.
* tree-inline.c (copy_bb): Likewise.
* tree-ssa-structalias.c (ipa_pta_execute): Likewise.
* varpool.c (symbol_table::remove_unreferenced_decls): Likewise.
2020-01-08 Martin Liska <mliska@suse.cz>
* lto-partition.c (add_symbol_to_partition_1): Use ::dump_name or
::dump_asm_name instead of (::name or ::asm_name).
(lto_balanced_map): Likewise.
(promote_symbol): Likewise.
(rename_statics): Likewise.
* lto.c (lto_wpa_write_files): Likewise.
2020-01-08 Martin Liska <mliska@suse.cz>
* gcc.dg/ipa/ipa-icf-1.c: Update expected scanned output.
* gcc.dg/ipa/ipa-icf-10.c: Likewise.
* gcc.dg/ipa/ipa-icf-11.c: Likewise.
* gcc.dg/ipa/ipa-icf-12.c: Likewise.
* gcc.dg/ipa/ipa-icf-13.c: Likewise.
* gcc.dg/ipa/ipa-icf-16.c: Likewise.
* gcc.dg/ipa/ipa-icf-18.c: Likewise.
* gcc.dg/ipa/ipa-icf-2.c: Likewise.
* gcc.dg/ipa/ipa-icf-20.c: Likewise.
* gcc.dg/ipa/ipa-icf-21.c: Likewise.
* gcc.dg/ipa/ipa-icf-23.c: Likewise.
* gcc.dg/ipa/ipa-icf-25.c: Likewise.
* gcc.dg/ipa/ipa-icf-26.c: Likewise.
* gcc.dg/ipa/ipa-icf-27.c: Likewise.
* gcc.dg/ipa/ipa-icf-3.c: Likewise.
* gcc.dg/ipa/ipa-icf-35.c: Likewise.
* gcc.dg/ipa/ipa-icf-36.c: Likewise.
* gcc.dg/ipa/ipa-icf-37.c: Likewise.
* gcc.dg/ipa/ipa-icf-38.c: Likewise.
* gcc.dg/ipa/ipa-icf-5.c: Likewise.
* gcc.dg/ipa/ipa-icf-7.c: Likewise.
* gcc.dg/ipa/ipa-icf-8.c: Likewise.
* gcc.dg/ipa/ipa-icf-merge-1.c: Likewise.
* gcc.dg/ipa/pr64307.c: Likewise.
* gcc.dg/ipa/pr90555.c: Likewise.
* gcc.dg/ipa/propmalloc-1.c: Likewise.
* gcc.dg/ipa/propmalloc-2.c: Likewise.
* gcc.dg/ipa/propmalloc-3.c: Likewise.
From-SVN: r280009
|
|
2020-01-08 Feng Xue <fxue@os.amperecomputing.com>
PR ipa/93084
* ipa-cp.c (self_recursively_generated_p): Find matched aggregate
lattice for a value to check.
(propagate_vals_across_arith_jfunc): Add an assertion to ensure
finite propagation in self-recursive scc.
2020-01-08 Feng Xue <fxue@os.amperecomputing.com>
PR ipa/93084
* gcc.dg/ipa/ipa-clone-3.c: New test.
From-SVN: r279987
|
|
2020-01-03 Martin Jambor <mjambor@suse.cz>
PR ipa/92917
* ipa-cp.c (print_all_lattices): Skip functions without info.
From-SVN: r279859
|
|
From-SVN: r279813
|
|
2019-12-21 Martin Jambor <mjambor@suse.cz>
PR ipa/93015
* ipa-cp.c (ipcp_store_vr_results): Check that info exists
testsuite/
* gcc.dg/lto/pr93015_0.c: New test.
From-SVN: r279695
|
|
2019-12-19 Feng Xue <fxue@os.amperecomputing.com>
PR ipa/92794
* ipa-cp.c (self_recursive_agg_pass_through_p): New function.
(intersect_with_plats): Use error_mark_node as place holder
when aggregate jump function is simple pass-through for
self-recursive call.
(intersect_with_agg_replacements): Likewise.
(intersect_aggregates_with_edge): Likewise.
(find_aggregate_values_for_callers_subset): Likewise.
2019-12-19 Feng Xue <fxue@os.amperecomputing.com>
PR ipa/92794
* gcc.dg/ipa/92794.c: New test.
From-SVN: r279561
|
|
2019-12-18 Martin Jambor <mjambor@suse.cz>
PR ipa/92971
* ipa-cp.c (cgraph_edge_brings_all_agg_vals_for_node): Fix
definition of values, release memory on exit.
testsuite/
* gcc.dg/ipa/ipcp-agg-12.c: New test.
From-SVN: r279525
|
|
PR ipa/92883
* ipa-cp.c (propagate_vr_across_jump_function): Pass jvr rather
than *jfunc->m_vr to intersect. Formatting fix.
* gcc.dg/ipa/pr92883.c: New test.
From-SVN: r279194
|
|
* cgraphclones.c (localize_profile): New function.
(cgraph_node::create_clone): Use it for partial profiles.
* common.opt (fprofile-partial-training): New flag.
* doc/invoke.texi (-fprofile-partial-training): Document.
* ipa-cp.c (update_profiling_info): For partial profiles do not
set function profile to zero.
* profile.c (compute_branch_probabilities): With partial profile
watch if edge count is zero and turn all probabilities to guessed.
(compute_branch_probabilities): For partial profiles do not apply
profile when entry count is zero.
* tree-profile.c (tree_profiling): Only do value_profile_transformations
when profile is read.
From-SVN: r279013
|
|
2019-12-02 Feng Xue <fxue@os.amperecomputing.com>
PR ipa/92133
* doc/invoke.texi (ipa-cp-max-recursive-depth): Document new option.
(ipa-cp-min-recursive-probability): Likewise.
* params.opt (ipa-cp-max-recursive-depth): New.
(ipa-cp-min-recursive-probability): Likewise.
* ipa-cp.c (ipcp_lattice<valtype>::add_value): Add two new parameters
val_p and unlimited.
(self_recursively_generated_p): New function.
(get_val_across_arith_op): Likewise.
(propagate_vals_across_arith_jfunc): Add constant propagation for
self-recursive function.
(incorporate_penalties): Do not penalize pure self-recursive function.
(good_cloning_opportunity_p): Dump node_is_self_scc flag.
(propagate_constants_topo): Set node_is_self_scc flag for cgraph node.
(get_info_about_necessary_edges): Relax hotness check for edge to
self-recursive function.
* ipa-prop.h (ipa_node_params): Add new field node_is_self_scc.
2019-12-02 Feng Xue <fxue@os.amperecomputing.com>
PR ipa/92133
* gcc.dg/ipa/ipa-clone-2.c: New test.
From-SVN: r278893
|
|
2019-11-29 Martin Jambor <mjambor@suse.cz>
PR ipa/92476
* ipa-cp.c (set_single_call_flag): Set node_calling_single_call in
the summary only if the summary exists.
(find_more_scalar_values_for_callers_subset): Check node_dead in
the summary only if the summary exists.
(ipcp_store_bits_results): Ignore nodes without lattices.
(ipcp_store_vr_results): Likewise.
* cgraphclones.c: Include ipa-fnsummary.h and ipa-prop.h and the
header files required by them.
(cgraph_node::expand_all_artificial_thunks): Analyze expanded thunks.
From-SVN: r278841
|
|
From-SVN: r278808
|
|
vectors to not be allocated.
* ipa-fnsummary.c (evaluate_conditions_for_known_args): Be
ready for some vectors to not be allocated.
(evaluate_properties_for_edge): Document better; make
known_vals and known_aggs caller allocated; avoid determining
values of parameters which are not used.
(ipa_merge_fn_summary_after_inlining): Pre allocate known_vals and
known_aggs.
* ipa-inline-analysis.c (do_estimate_edge_time): Likewise.
(do_estimate_edge_size): Likewise.
(do_estimate_edge_hints): Likewise.
* ipa-cp.c (ipa_get_indirect_edge_target_1): Do not early exit when
values are not known.
(ipa_release_agg_values): Add option to not release vector itself.
From-SVN: r278553
|
|
* ipa-cp.c (ipa_vr_operation_and_type_effects): Move up in file.
(ipa_value_range_from_jfunc): New function.
* ipa-fnsummary.c (evaluate_conditions_for_known_args): Add
known_value_ranges parameter; use it to evalulate conditions.
(evaluate_properties_for_edge): Compute known value ranges.
(ipa_fn_summary_t::duplicate): Update use of
evaluate_conditions_for_known_args.
(estimate_ipcp_clone_size_and_time): Likewise.
(ipa_merge_fn_summary_after_inlining): Likewise.
* ipa-prop.h (ipa_value_range_from_jfunc): Declare.
* gcc.dg/ipa/inline-9.c: New testcase.
From-SVN: r278220
|
|
2019-11-14 Martin Liska <mliska@suse.cz>
* ipa-cp.c (devirtualization_time_bonus): Use opt_for_fn
of a callee to get value of the param.
* ipa-inline.c (inline_insns_auto): Use proper
opt_for_fn.
* opts.c (maybe_default_option): Do not overwrite param
value if optimization level does not match. Note that
params usually have default value set via Init() keyword.
* params.opt: Remove -param=max-inline-insns-auto-O2.
* cif-code.def (MAX_INLINE_INSNS_AUTO_O2_LIMIT): Remove.
* doc/invoke.texi: Remove documentation of
max-inline-insns-auto-O2.
2019-11-14 Martin Liska <mliska@suse.cz>
* c-c++-common/asan/memcmp-1.c: Update expected backtrace.
From-SVN: r278218
|
|
2019-11-14 Feng Xue <fxue@os.amperecomputing.com>
PR ipa/91682
* ipa-prop.h (jump_func_type): New value IPA_JF_LOAD_AGG.
(ipa_load_agg_data, ipa_agg_value, ipa_agg_value_set): New structs.
(ipa_agg_jf_item): Add new field jftype and type, redefine field value.
(ipa_agg_jump_function): Remove member function equal_to.
(ipa_agg_jump_function_p): Remove typedef.
(ipa_copy_agg_values, ipa_release_agg_values): New functions.
* ipa-prop.c (ipa_print_node_jump_functions_for_edge): Dump
information for aggregate jump function.
(get_ssa_def_if_simple_copy): Add new parameter rhs_stmt to
record last definition statement.
(load_from_unmodified_param_or_agg): New function.
(ipa_known_agg_contents_list): Add new field type and value, remove
field constant.
(build_agg_jump_func_from_list): Rename parameter const_count to
value_count, build aggregate jump function from ipa_load_agg_data.
(analyze_agg_content_value): New function.
(extract_mem_content): Analyze memory store assignment to prepare
information for aggregate jump function generation.
(determine_known_aggregate_parts): Add new parameter fbi, remove
parameter aa_walk_budeget_p.
(update_jump_functions_after_inlining): Update aggregate jump function.
(ipa_find_agg_cst_for_param): Change type of parameter agg.
(try_make_edge_direct_simple_call): Add new parameter new_root.
(try_make_edge_direct_virtual_call): Add new parameter new_root and
new_root_info.
(update_indirect_edges_after_inlining): Pass new argument to
try_make_edge_direct_simple_call and try_make_edge_direct_virtual_call.
(ipa_write_jump_function): Write aggregate jump function to file.
(ipa_read_jump_function): Read aggregate jump function from file.
(ipa_agg_value::equal_to): Migrate from ipa_agg_jf_item::equal_to.
* ipa-cp.c (ipa_get_jf_arith_result): New function.
(ipa_agg_value_from_node): Likewise.
(ipa_agg_value_set_from_jfunc): Likewise.
(propagate_vals_across_arith_jfunc): Likewise.
(propagate_aggregate_lattice): Likewise.
(ipa_get_jf_pass_through_result): Call ipa_get_jf_arith_result.
(propagate_vals_across_pass_through): Call
propagate_vals_across_arith_jfunc.
(get_clone_agg_value): Move forward.
(propagate_aggs_across_jump_function): Handle value propagation for
aggregate jump function.
(agg_jmp_p_vec_for_t_vec): Remove.
(context_independent_aggregate_values): Replace vec<ipa_agg_jf_item>
with vec<ipa_agg_value>.
(copy_plats_to_inter, intersect_with_plats): Likewise.
(agg_replacements_to_vector, intersect_with_agg_replacements): Likewise.
(intersect_aggregate_with_edge): Likewise.
(find_aggregate_values_for_callers_subset): Likewise.
(cgraph_edge_brings_all_agg_vals_for_node): Likewise.
(estimate_local_effects): Replace vec<ipa_agg_jump_function> and
vec<ipa_agg_jump_function_p> with vec<ipa_agg_value_set>.
(gather_context_independent_values): Likewise.
(perform_estimation_of_a_value, decide_whether_version_node): Likewise.
* ipa-fnsummary.c (evaluate_conditions_for_known_args): Replace
vec<ipa_agg_jump_function_p> with vec<ipa_agg_value_set>.
(evaluate_properties_for_edge): Likewise.
(estimate_edge_devirt_benefit): Likewise.
(estimate_edge_size_and_time): Likewise.
(estimate_calls_size_and_time): Likewise.
(ipa_call_context::ipa_call_context): Likewise.
(estimate_ipcp_clone_size_and_time): Likewise.
* ipa-fnsummary.h (ipa_call_context): Replace
vec<ipa_agg_jump_function_p> with vec<ipa_agg_value_set>.
* ipa-inline-analysis.c (do_estimate_edge_time): Replace
vec<ipa_agg_jump_function_p> with vec<ipa_agg_value_set>.
(do_estimate_edge_size): Likewise.
(do_estimate_edge_hints): Likewise.
2019-11-14 Feng Xue <fxue@os.amperecomputing.com>
PR ipa/91682
* gcc.dg/ipa/ipcp-agg-10.c: Change dg-scan string.
* gcc.dg/ipa/ipcp-agg-11.c: New test.
From-SVN: r278193
|