Age | Commit message (Collapse) | Author | Files | Lines |
|
In spring I added code eliminating any statements using parameters
removed by IPA passes (to fix PR 93385). That patch fixed issues such
as divisions by zero that such code could perform but it only reset
all affected debug bind statements, this one updates them with
expressions which can allow the debugger to print the removed value -
see the added test-case for an example.
Even though I originally did not want to create DEBUG_EXPR_DECLs for
intermediate values, I ended up doing so, because otherwise the code
started creating statements like
# DEBUG __aD.198693 => &MEM[(const struct _Alloc_nodeD.171110 *)D#195]._M_tD.184726->_M_implD.171154
which not only is a bit scary but also gimple-fold ICEs on
it. Therefore I decided they are probably quite necessary.
The patch simply notes each removed SSA name present in a debug
statement and then works from it backwards, looking if it can
reconstruct the expression it represents (which can fail if a
non-degenerate PHI node is in the way). If it can, it populates two
hash maps with those expressions so that 1) removed assignments are
replaced with a debug bind defining a new intermediate debug_decl_expr
and 2) existing debug binds that refer to SSA names that are bing
removed now refer to corresponding debug_decl_exprs.
If a removed parameter is passed to another function, the debugging
information still cannot describe its value there - see the xfailed
test in the testcase. I sort of know what needs to be done but that
needs a little bit more of IPA infrastructure on top of this patch and
so I would like to get this patch reviewed first.
Bootstrapped and tested on x86_64-linux, i686-linux and (long time
ago) on aarch64-linux. Also LTO-bootstrapped and on x86_64-linux.
Perhaps it is good to go to trunk?
Thanks,
Martin
gcc/ChangeLog:
2021-03-29 Martin Jambor <mjambor@suse.cz>
PR ipa/93385
* ipa-param-manipulation.h (class ipa_param_body_adjustments): New
members remap_with_debug_expressions, m_dead_ssa_debug_equiv,
m_dead_stmt_debug_equiv and prepare_debug_expressions. Added
parameter to mark_dead_statements.
* ipa-param-manipulation.c: Include tree-phinodes.h and cfgexpand.h.
(ipa_param_body_adjustments::mark_dead_statements): New parameter
debugstack, push into it all SSA names used in debug statements,
produce m_dead_ssa_debug_equiv mapping for the removed param.
(replace_with_mapped_expr): New function.
(ipa_param_body_adjustments::remap_with_debug_expressions): Likewise.
(ipa_param_body_adjustments::prepare_debug_expressions): Likewise.
(ipa_param_body_adjustments::common_initialization): Gather and
procecc SSA which will be removed but are in debug statements. Simplify.
(ipa_param_body_adjustments::ipa_param_body_adjustments): Initialize
new members.
* tree-inline.c (remap_gimple_stmt): Create a debug bind when possible
when avoiding a copy of an unnecessary statement. Remap removed SSA
names in existing debug statements.
(tree_function_versioning): Do not create DEBUG_EXPR_DECL for removed
parameters if we have already done so.
gcc/testsuite/ChangeLog:
2021-03-29 Martin Jambor <mjambor@suse.cz>
PR ipa/93385
* gcc.dg/guality/ipa-sra-1.c: New test.
|
|
We have to be careful to not break the argument space calculation.
If there's not enough arguments just do not append any.
2021-10-15 Richard Biener <rguenther@suse.de>
PR ipa/102762
* tree-inline.c (copy_bb): Avoid underflowing nargs.
* gcc.dg/torture/pr102762.c: New testcase.
|
|
When inlining we have to avoid mapping a non-lvalue parameter
value into a context that prevents the parameter to be a register.
Formerly the register were TREE_ADDRESSABLE but now it can be
just DECL_NOT_GIMPLE_REG_P.
2021-09-30 Richard Biener <rguenther@suse.de>
PR middle-end/102518
* tree-inline.c (setup_one_parameter): Avoid substituting
an invariant into contexts where a GIMPLE register is not valid.
* gcc.dg/torture/pr102518.c: New testcase.
|
|
I've hit a bootstrap-debug error involving large subprograms in
gcc/ada/sem_ch12.adb. I'm afraid I couldn't narrow it down to a
reasonable testcase.
thread1 made different decisions about a block containing a
builtin_eh_filter call because in one compilation, estimate_num_insns
found a cgraph_node for the builtin and could thus get to the
is_simple_builtin test, but in the other it didn't. With different
insn counts, one stage jump-threaded and the other didn't, and the
resulting code diverged quite a bit.
The reason the builtin had a cgraph_node in one case but not the other
was that modref got a chance to analyze the builtin call when it was
the first stmt in the block, and that created the cgraph_node.
However, when it was preceded by debug stmts, the loop in
analyze_function was cut short after the first debug stmt, because the
summary so far was not useful.
This patch fixes both issues: skip debug stmts in the analyze_function
loop, so as to prevent them from affecting any decisions in the loop,
and enable the insn count estimator to get to the is_simple_builtin
test when a cgraph_node has not been created for the builtin.
for gcc/ChangeLog
* ipa-modref.c (analyze_function): Skip debug stmts.
* tree-inline.c (estimate_num_insn): Consider builtins even
without a cgraph_node.
|
|
We iterate over debug stmts from the last one in new_bb, and we insert
them before the first post-label stmt in each dest block, without
moving the insertion iterator, so they end up reversed. Moving the
insertion iterator fixes this.
for gcc/ChangeLog
* tree-inline.c (maybe_move_debug_stmts_to_successors): Don't
reverse debug stmts.
|
|
This patch implements the OpenMP 5.1 scope construct, which is similar
to worksharing constructs in many regards, but isn't one of them.
The body of the construct is encountered by all threads though, it can
be nested in itself or intermixed with taskgroup and worksharing etc.
constructs can appear inside of it (but it can't be nested in
worksharing etc. constructs). The main purpose of the construct
is to allow reductions (normal and task ones) without the need to
close the parallel and reopen another one.
If it doesn't have task reductions, it can be implemented without
any new library support, with nowait it just does the privatizations
at the start if any and reductions before the end of the body, with
without nowait emits a normal GOMP_barrier{,_cancel} at the end too.
For task reductions, we need to ensure only one thread initializes
the task reduction library data structures and other threads copy from that,
so a new GOMP_scope_start routine is added to the library for that.
It acts as if the start of the scope construct is a nowait worksharing
construct (that is ok, it can't be nested in other worksharing
constructs and all threads need to encounter the start in the same
order) which does the task reduction initialization, but as the body
can have other scope constructs and/or worksharing constructs, that is
all where we use this dummy worksharing construct. With task reductions,
the construct must not have nowait and ends with a GOMP_barrier{,_cancel},
followed by task reductions followed by GOMP_workshare_task_reduction_unregister.
Only C/C++ FE support is done.
2021-08-17 Jakub Jelinek <jakub@redhat.com>
gcc/
* tree.def (OMP_SCOPE): New tree code.
* tree.h (OMP_SCOPE_BODY, OMP_SCOPE_CLAUSES): Define.
* tree-nested.c (convert_nonlocal_reference_stmt,
convert_local_reference_stmt, convert_gimple_call): Handle
GIMPLE_OMP_SCOPE.
* tree-pretty-print.c (dump_generic_node): Handle OMP_SCOPE.
* gimple.def (GIMPLE_OMP_SCOPE): New gimple code.
* gimple.c (gimple_build_omp_scope): New function.
(gimple_copy): Handle GIMPLE_OMP_SCOPE.
* gimple.h (gimple_build_omp_scope): Declare.
(gimple_has_substatements): Handle GIMPLE_OMP_SCOPE.
(gimple_omp_scope_clauses, gimple_omp_scope_clauses_ptr,
gimple_omp_scope_set_clauses): New inline functions.
(CASE_GIMPLE_OMP): Add GIMPLE_OMP_SCOPE.
* gimple-pretty-print.c (dump_gimple_omp_scope): New function.
(pp_gimple_stmt_1): Handle GIMPLE_OMP_SCOPE.
* gimple-walk.c (walk_gimple_stmt): Likewise.
* gimple-low.c (lower_stmt): Likewise.
* gimplify.c (is_gimple_stmt): Handle OMP_MASTER.
(gimplify_scan_omp_clauses): For task reductions, handle OMP_SCOPE
like ORT_WORKSHARE constructs. Adjust diagnostics for %<scope%>
allowing task reductions. Reject inscan reductions on scope.
(omp_find_stores_stmt): Handle GIMPLE_OMP_SCOPE.
(gimplify_omp_workshare, gimplify_expr): Handle OMP_SCOPE.
* tree-inline.c (remap_gimple_stmt): Handle GIMPLE_OMP_SCOPE.
(estimate_num_insns): Likewise.
* omp-low.c (build_outer_var_ref): Look through GIMPLE_OMP_SCOPE
contexts if var isn't privatized there.
(check_omp_nesting_restrictions): Handle GIMPLE_OMP_SCOPE.
(scan_omp_1_stmt): Likewise.
(maybe_add_implicit_barrier_cancel): Look through outer
scope constructs.
(lower_omp_scope): New function.
(lower_omp_task_reductions): Handle OMP_SCOPE.
(lower_omp_1): Handle GIMPLE_OMP_SCOPE.
(diagnose_sb_1, diagnose_sb_2): Likewise.
* omp-expand.c (expand_omp_single): Support also GIMPLE_OMP_SCOPE.
(expand_omp): Handle GIMPLE_OMP_SCOPE.
(omp_make_gimple_edges): Likewise.
* omp-builtins.def (BUILT_IN_GOMP_SCOPE_START): New built-in.
gcc/c-family/
* c-pragma.h (enum pragma_kind): Add PRAGMA_OMP_SCOPE.
* c-pragma.c (omp_pragmas): Add scope construct.
* c-omp.c (omp_directives): Uncomment scope directive entry.
gcc/c/
* c-parser.c (OMP_SCOPE_CLAUSE_MASK): Define.
(c_parser_omp_scope): New function.
(c_parser_omp_construct): Handle PRAGMA_OMP_SCOPE.
gcc/cp/
* parser.c (OMP_SCOPE_CLAUSE_MASK): Define.
(cp_parser_omp_scope): New function.
(cp_parser_omp_construct, cp_parser_pragma): Handle PRAGMA_OMP_SCOPE.
* pt.c (tsubst_expr): Handle OMP_SCOPE.
gcc/testsuite/
* c-c++-common/gomp/nesting-2.c (foo): Add scope and masked
construct tests.
* c-c++-common/gomp/scan-1.c (f3): Add scope construct test..
* c-c++-common/gomp/cancel-1.c (f2): Add scope and masked
construct tests.
* c-c++-common/gomp/reduction-task-2.c (bar): Add scope construct
test. Adjust diagnostics for the addition of scope.
* c-c++-common/gomp/loop-1.c (f5): Add master, masked and scope
construct tests.
* c-c++-common/gomp/clause-dups-1.c (f1): Add scope construct test.
* gcc.dg/gomp/nesting-1.c (f1, f2, f3): Add scope construct tests.
* c-c++-common/gomp/scope-1.c: New test.
* c-c++-common/gomp/scope-2.c: New test.
* g++.dg/gomp/attrs-1.C (bar): Add scope construct tests.
* g++.dg/gomp/attrs-2.C (bar): Likewise.
* gfortran.dg/gomp/reduction4.f90: Adjust expected diagnostics.
* gfortran.dg/gomp/reduction7.f90: Likewise.
libgomp/
* Makefile.am (libgomp_la_SOURCES): Add scope.c
* Makefile.in: Regenerated.
* libgomp_g.h (GOMP_scope_start): Declare.
* libgomp.map: Add GOMP_scope_start@@GOMP_5.1.
* scope.c: New file.
* testsuite/libgomp.c-c++-common/scope-1.c: New test.
* testsuite/libgomp.c-c++-common/task-reduction-16.c: New test.
|
|
This construct has been introduced as a replacement for master
construct, but unlike that construct is slightly more general,
has an optional clause which allows to choose which thread
will be the one running the region, it can be some other thread
than the master (primary) thread with number 0, or it could be no
threads or multiple threads (then of course one needs to be careful
about data races).
It is way too early to deprecate the master construct though, we don't
even have OpenMP 5.0 fully implemented, it has been deprecated in 5.1,
will be also in 5.2 and removed in 6.0. But even then it will likely
be a good idea to just -Wdeprecated warn about it and still accept it.
The patch also contains something I should have done much earlier,
for clauses that accept some integral expression where we only care
about the value, forces during gimplification that value into
either a min invariant (as before), SSA_NAME or a fresh temporary,
but never e.g. a user VAR_DECL, so that for those clauses we don't
need to worry about adjusting it.
2021-08-12 Jakub Jelinek <jakub@redhat.com>
gcc/
* tree.def (OMP_MASKED): New tree code.
* tree-core.h (enum omp_clause_code): Add OMP_CLAUSE_FILTER.
* tree.h (OMP_MASKED_BODY, OMP_MASKED_CLAUSES, OMP_MASKED_COMBINED,
OMP_CLAUSE_FILTER_EXPR): Define.
* tree.c (omp_clause_num_ops): Add OMP_CLAUSE_FILTER entry.
(omp_clause_code_name): Likewise.
(walk_tree_1): Handle OMP_CLAUSE_FILTER.
* tree-nested.c (convert_nonlocal_omp_clauses,
convert_local_omp_clauses): Handle OMP_CLAUSE_FILTER.
(convert_nonlocal_reference_stmt, convert_local_reference_stmt,
convert_gimple_call): Handle GIMPLE_OMP_MASTER.
* tree-pretty-print.c (dump_omp_clause): Handle OMP_CLAUSE_FILTER.
(dump_generic_node): Handle OMP_MASTER.
* gimple.def (GIMPLE_OMP_MASKED): New gimple code.
* gimple.c (gimple_build_omp_masked): New function.
(gimple_copy): Handle GIMPLE_OMP_MASKED.
* gimple.h (gimple_build_omp_masked): Declare.
(gimple_has_substatements): Handle GIMPLE_OMP_MASKED.
(gimple_omp_masked_clauses, gimple_omp_masked_clauses_ptr,
gimple_omp_masked_set_clauses): New inline functions.
(CASE_GIMPLE_OMP): Add GIMPLE_OMP_MASKED.
* gimple-pretty-print.c (dump_gimple_omp_masked): New function.
(pp_gimple_stmt_1): Handle GIMPLE_OMP_MASKED.
* gimple-walk.c (walk_gimple_stmt): Likewise.
* gimple-low.c (lower_stmt): Likewise.
* gimplify.c (is_gimple_stmt): Handle OMP_MASTER.
(gimplify_scan_omp_clauses): Handle OMP_CLAUSE_FILTER. For clauses
that take one expression rather than decl or constant, force
gimplification of that into a SSA_NAME or temporary unless min
invariant.
(gimplify_adjust_omp_clauses): Handle OMP_CLAUSE_FILTER.
(gimplify_expr): Handle OMP_MASKED.
* tree-inline.c (remap_gimple_stmt): Handle GIMPLE_OMP_MASKED.
(estimate_num_insns): Likewise.
* omp-low.c (scan_sharing_clauses): Handle OMP_CLAUSE_FILTER.
(check_omp_nesting_restrictions): Handle GIMPLE_OMP_MASKED. Adjust
diagnostics for existence of masked construct.
(scan_omp_1_stmt, lower_omp_master, lower_omp_1, diagnose_sb_1,
diagnose_sb_2): Handle GIMPLE_OMP_MASKED.
* omp-expand.c (expand_omp_synch, expand_omp, omp_make_gimple_edges):
Likewise.
gcc/c-family/
* c-pragma.h (enum pragma_kind): Add PRAGMA_OMP_MASKED.
(enum pragma_omp_clause): Add PRAGMA_OMP_CLAUSE_FILTER.
* c-pragma.c (omp_pragmas_simd): Add masked construct.
* c-common.h (enum c_omp_clause_split): Add C_OMP_CLAUSE_SPLIT_MASKED
enumerator.
(c_finish_omp_masked): Declare.
* c-omp.c (c_finish_omp_masked): New function.
(c_omp_split_clauses): Handle combined masked constructs.
gcc/c/
* c-parser.c (c_parser_omp_clause_name): Parse filter clause name.
(c_parser_omp_clause_filter): New function.
(c_parser_omp_all_clauses): Handle PRAGMA_OMP_CLAUSE_FILTER.
(OMP_MASKED_CLAUSE_MASK): Define.
(c_parser_omp_masked): New function.
(c_parser_omp_parallel): Handle parallel masked.
(c_parser_omp_construct): Handle PRAGMA_OMP_MASKED.
* c-typeck.c (c_finish_omp_clauses): Handle OMP_CLAUSE_FILTER.
gcc/cp/
* parser.c (cp_parser_omp_clause_name): Parse filter clause name.
(cp_parser_omp_clause_filter): New function.
(cp_parser_omp_all_clauses): Handle PRAGMA_OMP_CLAUSE_FILTER.
(OMP_MASKED_CLAUSE_MASK): Define.
(cp_parser_omp_masked): New function.
(cp_parser_omp_parallel): Handle parallel masked.
(cp_parser_omp_construct, cp_parser_pragma): Handle PRAGMA_OMP_MASKED.
* semantics.c (finish_omp_clauses): Handle OMP_CLAUSE_FILTER.
* pt.c (tsubst_omp_clauses): Likewise.
(tsubst_expr): Handle OMP_MASKED.
gcc/testsuite/
* c-c++-common/gomp/clauses-1.c (bar): Add tests for combined masked
constructs with clauses.
* c-c++-common/gomp/clauses-5.c (foo): Add testcase for filter clause.
* c-c++-common/gomp/clause-dups-1.c (f1): Likewise.
* c-c++-common/gomp/masked-1.c: New test.
* c-c++-common/gomp/masked-2.c: New test.
* c-c++-common/gomp/masked-combined-1.c: New test.
* c-c++-common/gomp/masked-combined-2.c: New test.
* c-c++-common/goacc/uninit-if-clause.c: Remove xfails.
* g++.dg/gomp/block-11.C: New test.
* g++.dg/gomp/tpl-masked-1.C: New test.
* g++.dg/gomp/attrs-1.C (bar): Add tests for masked construct and
combined masked constructs with clauses in attribute syntax.
* g++.dg/gomp/attrs-2.C (bar): Likewise.
* gcc.dg/gomp/nesting-1.c (f1, f2): Add tests for masked construct
nesting.
* gfortran.dg/goacc/host_data-tree.f95: Allow also SSA_NAMEs in if
clause.
* gfortran.dg/goacc/kernels-tree.f95: Likewise.
libgomp/
* testsuite/libgomp.c-c++-common/masked-1.c: New test.
|
|
This gets rid of a few gimple_expr_type uses.
2021-07-16 Richard Biener <rguenther@suse.de>
* gimple-fold.c (gimple_fold_stmt_to_constant_1): Use
the type of the LHS.
(gimple_assign_nonnegative_warnv_p): Likewise.
(gimple_call_nonnegative_warnv_p): Likewise. Return false
if the call has no LHS.
* gimple.c (gimple_could_trap_p_1): Use the type of the LHS.
* tree-eh.c (stmt_could_throw_1_p): Likewise.
* tree-inline.c (insert_init_stmt): Likewise.
* tree-ssa-loop-niter.c (get_val_for): Likewise.
* tree-outof-ssa.c (ssa_is_replaceable_p): Use the type of
the def.
* tree-ssa-sccvn.c (init_vn_nary_op_from_stmt): Take a
gassign *. Use the type of the lhs.
(vn_nary_op_lookup_stmt): Adjust.
(vn_nary_op_insert_stmt): Likewise.
|
|
I was asked by Richi to split my fix for PR 93385 for easier review
into IPA-SRA materialization refactoring and the actual DCE addition.
This is the second part that actually contains the DCE of statements
that IPA-SRA should not leave behind because they can have problematic
side effects, even if they are useless, so that we do not depend on
tree-dce to remove them for correctness.
The patch fixes the problem by doing a def-use walk when materializing
clones, marking which statements should not be copied and which
SSA_NAMEs do not need to be computed because eventually they would be
DCEd. We do this on the original function body and tree-inline simply
does not copy statements which are "dead."
The only complication is removing dead argument calls because that
needs to be communicated to callee redirection code using the
infrastructure introduced by the previous patch.
I added all testcases of the original patch to this one, although some
probably test behavior introduced in the previous patch.
gcc/ChangeLog:
2021-05-12 Martin Jambor <mjambor@suse.cz>
PR ipa/93385
* ipa-param-manipulation.h (class ipa_param_body_adjustments): New
members m_dead_stmts and m_dead_ssas.
* ipa-param-manipulation.c
(ipa_param_body_adjustments::mark_dead_statements): New function.
(ipa_param_body_adjustments::common_initialization): Call it on
all removed but not split parameters.
(ipa_param_body_adjustments::ipa_param_body_adjustments): Initialize
new mwmbers.
(ipa_param_body_adjustments::modify_call_stmt): Remove arguments that
are dead.
* tree-inline.c (remap_gimple_stmt): Do not copy dead statements, reset
dead debug statements.
(copy_phis_for_bb): Do not copy dead PHI nodes.
gcc/testsuite/ChangeLog:
2021-03-22 Martin Jambor <mjambor@suse.cz>
PR ipa/93385
* gcc.dg/ipa/pr93385.c: New test.
* gcc.dg/ipa/ipa-sra-23.c: Likewise.
* gcc.dg/ipa/ipa-sra-24.c: Likewise.
* g++.dg/ipa/ipa-sra-4.C: Likewise.
|
|
I was asked by Richi to split my fix for PR 93385 for easier review
into IPA-SRA materialization refactoring and the actual DCE addition.
Fortunately it was mostly natural except for a temporary weird
condition in ipa_param_body_adjustments::modify_call_stmt.
Additionally. In addition to the patch I posted previously, this one
also deallocated the newly added summary in toplev::finalize and fixes
a mistakenly uninitialized field.
This is the first part which basically replaces performed_splits in
clone_info and the code which generates it, keeps it up-to-date and
consumes it with new edge summaries which are much nicer. It simply
contains 1) a mapping from the original argument indices to the actual
indices in the call statement as it is now, 2) information needed to
identify arguments representing pass-through IPA-SRA splits with which
have been added to the call arguments in place of an original
argument/reference and 3) a delta to the index where va_args may start
- so basically directly all the information that the consumer of
performed_splits had to compute and we also do not need the weird
dummy declarations.
The main disadvantage is that the information has to be created (and
kept up-to-date) for all call graph edges associated with the given
statement from all clones (including inline clones) of the clone where
splitting or removal happened first. But all of this happens during
clone materialization so the only effect on WPA memory consumption is
the removal of a pointer from clone_info.
The statement modification code also has to know the statement from
the original function in order to be able to locate the edge summaries
which at this point are still keyed to these. However, the code is
already quite heavily dependant on how things are structured in
tree-inline.c and in order to fix bugs like these it probably has to
be.
The subsequent patch needs this new information to be able to remove
arguments from calls during materialization and communicate this
information to the call redirection.
gcc/ChangeLog:
2021-05-17 Martin Jambor <mjambor@suse.cz>
PR ipa/93385
* symtab-clones.h (clone_info): Removed member param_adjustments.
* ipa-param-manipulation.h: Adjust initial comment to reflect how we
deal with pass-through splits now.
(ipa_param_performed_split): Removed.
(ipa_param_adjustments::modify_call): Adjusted parameters.
(class ipa_param_body_adjustments): Adjusted parameters of
register_replacement, modify_gimple_stmt and modify_call_stmt.
(ipa_verify_edge_has_no_modifications): Declare.
(ipa_edge_modifications_finalize): Declare.
* cgraph.c (cgraph_edge::redirect_call_stmt_to_callee): Remove
performed_splits processing, pas only edge to padjs->modify_call,
check that call arguments were not modified if they should not have
been.
* cgraphclones.c (cgraph_node::create_clone): Do not copy performed
splits.
* ipa-param-manipulation.c (struct pass_through_split_map): New type.
(ipa_edge_modification_info): Likewise.
(ipa_edge_modification_sum): Likewise.
(ipa_edge_modifications): New edge summary.
(ipa_verify_edge_has_no_modifications): New function.
(transitive_split_p): Removed.
(transitive_split_map): Likewise.
(init_transitive_splits): Likewise.
(ipa_param_adjustments::modify_call): Adjusted to use the new edge
summary instead of performed_splits.
(ipa_param_body_adjustments::register_replacement): Drop dummy
parameter, set base_index of the created ipa_param_body_replacement.
(phi_arg_will_live_p): New function.
(ipa_param_body_adjustments::common_initialization): Do not create
IPA_SRA dummy decls.
(simple_tree_swap_info): Removed.
(remap_split_decl_to_dummy): Likewise.
(record_argument_state_1): New function.
(record_argument_state): Likewise.
(ipa_param_body_adjustments::modify_call_stmt): New parameter
orig_stmt. Do not work with dummy decls, save necessary info about
changes to ipa_edge_modifications.
(ipa_param_body_adjustments::modify_gimple_stmt): New parameter
orig_stmt, pass it to modify_call_stmt.
(ipa_param_body_adjustments::modify_cfun_body): Adjust call to
modify_gimple_stmt.
(ipa_edge_modifications_finalize): New function.
* tree-inline.c (remap_gimple_stmt): Pass original statement to
modify_gimple_stmt.
(copy_phis_for_bb): Do not copy dead PHI nodes.
(expand_call_inline): Do not remap performed_splits.
(update_clone_info): Likewise.
* toplev.c: Include ipa-param-manipulation.h.
(toplev::finalize): Call ipa_edge_modifications_finalize.
|
|
gcc/ChangeLog:
* builtins.c (warn_string_no_nul): Replace uses of TREE_NO_WARNING,
gimple_no_warning_p and gimple_set_no_warning with
warning_suppressed_p, and suppress_warning.
(c_strlen): Same.
(maybe_warn_for_bound): Same.
(warn_for_access): Same.
(check_access): Same.
(expand_builtin_strncmp): Same.
(fold_builtin_varargs): Same.
* calls.c (maybe_warn_nonstring_arg): Same.
(maybe_warn_rdwr_sizes): Same.
* cfgexpand.c (expand_call_stmt): Same.
* cgraphunit.c (check_global_declaration): Same.
* fold-const.c (fold_undefer_overflow_warnings): Same.
(fold_truth_not_expr): Same.
(fold_unary_loc): Same.
(fold_checksum_tree): Same.
* gimple-array-bounds.cc (array_bounds_checker::check_array_ref): Same.
(array_bounds_checker::check_mem_ref): Same.
(array_bounds_checker::check_addr_expr): Same.
(array_bounds_checker::check_array_bounds): Same.
* gimple-expr.c (copy_var_decl): Same.
* gimple-fold.c (gimple_fold_builtin_strcpy): Same.
(gimple_fold_builtin_strncat): Same.
(gimple_fold_builtin_stxcpy_chk): Same.
(gimple_fold_builtin_stpcpy): Same.
(gimple_fold_builtin_sprintf): Same.
(fold_stmt_1): Same.
* gimple-ssa-isolate-paths.c (diag_returned_locals): Same.
* gimple-ssa-nonnull-compare.c (do_warn_nonnull_compare): Same.
* gimple-ssa-sprintf.c (handle_printf_call): Same.
* gimple-ssa-store-merging.c (imm_store_chain_info::output_merged_store): Same.
* gimple-ssa-warn-restrict.c (maybe_diag_overlap): Same.
* gimple-ssa-warn-restrict.h: Adjust declarations.
(maybe_diag_access_bounds): Replace uses of TREE_NO_WARNING,
gimple_no_warning_p and gimple_set_no_warning with
warning_suppressed_p, and suppress_warning.
(check_call): Same.
(check_bounds_or_overlap): Same.
* gimple.c (gimple_build_call_from_tree): Same.
* gimplify.c (gimplify_return_expr): Same.
(gimplify_cond_expr): Same.
(gimplify_modify_expr_complex_part): Same.
(gimplify_modify_expr): Same.
(gimple_push_cleanup): Same.
(gimplify_expr): Same.
* omp-expand.c (expand_omp_for_generic): Same.
(expand_omp_taskloop_for_outer): Same.
* omp-low.c (lower_rec_input_clauses): Same.
(lower_lastprivate_clauses): Same.
(lower_send_clauses): Same.
(lower_omp_target): Same.
* tree-cfg.c (pass_warn_function_return::execute): Same.
* tree-complex.c (create_one_component_var): Same.
* tree-inline.c (remap_gimple_op_r): Same.
(copy_tree_body_r): Same.
(declare_return_variable): Same.
(expand_call_inline): Same.
* tree-nested.c (lookup_field_for_decl): Same.
* tree-sra.c (create_access_replacement): Same.
(generate_subtree_copies): Same.
* tree-ssa-ccp.c (pass_post_ipa_warn::execute): Same.
* tree-ssa-forwprop.c (combine_cond_expr_cond): Same.
* tree-ssa-loop-ch.c (ch_base::copy_headers): Same.
* tree-ssa-loop-im.c (execute_sm): Same.
* tree-ssa-phiopt.c (cond_store_replacement): Same.
* tree-ssa-strlen.c (maybe_warn_overflow): Same.
(handle_builtin_strcpy): Same.
(maybe_diag_stxncpy_trunc): Same.
(handle_builtin_stxncpy_strncat): Same.
(handle_builtin_strcat): Same.
* tree-ssa-uninit.c (get_no_uninit_warning): Same.
(set_no_uninit_warning): Same.
(uninit_undefined_value_p): Same.
(warn_uninit): Same.
(maybe_warn_operand): Same.
* tree-vrp.c (compare_values_warnv): Same.
* vr-values.c (vr_values::extract_range_for_var_from_comparison_expr): Same.
(test_for_singularity): Same.
* gimple.h (warning_suppressed_p): New function.
(suppress_warning): Same.
(copy_no_warning): Same.
(gimple_set_block): Call gimple_set_location.
(gimple_set_location): Call copy_warning.
|
|
tree-inline leaves behind VAR_DECLs which are TREE_READONLY (because
they are copies of const parameters) but are written to because they
need to be initialized. This patch resets the flag unconditionally so
that this does not happen.
There are other sources of variables which are incorrectly marked as
TREE_READOLY, but with this patch and a verifier catching them I can
at least compile the Ada run-time library.
gcc/ChangeLog:
2021-06-22 Richard Biener <rguenther@suse.de>
Martin Jambor <mjambor@suse.cz>
* tree-inline.c (setup_one_parameter): Set TREE_READONLY of the
param replacement unconditionally. Adjust comment.
|
|
This changes users of FOR_EACH_VEC_ELT to use range based for loops,
where the index variables are otherwise unused. As such the index
variables are all deleted, producing shorter and simpler code.
Signed-off-by: Trevor Saunders <tbsaunde@tbsaunde.org>
gcc/analyzer/ChangeLog:
* call-string.cc (call_string::call_string): Use range based for
to iterate over vec<>.
(call_string::to_json): Likewise.
(call_string::hash): Likewise.
(call_string::calc_recursion_depth): Likewise.
* checker-path.cc (checker_path::fixup_locations): Likewise.
* constraint-manager.cc (equiv_class::equiv_class): Likewise.
(equiv_class::to_json): Likewise.
(equiv_class::hash): Likewise.
(constraint_manager::to_json): Likewise.
* engine.cc (impl_region_model_context::on_svalue_leak):
Likewise.
(on_liveness_change): Likewise.
(impl_region_model_context::on_unknown_change): Likewise.
* program-state.cc (sm_state_map::set_state): Likewise.
* region-model.cc (test_canonicalization_4): Likewise.
gcc/ChangeLog:
* attribs.c (find_attribute_namespace): Iterate over vec<> with
range based for.
* auto-profile.c (afdo_find_equiv_class): Likewise.
* gcc.c (do_specs_vec): Likewise.
(do_spec_1): Likewise.
(driver::set_up_specs): Likewise.
* gimple-loop-jam.c (any_access_function_variant_p): Likewise.
* gimple-ssa-store-merging.c (compatible_load_p): Likewise.
(imm_store_chain_info::try_coalesce_bswap): Likewise.
(imm_store_chain_info::coalesce_immediate_stores): Likewise.
(get_location_for_stmts): Likewise.
* graphite-poly.c (print_iteration_domains): Likewise.
(free_poly_bb): Likewise.
(remove_gbbs_in_scop): Likewise.
(free_scop): Likewise.
(dump_gbb_cases): Likewise.
(dump_gbb_conditions): Likewise.
(print_pdrs): Likewise.
(print_scop): Likewise.
* ifcvt.c (cond_move_process_if_block): Likewise.
* lower-subreg.c (decompose_multiword_subregs): Likewise.
* regcprop.c (pass_cprop_hardreg::execute): Likewise.
* sanopt.c (sanitize_rewrite_addressable_params): Likewise.
* sel-sched-dump.c (dump_insn_vector): Likewise.
* store-motion.c (store_ops_ok): Likewise.
(store_killed_in_insn): Likewise.
* timevar.c (timer::named_items::print): Likewise.
* tree-cfgcleanup.c (cleanup_control_flow_pre): Likewise.
(cleanup_tree_cfg_noloop): Likewise.
* tree-data-ref.c (dump_data_references): Likewise.
(print_dir_vectors): Likewise.
(print_dist_vectors): Likewise.
(dump_data_dependence_relations): Likewise.
(dump_dist_dir_vectors): Likewise.
(dump_ddrs): Likewise.
(create_runtime_alias_checks): Likewise.
(free_subscripts): Likewise.
(save_dist_v): Likewise.
(save_dir_v): Likewise.
(invariant_access_functions): Likewise.
(same_access_functions): Likewise.
(access_functions_are_affine_or_constant_p): Likewise.
(find_data_references_in_stmt): Likewise.
(graphite_find_data_references_in_stmt): Likewise.
(free_dependence_relations): Likewise.
(free_data_refs): Likewise.
* tree-inline.c (copy_debug_stmts): Likewise.
* tree-into-ssa.c (dump_currdefs): Likewise.
(rewrite_update_phi_arguments): Likewise.
* tree-ssa-propagate.c (clean_up_loop_closed_phi): Likewise.
* tree-vect-data-refs.c (vect_analyze_possibly_independent_ddr):
Likewise.
(vect_slp_analyze_node_dependences): Likewise.
(vect_slp_analyze_instance_dependence): Likewise.
(vect_record_base_alignments): Likewise.
(vect_get_peeling_costs_all_drs): Likewise.
(vect_peeling_supportable): Likewise.
* tree-vectorizer.c (vec_info::~vec_info): Likewise.
(vec_info::free_stmt_vec_infos): Likewise.
gcc/cp/ChangeLog:
* constexpr.c (cxx_eval_call_expression): Iterate over vec<>
with range based for.
(cxx_eval_store_expression): Likewise.
(cxx_eval_loop_expr): Likewise.
* decl.c (wrapup_namespace_globals): Likewise.
(cp_finish_decl): Likewise.
(cxx_simulate_enum_decl): Likewise.
* parser.c (cp_parser_postfix_expression): Likewise.
|
|
The depend(source) clause has NULL OMP_CLAUSE_DECL, it has just the
depend kind specified and no arguments. So copy_tree_body_r shouldn't
check TREE_CODE on it without checking it is non-NULL.
2021-06-08 Jakub Jelinek <jakub@redhat.com>
PR c++/100957
* tree-inline.c (copy_tree_body_r): For OMP_CLAUSE_DEPEND don't
check TREE_CODE if OMP_CLAUSE_DECL is NULL.
* g++.dg/gomp/doacross-2.C: New test.
|
|
The following testcase ICEs, because gimple_call_arg_ptr (..., 0)
asserts that there is at least one argument, while we were using
it even if we didn't copy anything just to get a pointer from/to which
the zero arguments should be copied.
Fixed by guarding the memcpy calls. Also, the code was calling
gimple_call_num_args too many times - 5 times instead of 2, so the patch
adds two temporaries for those.
2021-06-07 Jakub Jelinek <jakub@redhat.com>
PR middle-end/100898
* tree-inline.c (copy_bb): Only use gimple_call_arg_ptr if memcpy
should copy any arguments. Don't call gimple_call_num_args
on id->call_stmt or call_stmt more than once.
* g++.dg/ext/va-arg-pack-3.C: New test.
|
|
The return part has a major performance impact in Ada where variable-sized
types are first-class citizens, but it turns out that it is not exercized
in the testsuite yet, so back it out for now.
gcc/
PR ipa/99122
* tree-inline.c (inline_forbidden_p): Remove test on return type.
gcc/testsuite/
* gnat.dg/inline22.adb: New test.
|
|
The depend-iterator-3.C testcases shows various bugs.
1) tsubst_omp_clauses didn't handle OMP_CLAUSE_AFFINITY (should be
handled like OMP_CLAUSE_DEPEND)
2) because locators can be arbitrary lvalue expressions, we need
to allow for C++ array section base (especially when array section
is just an array reference) FIELD_DECLs, handle them as this->member,
but don't need to privatize in any way
3) similarly for this as base
4) depend(inout: this) is invalid, but for different reason than the reported
one, again this is an expression, but not lvalue expression, so that
should be reported
5) the ctor/dtor cloning in the C++ FE (which is using walk_tree with
copy_tree_body_r) didn't handle iterators correctly, walk_tree normally
doesn't walk TREE_PURPOSE of TREE_LIST, and in the iterator case
that TREE_VEC contains also a BLOCK that needs special handling during
copy_tree_body_r
2021-06-03 Jakub Jelinek <jakub@redhat.com>
PR c++/100859
gcc/
* tree-inline.c (copy_tree_body_r): Handle iterators on
OMP_CLAUSE_AFFINITY or OMP_CLAUSE_DEPEND.
gcc/c/
* c-typeck.c (c_finish_omp_clauses): Move OMP_CLAUSE_AFFINITY
after depend only cases.
gcc/cp/
* semantics.c (handle_omp_array_sections_1): For
OMP_CLAUSE_{AFFINITY,DEPEND} handle FIELD_DECL base using
finish_non_static_data_member and allow this as base.
(finish_omp_clauses): Move OMP_CLAUSE_AFFINITY
after depend only cases. Let this be diagnosed by !lvalue_p
case for OMP_CLAUSE_{AFFINITY,DEPEND} and remove useless
assert.
* pt.c (tsubst_omp_clauses): Handle OMP_CLAUSE_AFFINITY.
gcc/testsuite/
* g++.dg/gomp/depend-iterator-3.C: New test.
* g++.dg/gomp/this-1.C: Don't expect any diagnostics for
this as base expression of depend array section, expect a different
error wording for this as depend locator and add testcases
for affinity clauses.
|
|
This missing copying exposed a type mismatch in the IL.
2021-05-28 Richard Biener <rguenther@suse.de>
PR ipa/100791
* tree-inline.c (copy_bb): When processing __builtin_va_arg_pack
copy fntype from original call.
* gcc.dg/pr100791.c: New testcase.
|
|
gcc/
* tree-inline.c (setup_one_parameter): Fix thinko in new condition.
|
|
While doing bogus call LHS removal I noticed that cloning with
dropping a return value creates a bogus replacement for a
DECL_BY_REFERENCE DECL_RESULT, resulting in MEM_REFs of
aggregates rather than pointers. The following fixes this
latent issue.
2021-05-06 Richard Biener <rguenther@suse.de>
* tree-inline.c (tree_function_versioning): Fix DECL_BY_REFERENCE
return variable removal.
|
|
When a call to a function is inlined and takes a parameter whose type is not
gimple_reg, a local variable is created in the caller to hold a copy of the
argument passed in the call with the following comment:
/* We may produce non-gimple trees by adding NOPs or introduce
invalid sharing when operand is not really constant.
It is not big deal to prohibit constant propagation here as
we will constant propagate in DOM1 pass anyway. *
Of course the second sentence of the comment does not apply to non-gimple_reg
values, unless they get SRAed later, because we don't do constant propagation
for them. This for example prevents two identical calls to a pure function
from being merged in the attached Ada testcase.
Therefore the attached patch attempts to reuse a read-only or non-addressable
local DECL of the caller, the hitch being that expand_call_inline needs to be
prevented from creating a CLOBBER for the case where it ends uo being reused.
gcc/
* tree-inline.c (insert_debug_decl_map): Delete.
(copy_debug_stmt): Minor tweak.
(setup_one_parameter): Do not use a variable if the value is either
a read-only DECL or a non-addressable local variable in the caller.
In this case, insert the debug-only variable in the map manually.
(expand_call_inline): Do not generate a CLOBBER for these values.
* tree-inline.h (debug_map): Minor tweak.
|
|
After g:1a2a7096e5e20d736c6138179470b21aa5a74864 we forbid inlining
for a VLA types. What we miss is setting inline_forbidden_reason
variable.
Fixes:
./xgcc -B. -O3 -c /home/marxin/Programming/gcc/gcc/testsuite/gcc.dg/pr99122-2.c -Winline
during GIMPLE pass: local-fnsummary
/home/marxin/Programming/gcc/gcc/testsuite/gcc.dg/pr99122-2.c: In function ‘foo’:
/home/marxin/Programming/gcc/gcc/testsuite/gcc.dg/pr99122-2.c:21:1: internal compiler error: Segmentation fault
21 | }
| ^
0xe8b2ca crash_signal
/home/marxin/Programming/gcc/gcc/toplev.c:327
0x1a92733 pp_format(pretty_printer*, text_info*)
/home/marxin/Programming/gcc/gcc/pretty-print.c:1096
0x1a76b90 diagnostic_report_diagnostic(diagnostic_context*, diagnostic_info*)
/home/marxin/Programming/gcc/gcc/diagnostic.c:1244
0x1a79994 diagnostic_impl
/home/marxin/Programming/gcc/gcc/diagnostic.c:1406
0x1a79994 warning(int, char const*, ...)
/home/marxin/Programming/gcc/gcc/diagnostic.c:1527
0xf1bb16 tree_inlinable_function_p(tree_node*)
/home/marxin/Programming/gcc/gcc/tree-inline.c:4123
0xc3f1c5 compute_fn_summary(cgraph_node*, bool)
/home/marxin/Programming/gcc/gcc/ipa-fnsummary.c:3110
0xc3f937 compute_fn_summary_for_current
/home/marxin/Programming/gcc/gcc/ipa-fnsummary.c:3160
0xc3f937 execute
/home/marxin/Programming/gcc/gcc/ipa-fnsummary.c:4768
gcc/ChangeLog:
* tree-inline.c (inline_forbidden_p): Set
inline_forbidden_reason.
|
|
This avoids declaring a function with VLA arguments or return values
as inlineable. IPA CP still ICEs, so the testcase has that disabled.
2021-02-19 Richard Biener <rguenther@suse.de>
PR middle-end/99122
* tree-inline.c (inline_forbidden_p): Do not inline functions
with VLA arguments or return value.
* gcc.dg/pr99122-3.c: New testcase.
|
|
The following instructs IPA not to inline calls with VLA parameters
and adjusts inlining not to create invalid view-converted VLA
parameters on mismatch and makes the error_mark paths with debug
stmts actually work.
The first part avoids the ICEs with the testcases already.
2021-02-18 Richard Biener <rguenther@suse.de>
PR middle-end/99122
* ipa-fnsummary.c (analyze_function_body): Set
CIF_FUNCTION_NOT_INLINABLE for VLA parameter calls.
* tree-inline.c (insert_init_debug_bind): Pass NULL for
error_mark_node values.
(force_value_to_type): Do not build V_C_Es for WITH_SIZE_EXPR
values.
(setup_one_parameter): Delay force_value_to_type until when
it's needed.
* gcc.dg/pr99122-1.c: New testcase.
* gcc.dg/pr99122-2.c: Likewise.
|
|
This fixes input_location leaking with an invalid BLOCK from
expand_call_inline to tree_function_versioning via clone
materialization.
2021-01-19 Richard Biener <rguenther@suse.de>
PR ipa/97673
* tree-inline.c (tree_function_versioning): Set input_location
to UNKNOWN_LOCATION throughout the function.
* gfortran.dg/pr97673.f90: New testcase.
|
|
This is just a precautionary fix.
2021-01-05 Bernd Edlinger <bernd.edlinger@hotmail.de>
* tree-inline.c (expand_call_inline): Restore input_location.
Return result from recursive call.
|
|
|
|
This removes gimple_debug_begin_stmts without block info which remain
after a gimple block originating from an inline function is unused.
The line numbers from these stmts are from the inline function,
but since the inline function is completely optimized away,
there will be no DW_TAG_inlined_subroutine so the debugger has
no callstack available at this point, and therefore those
line table entries are not helpful to the user.
2020-12-10 Bernd Edlinger <bernd.edlinger@hotmail.de>
* cfgexpand.c (expand_gimple_basic_block): Remove special handling
of debug_inline_entries without block info.
* tree-inline.c (remap_gimple_stmt): Drop debug_nonbind_markers when
the call statement has no block info.
(copy_debug_stmt): Remove debug_nonbind_markers when inlining
and the block info is mapped to NULL.
* tree-ssa-live.c (clear_unused_block_pointer): Remove
debug_nonbind_markers originating from removed inline functions.
|
|
Add widening add, subtract patterns to tree-vect-patterns. Update the
widened code of patterns that detect PLUS_EXPR to also detect
WIDEN_PLUS_EXPR. These patterns take 2 vectors with N elements of size
S and perform an add/subtract on the elements, storing the results as N
elements of size 2*S (in 2 result vectors). This is implemented in the
aarch64 backend as addl,addl2 and subl,subl2 respectively. Add aarch64
tests for patterns.
gcc/ChangeLog:
* doc/generic.texi: Document new widen_plus/minus_lo/hi tree codes.
* doc/md.texi: Document new widenening add/subtract hi/lo optabs.
* expr.c (expand_expr_real_2): Add widen_add, widen_subtract cases.
* optabs-tree.c (optab_for_tree_code): Add case for widening optabs.
* optabs.def (OPTAB_D): Define vectorized widen add, subtracts.
* tree-cfg.c (verify_gimple_assign_binary): Add case for widening adds,
subtracts.
* tree-inline.c (estimate_operator_cost): Add case for widening adds,
subtracts.
* tree-vect-generic.c (expand_vector_operations_1): Add case for
widening adds, subtracts
* tree-vect-patterns.c (vect_recog_widen_add_pattern): New recog
pattern.
(vect_recog_widen_sub_pattern): New recog pattern.
(vect_recog_average_pattern): Update widened add code.
(vect_recog_average_pattern): Update widened add code.
* tree-vect-stmts.c (vectorizable_conversion): Add case for widened add,
subtract.
(supportable_widening_operation): Add case for widened add, subtract.
* tree.def
(WIDEN_PLUS_EXPR): New tree code.
(WIDEN_MINUS_EXPR): New tree code.
(VEC_WIDEN_ADD_HI_EXPR): New tree code.
(VEC_WIDEN_PLUS_LO_EXPR): New tree code.
(VEC_WIDEN_MINUS_HI_EXPR): New tree code.
(VEC_WIDEN_MINUS_LO_EXPR): New tree code.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/vect-widen-add.c: New test.
* gcc.target/aarch64/vect-widen-sub.c: New test.
|
|
* Makefile.in: (OBJS): Add symtab-clones.o
(GTFILES): Add symtab-clones.h
* cgraph.c: Include symtab-clones.h.
(cgraph_edge::resolve_speculation): Fix formating
(cgraph_edge::redirect_call_stmt_to_callee): Update.
(cgraph_update_edges_for_call_stmt): Update
(release_function_body): Fix formating.
(cgraph_node::remove): Fix formating.
(cgraph_node::dump): Fix formating.
(cgraph_node::get_availability): Fix formating.
(cgraph_node::call_for_symbol_thunks_and_aliases): Fix formating.
(set_const_flag_1): Fix formating.
(set_pure_flag_1): Fix formating.
(cgraph_node::can_remove_if_no_direct_calls_p): Fix formating.
(collect_callers_of_node_1): Fix formating.
(clone_of_p): Update.
(cgraph_node::verify_node): Update.
(cgraph_c_finalize): Call clone_info::release ().
* cgraph.h (struct cgraph_clone_info): Move to symtab-clones.h.
(cgraph_node): Remove clone_info.
(symbol_table): Add m_clones.
* cgraphclones.c: Include symtab-clone.h.
(duplicate_thunk_for_node): Update.
(cgraph_node::create_clone): Update.
(cgraph_node::create_virtual_clone): Update.
(cgraph_node::find_replacement): Update.
(cgraph_node::materialize_clone): Update.
* gengtype.c (open_base_files): Include symtab-clones.h.
* ipa-cp.c: Include symtab-clones.h.
(initialize_node_lattices): Update.
(want_remove_some_param_p): Update.
(create_specialized_node): Update.
* ipa-fnsummary.c: Include symtab-clones.h.
(ipa_fn_summary_t::duplicate): Update.
* ipa-modref.c: Include symtab-clones.h.
(update_signature): Update.
* ipa-param-manipulation.c: Include symtab-clones.h.
(ipa_param_body_adjustments::common_initialization): Update.
* ipa-prop.c: Include symtab-clones.h.
(adjust_agg_replacement_values): Update.
(ipcp_get_parm_bits): Update.
(ipcp_update_bits): Update.
(ipcp_update_vr): Update.
* ipa-sra.c: Include symtab-clones.h.
(process_isra_node_results): Update.
(disable_unavailable_parameters): Update.
* lto-cgraph.c: Include symtab-clone.h.
(output_cgraph_opt_summary_p): Update.
(output_node_opt_summary): Update.
(input_node_opt_summary): Update.
* symtab-clones.cc: New file.
* symtab-clones.h: New file.
* tree-inline.c (expand_call_inline): Update.
(update_clone_info): Update.
(tree_function_versioning): Update.
|
|
this patch moves thunk_info out of cgraph_node into a symbol summary.
I also moved it to separate hearder file since cgraph.h became really too
fat. I plan to contiue with similar breakup in order to cleanup interfaces
and reduce WPA memory footprint (symbol table now consumes more memory than
trees)
gcc/ChangeLog:
2020-10-23 Jan Hubicka <hubicka@ucw.cz>
* Makefile.in: Add symtab-thunks.o
(GTFILES): Add symtab-thunks.h and symtab-thunks.cc; remove cgraphunit.c
* cgraph.c: Include symtab-thunks.h.
(cgraph_node::create_thunk): Update
(symbol_table::create_edge): Update
(cgraph_node::dump): Update
(cgraph_node::call_for_symbol_thunks_and_aliases): Update
(set_nothrow_flag_1): Update
(set_malloc_flag_1): Update
(set_const_flag_1): Update
(collect_callers_of_node_1): Update
(clone_of_p): Update
(cgraph_node::verify_node): Update
(cgraph_node::function_symbol): Update
(cgraph_c_finalize): Call thunk_info::release.
(cgraph_node::has_thunk_p): Update
(cgraph_node::former_thunk_p): Move here from cgraph.h; reimplement.
* cgraph.h (struct cgraph_thunk_info): Rename to symtab-thunks.h.
(cgraph_node): Remove thunk field; add thunk bitfield.
(cgraph_node::expand_thunk): Move to symtab-thunks.h
(symtab_thunks_cc_finalize): Declare.
(cgraph_node::has_gimple_body_p): Update.
(cgraph_node::former_thunk_p): Update.
* cgraphclones.c: Include symtab-thunks.h.
(duplicate_thunk_for_node): Update.
(cgraph_edge::redirect_callee_duplicating_thunks): Update.
(cgraph_node::expand_all_artificial_thunks): Update.
(cgraph_node::create_edge_including_clones): Update.
* cgraphunit.c: Include symtab-thunks.h.
(vtable_entry_type): Move to symtab-thunks.c.
(cgraph_node::analyze): Update.
(analyze_functions): Update.
(mark_functions_to_output): Update.
(thunk_adjust): Move to symtab-thunks.c
(cgraph_node::expand_thunk): Move to symtab-thunks.c
(cgraph_node::assemble_thunks_and_aliases): Update.
(output_in_order): Update.
(cgraphunit_c_finalize): Do not clear vtable_entry_type.
(cgraph_node::create_wrapper): Update.
* gengtype.c (open_base_files): Add symtab-thunks.h
* ipa-comdats.c (propagate_comdat_group): UPdate.
(ipa_comdats): Update.
* ipa-cp.c (determine_versionability): UPdate.
(gather_caller_stats): Update.
(count_callers): Update
(set_single_call_flag): Update
(initialize_node_lattices): Update
(call_passes_through_thunk_p): Update
(call_passes_through_thunk): Update
(propagate_constants_across_call): Update
(find_more_scalar_values_for_callers_subset): Update
(has_undead_caller_from_outside_scc_p): Update
* ipa-fnsummary.c (evaluate_properties_for_edge): Update.
(compute_fn_summary): Update.
(inline_analyze_function): Update.
* ipa-icf.c: Include symtab-thunks.h.
(sem_function::equals_wpa): Update.
(redirect_all_callers): Update.
(sem_function::init): Update.
(sem_function::parse): Update.
* ipa-inline-transform.c: Include symtab-thunks.h.
(inline_call): Update.
(save_inline_function_body): Update.
(preserve_function_body_p): Update.
* ipa-inline.c (inline_small_functions): Update.
* ipa-polymorphic-call.c: Include alloc-pool.h, symbol-summary.h,
symtab-thunks.h
(ipa_polymorphic_call_context::ipa_polymorphic_call_context): Update.
* ipa-pure-const.c: Include symtab-thunks.h.
(analyze_function): Update.
* ipa-sra.c (check_for_caller_issues): Update.
* ipa-utils.c (ipa_reverse_postorder): Update.
(ipa_merge_profiles): Update.
* ipa-visibility.c (non_local_p): Update.
(cgraph_node::local_p): Update.
(function_and_variable_visibility): Update.
* ipa.c (symbol_table::remove_unreachable_nodes): Update.
* lto-cgraph.c: Include alloc-pool.h, symbol-summary.h and
symtab-thunks.h
(lto_output_edge): Update.
(lto_output_node): Update.
(compute_ltrans_boundary): Update.
(output_symtab): Update.
(verify_node_partition): Update.
(input_overwrite_node): Update.
(input_node): Update.
* lto-streamer-in.c (fixup_call_stmt_edges): Update.
* symtab-thunks.cc: New file.
* symtab-thunks.h: New file.
* toplev.c (toplev::finalize): Call symtab_thunks_cc_finalize.
* trans-mem.c (ipa_tm_mayenterirr_function): Update.
(ipa_tm_execute): Update.
* tree-inline.c (expand_call_inline): Update.
* tree-nested.c (create_nesting_tree): Update.
(convert_all_function_calls): Update.
(gimplify_all_functions): Update.
* tree-profile.c (tree_profiling): Update.
* tree-ssa-structalias.c (associate_varinfo_to_alias): Update.
* tree.c (free_lang_data_in_decl): Update.
* value-prof.c (init_node_map): Update.
gcc/c-family/ChangeLog:
2020-10-23 Jan Hubicka <hubicka@ucw.cz>
* c-common.c (c_common_finalize_early_debug): Update for new thunk api.
gcc/d/ChangeLog:
2020-10-23 Jan Hubicka <hubicka@ucw.cz>
* decl.cc (finish_thunk): Update for new thunk api.
gcc/lto/ChangeLog:
2020-10-23 Jan Hubicka <hubicka@ucw.cz>
* lto-partition.c (add_symbol_to_partition_1): Update for new thunk
api.
|
|
gcc/ada/ChangeLog:
* gcc-interface/trans.c (gigi): Set exact argument of a vector
growth function to true.
(Attribute_to_gnu): Likewise.
gcc/ChangeLog:
* alias.c (init_alias_analysis): Set exact argument of a vector
growth function to true.
* calls.c (internal_arg_pointer_based_exp_scan): Likewise.
* cfgbuild.c (find_many_sub_basic_blocks): Likewise.
* cfgexpand.c (expand_asm_stmt): Likewise.
* cfgrtl.c (rtl_create_basic_block): Likewise.
* combine.c (combine_split_insns): Likewise.
(combine_instructions): Likewise.
* config/aarch64/aarch64-sve-builtins.cc (function_expander::add_output_operand): Likewise.
(function_expander::add_input_operand): Likewise.
(function_expander::add_integer_operand): Likewise.
(function_expander::add_address_operand): Likewise.
(function_expander::add_fixed_operand): Likewise.
* df-core.c (df_worklist_dataflow_doublequeue): Likewise.
* dwarf2cfi.c (update_row_reg_save): Likewise.
* early-remat.c (early_remat::init_block_info): Likewise.
(early_remat::finalize_candidate_indices): Likewise.
* except.c (sjlj_build_landing_pads): Likewise.
* final.c (compute_alignments): Likewise.
(grow_label_align): Likewise.
* function.c (temp_slots_at_level): Likewise.
* fwprop.c (build_single_def_use_links): Likewise.
(update_uses): Likewise.
* gcc.c (insert_wrapper): Likewise.
* genautomata.c (create_state_ainsn_table): Likewise.
(add_vect): Likewise.
(output_dead_lock_vect): Likewise.
* genmatch.c (capture_info::capture_info): Likewise.
(parser::finish_match_operand): Likewise.
* genrecog.c (optimize_subroutine_group): Likewise.
(merge_pattern_info::merge_pattern_info): Likewise.
(merge_into_decision): Likewise.
(print_subroutine_start): Likewise.
(main): Likewise.
* gimple-loop-versioning.cc (loop_versioning::loop_versioning): Likewise.
* gimple.c (gimple_set_bb): Likewise.
* graphite-isl-ast-to-gimple.c (translate_isl_ast_node_user): Likewise.
* haifa-sched.c (sched_extend_luids): Likewise.
(extend_h_i_d): Likewise.
* insn-addr.h (insn_addresses_new): Likewise.
* ipa-cp.c (gather_context_independent_values): Likewise.
(find_more_contexts_for_caller_subset): Likewise.
* ipa-devirt.c (final_warning_record::grow_type_warnings): Likewise.
(ipa_odr_read_section): Likewise.
* ipa-fnsummary.c (evaluate_properties_for_edge): Likewise.
(ipa_fn_summary_t::duplicate): Likewise.
(analyze_function_body): Likewise.
(ipa_merge_fn_summary_after_inlining): Likewise.
(read_ipa_call_summary): Likewise.
* ipa-icf.c (sem_function::bb_dict_test): Likewise.
* ipa-prop.c (ipa_alloc_node_params): Likewise.
(parm_bb_aa_status_for_bb): Likewise.
(ipa_compute_jump_functions_for_edge): Likewise.
(ipa_analyze_node): Likewise.
(update_jump_functions_after_inlining): Likewise.
(ipa_read_edge_info): Likewise.
(read_ipcp_transformation_info): Likewise.
(ipcp_transform_function): Likewise.
* ipa-reference.c (ipa_reference_write_optimization_summary): Likewise.
* ipa-split.c (execute_split_functions): Likewise.
* ira.c (find_moveable_pseudos): Likewise.
* lower-subreg.c (decompose_multiword_subregs): Likewise.
* lto-streamer-in.c (input_eh_regions): Likewise.
(input_cfg): Likewise.
(input_struct_function_base): Likewise.
(input_function): Likewise.
* modulo-sched.c (set_node_sched_params): Likewise.
(extend_node_sched_params): Likewise.
(schedule_reg_moves): Likewise.
* omp-general.c (omp_construct_simd_compare): Likewise.
* passes.c (pass_manager::create_pass_tab): Likewise.
(enable_disable_pass): Likewise.
* predict.c (determine_unlikely_bbs): Likewise.
* profile.c (compute_branch_probabilities): Likewise.
* read-rtl-function.c (function_reader::parse_block): Likewise.
* read-rtl.c (rtx_reader::read_rtx_code): Likewise.
* reg-stack.c (stack_regs_mentioned): Likewise.
* regrename.c (regrename_init): Likewise.
* rtlanal.c (T>::add_single_to_queue): Likewise.
* sched-deps.c (init_deps_data_vector): Likewise.
* sel-sched-ir.c (sel_extend_global_bb_info): Likewise.
(extend_region_bb_info): Likewise.
(extend_insn_data): Likewise.
* symtab.c (symtab_node::create_reference): Likewise.
* tracer.c (tail_duplicate): Likewise.
* trans-mem.c (tm_region_init): Likewise.
(get_bb_regions_instrumented): Likewise.
* tree-cfg.c (init_empty_tree_cfg_for_function): Likewise.
(build_gimple_cfg): Likewise.
(create_bb): Likewise.
(move_block_to_fn): Likewise.
* tree-complex.c (tree_lower_complex): Likewise.
* tree-if-conv.c (predicate_rhs_code): Likewise.
* tree-inline.c (copy_bb): Likewise.
* tree-into-ssa.c (get_ssa_name_ann): Likewise.
(mark_phi_for_rewrite): Likewise.
* tree-object-size.c (compute_builtin_object_size): Likewise.
(init_object_sizes): Likewise.
* tree-predcom.c (initialize_root_vars_store_elim_1): Likewise.
(initialize_root_vars_store_elim_2): Likewise.
(prepare_initializers_chain_store_elim): Likewise.
* tree-ssa-address.c (addr_for_mem_ref): Likewise.
(multiplier_allowed_in_address_p): Likewise.
* tree-ssa-coalesce.c (ssa_conflicts_new): Likewise.
* tree-ssa-forwprop.c (simplify_vector_constructor): Likewise.
* tree-ssa-loop-ivopts.c (addr_offset_valid_p): Likewise.
(get_address_cost_ainc): Likewise.
* tree-ssa-loop-niter.c (discover_iteration_bound_by_body_walk): Likewise.
* tree-ssa-pre.c (add_to_value): Likewise.
(phi_translate_1): Likewise.
(do_pre_regular_insertion): Likewise.
(do_pre_partial_partial_insertion): Likewise.
(init_pre): Likewise.
* tree-ssa-propagate.c (ssa_prop_init): Likewise.
(update_call_from_tree): Likewise.
* tree-ssa-reassoc.c (optimize_range_tests_cmp_bitwise): Likewise.
* tree-ssa-sccvn.c (vn_reference_lookup_3): Likewise.
(vn_reference_lookup_pieces): Likewise.
(eliminate_dom_walker::eliminate_push_avail): Likewise.
* tree-ssa-strlen.c (set_strinfo): Likewise.
(get_stridx_plus_constant): Likewise.
(zero_length_string): Likewise.
(find_equal_ptrs): Likewise.
(printf_strlen_execute): Likewise.
* tree-ssa-threadedge.c (set_ssa_name_value): Likewise.
* tree-ssanames.c (make_ssa_name_fn): Likewise.
* tree-streamer-in.c (streamer_read_tree_bitfields): Likewise.
* tree-vect-loop.c (vect_record_loop_mask): Likewise.
(vect_get_loop_mask): Likewise.
(vect_record_loop_len): Likewise.
(vect_get_loop_len): Likewise.
* tree-vect-patterns.c (vect_recog_mask_conversion_pattern): Likewise.
* tree-vect-slp.c (vect_slp_convert_to_external): Likewise.
(vect_bb_slp_scalar_cost): Likewise.
(vect_bb_vectorization_profitable_p): Likewise.
(vectorizable_slp_permutation): Likewise.
* tree-vect-stmts.c (vectorizable_call): Likewise.
(vectorizable_simd_clone_call): Likewise.
(scan_store_can_perm_p): Likewise.
(vectorizable_store): Likewise.
* expr.c: Likewise.
* vec.c (test_safe_grow_cleared): Likewise.
* vec.h (vec_safe_grow): Likewise.
(vec_safe_grow_cleared): Likewise.
(vl_ptr>::safe_grow): Likewise.
(vl_ptr>::safe_grow_cleared): Likewise.
* config/c6x/c6x.c (insn_set_clock): Likewise.
gcc/c/ChangeLog:
* gimple-parser.c (c_parser_gimple_compound_statement): Set exact argument of a vector
growth function to true.
gcc/cp/ChangeLog:
* class.c (build_vtbl_initializer): Set exact argument of a vector
growth function to true.
* constraint.cc (get_mapped_args): Likewise.
* decl.c (cp_maybe_mangle_decomp): Likewise.
(cp_finish_decomp): Likewise.
* parser.c (cp_parser_omp_for_loop): Likewise.
* pt.c (canonical_type_parameter): Likewise.
* rtti.c (get_pseudo_ti_init): Likewise.
gcc/fortran/ChangeLog:
* trans-openmp.c (gfc_trans_omp_do): Set exact argument of a vector
growth function to true.
gcc/lto/ChangeLog:
* lto-common.c (lto_file_finalize): Set exact argument of a vector
growth function to true.
|
|
I mistook the opportunity to also "fix" the [VEC_]COND_EXPR case
for PR95171 but I was wrong in that it doesn't need the fix and
in the actual fix as well. The following just reverts that part.
2020-05-20 Richard Biener <rguenther@suse.de>
PR middle-end/95231
* tree-inline.c (remap_gimple_stmt): Revert adjusting
COND_EXPR and VEC_COND_EXPR for a -fnon-call-exception boundary.
* g++.dg/other/pr95231.C: New testcase.
|
|
This fixes always-inlining across -fnon-call-exception boundaries
for conditions which we do not allow to throw.
2020-05-18 Richard Biener <rguenther@suse.de>
PR middle-end/95171
* tree-inline.c (remap_gimple_stmt): Split out trapping compares
when inlining into a non-call EH function.
* gcc.dg/pr95171.c: New testcase.
|
|
This is a new version of the
https://gcc.gnu.org/legacy-ml/gcc-patches/2019-11/msg01493.html
patch. Unlike the previous version, this one actually works properly
except for LTO, bootstrapped/regtested on x86_64-linux and i686-linux
too.
In short, #pragma omp declare variant is a directive which allows
redirection of direct calls to certain function to other calls with a
scoring system and some of those decisions need to be deferred until after
IPA. The patch represents them with calls to an artificial FUNCTION_DECL
with declare_variant_alt in the cgraph_node set.
For LTO, the patch only saves/restores the two cgraph_node bits added in the
patch, but doesn't yet stream out and back in the on the side info for the
declare_variant_alt. For the LTO partitioning, I believe those artificial
FUNCTION_DECLs with declare_variant_alt need to go into partition together
with anything that calls them (possibly duplicated), any way how to achieve
that? Say if declare variant artificial fn foobar is directly
called from all of foo, bar and baz and not from qux and we want 4
partitions, one for each of foo, bar, baz, qux, then foobar is needed in the
first 3 partitions, and the IPA_REF_ADDRs recorded for foobar that right
after IPA the foobar call will be replaced with calls to foobar1, foobar2,
foobar3 or foobar (non-artificial) can of course stay in different
partitions if needed.
2020-05-14 Jakub Jelinek <jakub@redhat.com>
* Makefile.in (GTFILES): Add omp-general.c.
* cgraph.h (struct cgraph_node): Add declare_variant_alt and
calls_declare_variant_alt members and initialize them in the
ctor.
* ipa.c (symbol_table::remove_unreachable_nodes): Handle direct
calls to declare_variant_alt nodes.
* lto-cgraph.c (lto_output_node): Write declare_variant_alt
and calls_declare_variant_alt.
(input_overwrite_node): Read them back.
* omp-simd-clone.c (simd_clone_create): Copy calls_declare_variant_alt
bit.
* tree-inline.c (expand_call_inline): Or in calls_declare_variant_alt
bit.
(tree_function_versioning): Copy calls_declare_variant_alt bit.
* omp-offload.c (execute_omp_device_lower): Call
omp_resolve_declare_variant on direct function calls.
(pass_omp_device_lower::gate): Also enable for
calls_declare_variant_alt functions.
* omp-general.c (omp_maybe_offloaded): Return false after inlining.
(omp_context_selector_matches): Handle the case when
cfun->curr_properties has PROP_gimple_any bit set.
(struct omp_declare_variant_entry): New type.
(struct omp_declare_variant_base_entry): New type.
(struct omp_declare_variant_hasher): New type.
(omp_declare_variant_hasher::hash, omp_declare_variant_hasher::equal):
New methods.
(omp_declare_variants): New variable.
(struct omp_declare_variant_alt_hasher): New type.
(omp_declare_variant_alt_hasher::hash,
omp_declare_variant_alt_hasher::equal): New methods.
(omp_declare_variant_alt): New variables.
(omp_resolve_late_declare_variant): New function.
(omp_resolve_declare_variant): Call omp_resolve_late_declare_variant
when called late. Create a magic declare_variant_alt fndecl and
cgraph node and return that if decision needs to be deferred until
after gimplification.
* cgraph.c (symbol_table::create_edge): Or in calls_declare_variant_alt
bit.
* c-c++-common/gomp/declare-variant-14.c: New test.
|
|
This extends DECL_GIMPLE_REG_P to all types so we can clear
TREE_ADDRESSABLE even for integers with partial defs, not just
complex and vector variables. To make that transition easier
the patch inverts DECL_GIMPLE_REG_P to DECL_NOT_GIMPLE_REG_P
since that makes the default the current state for all other
types besides complex and vectors.
For the testcase in PR94703 we're able to expand the partial
def'ed local integer to a register then, producing a single
movl rather than going through the stack.
On i?86 this execute FAILs gcc.dg/torture/pr71522.c because
we now expand a round-trip through a long double automatic var
to a register fld/fst which normalizes the value. For that
during RTL expansion we're looking for problematic punnings
of decls and avoid pseudos for those - I chose integer or
BLKmode accesses on decls with modes where precision doesn't
match bitsize which covers the XFmode case.
2020-05-07 Richard Biener <rguenther@suse.de>
PR middle-end/94703
* tree-core.h (tree_decl_common::gimple_reg_flag): Rename ...
(tree_decl_common::not_gimple_reg_flag): ... to this.
* tree.h (DECL_GIMPLE_REG_P): Rename ...
(DECL_NOT_GIMPLE_REG_P): ... to this.
* gimple-expr.c (copy_var_decl): Copy DECL_NOT_GIMPLE_REG_P.
(create_tmp_reg): Simplify.
(create_tmp_reg_fn): Likewise.
(is_gimple_reg): Check DECL_NOT_GIMPLE_REG_P for all regs.
* gimplify.c (create_tmp_from_val): Simplify.
(gimplify_bind_expr): Likewise.
(gimplify_compound_literal_expr): Likewise.
(gimplify_function_tree): Likewise.
(prepare_gimple_addressable): Set DECL_NOT_GIMPLE_REG_P.
* asan.c (create_odr_indicator): Do not clear DECL_GIMPLE_REG_P.
(asan_add_global): Copy it.
* cgraphunit.c (cgraph_node::expand_thunk): Force args
to be GIMPLE regs.
* function.c (gimplify_parameters): Copy
DECL_NOT_GIMPLE_REG_P.
* ipa-param-manipulation.c
(ipa_param_body_adjustments::common_initialization): Simplify.
(ipa_param_body_adjustments::reset_debug_stmts): Copy
DECL_NOT_GIMPLE_REG_P.
* omp-low.c (lower_omp_for_scan): Do not set DECL_GIMPLE_REG_P.
* sanopt.c (sanitize_rewrite_addressable_params): Likewise.
* tree-cfg.c (make_blocks_1): Simplify.
(verify_address): Do not verify DECL_GIMPLE_REG_P setting.
* tree-eh.c (lower_eh_constructs_2): Simplify.
* tree-inline.c (declare_return_variable): Adjust and
generalize.
(copy_decl_to_var): Copy DECL_NOT_GIMPLE_REG_P.
(copy_result_decl_to_var): Likewise.
* tree-into-ssa.c (pass_build_ssa::execute): Adjust comment.
* tree-nested.c (create_tmp_var_for): Simplify.
* tree-parloops.c (separate_decls_in_region_name): Copy
DECL_NOT_GIMPLE_REG_P.
* tree-sra.c (create_access_replacement): Adjust and
generalize partial def support.
* tree-ssa-forwprop.c (pass_forwprop::execute): Set
DECL_NOT_GIMPLE_REG_P on decls we introduce partial defs on.
* tree-ssa.c (maybe_optimize_var): Handle clearing of
TREE_ADDRESSABLE and setting/clearing DECL_NOT_GIMPLE_REG_P
independently.
* lto-streamer-out.c (hash_tree): Hash DECL_NOT_GIMPLE_REG_P.
* tree-streamer-out.c (pack_ts_decl_common_value_fields): Stream
DECL_NOT_GIMPLE_REG_P.
* tree-streamer-in.c (unpack_ts_decl_common_value_fields): Likewise.
* cfgexpand.c (avoid_type_punning_on_regs): New.
(discover_nonconstant_array_refs): Call
avoid_type_punning_on_regs to avoid unsupported mode punning.
lto/
* lto-common.c (compare_tree_sccs_1): Compare
DECL_NOT_GIMPLE_REG_P.
c/
* gimple-parser.c (c_parser_parse_ssa_name): Do not set
DECL_GIMPLE_REG_P.
cp/
* optimize.c (update_cloned_parm): Copy DECL_NOT_GIMPLE_REG_P.
* gcc.dg/tree-ssa/pr94703.c: New testcase.
|
|
when callers and callees do not quite agree on the type of a
parameter, usually with ill-defined K&R or with smilarly wrong LTO
input, IPA-CP can attempt to try and substitute a wrong type for a
parameter (see e.g. gcc.dg/torture/pr48063.c). Function
tree_function_versioning attempts to handle this in order not to
create invalid gimple but it then creates the mapping using
setup_one_parameter which also handles the same situation to avoid
similar problems when inlining and in more defined way.
So this patch simply removes the conversion attempts in
tree_function_versioning itself. It is helpful for my upcoming fix of
PR 93385 because then I do not need to teach
ipa_param_body_adjustments to distinguish between successful and
unsuccessful mappings - setup_one_parameter uses force_value_to_type
for conversions which simly maps the worst cases to zero.
2020-05-04 Martin Jambor <mjambor@suse.cz>
PR ipa/93385
* tree-inline.c (tree_function_versioning): Leave any type conversion
of replacements to setup_one_parameter and its friend
force_value_to_type.
|
|
PR ipa/94582
* tree-inline.c (optimize_inline_calls): Recompute calls_comdat_local
flag.
* g++.dg/torture/pr94582.C: New test.
|
|
When I've added the VLA tweak for OpenMP to avoid error_mark_nodes in the IL in
type, I forgot that TYPE_DOMAIN could be NULL. Furthermore, as an optimization,
this patch checks the hopefully cheapest condition that is very likely false
most of the time (enabled only during OpenMP handling) first.
2020-04-17 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/94621
* tree-inline.c (remap_type_1): Don't dereference NULL TYPE_DOMAIN.
Move id->adjust_array_error_bounds check first in the condition.
* gcc.c-torture/compile/pr94621.c: New test.
|
|
The following testcase fails with -fcompare-debug. The problem is that
bar is marked as address_taken only with -g and not without.
I've tracked it down to insert_init_stmt calling gimple_regimplify_operands
even on DEBUG_STMTs. That function will just insert normal stmts before
the DEBUG_STMT if the DEBUG_STMT operand isn't gimple val or invariant.
While DCE will turn those statements into debug temporaries, it can cause
differences in SSA_NAMEs and more importantly, the ipa references are
generated from those before the DCE happens.
On the testcase, the DEBUG_STMT value is (int)bar.
We could generate DEBUG_STMTs with debug temporaries instead, but I fail to
see the reason to do that, DEBUG_STMTs allow other expressions and all we
want to ensure is that the expressions aren't too large (arbitrarily
complex), but during inlining/function versioning I don't see why something
would queue a DEBUG_STMT with arbitrarily complex expressions in there.
2020-03-16 Jakub Jelinek <jakub@redhat.com>
PR debug/94167
* tree-inline.c (insert_init_stmt): Don't gimple_regimplify_operands
DEBUG_STMTs.
* gcc.dg/pr94167.c: New test.
|
|
In the following testcase we emit wrong debug info for the karg
parameter in the DW_TAG_inlined_subroutine into main.
The problem is that the karg PARM_DECL is DECL_BY_REFERENCE and thus
in the IL has const K & type, but in the source just const K.
When the function is inlined, we create a VAR_DECL for it, but don't
set DECL_BY_REFERENCE, so when emitting DW_AT_location, we treat it like
a const K & typed variable, but it has DW_AT_abstract_origin which has
just the const K type and thus the debugger thinks the variable has
const K type.
Fixed by copying the DECL_BY_REFERENCE flag. Not doing it in
copy_decl_for_dup_finish, because copy_decl_no_change already copies
that flag through copy_node and in copy_result_decl_to_var it is
undesirable, as we handle DECL_BY_REFERENCE in that case instead
by changing the type.
2020-03-04 Jakub Jelinek <jakub@redhat.com>
PR debug/93888
* tree-inline.c (copy_decl_to_var): Copy DECL_BY_REFERENCE flag.
* g++.dg/guality/pr93888.C: New test.
|
|
The inliner folds stmts delayed, the following arranges things so
to not fold stmts that are obviously not reachable to avoid warnings
from those code regions.
2020-02-07 Richard Biener <rguenther@suse.de>
PR middle-end/93519
* tree-inline.c (fold_marked_statements): Do a PRE walk,
skipping unreachable regions.
(optimize_inline_calls): Skip folding stmts when we didn't
inline.
* gcc.dg/Wrestrict-21.c: New testcase.
|
|
This patch started as work to resole Richard's comment on quadratic lookups
in resolve_speculation. While doing it I however noticed multiple problems
in the new speuclative call code which made the patch quite big. In
particular:
1) Before applying speculation we consider only targets with at lest
probability 1/2.
If profile is sane at most two targets can have probability greater or
equal to 1/2. So the new multi-target speculation code got enabled only
in very special scenario when there ae precisely two target with precise
probability 1/2 (which is tested by the single testcase).
As a conseuqence the multiple target logic got minimal test coverage and
this made us to miss several ICEs.
2) Profile updating in profile merging, tree-inline and indirect call
expansion was wrong which led to inconsistent profiles (as already seen
on the testcase).
3) Code responsible to turn speculative call to direct call was broken for
anything with more than one target.
4) There were multiple cases where call_site_hash went out of sync which
eventually leads to an ICE..
5) Some code expects that all speculative call targets forms a sequence in
the callee linked list but there is no code to maintain that invariant
nor a verifier.
Fixing this it became obvious that the current API of speculative_call_info is
not useful because it really builds on fact tht there are precisely three
components (direct call, ref and indirect call) in every speculative call
sequence. I ended up replacing it with iterator API for direct call
(first_speculative_call_target, next_speculative_call_target) and accessors for
the other coponents updating comment in cgraph.h.
Finally I made the work with call site hash more effetive by updating edge
manipulation to keep them in sequence. So first one can be looked up from the
hash and then they can be iterated by callee.
There are other things that can be improved (for example the speculation should
start with most common target first), but I will try to keep that for next
stage1. This patch is mostly about getting rid of ICE and profile corruption
which is a regression from GCC 9.
gcc/ChangeLog:
PR lto/93318
* cgraph.c (cgraph_add_edge_to_call_site_hash): Update call site
hash only when edge is first within the sequence.
(cgraph_edge::set_call_stmt): Update handling of speculative calls.
(symbol_table::create_edge): Do not set target_prob.
(cgraph_edge::remove_caller): Watch for speculative calls when updating
the call site hash.
(cgraph_edge::make_speculative): Drop target_prob parameter.
(cgraph_edge::speculative_call_info): Remove.
(cgraph_edge::first_speculative_call_target): New member function.
(update_call_stmt_hash_for_removing_direct_edge): New function.
(cgraph_edge::resolve_speculation): Rewrite to new API.
(cgraph_edge::speculative_call_for_target): New member function.
(cgraph_edge::make_direct): Rewrite to new API; fix handling of
multiple speculation targets.
(cgraph_edge::redirect_call_stmt_to_callee): Likewise; fix updating
of profile.
(verify_speculative_call): Verify that targets form an interval.
* cgraph.h (cgraph_edge::speculative_call_info): Remove.
(cgraph_edge::first_speculative_call_target): New member function.
(cgraph_edge::next_speculative_call_target): New member function.
(cgraph_edge::speculative_call_target_ref): New member function.
(cgraph_edge;:speculative_call_indirect_edge): New member funtion.
(cgraph_edge): Remove target_prob.
* cgraphclones.c (cgraph_node::set_call_stmt_including_clones):
Fix handling of speculative calls.
* ipa-devirt.c (ipa_devirt): Fix handling of speculative cals.
* ipa-fnsummary.c (analyze_function_body): Likewise.
* ipa-inline.c (speculation_useful_p): Use new speculative call API.
* ipa-profile.c (dump_histogram): Fix formating.
(ipa_profile_generate_summary): Watch for overflows.
(ipa_profile): Do not require probablity to be 1/2; update to new API.
* ipa-prop.c (ipa_make_edge_direct_to_target): Update to new API.
(update_indirect_edges_after_inlining): Update to new API.
* ipa-utils.c (ipa_merge_profiles): Rewrite merging of speculative call
profiles.
* profile-count.h: (profile_probability::adjusted): New.
* tree-inline.c (copy_bb): Update to new speculative call API; fix
updating of profile.
* value-prof.c (gimple_ic_transform): Rename to ...
(dump_ic_profile): ... this one; update dumping.
(stream_in_histogram_value): Fix formating.
(gimple_value_profile_transformations): Update.
gcc/testsuite/ChangeLog:
* g++.dg/tree-prof/indir-call-prof.C: Update template.
* gcc.dg/tree-prof/crossmodule-indircall-1.c: Add more targets.
* gcc.dg/tree-prof/crossmodule-indircall-1a.c: Add more targets.
* gcc.dg/tree-prof/indir-call-prof.c: Update template.
|
|
v8:
1. Rebase to master with Martin's static function (r280043) comments merge.
Boostrap/testsuite/SPEC2017 tested pass on Power8-LE.
2. TODO:
2.1. C++ devirt for multiple speculative call targets.
2.2. ipa-icf ipa_merge_profiles refine with COMDAT inline testcase.
This patch aims to fix PR69678 caused by PGO indirect call profiling
performance issues.
The bug that profiling data is never working was fixed by Martin's pull
back of topN patches, performance got GEOMEAN ~1% improvement(+24% for
511.povray_r specifically).
Still, currently the default profile only generates SINGLE indirect target
that called more than 75%. This patch leverages MULTIPLE indirect
targets use in LTO-WPA and LTO-LTRANS stage, as a result, function
specialization, profiling, partial devirtualization, inlining and
cloning could be done successfully based on it.
Performance can get improved from 0.70 sec to 0.38 sec on simple tests.
Details are:
1. PGO with topn is enabled by default now, but only one indirect
target edge will be generated in ipa-profile pass, so add variables to enable
multiple speculative edges through passes, speculative_id will record the
direct edge index bind to the indirect edge, indirect_call_targets length
records how many direct edges owned by the indirect edge, postpone gimple_ic
to ipa-profile like default as inline pass will decide whether it is benefit
to transform indirect call.
2. Use speculative_id to track and search the reference node matched
with the direct edge's callee for multiple targets. Actually, it is the
caller's responsibility to handle the direct edges mapped to same indirect
edge. speculative_call_info will return one of the direct edge specified,
this will leverage current IPA edge process framework mostly.
3. Enable LTO WPA/LTRANS stage multiple indirect call targets analysis for
profile full support in ipa passes and cgraph_edge functions. speculative_id
can be set by make_speculative id when multiple targets are binded to
one indirect edge, and cloned if new edge is cloned. speculative_id
is streamed out and stream int by lto like lto_stmt_uid.
4. Create and duplicate all speculative direct edge's call summary
in ipa-fnsummary.c with auto_vec.
5. Add 1 in module testcase and 2 cross module testcases.
6. Bootstrap and regression test passed on Power8-LE. No function
and performance regression for SPEC2017.
gcc/ChangeLog
2020-01-14 Xiong Hu Luo <luoxhu@linux.ibm.com>
PR ipa/69678
* cgraph.c (symbol_table::create_edge): Init speculative_id and
target_prob.
(cgraph_edge::make_speculative): Add param for setting speculative_id
and target_prob.
(cgraph_edge::speculative_call_info): Update comments and find reference
by speculative_id for multiple indirect targets.
(cgraph_edge::resolve_speculation): Decrease the speculations
for indirect edge, drop it's speculative if not direct target
left. Update comments.
(cgraph_edge::redirect_call_stmt_to_callee): Likewise.
(cgraph_node::dump): Print num_speculative_call_targets.
(cgraph_node::verify_node): Don't report error if speculative
edge not include statement.
(cgraph_edge::num_speculative_call_targets_p): New function.
* cgraph.h (int common_target_id): Remove.
(int common_target_probability): Remove.
(num_speculative_call_targets): New variable.
(make_speculative): Add param for setting speculative_id.
(cgraph_edge::num_speculative_call_targets_p): New declare.
(target_prob): New variable.
(speculative_id): New variable.
* ipa-fnsummary.c (analyze_function_body): Create and duplicate
call summaries for multiple speculative call targets.
* cgraphclones.c (cgraph_node::create_clone): Clone speculative_id.
* ipa-profile.c (struct speculative_call_target): New struct.
(class speculative_call_summary): New class.
(class speculative_call_summaries): New class.
(call_sums): New variable.
(ipa_profile_generate_summary): Generate indirect multiple targets summaries.
(ipa_profile_write_edge_summary): New function.
(ipa_profile_write_summary): Stream out indirect multiple targets summaries.
(ipa_profile_dump_all_summaries): New function.
(ipa_profile_read_edge_summary): New function.
(ipa_profile_read_summary_section): New function.
(ipa_profile_read_summary): Stream in indirect multiple targets summaries.
(ipa_profile): Generate num_speculative_call_targets from
profile summaries.
* ipa-ref.h (speculative_id): New variable.
* ipa-utils.c (ipa_merge_profiles): Update with target_prob.
* lto-cgraph.c (lto_output_edge): Remove indirect common_target_id and
common_target_probability. Stream out speculative_id and
num_speculative_call_targets.
(input_edge): Likewise.
* predict.c (dump_prediction): Remove edges count assert to be
precise.
* symtab.c (symtab_node::create_reference): Init speculative_id.
(symtab_node::clone_references): Clone speculative_id.
(symtab_node::clone_referring): Clone speculative_id.
(symtab_node::clone_reference): Clone speculative_id.
(symtab_node::clear_stmts_in_references): Clear speculative_id.
* tree-inline.c (copy_bb): Duplicate all the speculative edges
if indirect call contains multiple speculative targets.
* value-prof.h (check_ic_target): Remove.
* value-prof.c (gimple_value_profile_transformations):
Use void function gimple_ic_transform.
* value-prof.c (gimple_ic_transform): Handle topn case.
Fix comment typos. Change it to a void function.
gcc/testsuite/ChangeLog
2020-01-14 Xiong Hu Luo <luoxhu@linux.ibm.com>
PR ipa/69678
* gcc.dg/tree-prof/indir-call-prof-topn.c: New testcase.
* gcc.dg/tree-prof/crossmodule-indir-call-topn-1.c: New testcase.
* gcc.dg/tree-prof/crossmodule-indir-call-topn-1a.c: New testcase.
* gcc.dg/tree-prof/crossmodule-indir-call-topn-2.c: New testcase.
* lib/scandump.exp: Dump executable file name.
* lib/scanwpaipa.exp: New scan-pgo-wap-ipa-dump.
|
|
2020-01-09 Martin Jambor <mjambor@suse.cz>
* cgraph.h (cgraph_edge): Make remove, set_call_stmt, make_direct,
resolve_speculation and redirect_call_stmt_to_callee static. Change
return type of set_call_stmt to cgraph_edge *.
* auto-profile.c (afdo_indirect_call): Adjust call to
redirect_call_stmt_to_callee.
* cgraph.c (cgraph_edge::set_call_stmt): Make return cgraph-edge *,
make the this pointer explicit, adjust self-recursive calls and the
call top make_direct. Return the resulting edge.
(cgraph_edge::remove): Make this pointer explicit.
(cgraph_edge::resolve_speculation): Likewise, adjust call to remove.
(cgraph_edge::make_direct): Likewise, adjust call to
resolve_speculation.
(cgraph_edge::redirect_call_stmt_to_callee): Likewise, also adjust
call to set_call_stmt.
(cgraph_update_edges_for_call_stmt_node): Update call to
set_call_stmt and remove.
* cgraphclones.c (cgraph_node::set_call_stmt_including_clones):
Renamed edge to master_edge. Adjusted calls to set_call_stmt.
(cgraph_node::create_edge_including_clones): Moved "first" definition
of edge to the block where it was used. Adjusted calls to
set_call_stmt.
(cgraph_node::remove_symbol_and_inline_clones): Adjust call to
cgraph_edge::remove.
* cgraphunit.c (walk_polymorphic_call_targets): Adjusted calls to
make_direct and redirect_call_stmt_to_callee.
* ipa-fnsummary.c (redirect_to_unreachable): Adjust calls to
resolve_speculation and make_direct.
* ipa-inline-transform.c (inline_transform): Adjust call to
redirect_call_stmt_to_callee.
(check_speculations_1):: Adjust call to resolve_speculation.
* ipa-inline.c (resolve_noninline_speculation): Adjust call to
resolve-speculation.
(inline_small_functions): Adjust call to resolve_speculation.
(ipa_inline): Likewise.
* ipa-prop.c (ipa_make_edge_direct_to_target): Adjust call to
make_direct.
* ipa-visibility.c (function_and_variable_visibility): Make iteration
safe with regards to edge removal, adjust calls to
redirect_call_stmt_to_callee.
* ipa.c (walk_polymorphic_call_targets): Adjust calls to make_direct
and redirect_call_stmt_to_callee.
* multiple_target.c (create_dispatcher_calls): Adjust call to
redirect_call_stmt_to_callee
(redirect_to_specific_clone): Likewise.
* tree-cfgcleanup.c (delete_unreachable_blocks_update_callgraph):
Adjust calls to cgraph_edge::remove.
* tree-inline.c (copy_bb): Adjust call to set_call_stmt.
(redirect_all_calls): Adjust call to redirect_call_stmt_to_callee.
(expand_call_inline): Adjust call to cgraph_edge::remove.
From-SVN: r280043
|
|
2020-01-08 Martin Liska <mliska@suse.cz>
* cgraph.c (cgraph_node::dump): Use ::dump_name or
::dump_asm_name instead of (::name or ::asm_name).
* cgraphclones.c (symbol_table::materialize_all_clones): Likewise.
* cgraphunit.c (walk_polymorphic_call_targets): Likewise.
(analyze_functions): Likewise.
(expand_all_functions): Likewise.
* ipa-cp.c (ipcp_cloning_candidate_p): Likewise.
(propagate_bits_across_jump_function): Likewise.
(dump_profile_updates): Likewise.
(ipcp_store_bits_results): Likewise.
(ipcp_store_vr_results): Likewise.
* ipa-devirt.c (dump_targets): Likewise.
* ipa-fnsummary.c (analyze_function_body): Likewise.
* ipa-hsa.c (check_warn_node_versionable): Likewise.
(process_hsa_functions): Likewise.
* ipa-icf.c (sem_item_optimizer::merge_classes): Likewise.
(set_alias_uids): Likewise.
* ipa-inline-transform.c (save_inline_function_body): Likewise.
* ipa-inline.c (recursive_inlining): Likewise.
(inline_to_all_callers_1): Likewise.
(ipa_inline): Likewise.
* ipa-profile.c (ipa_propagate_frequency_1): Likewise.
(ipa_propagate_frequency): Likewise.
* ipa-prop.c (ipa_make_edge_direct_to_target): Likewise.
(remove_described_reference): Likewise.
* ipa-pure-const.c (worse_state): Likewise.
(check_retval_uses): Likewise.
(analyze_function): Likewise.
(propagate_pure_const): Likewise.
(propagate_nothrow): Likewise.
(dump_malloc_lattice): Likewise.
(propagate_malloc): Likewise.
(pass_local_pure_const::execute): Likewise.
* ipa-visibility.c (optimize_weakref): Likewise.
(function_and_variable_visibility): Likewise.
* ipa.c (symbol_table::remove_unreachable_nodes): Likewise.
(ipa_discover_variable_flags): Likewise.
* lto-streamer-out.c (output_function): Likewise.
(output_constructor): Likewise.
* tree-inline.c (copy_bb): Likewise.
* tree-ssa-structalias.c (ipa_pta_execute): Likewise.
* varpool.c (symbol_table::remove_unreferenced_decls): Likewise.
2020-01-08 Martin Liska <mliska@suse.cz>
* lto-partition.c (add_symbol_to_partition_1): Use ::dump_name or
::dump_asm_name instead of (::name or ::asm_name).
(lto_balanced_map): Likewise.
(promote_symbol): Likewise.
(rename_statics): Likewise.
* lto.c (lto_wpa_write_files): Likewise.
2020-01-08 Martin Liska <mliska@suse.cz>
* gcc.dg/ipa/ipa-icf-1.c: Update expected scanned output.
* gcc.dg/ipa/ipa-icf-10.c: Likewise.
* gcc.dg/ipa/ipa-icf-11.c: Likewise.
* gcc.dg/ipa/ipa-icf-12.c: Likewise.
* gcc.dg/ipa/ipa-icf-13.c: Likewise.
* gcc.dg/ipa/ipa-icf-16.c: Likewise.
* gcc.dg/ipa/ipa-icf-18.c: Likewise.
* gcc.dg/ipa/ipa-icf-2.c: Likewise.
* gcc.dg/ipa/ipa-icf-20.c: Likewise.
* gcc.dg/ipa/ipa-icf-21.c: Likewise.
* gcc.dg/ipa/ipa-icf-23.c: Likewise.
* gcc.dg/ipa/ipa-icf-25.c: Likewise.
* gcc.dg/ipa/ipa-icf-26.c: Likewise.
* gcc.dg/ipa/ipa-icf-27.c: Likewise.
* gcc.dg/ipa/ipa-icf-3.c: Likewise.
* gcc.dg/ipa/ipa-icf-35.c: Likewise.
* gcc.dg/ipa/ipa-icf-36.c: Likewise.
* gcc.dg/ipa/ipa-icf-37.c: Likewise.
* gcc.dg/ipa/ipa-icf-38.c: Likewise.
* gcc.dg/ipa/ipa-icf-5.c: Likewise.
* gcc.dg/ipa/ipa-icf-7.c: Likewise.
* gcc.dg/ipa/ipa-icf-8.c: Likewise.
* gcc.dg/ipa/ipa-icf-merge-1.c: Likewise.
* gcc.dg/ipa/pr64307.c: Likewise.
* gcc.dg/ipa/pr90555.c: Likewise.
* gcc.dg/ipa/propmalloc-1.c: Likewise.
* gcc.dg/ipa/propmalloc-2.c: Likewise.
* gcc.dg/ipa/propmalloc-3.c: Likewise.
From-SVN: r280009
|
|
2020-01-07 Martin Liska <mliska@suse.cz>
PR tree-optimization/92860
* common.opt: Make in Optimization option
as it is affected by -O0, which is an Optimization
option.
* tree-inline.c (tree_inlinable_function_p):
Use opt_for_fn for warn_inline.
(expand_call_inline): Likewise.
2020-01-07 Martin Liska <mliska@suse.cz>
PR tree-optimization/92860
* gcc.dg/pr92860-2.c: New test.
From-SVN: r279947
|
|
From-SVN: r279813
|
|
when compiled with GCC compared to Clang)
2019-11-28 Richard Biener <rguenther@suse.de>
PR tree-optimization/92645
* tree-inline.c (remap_gimple_stmt): When the return value
is not wanted, elide GIMPLE_RETURN.
* gcc.dg/tree-ssa/inline-12.c: New testcase.
From-SVN: r278807
|
|
2019-11-27 Richard Biener <rguenther@suse.de>
PR middle-end/92674
* tree-inline.c (expand_call_inline): Delay purging EH/abnormal
edges and instead record blocks in bitmap.
(gimple_expand_calls_inline): Adjust.
(fold_marked_statements): Delay EH cleanup until all folding is
done.
(optimize_inline_calls): Do EH/abnormal cleanup for calls after
inlining finished.
From-SVN: r278757
|