Age | Commit message (Collapse) | Author | Files | Lines |
|
incompatible types in PHI argument 0))
2018-11-05 Richard Biener <rguenther@suse.de>
PR tree-optimization/87873
* tree-ssa-loop-manip.h (split_loop_exit_edge): Add copy_constants_p
argument.
* tree-ssa-loop-manip.c (split_loop_exit_edge): Likewise.
* tree-vect-loop.c (vect_transform_loop): When splitting the
loop exit also create forwarder PHIs for constants.
* tree-vect-loop-manip.c (slpeel_duplicate_current_defs_from_edges):
Handle constant to_arg, add extra checking we match up the correct
PHIs.
* gcc.dg/pr87873.c: New testcase.
From-SVN: r265812
|
|
This patch introduces a verbosity level to dump messages:
"user-facing" vs "internals".
By default, messages at the top-level dump scope are "user-facing",
whereas those that are in nested scopes are implicitly "internals",
and are filtered out by -fopt-info unless a new "-internals" sub-option
of "-fopt-info" is supplied (intended purely for use by GCC developers).
Dumpfiles are unaffected by the change.
Given that the vectorizer is the only subsystem using AUTO_DUMP_SCOPE
(via DUMP_VECT_SCOPE), this only affects the vectorizer.
Filtering out these implementation-detail messages goes a long way
towards making -fopt-info-vec-all more accessible to advanced end-users;
the follow-up patch restores the most pertinent missing details.
gcc/ChangeLog:
* doc/invoke.texi (-fopt-info): Document new "internals"
sub-option.
* dump-context.h (dump_context::apply_dump_filter_p): New decl.
* dumpfile.c (dump_options): Update for renaming of MSG_ALL to
MSG_ALL_KINDS.
(optinfo_verbosity_options): Add "internals".
(kind_as_string): Update for renaming of MSG_ALL to MSG_ALL_KINDS.
(dump_context::apply_dump_filter_p): New member function.
(dump_context::dump_loc): Use apply_dump_filter_p rather than
explicitly masking the dump_kind.
(dump_context::begin_scope): Increment the scope depth first. Use
apply_dump_filter_p rather than explicitly masking the dump_kind.
(dump_context::emit_item): Use apply_dump_filter_p rather than
explicitly masking the dump_kind.
(dump_dec): Likewise.
(dump_hex): Likewise.
(dump_switch_p_1): Default to MSG_ALL_PRIORITIES.
(opt_info_switch_p_1): Default to MSG_PRIORITY_USER_FACING.
(opt_info_switch_p): Update handling of default
MSG_OPTIMIZED_LOCATIONS to cope with default of
MSG_PRIORITY_USER_FACING.
(dump_basic_block): Use apply_dump_filter_p rather than explicitly
masking the dump_kind.
(selftest::test_capture_of_dump_calls): Update test_dump_context
instances to use MSG_ALL_KINDS | MSG_PRIORITY_USER_FACING rather
than MSG_ALL. Generalize scope test to be run at all four
combinations of with/without MSG_PRIORITY_USER_FACING and
MSG_PRIORITY_INTERNALS, adding examples of explicit priority
for each of the two values.
* dumpfile.h (enum dump_flag): Add comment about the MSG_* flags.
Rename MSG_ALL to MSG_ALL_KINDS. Add MSG_PRIORITY_USER_FACING,
MSG_PRIORITY_INTERNALS, and MSG_ALL_PRIORITIES, updating the
values for TDF_COMPARE_DEBUG and TDF_ALL_VALUES.
(AUTO_DUMP_SCOPE): Add a note to the comment about the interaction
with MSG_PRIORITY_*.
* tree-vect-loop-manip.c (vect_loop_versioning): Mark versioning
dump messages as MSG_PRIORITY_USER_FACING.
* tree-vectorizer.h (DUMP_VECT_SCOPE): Add a note to the comment
about the interaction with MSG_PRIORITY_*.
gcc/testsuite/ChangeLog:
* gcc.dg/plugin/dump-1.c: Update expected output for test_scopes
due to "-internals" not being selected.
* gcc.dg/plugin/dump-2.c: New test, based on dump-1.c, with
"-internals" added to re-enable the output from test_scopes.
* gcc.dg/plugin/plugin.exp (plugin_test_list): Add dump-2.c.
From-SVN: r264851
|
|
As promised at Cauldron, this patch uses %T and %G with dump_printf and
dump_printf_loc calls to eliminate calls to
dump_generic_expr (MSG_*, arg, TDF_SLIM) (via %T)
and
dump_gimple_stmt (MSG_*, TDF_SLIM, stmt, 0) (via %G)
throughout the middle-end, simplifying numerous dump callsites.
A few calls to these functions didn't match the above pattern; I didn't
touch these. I wasn't able to use %E anywhere.
gcc/ChangeLog:
* tree-data-ref.c (runtime_alias_check_p): Use formatted printing
with %T in place of calls to dump_generic_expr.
(prune_runtime_alias_test_list): Likewise.
(create_runtime_alias_checks): Likewise.
* tree-vect-data-refs.c (vect_check_nonzero_value): Likewise.
(vect_analyze_data_ref_dependence): Likewise.
(vect_slp_analyze_data_ref_dependence): Likewise.
(vect_record_base_alignment): Likewise. Use %G in place of call
to dump_gimple_stmt.
(vect_compute_data_ref_alignment): Likewise.
(verify_data_ref_alignment): Likewise.
(vect_find_same_alignment_drs): Likewise.
(vect_analyze_group_access_1): Likewise.
(vect_analyze_data_ref_accesses): Likewise.
(dependence_distance_ge_vf): Likewise.
(dump_lower_bound): Likewise.
(vect_prune_runtime_alias_test_list): Likewise.
(vect_find_stmt_data_reference): Likewise.
(vect_analyze_data_refs): Likewise.
(vect_create_addr_base_for_vector_ref): Likewise.
(vect_create_data_ref_ptr): Likewise.
* tree-vect-loop-manip.c (vect_set_loop_condition): Likewise.
(vect_can_advance_ivs_p): Likewise.
(vect_update_ivs_after_vectorizer): Likewise.
(vect_gen_prolog_loop_niters): Likewise.
(vect_prepare_for_masked_peels): Likewise.
* tree-vect-loop.c (vect_determine_vf_for_stmt): Likewise.
(vect_determine_vectorization_factor): Likewise.
(vect_is_simple_iv_evolution): Likewise.
(vect_analyze_scalar_cycles_1): Likewise.
(vect_analyze_loop_operations): Likewise.
(report_vect_op): Likewise.
(vect_is_slp_reduction): Likewise.
(check_reduction_path): Likewise.
(vect_is_simple_reduction): Likewise.
(vect_create_epilog_for_reduction): Likewise.
(vect_finalize_reduction:): Likewise.
(vectorizable_induction): Likewise.
(vect_transform_loop_stmt): Likewise.
(vect_transform_loop): Likewise.
(optimize_mask_stores): Likewise.
* tree-vect-patterns.c (vect_pattern_detected): Likewise.
(vect_split_statement): Likewise.
(vect_recog_over_widening_pattern): Likewise.
(vect_recog_average_pattern): Likewise.
(vect_determine_min_output_precision_1): Likewise.
(vect_determine_precisions_from_range): Likewise.
(vect_determine_precisions_from_users): Likewise.
(vect_mark_pattern_stmts): Likewise.
(vect_pattern_recog_1): Likewise.
* tree-vect-slp.c (vect_get_and_check_slp_defs): Likewise.
(vect_record_max_nunits): Likewise.
(vect_build_slp_tree_1): Likewise.
(vect_build_slp_tree_2): Likewise.
(vect_print_slp_tree): Likewise.
(vect_analyze_slp_instance): Likewise.
(vect_detect_hybrid_slp_stmts): Likewise.
(vect_detect_hybrid_slp_1): Likewise.
(vect_slp_analyze_operations): Likewise.
(vect_slp_analyze_bb_1): Likewise.
(vect_transform_slp_perm_load): Likewise.
(vect_schedule_slp_instance): Likewise.
* tree-vect-stmts.c (vect_mark_relevant): Likewise.
(vect_mark_stmts_to_be_vectorized): Likewise.
(vect_init_vector_1): Likewise.
(vect_get_vec_def_for_operand): Likewise.
(vect_finish_stmt_generation_1): Likewise.
(vect_check_load_store_mask): Likewise.
(vectorizable_call): Likewise.
(vectorizable_conversion): Likewise.
(vectorizable_operation): Likewise.
(vectorizable_load): Likewise.
(vect_analyze_stmt): Likewise.
(vect_is_simple_use): Likewise.
(vect_get_vector_types_for_stmt): Likewise.
(vect_get_mask_type_for_stmt): Likewise.
* tree-vectorizer.c (increase_alignment): Likewise.
From-SVN: r264424
|
|
This patch adds a new helper function for permanently removing a
statement and its associated stmt_vec_info.
2018-07-31 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vectorizer.h (vec_info::remove_stmt): Declare.
* tree-vectorizer.c (vec_info::remove_stmt): New function.
* tree-vect-loop-manip.c (vect_set_loop_condition): Use it.
* tree-vect-loop.c (vect_transform_loop): Likewise.
* tree-vect-slp.c (vect_schedule_slp): Likewise.
* tree-vect-stmts.c (vect_remove_stores): Likewise.
From-SVN: r263156
|
|
This patch replaces DR_VECT_AUX and vect_dr_stmt with a new
vec_info::lookup_dr function, so that the lookup is relative
to a particular vec_info rather than to global state.
2018-07-31 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vectorizer.h (vec_info::lookup_dr): New member function.
(vect_dr_stmt): Delete.
* tree-vectorizer.c (vec_info::lookup_dr): New function.
* tree-vect-loop-manip.c (vect_update_inits_of_drs): Use it instead
of DR_VECT_AUX.
* tree-vect-data-refs.c (vect_analyze_possibly_independent_ddr)
(vect_analyze_data_ref_dependence, vect_record_base_alignments)
(vect_verify_datarefs_alignment, vect_peeling_supportable)
(vect_analyze_data_ref_accesses, vect_prune_runtime_alias_test_list)
(vect_analyze_data_refs): Likewise.
(vect_slp_analyze_data_ref_dependence): Likewise. Take a vec_info
argument.
(vect_find_same_alignment_drs): Likewise.
(vect_slp_analyze_node_dependences): Update calls accordingly.
(vect_analyze_data_refs_alignment): Likewise. Use vec_info::lookup_dr
instead of DR_VECT_AUX.
(vect_get_peeling_costs_all_drs): Take a loop_vec_info instead
of a vector data references. Use vec_info::lookup_dr instead of
DR_VECT_AUX.
(vect_peeling_hash_get_lowest_cost): Update calls accordingly.
(vect_enhance_data_refs_alignment): Likewise. Use vec_info::lookup_dr
instead of DR_VECT_AUX.
From-SVN: r263155
|
|
After previous changes, it makes more sense for STMT_VINFO_UNALIGNED_DR
to be dr_vec_info rather than a data_reference.
2018-07-31 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vectorizer.h (_loop_vec_info::unaligned_dr): Change to
dr_vec_info.
* tree-vect-data-refs.c (vect_enhance_data_refs_alignment): Update
accordingly.
* tree-vect-loop.c (vect_analyze_loop_2): Likewise.
* tree-vect-loop-manip.c (get_misalign_in_elems): Likewise.
(vect_gen_prolog_loop_niters): Likewise.
From-SVN: r263154
|
|
This patch makes various routines (mostly in tree-vect-data-refs.c)
take dr_vec_infos rather than data_references. The affected routines
are really dealing with the way that an access is going to vectorised,
rather than with the original scalar access described by the
data_reference.
2018-07-31 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vectorizer.h (set_dr_misalignment, dr_misalignment)
(DR_TARGET_ALIGNMENT, aligned_access_p, known_alignment_for_access_p)
(vect_known_alignment_in_bytes, vect_dr_behavior)
(vect_get_scalar_dr_size): Take references as dr_vec_infos
instead of data_references. Update calls to other routines for
which the same change has been made.
* tree-vect-data-refs.c (vect_preserves_scalar_order_p): Take
dr_vec_infos instead of stmt_vec_infos.
(vect_analyze_data_ref_dependence): Update call accordingly.
(vect_slp_analyze_data_ref_dependence)
(vect_record_base_alignments): Use DR_VECT_AUX.
(vect_calculate_target_alignment, vect_compute_data_ref_alignment)
(vect_update_misalignment_for_peel, verify_data_ref_alignment)
(vector_alignment_reachable_p, vect_get_data_access_cost)
(vect_peeling_supportable, vect_analyze_group_access_1)
(vect_analyze_group_access, vect_analyze_data_ref_access)
(vect_vfa_segment_size, vect_vfa_access_size, vect_vfa_align)
(vect_compile_time_alias, vect_small_gap_p)
(vectorizable_with_step_bound_p, vect_duplicate_ssa_name_ptr_info):
(vect_supportable_dr_alignment): Take references as dr_vec_infos
instead of data_references. Update calls to other routines for
which the same change has been made.
(vect_verify_datarefs_alignment, vect_get_peeling_costs_all_drs)
(vect_find_same_alignment_drs, vect_analyze_data_refs_alignment)
(vect_slp_analyze_and_verify_node_alignment)
(vect_analyze_data_ref_accesses, vect_prune_runtime_alias_test_list)
(vect_create_addr_base_for_vector_ref, vect_create_data_ref_ptr)
(vect_setup_realignment): Use dr_vec_infos. Update calls after
above changes.
(_vect_peel_info::dr): Replace with...
(_vect_peel_info::dr_info): ...this new field.
(vect_peeling_hash_get_most_frequent)
(vect_peeling_hash_choose_best_peeling): Update accordingly.
(vect_peeling_hash_get_lowest_cost):
(vect_enhance_data_refs_alignment): Likewise. Update calls to other
routines for which the same change has been made.
(vect_peeling_hash_insert): Likewise. Take a dr_vec_info instead of a
data_reference.
* tree-vect-loop-manip.c (get_misalign_in_elems)
(vect_gen_prolog_loop_niters): Use dr_vec_infos. Update calls after
above changes.
* tree-vect-loop.c (vect_analyze_loop_2): Likewise.
* tree-vect-stmts.c (vect_get_store_cost, vect_get_load_cost)
(vect_truncate_gather_scatter_offset, compare_step_with_zero)
(get_group_load_store_type, get_negative_load_store_type)
(vect_get_data_ptr_increment, vectorizable_store)
(vectorizable_load): Likewise.
(ensure_base_align): Take a dr_vec_info instead of a data_reference.
Update calls to other routines for which the same change has been made.
From-SVN: r263153
|
|
This first (less mechanical) part handles cases that involve changes in
the callers or non-trivial changes in the functions themselves.
2018-07-31 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vect-data-refs.c (vect_describe_gather_scatter_call): Take
a stmt_vec_info instead of a gcall.
(vect_check_gather_scatter): Update call accordingly.
* tree-vect-loop-manip.c (iv_phi_p): Take a stmt_vec_info instead
of a gphi.
(vect_can_advance_ivs_p, vect_update_ivs_after_vectorizer)
(slpeel_update_phi_nodes_for_loops):): Update calls accordingly.
* tree-vect-loop.c (vect_transform_loop_stmt): Take a stmt_vec_info
instead of a gimple stmt.
(vect_transform_loop): Update calls accordingly.
* tree-vect-slp.c (vect_split_slp_store_group): Take and return
stmt_vec_infos instead of gimple stmts.
(vect_analyze_slp_instance): Update use accordingly.
* tree-vect-stmts.c (read_vector_array, write_vector_array)
(vect_clobber_variable, vect_stmt_relevant_p, permute_vec_elements)
(vect_use_strided_gather_scatters_p, vect_build_all_ones_mask)
(vect_build_zero_merge_argument, vect_get_gather_scatter_ops)
(vect_gen_widened_results_half, vect_get_loop_based_defs)
(vect_create_vectorized_promotion_stmts, can_vectorize_live_stmts):
Take a stmt_vec_info instead of a gimple stmt and pass stmt_vec_infos
down to subroutines.
From-SVN: r263146
|
|
This first part makes functions use stmt_vec_infos instead of
gimple stmts in cases where the stmt_vec_info was already available
and where the change is mechanical. Most of it is just replacing
"stmt" with "stmt_info".
2018-07-31 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vect-data-refs.c (vect_slp_analyze_node_dependences):
(vect_check_gather_scatter, vect_create_data_ref_ptr, bump_vector_ptr)
(vect_permute_store_chain, vect_setup_realignment)
(vect_permute_load_chain, vect_shift_permute_load_chain)
(vect_transform_grouped_load): Use stmt_vec_info rather than gimple
stmts internally, and when passing values to other vectorizer routines.
* tree-vect-loop-manip.c (vect_can_advance_ivs_p): Likewise.
* tree-vect-loop.c (vect_analyze_scalar_cycles_1)
(vect_analyze_loop_operations, get_initial_def_for_reduction)
(vect_create_epilog_for_reduction, vectorize_fold_left_reduction)
(vectorizable_reduction, vectorizable_induction)
(vectorizable_live_operation, vect_transform_loop_stmt)
(vect_transform_loop): Likewise.
* tree-vect-patterns.c (vect_reassociating_reduction_p)
(vect_recog_widen_op_pattern, vect_recog_mixed_size_cond_pattern)
(vect_recog_bool_pattern, vect_recog_gather_scatter_pattern): Likewise.
* tree-vect-slp.c (vect_analyze_slp_instance): Likewise.
(vect_slp_analyze_node_operations_1): Likewise.
* tree-vect-stmts.c (vect_mark_relevant, process_use)
(exist_non_indexing_operands_for_use_p, vect_init_vector_1)
(vect_mark_stmts_to_be_vectorized, vect_get_vec_def_for_operand)
(vect_finish_stmt_generation_1, get_group_load_store_type)
(get_load_store_type, vect_build_gather_load_calls)
(vectorizable_bswap, vectorizable_call, vectorizable_simd_clone_call)
(vect_create_vectorized_demotion_stmts, vectorizable_conversion)
(vectorizable_assignment, vectorizable_shift, vectorizable_operation)
(vectorizable_store, vectorizable_load, vectorizable_condition)
(vectorizable_comparison, vect_analyze_stmt, vect_transform_stmt)
(supportable_widening_operation): Likewise.
(vect_get_vector_types_for_stmt): Likewise.
* tree-vectorizer.h (vect_dr_behavior): Likewise.
From-SVN: r263143
|
|
Various places called vect_dr_stmt or vinfo_for_stmt multiple times
on the same input. This patch makes them reuse the earlier result.
It also splits a couple of single vinfo_for_stmt calls out into
separate statements so that they can be reused in later patches.
2018-07-31 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vect-data-refs.c (vect_analyze_data_ref_dependence)
(vect_slp_analyze_node_dependences, vect_analyze_data_ref_accesses)
(vect_permute_store_chain, vect_permute_load_chain)
(vect_shift_permute_load_chain, vect_transform_grouped_load): Avoid
repeated stmt_vec_info lookups.
* tree-vect-loop-manip.c (vect_can_advance_ivs_p): Likewise.
(vect_update_ivs_after_vectorizer): Likewise.
* tree-vect-loop.c (vect_is_simple_reduction): Likewise.
(vect_create_epilog_for_reduction, vectorizable_reduction): Likewise.
* tree-vect-patterns.c (adjust_bool_stmts): Likewise.
* tree-vect-slp.c (vect_analyze_slp_instance): Likewise.
(vect_bb_slp_scalar_cost): Likewise.
* tree-vect-stmts.c (get_group_alias_ptr_type): Likewise.
From-SVN: r263142
|
|
This patch changes LOOP_VINFO_MAY_MISALIGN_STMTS from an
auto_vec<gimple *> to an auto_vec<stmt_vec_info>.
2018-07-31 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vectorizer.h (_loop_vec_info::may_misalign_stmts): Change
from an auto_vec<gimple *> to an auto_vec<stmt_vec_info>.
* tree-vect-data-refs.c (vect_enhance_data_refs_alignment): Update
accordingly.
* tree-vect-loop-manip.c (vect_create_cond_for_align_checks): Likewise.
From-SVN: r263138
|
|
This patch makes vect_dr_stmt return a stmt_vec_info instead of a
gimple stmt. Rather than retain a separate gimple stmt variable
in cases where both existed, the patch replaces uses of the gimple
variable with the uses of the stmt_vec_info. Later patches do this
more generally.
Many things that are keyed off a data_reference would these days
be better keyed off a stmt_vec_info, but it's more convenient
to do that later in the series. The vect_dr_size calls that are
left over do still benefit from this patch.
2018-07-31 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vectorizer.h (vect_dr_stmt): Return a stmt_vec_info rather
than a gimple stmt.
* tree-vect-data-refs.c (vect_analyze_data_ref_dependence)
(vect_slp_analyze_data_ref_dependence, vect_record_base_alignments)
(vect_calculate_target_alignmentm, vect_compute_data_ref_alignment)
(vect_update_misalignment_for_peel, vect_verify_datarefs_alignment)
(vector_alignment_reachable_p, vect_get_data_access_cost)
(vect_get_peeling_costs_all_drs, vect_peeling_hash_get_lowest_cost)
(vect_peeling_supportable, vect_enhance_data_refs_alignment)
(vect_find_same_alignment_drs, vect_analyze_data_refs_alignment)
(vect_analyze_group_access_1, vect_analyze_group_access)
(vect_analyze_data_ref_access, vect_analyze_data_ref_accesses)
(vect_vfa_access_size, vect_small_gap_p, vect_analyze_data_refs)
(vect_supportable_dr_alignment): Remove vinfo_for_stmt from the
result of vect_dr_stmt and use the stmt_vec_info instead of
the associated gimple stmt.
* tree-vect-loop-manip.c (get_misalign_in_elems): Likewise.
(vect_gen_prolog_loop_niters): Likewise.
* tree-vect-loop.c (vect_analyze_loop_2): Likewise.
From-SVN: r263134
|
|
This patch turns stmt_vec_info into an unspeakably bad wrapper class
and adds an implicit conversion to the associated gimple stmt.
Having this conversion makes the rest of the series easier to write,
but since the class goes away again at the end of the series, I've
not bothered adding any comments or tried to make it pretty.
2018-07-31 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vectorizer.h (stmt_vec_info): Temporarily change from
a typedef to a wrapper class.
(NULL_STMT_VEC_INFO): New macro.
(vec_info::stmt_infos): Change to vec<stmt_vec_info>.
(stmt_vec_info::operator*): New function.
(stmt_vec_info::operator gimple *): Likewise.
(set_vinfo_for_stmt): Use NULL_STMT_VEC_INFO.
(add_stmt_costs): Likewise.
* tree-vect-loop-manip.c (iv_phi_p): Likewise.
* tree-vect-loop.c (vect_compute_single_scalar_iteration_cost)
(vect_get_known_peeling_cost): Likewise.
(vect_estimate_min_profitable_iters): Likewise.
* tree-vect-patterns.c (vect_init_pattern_stmt): Likewise.
* tree-vect-slp.c (vect_remove_slp_scalar_calls): Likewise.
* tree-vect-stmts.c (vect_build_gather_load_calls): Likewise.
(vectorizable_store, free_stmt_vec_infos): Likewise.
(new_stmt_vec_info): Change return type of xcalloc to
_stmt_vec_info *.
From-SVN: r263125
|
|
vect_recog_rotate_pattern had code to prevent operations
on invariants being vectorised unnecessarily:
if (dt == vect_external_def
&& TREE_CODE (oprnd1) == SSA_NAME
&& is_a <loop_vec_info> (vinfo))
{
struct loop *loop = as_a <loop_vec_info> (vinfo)->loop;
ext_def = loop_preheader_edge (loop);
if (!SSA_NAME_IS_DEFAULT_DEF (oprnd1))
{
basic_block bb = gimple_bb (SSA_NAME_DEF_STMT (oprnd1));
if (bb == NULL
|| !dominated_by_p (CDI_DOMINATORS, ext_def->dest, bb))
ext_def = NULL;
}
}
[..]
if (ext_def)
{
basic_block new_bb
= gsi_insert_on_edge_immediate (ext_def, def_stmt);
gcc_assert (!new_bb);
}
This patch reuses the same idea for casts of invariants created
during widening optimisations.
One hitch was that vect_loop_versioning asserted that the vector loop
preheader was still empty, although the cfg transformation it's doing
should be correct either way.
2018-06-30 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vect-patterns.c (vect_get_external_def_edge): New function,
split out from...
(vect_recog_rotate_pattern): ...here.
(vect_convert_input): Try to insert casts of invariants in the
preheader.
* tree-vect-loop-manip.c (vect_loop_versioning): Don't require the
preheader to be empty.
gcc/testsuite/
* gcc.dg/vect/vect-widen-mult-extern-1.c: New test.
From-SVN: r262277
|
|
gcc/ChangeLog:
* cfgloop.c (get_loop_location): Convert return type from
location_t to dump_user_location_t, replacing INSN_LOCATION lookups
by implicit construction from rtx_insn *, and using
dump_user_location_t::from_function_decl for the fallback case.
* cfgloop.h (get_loop_location): Convert return type from
location_t to dump_user_location_t.
* cgraphunit.c (walk_polymorphic_call_targets): Update call to
dump_printf_loc to pass in a dump_location_t rather than a
location_t, via the gimple stmt.
* coverage.c (get_coverage_counts): Update calls to
dump_printf_loc to pass in dump_location_t rather than a
location_t.
* doc/optinfo.texi (Dump types): Convert example of
dump_printf_loc from taking "locus" to taking "insn". Update
description of the "_loc" calls to cover dump_location_t.
* dumpfile.c: Include "backend.h", "gimple.h", "rtl.h", and
"selftest.h".
(dump_user_location_t::dump_user_location_t): New constructors,
from gimple *stmt and rtx_insn *.
(dump_user_location_t::from_function_decl): New function.
(dump_loc): Make static.
(dump_gimple_stmt_loc): Convert param "loc" from location_t to
const dump_location_t &.
(dump_generic_expr_loc): Delete.
(dump_printf_loc): Convert param "loc" from location_t to
const dump_location_t &.
(selftest::test_impl_location): New function.
(selftest::dumpfile_c_tests): New function.
* dumpfile.h: Include "profile-count.h".
(class dump_user_location_t): New class.
(struct dump_impl_location_t): New struct.
(class dump_location_t): New class.
(dump_printf_loc): Convert 2nd param from source_location to
const dump_location_t &.
(dump_generic_expr_loc): Delete.
(dump_gimple_stmt_loc): Convert 2nd param from source_location to
const dump_location_t &.
* gimple-fold.c (fold_gimple_assign): Update call to
dump_printf_loc to pass in a dump_location_t rather than a
location_t, via the gimple stmt.
(gimple_fold_call): Likewise.
* gimple-loop-interchange.cc
(loop_cand::analyze_iloop_reduction_var): Update for change to
check_reduction_path.
(tree_loop_interchange::interchange): Update for change to
find_loop_location.
* graphite-isl-ast-to-gimple.c (scop_to_isl_ast): Update for
change in return-type of find_loop_location.
(graphite_regenerate_ast_isl): Likewise.
* graphite-optimize-isl.c (optimize_isl): Likewise.
* graphite.c (graphite_transform_loops): Likewise.
* ipa-devirt.c (ipa_devirt): Update call to dump_printf_loc to
pass in a dump_location_t rather than a location_t, via the
gimple stmt.
* ipa-prop.c (ipa_make_edge_direct_to_target): Likewise.
* ipa.c (walk_polymorphic_call_targets): Likewise.
* loop-unroll.c (report_unroll): Convert "locus" param from
location_t to dump_location_t.
(decide_unrolling): Update for change to get_loop_location's
return type.
* omp-grid.c (struct grid_prop): Convert field "target_loc" from
location_t to dump_user_location_t.
(grid_find_single_omp_among_assignments_1): Updates calls to
dump_printf_loc to pass in a dump_location_t rather than a
location_t, via the gimple stmt.
(grid_parallel_clauses_gridifiable): Convert "tloc" from
location_t to dump_location_t. Updates calls to dump_printf_loc
to pass in a dump_location_t rather than a location_t, via the
gimple stmt.
(grid_inner_loop_gridifiable_p): Likewise.
(grid_dist_follows_simple_pattern): Likewise.
(grid_gfor_follows_tiling_pattern): Likewise.
(grid_target_follows_gridifiable_pattern): Likewise.
(grid_attempt_target_gridification): Convert initialization
of local "grid" from memset to zero-initialization; FIXME: does
this require C++11? Update call to dump_printf_loc to pass in a
optinfo_location rather than a location_t, via the gimple stmt.
* profile.c (read_profile_edge_counts): Updates call to
dump_printf_loc to pass in a dump_location_t rather than a
location_t
(compute_branch_probabilities): Likewise.
* selftest-run-tests.c (selftest::run_tests): Call
dumpfile_c_tests.
* selftest.h (dumpfile_c_tests): New decl.
* tree-loop-distribution.c (pass_loop_distribution::execute):
Update for change in return type of find_loop_location.
* tree-parloops.c (parallelize_loops): Likewise.
* tree-ssa-loop-ivcanon.c (try_unroll_loop_completely): Convert
"locus" from location_t to dump_user_location_t.
(canonicalize_loop_induction_variables): Likewise.
* tree-ssa-loop-ivopts.c (tree_ssa_iv_optimize_loop): Update
for change in return type of find_loop_location.
* tree-ssa-loop-niter.c (number_of_iterations_exit): Update call
to dump_printf_loc to pass in a dump_location_t rather than a
location_t, via the stmt.
* tree-ssa-sccvn.c (eliminate_dom_walker::before_dom_children):
Likewise.
* tree-vect-loop-manip.c (find_loop_location): Convert return
type from source_location to dump_user_location_t.
(vect_do_peeling): Update for above change.
(vect_loop_versioning): Update for change in type of
vect_location.
* tree-vect-loop.c (check_reduction_path): Convert "loc" param
from location_t to dump_user_location_t.
(vect_estimate_min_profitable_iters): Update for change in type
of vect_location.
* tree-vect-slp.c (vect_print_slp_tree): Convert param "loc" from
location_t to dump_location_t.
(vect_slp_bb): Update for change in type of vect_location.
* tree-vectorizer.c (vect_location): Convert from source_location
to dump_user_location_t.
(try_vectorize_loop_1): Update for change in vect_location's type.
(vectorize_loops): Likewise.
(increase_alignment): Likewise.
* tree-vectorizer.h (vect_location): Convert from source_location
to dump_user_location_t.
(find_loop_location): Convert return type from source_location to
dump_user_location_t.
(check_reduction_path): Convert 1st param from location_t to
dump_user_location_t.
* value-prof.c (check_counter): Update call to dump_printf_loc to
pass in a dump_user_location_t rather than a location_t; update
call to error_at for change in type of "locus".
(check_ic_target): Update call to dump_printf_loc to
pass in a dump_user_location_t rather than a location_t, via the
call_stmt.
From-SVN: r262149
|
|
2018-06-21 Richard Biener <rguenther@suse.de>
* tree-data-ref.c (dr_step_indicator): Handle NULL DR_STEP.
* tree-vect-data-refs.c (vect_analyze_possibly_independent_ddr):
Avoid calling vect_mark_for_runtime_alias_test with gathers or scatters.
(vect_analyze_data_ref_dependence): Re-order checks to deal with
NULL DR_STEP.
(vect_record_base_alignments): Do not record base alignment
for gathers or scatters.
(vect_compute_data_ref_alignment): Drop return value that is always
true. Bail out early for gathers or scatters.
(vect_enhance_data_refs_alignment): Bail out early for gathers
or scatters.
(vect_find_same_alignment_drs): Likewise.
(vect_analyze_data_refs_alignment): Remove dead code.
(vect_slp_analyze_and_verify_node_alignment): Likewise.
(vect_analyze_data_refs): For possible gathers or scatters do
not create an alternate DR, just check their possible validity
and mark them. Adjust DECL_NONALIASED handling to not rely
on DR_BASE_ADDRESS.
* tree-vect-loop-manip.c (vect_update_inits_of_drs): Do not
update inits of gathers or scatters.
* tree-vect-patterns.c (vect_recog_mask_conversion_pattern):
Also copy gather/scatter flag to pattern vinfo.
From-SVN: r261834
|
|
gcc/ChangeLog:
* tree-vect-data-refs.c (vect_analyze_data_ref_dependences):
Replace dump_printf_loc call with DUMP_VECT_SCOPE.
(vect_slp_analyze_instance_dependence): Likewise.
(vect_enhance_data_refs_alignment): Likewise.
(vect_analyze_data_refs_alignment): Likewise.
(vect_slp_analyze_and_verify_instance_alignment
(vect_analyze_data_ref_accesses): Likewise.
(vect_prune_runtime_alias_test_list): Likewise.
(vect_analyze_data_refs): Likewise.
* tree-vect-loop-manip.c (vect_update_inits_of_drs): Likewise.
* tree-vect-loop.c (vect_determine_vectorization_factor): Likewise.
(vect_analyze_scalar_cycles_1): Likewise.
(vect_get_loop_niters): Likewise.
(vect_analyze_loop_form_1): Likewise.
(vect_update_vf_for_slp): Likewise.
(vect_analyze_loop_operations): Likewise.
(vect_analyze_loop): Likewise.
(vectorizable_induction): Likewise.
(vect_transform_loop): Likewise.
* tree-vect-patterns.c (vect_pattern_recog): Likewise.
* tree-vect-slp.c (vect_analyze_slp): Likewise.
(vect_make_slp_decision): Likewise.
(vect_detect_hybrid_slp): Likewise.
(vect_slp_analyze_operations): Likewise.
(vect_slp_bb): Likewise.
* tree-vect-stmts.c (vect_mark_stmts_to_be_vectorized): Likewise.
(vectorizable_bswap): Likewise.
(vectorizable_call): Likewise.
(vectorizable_simd_clone_call): Likewise.
(vectorizable_conversion): Likewise.
(vectorizable_assignment): Likewise.
(vectorizable_shift): Likewise.
(vectorizable_operation): Likewise.
* tree-vectorizer.h (DUMP_VECT_SCOPE): New macro.
From-SVN: r261710
|
|
2018-06-01 Richard Biener <rguenther@suse.de>
* tree-vectorizer.h (vect_dr_stmt): New function.
(vect_get_load_cost): Adjust.
(vect_get_store_cost): Likewise.
* tree-vect-data-refs.c (vect_analyze_data_ref_dependence):
Use vect_dr_stmt instead of DR_SMTT.
(vect_record_base_alignments): Likewise.
(vect_calculate_target_alignment): Likewise.
(vect_compute_data_ref_alignment): Likewise and make static.
(vect_update_misalignment_for_peel): Likewise.
(vect_verify_datarefs_alignment): Likewise.
(vector_alignment_reachable_p): Likewise.
(vect_get_data_access_cost): Likewise. Pass down
vinfo to vect_get_load_cost/vect_get_store_cost instead of DR.
(vect_get_peeling_costs_all_drs): Likewise.
(vect_peeling_hash_get_lowest_cost): Likewise.
(vect_enhance_data_refs_alignment): Likewise.
(vect_find_same_alignment_drs): Likewise.
(vect_analyze_data_refs_alignment): Likewise.
(vect_analyze_group_access_1): Likewise.
(vect_analyze_group_access): Likewise.
(vect_analyze_data_ref_access): Likewise.
(vect_analyze_data_ref_accesses): Likewise.
(vect_vfa_segment_size): Likewise.
(vect_small_gap_p): Likewise.
(vectorizable_with_step_bound_p): Likewise.
(vect_prune_runtime_alias_test_list): Likewise.
(vect_analyze_data_refs): Likewise.
(vect_supportable_dr_alignment): Likewise.
* tree-vect-loop-manip.c (get_misalign_in_elems): Likewise.
(vect_gen_prolog_loop_niters): Likewise.
* tree-vect-loop.c (vect_analyze_loop_2): Likewise.
* tree-vect-patterns.c (vect_recog_bool_pattern): Do not
modify DR_STMT.
(vect_recog_mask_conversion_pattern): Likewise.
(vect_try_gather_scatter_pattern): Likewise.
* tree-vect-stmts.c (vect_model_store_cost): Pass stmt_info
to vect_get_store_cost.
(vect_get_store_cost): Get stmt_info instead of DR.
(vect_model_load_cost): Pass stmt_info to vect_get_load_cost.
(vect_get_load_cost): Get stmt_info instead of DR.
From-SVN: r261062
|
|
PR c/84909
* c-warn.c (conversion_warning): Replace "to to" with "to" in
diagnostics.
* hsa-gen.c (mem_type_for_type): Fix comment typo.
* tree-vect-loop-manip.c (vect_create_cond_for_niters_checks):
Likewise.
* gimple-ssa-warn-restrict.c (builtin_memref::set_base_and_offset):
Likewise.
From-SVN: r258609
|
|
I hadn't realised that on big-endian targets, VEC_UNPACK*HI_EXPR unpacks
the low-numbered lanes and VEC_UNPACK*LO_EXPR unpacks the high-numbered
lanes. This meant that both the SVE patterns and the handling of
fully-masked loops were wrong.
The patch deals with that by making sure that all vec_unpack* optabs
are define_expands, using BYTES_BIG_ENDIAN to choose the appropriate
define_insn. This in turn meant that we can get rid of the duplication
between the signed and unsigned patterns for predicates. (We provide
implementations of both the signed and unsigned optabs because the sign
doesn't matter for predicates: every element contains only one
significant bit.)
Also, the float unpacks need to unpack one half of the input vector,
but the unpacked upper bits are "don't care". There are two obvious
ways of handling that: use an unpack (filling with zeros) or use a ZIP
(filling with a duplicate of the low bits). The code previously used
unpacks, but the sequence involved a subreg that is semantically an
element reverse on big-endian targets. Using the ZIP patterns avoids
that, and at the moment there's no reason to prefer one over the other
for performance reasons, so the patch switches to ZIP unconditionally.
As the comment says, it would be easy to optimise this later if UUNPK
turns out to be better for some implementations.
2018-03-13 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* tree-vect-loop-manip.c (vect_maybe_permute_loop_masks):
Reverse the choice between VEC_UNPACK_LO_EXPR and VEC_UNPACK_HI_EXPR
for big-endian.
* config/aarch64/iterators.md (hi_lanes_optab): New int attribute.
* config/aarch64/aarch64-sve.md
(*aarch64_sve_<perm_insn><perm_hilo><mode>): Rename to...
(aarch64_sve_<perm_insn><perm_hilo><mode>): ...this.
(*extend<mode><Vwide>2): Rename to...
(aarch64_sve_extend<mode><Vwide>2): ...this.
(vec_unpack<su>_<perm_hilo>_<mode>): Turn into a define_expand,
renaming the old pattern to...
(aarch64_sve_punpk<perm_hilo>_<mode>): ...this. Only define
unsigned packs.
(vec_unpack<su>_<perm_hilo>_<SVE_BHSI:mode>): Turn into a
define_expand, renaming the old pattern to...
(aarch64_sve_<su>unpk<perm_hilo>_<SVE_BHSI:mode>): ...this.
(*vec_unpacku_<perm_hilo>_<mode>_no_convert): Delete.
(vec_unpacks_<perm_hilo>_<mode>): Take BYTES_BIG_ENDIAN into
account when deciding which SVE instruction the optab should use.
(vec_unpack<su_optab>_float_<perm_hilo>_vnx4si): Likewise.
gcc/testsuite/
* gcc.target/aarch64/sve/unpack_fcvt_signed_1.c: Expect zips rather
than unpacks.
* gcc.target/aarch64/sve/unpack_fcvt_unsigned_1.c: Likewise.
* gcc.target/aarch64/sve/unpack_float_1.c: Likewise.
From-SVN: r258489
|
|
since r256644)
2018-02-07 Richard Biener <rguenther@suse.de>
PR tree-optimization/84037
* tree-vectorizer.h (struct _loop_vec_info): Add ivexpr_map member.
(cse_and_gimplify_to_preheader): Declare.
(vect_get_place_in_interleaving_chain): Likewise.
* tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Initialize
ivexpr_map.
(_loop_vec_info::~_loop_vec_info): Delete it.
(cse_and_gimplify_to_preheader): New function.
* tree-vect-slp.c (vect_get_place_in_interleaving_chain): Export.
* tree-vect-stmts.c (vectorizable_store): CSE base and steps.
(vectorizable_load): Likewise. For grouped stores always base
the IV on the first element.
* tree-vect-loop-manip.c (vect_loop_versioning): Unshare versioning
condition before gimplifying.
From-SVN: r257453
|
|
This patch adds runtime alias checks for loops with variable strides,
so that we can vectorise them even without a restrict qualifier.
There are several parts to doing this:
1) For accesses like:
x[i * n] += 1;
we need to check whether n (and thus the DR_STEP) is nonzero.
vect_analyze_data_ref_dependence records values that need to be
checked in this way, then prune_runtime_alias_test_list records a
bounds check on DR_STEP being outside the range [0, 0].
2) For accesses like:
x[i * n] = x[i * n + 1] + 1;
we simply need to test whether abs (n) >= 2.
prune_runtime_alias_test_list looks for cases like this and tries
to guess whether it is better to use this kind of check or a check
for non-overlapping ranges. (We could do an OR of the two conditions
at runtime, but that isn't implemented yet.)
3) Checks for overlapping ranges need to cope with variable strides.
At present the "length" of each segment in a range check is
represented as an offset from the base that lies outside the
touched range, in the same direction as DR_STEP. The length
can therefore be negative and is sometimes conservative.
With variable steps it's easier to reaon about if we split
this into two:
seg_len:
distance travelled from the first iteration of interest
to the last, e.g. DR_STEP * (VF - 1)
access_size:
the number of bytes accessed in each iteration
with access_size always being a positive constant and seg_len
possibly being variable. We can then combine alias checks
for two accesses that are a constant number of bytes apart by
adjusting the access size to account for the gap. This leaves
the segment length unchanged, which allows the check to be combined
with further accesses.
When seg_len is positive, the runtime alias check has the form:
base_a >= base_b + seg_len_b + access_size_b
|| base_b >= base_a + seg_len_a + access_size_a
In many accesses the base will be aligned to the access size, which
allows us to skip the addition:
base_a > base_b + seg_len_b
|| base_b > base_a + seg_len_a
A similar saving is possible with "negative" lengths.
The patch therefore tracks the alignment in addition to seg_len
and access_size.
2018-01-13 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* tree-vectorizer.h (vec_lower_bound): New structure.
(_loop_vec_info): Add check_nonzero and lower_bounds.
(LOOP_VINFO_CHECK_NONZERO): New macro.
(LOOP_VINFO_LOWER_BOUNDS): Likewise.
(LOOP_REQUIRES_VERSIONING_FOR_ALIAS): Check lower_bounds too.
* tree-data-ref.h (dr_with_seg_len): Add access_size and align
fields. Make seg_len the distance travelled, not including the
access size.
(dr_direction_indicator): Declare.
(dr_zero_step_indicator): Likewise.
(dr_known_forward_stride_p): Likewise.
* tree-data-ref.c: Include stringpool.h, tree-vrp.h and
tree-ssanames.h.
(runtime_alias_check_p): Allow runtime alias checks with
variable strides.
(operator ==): Compare access_size and align.
(prune_runtime_alias_test_list): Rework for new distinction between
the access_size and seg_len.
(create_intersect_range_checks_index): Likewise. Cope with polynomial
segment lengths.
(get_segment_min_max): New function.
(create_intersect_range_checks): Use it.
(dr_step_indicator): New function.
(dr_direction_indicator): Likewise.
(dr_zero_step_indicator): Likewise.
(dr_known_forward_stride_p): Likewise.
* tree-loop-distribution.c (data_ref_segment_size): Return
DR_STEP * (niters - 1).
(compute_alias_check_pairs): Update call to the dr_with_seg_len
constructor.
* tree-vect-data-refs.c (vect_check_nonzero_value): New function.
(vect_preserves_scalar_order_p): New function, split out from...
(vect_analyze_data_ref_dependence): ...here. Check for zero steps.
(vect_vfa_segment_size): Return DR_STEP * (length_factor - 1).
(vect_vfa_access_size): New function.
(vect_vfa_align): Likewise.
(vect_compile_time_alias): Take access_size_a and access_b arguments.
(dump_lower_bound): New function.
(vect_check_lower_bound): Likewise.
(vect_small_gap_p): Likewise.
(vectorizable_with_step_bound_p): Likewise.
(vect_prune_runtime_alias_test_list): Ignore cross-iteration
depencies if the vectorization factor is 1. Convert the checks
for nonzero steps into checks on the bounds of DR_STEP. Try using
a bunds check for variable steps if the minimum required step is
relatively small. Update calls to the dr_with_seg_len
constructor and to vect_compile_time_alias.
* tree-vect-loop-manip.c (vect_create_cond_for_lower_bounds): New
function.
(vect_loop_versioning): Call it.
* tree-vect-loop.c (vect_analyze_loop_2): Clear LOOP_VINFO_LOWER_BOUNDS
when retrying.
(vect_estimate_min_profitable_iters): Account for any bounds checks.
gcc/testsuite/
* gcc.dg/vect/bb-slp-cond-1.c: Expect loop vectorization rather
than SLP vectorization.
* gcc.dg/vect/vect-alias-check-10.c: New test.
* gcc.dg/vect/vect-alias-check-11.c: Likewise.
* gcc.dg/vect/vect-alias-check-12.c: Likewise.
* gcc.dg/vect/vect-alias-check-8.c: Likewise.
* gcc.dg/vect/vect-alias-check-9.c: Likewise.
* gcc.target/aarch64/sve/strided_load_8.c: Likewise.
* gcc.target/aarch64/sve/var_stride_1.c: Likewise.
* gcc.target/aarch64/sve/var_stride_1.h: Likewise.
* gcc.target/aarch64/sve/var_stride_1_run.c: Likewise.
* gcc.target/aarch64/sve/var_stride_2.c: Likewise.
* gcc.target/aarch64/sve/var_stride_2_run.c: Likewise.
* gcc.target/aarch64/sve/var_stride_3.c: Likewise.
* gcc.target/aarch64/sve/var_stride_3_run.c: Likewise.
* gcc.target/aarch64/sve/var_stride_4.c: Likewise.
* gcc.target/aarch64/sve/var_stride_4_run.c: Likewise.
* gcc.target/aarch64/sve/var_stride_5.c: Likewise.
* gcc.target/aarch64/sve/var_stride_5_run.c: Likewise.
* gcc.target/aarch64/sve/var_stride_6.c: Likewise.
* gcc.target/aarch64/sve/var_stride_6_run.c: Likewise.
* gcc.target/aarch64/sve/var_stride_7.c: Likewise.
* gcc.target/aarch64/sve/var_stride_7_run.c: Likewise.
* gcc.target/aarch64/sve/var_stride_8.c: Likewise.
* gcc.target/aarch64/sve/var_stride_8_run.c: Likewise.
* gfortran.dg/vect/vect-alias-check-1.F90: Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256644
|
|
This patch adds support for fully-masking loops that require peeling
for gaps. It peels exactly one scalar iteration and uses the masked
loop to handle the rest. Previously we would fall back on using a
standard unmasked loop instead.
2018-01-13 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* tree-vect-loop-manip.c (vect_gen_scalar_loop_niters): Replace
vfm1 with a bound_epilog parameter.
(vect_do_peeling): Update calls accordingly, and move the prologue
call earlier in the function. Treat the base bound_epilog as 0 for
fully-masked loops and retain vf - 1 for other loops. Add 1 to
this base when peeling for gaps.
* tree-vect-loop.c (vect_analyze_loop_2): Allow peeling for gaps
with fully-masked loops.
(vect_estimate_min_profitable_iters): Handle the single peeled
iteration in that case.
gcc/testsuite/
* gcc.target/aarch64/sve/struct_vect_18.c: Check the number
of branches.
* gcc.target/aarch64/sve/struct_vect_19.c: Likewise.
* gcc.target/aarch64/sve/struct_vect_20.c: New test.
* gcc.target/aarch64/sve/struct_vect_20_run.c: Likewise.
* gcc.target/aarch64/sve/struct_vect_21.c: Likewise.
* gcc.target/aarch64/sve/struct_vect_21_run.c: Likewise.
* gcc.target/aarch64/sve/struct_vect_22.c: Likewise.
* gcc.target/aarch64/sve/struct_vect_22_run.c: Likewise.
* gcc.target/aarch64/sve/struct_vect_23.c: Likewise.
* gcc.target/aarch64/sve/struct_vect_23_run.c: Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256635
|
|
This patch adds support for aligning vectors by using a partial
first iteration. E.g. if the start pointer is 3 elements beyond
an aligned address, the first iteration will have a mask in which
the first three elements are false.
On SVE, the optimisation is only useful for vector-length-specific
code. Vector-length-agnostic code doesn't try to align vectors
since the vector length might not be a power of 2.
2018-01-13 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* tree-vectorizer.h (_loop_vec_info::mask_skip_niters): New field.
(LOOP_VINFO_MASK_SKIP_NITERS): New macro.
(vect_use_loop_mask_for_alignment_p): New function.
(vect_prepare_for_masked_peels, vect_gen_while_not): Declare.
* tree-vect-loop-manip.c (vect_set_loop_masks_directly): Add an
niters_skip argument. Make sure that the first niters_skip elements
of the first iteration are inactive.
(vect_set_loop_condition_masked): Handle LOOP_VINFO_MASK_SKIP_NITERS.
Update call to vect_set_loop_masks_directly.
(get_misalign_in_elems): New function, split out from...
(vect_gen_prolog_loop_niters): ...here.
(vect_update_init_of_dr): Take a code argument that specifies whether
the adjustment should be added or subtracted.
(vect_update_init_of_drs): Likewise.
(vect_prepare_for_masked_peels): New function.
(vect_do_peeling): Skip prologue peeling if we're using a mask
instead. Update call to vect_update_inits_of_drs.
* tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Initialize
mask_skip_niters.
(vect_analyze_loop_2): Allow fully-masked loops with peeling for
alignment. Do not include the number of peeled iterations in
the minimum threshold in that case.
(vectorizable_induction): Adjust the start value down by
LOOP_VINFO_MASK_SKIP_NITERS iterations.
(vect_transform_loop): Call vect_prepare_for_masked_peels.
Take the number of skipped iterations into account when calculating
the loop bounds.
* tree-vect-stmts.c (vect_gen_while_not): New function.
gcc/testsuite/
* gcc.target/aarch64/sve/nopeel_1.c: New test.
* gcc.target/aarch64/sve/peel_ind_1.c: Likewise.
* gcc.target/aarch64/sve/peel_ind_1_run.c: Likewise.
* gcc.target/aarch64/sve/peel_ind_2.c: Likewise.
* gcc.target/aarch64/sve/peel_ind_2_run.c: Likewise.
* gcc.target/aarch64/sve/peel_ind_3.c: Likewise.
* gcc.target/aarch64/sve/peel_ind_3_run.c: Likewise.
* gcc.target/aarch64/sve/peel_ind_4.c: Likewise.
* gcc.target/aarch64/sve/peel_ind_4_run.c: Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256630
|
|
This patch adds support for using a single fully-predicated loop instead
of a vector loop and a scalar tail. An SVE WHILELO instruction generates
the predicate for each iteration of the loop, given the current scalar
iv value and the loop bound. This operation is wrapped up in a new internal
function called WHILE_ULT. E.g.:
WHILE_ULT (0, 3, { 0, 0, 0, 0}) -> { 1, 1, 1, 0 }
WHILE_ULT (UINT_MAX - 1, UINT_MAX, { 0, 0, 0, 0 }) -> { 1, 0, 0, 0 }
The third WHILE_ULT argument is needed to make the operation
unambiguous: without it, WHILE_ULT (0, 3) for one vector type would
seem equivalent to WHILE_ULT (0, 3) for another, even if the types have
different numbers of elements.
Note that the patch uses "mask" and "fully-masked" instead of
"predicate" and "fully-predicated", to follow existing GCC terminology.
This patch just handles the simple cases, punting for things like
reductions and live-out values. Later patches remove most of these
restrictions.
2018-01-13 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* optabs.def (while_ult_optab): New optab.
* doc/md.texi (while_ult@var{m}@var{n}): Document.
* internal-fn.def (WHILE_ULT): New internal function.
* internal-fn.h (direct_internal_fn_supported_p): New override
that takes two types as argument.
* internal-fn.c (while_direct): New macro.
(expand_while_optab_fn): New function.
(convert_optab_supported_p): Likewise.
(direct_while_optab_supported_p): New macro.
* wide-int.h (wi::udiv_ceil): New function.
* tree-vectorizer.h (rgroup_masks): New structure.
(vec_loop_masks): New typedef.
(_loop_vec_info): Add masks, mask_compare_type, can_fully_mask_p
and fully_masked_p.
(LOOP_VINFO_CAN_FULLY_MASK_P, LOOP_VINFO_FULLY_MASKED_P)
(LOOP_VINFO_MASKS, LOOP_VINFO_MASK_COMPARE_TYPE): New macros.
(vect_max_vf): New function.
(slpeel_make_loop_iterate_ntimes): Delete.
(vect_set_loop_condition, vect_get_loop_mask_type, vect_gen_while)
(vect_halve_mask_nunits, vect_double_mask_nunits): Declare.
(vect_record_loop_mask, vect_get_loop_mask): Likewise.
* tree-vect-loop-manip.c: Include tree-ssa-loop-niter.h,
internal-fn.h, stor-layout.h and optabs-query.h.
(vect_set_loop_mask): New function.
(add_preheader_seq): Likewise.
(add_header_seq): Likewise.
(interleave_supported_p): Likewise.
(vect_maybe_permute_loop_masks): Likewise.
(vect_set_loop_masks_directly): Likewise.
(vect_set_loop_condition_masked): Likewise.
(vect_set_loop_condition_unmasked): New function, split out from
slpeel_make_loop_iterate_ntimes.
(slpeel_make_loop_iterate_ntimes): Rename to..
(vect_set_loop_condition): ...this. Use vect_set_loop_condition_masked
for fully-masked loops and vect_set_loop_condition_unmasked otherwise.
(vect_do_peeling): Update call accordingly.
(vect_gen_vector_loop_niters): Use VF as the step for fully-masked
loops.
* tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Initialize
mask_compare_type, can_fully_mask_p and fully_masked_p.
(release_vec_loop_masks): New function.
(_loop_vec_info): Use it to free the loop masks.
(can_produce_all_loop_masks_p): New function.
(vect_get_max_nscalars_per_iter): Likewise.
(vect_verify_full_masking): Likewise.
(vect_analyze_loop_2): Save LOOP_VINFO_CAN_FULLY_MASK_P around
retries, and free the mask rgroups before retrying. Check loop-wide
reasons for disallowing fully-masked loops. Make the final decision
about whether use a fully-masked loop or not.
(vect_estimate_min_profitable_iters): Do not assume that peeling
for the number of iterations will be needed for fully-masked loops.
(vectorizable_reduction): Disable fully-masked loops.
(vectorizable_live_operation): Likewise.
(vect_halve_mask_nunits): New function.
(vect_double_mask_nunits): Likewise.
(vect_record_loop_mask): Likewise.
(vect_get_loop_mask): Likewise.
(vect_transform_loop): Handle the case in which the final loop
iteration might handle a partial vector. Call vect_set_loop_condition
instead of slpeel_make_loop_iterate_ntimes.
* tree-vect-stmts.c: Include tree-ssa-loop-niter.h and gimple-fold.h.
(check_load_store_masking): New function.
(prepare_load_store_mask): Likewise.
(vectorizable_store): Handle fully-masked loops.
(vectorizable_load): Likewise.
(supportable_widening_operation): Use vect_halve_mask_nunits for
booleans.
(supportable_narrowing_operation): Likewise vect_double_mask_nunits.
(vect_gen_while): New function.
* config/aarch64/aarch64.md (umax<mode>3): New expander.
(aarch64_uqdec<mode>): New insn.
gcc/testsuite/
* gcc.dg/tree-ssa/cunroll-10.c: Disable vectorization.
* gcc.dg/tree-ssa/peel1.c: Likewise.
* gcc.dg/vect/vect-load-lanes-peeling-1.c: Remove XFAIL for
variable-length vectors.
* gcc.target/aarch64/sve/vcond_6.c: XFAIL test for AND.
* gcc.target/aarch64/sve/vec_bool_cmp_1.c: Expect BIC instead of NOT.
* gcc.target/aarch64/sve/slp_1.c: Check for a fully-masked loop.
* gcc.target/aarch64/sve/slp_2.c: Likewise.
* gcc.target/aarch64/sve/slp_3.c: Likewise.
* gcc.target/aarch64/sve/slp_4.c: Likewise.
* gcc.target/aarch64/sve/slp_6.c: Likewise.
* gcc.target/aarch64/sve/slp_8.c: New test.
* gcc.target/aarch64/sve/slp_8_run.c: Likewise.
* gcc.target/aarch64/sve/slp_9.c: Likewise.
* gcc.target/aarch64/sve/slp_9_run.c: Likewise.
* gcc.target/aarch64/sve/slp_10.c: Likewise.
* gcc.target/aarch64/sve/slp_10_run.c: Likewise.
* gcc.target/aarch64/sve/slp_11.c: Likewise.
* gcc.target/aarch64/sve/slp_11_run.c: Likewise.
* gcc.target/aarch64/sve/slp_12.c: Likewise.
* gcc.target/aarch64/sve/slp_12_run.c: Likewise.
* gcc.target/aarch64/sve/ld1r_2.c: Likewise.
* gcc.target/aarch64/sve/ld1r_2_run.c: Likewise.
* gcc.target/aarch64/sve/while_1.c: Likewise.
* gcc.target/aarch64/sve/while_2.c: Likewise.
* gcc.target/aarch64/sve/while_3.c: Likewise.
* gcc.target/aarch64/sve/while_4.c: Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256625
|
|
From-SVN: r256169
|
|
This patch changes the type of the vectorisation factor and SLP
unrolling factor to poly_uint64. This in turn required some knock-on
changes in signedness elsewhere.
Cost decisions are generally based on estimated_poly_value,
which for VF is wrapped up as vect_vf_for_cost.
The patch doesn't on its own enable variable-length vectorisation.
It just makes the minimum changes necessary for the code to build
with the new VF and UF types. Later patches also make the
vectoriser cope with variable TYPE_VECTOR_SUBPARTS and variable
GET_MODE_NUNITS, at which point the code really does handle
variable-length vectors.
The patch also changes MAX_VECTORIZATION_FACTOR to INT_MAX,
to avoid hard-coding a particular architectural limit.
The patch includes a new test because a development version of the patch
accidentally used file print routines instead of dump_*, which would
fail with -fopt-info.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* tree-vectorizer.h (_slp_instance::unrolling_factor): Change
from an unsigned int to a poly_uint64.
(_loop_vec_info::slp_unrolling_factor): Likewise.
(_loop_vec_info::vectorization_factor): Change from an int
to a poly_uint64.
(MAX_VECTORIZATION_FACTOR): Bump from 64 to INT_MAX.
(vect_get_num_vectors): New function.
(vect_update_max_nunits, vect_vf_for_cost): Likewise.
(vect_get_num_copies): Use vect_get_num_vectors.
(vect_analyze_data_ref_dependences): Change max_vf from an int *
to an unsigned int *.
(vect_analyze_data_refs): Change min_vf from an int * to a
poly_uint64 *.
(vect_transform_slp_perm_load): Take the vf as a poly_uint64 rather
than an unsigned HOST_WIDE_INT.
* tree-vect-data-refs.c (vect_analyze_possibly_independent_ddr)
(vect_analyze_data_ref_dependence): Change max_vf from an int *
to an unsigned int *.
(vect_analyze_data_ref_dependences): Likewise.
(vect_compute_data_ref_alignment): Handle polynomial vf.
(vect_enhance_data_refs_alignment): Likewise.
(vect_prune_runtime_alias_test_list): Likewise.
(vect_shift_permute_load_chain): Likewise.
(vect_supportable_dr_alignment): Likewise.
(dependence_distance_ge_vf): Take the vectorization factor as a
poly_uint64 rather than an unsigned HOST_WIDE_INT.
(vect_analyze_data_refs): Change min_vf from an int * to a
poly_uint64 *.
* tree-vect-loop-manip.c (vect_gen_scalar_loop_niters): Take
vfm1 as a poly_uint64 rather than an int. Make the same change
for the returned bound_scalar.
(vect_gen_vector_loop_niters): Handle polynomial vf.
(vect_do_peeling): Likewise. Update call to
vect_gen_scalar_loop_niters and handle polynomial bound_scalars.
(vect_gen_vector_loop_niters_mult_vf): Assert that the vf must
be constant.
* tree-vect-loop.c (vect_determine_vectorization_factor)
(vect_update_vf_for_slp, vect_analyze_loop_2): Handle polynomial vf.
(vect_get_known_peeling_cost): Likewise.
(vect_estimate_min_profitable_iters, vectorizable_reduction): Likewise.
(vect_worthwhile_without_simd_p, vectorizable_induction): Likewise.
(vect_transform_loop): Likewise. Use the lowest possible VF when
updating the upper bounds of the loop.
(vect_min_worthwhile_factor): Make static. Return an unsigned int
rather than an int.
* tree-vect-slp.c (vect_attempt_slp_rearrange_stmts): Cope with
polynomial unroll factors.
(vect_analyze_slp_cost_1, vect_analyze_slp_instance): Likewise.
(vect_make_slp_decision): Likewise.
(vect_supported_load_permutation_p): Likewise, and polynomial
vf too.
(vect_analyze_slp_cost): Handle polynomial vf.
(vect_slp_analyze_node_operations): Likewise.
(vect_slp_analyze_bb_1): Likewise.
(vect_transform_slp_perm_load): Take the vf as a poly_uint64 rather
than an unsigned HOST_WIDE_INT.
* tree-vect-stmts.c (vectorizable_simd_clone_call, vectorizable_store)
(vectorizable_load): Handle polynomial vf.
* tree-vectorizer.c (simduid_to_vf::vf): Change from an int to
a poly_uint64.
(adjust_simduid_builtins, shrink_simd_arrays): Update accordingly.
gcc/testsuite/
* gcc.dg/vect-opt-info-1.c: New test.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256126
|
|
Normally we adjust the vector loop so that it iterates:
(original number of scalar iterations - number of peels) / VF
times, enforcing this using an IV that starts at zero and increments
by one each iteration. However, dividing by VF would be expensive
for variable VF, so this patch adds an alternative in which the IV
increments by VF each iteration instead. We then need to take care
to handle possible overflow in the IV.
The new mechanism isn't used yet; a later patch replaces the
"if (1)" with a check for variable VF.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* tree-vect-loop-manip.c: Include gimple-fold.h.
(slpeel_make_loop_iterate_ntimes): Add step, final_iv and
niters_maybe_zero parameters. Handle other cases besides a step of 1.
(vect_gen_vector_loop_niters): Add a step_vector_ptr parameter.
Add a path that uses a step of VF instead of 1, but disable it
for now.
(vect_do_peeling): Add step_vector, niters_vector_mult_vf_var
and niters_no_overflow parameters. Update calls to
slpeel_make_loop_iterate_ntimes and vect_gen_vector_loop_niters.
Create a new SSA name if the latter choses to use a ste other
than zero, and return it via niters_vector_mult_vf_var.
* tree-vect-loop.c (vect_transform_loop): Update calls to
vect_do_peeling, vect_gen_vector_loop_niters and
slpeel_make_loop_iterate_ntimes.
* tree-vectorizer.h (slpeel_make_loop_iterate_ntimes, vect_do_peeling)
(vect_gen_vector_loop_niters): Update declarations after above changes.
From-SVN: r256124
|
|
This patch splits the loop versioning threshold out from the
cost model threshold so that the former can become a poly_uint64.
We still use a single test to enforce both limits where possible.
2017-12-21 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* tree-vectorizer.h (_loop_vec_info): Add a versioning_threshold
field.
(LOOP_VINFO_VERSIONING_THRESHOLD): New macro
(vect_loop_versioning): Take the loop versioning threshold as a
separate parameter.
* tree-vect-loop-manip.c (vect_loop_versioning): Likewise.
* tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Initialize
versioning_threshold.
(vect_analyze_loop_2): Compute the loop versioning threshold
whenever loop versioning is needed, and store it in the new
field rather than combining it with the cost model threshold.
(vect_transform_loop): Update call to vect_loop_versioning.
Try to combine the loop versioning and cost thresholds here.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r255934
|
|
This patch introduces a number of new macros and functions that will
be used to distinguish between different kinds of debug stmts, insns
and notes, namely, preexisting debug bind ones and to-be-introduced
nonbind markers.
In a seemingly mechanical way, it adjusts several uses of the macros
and functions, so that they refer to narrower categories when
appropriate.
These changes, by themselves, should not have any visible effect in
the compiler behavior, since the upcoming debug markers are never
created with this patch alone.
for gcc/ChangeLog
* gimple.h (enum gimple_debug_subcode): Add
GIMPLE_DEBUG_BEGIN_STMT.
(gimple_debug_begin_stmt_p): New.
(gimple_debug_nonbind_marker_p): New.
* tree.h (MAY_HAVE_DEBUG_MARKER_STMTS): New.
(MAY_HAVE_DEBUG_BIND_STMTS): Renamed from....
(MAY_HAVE_DEBUG_STMTS): ... this. Check both.
* insn-notes.def (BEGIN_STMT): New.
* rtl.h (MAY_HAVE_DEBUG_MARKER_INSNS): New.
(MAY_HAVE_DEBUG_BIND_INSNS): Renamed from....
(MAY_HAVE_DEBUG_INSNS): ... this. Check both.
(NOTE_MARKER_LOCATION, NOTE_MARKER_P): New.
(DEBUG_BIND_INSN_P, DEBUG_MARKER_INSN_P): New.
(INSN_DEBUG_MARKER_KIND): New.
(GEN_RTX_DEBUG_MARKER_BEGIN_STMT_PAT): New.
(INSN_VAR_LOCATION): Check for VAR_LOCATION.
(INSN_VAR_LOCATION_PTR): New.
* cfgexpand.c (expand_debug_locations): Handle debug bind insns
only.
(expand_gimple_basic_block): Likewise. Emit debug temps for TER
deps only if debug bind insns are enabled.
(pass_expand::execute): Avoid deep TER and expand
debug locations for debug bind insns only.
* cgraph.c (cgraph_edge::redirect_call_stmt_to_callee): Narrow
debug stmts special handling down to debug bind stmts.
* combine.c (try_combine): Narrow debug insns special handling
down to debug bind insns.
* cse.c (delete_trivially_dead_insns): Handle debug bindings.
Narrow debug insns preexisting special handling down to debug
bind insns.
* dce.c (rest_of_handle_ud_dce): Narrow debug insns special
handling down to debug bind insns.
* function.c (instantiate_virtual_regs): Skip debug markers,
adjust handling of debug binds.
* gimple-ssa-backprop.c (backprop::prepare_change): Try debug
temp insertion iff MAY_HAVE_DEBUG_BIND_STMTS.
* haifa-sched.c (schedule_insn): Narrow special handling of debug
insns to debug bind insns.
* ipa-param-manipulation.c (ipa_modify_call_arguments): Narrow
special handling of debug stmts to debug bind stmts.
* ipa-split.c (split_function): Likewise.
* ira.c (combine_and_move_insns): Adjust debug bind insns only.
* loop-unroll.c (apply_opt_in_copies): Adjust tests on bind
debug insns.
* reg-stack.c (convert_regs_1): Use DEBUG_BIND_INSN_P.
* regrename.c (build_def_use): Likewise.
* regcprop.c (copyprop_hardreg_forward_1): Likewise.
(pass_cprop_hardreg): Narrow special casing of debug insns to
debug bind insns.
* regstat.c (regstat_init_n_sets_and_refs): Likewise.
* reload1.c (reload): Likewise.
* sese.c (sese_insert_phis_for_liveouts): Narrow special
casing of debug stmts to debug bind stmts.
* shrink-wrap.c (move_insn_for_shrink_wrap): Likewise.
* ssa-iterators.h (num_imm_uses): Likewise.
* tree-cfg.c (gimple_merge_blocks): Narrow special casing of
debug stmts to debug bind stmts.
* tree-inline.c (tree_function_versioning): Narrow special casing
of debug stmts to debug bind stmts.
* tree-loop-distribution.c (generate_loops_for_partition):
Narrow special casing of debug stmts to debug bind stmts.
* tree-sra.c (analyze_access_subtree): Narrow special casing
of debug stmts to debug bind stmts.
* tree-ssa-dce.c (remove_dead_stmt): Narrow special casing of debug
stmts to debug bind stmts.
* tree-ssa-loop-ivopt.c (remove_unused_ivs): Narrow special
casing of debug stmts to debug bind stmts.
* tree-ssa-reassoc.c (reassoc_remove_stmt): Likewise.
* tree-ssa-tail-merge.c (tail_merge_optimize): Narrow special
casing of debug stmts to debug bind stmts.
* tree-ssa-threadedge.c (propagate_threaded_block_debug_info):
Likewise.
* tree-ssa.c (flush_pending_stmts): Narrow special casing of
debug stmts to debug bind stmts.
(gimple_replace_ssa_lhs): Likewise.
(insert_debug_temp_for_var_def): Likewise.
(insert_debug_temps_for_defs): Likewise.
(reset_debug_uses): Likewise.
* tree-ssanames.c (release_ssa_name_fn): Likewise.
* tree-vect-loop-manip.c (adjust_debug_stmts_now): Likewise.
(adjust_debug_stmts): Likewise.
(adjust_phi_and_debug_stmts): Likewise.
(vect_do_peeling): Likewise.
* tree-vect-loop.c (vect_transform_loop): Likewise.
* valtrack.c (propagate_for_debug): Use BIND_DEBUG_INSN_P.
* var-tracking.c (adjust_mems): Narrow special casing of debug
insns to debug bind insns.
(dv_onepart_p, dataflow_set_clar_at_call, use_type): Likewise.
(compute_bb_dataflow, vt_find_locations): Likewise.
(vt_expand_loc, emit_notes_for_changes): Likewise.
(vt_init_cfa_base): Likewise.
(vt_emit_notes): Likewise.
(vt_initialize): Likewise.
(vt_finalize): Likewise.
From-SVN: r255565
|
|
be set)
2017-11-24 Richard Biener <rguenther@suse.de>
PR tree-optimization/82402
* tree-vect-loop-manip.c (create_lcssa_for_virtual_phi): Properly
set SSA_NAME_OCCURS_IN_ABNORMAL_PHI.
* gcc.dg/torture/pr82402.c: New testcase.
From-SVN: r255140
|
|
* tree-vect-loop-manip.c (vect_do_peeling): Do not use
scale_bbs_frequencies_int.
From-SVN: r254810
|
|
* asan.c (create_cond_insert_point): Maintain profile.
* ipa-utils.c (ipa_merge_profiles): Be sure only IPA profiles are
merged.
* basic-block.h (struct basic_block_def): Remove frequency.
(EDGE_FREQUENCY): Use to_frequency
* bb-reorder.c (push_to_next_round_p): Use only IPA counts for global
heuristics.
(find_traces): Update to use to_frequency.
(find_traces_1_round): Likewise; use only IPA counts.
(bb_to_key): Likewise.
(connect_traces): Use IPA counts only.
(copy_bb_p): Update to use to_frequency.
(fix_up_crossing_landing_pad): Likewise.
(sanitize_hot_paths): Likewise.
* bt-load.c (basic_block_freq): Likewise.
* cfg.c (init_flow): Set count_max to uninitialized.
(check_bb_profile): Remove frequencies; check counts.
(dump_bb_info): Do not dump frequencies.
(update_bb_profile_for_threading): Update counts only.
(scale_bbs_frequencies_int): Likewise.
(MAX_SAFE_MULTIPLIER): Remove.
(scale_bbs_frequencies_gcov_type): Update counts only.
(scale_bbs_frequencies_profile_count): Update counts only.
(scale_bbs_frequencies): Update counts only.
* cfg.h (struct control_flow_graph): Add count-max.
(update_bb_profile_for_threading): Update prototype.
* cfgbuild.c (find_bb_boundaries): Do not update frequencies.
(find_many_sub_basic_blocks): Likewise.
* cfgcleanup.c (try_forward_edges): Likewise.
(try_crossjump_to_edge): Likewise.
* cfgexpand.c (expand_gimple_cond): Likewise.
(expand_gimple_tailcall): Likewise.
(construct_init_block): Likewise.
(construct_exit_block): Likewise.
* cfghooks.c (verify_flow_info): Check consistency of counts.
(dump_bb_for_graph): Do not dump frequencies.
(split_block_1): Do not update frequencies.
(split_edge): Do not update frequencies.
(make_forwarder_block): Do not update frequencies.
(duplicate_block): Do not update frequencies.
(account_profile_record): Do not update frequencies.
* cfgloop.c (find_subloop_latch_edge_by_profile): Use IPA counts
for global heuristics.
* cfgloopanal.c (average_num_loop_insns): Update to use to_frequency.
(expected_loop_iterations_unbounded): Use counts only.
* cfgloopmanip.c (scale_loop_profile): Simplify.
(create_empty_loop_on_edge): Simplify
(loopify): Simplify
(duplicate_loop_to_header_edge): Simplify
* cfgrtl.c (force_nonfallthru_and_redirect): Update profile.
(update_br_prob_note): Take care of removing note when profile
becomes undefined.
(relink_block_chain): Do not dump frequency.
(rtl_account_profile_record): Use to_frequency.
* cgraph.c (symbol_table::create_edge): Convert count to ipa count.
(cgraph_edge::redirect_call_stmt_to_calle): Conver tcount to ipa count.
(cgraph_update_edges_for_call_stmt_node): Likewise.
(cgraph_edge::verify_count_and_frequency): Update.
(cgraph_node::verify_node): Temporarily disable frequency verification.
* cgraphbuild.c (compute_call_stmt_bb_frequency): Use
to_cgraph_frequency.
(cgraph_edge::rebuild_edges): Convert to ipa counts.
* cgraphunit.c (init_lowered_empty_function): Do not initialize
frequencies.
(cgraph_node::expand_thunk): Update profile.
* except.c (dw2_build_landing_pads): Do not update frequency.
* final.c (compute_alignments): Use to_frequency.
(dump_basic_block_info): Do not dump frequency.
* gimple-pretty-print.c (dump_profile): Do not dump frequency.
(dump_gimple_bb_header): Do not dump frequency.
* gimple-ssa-isolate-paths.c (isolate_path): Do not update frequency;
do update count.
* gimple-streamer-in.c (input_bb): Do not stream frequency.
* gimple-streamer-out.c (output_bb): Do not stream frequency.
* haifa-sched.c (sched_pressure_start_bb): Use to_freuqency.
(init_before_recovery): Do not update frequency.
(sched_create_recovery_edges): Do not update frequency.
* hsa-gen.c (convert_switch_statements): Do not update frequency.
* ipa-cp.c (ipcp_propagate_stage): Update search for max_count.
(ipa_cp_c_finalize): Set max_count to uninitialized.
* ipa-fnsummary.c (get_minimal_bb): Use counts.
(param_change_prob): Use counts.
* ipa-profile.c (ipa_profile_generate_summary): Do not summarize
local profiles.
* ipa-split.c (consider_split): Use to_frequency.
(split_function): Use to_frequency.
* ira-build.c (loop_compare_func): Likewise.
(mark_loops_for_removal): Likewise.
(mark_all_loops_for_removal): Likewise.
* loop-doloop.c (doloop_modify): Do not update frequency.
* loop-unroll.c (unroll_loop_runtime_iterations): Do not update
frequency.
* lto-streamer-in.c (input_function): Update count_max.
* omp-expand.c (expand_omp_taskreg): Update count_max.
* omp-simd-clone.c (simd_clone_adjust): Update profile.
* predict.c (maybe_hot_frequency_p): Use to_frequency.
(maybe_hot_count_p): Use ipa counts only.
(maybe_hot_bb_p): Simplify.
(maybe_hot_edge_p): Simplify.
(probably_never_executed): Do not take frequency argument.
(probably_never_executed_bb_p): Do not pass frequency.
(probably_never_executed_edge_p): Likewise.
(combine_predictions_for_bb): Check that profile is nonzero.
(propagate_freq): Do not set frequency.
(drop_profile): Simplify.
(counts_to_freqs): Simplify.
(expensive_function_p): Use to_frequency.
(propagate_unlikely_bbs_forward): Simplify.
(determine_unlikely_bbs): Simplify.
(estimate_bb_frequencies): Add hack to silence graphite issues.
(compute_function_frequency): Use ipa counts.
(pass_profile::execute): Update.
(rebuild_frequencies): Use counts only.
(force_edge_cold): Use counts only.
* profile-count.c (profile_count::dump): Dump new count types.
(profile_count::differs_from_p): Check compatiblity.
(profile_count::to_frequency): New function.
(profile_count::to_cgraph_frequency): New function.
* profile-count.h (struct function): Declare.
(enum profile_quality): Add profile_guessed_local and
profile_guessed_global0.
(class profile_proability): Decrease number of bits to 29;
update from_reg_br_prob_note and to_reg_br_prob_note.
(class profile_count: Update comment; decrease number of bits
to 61. Check compatibility.
(profile_count::compatible_p): New private member function.
(profile_count::ipa_p): New member function.
(profile_count::operator<): Handle global zero correctly.
(profile_count::operator>): Handle global zero correctly.
(profile_count::operator<=): Handle global zero correctly.
(profile_count::operator>=): Handle global zero correctly.
(profile_count::nonzero_p): New member function.
(profile_count::force_nonzero): New member function.
(profile_count::max): New member function.
(profile_count::apply_scale): Handle IPA scalling.
(profile_count::guessed_local): New member function.
(profile_count::global0): New member function.
(profile_count::ipa): New member function.
(profile_count::to_frequency): Declare.
(profile_count::to_cgraph_frequency): Declare.
* profile.c (OVERLAP_BASE): Delete.
(compute_frequency_overlap): Delete.
(compute_branch_probabilities): Do not use compute_frequency_overlap.
* regs.h (REG_FREQ_FROM_BB): Use to_frequency.
* sched-ebb.c (rank): Use counts only.
* shrink-wrap.c (handle_simple_exit): Use counts only.
(try_shrink_wrapping): Use counts only.
(place_prologue_for_one_component): Use counts only.
* tracer.c (find_best_predecessor): Use to_frequency.
(find_trace): Use to_frequency.
(tail_duplicate): Use to_frequency.
* trans-mem.c (expand_transaction): Do not update frequency.
* tree-call-cdce.c: Do not update frequency.
* tree-cfg.c (gimple_find_sub_bbs): Likewise.
(gimple_merge_blocks): Likewise.
(gimple_split_edge): Likewise.
(gimple_duplicate_sese_region): Likewise.
(gimple_duplicate_sese_tail): Likewise.
(move_sese_region_to_fn): Likewise.
(gimple_account_profile_record): Likewise.
(insert_cond_bb): Likewise.
* tree-complex.c (expand_complex_div_wide): Likewise.
* tree-eh.c (lower_resx): Update profile.
* tree-inline.c (copy_bb): Simplify count scaling; do not scale
frequencies.
(initialize_cfun): Do not initialize frequencies
(freqs_to_counts): Delete.
(copy_cfg_body): Ignore count parameter.
(copy_body): Update.
(expand_call_inline): Update count_max.
(optimize_inline_calls): Update count_max.
(tree_function_versioning): Update count_max.
* tree-ssa-coalesce.c (coalesce_cost_bb): Use to_frequency.
* tree-ssa-ifcombine.c (update_profile_after_ifcombine): Do not update
frequency.
* tree-ssa-loop-im.c (execute_sm_if_changed): Use counts only.
* tree-ssa-loop-ivcanon.c (unloop_loops): Do not update freuqency.
(try_peel_loop): Likewise.
* tree-ssa-loop-ivopts.c (get_scaled_computation_cost_at): Use
to_frequency.
* tree-ssa-loop-manip.c (niter_for_unrolled_loop): Pass -1.
(tree_transform_and_unroll_loop): Do not use frequencies
* tree-ssa-loop-niter.c (estimate_numbers_of_iterations):
Use reliable prediction only.
* tree-ssa-loop-unswitch.c (hoist_guard): Do not use frequencies.
* tree-ssa-sink.c (select_best_block): Use to_frequency.
* tree-ssa-tail-merge.c (replace_block_by): Temporarily disable
probability scaling.
* tree-ssa-threadupdate.c (create_block_for_threading): Do
not update frequency
(any_remaining_duplicated_blocks): Likewise.
(update_profile): Likewise.
(estimated_freqs_path): Delete.
(freqs_to_counts_path): Delete.
(clear_counts_path): Delete.
(ssa_fix_duplicate_block_edges): Likewise.
(duplicate_thread_path): Likewise.
* tree-switch-conversion.c (gen_inbound_check): Use counts.
* tree-tailcall.c (decrease_profile): Do not update frequency.
(eliminate_tail_call): Likewise.
* tree-vect-loop-manip.c (vect_do_peeling): Likewise.
* tree-vect-loop.c (scale_profile_for_vect_loop): Likewise.
(optimize_mask_stores): Likewise.
* tree-vect-stmts.c (vectorizable_simd_clone_call): Likewise.
* ubsan.c (ubsan_expand_null_ifn): Update profile.
(ubsan_expand_ptr_ifn): Update profile.
* value-prof.c (gimple_ic): Simplify.
* value-prof.h (gimple_ic): Update prototype.
* ipa-inline-transform.c (inline_transform): Fix scaling conditoins.
* ipa-inline.c (compute_uninlined_call_time): Be sure that
counts are nonzero.
(want_inline_self_recursive_call_p): Likewise.
(resolve_noninline_speculation): Only cummulate defined counts.
(inline_small_functions): Use nonzero_p.
(ipa_inline): Do not access freed node.
Unknown ChangeLog:
2017-11-02 Jan Hubicka <hubicka@ucw.cz>
* testsuite/gcc.dg/no-strict-overflow-3.c (foo): Update magic
value to not clash with frequency.
* testsuite/gcc.dg/strict-overflow-3.c (foo): Likewise.
* testsuite/gcc.dg/tree-ssa/builtin-sprintf-2.c: Update template.
* testsuite/gcc.dg/tree-ssa/dump-2.c: Update template.
* testsuite/gcc.dg/tree-ssa/ifc-10.c: Update template.
* testsuite/gcc.dg/tree-ssa/ifc-11.c: Update template.
* testsuite/gcc.dg/tree-ssa/ifc-12.c: Update template.
* testsuite/gcc.dg/tree-ssa/ifc-20040816-1.c: Update template.
* testsuite/gcc.dg/tree-ssa/ifc-20040816-2.c: Update template.
* testsuite/gcc.dg/tree-ssa/ifc-5.c: Update template.
* testsuite/gcc.dg/tree-ssa/ifc-8.c: Update template.
* testsuite/gcc.dg/tree-ssa/ifc-9.c: Update template.
* testsuite/gcc.dg/tree-ssa/ifc-cd.c: Update template.
* testsuite/gcc.dg/tree-ssa/ifc-pr56541.c: Update template.
* testsuite/gcc.dg/tree-ssa/ifc-pr68583.c: Update template.
* testsuite/gcc.dg/tree-ssa/ifc-pr69489-1.c: Update template.
* testsuite/gcc.dg/tree-ssa/ifc-pr69489-2.c: Update template.
* testsuite/gcc.target/i386/pr61403.c: Update template.
From-SVN: r254379
|
|
* asan.c (create_cond_insert_point): Do not update edge count.
* auto-profile.c (afdo_propagate_edge): Update for edge count removal.
(afdo_propagate_circuit): Likewise.
(afdo_calculate_branch_prob): Likewise.
(afdo_annotate_cfg): Likewise.
* basic-block.h (struct edge_def): Remove count.
(edge_def::count): New accessor.
* bb-reorder.c (rotate_loop): Update.
(find_traces_1_round): Update.
(connect_traces): Update.
(sanitize_hot_paths): Update.
* cfg.c (unchecked_make_edge): Update.
(make_single_succ_edge): Update.
(check_bb_profile): Update.
(dump_edge_info): Update.
(update_bb_profile_for_threading): Update.
(scale_bbs_frequencies_int): Update.
(scale_bbs_frequencies_gcov_type): Update.
(scale_bbs_frequencies_profile_count): Update.
(scale_bbs_frequencies): Update.
* cfganal.c (connect_infinite_loops_to_exit): Update.
* cfgbuild.c (compute_outgoing_frequencies): Update.
(find_many_sub_basic_blocks): Update.
* cfgcleanup.c (try_forward_edges): Update.
(try_crossjump_to_edge): Update
* cfgexpand.c (expand_gimple_cond): Update
(expand_gimple_tailcall): Update
(construct_exit_block): Update
* cfghooks.c (verify_flow_info): Update
(redirect_edge_succ_nodup): Update
(split_edge): Update
(make_forwarder_block): Update
(duplicate_block): Update
(account_profile_record): Update
* cfgloop.c (find_subloop_latch_edge_by_profile): Update.
* cfgloopanal.c (expected_loop_iterations_unbounded): Update.
* cfgloopmanip.c (scale_loop_profile): Update.
(loopify): Update.
(lv_adjust_loop_entry_edge): Update.
* cfgrtl.c (try_redirect_by_replacing_jump): Update.
(force_nonfallthru_and_redirect): Update.
(purge_dead_edges): Update.
(rtl_flow_call_edges_add): Update.
* cgraphunit.c (init_lowered_empty_function): Update.
(cgraph_node::expand_thunk): Update.
* gimple-pretty-print.c (dump_probability): Update.
(dump_edge_probability): Update.
* gimple-ssa-isolate-paths.c (isolate_path): Update.
* haifa-sched.c (sched_create_recovery_edges): Update.
* hsa-gen.c (convert_switch_statements): Update.
* ifcvt.c (dead_or_predicable): Update.
* ipa-inline-transform.c (inline_transform): Update.
* ipa-split.c (split_function): Update.
* ipa-utils.c (ipa_merge_profiles): Update.
* loop-doloop.c (add_test): Update.
* loop-unroll.c (unroll_loop_runtime_iterations): Update.
* lto-streamer-in.c (input_cfg): Update.
(input_function): Update.
* lto-streamer-out.c (output_cfg): Update.
* modulo-sched.c (sms_schedule): Update.
* postreload-gcse.c (eliminate_partially_redundant_load): Update.
* predict.c (maybe_hot_edge_p): Update.
(unlikely_executed_edge_p): Update.
(probably_never_executed_edge_p): Update.
(dump_prediction): Update.
(drop_profile): Update.
(propagate_unlikely_bbs_forward): Update.
(determine_unlikely_bbs): Update.
(force_edge_cold): Update.
* profile.c (compute_branch_probabilities): Update.
* reg-stack.c (better_edge): Update.
* shrink-wrap.c (handle_simple_exit): Update.
* tracer.c (better_p): Update.
* trans-mem.c (expand_transaction): Update.
(split_bb_make_tm_edge): Update.
* tree-call-cdce.c: Update.
* tree-cfg.c (gimple_find_sub_bbs): Update.
(gimple_split_edge): Update.
(gimple_duplicate_sese_region): Update.
(gimple_duplicate_sese_tail): Update.
(gimple_flow_call_edges_add): Update.
(insert_cond_bb): Update.
(execute_fixup_cfg): Update.
* tree-cfgcleanup.c (cleanup_control_expr_graph): Update.
* tree-complex.c (expand_complex_div_wide): Update.
* tree-eh.c (lower_resx): Update.
(unsplit_eh): Update.
(cleanup_empty_eh_move_lp): Update.
* tree-inline.c (copy_edges_for_bb): Update.
(freqs_to_counts): Update.
(copy_cfg_body): Update.
* tree-ssa-dce.c (remove_dead_stmt): Update.
* tree-ssa-ifcombine.c (update_profile_after_ifcombine): Update.
* tree-ssa-loop-im.c (execute_sm_if_changed): Update.
* tree-ssa-loop-ivcanon.c (remove_exits_and_undefined_stmts): Update.
(unloop_loops): Update.
* tree-ssa-loop-manip.c (tree_transform_and_unroll_loop): Update.
* tree-ssa-loop-split.c (connect_loops): Update.
(split_loop): Update.
* tree-ssa-loop-unswitch.c (hoist_guard): Update.
* tree-ssa-phionlycprop.c (propagate_rhs_into_lhs): Update.
* tree-ssa-phiopt.c (replace_phi_edge_with_variable): Update.
* tree-ssa-reassoc.c (branch_fixup): Update.
* tree-ssa-tail-merge.c (replace_block_by): Update.
* tree-ssa-threadupdate.c (remove_ctrl_stmt_and_useless_edges): Update.
(compute_path_counts): Update.
(update_profile): Update.
(recompute_probabilities): Update.
(update_joiner_offpath_counts): Update.
(estimated_freqs_path): Update.
(freqs_to_counts_path): Update.
(clear_counts_path): Update.
(ssa_fix_duplicate_block_edges): Update.
(duplicate_thread_path): Update.
* tree-switch-conversion.c (hoist_edge_and_branch_if_true): Update.
(case_bit_test_cmp): Update.
(collect_switch_conv_info): Update.
(gen_inbound_check): Update.
(do_jump_if_equal): Update.
(emit_cmp_and_jump_insns): Update.
* tree-tailcall.c (decrease_profile): Update.
(eliminate_tail_call): Update.
* tree-vect-loop-manip.c (slpeel_add_loop_guard): Update.
(vect_do_peeling): Update.
* tree-vect-loop.c (scale_profile_for_vect_loop): Update.
* ubsan.c (ubsan_expand_null_ifn): Update.
(ubsan_expand_ptr_ifn): Update.
* value-prof.c (gimple_divmod_fixed_value): Update.
(gimple_mod_pow2): Update.
(gimple_mod_subtract): Update.
(gimple_ic): Update.
(gimple_stringop_fixed_value): Update.
From-SVN: r253910
|
|
The wide_int routines allow things like:
wi::add (t, 1)
to add 1 to an INTEGER_CST T in its native precision. But we also have:
wi::to_offset (t) // Treat T as an offset_int
wi::to_widest (t) // Treat T as a widest_int
Recently we also gained:
wi::to_wide (t, prec) // Treat T as a wide_int in preccision PREC
This patch therefore requires:
wi::to_wide (t)
when operating on INTEGER_CSTs in their native precision. This is
just as efficient, and makes it clearer that a deliberate choice is
being made to treat the tree as a wide_int in its native precision.
This also removes the inconsistency that
a) INTEGER_CSTs in their native precision can be used without an accessor
but must use wi:: functions instead of C++ operators
b) the other forms need an explicit accessor but the result can be used
with C++ operators.
It also helps with SVE, where there's the additional possibility
that the tree could be a runtime value.
2017-10-10 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* wide-int.h (wide_int_ref_storage): Make host_dependent_precision
a template parameter.
(WIDE_INT_REF_FOR): Update accordingly.
* tree.h (wi::int_traits <const_tree>): Delete.
(wi::tree_to_widest_ref, wi::tree_to_offset_ref): New typedefs.
(wi::to_widest, wi::to_offset): Use them. Expand commentary.
(wi::tree_to_wide_ref): New typedef.
(wi::to_wide): New function.
* calls.c (get_size_range): Use wi::to_wide when operating on
trees as wide_ints.
* cgraph.c (cgraph_node::create_thunk): Likewise.
* config/i386/i386.c (ix86_data_alignment): Likewise.
(ix86_local_alignment): Likewise.
* dbxout.c (stabstr_O): Likewise.
* dwarf2out.c (add_scalar_info, gen_enumeration_type_die): Likewise.
* expr.c (const_vector_from_tree): Likewise.
* fold-const-call.c (host_size_t_cst_p, fold_const_call_1): Likewise.
* fold-const.c (may_negate_without_overflow_p, negate_expr_p)
(fold_negate_expr_1, int_const_binop_1, const_binop)
(fold_convert_const_int_from_real, optimize_bit_field_compare)
(all_ones_mask_p, sign_bit_p, unextend, extract_muldiv_1)
(fold_div_compare, fold_single_bit_test, fold_plusminus_mult_expr)
(pointer_may_wrap_p, expr_not_equal_to, fold_binary_loc)
(fold_ternary_loc, multiple_of_p, fold_negate_const, fold_abs_const)
(fold_not_const, round_up_loc): Likewise.
* gimple-fold.c (gimple_fold_indirect_ref): Likewise.
* gimple-ssa-warn-alloca.c (alloca_call_type_by_arg): Likewise.
(alloca_call_type): Likewise.
* gimple.c (preprocess_case_label_vec_for_gimple): Likewise.
* godump.c (go_output_typedef): Likewise.
* graphite-sese-to-poly.c (tree_int_to_gmp): Likewise.
* internal-fn.c (get_min_precision): Likewise.
* ipa-cp.c (ipcp_store_vr_results): Likewise.
* ipa-polymorphic-call.c
(ipa_polymorphic_call_context::ipa_polymorphic_call_context): Likewise.
* ipa-prop.c (ipa_print_node_jump_functions_for_edge): Likewise.
(ipa_modify_call_arguments): Likewise.
* match.pd: Likewise.
* omp-low.c (scan_omp_1_op, lower_omp_ordered_clauses): Likewise.
* print-tree.c (print_node_brief, print_node): Likewise.
* stmt.c (expand_case): Likewise.
* stor-layout.c (layout_type): Likewise.
* tree-affine.c (tree_to_aff_combination): Likewise.
* tree-cfg.c (group_case_labels_stmt): Likewise.
* tree-data-ref.c (dr_analyze_indices): Likewise.
(prune_runtime_alias_test_list): Likewise.
* tree-dump.c (dequeue_and_dump): Likewise.
* tree-inline.c (remap_gimple_op_r, copy_tree_body_r): Likewise.
* tree-predcom.c (is_inv_store_elimination_chain): Likewise.
* tree-pretty-print.c (dump_generic_node): Likewise.
* tree-scalar-evolution.c (iv_can_overflow_p): Likewise.
(simple_iv_with_niters): Likewise.
* tree-ssa-address.c (addr_for_mem_ref): Likewise.
* tree-ssa-ccp.c (ccp_finalize, evaluate_stmt): Likewise.
* tree-ssa-loop-ivopts.c (constant_multiple_of): Likewise.
* tree-ssa-loop-niter.c (split_to_var_and_offset)
(refine_value_range_using_guard, number_of_iterations_ne_max)
(number_of_iterations_lt_to_ne, number_of_iterations_lt)
(get_cst_init_from_scev, record_nonwrapping_iv)
(scev_var_range_cant_overflow): Likewise.
* tree-ssa-phiopt.c (minmax_replacement): Likewise.
* tree-ssa-pre.c (compute_avail): Likewise.
* tree-ssa-sccvn.c (vn_reference_fold_indirect): Likewise.
(vn_reference_maybe_forwprop_address, valueized_wider_op): Likewise.
* tree-ssa-structalias.c (get_constraint_for_ptr_offset): Likewise.
* tree-ssa-uninit.c (is_pred_expr_subset_of): Likewise.
* tree-ssanames.c (set_nonzero_bits, get_nonzero_bits): Likewise.
* tree-switch-conversion.c (collect_switch_conv_info, array_value_type)
(dump_case_nodes, try_switch_expansion): Likewise.
* tree-vect-loop-manip.c (vect_gen_vector_loop_niters): Likewise.
(vect_do_peeling): Likewise.
* tree-vect-patterns.c (vect_recog_bool_pattern): Likewise.
* tree-vect-stmts.c (vectorizable_load): Likewise.
* tree-vrp.c (compare_values_warnv, vrp_int_const_binop): Likewise.
(zero_nonzero_bits_from_vr, ranges_from_anti_range): Likewise.
(extract_range_from_binary_expr_1, adjust_range_with_scev): Likewise.
(overflow_comparison_p_1, register_edge_assert_for_2): Likewise.
(is_masked_range_test, find_switch_asserts, maybe_set_nonzero_bits)
(vrp_evaluate_conditional_warnv_with_ops, intersect_ranges): Likewise.
(range_fits_type_p, two_valued_val_range_p, vrp_finalize): Likewise.
(evrp_dom_walker::before_dom_children): Likewise.
* tree.c (cache_integer_cst, real_value_from_int_cst, integer_zerop)
(integer_all_onesp, integer_pow2p, integer_nonzerop, tree_log2)
(tree_floor_log2, tree_ctz, mem_ref_offset, tree_int_cst_sign_bit)
(tree_int_cst_sgn, get_unwidened, int_fits_type_p): Likewise.
(get_type_static_bounds, num_ending_zeros, drop_tree_overflow)
(get_range_pos_neg): Likewise.
* ubsan.c (ubsan_expand_ptr_ifn): Likewise.
* config/darwin.c (darwin_mergeable_constant_section): Likewise.
* config/aarch64/aarch64.c (aapcs_vfp_sub_candidate): Likewise.
* config/arm/arm.c (aapcs_vfp_sub_candidate): Likewise.
* config/avr/avr.c (avr_fold_builtin): Likewise.
* config/bfin/bfin.c (bfin_local_alignment): Likewise.
* config/msp430/msp430.c (msp430_attr): Likewise.
* config/nds32/nds32.c (nds32_insert_attributes): Likewise.
* config/powerpcspe/powerpcspe-c.c
(altivec_resolve_overloaded_builtin): Likewise.
* config/powerpcspe/powerpcspe.c (rs6000_aggregate_candidate)
(rs6000_expand_ternop_builtin): Likewise.
* config/rs6000/rs6000-c.c
(altivec_resolve_overloaded_builtin): Likewise.
* config/rs6000/rs6000.c (rs6000_aggregate_candidate): Likewise.
(rs6000_expand_ternop_builtin): Likewise.
* config/s390/s390.c (s390_handle_hotpatch_attribute): Likewise.
gcc/ada/
* gcc-interface/decl.c (annotate_value): Use wi::to_wide when
operating on trees as wide_ints.
gcc/c/
* c-parser.c (c_parser_cilk_clause_vectorlength): Use wi::to_wide when
operating on trees as wide_ints.
* c-typeck.c (build_c_cast, c_finish_omp_clauses): Likewise.
(c_tree_equal): Likewise.
gcc/c-family/
* c-ada-spec.c (dump_generic_ada_node): Use wi::to_wide when
operating on trees as wide_ints.
* c-common.c (pointer_int_sum): Likewise.
* c-pretty-print.c (pp_c_integer_constant): Likewise.
* c-warn.c (match_case_to_enum_1): Likewise.
(c_do_switch_warnings): Likewise.
(maybe_warn_shift_overflow): Likewise.
gcc/cp/
* cvt.c (ignore_overflows): Use wi::to_wide when
operating on trees as wide_ints.
* decl.c (check_array_designated_initializer): Likewise.
* mangle.c (write_integer_cst): Likewise.
* semantics.c (cp_finish_omp_clause_depend_sink): Likewise.
gcc/fortran/
* target-memory.c (gfc_interpret_logical): Use wi::to_wide when
operating on trees as wide_ints.
* trans-const.c (gfc_conv_tree_to_mpz): Likewise.
* trans-expr.c (gfc_conv_cst_int_power): Likewise.
* trans-intrinsic.c (trans_this_image): Likewise.
(gfc_conv_intrinsic_bound): Likewise.
(conv_intrinsic_cobound): Likewise.
gcc/lto/
* lto.c (compare_tree_sccs_1): Use wi::to_wide when
operating on trees as wide_ints.
gcc/objc/
* objc-act.c (objc_decl_method_attributes): Use wi::to_wide when
operating on trees as wide_ints.
From-SVN: r253595
|
|
copying loop nest with only one inner loop.
* tree-vect-loop-manip.c (rename_variables_in_bb): Rename PHI nodes
when copying loop nest with only one inner loop.
gcc/testsuite
* gcc.dg/tree-ssa/ldist-34.c: New test.
From-SVN: r253586
|
|
renaming variables in new preheader if it's deleted.
* tree-vect-loop-manip.c (slpeel_tree_duplicate_loop_to_edge_cfg): Skip
renaming variables in new preheader if it's deleted.
From-SVN: r253579
|
|
The vectoriser aligned vectors to TYPE_ALIGN unconditionally, although
there was also a hard-coded assumption that this was equal to the type
size. This was inconvenient for SVE for two reasons:
- When compiling for a specific power-of-2 SVE vector length, we might
want to align to a full vector. However, the TYPE_ALIGN is governed
by the ABI alignment, which is 128 bits regardless of size.
- For vector-length-agnostic code it doesn't usually make sense to align,
since the runtime vector length might not be a power of two. Even for
power of two sizes, there's no guarantee that aligning to the previous
16 bytes will be an improveent.
This patch therefore adds a target hook to control the preferred
vectoriser (as opposed to ABI) alignment.
2017-09-22 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* target.def (preferred_vector_alignment): New hook.
* doc/tm.texi.in (TARGET_VECTORIZE_PREFERRED_VECTOR_ALIGNMENT): New
hook.
* doc/tm.texi: Regenerate.
* targhooks.h (default_preferred_vector_alignment): Declare.
* targhooks.c (default_preferred_vector_alignment): New function.
* tree-vectorizer.h (dataref_aux): Add a target_alignment field.
Expand commentary.
(DR_TARGET_ALIGNMENT): New macro.
(aligned_access_p): Update commentary.
(vect_known_alignment_in_bytes): New function.
* tree-vect-data-refs.c (vect_calculate_required_alignment): New
function.
(vect_compute_data_ref_alignment): Set DR_TARGET_ALIGNMENT.
Calculate the misalignment based on the target alignment rather than
the vector size.
(vect_update_misalignment_for_peel): Use DR_TARGET_ALIGMENT
rather than TYPE_ALIGN / BITS_PER_UNIT to update the misalignment.
(vect_enhance_data_refs_alignment): Mask the byte misalignment with
the target alignment, rather than masking the element misalignment
with the number of elements in a vector. Also use the target
alignment when calculating the maximum number of peels.
(vect_find_same_alignment_drs): Use vect_calculate_required_alignment
instead of TYPE_ALIGN_UNIT.
(vect_duplicate_ssa_name_ptr_info): Remove stmt_info parameter.
Measure DR_MISALIGNMENT relative to DR_TARGET_ALIGNMENT.
(vect_create_addr_base_for_vector_ref): Update call accordingly.
(vect_create_data_ref_ptr): Likewise.
(vect_setup_realignment): Realign by ANDing with
-DR_TARGET_MISALIGNMENT.
* tree-vect-loop-manip.c (vect_gen_prolog_loop_niters): Calculate
the number of peels based on DR_TARGET_ALIGNMENT.
* tree-vect-stmts.c (get_group_load_store_type): Compare the gap
with the guaranteed alignment boundary when deciding whether
overrun is OK.
(vectorizable_mask_load_store): Interpret DR_MISALIGNMENT
relative to DR_TARGET_ALIGNMENT instead of TYPE_ALIGN_UNIT.
(ensure_base_align): Remove stmt_info parameter. Get the
target base alignment from DR_TARGET_ALIGNMENT.
(vectorizable_store): Update call accordingly. Interpret
DR_MISALIGNMENT relative to DR_TARGET_ALIGNMENT instead of
TYPE_ALIGN_UNIT.
(vectorizable_load): Likewise.
gcc/testsuite/
* gcc.dg/vect/vect-outer-3a.c: Adjust dump scan for new wording
of alignment message.
* gcc.dg/vect/vect-outer-3a-big-array.c: Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r253101
|
|
This patch checks whether two data references x and y cannot
partially overlap and so are independent whenever &x != &y.
We can then use this in the vectoriser to optimise alias checks.
gcc/
2016-08-04 Richard Sandiford <richard.sandiford@linaro.org>
* hash-traits.h (pair_hash): New struct.
* tree-data-ref.h (data_dependence_relation): Add object_a and
object_b fields.
(DDR_OBJECT_A, DDR_OBJECT_B): New macros.
* tree-data-ref.c (initialize_data_dependence_relation): Initialize
DDR_OBJECT_A and DDR_OBJECT_B.
* tree-vectorizer.h (vec_object_pair): New type.
(_loop_vec_info): Add a check_unequal_addrs field.
(LOOP_VINFO_CHECK_UNEQUAL_ADDRS): New macro.
(LOOP_REQUIRES_VERSIONING_FOR_ALIAS): Return true if there is an
entry in check_unequal_addrs. Check comp_alias_ddrs instead of
may_alias_ddrs.
* tree-vect-loop.c (destroy_loop_vec_info): Release
LOOP_VINFO_CHECK_UNEQUAL_ADDRS.
(vect_analyze_loop_2): Likewise, when restarting.
(vect_estimate_min_profitable_iters): Estimate the cost of
LOOP_VINFO_CHECK_UNEQUAL_ADDRS.
* tree-vect-data-refs.c: Include tree-hash-traits.h.
(vect_prune_runtime_alias_test_list): Try to handle conflicts
using LOOP_VINFO_CHECK_UNEQUAL_ADDRS, if the data dependence allows.
Count such tests in the final summary.
* tree-vect-loop-manip.c (chain_cond_expr): New function.
(vect_create_cond_for_align_checks): Use it.
(vect_create_cond_for_unequal_addrs): New function.
(vect_loop_versioning): Call it.
gcc/testsuite/
* gcc.dg/vect/vect-alias-check-6.c: New test.
From-SVN: r250868
|
|
2017-07-25 Richard Biener <rguenther@suse.de>
PR tree-optimization/81303
* tree-vect-loop-manip.c (vect_loop_versioning): Build
profitability check against LOOP_VINFO_NITERSM1.
From-SVN: r250503
|
|
From-SVN: r249929
|
|
The variables that claimed to be the "minimum number of iterations" for
which vectorisation was profitable were actually the maximum number of
iterations for which vectorisation wasn't profitable. The loop threshold
was too.
This patch makes the values be what the variable names suggest.
2017-07-03 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* tree-vect-loop.c (vect_analyze_loop_2): Treat min_scalar_loop_bound,
min_profitable_iters, and th as inclusive lower bounds.
Fix LOOP_VINFO_PEELING_FOR_GAPS condition.
(vect_estimate_min_profitable_iters): Return inclusive lower bounds
for min_profitable_iters and min_profitable_estimate.
(vect_transform_loop): Treat th as an inclusive lower bound.
* tree-vect-loop-manip.c (vect_loop_versioning): Likewise.
From-SVN: r249927
|
|
This patch replaces the individual stmt_vinfo dr_* fields with
an innermost_loop_behavior, so that the changes in later patches
get picked up automatically. It also adds a helper function for
getting the behavior of a data reference wrt the vectorised loop.
2017-07-03 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* tree-vectorizer.h (_stmt_vec_info): Replace individual dr_*
fields with dr_wrt_vec_loop.
(STMT_VINFO_DR_BASE_ADDRESS, STMT_VINFO_DR_INIT, STMT_VINFO_DR_OFFSET)
(STMT_VINFO_DR_STEP, STMT_VINFO_DR_ALIGNED_TO): Update accordingly.
(STMT_VINFO_DR_WRT_VEC_LOOP): New macro.
(vect_dr_behavior): New function.
(vect_create_addr_base_for_vector_ref): Remove loop parameter.
* tree-vect-data-refs.c (vect_compute_data_ref_alignment): Use
vect_dr_behavior. Use a step_preserves_misalignment_p boolean to
track whether the step preserves the misalignment.
(vect_create_addr_base_for_vector_ref): Remove loop parameter.
Use vect_dr_behavior.
(vect_setup_realignment): Update call accordingly.
(vect_create_data_ref_ptr): Likewise. Use vect_dr_behavior.
* tree-vect-loop-manip.c (vect_gen_prolog_loop_niters): Update
call to vect_create_addr_base_for_vector_ref.
(vect_create_cond_for_align_checks): Likewise.
* tree-vect-patterns.c (vect_recog_bool_pattern): Copy
STMT_VINFO_DR_WRT_VEC_LOOP as a block.
(vect_recog_mask_conversion_pattern): Likewise.
* tree-vect-stmts.c (compare_step_with_zero): Use vect_dr_behavior.
(new_stmt_vec_info): Remove redundant zeroing.
From-SVN: r249911
|
|
* cfg.c (scale_bbs_frequencies): New function.
* cfg.h (scale_bbs_frequencies): Declare it.
* cfgloopanal.c (single_likely_exit): Cleanup.
* cfgloopmanip.c (scale_loop_frequencies): Take profile_probability
as parameter.
(scale_loop_profile): Likewise.
(loop_version): Likewise.
(create_empty_loop_on_edge): Update.
* cfgloopmanip.h (scale_loop_frequencies, scale_loop_profile,
scale_loop_frequencies, scale_loop_profile, loopify,
loop_version): Update prototypes.
* modulo-sched.c (sms_schedule): Update.
* predict.c (unlikely_executed_edge_p): Also check probability.
(probably_never_executed_edge_p): Fix typo.
* tree-if-conv.c (version_loop_for_if_conversion): Update.
* tree-parloops.c (gen_parallel_loop): Update.
* tree-ssa-loop-ivcanon.c (try_peel_loop): Update.
* tree-ssa-loop-manip.c (tree_transform_and_unroll_loop): Update.
* tree-ssa-loop-split.c (split_loop): Update.
* tree-ssa-loop-unswitch.c (tree_unswitch_loop): Update.
* tree-vect-loop-manip.c (vect_do_peeling): Update.
(vect_loop_versioning): Update.
* tree-vect-loop.c (scale_profile_for_vect_loop): Update.
From-SVN: r249872
|
|
* asan.c (asan_emit_stack_protection): Update.
(create_cond_insert_point): Update.
* auto-profile.c (afdo_propagate_circuit): Update.
* basic-block.h (struct edge_def): Turn probability to
profile_probability.
(EDGE_FREQUENCY): Update.
* bb-reorder.c (find_traces_1_round): Update.
(better_edge_p): Update.
(sanitize_hot_paths): Update.
* cfg.c (unchecked_make_edge): Initialize probability to uninitialized.
(make_single_succ_edge): Update.
(check_bb_profile): Update.
(dump_edge_info): Update.
(update_bb_profile_for_threading): Update.
* cfganal.c (connect_infinite_loops_to_exit): Initialize new edge
probabilitycount to 0.
* cfgbuild.c (compute_outgoing_frequencies): Update.
* cfgcleanup.c (try_forward_edges): Update.
(outgoing_edges_match): Update.
(try_crossjump_to_edge): Update.
* cfgexpand.c (expand_gimple_cond): Update make_single_succ_edge.
(expand_gimple_tailcall): Update.
(construct_init_block): Use make_single_succ_edge.
(construct_exit_block): Use make_single_succ_edge.
* cfghooks.c (verify_flow_info): Update.
(redirect_edge_succ_nodup): Update.
(split_edge): Update.
(account_profile_record): Update.
* cfgloopanal.c (single_likely_exit): Update.
* cfgloopmanip.c (scale_loop_profile): Update.
(set_zero_probability): Remove.
(duplicate_loop_to_header_edge): Update.
* cfgloopmanip.h (loop_version): Update prototype.
* cfgrtl.c (try_redirect_by_replacing_jump): Update.
(force_nonfallthru_and_redirect): Update.
(update_br_prob_note): Update.
(rtl_verify_edges): Update.
(purge_dead_edges): Update.
(rtl_lv_add_condition_to_bb): Update.
* cgraph.c: (cgraph_edge::redirect_call_stmt_to_calle): Update.
* cgraphunit.c (init_lowered_empty_function): Update.
(cgraph_node::expand_thunk): Update.
* cilk-common.c: Include profile-count.h
* dojump.c (inv): Remove.
(jumpifnot): Update.
(jumpifnot_1): Update.
(do_jump_1): Update.
(do_jump): Update.
(do_jump_by_parts_greater_rtx): Update.
(do_compare_rtx_and_jump): Update.
* dojump.h (jumpifnot, jumpifnot_1, jumpif_1, jumpif, do_jump,
do_jump_1. do_compare_rtx_and_jump): Update prototype.
* dwarf2cfi.c: Include profile-count.h
* except.c (dw2_build_landing_pads): Use make_single_succ_edge.
(sjlj_emit_dispatch_table): Likewise.
* explow.c: Include profile-count.h
* expmed.c (emit_store_flag_force): Update.
(do_cmp_and_jump): Update.
* expr.c (compare_by_pieces_d::generate): Update.
(compare_by_pieces_d::finish_mode): Update.
(emit_block_move_via_loop): Update.
(store_expr_with_bounds): Update.
(store_constructor): Update.
(expand_expr_real_2): Update.
(expand_expr_real_1): Update.
* expr.h (try_casesi, try_tablejump): Update prototypes.
* gimple-pretty-print.c (dump_probability): Update.
(dump_profile): New.
(dump_gimple_label): Update.
(dump_gimple_bb_header): Update.
* graph.c (draw_cfg_node_succ_edges): Update.
* hsa-gen.c (convert_switch_statements): Update.
* ifcvt.c (cheap_bb_rtx_cost_p): Update.
(find_if_case_1): Update.
(find_if_case_2): Update.
* internal-fn.c (expand_arith_overflow_result_store): Update.
(expand_addsub_overflow): Update.
(expand_neg_overflow): Update.
(expand_mul_overflow): Update.
(expand_vector_ubsan_overflow): Update.
* ipa-cp.c (good_cloning_opportunity_p): Update.
* ipa-split.c (split_function): Use make_single_succ_edge.
* ipa-utils.c (ipa_merge_profiles): Update.
* loop-doloop.c (add_test): Update.
(doloop_modify): Update.
* loop-unroll.c (compare_and_jump_seq): Update.
(unroll_loop_runtime_iterations): Update.
* lra-constraints.c (lra_inheritance): Update.
* lto-streamer-in.c (input_cfg): Update.
* lto-streamer-out.c (output_cfg): Update.
* mcf.c (adjust_cfg_counts): Update.
* modulo-sched.c (sms_schedule): Update.
* omp-expand.c (expand_omp_for_init_counts): Update.
(extract_omp_for_update_vars): Update.
(expand_omp_ordered_sink): Update.
(expand_omp_for_ordered_loops): Update.
(expand_omp_for_generic): Update.
(expand_omp_for_static_nochunk): Update.
(expand_omp_for_static_chunk): Update.
(expand_cilk_for): Update.
(expand_omp_simd): Update.
(expand_omp_taskloop_for_outer): Update.
(expand_omp_taskloop_for_inner): Update.
* omp-simd-clone.c (simd_clone_adjust): Update.
* optabs.c (expand_doubleword_shift): Update.
(expand_abs): Update.
(emit_cmp_and_jump_insn_1): Update.
(expand_compare_and_swap_loop): Update.
* optabs.h (emit_cmp_and_jump_insns): Update prototype.
* predict.c (predictable_edge_p): Update.
(edge_probability_reliable_p): Update.
(set_even_probabilities): Update.
(combine_predictions_for_insn): Update.
(combine_predictions_for_bb): Update.
(propagate_freq): Update.
(estimate_bb_frequencies): Update.
(force_edge_cold): Update.
* profile-count.c (profile_count::dump): Add missing space into dump.
(profile_count::debug): Add newline.
(profile_count::differs_from_p): Explicitly convert to unsigned.
(profile_count::stream_in): Update.
(profile_probability::dump): New member function.
(profile_probability::debug): New member function.
(profile_probability::differs_from_p): New member function.
(profile_probability::differs_lot_from_p): New member function.
(profile_probability::stream_in): New member function.
(profile_probability::stream_out): New member function.
* profile-count.h (profile_count_quality): Rename to ...
(profile_quality): ... this one.
(profile_probability): New.
(profile_count): Update.
* profile.c (compute_branch_probabilities): Update.
* recog.c (peep2_attempt): Update.
* sched-ebb.c (schedule_ebbs): Update.
* sched-rgn.c (find_single_block_region): Update.
(compute_dom_prob_ps): Update.
(schedule_region): Update.
* sel-sched-ir.c (compute_succs_info): Update.
* stmt.c (struct case_node): Update.
(do_jump_if_equal): Update.
(get_outgoing_edge_probs): Update.
(conditional_probability): Update.
(emit_case_dispatch_table): Update.
(expand_case): Update.
(expand_sjlj_dispatch_table): Update.
(emit_case_nodes): Update.
* targhooks.c: Update.
* tracer.c (better_p): Update.
(find_best_successor): Update.
* trans-mem.c (expand_transaction): Update.
* tree-call-cdce.c: Update.
* tree-cfg.c (gimple_split_edge): Upate.
(move_sese_region_to_fn): Upate.
* tree-cfgcleanup.c (cleanup_control_expr_graph): Upate.
* tree-eh.c (lower_resx): Upate.
(cleanup_empty_eh_move_lp): Upate.
* tree-if-conv.c (version_loop_for_if_conversion): Update.
* tree-inline.c (copy_edges_for_bb): Update.
(copy_cfg_body): Update.
* tree-parloops.c (gen_parallel_loop): Update.
* tree-profile.c (gimple_gen_ic_func_profiler): Update.
(gimple_gen_time_profiler): Update.
* tree-ssa-dce.c (remove_dead_stmt): Update.
* tree-ssa-ifcombine.c (update_profile_after_ifcombine): Update.
* tree-ssa-loop-im.c (execute_sm_if_changed): Update.
* tree-ssa-loop-ivcanon.c (remove_exits_and_undefined_stmts): Update.
(unloop_loops): Update.
(try_peel_loop): Update.
* tree-ssa-loop-manip.c (tree_transform_and_unroll_loop): Update.
* tree-ssa-loop-split.c (connect_loops): Update.
(split_loop): Update.
* tree-ssa-loop-unswitch.c (tree_unswitch_loop): Update.
(hoist_guard): Update.
* tree-ssa-phionlycprop.c (propagate_rhs_into_lhs): Update.
* tree-ssa-phiopt.c (replace_phi_edge_with_variable): Update.
(value_replacement): Update.
* tree-ssa-reassoc.c (branch_fixup): Update.
* tree-ssa-tail-merge.c (replace_block_by): Update.
* tree-ssa-threadupdate.c (remove_ctrl_stmt_and_useless_edges): Update.
(create_edge_and_update_destination_phis): Update.
(compute_path_counts): Update.
(recompute_probabilities): Update.
(update_joiner_offpath_counts): Update.
(freqs_to_counts_path): Update.
(duplicate_thread_path): Update.
* tree-switch-conversion.c (hoist_edge_and_branch_if_true): Update.
(struct switch_conv_info): Update.
(gen_inbound_check): Update.
* tree-vect-loop-manip.c (slpeel_add_loop_guard): Update.
(vect_do_peeling): Update.
(vect_loop_versioning): Update.
* tree-vect-loop.c (scale_profile_for_vect_loop): Update.
(optimize_mask_stores): Update.
* ubsan.c (ubsan_expand_null_ifn): Update.
* value-prof.c (gimple_divmod_fixed_value): Update.
(gimple_divmod_fixed_value_transform): Update.
(gimple_mod_pow2): Update.
(gimple_mod_pow2_value_transform): Update.
(gimple_mod_subtract): Update.
(gimple_mod_subtract_transform): Update.
(gimple_ic): Update.
(gimple_stringop_fixed_value): Update.
(gimple_stringops_transform): Update.
* value-prof.h: Update.
From-SVN: r249800
|
|
versioning is required.
* tree-vect-loop-manip.c (vect_do_peeling): Don't skip vector loop
if versioning is required.
* tree-vect-loop.c (vect_analyze_loop_2): Merge niter check for loop
peeling with the check for versioning.
From-SVN: r248959
|
|
* tree-vectorizer.h (vect_build_loop_niters): New parameter.
* tree-vect-loop-manip.c (vect_build_loop_niters): New parameter.
Set true to new parameter if new ssa variable is defined.
(vect_gen_vector_loop_niters): Refactor. Set range information
for the new vector loop bound variable.
(vect_do_peeling): Ditto.
From-SVN: r248958
|
|
2017-05-23 Jan Hubicka <hubicka@ucw.cz>
* config/i386/i386.c (make_resolver_func): Update.
* Makefile.in: Add profile-count.h and profile-count.o
* auto-profile.c (afdo_indirect_call): Update to new API.
(afdo_set_bb_count): Update.
(afdo_propagate_edge): Update.
(afdo_propagate_circuit): Update.
(afdo_calculate_branch_prob): Update.
(afdo_annotate_cfg): Update.
* basic-block.h: Include profile-count.h
(struct edge_def): Turn count to profile_count.
(struct basic_block_def): Likewie.
(REG_BR_PROB_BASE): Move to profile-count.h
(RDIV): Move to profile-count.h
* bb-reorder.c (max_entry_count): Turn to profile_count.
(find_traces): Update.
(rotate_loop):Update.
(connect_traces):Update.
(sanitize_hot_paths):Update.
* bt-load.c (migrate_btr_defs): Update.
* cfg.c (RDIV): Remove.
(init_flow): Use alloc_block.
(alloc_block): Uninitialize count.
(unchecked_make_edge): Uninitialize count.
(check_bb_profile): Update.
(dump_edge_info): Update.
(dump_bb_info): Update.
(update_bb_profile_for_threading): Update.
(scale_bbs_frequencies_int): Update.
(scale_bbs_frequencies_gcov_type): Update.
(scale_bbs_frequencies_profile_count): New.
* cfg.h (update_bb_profile_for_threading): Update.
(scale_bbs_frequencies_profile_count): Declare.
* cfgbuild.c (compute_outgoing_frequencies): Update.
(find_many_sub_basic_blocks): Update.
* cfgcleanup.c (try_forward_edges): Update.
(try_crossjump_to_edge): Update.
* cfgexpand.c (expand_gimple_tailcall): Update.
(construct_exit_block): Update.
* cfghooks.c (verify_flow_info): Update.
(dump_bb_for_graph): Update.
(split_edge): Update.
(make_forwarder_block): Update.
(duplicate_block): Update.
(account_profile_record): Update.
* cfgloop.c (find_subloop_latch_edge_by_profile): Update.
(get_estimated_loop_iterations): Update.
* cfgloopanal.c (expected_loop_iterations_unbounded): Update.
(single_likely_exit): Update.
* cfgloopmanip.c (scale_loop_profile): Update.
(loopify): Update.
(set_zero_probability): Update.
(lv_adjust_loop_entry_edge): Update.
* cfgrtl.c (force_nonfallthru_and_redirect): Update.
(purge_dead_edges): Update.
(rtl_account_profile_record): Update.
* cgraph.c (cgraph_node::create): Uninitialize count.
(symbol_table::create_edge): Uninitialize count.
(cgraph_update_edges_for_call_stmt_node): Update.
(cgraph_edge::dump_edge_flags): Update.
(cgraph_node::dump): Update.
(cgraph_edge::maybe_hot_p): Update.
* cgraph.h: Include profile-count.h
(create_clone), create_edge, create_indirect_edge): Update.
(cgraph_node): Turn count to profile_count.
(cgraph_edge0: Likewise.
(make_speculative, clone): Update.
(create_edge): Update.
(init_lowered_empty_function): Update.
* cgraphclones.c (cgraph_edge::clone): Update.
(duplicate_thunk_for_node): Update.
(cgraph_node::create_clone): Update.
* cgraphunit.c (cgraph_node::analyze): Update.
(cgraph_node::expand_thunk): Update.
* final.c (dump_basic_block_info): Update.
* gimple-streamer-in.c (input_bb): Update.
* gimple-streamer-out.c (output_bb): Update.
* graphite.c (print_global_statistics): Update.
(print_graphite_scop_statistics): Update.
* hsa-brig.c: Include basic-block.h.
* hsa-dump.c: Include basic-block.h.
* hsa-gen.c (T sum_slice): Update.
(convert_switch_statements):Update.
* hsa-regalloc.c: Include basic-block.h.
* ipa-chkp.c (chkp_produce_thunks): Update.
* ipa-cp.c (struct caller_statistics): Update.
(init_caller_stats): Update.
(gather_caller_stats): Update.
(ipcp_cloning_candidate_p): Update.
(good_cloning_opportunity_p): Update.
(get_info_about_necessary_edges): Update.
(dump_profile_updates): Update.
(update_profiling_info): Update.
(update_specialized_profile): Update.
(perhaps_add_new_callers): Update.
(decide_about_value): Update.
(ipa_cp_c_finalize): Update.
* ipa-devirt.c (struct odr_type_warn_count): Update.
(struct decl_warn_count): Update.
(struct final_warning_record): Update.
(possible_polymorphic_call_targets): Update.
(ipa_devirt): Update.
* ipa-fnsummary.c (redirect_to_unreachable): Update.
* ipa-icf.c (sem_function::merge): Update.
* ipa-inline-analysis.c (do_estimate_edge_time): Update.
* ipa-inline.c (compute_uninlined_call_time): Update.
(compute_inlined_call_time): Update.
(want_inline_small_function_p): Update.
(want_inline_self_recursive_call_p): Update.
(edge_badness): Update.
(lookup_recursive_calls): Update.
(recursive_inlining): Update.
(inline_small_functions): Update.
(dump_overall_stats): Update.
(dump_inline_stats): Update.
* ipa-profile.c (ipa_profile_generate_summary): Update.
(ipa_propagate_frequency): Update.
(ipa_profile): Update.
* ipa-prop.c (ipa_make_edge_direct_to_target): Update.
* ipa-utils.c (ipa_merge_profiles): Update.
* loop-doloop.c (doloop_modify): Update.
* loop-unroll.c (report_unroll): Update.
(unroll_loop_runtime_iterations): Update.
* lto-cgraph.c (lto_output_edge): Update.
(lto_output_node): Update.
(input_node): Update.
(input_edge): Update.
(merge_profile_summaries): Update.
* lto-streamer-in.c (input_cfg): Update.
* lto-streamer-out.c (output_cfg): Update.
* mcf.c (create_fixup_graph): Update.
(adjust_cfg_counts): Update.
(sum_edge_counts): Update.
* modulo-sched.c (sms_schedule): Update.
* postreload-gcse.c (eliminate_partially_redundant_load): Update.
* predict.c (maybe_hot_count_p): Update.
(probably_never_executed): Update.
(dump_prediction): Update.
(combine_predictions_for_bb): Update.
(propagate_freq): Update.
(handle_missing_profiles): Update.
(counts_to_freqs): Update.
(rebuild_frequencies): Update.
(force_edge_cold): Update.
* predict.h: Include profile-count.h
(maybe_hot_count_p, counts_to_freqs): UPdate.
* print-rtl-function.c: Do not include cfg.h
* print-rtl.c: Include basic-block.h
* profile-count.c: New file.
* profile-count.h: New file.
* profile.c (is_edge_inconsistent): Update.
(correct_negative_edge_counts): Update.
(is_inconsistent): Update.
(set_bb_counts): Update.
(read_profile_edge_counts): Update.
(compute_frequency_overlap): Update.
(compute_branch_probabilities): Update; Initialize and deinitialize
gcov_count tables.
(branch_prob): Update.
* profile.h (bb_gcov_counts, edge_gcov_counts): New.
(edge_gcov_count): New.
(bb_gcov_count): New.
* shrink-wrap.c (try_shrink_wrapping): Update.
* tracer.c (better_p): Update.
* trans-mem.c (expand_transaction): Update.
(ipa_tm_insert_irr_call): Update.
(ipa_tm_insert_gettmclone_call): Update.
* tree-call-cdce.c: Update.
* tree-cfg.c (gimple_duplicate_sese_region): Update.
(gimple_duplicate_sese_tail): Update.
(gimple_account_profile_record): Update.
(execute_fixup_cfg): Update.
* tree-inline.c (copy_bb): Update.
(copy_edges_for_bb): Update.
(initialize_cfun): Update.
(freqs_to_counts): Update.
(copy_cfg_body): Update.
(expand_call_inline): Update.
* tree-ssa-ifcombine.c (update_profile_after_ifcombine): Update.
* tree-ssa-loop-ivcanon.c (unloop_loops): Update.
(try_unroll_loop_completely): Update.
(try_peel_loop): Update.
* tree-ssa-loop-manip.c (tree_transform_and_unroll_loop): Update.
* tree-ssa-loop-niter.c (estimate_numbers_of_iterations_loop): Update.
* tree-ssa-loop-split.c (connect_loops): Update.
* tree-ssa-loop-unswitch.c (hoist_guard): Update.
* tree-ssa-reassoc.c (branch_fixup): Update.
* tree-ssa-tail-merge.c (replace_block_by): Update.
* tree-ssa-threadupdate.c (create_block_for_threading): Update.
(compute_path_counts): Update.
(update_profile): Update.
(recompute_probabilities): Update.
(update_joiner_offpath_counts): Update.
(estimated_freqs_path): Update.
(freqs_to_counts_path): Update.
(clear_counts_path): Update.
(ssa_fix_duplicate_block_edges): Update.
(duplicate_thread_path): Update.
* tree-switch-conversion.c (case_bit_test_cmp): Update.
(struct switch_conv_info): Update.
* tree-tailcall.c (decrease_profile): Update.
* tree-vect-loop-manip.c (slpeel_add_loop_guard): Update.
* tree-vect-loop.c (scale_profile_for_vect_loop): Update.
* value-prof.c (check_counter): Update.
(gimple_divmod_fixed_value): Update.
(gimple_mod_pow2): Update.
(gimple_mod_subtract): Update.
(gimple_ic_transform): Update.
(gimple_stringop_fixed_value): Update.
* value-prof.h (gimple_ic): Update.
* gcc.dg/tree-ssa/attr-hotcold-2.c: Update template.
From-SVN: r248863
|
|
(create_intersect_range_checks): Move from ...
* tree-data-ref.c (create_intersect_range_checks_index)
(create_intersect_range_checks): ... to here.
(create_runtime_alias_checks): New function factored from ...
* tree-vect-loop-manip.c (vect_create_cond_for_alias_checks): ...
here. Call above function.
* tree-data-ref.h (create_runtime_alias_checks): New function.
From-SVN: r248726
|
|
parameter loop, rather than loop_vinfo.
* tree-vect-loop-manip.c (create_intersect_range_checks_index): Pass
in parameter loop, rather than loop_vinfo.
(create_intersect_range_checks): Ditto.
(vect_create_cond_for_alias_checks): Update call to above functions.
From-SVN: r248513
|