Age | Commit message (Collapse) | Author | Files | Lines |
|
When versioning is run the IL is already mangled and finding
a VECTORIZED_CALL IFN can fail.
2020-01-20 Richard Biener <rguenther@suse.de>
PR tree-optimization/93094
* tree-vectorizer.h (vect_loop_versioning): Adjust.
(vect_transform_loop): Likewise.
* tree-vectorizer.c (try_vectorize_loop_1): Pass down
loop_vectorized_call to vect_transform_loop.
* tree-vect-loop.c (vect_transform_loop): Pass down
loop_vectorized_call to vect_loop_versioning.
* tree-vect-loop-manip.c (vect_loop_versioning): Use
the earlier discovered loop_vectorized_call.
* gcc.dg/vect/pr93094.c: New testcase.
|
|
gcc/ChangeLog:
2020-01-10 Andre Vieira <andre.simoesdiasvieira@arm.com>
* tree-vect-data-refs.c (vect_create_addr_base_for_vector_ref): Use
get_dr_vinfo_offset
* tree-vect-loop.c (update_epilogue_loop_vinfo): Remove orig_drs_init
parameter and its use to reset DR_OFFSET's.
(vect_transform_loop): Remove orig_drs_init argument.
* tree-vect-loop-manip.c (vect_update_init_of_dr): Update the offset
member of dr_vec_info rather than the offset of the associated
data_reference's innermost_loop_behavior.
(vect_update_init_of_dr): Pass dr_vec_info instead of data_reference.
(vect_do_peeling): Remove orig_drs_init parameter and its construction.
* tree-vect-stmts.c (check_scan_store): Replace use of DR_OFFSET with
get_dr_vinfo_offset.
(vectorizable_store): Likewise.
(vectorizable_load): Likewise.
From-SVN: r280107
|
|
From-SVN: r279813
|
|
This patch fixes an awkward corner case in which:
(a) we apply if-conversion to a loop;
(b) the original scalar loop doesn't have a vdef, and thus doesn't
need a virtual phi;
(c) the vectorised main loop does need a vdef and a virtual phi (see below);
(d) we also vectorise the epilogue; and
(e) the vectorised epilogue still needs a scalar epilogue
The specific case in which (c) applies is if a read-only loop is
vectorised using IFN_LOAD_LANES, which uses clobber statements to
mark the lifetime of the temporary array.
The vectoriser relies on the SSA renamer to update virtual operands.
All would probably be well if it postponed this update until after
it had vectorised both the main loop and the epilogue loop. However,
when vectorising the epilogue, vect_do_peeling does:
create_lcssa_for_virtual_phi (loop);
update_ssa (TODO_update_ssa_only_virtuals);
(with "loop" in this case being the to-be-vectorised epilogue loop).
So the vectoriser puts the virtual operand into SSA form for the
vectorised main loop as a separate step, during the early stages
of vectorising the epilogue.
I wasn't sure at first why that update_ssa was there. It looked
initially like it was related to create_lcssa_for_virtual_phi,
which seemed strange when create_lcssa_for_virtual_phi keeps the
SSA form up-to-date. But before r241099 it had the following comment,
which AFAICT is still the reason:
/* We might have a queued need to update virtual SSA form. As we
delete the update SSA machinery below after doing a regular
incremental SSA update during loop copying make sure we don't
lose that fact.
??? Needing to update virtual SSA form by renaming is unfortunate
but not all of the vectorizer code inserting new loads / stores
properly assigns virtual operands to those statements. */
The patch restores that comment since IMO it's helpful.
(a), (d) and (e) mean that we copy the original un-if-converted scalar
loop to act as the scalar epilogue. The update_ssa above means that this
copying needs to cope with any new virtual SSA names in the main loop.
The code to do that (reasonably) assumed that one of two things was true:
(1) the scalar loop and the vector loops don't have vdefs, and so no
virtual operand update is needed. The definition that applies
on entry to the loops is the same in all cases.
(2) the scalar loop and the vector loops have virtual phis, and so --
after applying create_lcssa_for_virtual_phi on the to-be-vectorised
epilogue loop -- the virtual operand update can be handled in the
same way as for normal SSA names.
But (b) and (c) together mean that the scalar loop and the
still-to-be-vectorised epilogue loop have no virtual phi that (2)
can use. We'd therefore keep the original vuses when duplicating,
rather than updating them to the definition that applies on exit
from the epilogue loop. (Since the epilogue is still unvectorised
and has no vdefs, the definition that applies on exit is the same
as the one that applies on entry.)
This patch therefore adds a third case: the scalar loop and
to-be-vectorised epilogue have no virtual defs, but the main loop does.
2019-12-31 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vect-loop-manip.c (create_lcssa_for_virtual_phi): Return
the incoming virtual operand definition.
(vect_do_peeling): When vectorizing an epilogue loop, handle the
case in which the main loop has a virtual phi and the epilogue
and scalar loops don't. Restore an earlier comment about the
update_ssa call.
gcc/testsuite/
* gcc.dg/vect/vect-epilogues-2.c: New test.
From-SVN: r279802
|
|
This patch replaces vec_info::vector_size with vec_info::vector_mode,
but for now continues to use it as a way of specifying a single
vector size. This makes it easier for later patches to use
related_vector_mode instead.
2019-11-14 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vectorizer.h (vec_info::vector_size): Replace with...
(vec_info::vector_mode): ...this new field.
* tree-vect-loop.c (vect_update_vf_for_slp): Update accordingly.
(vect_analyze_loop, vect_transform_loop): Likewise.
* tree-vect-loop-manip.c (vect_do_peeling): Likewise.
* tree-vect-slp.c (can_duplicate_and_interleave_p): Likewise.
(vect_make_slp_decision, vect_slp_bb_region): Likewise.
* tree-vect-stmts.c (get_vectype_for_scalar_type): Likewise.
* tree-vectorizer.c (try_vectorize_loop_1): Likewise.
gcc/testsuite/
* gcc.dg/vect/vect-tail-nomask-1.c: Update expected epilogue
vectorization message.
From-SVN: r278237
|
|
Callers of vect_halve_mask_nunits and vect_double_mask_nunits
already know what mode the resulting vector type should have,
so we might as well create the vector type directly with that mode,
just like build_vector_type_for_mode lets us build normal vectors
with a known mode. This avoids the current awkwardness of having
to recompute the mode starting from vec_info::vector_size, which
hard-codes the assumption that all vectors have to be the same size.
A later patch gets rid of build_truth_vector_type and
build_same_sized_truth_vector_type, so the net effect of the
series is to reduce the number of type functions by one.
2019-11-14 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree.h (build_truth_vector_type_for_mode): Declare.
* tree.c (build_truth_vector_type_for_mode): New function,
split out from...
(build_truth_vector_type): ...here.
(build_opaque_vector_type): Fix head comment.
* tree-vectorizer.h (supportable_narrowing_operation): Remove
vec_info parameter.
(vect_halve_mask_nunits): Replace vec_info parameter with the
mode of the new vector.
(vect_double_mask_nunits): Likewise.
* tree-vect-loop.c (vect_halve_mask_nunits): Likewise.
(vect_double_mask_nunits): Likewise.
* tree-vect-loop-manip.c: Include insn-config.h, rtl.h and recog.h.
(vect_maybe_permute_loop_masks): Remove vinfo parameter. Update call
to vect_halve_mask_nunits, getting the required mode from the unpack
patterns.
(vect_set_loop_condition_masked): Update call accordingly.
* tree-vect-stmts.c (supportable_narrowing_operation): Remove vec_info
parameter and update call to vect_double_mask_nunits.
(vectorizable_conversion): Update call accordingly.
(simple_integer_narrowing): Likewise. Remove vec_info parameter.
(vectorizable_call): Update call accordingly.
(supportable_widening_operation): Update call to
vect_halve_mask_nunits.
* config/aarch64/aarch64-sve-builtins.cc (register_builtin_types):
Use build_truth_vector_type_mode instead of build_truth_vector_type.
From-SVN: r278231
|
|
vect_analyze_loop_costing uses two profitability thresholds: a runtime
one and a static compile-time one. The runtime one is simply the point
at which the vector loop is cheaper than the scalar loop, while the
static one also takes into account the cost of choosing between the
scalar and vector loops at runtime. We compare this static cost against
the expected execution frequency to decide whether it's worth generating
any vector code at all.
However, we never reclaimed the cost of applying the runtime threshold
if it turned out that the vector code can always be used. And we only
know whether that's true once we've calculated what the runtime
threshold would be.
2019-11-13 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vectorizer.h (vect_apply_runtime_profitability_check_p):
New function.
* tree-vect-loop-manip.c (vect_loop_versioning): Use it.
* tree-vect-loop.c (vect_analyze_loop_2): Likewise.
(vect_transform_loop): Likewise.
(vect_analyze_loop_costing): Don't take the cost of versioning
into account for the static profitability threshold if it turns
out that no versioning is needed.
From-SVN: r278124
|
|
niters for epilogue
gcc/ChangeLog:
2019-11-11 Andre Vieira <andre.simoesdiasvieira@arm.com>
* tree-vect-loop-manip.c (vect_do_peeling): Take epilogue gaps into
account when checking if there are enough iterations to vectorize
epilogue.
gcc/testsuite/ChangeLog:
2019-11-11 Andre Vieira <andre.simoesdiasvieira@arm.com>
* gcc.dg/vect/vect-reduc-epilogue-gaps.c: New test.
From-SVN: r278049
|
|
gcc/ChangeLog:
2019-11-06 Andre Vieira <andre.simoesdiasvieira@arm.com>
PR tree-optimization/92317
* tree-vect-loop-manip.c (slpeel_update_phi_nodes_for_guard2): Also
update phi's with constant phi arguments.
gcc/testsuite/ChangeLog:
2019-11-06 Andre Vieira <andre.simoesdiasvieira@arm.com>
PR tree-optimization/92317
* gcc/testsuite/g++.dg/opt/pr92317.C: New test.
From-SVN: r277877
|
|
dominate use in block 15 since r277566)
2019-10-30 Richard Biener <rguenther@suse.de>
PR tree-optimization/92275
* tree-vect-loop-manip.c (slpeel_update_phi_nodes_for_loops):
Copy all loop-closed PHIs.
* gcc.dg/torture/pr92275.c: New testcase.
From-SVN: r277621
|
|
gcc/ChangeLog:
2019-10-29 Andre Vieira <andre.simoesdiasvieira@arm.com>
PR 88915
* tree-ssa-loop-niter.h (simplify_replace_tree): Change declaration.
* tree-ssa-loop-niter.c (simplify_replace_tree): Add context parameter
and make the valueize function pointer also take a void pointer.
* gcc/tree-ssa-sccvn.c (vn_valueize_wrapper): New function to wrap
around vn_valueize, to call it without a context.
(process_bb): Use vn_valueize_wrapper instead of vn_valueize.
* tree-vect-loop.c (_loop_vec_info): Initialize epilogue_vinfos.
(~_loop_vec_info): Release epilogue_vinfos.
(vect_analyze_loop_costing): Use knowledge of main VF to estimate
number of iterations of epilogue.
(vect_analyze_loop_2): Adapt to analyse main loop for all supported
vector sizes when vect-epilogues-nomask=1. Also keep track of lowest
versioning threshold needed for main loop.
(vect_analyze_loop): Likewise.
(find_in_mapping): New helper function.
(update_epilogue_loop_vinfo): New function.
(vect_transform_loop): When vectorizing epilogues re-use analysis done
on main loop and call update_epilogue_loop_vinfo to update it.
* tree-vect-loop-manip.c (vect_update_inits_of_drs): No longer insert
stmts on loop preheader edge.
(vect_do_peeling): Enable skip-vectors when doing loop versioning if
we decided to vectorize epilogues. Update epilogues NITERS and
construct ADVANCE to update epilogues data references where needed.
* tree-vectorizer.h (_loop_vec_info): Add epilogue_vinfos.
(vect_do_peeling, vect_update_inits_of_drs,
determine_peel_for_niter, vect_analyze_loop): Add or update
declarations.
* tree-vectorizer.c (try_vectorize_loop_1): Make sure to use already
created loop_vec_info's for epilogues when available. Otherwise analyse
epilogue separately.
From-SVN: r277569
|
|
2019-10-21 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vectorizer.h (vect_halve_mask_nunits): Take a vec_info.
* tree-vect-loop.c (vect_halve_mask_nunits): Likewise.
* tree-vect-loop-manip.c (vect_maybe_permute_loop_masks): Update
call accordingly.
* tree-vect-stmts.c (supportable_widening_operation): Likewise.
From-SVN: r277233
|
|
2019-10-21 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vect-loop-manip.c (vect_maybe_permute_loop_masks): Take
a loop_vec_info.
(vect_set_loop_condition_masked): Update call accordingly.
From-SVN: r277232
|
|
gcc/ChangeLog:
2019-10-17 Andre Vieira <andre.simoesdiasvieira@arm.com>
* tree-vect-loop.c (vect_transform_loop): Move code from here...
* tree-vect-loop-manip.c (vect_loop_versioning): ... to here.
* tree-vectorizer.h (vect_loop_versioning): Remove unused parameters.
From-SVN: r277101
|
|
non-bugs
gcc/c/ChangeLog:
PR c++/61339
* c-decl.c (xref_tag): Change class-key of PODs to struct and others
to class.
(field_decl_cmp): Same.
* c-parser.c (c_parser_struct_or_union_specifier): Same.
* c-tree.h: Same.
* gimple-parser.c (c_parser_gimple_compound_statement): Same.
gcc/c-family/ChangeLog:
PR c++/61339
* c-opts.c (handle_deferred_opts): : Change class-key of PODs to struct
and others to class.
* c-pretty-print.h: Same.
gcc/cp/ChangeLog:
PR c++/61339
* cp-tree.h: Change class-key of PODs to struct and others to class.
* search.c: Same.
* semantics.c (finalize_nrv_r): Same.
gcc/lto/ChangeLog:
PR c++/61339
* lto-common.c (lto_splay_tree_new): : Change class-key of PODs
to struct and others to class.
(mentions_vars_p): Same.
(register_resolution): Same.
(lto_register_var_decl_in_symtab): Same.
(lto_register_function_decl_in_symtab): Same.
(cmp_tree): Same.
(lto_read_decls): Same.
gcc/ChangeLog:
PR c++/61339
* auto-profile.c: Change class-key of PODs to struct and others
to class.
* basic-block.h: Same.
* bitmap.c (bitmap_alloc): Same.
* bitmap.h: Same.
* builtins.c (expand_builtin_prefetch): Same.
(expand_builtin_interclass_mathfn): Same.
(expand_builtin_strlen): Same.
(expand_builtin_mempcpy_args): Same.
(expand_cmpstr): Same.
(expand_builtin___clear_cache): Same.
(expand_ifn_atomic_bit_test_and): Same.
(expand_builtin_thread_pointer): Same.
(expand_builtin_set_thread_pointer): Same.
* caller-save.c (setup_save_areas): Same.
(replace_reg_with_saved_mem): Same.
(insert_restore): Same.
(insert_save): Same.
(add_used_regs): Same.
* cfg.c (get_bb_copy): Same.
(set_loop_copy): Same.
* cfg.h: Same.
* cfganal.h: Same.
* cfgexpand.c (alloc_stack_frame_space): Same.
(add_stack_var): Same.
(add_stack_var_conflict): Same.
(add_scope_conflicts_1): Same.
(update_alias_info_with_stack_vars): Same.
(expand_used_vars): Same.
* cfghooks.c (redirect_edge_and_branch_force): Same.
(delete_basic_block): Same.
(split_edge): Same.
(make_forwarder_block): Same.
(force_nonfallthru): Same.
(duplicate_block): Same.
(lv_flush_pending_stmts): Same.
* cfghooks.h: Same.
* cfgloop.c (flow_loops_cfg_dump): Same.
(flow_loop_nested_p): Same.
(superloop_at_depth): Same.
(get_loop_latch_edges): Same.
(flow_loop_dump): Same.
(flow_loops_dump): Same.
(flow_loops_free): Same.
(flow_loop_nodes_find): Same.
(establish_preds): Same.
(flow_loop_tree_node_add): Same.
(flow_loop_tree_node_remove): Same.
(flow_loops_find): Same.
(find_subloop_latch_edge_by_profile): Same.
(find_subloop_latch_edge_by_ivs): Same.
(mfb_redirect_edges_in_set): Same.
(form_subloop): Same.
(merge_latch_edges): Same.
(disambiguate_multiple_latches): Same.
(disambiguate_loops_with_multiple_latches): Same.
(flow_bb_inside_loop_p): Same.
(glb_enum_p): Same.
(get_loop_body_with_size): Same.
(get_loop_body): Same.
(fill_sons_in_loop): Same.
(get_loop_body_in_dom_order): Same.
(get_loop_body_in_custom_order): Same.
(release_recorded_exits): Same.
(get_loop_exit_edges): Same.
(num_loop_branches): Same.
(remove_bb_from_loops): Same.
(find_common_loop): Same.
(delete_loop): Same.
(cancel_loop): Same.
(verify_loop_structure): Same.
(loop_preheader_edge): Same.
(loop_exit_edge_p): Same.
(single_exit): Same.
(loop_exits_to_bb_p): Same.
(loop_exits_from_bb_p): Same.
(get_loop_location): Same.
(record_niter_bound): Same.
(get_estimated_loop_iterations_int): Same.
(max_stmt_executions_int): Same.
(likely_max_stmt_executions_int): Same.
(get_estimated_loop_iterations): Same.
(get_max_loop_iterations): Same.
(get_max_loop_iterations_int): Same.
(get_likely_max_loop_iterations): Same.
* cfgloop.h (simple_loop_desc): Same.
(get_loop): Same.
(loop_depth): Same.
(loop_outer): Same.
(loop_iterator::next): Same.
(loop_outermost): Same.
* cfgloopanal.c (mark_irreducible_loops): Same.
(num_loop_insns): Same.
(average_num_loop_insns): Same.
(expected_loop_iterations_unbounded): Same.
(expected_loop_iterations): Same.
(mark_loop_exit_edges): Same.
(single_likely_exit): Same.
* cfgloopmanip.c (fix_bb_placement): Same.
(fix_bb_placements): Same.
(remove_path): Same.
(place_new_loop): Same.
(add_loop): Same.
(scale_loop_frequencies): Same.
(scale_loop_profile): Same.
(create_empty_if_region_on_edge): Same.
(create_empty_loop_on_edge): Same.
(loopify): Same.
(unloop): Same.
(fix_loop_placements): Same.
(copy_loop_info): Same.
(duplicate_loop): Same.
(duplicate_subloops): Same.
(loop_redirect_edge): Same.
(can_duplicate_loop_p): Same.
(duplicate_loop_to_header_edge): Same.
(mfb_keep_just): Same.
(has_preds_from_loop): Same.
(create_preheader): Same.
(create_preheaders): Same.
(lv_adjust_loop_entry_edge): Same.
(loop_version): Same.
* cfgloopmanip.h: Same.
* cgraph.h: Same.
* cgraphbuild.c: Same.
* combine.c (make_extraction): Same.
* config/i386/i386-features.c: Same.
* config/i386/i386-features.h: Same.
* config/i386/i386.c (ix86_emit_outlined_ms2sysv_save): Same.
(ix86_emit_outlined_ms2sysv_restore): Same.
(ix86_noce_conversion_profitable_p): Same.
(ix86_init_cost): Same.
(ix86_simd_clone_usable): Same.
* configure.ac: Same.
* coretypes.h: Same.
* data-streamer-in.c (string_for_index): Same.
(streamer_read_indexed_string): Same.
(streamer_read_string): Same.
(bp_unpack_indexed_string): Same.
(bp_unpack_string): Same.
(streamer_read_uhwi): Same.
(streamer_read_hwi): Same.
(streamer_read_gcov_count): Same.
(streamer_read_wide_int): Same.
* data-streamer.h (streamer_write_bitpack): Same.
(bp_unpack_value): Same.
(streamer_write_char_stream): Same.
(streamer_write_hwi_in_range): Same.
(streamer_write_record_start): Same.
* ddg.c (create_ddg_dep_from_intra_loop_link): Same.
(add_cross_iteration_register_deps): Same.
(build_intra_loop_deps): Same.
* df-core.c (df_analyze): Same.
(loop_post_order_compute): Same.
(loop_inverted_post_order_compute): Same.
* df-problems.c (df_rd_alloc): Same.
(df_rd_simulate_one_insn): Same.
(df_rd_local_compute): Same.
(df_rd_init_solution): Same.
(df_rd_confluence_n): Same.
(df_rd_transfer_function): Same.
(df_rd_free): Same.
(df_rd_dump_defs_set): Same.
(df_rd_top_dump): Same.
(df_lr_alloc): Same.
(df_lr_reset): Same.
(df_lr_local_compute): Same.
(df_lr_init): Same.
(df_lr_confluence_n): Same.
(df_lr_free): Same.
(df_lr_top_dump): Same.
(df_lr_verify_transfer_functions): Same.
(df_live_alloc): Same.
(df_live_reset): Same.
(df_live_init): Same.
(df_live_confluence_n): Same.
(df_live_finalize): Same.
(df_live_free): Same.
(df_live_top_dump): Same.
(df_live_verify_transfer_functions): Same.
(df_mir_alloc): Same.
(df_mir_reset): Same.
(df_mir_init): Same.
(df_mir_confluence_n): Same.
(df_mir_free): Same.
(df_mir_top_dump): Same.
(df_word_lr_alloc): Same.
(df_word_lr_reset): Same.
(df_word_lr_init): Same.
(df_word_lr_confluence_n): Same.
(df_word_lr_free): Same.
(df_word_lr_top_dump): Same.
(df_md_alloc): Same.
(df_md_simulate_one_insn): Same.
(df_md_reset): Same.
(df_md_init): Same.
(df_md_free): Same.
(df_md_top_dump): Same.
* df-scan.c (df_insn_delete): Same.
(df_insn_rescan): Same.
(df_notes_rescan): Same.
(df_sort_and_compress_mws): Same.
(df_install_mws): Same.
(df_refs_add_to_chains): Same.
(df_ref_create_structure): Same.
(df_ref_record): Same.
(df_def_record_1): Same.
(df_find_hard_reg_defs): Same.
(df_uses_record): Same.
(df_get_conditional_uses): Same.
(df_get_call_refs): Same.
(df_recompute_luids): Same.
(df_get_entry_block_def_set): Same.
(df_entry_block_defs_collect): Same.
(df_get_exit_block_use_set): Same.
(df_exit_block_uses_collect): Same.
(df_mws_verify): Same.
(df_bb_verify): Same.
* df.h (df_scan_get_bb_info): Same.
* doc/tm.texi: Same.
* dse.c (record_store): Same.
* dumpfile.h: Same.
* emit-rtl.c (const_fixed_hasher::equal): Same.
(set_mem_attributes_minus_bitpos): Same.
(change_address): Same.
(adjust_address_1): Same.
(offset_address): Same.
* emit-rtl.h: Same.
* except.c (dw2_build_landing_pads): Same.
(sjlj_emit_dispatch_table): Same.
* explow.c (allocate_dynamic_stack_space): Same.
(emit_stack_probe): Same.
(probe_stack_range): Same.
* expmed.c (store_bit_field_using_insv): Same.
(store_bit_field_1): Same.
(store_integral_bit_field): Same.
(extract_bit_field_using_extv): Same.
(extract_bit_field_1): Same.
(emit_cstore): Same.
* expr.c (emit_block_move_via_cpymem): Same.
(expand_cmpstrn_or_cmpmem): Same.
(set_storage_via_setmem): Same.
(emit_single_push_insn_1): Same.
(expand_assignment): Same.
(store_constructor): Same.
(expand_expr_real_2): Same.
(expand_expr_real_1): Same.
(try_casesi): Same.
* flags.h: Same.
* function.c (try_fit_stack_local): Same.
(assign_stack_local_1): Same.
(assign_stack_local): Same.
(cut_slot_from_list): Same.
(insert_slot_to_list): Same.
(max_slot_level): Same.
(move_slot_to_level): Same.
(temp_address_hasher::equal): Same.
(remove_unused_temp_slot_addresses): Same.
(assign_temp): Same.
(combine_temp_slots): Same.
(update_temp_slot_address): Same.
(preserve_temp_slots): Same.
* function.h: Same.
* fwprop.c: Same.
* gcc-rich-location.h: Same.
* gcov.c: Same.
* genattrtab.c (check_attr_test): Same.
(check_attr_value): Same.
(convert_set_attr_alternative): Same.
(convert_set_attr): Same.
(check_defs): Same.
(copy_boolean): Same.
(get_attr_value): Same.
(expand_delays): Same.
(make_length_attrs): Same.
(min_fn): Same.
(make_alternative_compare): Same.
(simplify_test_exp): Same.
(tests_attr_p): Same.
(get_attr_order): Same.
(clear_struct_flag): Same.
(gen_attr): Same.
(compares_alternatives_p): Same.
(gen_insn): Same.
(gen_delay): Same.
(find_attrs_to_cache): Same.
(write_test_expr): Same.
(walk_attr_value): Same.
(write_attr_get): Same.
(eliminate_known_true): Same.
(write_insn_cases): Same.
(write_attr_case): Same.
(write_attr_valueq): Same.
(write_attr_value): Same.
(write_dummy_eligible_delay): Same.
(next_comma_elt): Same.
(find_attr): Same.
(make_internal_attr): Same.
(copy_rtx_unchanging): Same.
(gen_insn_reserv): Same.
(check_tune_attr): Same.
(make_automaton_attrs): Same.
(handle_arg): Same.
* genextract.c (gen_insn): Same.
(VEC_char_to_string): Same.
* genmatch.c (print_operand): Same.
(lower): Same.
(parser::parse_operation): Same.
(parser::parse_capture): Same.
(parser::parse_c_expr): Same.
(parser::parse_simplify): Same.
(main): Same.
* genoutput.c (output_operand_data): Same.
(output_get_insn_name): Same.
(compare_operands): Same.
(place_operands): Same.
(process_template): Same.
(validate_insn_alternatives): Same.
(validate_insn_operands): Same.
(gen_expand): Same.
(note_constraint): Same.
* genpreds.c (write_one_predicate_function): Same.
(add_constraint): Same.
(process_define_register_constraint): Same.
(write_lookup_constraint_1): Same.
(write_lookup_constraint_array): Same.
(write_insn_constraint_len): Same.
(write_reg_class_for_constraint_1): Same.
(write_constraint_satisfied_p_array): Same.
* genrecog.c (optimize_subroutine_group): Same.
* gensupport.c (process_define_predicate): Same.
(queue_pattern): Same.
(remove_from_queue): Same.
(process_rtx): Same.
(is_predicable): Same.
(change_subst_attribute): Same.
(subst_pattern_match): Same.
(alter_constraints): Same.
(alter_attrs_for_insn): Same.
(shift_output_template): Same.
(alter_output_for_subst_insn): Same.
(process_one_cond_exec): Same.
(subst_dup): Same.
(process_define_cond_exec): Same.
(mnemonic_htab_callback): Same.
(gen_mnemonic_attr): Same.
(read_md_rtx): Same.
* ggc-page.c: Same.
* gimple-loop-interchange.cc (dump_reduction): Same.
(dump_induction): Same.
(loop_cand::~loop_cand): Same.
(free_data_refs_with_aux): Same.
(tree_loop_interchange::interchange_loops): Same.
(tree_loop_interchange::map_inductions_to_loop): Same.
(tree_loop_interchange::move_code_to_inner_loop): Same.
(compute_access_stride): Same.
(compute_access_strides): Same.
(proper_loop_form_for_interchange): Same.
(tree_loop_interchange_compute_ddrs): Same.
(prune_datarefs_not_in_loop): Same.
(prepare_data_references): Same.
(pass_linterchange::execute): Same.
* gimple-loop-jam.c (bb_prevents_fusion_p): Same.
(unroll_jam_possible_p): Same.
(fuse_loops): Same.
(adjust_unroll_factor): Same.
(tree_loop_unroll_and_jam): Same.
* gimple-loop-versioning.cc (loop_versioning::~loop_versioning): Same.
(loop_versioning::expensive_stmt_p): Same.
(loop_versioning::version_for_unity): Same.
(loop_versioning::dump_inner_likelihood): Same.
(loop_versioning::find_per_loop_multiplication): Same.
(loop_versioning::analyze_term_using_scevs): Same.
(loop_versioning::record_address_fragment): Same.
(loop_versioning::analyze_expr): Same.
(loop_versioning::analyze_blocks): Same.
(loop_versioning::prune_conditions): Same.
(loop_versioning::merge_loop_info): Same.
(loop_versioning::add_loop_to_queue): Same.
(loop_versioning::decide_whether_loop_is_versionable): Same.
(loop_versioning::make_versioning_decisions): Same.
(loop_versioning::implement_versioning_decisions): Same.
* gimple-ssa-evrp-analyze.c
(evrp_range_analyzer::record_ranges_from_phis): Same.
* gimple-ssa-store-merging.c (split_store::split_store): Same.
(count_multiple_uses): Same.
(split_group): Same.
(imm_store_chain_info::output_merged_store): Same.
(pass_store_merging::process_store): Same.
* gimple-ssa-strength-reduction.c (slsr_process_phi): Same.
* gimple-ssa-warn-alloca.c (adjusted_warn_limit): Same.
(is_max): Same.
(alloca_call_type): Same.
(pass_walloca::execute): Same.
* gimple-streamer-in.c (input_phi): Same.
(input_gimple_stmt): Same.
* gimple-streamer.h: Same.
* godump.c (go_force_record_alignment): Same.
(go_format_type): Same.
(go_output_type): Same.
(go_output_fndecl): Same.
(go_output_typedef): Same.
(keyword_hash_init): Same.
(find_dummy_types): Same.
* graph.c (draw_cfg_nodes_no_loops): Same.
(draw_cfg_nodes_for_loop): Same.
* hard-reg-set.h (hard_reg_set_iter_next): Same.
* hsa-brig.c: Same.
* hsa-common.h (hsa_internal_fn_hasher::equal): Same.
* hsa-dump.c (dump_hsa_cfun): Same.
* hsa-gen.c (gen_function_def_parameters): Same.
* hsa-regalloc.c (dump_hsa_cfun_regalloc): Same.
* input.c (dump_line_table_statistics): Same.
(test_lexer): Same.
* input.h: Same.
* internal-fn.c (get_multi_vector_move): Same.
(expand_load_lanes_optab_fn): Same.
(expand_GOMP_SIMT_ENTER_ALLOC): Same.
(expand_GOMP_SIMT_EXIT): Same.
(expand_GOMP_SIMT_LAST_LANE): Same.
(expand_GOMP_SIMT_ORDERED_PRED): Same.
(expand_GOMP_SIMT_VOTE_ANY): Same.
(expand_GOMP_SIMT_XCHG_BFLY): Same.
(expand_GOMP_SIMT_XCHG_IDX): Same.
(expand_addsub_overflow): Same.
(expand_neg_overflow): Same.
(expand_mul_overflow): Same.
(expand_call_mem_ref): Same.
(expand_mask_load_optab_fn): Same.
(expand_scatter_store_optab_fn): Same.
(expand_gather_load_optab_fn): Same.
* ipa-cp.c (ipa_get_parm_lattices): Same.
(print_all_lattices): Same.
(ignore_edge_p): Same.
(build_toporder_info): Same.
(free_toporder_info): Same.
(push_node_to_stack): Same.
(ipcp_lattice<valtype>::set_contains_variable): Same.
(set_agg_lats_to_bottom): Same.
(ipcp_bits_lattice::meet_with): Same.
(set_single_call_flag): Same.
(initialize_node_lattices): Same.
(ipa_get_jf_ancestor_result): Same.
(ipcp_verify_propagated_values): Same.
(propagate_scalar_across_jump_function): Same.
(propagate_context_across_jump_function): Same.
(propagate_bits_across_jump_function): Same.
(ipa_vr_operation_and_type_effects): Same.
(propagate_vr_across_jump_function): Same.
(set_check_aggs_by_ref): Same.
(set_chain_of_aglats_contains_variable): Same.
(merge_aggregate_lattices): Same.
(agg_pass_through_permissible_p): Same.
(propagate_aggs_across_jump_function): Same.
(call_passes_through_thunk_p): Same.
(propagate_constants_across_call): Same.
(devirtualization_time_bonus): Same.
(good_cloning_opportunity_p): Same.
(context_independent_aggregate_values): Same.
(gather_context_independent_values): Same.
(perform_estimation_of_a_value): Same.
(estimate_local_effects): Same.
(value_topo_info<valtype>::add_val): Same.
(add_all_node_vals_to_toposort): Same.
(value_topo_info<valtype>::propagate_effects): Same.
(ipcp_propagate_stage): Same.
(ipcp_discover_new_direct_edges): Same.
(same_node_or_its_all_contexts_clone_p): Same.
(cgraph_edge_brings_value_p): Same.
(gather_edges_for_value): Same.
(create_specialized_node): Same.
(find_more_scalar_values_for_callers_subset): Same.
(find_more_contexts_for_caller_subset): Same.
(copy_plats_to_inter): Same.
(intersect_aggregates_with_edge): Same.
(find_aggregate_values_for_callers_subset): Same.
(cgraph_edge_brings_all_agg_vals_for_node): Same.
(decide_about_value): Same.
(decide_whether_version_node): Same.
(spread_undeadness): Same.
(identify_dead_nodes): Same.
(ipcp_store_vr_results): Same.
* ipa-devirt.c (final_warning_record::grow_type_warnings): Same.
* ipa-fnsummary.c (ipa_fn_summary::account_size_time): Same.
(redirect_to_unreachable): Same.
(edge_set_predicate): Same.
(evaluate_conditions_for_known_args): Same.
(evaluate_properties_for_edge): Same.
(ipa_fn_summary_t::duplicate): Same.
(ipa_call_summary_t::duplicate): Same.
(dump_ipa_call_summary): Same.
(ipa_dump_fn_summary): Same.
(eliminated_by_inlining_prob): Same.
(set_cond_stmt_execution_predicate): Same.
(set_switch_stmt_execution_predicate): Same.
(compute_bb_predicates): Same.
(will_be_nonconstant_expr_predicate): Same.
(phi_result_unknown_predicate): Same.
(analyze_function_body): Same.
(compute_fn_summary): Same.
(estimate_edge_devirt_benefit): Same.
(estimate_edge_size_and_time): Same.
(estimate_calls_size_and_time): Same.
(estimate_node_size_and_time): Same.
(remap_edge_change_prob): Same.
(remap_edge_summaries): Same.
(ipa_merge_fn_summary_after_inlining): Same.
(ipa_fn_summary_generate): Same.
(inline_read_section): Same.
(ipa_fn_summary_read): Same.
(ipa_fn_summary_write): Same.
* ipa-fnsummary.h: Same.
* ipa-hsa.c (ipa_hsa_read_section): Same.
* ipa-icf-gimple.c (func_checker::compare_loops): Same.
* ipa-icf.c (sem_function::param_used_p): Same.
* ipa-inline-analysis.c (do_estimate_edge_time): Same.
* ipa-inline.c (edge_badness): Same.
(inline_small_functions): Same.
* ipa-polymorphic-call.c
(ipa_polymorphic_call_context::stream_out): Same.
* ipa-predicate.c (predicate::remap_after_duplication): Same.
(predicate::remap_after_inlining): Same.
(predicate::stream_out): Same.
* ipa-predicate.h: Same.
* ipa-profile.c (ipa_profile_read_summary): Same.
* ipa-prop.c (ipa_get_param_decl_index_1): Same.
(count_formal_params): Same.
(ipa_dump_param): Same.
(ipa_alloc_node_params): Same.
(ipa_print_node_jump_functions_for_edge): Same.
(ipa_print_node_jump_functions): Same.
(ipa_load_from_parm_agg): Same.
(get_ancestor_addr_info): Same.
(ipa_compute_jump_functions_for_edge): Same.
(ipa_analyze_virtual_call_uses): Same.
(ipa_analyze_stmt_uses): Same.
(ipa_analyze_params_uses_in_bb): Same.
(update_jump_functions_after_inlining): Same.
(try_decrement_rdesc_refcount): Same.
(ipa_impossible_devirt_target): Same.
(update_indirect_edges_after_inlining): Same.
(combine_controlled_uses_counters): Same.
(ipa_edge_args_sum_t::duplicate): Same.
(ipa_write_jump_function): Same.
(ipa_write_indirect_edge_info): Same.
(ipa_write_node_info): Same.
(ipa_read_edge_info): Same.
(ipa_prop_read_section): Same.
(read_replacements_section): Same.
* ipa-prop.h (ipa_get_param_count): Same.
(ipa_get_param): Same.
(ipa_get_type): Same.
(ipa_get_param_move_cost): Same.
(ipa_set_param_used): Same.
(ipa_get_controlled_uses): Same.
(ipa_set_controlled_uses): Same.
(ipa_get_cs_argument_count): Same.
* ipa-pure-const.c (analyze_function): Same.
(pure_const_read_summary): Same.
* ipa-ref.h: Same.
* ipa-reference.c (ipa_reference_read_optimization_summary): Same.
* ipa-split.c (test_nonssa_use): Same.
(dump_split_point): Same.
(dominated_by_forbidden): Same.
(split_part_set_ssa_name_p): Same.
(find_split_points): Same.
* ira-build.c (finish_loop_tree_nodes): Same.
(low_pressure_loop_node_p): Same.
* ira-color.c (ira_reuse_stack_slot): Same.
* ira-int.h: Same.
* ira.c (setup_reg_equiv): Same.
(print_insn_chain): Same.
(ira): Same.
* loop-doloop.c (doloop_condition_get): Same.
(add_test): Same.
(record_reg_sets): Same.
(doloop_optimize): Same.
* loop-init.c (loop_optimizer_init): Same.
(fix_loop_structure): Same.
* loop-invariant.c (merge_identical_invariants): Same.
(compute_always_reached): Same.
(find_exits): Same.
(may_assign_reg_p): Same.
(find_invariants_bb): Same.
(find_invariants_body): Same.
(replace_uses): Same.
(can_move_invariant_reg): Same.
(free_inv_motion_data): Same.
(move_single_loop_invariants): Same.
(change_pressure): Same.
(mark_ref_regs): Same.
(calculate_loop_reg_pressure): Same.
* loop-iv.c (biv_entry_hasher::equal): Same.
(iv_extend_to_rtx_code): Same.
(check_iv_ref_table_size): Same.
(clear_iv_info): Same.
(latch_dominating_def): Same.
(iv_get_reaching_def): Same.
(iv_constant): Same.
(iv_subreg): Same.
(iv_extend): Same.
(iv_neg): Same.
(iv_add): Same.
(iv_mult): Same.
(get_biv_step): Same.
(record_iv): Same.
(analyzed_for_bivness_p): Same.
(record_biv): Same.
(iv_analyze_biv): Same.
(iv_analyze_expr): Same.
(iv_analyze_def): Same.
(iv_analyze_op): Same.
(iv_analyze): Same.
(iv_analyze_result): Same.
(biv_p): Same.
(eliminate_implied_conditions): Same.
(simplify_using_initial_values): Same.
(shorten_into_mode): Same.
(canonicalize_iv_subregs): Same.
(determine_max_iter): Same.
(check_simple_exit): Same.
(find_simple_exit): Same.
(get_simple_loop_desc): Same.
* loop-unroll.c (report_unroll): Same.
(decide_unrolling): Same.
(unroll_loops): Same.
(loop_exit_at_end_p): Same.
(decide_unroll_constant_iterations): Same.
(unroll_loop_constant_iterations): Same.
(compare_and_jump_seq): Same.
(unroll_loop_runtime_iterations): Same.
(decide_unroll_stupid): Same.
(unroll_loop_stupid): Same.
(referenced_in_one_insn_in_loop_p): Same.
(reset_debug_uses_in_loop): Same.
(analyze_iv_to_split_insn): Same.
* lra-eliminations.c (lra_debug_elim_table): Same.
(setup_can_eliminate): Same.
(form_sum): Same.
(lra_get_elimination_hard_regno): Same.
(lra_eliminate_regs_1): Same.
(eliminate_regs_in_insn): Same.
(update_reg_eliminate): Same.
(init_elimination): Same.
(lra_eliminate): Same.
* lra-int.h: Same.
* lra-lives.c (initiate_live_solver): Same.
* lra-remat.c (create_remat_bb_data): Same.
* lra-spills.c (lra_spill): Same.
* lra.c (lra_set_insn_recog_data): Same.
(lra_set_used_insn_alternative_by_uid): Same.
(init_reg_info): Same.
(expand_reg_info): Same.
* lto-cgraph.c (output_symtab): Same.
(read_identifier): Same.
(get_alias_symbol): Same.
(input_node): Same.
(input_varpool_node): Same.
(input_ref): Same.
(input_edge): Same.
(input_cgraph_1): Same.
(input_refs): Same.
(input_symtab): Same.
(input_offload_tables): Same.
(output_cgraph_opt_summary): Same.
(input_edge_opt_summary): Same.
(input_cgraph_opt_section): Same.
* lto-section-in.c (lto_free_raw_section_data): Same.
(lto_create_simple_input_block): Same.
(lto_free_function_in_decl_state_for_node): Same.
* lto-streamer-in.c (lto_tag_check_set): Same.
(lto_location_cache::revert_location_cache): Same.
(lto_location_cache::input_location): Same.
(lto_input_location): Same.
(stream_input_location_now): Same.
(lto_input_tree_ref): Same.
(lto_input_eh_catch_list): Same.
(input_eh_region): Same.
(lto_init_eh): Same.
(make_new_block): Same.
(input_cfg): Same.
(fixup_call_stmt_edges): Same.
(input_struct_function_base): Same.
(input_function): Same.
(lto_read_body_or_constructor): Same.
(lto_read_tree_1): Same.
(lto_read_tree): Same.
(lto_input_scc): Same.
(lto_input_tree_1): Same.
(lto_input_toplevel_asms): Same.
(lto_input_mode_table): Same.
(lto_reader_init): Same.
(lto_data_in_create): Same.
* lto-streamer-out.c (output_cfg): Same.
* lto-streamer.h: Same.
* modulo-sched.c (duplicate_insns_of_cycles): Same.
(generate_prolog_epilog): Same.
(mark_loop_unsched): Same.
(dump_insn_location): Same.
(loop_canon_p): Same.
(sms_schedule): Same.
* omp-expand.c (expand_omp_for_ordered_loops): Same.
(expand_omp_for_generic): Same.
(expand_omp_for_static_nochunk): Same.
(expand_omp_for_static_chunk): Same.
(expand_omp_simd): Same.
(expand_omp_taskloop_for_inner): Same.
(expand_oacc_for): Same.
(expand_omp_atomic_pipeline): Same.
(mark_loops_in_oacc_kernels_region): Same.
* omp-offload.c (oacc_xform_loop): Same.
* omp-simd-clone.c (simd_clone_adjust): Same.
* optabs-query.c (get_traditional_extraction_insn): Same.
* optabs.c (expand_vector_broadcast): Same.
(expand_binop_directly): Same.
(expand_twoval_unop): Same.
(expand_twoval_binop): Same.
(expand_unop_direct): Same.
(emit_indirect_jump): Same.
(emit_conditional_move): Same.
(emit_conditional_neg_or_complement): Same.
(emit_conditional_add): Same.
(vector_compare_rtx): Same.
(expand_vec_perm_1): Same.
(expand_vec_perm_const): Same.
(expand_vec_cond_expr): Same.
(expand_vec_series_expr): Same.
(maybe_emit_atomic_exchange): Same.
(maybe_emit_sync_lock_test_and_set): Same.
(expand_atomic_compare_and_swap): Same.
(expand_atomic_load): Same.
(expand_atomic_store): Same.
(maybe_emit_op): Same.
(valid_multiword_target_p): Same.
(create_integer_operand): Same.
(maybe_legitimize_operand_same_code): Same.
(maybe_legitimize_operand): Same.
(create_convert_operand_from_type): Same.
(can_reuse_operands_p): Same.
(maybe_legitimize_operands): Same.
(maybe_gen_insn): Same.
(maybe_expand_insn): Same.
(maybe_expand_jump_insn): Same.
(expand_insn): Same.
* optabs.h (create_expand_operand): Same.
(create_fixed_operand): Same.
(create_output_operand): Same.
(create_input_operand): Same.
(create_convert_operand_to): Same.
(create_convert_operand_from): Same.
* optinfo.h: Same.
* poly-int.h: Same.
* predict.c (optimize_insn_for_speed_p): Same.
(optimize_loop_for_size_p): Same.
(optimize_loop_for_speed_p): Same.
(optimize_loop_nest_for_speed_p): Same.
(get_base_value): Same.
(predicted_by_loop_heuristics_p): Same.
(predict_extra_loop_exits): Same.
(predict_loops): Same.
(predict_paths_for_bb): Same.
(predict_paths_leading_to): Same.
(propagate_freq): Same.
(pass_profile::execute): Same.
* predict.h: Same.
* profile-count.c (profile_count::differs_from_p): Same.
(profile_probability::differs_lot_from_p): Same.
* profile-count.h: Same.
* profile.c (branch_prob): Same.
* regrename.c (free_chain_data): Same.
(mark_conflict): Same.
(create_new_chain): Same.
(merge_overlapping_regs): Same.
(init_rename_info): Same.
(merge_chains): Same.
(regrename_analyze): Same.
(regrename_do_replace): Same.
(scan_rtx_reg): Same.
(record_out_operands): Same.
(build_def_use): Same.
* regrename.h: Same.
* reload.h: Same.
* reload1.c (init_reload): Same.
(maybe_fix_stack_asms): Same.
(copy_reloads): Same.
(count_pseudo): Same.
(count_spilled_pseudo): Same.
(find_reg): Same.
(find_reload_regs): Same.
(select_reload_regs): Same.
(spill_hard_reg): Same.
(fixup_eh_region_note): Same.
(set_reload_reg): Same.
(allocate_reload_reg): Same.
(compute_reload_subreg_offset): Same.
(reload_adjust_reg_for_icode): Same.
(emit_input_reload_insns): Same.
(emit_output_reload_insns): Same.
(do_input_reload): Same.
(inherit_piecemeal_p): Same.
* rtl.h: Same.
* sanopt.c (maybe_get_dominating_check): Same.
(maybe_optimize_ubsan_ptr_ifn): Same.
(can_remove_asan_check): Same.
(maybe_optimize_asan_check_ifn): Same.
(sanopt_optimize_walker): Same.
* sched-deps.c (add_dependence_list): Same.
(chain_to_prev_insn): Same.
(add_insn_mem_dependence): Same.
(create_insn_reg_set): Same.
(maybe_extend_reg_info_p): Same.
(sched_analyze_reg): Same.
(sched_analyze_1): Same.
(get_implicit_reg_pending_clobbers): Same.
(chain_to_prev_insn_p): Same.
(deps_analyze_insn): Same.
(deps_start_bb): Same.
(sched_free_deps): Same.
(init_deps): Same.
(init_deps_reg_last): Same.
(free_deps): Same.
* sched-ebb.c: Same.
* sched-int.h: Same.
* sched-rgn.c (add_branch_dependences): Same.
(concat_insn_mem_list): Same.
(deps_join): Same.
(sched_rgn_compute_dependencies): Same.
* sel-sched-ir.c (reset_target_context): Same.
(copy_deps_context): Same.
(init_id_from_df): Same.
(has_dependence_p): Same.
(change_loops_latches): Same.
(bb_top_order_comparator): Same.
(make_region_from_loop_preheader): Same.
(sel_init_pipelining): Same.
(get_loop_nest_for_rgn): Same.
(make_regions_from_the_rest): Same.
(sel_is_loop_preheader_p): Same.
* sel-sched-ir.h (inner_loop_header_p): Same.
(get_all_loop_exits): Same.
* selftest.h: Same.
* sese.c (sese_build_liveouts): Same.
(sese_insert_phis_for_liveouts): Same.
* sese.h (defined_in_sese_p): Same.
* sreal.c (sreal::stream_out): Same.
* sreal.h: Same.
* streamer-hooks.h: Same.
* target-globals.c (save_target_globals): Same.
* target-globals.h: Same.
* target.def: Same.
* target.h: Same.
* targhooks.c (default_has_ifunc_p): Same.
(default_empty_mask_is_expensive): Same.
(default_init_cost): Same.
* targhooks.h: Same.
* toplev.c: Same.
* tree-affine.c (aff_combination_mult): Same.
(aff_combination_expand): Same.
(aff_combination_constant_multiple_p): Same.
* tree-affine.h: Same.
* tree-cfg.c (build_gimple_cfg): Same.
(replace_loop_annotate_in_block): Same.
(replace_uses_by): Same.
(remove_bb): Same.
(dump_cfg_stats): Same.
(gimple_duplicate_sese_region): Same.
(gimple_duplicate_sese_tail): Same.
(move_block_to_fn): Same.
(replace_block_vars_by_duplicates): Same.
(move_sese_region_to_fn): Same.
(print_loops_bb): Same.
(print_loop): Same.
(print_loops): Same.
(debug): Same.
(debug_loops): Same.
* tree-cfg.h: Same.
* tree-chrec.c (chrec_fold_plus_poly_poly): Same.
(chrec_fold_multiply_poly_poly): Same.
(chrec_evaluate): Same.
(chrec_component_in_loop_num): Same.
(reset_evolution_in_loop): Same.
(is_multivariate_chrec): Same.
(chrec_contains_symbols): Same.
(nb_vars_in_chrec): Same.
(chrec_convert_1): Same.
(chrec_convert_aggressive): Same.
* tree-chrec.h: Same.
* tree-core.h: Same.
* tree-data-ref.c (dump_data_dependence_relation): Same.
(canonicalize_base_object_address): Same.
(data_ref_compare_tree): Same.
(prune_runtime_alias_test_list): Same.
(get_segment_min_max): Same.
(create_intersect_range_checks): Same.
(conflict_fn_no_dependence): Same.
(object_address_invariant_in_loop_p): Same.
(analyze_ziv_subscript): Same.
(analyze_siv_subscript_cst_affine): Same.
(analyze_miv_subscript): Same.
(analyze_overlapping_iterations): Same.
(build_classic_dist_vector_1): Same.
(add_other_self_distances): Same.
(same_access_functions): Same.
(build_classic_dir_vector): Same.
(subscript_dependence_tester_1): Same.
(subscript_dependence_tester): Same.
(access_functions_are_affine_or_constant_p): Same.
(get_references_in_stmt): Same.
(loop_nest_has_data_refs): Same.
(graphite_find_data_references_in_stmt): Same.
(find_data_references_in_bb): Same.
(get_base_for_alignment): Same.
(find_loop_nest_1): Same.
(find_loop_nest): Same.
* tree-data-ref.h (dr_alignment): Same.
(ddr_dependence_level): Same.
* tree-if-conv.c (fold_build_cond_expr): Same.
(add_to_predicate_list): Same.
(add_to_dst_predicate_list): Same.
(phi_convertible_by_degenerating_args): Same.
(idx_within_array_bound): Same.
(all_preds_critical_p): Same.
(pred_blocks_visited_p): Same.
(predicate_bbs): Same.
(build_region): Same.
(if_convertible_loop_p_1): Same.
(is_cond_scalar_reduction): Same.
(predicate_scalar_phi): Same.
(remove_conditions_and_labels): Same.
(combine_blocks): Same.
(version_loop_for_if_conversion): Same.
(versionable_outer_loop_p): Same.
(ifcvt_local_dce): Same.
(tree_if_conversion): Same.
(pass_if_conversion::gate): Same.
* tree-if-conv.h: Same.
* tree-inline.c (maybe_move_debug_stmts_to_successors): Same.
* tree-loop-distribution.c (bb_top_order_cmp): Same.
(free_rdg): Same.
(stmt_has_scalar_dependences_outside_loop): Same.
(copy_loop_before): Same.
(create_bb_after_loop): Same.
(const_with_all_bytes_same): Same.
(generate_memset_builtin): Same.
(generate_memcpy_builtin): Same.
(destroy_loop): Same.
(build_rdg_partition_for_vertex): Same.
(compute_access_range): Same.
(data_ref_segment_size): Same.
(latch_dominated_by_data_ref): Same.
(compute_alias_check_pairs): Same.
(fuse_memset_builtins): Same.
(finalize_partitions): Same.
(find_seed_stmts_for_distribution): Same.
(prepare_perfect_loop_nest): Same.
* tree-parloops.c (lambda_transform_legal_p): Same.
(loop_parallel_p): Same.
(reduc_stmt_res): Same.
(add_field_for_name): Same.
(create_call_for_reduction_1): Same.
(replace_uses_in_bb_by): Same.
(transform_to_exit_first_loop_alt): Same.
(try_transform_to_exit_first_loop_alt): Same.
(transform_to_exit_first_loop): Same.
(num_phis): Same.
(gen_parallel_loop): Same.
(gather_scalar_reductions): Same.
(get_omp_data_i_param): Same.
(try_create_reduction_list): Same.
(oacc_entry_exit_single_gang): Same.
(parallelize_loops): Same.
* tree-pass.h: Same.
* tree-predcom.c (determine_offset): Same.
(last_always_executed_block): Same.
(split_data_refs_to_components): Same.
(suitable_component_p): Same.
(valid_initializer_p): Same.
(find_looparound_phi): Same.
(insert_looparound_copy): Same.
(add_looparound_copies): Same.
(determine_roots_comp): Same.
(predcom_tmp_var): Same.
(initialize_root_vars): Same.
(initialize_root_vars_store_elim_1): Same.
(initialize_root_vars_store_elim_2): Same.
(finalize_eliminated_stores): Same.
(initialize_root_vars_lm): Same.
(remove_stmt): Same.
(determine_unroll_factor): Same.
(execute_pred_commoning_cbck): Same.
(base_names_in_chain_on): Same.
(combine_chains): Same.
(pcom_stmt_dominates_stmt_p): Same.
(try_combine_chains): Same.
(prepare_initializers_chain_store_elim): Same.
(prepare_initializers_chain): Same.
(prepare_initializers): Same.
(prepare_finalizers_chain): Same.
(prepare_finalizers): Same.
(insert_init_seqs): Same.
* tree-scalar-evolution.c (loop_phi_node_p): Same.
(compute_overall_effect_of_inner_loop): Same.
(add_to_evolution_1): Same.
(add_to_evolution): Same.
(follow_ssa_edge_binary): Same.
(follow_ssa_edge_expr): Same.
(backedge_phi_arg_p): Same.
(follow_ssa_edge_in_condition_phi_branch): Same.
(follow_ssa_edge_in_condition_phi): Same.
(follow_ssa_edge_inner_loop_phi): Same.
(follow_ssa_edge): Same.
(analyze_evolution_in_loop): Same.
(analyze_initial_condition): Same.
(interpret_loop_phi): Same.
(interpret_condition_phi): Same.
(interpret_rhs_expr): Same.
(interpret_expr): Same.
(interpret_gimple_assign): Same.
(analyze_scalar_evolution_1): Same.
(analyze_scalar_evolution): Same.
(analyze_scalar_evolution_for_address_of): Same.
(get_instantiated_value_entry): Same.
(loop_closed_phi_def): Same.
(instantiate_scev_name): Same.
(instantiate_scev_poly): Same.
(instantiate_scev_binary): Same.
(instantiate_scev_convert): Same.
(instantiate_scev_not): Same.
(instantiate_scev_r): Same.
(instantiate_scev): Same.
(resolve_mixers): Same.
(initialize_scalar_evolutions_analyzer): Same.
(scev_reset_htab): Same.
(scev_reset): Same.
(derive_simple_iv_with_niters): Same.
(simple_iv_with_niters): Same.
(expression_expensive_p): Same.
(final_value_replacement_loop): Same.
* tree-scalar-evolution.h (block_before_loop): Same.
* tree-ssa-address.h: Same.
* tree-ssa-dce.c (find_obviously_necessary_stmts): Same.
* tree-ssa-dom.c (edge_info::record_simple_equiv): Same.
(record_edge_info): Same.
* tree-ssa-live.c (var_map_base_fini): Same.
(remove_unused_locals): Same.
* tree-ssa-live.h: Same.
* tree-ssa-loop-ch.c (should_duplicate_loop_header_p): Same.
(pass_ch_vect::execute): Same.
(pass_ch::process_loop_p): Same.
* tree-ssa-loop-im.c (mem_ref_hasher::hash): Same.
(movement_possibility): Same.
(outermost_invariant_loop): Same.
(stmt_cost): Same.
(determine_max_movement): Same.
(invariantness_dom_walker::before_dom_children): Same.
(move_computations): Same.
(may_move_till): Same.
(force_move_till_op): Same.
(force_move_till): Same.
(memref_free): Same.
(record_mem_ref_loc): Same.
(set_ref_stored_in_loop): Same.
(mark_ref_stored): Same.
(sort_bbs_in_loop_postorder_cmp): Same.
(sort_locs_in_loop_postorder_cmp): Same.
(analyze_memory_references): Same.
(mem_refs_may_alias_p): Same.
(find_ref_loc_in_loop_cmp): Same.
(rewrite_mem_ref_loc::operator): Same.
(first_mem_ref_loc_1::operator): Same.
(sm_set_flag_if_changed::operator): Same.
(execute_sm_if_changed_flag_set): Same.
(execute_sm): Same.
(hoist_memory_references): Same.
(ref_always_accessed::operator): Same.
(refs_independent_p): Same.
(record_dep_loop): Same.
(ref_indep_loop_p_1): Same.
(ref_indep_loop_p): Same.
(can_sm_ref_p): Same.
(find_refs_for_sm): Same.
(loop_suitable_for_sm): Same.
(store_motion_loop): Same.
(store_motion): Same.
(fill_always_executed_in): Same.
* tree-ssa-loop-ivcanon.c (constant_after_peeling): Same.
(estimated_unrolled_size): Same.
(loop_edge_to_cancel): Same.
(remove_exits_and_undefined_stmts): Same.
(remove_redundant_iv_tests): Same.
(unloop_loops): Same.
(estimated_peeled_sequence_size): Same.
(try_peel_loop): Same.
(canonicalize_loop_induction_variables): Same.
(canonicalize_induction_variables): Same.
* tree-ssa-loop-ivopts.c (iv_inv_expr_hasher::equal): Same.
(name_info): Same.
(stmt_after_inc_pos): Same.
(contains_abnormal_ssa_name_p): Same.
(niter_for_exit): Same.
(find_bivs): Same.
(mark_bivs): Same.
(find_givs_in_bb): Same.
(find_induction_variables): Same.
(find_interesting_uses_cond): Same.
(outermost_invariant_loop_for_expr): Same.
(idx_find_step): Same.
(add_candidate_1): Same.
(add_iv_candidate_derived_from_uses): Same.
(alloc_use_cost_map): Same.
(prepare_decl_rtl): Same.
(generic_predict_doloop_p): Same.
(computation_cost): Same.
(determine_common_wider_type): Same.
(get_computation_aff_1): Same.
(get_use_type): Same.
(determine_group_iv_cost_address): Same.
(iv_period): Same.
(difference_cannot_overflow_p): Same.
(may_eliminate_iv): Same.
(determine_set_costs): Same.
(cheaper_cost_pair): Same.
(compare_cost_pair): Same.
(iv_ca_cand_for_group): Same.
(iv_ca_recount_cost): Same.
(iv_ca_set_remove_invs): Same.
(iv_ca_set_no_cp): Same.
(iv_ca_set_add_invs): Same.
(iv_ca_set_cp): Same.
(iv_ca_add_group): Same.
(iv_ca_cost): Same.
(iv_ca_compare_deps): Same.
(iv_ca_delta_reverse): Same.
(iv_ca_delta_commit): Same.
(iv_ca_cand_used_p): Same.
(iv_ca_delta_free): Same.
(iv_ca_new): Same.
(iv_ca_free): Same.
(iv_ca_dump): Same.
(iv_ca_extend): Same.
(iv_ca_narrow): Same.
(iv_ca_prune): Same.
(cheaper_cost_with_cand): Same.
(iv_ca_replace): Same.
(try_add_cand_for): Same.
(get_initial_solution): Same.
(try_improve_iv_set): Same.
(find_optimal_iv_set_1): Same.
(create_new_iv): Same.
(rewrite_use_compare): Same.
(remove_unused_ivs): Same.
(determine_scaling_factor): Same.
* tree-ssa-loop-ivopts.h: Same.
* tree-ssa-loop-manip.c (create_iv): Same.
(compute_live_loop_exits): Same.
(add_exit_phi): Same.
(add_exit_phis): Same.
(find_uses_to_rename_use): Same.
(find_uses_to_rename_def): Same.
(find_uses_to_rename_in_loop): Same.
(rewrite_into_loop_closed_ssa): Same.
(check_loop_closed_ssa_bb): Same.
(split_loop_exit_edge): Same.
(ip_end_pos): Same.
(ip_normal_pos): Same.
(copy_phi_node_args): Same.
(gimple_duplicate_loop_to_header_edge): Same.
(can_unroll_loop_p): Same.
(determine_exit_conditions): Same.
(scale_dominated_blocks_in_loop): Same.
(niter_for_unrolled_loop): Same.
(tree_transform_and_unroll_loop): Same.
(rewrite_all_phi_nodes_with_iv): Same.
* tree-ssa-loop-manip.h: Same.
* tree-ssa-loop-niter.c (number_of_iterations_ne_max): Same.
(number_of_iterations_ne): Same.
(assert_no_overflow_lt): Same.
(assert_loop_rolls_lt): Same.
(number_of_iterations_lt): Same.
(adjust_cond_for_loop_until_wrap): Same.
(tree_simplify_using_condition): Same.
(simplify_using_initial_conditions): Same.
(simplify_using_outer_evolutions): Same.
(loop_only_exit_p): Same.
(ssa_defined_by_minus_one_stmt_p): Same.
(number_of_iterations_popcount): Same.
(number_of_iterations_exit): Same.
(find_loop_niter): Same.
(finite_loop_p): Same.
(chain_of_csts_start): Same.
(get_val_for): Same.
(loop_niter_by_eval): Same.
(derive_constant_upper_bound_ops): Same.
(do_warn_aggressive_loop_optimizations): Same.
(record_estimate): Same.
(get_cst_init_from_scev): Same.
(record_nonwrapping_iv): Same.
(idx_infer_loop_bounds): Same.
(infer_loop_bounds_from_ref): Same.
(infer_loop_bounds_from_array): Same.
(infer_loop_bounds_from_pointer_arith): Same.
(infer_loop_bounds_from_signedness): Same.
(bound_index): Same.
(discover_iteration_bound_by_body_walk): Same.
(maybe_lower_iteration_bound): Same.
(estimate_numbers_of_iterations): Same.
(estimated_loop_iterations): Same.
(estimated_loop_iterations_int): Same.
(max_loop_iterations): Same.
(max_loop_iterations_int): Same.
(likely_max_loop_iterations): Same.
(likely_max_loop_iterations_int): Same.
(estimated_stmt_executions_int): Same.
(max_stmt_executions): Same.
(likely_max_stmt_executions): Same.
(estimated_stmt_executions): Same.
(stmt_dominates_stmt_p): Same.
(nowrap_type_p): Same.
(loop_exits_before_overflow): Same.
(scev_var_range_cant_overflow): Same.
(scev_probably_wraps_p): Same.
(free_numbers_of_iterations_estimates): Same.
* tree-ssa-loop-niter.h: Same.
* tree-ssa-loop-prefetch.c (release_mem_refs): Same.
(idx_analyze_ref): Same.
(analyze_ref): Same.
(gather_memory_references_ref): Same.
(mark_nontemporal_store): Same.
(emit_mfence_after_loop): Same.
(may_use_storent_in_loop_p): Same.
(mark_nontemporal_stores): Same.
(should_unroll_loop_p): Same.
(volume_of_dist_vector): Same.
(add_subscript_strides): Same.
(self_reuse_distance): Same.
(insn_to_prefetch_ratio_too_small_p): Same.
* tree-ssa-loop-split.c (split_at_bb_p): Same.
(patch_loop_exit): Same.
(find_or_create_guard_phi): Same.
(easy_exit_values): Same.
(connect_loop_phis): Same.
(connect_loops): Same.
(compute_new_first_bound): Same.
(split_loop): Same.
(tree_ssa_split_loops): Same.
* tree-ssa-loop-unswitch.c (tree_ssa_unswitch_loops): Same.
(is_maybe_undefined): Same.
(tree_may_unswitch_on): Same.
(simplify_using_entry_checks): Same.
(tree_unswitch_single_loop): Same.
(tree_unswitch_loop): Same.
(tree_unswitch_outer_loop): Same.
(empty_bb_without_guard_p): Same.
(used_outside_loop_p): Same.
(get_vop_from_header): Same.
(hoist_guard): Same.
* tree-ssa-loop.c (gate_oacc_kernels): Same.
(get_lsm_tmp_name): Same.
* tree-ssa-loop.h: Same.
* tree-ssa-reassoc.c (add_repeat_to_ops_vec): Same.
(build_and_add_sum): Same.
(no_side_effect_bb): Same.
(get_ops): Same.
(linearize_expr): Same.
(should_break_up_subtract): Same.
(linearize_expr_tree): Same.
* tree-ssa-scopedtables.c: Same.
* tree-ssa-scopedtables.h: Same.
* tree-ssa-structalias.c (condense_visit): Same.
(label_visit): Same.
(dump_pred_graph): Same.
(perform_var_substitution): Same.
(move_complex_constraints): Same.
(remove_preds_and_fake_succs): Same.
* tree-ssa-threadupdate.c (dbds_continue_enumeration_p): Same.
(determine_bb_domination_status): Same.
(duplicate_thread_path): Same.
(thread_through_all_blocks): Same.
* tree-ssa-threadupdate.h: Same.
* tree-streamer-in.c (streamer_read_string_cst): Same.
(input_identifier): Same.
(unpack_ts_type_common_value_fields): Same.
(unpack_ts_block_value_fields): Same.
(unpack_ts_translation_unit_decl_value_fields): Same.
(unpack_ts_omp_clause_value_fields): Same.
(streamer_read_tree_bitfields): Same.
(streamer_alloc_tree): Same.
(lto_input_ts_common_tree_pointers): Same.
(lto_input_ts_vector_tree_pointers): Same.
(lto_input_ts_poly_tree_pointers): Same.
(lto_input_ts_complex_tree_pointers): Same.
(lto_input_ts_decl_minimal_tree_pointers): Same.
(lto_input_ts_decl_common_tree_pointers): Same.
(lto_input_ts_decl_non_common_tree_pointers): Same.
(lto_input_ts_decl_with_vis_tree_pointers): Same.
(lto_input_ts_field_decl_tree_pointers): Same.
(lto_input_ts_function_decl_tree_pointers): Same.
(lto_input_ts_type_common_tree_pointers): Same.
(lto_input_ts_type_non_common_tree_pointers): Same.
(lto_input_ts_list_tree_pointers): Same.
(lto_input_ts_vec_tree_pointers): Same.
(lto_input_ts_exp_tree_pointers): Same.
(lto_input_ts_block_tree_pointers): Same.
(lto_input_ts_binfo_tree_pointers): Same.
(lto_input_ts_constructor_tree_pointers): Same.
(lto_input_ts_omp_clause_tree_pointers): Same.
(streamer_read_tree_body): Same.
* tree-streamer.h: Same.
* tree-switch-conversion.c (bit_test_cluster::is_beneficial): Same.
* tree-vect-data-refs.c (vect_get_smallest_scalar_type): Same.
(vect_analyze_possibly_independent_ddr): Same.
(vect_analyze_data_ref_dependence): Same.
(vect_compute_data_ref_alignment): Same.
(vect_enhance_data_refs_alignment): Same.
(vect_analyze_data_ref_access): Same.
(vect_check_gather_scatter): Same.
(vect_find_stmt_data_reference): Same.
(vect_create_addr_base_for_vector_ref): Same.
(vect_setup_realignment): Same.
(vect_supportable_dr_alignment): Same.
* tree-vect-loop-manip.c (rename_variables_in_bb): Same.
(adjust_phi_and_debug_stmts): Same.
(vect_set_loop_mask): Same.
(add_preheader_seq): Same.
(vect_maybe_permute_loop_masks): Same.
(vect_set_loop_masks_directly): Same.
(vect_set_loop_condition_masked): Same.
(vect_set_loop_condition_unmasked): Same.
(slpeel_duplicate_current_defs_from_edges): Same.
(slpeel_add_loop_guard): Same.
(slpeel_can_duplicate_loop_p): Same.
(create_lcssa_for_virtual_phi): Same.
(iv_phi_p): Same.
(vect_update_ivs_after_vectorizer): Same.
(vect_gen_vector_loop_niters_mult_vf): Same.
(slpeel_update_phi_nodes_for_loops): Same.
(slpeel_update_phi_nodes_for_guard1): Same.
(find_guard_arg): Same.
(slpeel_update_phi_nodes_for_guard2): Same.
(slpeel_update_phi_nodes_for_lcssa): Same.
(vect_do_peeling): Same.
(vect_create_cond_for_alias_checks): Same.
(vect_loop_versioning): Same.
* tree-vect-loop.c (vect_determine_vf_for_stmt): Same.
(vect_inner_phi_in_double_reduction_p): Same.
(vect_analyze_scalar_cycles_1): Same.
(vect_fixup_scalar_cycles_with_patterns): Same.
(vect_get_loop_niters): Same.
(bb_in_loop_p): Same.
(vect_get_max_nscalars_per_iter): Same.
(vect_verify_full_masking): Same.
(vect_compute_single_scalar_iteration_cost): Same.
(vect_analyze_loop_form_1): Same.
(vect_analyze_loop_form): Same.
(vect_active_double_reduction_p): Same.
(vect_analyze_loop_operations): Same.
(neutral_op_for_slp_reduction): Same.
(vect_is_simple_reduction): Same.
(vect_model_reduction_cost): Same.
(get_initial_def_for_reduction): Same.
(get_initial_defs_for_reduction): Same.
(vect_create_epilog_for_reduction): Same.
(vectorize_fold_left_reduction): Same.
(vectorizable_reduction): Same.
(vectorizable_induction): Same.
(vectorizable_live_operation): Same.
(loop_niters_no_overflow): Same.
(vect_get_loop_mask): Same.
(vect_transform_loop_stmt): Same.
(vect_transform_loop): Same.
* tree-vect-patterns.c (vect_reassociating_reduction_p): Same.
(vect_determine_precisions): Same.
(vect_pattern_recog_1): Same.
* tree-vect-slp.c (vect_analyze_slp_instance): Same.
* tree-vect-stmts.c (stmt_vectype): Same.
(process_use): Same.
(vect_init_vector_1): Same.
(vect_truncate_gather_scatter_offset): Same.
(get_group_load_store_type): Same.
(vect_build_gather_load_calls): Same.
(vect_get_strided_load_store_ops): Same.
(vectorizable_simd_clone_call): Same.
(vectorizable_store): Same.
(permute_vec_elements): Same.
(vectorizable_load): Same.
(vect_transform_stmt): Same.
(supportable_widening_operation): Same.
* tree-vectorizer.c (vec_info::replace_stmt): Same.
(vec_info::free_stmt_vec_info): Same.
(vect_free_loop_info_assumptions): Same.
(vect_loop_vectorized_call): Same.
(set_uid_loop_bbs): Same.
(vectorize_loops): Same.
* tree-vectorizer.h (STMT_VINFO_BB_VINFO): Same.
* tree.c (add_tree_to_fld_list): Same.
(fld_type_variant_equal_p): Same.
(fld_decl_context): Same.
(fld_incomplete_type_of): Same.
(free_lang_data_in_binfo): Same.
(need_assembler_name_p): Same.
(find_decls_types_r): Same.
(get_eh_types_for_runtime): Same.
(find_decls_types_in_eh_region): Same.
(find_decls_types_in_node): Same.
(assign_assembler_name_if_needed): Same.
* value-prof.c (stream_out_histogram_value): Same.
* value-prof.h: Same.
* var-tracking.c (use_narrower_mode): Same.
(prepare_call_arguments): Same.
(vt_expand_loc_callback): Same.
(resolve_expansions_pending_recursion): Same.
(vt_expand_loc): Same.
* varasm.c (const_hash_1): Same.
(compare_constant): Same.
(tree_output_constant_def): Same.
(simplify_subtraction): Same.
(get_pool_constant): Same.
(output_constant_pool_2): Same.
(output_constant_pool_1): Same.
(mark_constants_in_pattern): Same.
(mark_constant_pool): Same.
(get_section_anchor): Same.
* vr-values.c (compare_range_with_value): Same.
(vr_values::extract_range_from_phi_node): Same.
* vr-values.h: Same.
* web.c (unionfind_union): Same.
* wide-int.h: Same.
From-SVN: r273311
|
|
2019-07-04 Richard Biener <rguenther@suse.de>
PR tree-optimization/90911
* tree-vectorizer.h (_loop_vec_info::scalar_loop_scaling): New field.
(LOOP_VINFO_SCALAR_LOOP_SCALING): new.
* tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Initialize
scalar_loop_scaling.
(vect_transform_loop): Scale scalar loop profile if needed.
* tree-vect-loop-manip.c (vect_loop_versioning): When re-using
the loop copy from if-conversion adjust edge probabilities
and scale the vectorized loop body profile, queue the scalar
profile for updating after peeling.
From-SVN: r273082
|
|
r272239)
2019-06-21 Richard Biener <rguenther@suse.de>
PR tree-optimization/90913
* tree-vect-loop-manip.c (vect_loop_versioning): Do not re-use
the scalar variant of if-conversion versioning.
* gfortran.dg/vect/pr90913.f90: New testcase.
From-SVN: r272545
|
|
r272233 introduced a large number of execution failures on SVE.
The patch hard-coded an IV step of VF, but for SLP groups it needs
to be VF * group size.
Also, iv_precision had type widest_int but only needs to be unsigned int.
2019-06-18 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vect-loop-manip.c (vect_set_loop_masks_directly): Remove
vf parameter. Restore the previous iv step of nscalars_step,
but give it iv_type rather than compare_type. Tweak code order
to match the comments.
(vect_set_loop_condition_masked): Update accordingly.
* tree-vect-loop.c (vect_verify_full_masking): Use "unsigned int"
for iv_precision. Tweak comment formatting.
From-SVN: r272411
|
|
2019-06-13 Richard Biener <rguenther@suse.de>
* tree-vectorizer.h (vect_loop_vectorized_call): Declare.
* tree-vectorizer.c (vect_loop_vectorized_call): Export and
also return the condition stmt.
* tree-vect-loop-manip.c (vect_loop_versioning): Compute outermost
loop we can version and version that, reusing the loop version
created by if-conversion instead of versioning again.
* gcc.dg/vect/vect-version-1.c: New testcase.
* gcc.dg/vect/vect-version-2.c: Likewise.
From-SVN: r272239
|
|
gcc/ChangeLog:
2019-06-13 Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
PR target/88838
* tree-vect-loop-manip.c (vect_set_loop_masks_directly): If the
compare_type is not with Pmode size, we will create an IV with
Pmode size with truncated use (i.e. converted to the correct type).
* tree-vect-loop.c (vect_verify_full_masking): Find IV type.
(vect_iv_limit_for_full_masking): New. Factored out of
vect_set_loop_condition_masked.
* tree-vectorizer.h (LOOP_VINFO_MASK_IV_TYPE): New.
(vect_iv_limit_for_full_masking): Declare.
gcc/testsuite/ChangeLog:
2019-06-13 Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
PR target/88838
* gcc.target/aarch64/pr88838.c: New test.
* gcc.target/aarch64/sve/while_1.c: Adjust.
From-SVN: r272233
|
|
expression...
* omp-low.c (lower_rec_input_clauses): If OMP_CLAUSE_IF
has non-constant expression, force sctx.lane and use two
argument IFN_GOMP_SIMD_LANE instead of single argument.
* tree-ssa-dce.c (eliminate_unnecessary_stmts): Don't DCE
two argument IFN_GOMP_SIMD_LANE without lhs.
* tree-vectorizer.h (struct _loop_vec_info): Add simd_if_cond
member.
(LOOP_VINFO_SIMD_IF_COND, LOOP_REQUIRES_VERSIONING_FOR_SIMD_IF_COND):
Define.
(LOOP_REQUIRES_VERSIONING): Or in
LOOP_REQUIRES_VERSIONING_FOR_SIMD_IF_COND.
* tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Initialize
simd_if_cond.
(vect_analyze_loop_2): Punt if LOOP_VINFO_SIMD_IF_COND is constant 0.
* tree-vect-loop-manip.c (vect_loop_versioning): Add runtime check
from simd if clause if needed.
* gcc.dg/vect/vect-simd-1.c: New test.
* gcc.dg/vect/vect-simd-2.c: New test.
* gcc.dg/vect/vect-simd-3.c: New test.
* gcc.dg/vect/vect-simd-4.c: New test.
From-SVN: r271298
|
|
scan-assembler-not vmovaps)
2019-03-11 Richard Biener <rguenther@suse.de>
PR tree-optimization/89649
* tree-vectorizer.h (vect_loop_versioning): Adjust prototype.
* tree-vect-loop-manip.c (vect_do_peeling): Unset force_vectorize
on the prolog and epilog loops.
(vect_loop_versioning): Return copy of loop.
* tree-vect-loop.c (vect_transform_loop): Unset force_vectorize
on the non-vectorized version of the loop.
From-SVN: r269578
|
|
From-SVN: r267494
|
|
Historically GCC used location_t, while libcpp used source_location.
This inconsistency has been annoying me for a while, so this patch
removes source_location in favor of location_t throughout
(as the latter is shorter).
gcc/ChangeLog:
* builtins.c: Replace "source_location" with "location_t".
* diagnostic-show-locus.c: Likewise.
* diagnostic.c: Likewise.
* dumpfile.c: Likewise.
* gcc-rich-location.h: Likewise.
* genmatch.c: Likewise.
* gimple.h: Likewise.
* gimplify.c: Likewise.
* input.c: Likewise.
* input.h: Likewise. Eliminate the typedef.
* omp-expand.c: Likewise.
* selftest.h: Likewise.
* substring-locations.h (get_source_location_for_substring):
Rename to..
(get_location_within_string): ...this.
* tree-cfg.c: Replace "source_location" with "location_t".
* tree-cfgcleanup.c: Likewise.
* tree-diagnostic.c: Likewise.
* tree-into-ssa.c: Likewise.
* tree-outof-ssa.c: Likewise.
* tree-parloops.c: Likewise.
* tree-phinodes.c: Likewise.
* tree-phinodes.h: Likewise.
* tree-ssa-loop-ivopts.c: Likewise.
* tree-ssa-loop-manip.c: Likewise.
* tree-ssa-phiopt.c: Likewise.
* tree-ssa-phiprop.c: Likewise.
* tree-ssa-threadupdate.c: Likewise.
* tree-ssa.c: Likewise.
* tree-ssa.h: Likewise.
* tree-vect-loop-manip.c: Likewise.
gcc/c-family/ChangeLog:
* c-common.c (c_get_substring_location): Update for renaming of
get_source_location_for_substring to get_location_within_string.
* c-lex.c: Replace "source_location" with "location_t".
* c-opts.c: Likewise.
* c-ppoutput.c: Likewise.
gcc/c/ChangeLog:
* c-decl.c: Replace "source_location" with "location_t".
* c-tree.h: Likewise.
* c-typeck.c: Likewise.
* gimple-parser.c: Likewise.
gcc/cp/ChangeLog:
* call.c: Replace "source_location" with "location_t".
* cp-tree.h: Likewise.
* cvt.c: Likewise.
* name-lookup.c: Likewise.
* parser.c: Likewise.
* typeck.c: Likewise.
gcc/fortran/ChangeLog:
* cpp.c: Replace "source_location" with "location_t".
* gfortran.h: Likewise.
gcc/go/ChangeLog:
* go-gcc-diagnostics.cc: Replace "source_location" with "location_t".
* go-gcc.cc: Likewise.
* go-linemap.cc: Likewise.
* go-location.h: Likewise.
* gofrontend/README: Likewise.
gcc/jit/ChangeLog:
* jit-playback.c: Replace "source_location" with "location_t".
gcc/testsuite/ChangeLog:
* g++.dg/plugin/comment_plugin.c: Replace "source_location" with
"location_t".
* gcc.dg/plugin/diagnostic_plugin_test_show_locus.c: Likewise.
libcc1/ChangeLog:
* libcc1plugin.cc: Replace "source_location" with "location_t".
(plugin_context::get_source_location): Rename to...
(plugin_context::get_location_t): ...this.
* libcp1plugin.cc: Likewise.
libcpp/ChangeLog:
* charset.c: Replace "source_location" with "location_t".
* directives-only.c: Likewise.
* directives.c: Likewise.
* errors.c: Likewise.
* expr.c: Likewise.
* files.c: Likewise.
* include/cpplib.h: Likewise. Rename MAX_SOURCE_LOCATION to
MAX_LOCATION_T.
* include/line-map.h: Likewise.
* init.c: Likewise.
* internal.h: Likewise.
* lex.c: Likewise.
* line-map.c: Likewise.
* location-example.txt: Likewise.
* macro.c: Likewise.
* pch.c: Likewise.
* traditional.c: Likewise.
From-SVN: r266085
|
|
This patch enables targets to describe DR_TARGET_ALIGNMENT as a compile-time
variable. It does so by turning the variable into a 'poly_uint64'.
gcc/ChangeLog:
2018-11-13 Andre Vieira <andre.simoesdiasvieira@arm.com>
* config/aarch64/aarch64.c
(aarch64_vectorize_preferred_vector_alignment): Change return type to
poly_uint64.
(aarch64_simd_vector_alignment_reachable): Adapt to preferred vector
alignment being a poly int.
* doc/tm.texi (TARGET_VECTORIZE_PREFERRED_VECTOR_ALIGNMENT): Change
return type to poly_uint64.
* target.def (default_preferred_vector_alignment): Likewise.
* targhooks.c (default_preferred_vector_alignment): Likewise.
* targhooks.h (default_preferred_vector_alignment): Likewise.
* tree-vect-data-refs.c (vect_calculate_target_alignment): Likewise.
(vect_compute_data_ref_alignment): Adapt to vector alignment being a
poly int.
(vect_update_misalignment_for_peel): Likewise.
(vect_enhance_data_refs_alignment): Likewise.
(vect_find_same_alignment_drs): Likewise.
(vect_duplicate_ssa_name_ptr_info): Likewise.
(vect_setup_realignment): Likewise.
(vect_can_force_dr_alignment_p): Change alignment parameter type to
poly_uint64.
* tree-vect-loop-manip.c (get_misalign_in_elems): Learn to construct a
mask with a compile time variable vector alignment.
(vect_gen_prolog_loop_niters): Adapt to vector alignment being a poly
int.
(vect_do_peeling): Exit early if vector alignment is not constant.
* tree-vect-stmts.c (ensure_base_align): Adapt to vector alignment being
a poly int.
(vectorizable_store): Likewise.
(vectorizable_load): Likweise.
* tree-vectorizer.h (struct dr_vec_info): Make target_alignment field a
poly_uint64.
(vect_known_alignment_in_bytes): Adapt to vector alignment being a
poly int.
(vect_can_force_dr_alignment_p): Change alignment parameter type to
poly_uint64.
From-SVN: r266072
|
|
2018-11-06 Richard Biener <rguenther@suse.de>
PR tree-optimization/87889
* tree-vect-loop-manip.c (slpeel_duplicate_current_defs_from_edges):
Do nothing if old and new arg are the same
* gcc.dg/pr87894.c: New testcase.
From-SVN: r265833
|
|
incompatible types in PHI argument 0))
2018-11-05 Richard Biener <rguenther@suse.de>
PR tree-optimization/87873
* tree-ssa-loop-manip.h (split_loop_exit_edge): Add copy_constants_p
argument.
* tree-ssa-loop-manip.c (split_loop_exit_edge): Likewise.
* tree-vect-loop.c (vect_transform_loop): When splitting the
loop exit also create forwarder PHIs for constants.
* tree-vect-loop-manip.c (slpeel_duplicate_current_defs_from_edges):
Handle constant to_arg, add extra checking we match up the correct
PHIs.
* gcc.dg/pr87873.c: New testcase.
From-SVN: r265812
|
|
This patch introduces a verbosity level to dump messages:
"user-facing" vs "internals".
By default, messages at the top-level dump scope are "user-facing",
whereas those that are in nested scopes are implicitly "internals",
and are filtered out by -fopt-info unless a new "-internals" sub-option
of "-fopt-info" is supplied (intended purely for use by GCC developers).
Dumpfiles are unaffected by the change.
Given that the vectorizer is the only subsystem using AUTO_DUMP_SCOPE
(via DUMP_VECT_SCOPE), this only affects the vectorizer.
Filtering out these implementation-detail messages goes a long way
towards making -fopt-info-vec-all more accessible to advanced end-users;
the follow-up patch restores the most pertinent missing details.
gcc/ChangeLog:
* doc/invoke.texi (-fopt-info): Document new "internals"
sub-option.
* dump-context.h (dump_context::apply_dump_filter_p): New decl.
* dumpfile.c (dump_options): Update for renaming of MSG_ALL to
MSG_ALL_KINDS.
(optinfo_verbosity_options): Add "internals".
(kind_as_string): Update for renaming of MSG_ALL to MSG_ALL_KINDS.
(dump_context::apply_dump_filter_p): New member function.
(dump_context::dump_loc): Use apply_dump_filter_p rather than
explicitly masking the dump_kind.
(dump_context::begin_scope): Increment the scope depth first. Use
apply_dump_filter_p rather than explicitly masking the dump_kind.
(dump_context::emit_item): Use apply_dump_filter_p rather than
explicitly masking the dump_kind.
(dump_dec): Likewise.
(dump_hex): Likewise.
(dump_switch_p_1): Default to MSG_ALL_PRIORITIES.
(opt_info_switch_p_1): Default to MSG_PRIORITY_USER_FACING.
(opt_info_switch_p): Update handling of default
MSG_OPTIMIZED_LOCATIONS to cope with default of
MSG_PRIORITY_USER_FACING.
(dump_basic_block): Use apply_dump_filter_p rather than explicitly
masking the dump_kind.
(selftest::test_capture_of_dump_calls): Update test_dump_context
instances to use MSG_ALL_KINDS | MSG_PRIORITY_USER_FACING rather
than MSG_ALL. Generalize scope test to be run at all four
combinations of with/without MSG_PRIORITY_USER_FACING and
MSG_PRIORITY_INTERNALS, adding examples of explicit priority
for each of the two values.
* dumpfile.h (enum dump_flag): Add comment about the MSG_* flags.
Rename MSG_ALL to MSG_ALL_KINDS. Add MSG_PRIORITY_USER_FACING,
MSG_PRIORITY_INTERNALS, and MSG_ALL_PRIORITIES, updating the
values for TDF_COMPARE_DEBUG and TDF_ALL_VALUES.
(AUTO_DUMP_SCOPE): Add a note to the comment about the interaction
with MSG_PRIORITY_*.
* tree-vect-loop-manip.c (vect_loop_versioning): Mark versioning
dump messages as MSG_PRIORITY_USER_FACING.
* tree-vectorizer.h (DUMP_VECT_SCOPE): Add a note to the comment
about the interaction with MSG_PRIORITY_*.
gcc/testsuite/ChangeLog:
* gcc.dg/plugin/dump-1.c: Update expected output for test_scopes
due to "-internals" not being selected.
* gcc.dg/plugin/dump-2.c: New test, based on dump-1.c, with
"-internals" added to re-enable the output from test_scopes.
* gcc.dg/plugin/plugin.exp (plugin_test_list): Add dump-2.c.
From-SVN: r264851
|
|
As promised at Cauldron, this patch uses %T and %G with dump_printf and
dump_printf_loc calls to eliminate calls to
dump_generic_expr (MSG_*, arg, TDF_SLIM) (via %T)
and
dump_gimple_stmt (MSG_*, TDF_SLIM, stmt, 0) (via %G)
throughout the middle-end, simplifying numerous dump callsites.
A few calls to these functions didn't match the above pattern; I didn't
touch these. I wasn't able to use %E anywhere.
gcc/ChangeLog:
* tree-data-ref.c (runtime_alias_check_p): Use formatted printing
with %T in place of calls to dump_generic_expr.
(prune_runtime_alias_test_list): Likewise.
(create_runtime_alias_checks): Likewise.
* tree-vect-data-refs.c (vect_check_nonzero_value): Likewise.
(vect_analyze_data_ref_dependence): Likewise.
(vect_slp_analyze_data_ref_dependence): Likewise.
(vect_record_base_alignment): Likewise. Use %G in place of call
to dump_gimple_stmt.
(vect_compute_data_ref_alignment): Likewise.
(verify_data_ref_alignment): Likewise.
(vect_find_same_alignment_drs): Likewise.
(vect_analyze_group_access_1): Likewise.
(vect_analyze_data_ref_accesses): Likewise.
(dependence_distance_ge_vf): Likewise.
(dump_lower_bound): Likewise.
(vect_prune_runtime_alias_test_list): Likewise.
(vect_find_stmt_data_reference): Likewise.
(vect_analyze_data_refs): Likewise.
(vect_create_addr_base_for_vector_ref): Likewise.
(vect_create_data_ref_ptr): Likewise.
* tree-vect-loop-manip.c (vect_set_loop_condition): Likewise.
(vect_can_advance_ivs_p): Likewise.
(vect_update_ivs_after_vectorizer): Likewise.
(vect_gen_prolog_loop_niters): Likewise.
(vect_prepare_for_masked_peels): Likewise.
* tree-vect-loop.c (vect_determine_vf_for_stmt): Likewise.
(vect_determine_vectorization_factor): Likewise.
(vect_is_simple_iv_evolution): Likewise.
(vect_analyze_scalar_cycles_1): Likewise.
(vect_analyze_loop_operations): Likewise.
(report_vect_op): Likewise.
(vect_is_slp_reduction): Likewise.
(check_reduction_path): Likewise.
(vect_is_simple_reduction): Likewise.
(vect_create_epilog_for_reduction): Likewise.
(vect_finalize_reduction:): Likewise.
(vectorizable_induction): Likewise.
(vect_transform_loop_stmt): Likewise.
(vect_transform_loop): Likewise.
(optimize_mask_stores): Likewise.
* tree-vect-patterns.c (vect_pattern_detected): Likewise.
(vect_split_statement): Likewise.
(vect_recog_over_widening_pattern): Likewise.
(vect_recog_average_pattern): Likewise.
(vect_determine_min_output_precision_1): Likewise.
(vect_determine_precisions_from_range): Likewise.
(vect_determine_precisions_from_users): Likewise.
(vect_mark_pattern_stmts): Likewise.
(vect_pattern_recog_1): Likewise.
* tree-vect-slp.c (vect_get_and_check_slp_defs): Likewise.
(vect_record_max_nunits): Likewise.
(vect_build_slp_tree_1): Likewise.
(vect_build_slp_tree_2): Likewise.
(vect_print_slp_tree): Likewise.
(vect_analyze_slp_instance): Likewise.
(vect_detect_hybrid_slp_stmts): Likewise.
(vect_detect_hybrid_slp_1): Likewise.
(vect_slp_analyze_operations): Likewise.
(vect_slp_analyze_bb_1): Likewise.
(vect_transform_slp_perm_load): Likewise.
(vect_schedule_slp_instance): Likewise.
* tree-vect-stmts.c (vect_mark_relevant): Likewise.
(vect_mark_stmts_to_be_vectorized): Likewise.
(vect_init_vector_1): Likewise.
(vect_get_vec_def_for_operand): Likewise.
(vect_finish_stmt_generation_1): Likewise.
(vect_check_load_store_mask): Likewise.
(vectorizable_call): Likewise.
(vectorizable_conversion): Likewise.
(vectorizable_operation): Likewise.
(vectorizable_load): Likewise.
(vect_analyze_stmt): Likewise.
(vect_is_simple_use): Likewise.
(vect_get_vector_types_for_stmt): Likewise.
(vect_get_mask_type_for_stmt): Likewise.
* tree-vectorizer.c (increase_alignment): Likewise.
From-SVN: r264424
|
|
This patch adds a new helper function for permanently removing a
statement and its associated stmt_vec_info.
2018-07-31 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vectorizer.h (vec_info::remove_stmt): Declare.
* tree-vectorizer.c (vec_info::remove_stmt): New function.
* tree-vect-loop-manip.c (vect_set_loop_condition): Use it.
* tree-vect-loop.c (vect_transform_loop): Likewise.
* tree-vect-slp.c (vect_schedule_slp): Likewise.
* tree-vect-stmts.c (vect_remove_stores): Likewise.
From-SVN: r263156
|
|
This patch replaces DR_VECT_AUX and vect_dr_stmt with a new
vec_info::lookup_dr function, so that the lookup is relative
to a particular vec_info rather than to global state.
2018-07-31 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vectorizer.h (vec_info::lookup_dr): New member function.
(vect_dr_stmt): Delete.
* tree-vectorizer.c (vec_info::lookup_dr): New function.
* tree-vect-loop-manip.c (vect_update_inits_of_drs): Use it instead
of DR_VECT_AUX.
* tree-vect-data-refs.c (vect_analyze_possibly_independent_ddr)
(vect_analyze_data_ref_dependence, vect_record_base_alignments)
(vect_verify_datarefs_alignment, vect_peeling_supportable)
(vect_analyze_data_ref_accesses, vect_prune_runtime_alias_test_list)
(vect_analyze_data_refs): Likewise.
(vect_slp_analyze_data_ref_dependence): Likewise. Take a vec_info
argument.
(vect_find_same_alignment_drs): Likewise.
(vect_slp_analyze_node_dependences): Update calls accordingly.
(vect_analyze_data_refs_alignment): Likewise. Use vec_info::lookup_dr
instead of DR_VECT_AUX.
(vect_get_peeling_costs_all_drs): Take a loop_vec_info instead
of a vector data references. Use vec_info::lookup_dr instead of
DR_VECT_AUX.
(vect_peeling_hash_get_lowest_cost): Update calls accordingly.
(vect_enhance_data_refs_alignment): Likewise. Use vec_info::lookup_dr
instead of DR_VECT_AUX.
From-SVN: r263155
|
|
After previous changes, it makes more sense for STMT_VINFO_UNALIGNED_DR
to be dr_vec_info rather than a data_reference.
2018-07-31 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vectorizer.h (_loop_vec_info::unaligned_dr): Change to
dr_vec_info.
* tree-vect-data-refs.c (vect_enhance_data_refs_alignment): Update
accordingly.
* tree-vect-loop.c (vect_analyze_loop_2): Likewise.
* tree-vect-loop-manip.c (get_misalign_in_elems): Likewise.
(vect_gen_prolog_loop_niters): Likewise.
From-SVN: r263154
|
|
This patch makes various routines (mostly in tree-vect-data-refs.c)
take dr_vec_infos rather than data_references. The affected routines
are really dealing with the way that an access is going to vectorised,
rather than with the original scalar access described by the
data_reference.
2018-07-31 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vectorizer.h (set_dr_misalignment, dr_misalignment)
(DR_TARGET_ALIGNMENT, aligned_access_p, known_alignment_for_access_p)
(vect_known_alignment_in_bytes, vect_dr_behavior)
(vect_get_scalar_dr_size): Take references as dr_vec_infos
instead of data_references. Update calls to other routines for
which the same change has been made.
* tree-vect-data-refs.c (vect_preserves_scalar_order_p): Take
dr_vec_infos instead of stmt_vec_infos.
(vect_analyze_data_ref_dependence): Update call accordingly.
(vect_slp_analyze_data_ref_dependence)
(vect_record_base_alignments): Use DR_VECT_AUX.
(vect_calculate_target_alignment, vect_compute_data_ref_alignment)
(vect_update_misalignment_for_peel, verify_data_ref_alignment)
(vector_alignment_reachable_p, vect_get_data_access_cost)
(vect_peeling_supportable, vect_analyze_group_access_1)
(vect_analyze_group_access, vect_analyze_data_ref_access)
(vect_vfa_segment_size, vect_vfa_access_size, vect_vfa_align)
(vect_compile_time_alias, vect_small_gap_p)
(vectorizable_with_step_bound_p, vect_duplicate_ssa_name_ptr_info):
(vect_supportable_dr_alignment): Take references as dr_vec_infos
instead of data_references. Update calls to other routines for
which the same change has been made.
(vect_verify_datarefs_alignment, vect_get_peeling_costs_all_drs)
(vect_find_same_alignment_drs, vect_analyze_data_refs_alignment)
(vect_slp_analyze_and_verify_node_alignment)
(vect_analyze_data_ref_accesses, vect_prune_runtime_alias_test_list)
(vect_create_addr_base_for_vector_ref, vect_create_data_ref_ptr)
(vect_setup_realignment): Use dr_vec_infos. Update calls after
above changes.
(_vect_peel_info::dr): Replace with...
(_vect_peel_info::dr_info): ...this new field.
(vect_peeling_hash_get_most_frequent)
(vect_peeling_hash_choose_best_peeling): Update accordingly.
(vect_peeling_hash_get_lowest_cost):
(vect_enhance_data_refs_alignment): Likewise. Update calls to other
routines for which the same change has been made.
(vect_peeling_hash_insert): Likewise. Take a dr_vec_info instead of a
data_reference.
* tree-vect-loop-manip.c (get_misalign_in_elems)
(vect_gen_prolog_loop_niters): Use dr_vec_infos. Update calls after
above changes.
* tree-vect-loop.c (vect_analyze_loop_2): Likewise.
* tree-vect-stmts.c (vect_get_store_cost, vect_get_load_cost)
(vect_truncate_gather_scatter_offset, compare_step_with_zero)
(get_group_load_store_type, get_negative_load_store_type)
(vect_get_data_ptr_increment, vectorizable_store)
(vectorizable_load): Likewise.
(ensure_base_align): Take a dr_vec_info instead of a data_reference.
Update calls to other routines for which the same change has been made.
From-SVN: r263153
|
|
This first (less mechanical) part handles cases that involve changes in
the callers or non-trivial changes in the functions themselves.
2018-07-31 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vect-data-refs.c (vect_describe_gather_scatter_call): Take
a stmt_vec_info instead of a gcall.
(vect_check_gather_scatter): Update call accordingly.
* tree-vect-loop-manip.c (iv_phi_p): Take a stmt_vec_info instead
of a gphi.
(vect_can_advance_ivs_p, vect_update_ivs_after_vectorizer)
(slpeel_update_phi_nodes_for_loops):): Update calls accordingly.
* tree-vect-loop.c (vect_transform_loop_stmt): Take a stmt_vec_info
instead of a gimple stmt.
(vect_transform_loop): Update calls accordingly.
* tree-vect-slp.c (vect_split_slp_store_group): Take and return
stmt_vec_infos instead of gimple stmts.
(vect_analyze_slp_instance): Update use accordingly.
* tree-vect-stmts.c (read_vector_array, write_vector_array)
(vect_clobber_variable, vect_stmt_relevant_p, permute_vec_elements)
(vect_use_strided_gather_scatters_p, vect_build_all_ones_mask)
(vect_build_zero_merge_argument, vect_get_gather_scatter_ops)
(vect_gen_widened_results_half, vect_get_loop_based_defs)
(vect_create_vectorized_promotion_stmts, can_vectorize_live_stmts):
Take a stmt_vec_info instead of a gimple stmt and pass stmt_vec_infos
down to subroutines.
From-SVN: r263146
|
|
This first part makes functions use stmt_vec_infos instead of
gimple stmts in cases where the stmt_vec_info was already available
and where the change is mechanical. Most of it is just replacing
"stmt" with "stmt_info".
2018-07-31 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vect-data-refs.c (vect_slp_analyze_node_dependences):
(vect_check_gather_scatter, vect_create_data_ref_ptr, bump_vector_ptr)
(vect_permute_store_chain, vect_setup_realignment)
(vect_permute_load_chain, vect_shift_permute_load_chain)
(vect_transform_grouped_load): Use stmt_vec_info rather than gimple
stmts internally, and when passing values to other vectorizer routines.
* tree-vect-loop-manip.c (vect_can_advance_ivs_p): Likewise.
* tree-vect-loop.c (vect_analyze_scalar_cycles_1)
(vect_analyze_loop_operations, get_initial_def_for_reduction)
(vect_create_epilog_for_reduction, vectorize_fold_left_reduction)
(vectorizable_reduction, vectorizable_induction)
(vectorizable_live_operation, vect_transform_loop_stmt)
(vect_transform_loop): Likewise.
* tree-vect-patterns.c (vect_reassociating_reduction_p)
(vect_recog_widen_op_pattern, vect_recog_mixed_size_cond_pattern)
(vect_recog_bool_pattern, vect_recog_gather_scatter_pattern): Likewise.
* tree-vect-slp.c (vect_analyze_slp_instance): Likewise.
(vect_slp_analyze_node_operations_1): Likewise.
* tree-vect-stmts.c (vect_mark_relevant, process_use)
(exist_non_indexing_operands_for_use_p, vect_init_vector_1)
(vect_mark_stmts_to_be_vectorized, vect_get_vec_def_for_operand)
(vect_finish_stmt_generation_1, get_group_load_store_type)
(get_load_store_type, vect_build_gather_load_calls)
(vectorizable_bswap, vectorizable_call, vectorizable_simd_clone_call)
(vect_create_vectorized_demotion_stmts, vectorizable_conversion)
(vectorizable_assignment, vectorizable_shift, vectorizable_operation)
(vectorizable_store, vectorizable_load, vectorizable_condition)
(vectorizable_comparison, vect_analyze_stmt, vect_transform_stmt)
(supportable_widening_operation): Likewise.
(vect_get_vector_types_for_stmt): Likewise.
* tree-vectorizer.h (vect_dr_behavior): Likewise.
From-SVN: r263143
|
|
Various places called vect_dr_stmt or vinfo_for_stmt multiple times
on the same input. This patch makes them reuse the earlier result.
It also splits a couple of single vinfo_for_stmt calls out into
separate statements so that they can be reused in later patches.
2018-07-31 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vect-data-refs.c (vect_analyze_data_ref_dependence)
(vect_slp_analyze_node_dependences, vect_analyze_data_ref_accesses)
(vect_permute_store_chain, vect_permute_load_chain)
(vect_shift_permute_load_chain, vect_transform_grouped_load): Avoid
repeated stmt_vec_info lookups.
* tree-vect-loop-manip.c (vect_can_advance_ivs_p): Likewise.
(vect_update_ivs_after_vectorizer): Likewise.
* tree-vect-loop.c (vect_is_simple_reduction): Likewise.
(vect_create_epilog_for_reduction, vectorizable_reduction): Likewise.
* tree-vect-patterns.c (adjust_bool_stmts): Likewise.
* tree-vect-slp.c (vect_analyze_slp_instance): Likewise.
(vect_bb_slp_scalar_cost): Likewise.
* tree-vect-stmts.c (get_group_alias_ptr_type): Likewise.
From-SVN: r263142
|
|
This patch changes LOOP_VINFO_MAY_MISALIGN_STMTS from an
auto_vec<gimple *> to an auto_vec<stmt_vec_info>.
2018-07-31 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vectorizer.h (_loop_vec_info::may_misalign_stmts): Change
from an auto_vec<gimple *> to an auto_vec<stmt_vec_info>.
* tree-vect-data-refs.c (vect_enhance_data_refs_alignment): Update
accordingly.
* tree-vect-loop-manip.c (vect_create_cond_for_align_checks): Likewise.
From-SVN: r263138
|
|
This patch makes vect_dr_stmt return a stmt_vec_info instead of a
gimple stmt. Rather than retain a separate gimple stmt variable
in cases where both existed, the patch replaces uses of the gimple
variable with the uses of the stmt_vec_info. Later patches do this
more generally.
Many things that are keyed off a data_reference would these days
be better keyed off a stmt_vec_info, but it's more convenient
to do that later in the series. The vect_dr_size calls that are
left over do still benefit from this patch.
2018-07-31 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vectorizer.h (vect_dr_stmt): Return a stmt_vec_info rather
than a gimple stmt.
* tree-vect-data-refs.c (vect_analyze_data_ref_dependence)
(vect_slp_analyze_data_ref_dependence, vect_record_base_alignments)
(vect_calculate_target_alignmentm, vect_compute_data_ref_alignment)
(vect_update_misalignment_for_peel, vect_verify_datarefs_alignment)
(vector_alignment_reachable_p, vect_get_data_access_cost)
(vect_get_peeling_costs_all_drs, vect_peeling_hash_get_lowest_cost)
(vect_peeling_supportable, vect_enhance_data_refs_alignment)
(vect_find_same_alignment_drs, vect_analyze_data_refs_alignment)
(vect_analyze_group_access_1, vect_analyze_group_access)
(vect_analyze_data_ref_access, vect_analyze_data_ref_accesses)
(vect_vfa_access_size, vect_small_gap_p, vect_analyze_data_refs)
(vect_supportable_dr_alignment): Remove vinfo_for_stmt from the
result of vect_dr_stmt and use the stmt_vec_info instead of
the associated gimple stmt.
* tree-vect-loop-manip.c (get_misalign_in_elems): Likewise.
(vect_gen_prolog_loop_niters): Likewise.
* tree-vect-loop.c (vect_analyze_loop_2): Likewise.
From-SVN: r263134
|
|
This patch turns stmt_vec_info into an unspeakably bad wrapper class
and adds an implicit conversion to the associated gimple stmt.
Having this conversion makes the rest of the series easier to write,
but since the class goes away again at the end of the series, I've
not bothered adding any comments or tried to make it pretty.
2018-07-31 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vectorizer.h (stmt_vec_info): Temporarily change from
a typedef to a wrapper class.
(NULL_STMT_VEC_INFO): New macro.
(vec_info::stmt_infos): Change to vec<stmt_vec_info>.
(stmt_vec_info::operator*): New function.
(stmt_vec_info::operator gimple *): Likewise.
(set_vinfo_for_stmt): Use NULL_STMT_VEC_INFO.
(add_stmt_costs): Likewise.
* tree-vect-loop-manip.c (iv_phi_p): Likewise.
* tree-vect-loop.c (vect_compute_single_scalar_iteration_cost)
(vect_get_known_peeling_cost): Likewise.
(vect_estimate_min_profitable_iters): Likewise.
* tree-vect-patterns.c (vect_init_pattern_stmt): Likewise.
* tree-vect-slp.c (vect_remove_slp_scalar_calls): Likewise.
* tree-vect-stmts.c (vect_build_gather_load_calls): Likewise.
(vectorizable_store, free_stmt_vec_infos): Likewise.
(new_stmt_vec_info): Change return type of xcalloc to
_stmt_vec_info *.
From-SVN: r263125
|
|
vect_recog_rotate_pattern had code to prevent operations
on invariants being vectorised unnecessarily:
if (dt == vect_external_def
&& TREE_CODE (oprnd1) == SSA_NAME
&& is_a <loop_vec_info> (vinfo))
{
struct loop *loop = as_a <loop_vec_info> (vinfo)->loop;
ext_def = loop_preheader_edge (loop);
if (!SSA_NAME_IS_DEFAULT_DEF (oprnd1))
{
basic_block bb = gimple_bb (SSA_NAME_DEF_STMT (oprnd1));
if (bb == NULL
|| !dominated_by_p (CDI_DOMINATORS, ext_def->dest, bb))
ext_def = NULL;
}
}
[..]
if (ext_def)
{
basic_block new_bb
= gsi_insert_on_edge_immediate (ext_def, def_stmt);
gcc_assert (!new_bb);
}
This patch reuses the same idea for casts of invariants created
during widening optimisations.
One hitch was that vect_loop_versioning asserted that the vector loop
preheader was still empty, although the cfg transformation it's doing
should be correct either way.
2018-06-30 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vect-patterns.c (vect_get_external_def_edge): New function,
split out from...
(vect_recog_rotate_pattern): ...here.
(vect_convert_input): Try to insert casts of invariants in the
preheader.
* tree-vect-loop-manip.c (vect_loop_versioning): Don't require the
preheader to be empty.
gcc/testsuite/
* gcc.dg/vect/vect-widen-mult-extern-1.c: New test.
From-SVN: r262277
|
|
gcc/ChangeLog:
* cfgloop.c (get_loop_location): Convert return type from
location_t to dump_user_location_t, replacing INSN_LOCATION lookups
by implicit construction from rtx_insn *, and using
dump_user_location_t::from_function_decl for the fallback case.
* cfgloop.h (get_loop_location): Convert return type from
location_t to dump_user_location_t.
* cgraphunit.c (walk_polymorphic_call_targets): Update call to
dump_printf_loc to pass in a dump_location_t rather than a
location_t, via the gimple stmt.
* coverage.c (get_coverage_counts): Update calls to
dump_printf_loc to pass in dump_location_t rather than a
location_t.
* doc/optinfo.texi (Dump types): Convert example of
dump_printf_loc from taking "locus" to taking "insn". Update
description of the "_loc" calls to cover dump_location_t.
* dumpfile.c: Include "backend.h", "gimple.h", "rtl.h", and
"selftest.h".
(dump_user_location_t::dump_user_location_t): New constructors,
from gimple *stmt and rtx_insn *.
(dump_user_location_t::from_function_decl): New function.
(dump_loc): Make static.
(dump_gimple_stmt_loc): Convert param "loc" from location_t to
const dump_location_t &.
(dump_generic_expr_loc): Delete.
(dump_printf_loc): Convert param "loc" from location_t to
const dump_location_t &.
(selftest::test_impl_location): New function.
(selftest::dumpfile_c_tests): New function.
* dumpfile.h: Include "profile-count.h".
(class dump_user_location_t): New class.
(struct dump_impl_location_t): New struct.
(class dump_location_t): New class.
(dump_printf_loc): Convert 2nd param from source_location to
const dump_location_t &.
(dump_generic_expr_loc): Delete.
(dump_gimple_stmt_loc): Convert 2nd param from source_location to
const dump_location_t &.
* gimple-fold.c (fold_gimple_assign): Update call to
dump_printf_loc to pass in a dump_location_t rather than a
location_t, via the gimple stmt.
(gimple_fold_call): Likewise.
* gimple-loop-interchange.cc
(loop_cand::analyze_iloop_reduction_var): Update for change to
check_reduction_path.
(tree_loop_interchange::interchange): Update for change to
find_loop_location.
* graphite-isl-ast-to-gimple.c (scop_to_isl_ast): Update for
change in return-type of find_loop_location.
(graphite_regenerate_ast_isl): Likewise.
* graphite-optimize-isl.c (optimize_isl): Likewise.
* graphite.c (graphite_transform_loops): Likewise.
* ipa-devirt.c (ipa_devirt): Update call to dump_printf_loc to
pass in a dump_location_t rather than a location_t, via the
gimple stmt.
* ipa-prop.c (ipa_make_edge_direct_to_target): Likewise.
* ipa.c (walk_polymorphic_call_targets): Likewise.
* loop-unroll.c (report_unroll): Convert "locus" param from
location_t to dump_location_t.
(decide_unrolling): Update for change to get_loop_location's
return type.
* omp-grid.c (struct grid_prop): Convert field "target_loc" from
location_t to dump_user_location_t.
(grid_find_single_omp_among_assignments_1): Updates calls to
dump_printf_loc to pass in a dump_location_t rather than a
location_t, via the gimple stmt.
(grid_parallel_clauses_gridifiable): Convert "tloc" from
location_t to dump_location_t. Updates calls to dump_printf_loc
to pass in a dump_location_t rather than a location_t, via the
gimple stmt.
(grid_inner_loop_gridifiable_p): Likewise.
(grid_dist_follows_simple_pattern): Likewise.
(grid_gfor_follows_tiling_pattern): Likewise.
(grid_target_follows_gridifiable_pattern): Likewise.
(grid_attempt_target_gridification): Convert initialization
of local "grid" from memset to zero-initialization; FIXME: does
this require C++11? Update call to dump_printf_loc to pass in a
optinfo_location rather than a location_t, via the gimple stmt.
* profile.c (read_profile_edge_counts): Updates call to
dump_printf_loc to pass in a dump_location_t rather than a
location_t
(compute_branch_probabilities): Likewise.
* selftest-run-tests.c (selftest::run_tests): Call
dumpfile_c_tests.
* selftest.h (dumpfile_c_tests): New decl.
* tree-loop-distribution.c (pass_loop_distribution::execute):
Update for change in return type of find_loop_location.
* tree-parloops.c (parallelize_loops): Likewise.
* tree-ssa-loop-ivcanon.c (try_unroll_loop_completely): Convert
"locus" from location_t to dump_user_location_t.
(canonicalize_loop_induction_variables): Likewise.
* tree-ssa-loop-ivopts.c (tree_ssa_iv_optimize_loop): Update
for change in return type of find_loop_location.
* tree-ssa-loop-niter.c (number_of_iterations_exit): Update call
to dump_printf_loc to pass in a dump_location_t rather than a
location_t, via the stmt.
* tree-ssa-sccvn.c (eliminate_dom_walker::before_dom_children):
Likewise.
* tree-vect-loop-manip.c (find_loop_location): Convert return
type from source_location to dump_user_location_t.
(vect_do_peeling): Update for above change.
(vect_loop_versioning): Update for change in type of
vect_location.
* tree-vect-loop.c (check_reduction_path): Convert "loc" param
from location_t to dump_user_location_t.
(vect_estimate_min_profitable_iters): Update for change in type
of vect_location.
* tree-vect-slp.c (vect_print_slp_tree): Convert param "loc" from
location_t to dump_location_t.
(vect_slp_bb): Update for change in type of vect_location.
* tree-vectorizer.c (vect_location): Convert from source_location
to dump_user_location_t.
(try_vectorize_loop_1): Update for change in vect_location's type.
(vectorize_loops): Likewise.
(increase_alignment): Likewise.
* tree-vectorizer.h (vect_location): Convert from source_location
to dump_user_location_t.
(find_loop_location): Convert return type from source_location to
dump_user_location_t.
(check_reduction_path): Convert 1st param from location_t to
dump_user_location_t.
* value-prof.c (check_counter): Update call to dump_printf_loc to
pass in a dump_user_location_t rather than a location_t; update
call to error_at for change in type of "locus".
(check_ic_target): Update call to dump_printf_loc to
pass in a dump_user_location_t rather than a location_t, via the
call_stmt.
From-SVN: r262149
|
|
2018-06-21 Richard Biener <rguenther@suse.de>
* tree-data-ref.c (dr_step_indicator): Handle NULL DR_STEP.
* tree-vect-data-refs.c (vect_analyze_possibly_independent_ddr):
Avoid calling vect_mark_for_runtime_alias_test with gathers or scatters.
(vect_analyze_data_ref_dependence): Re-order checks to deal with
NULL DR_STEP.
(vect_record_base_alignments): Do not record base alignment
for gathers or scatters.
(vect_compute_data_ref_alignment): Drop return value that is always
true. Bail out early for gathers or scatters.
(vect_enhance_data_refs_alignment): Bail out early for gathers
or scatters.
(vect_find_same_alignment_drs): Likewise.
(vect_analyze_data_refs_alignment): Remove dead code.
(vect_slp_analyze_and_verify_node_alignment): Likewise.
(vect_analyze_data_refs): For possible gathers or scatters do
not create an alternate DR, just check their possible validity
and mark them. Adjust DECL_NONALIASED handling to not rely
on DR_BASE_ADDRESS.
* tree-vect-loop-manip.c (vect_update_inits_of_drs): Do not
update inits of gathers or scatters.
* tree-vect-patterns.c (vect_recog_mask_conversion_pattern):
Also copy gather/scatter flag to pattern vinfo.
From-SVN: r261834
|
|
gcc/ChangeLog:
* tree-vect-data-refs.c (vect_analyze_data_ref_dependences):
Replace dump_printf_loc call with DUMP_VECT_SCOPE.
(vect_slp_analyze_instance_dependence): Likewise.
(vect_enhance_data_refs_alignment): Likewise.
(vect_analyze_data_refs_alignment): Likewise.
(vect_slp_analyze_and_verify_instance_alignment
(vect_analyze_data_ref_accesses): Likewise.
(vect_prune_runtime_alias_test_list): Likewise.
(vect_analyze_data_refs): Likewise.
* tree-vect-loop-manip.c (vect_update_inits_of_drs): Likewise.
* tree-vect-loop.c (vect_determine_vectorization_factor): Likewise.
(vect_analyze_scalar_cycles_1): Likewise.
(vect_get_loop_niters): Likewise.
(vect_analyze_loop_form_1): Likewise.
(vect_update_vf_for_slp): Likewise.
(vect_analyze_loop_operations): Likewise.
(vect_analyze_loop): Likewise.
(vectorizable_induction): Likewise.
(vect_transform_loop): Likewise.
* tree-vect-patterns.c (vect_pattern_recog): Likewise.
* tree-vect-slp.c (vect_analyze_slp): Likewise.
(vect_make_slp_decision): Likewise.
(vect_detect_hybrid_slp): Likewise.
(vect_slp_analyze_operations): Likewise.
(vect_slp_bb): Likewise.
* tree-vect-stmts.c (vect_mark_stmts_to_be_vectorized): Likewise.
(vectorizable_bswap): Likewise.
(vectorizable_call): Likewise.
(vectorizable_simd_clone_call): Likewise.
(vectorizable_conversion): Likewise.
(vectorizable_assignment): Likewise.
(vectorizable_shift): Likewise.
(vectorizable_operation): Likewise.
* tree-vectorizer.h (DUMP_VECT_SCOPE): New macro.
From-SVN: r261710
|
|
2018-06-01 Richard Biener <rguenther@suse.de>
* tree-vectorizer.h (vect_dr_stmt): New function.
(vect_get_load_cost): Adjust.
(vect_get_store_cost): Likewise.
* tree-vect-data-refs.c (vect_analyze_data_ref_dependence):
Use vect_dr_stmt instead of DR_SMTT.
(vect_record_base_alignments): Likewise.
(vect_calculate_target_alignment): Likewise.
(vect_compute_data_ref_alignment): Likewise and make static.
(vect_update_misalignment_for_peel): Likewise.
(vect_verify_datarefs_alignment): Likewise.
(vector_alignment_reachable_p): Likewise.
(vect_get_data_access_cost): Likewise. Pass down
vinfo to vect_get_load_cost/vect_get_store_cost instead of DR.
(vect_get_peeling_costs_all_drs): Likewise.
(vect_peeling_hash_get_lowest_cost): Likewise.
(vect_enhance_data_refs_alignment): Likewise.
(vect_find_same_alignment_drs): Likewise.
(vect_analyze_data_refs_alignment): Likewise.
(vect_analyze_group_access_1): Likewise.
(vect_analyze_group_access): Likewise.
(vect_analyze_data_ref_access): Likewise.
(vect_analyze_data_ref_accesses): Likewise.
(vect_vfa_segment_size): Likewise.
(vect_small_gap_p): Likewise.
(vectorizable_with_step_bound_p): Likewise.
(vect_prune_runtime_alias_test_list): Likewise.
(vect_analyze_data_refs): Likewise.
(vect_supportable_dr_alignment): Likewise.
* tree-vect-loop-manip.c (get_misalign_in_elems): Likewise.
(vect_gen_prolog_loop_niters): Likewise.
* tree-vect-loop.c (vect_analyze_loop_2): Likewise.
* tree-vect-patterns.c (vect_recog_bool_pattern): Do not
modify DR_STMT.
(vect_recog_mask_conversion_pattern): Likewise.
(vect_try_gather_scatter_pattern): Likewise.
* tree-vect-stmts.c (vect_model_store_cost): Pass stmt_info
to vect_get_store_cost.
(vect_get_store_cost): Get stmt_info instead of DR.
(vect_model_load_cost): Pass stmt_info to vect_get_load_cost.
(vect_get_load_cost): Get stmt_info instead of DR.
From-SVN: r261062
|
|
PR c/84909
* c-warn.c (conversion_warning): Replace "to to" with "to" in
diagnostics.
* hsa-gen.c (mem_type_for_type): Fix comment typo.
* tree-vect-loop-manip.c (vect_create_cond_for_niters_checks):
Likewise.
* gimple-ssa-warn-restrict.c (builtin_memref::set_base_and_offset):
Likewise.
From-SVN: r258609
|
|
I hadn't realised that on big-endian targets, VEC_UNPACK*HI_EXPR unpacks
the low-numbered lanes and VEC_UNPACK*LO_EXPR unpacks the high-numbered
lanes. This meant that both the SVE patterns and the handling of
fully-masked loops were wrong.
The patch deals with that by making sure that all vec_unpack* optabs
are define_expands, using BYTES_BIG_ENDIAN to choose the appropriate
define_insn. This in turn meant that we can get rid of the duplication
between the signed and unsigned patterns for predicates. (We provide
implementations of both the signed and unsigned optabs because the sign
doesn't matter for predicates: every element contains only one
significant bit.)
Also, the float unpacks need to unpack one half of the input vector,
but the unpacked upper bits are "don't care". There are two obvious
ways of handling that: use an unpack (filling with zeros) or use a ZIP
(filling with a duplicate of the low bits). The code previously used
unpacks, but the sequence involved a subreg that is semantically an
element reverse on big-endian targets. Using the ZIP patterns avoids
that, and at the moment there's no reason to prefer one over the other
for performance reasons, so the patch switches to ZIP unconditionally.
As the comment says, it would be easy to optimise this later if UUNPK
turns out to be better for some implementations.
2018-03-13 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* tree-vect-loop-manip.c (vect_maybe_permute_loop_masks):
Reverse the choice between VEC_UNPACK_LO_EXPR and VEC_UNPACK_HI_EXPR
for big-endian.
* config/aarch64/iterators.md (hi_lanes_optab): New int attribute.
* config/aarch64/aarch64-sve.md
(*aarch64_sve_<perm_insn><perm_hilo><mode>): Rename to...
(aarch64_sve_<perm_insn><perm_hilo><mode>): ...this.
(*extend<mode><Vwide>2): Rename to...
(aarch64_sve_extend<mode><Vwide>2): ...this.
(vec_unpack<su>_<perm_hilo>_<mode>): Turn into a define_expand,
renaming the old pattern to...
(aarch64_sve_punpk<perm_hilo>_<mode>): ...this. Only define
unsigned packs.
(vec_unpack<su>_<perm_hilo>_<SVE_BHSI:mode>): Turn into a
define_expand, renaming the old pattern to...
(aarch64_sve_<su>unpk<perm_hilo>_<SVE_BHSI:mode>): ...this.
(*vec_unpacku_<perm_hilo>_<mode>_no_convert): Delete.
(vec_unpacks_<perm_hilo>_<mode>): Take BYTES_BIG_ENDIAN into
account when deciding which SVE instruction the optab should use.
(vec_unpack<su_optab>_float_<perm_hilo>_vnx4si): Likewise.
gcc/testsuite/
* gcc.target/aarch64/sve/unpack_fcvt_signed_1.c: Expect zips rather
than unpacks.
* gcc.target/aarch64/sve/unpack_fcvt_unsigned_1.c: Likewise.
* gcc.target/aarch64/sve/unpack_float_1.c: Likewise.
From-SVN: r258489
|
|
since r256644)
2018-02-07 Richard Biener <rguenther@suse.de>
PR tree-optimization/84037
* tree-vectorizer.h (struct _loop_vec_info): Add ivexpr_map member.
(cse_and_gimplify_to_preheader): Declare.
(vect_get_place_in_interleaving_chain): Likewise.
* tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Initialize
ivexpr_map.
(_loop_vec_info::~_loop_vec_info): Delete it.
(cse_and_gimplify_to_preheader): New function.
* tree-vect-slp.c (vect_get_place_in_interleaving_chain): Export.
* tree-vect-stmts.c (vectorizable_store): CSE base and steps.
(vectorizable_load): Likewise. For grouped stores always base
the IV on the first element.
* tree-vect-loop-manip.c (vect_loop_versioning): Unshare versioning
condition before gimplifying.
From-SVN: r257453
|
|
This patch adds runtime alias checks for loops with variable strides,
so that we can vectorise them even without a restrict qualifier.
There are several parts to doing this:
1) For accesses like:
x[i * n] += 1;
we need to check whether n (and thus the DR_STEP) is nonzero.
vect_analyze_data_ref_dependence records values that need to be
checked in this way, then prune_runtime_alias_test_list records a
bounds check on DR_STEP being outside the range [0, 0].
2) For accesses like:
x[i * n] = x[i * n + 1] + 1;
we simply need to test whether abs (n) >= 2.
prune_runtime_alias_test_list looks for cases like this and tries
to guess whether it is better to use this kind of check or a check
for non-overlapping ranges. (We could do an OR of the two conditions
at runtime, but that isn't implemented yet.)
3) Checks for overlapping ranges need to cope with variable strides.
At present the "length" of each segment in a range check is
represented as an offset from the base that lies outside the
touched range, in the same direction as DR_STEP. The length
can therefore be negative and is sometimes conservative.
With variable steps it's easier to reaon about if we split
this into two:
seg_len:
distance travelled from the first iteration of interest
to the last, e.g. DR_STEP * (VF - 1)
access_size:
the number of bytes accessed in each iteration
with access_size always being a positive constant and seg_len
possibly being variable. We can then combine alias checks
for two accesses that are a constant number of bytes apart by
adjusting the access size to account for the gap. This leaves
the segment length unchanged, which allows the check to be combined
with further accesses.
When seg_len is positive, the runtime alias check has the form:
base_a >= base_b + seg_len_b + access_size_b
|| base_b >= base_a + seg_len_a + access_size_a
In many accesses the base will be aligned to the access size, which
allows us to skip the addition:
base_a > base_b + seg_len_b
|| base_b > base_a + seg_len_a
A similar saving is possible with "negative" lengths.
The patch therefore tracks the alignment in addition to seg_len
and access_size.
2018-01-13 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* tree-vectorizer.h (vec_lower_bound): New structure.
(_loop_vec_info): Add check_nonzero and lower_bounds.
(LOOP_VINFO_CHECK_NONZERO): New macro.
(LOOP_VINFO_LOWER_BOUNDS): Likewise.
(LOOP_REQUIRES_VERSIONING_FOR_ALIAS): Check lower_bounds too.
* tree-data-ref.h (dr_with_seg_len): Add access_size and align
fields. Make seg_len the distance travelled, not including the
access size.
(dr_direction_indicator): Declare.
(dr_zero_step_indicator): Likewise.
(dr_known_forward_stride_p): Likewise.
* tree-data-ref.c: Include stringpool.h, tree-vrp.h and
tree-ssanames.h.
(runtime_alias_check_p): Allow runtime alias checks with
variable strides.
(operator ==): Compare access_size and align.
(prune_runtime_alias_test_list): Rework for new distinction between
the access_size and seg_len.
(create_intersect_range_checks_index): Likewise. Cope with polynomial
segment lengths.
(get_segment_min_max): New function.
(create_intersect_range_checks): Use it.
(dr_step_indicator): New function.
(dr_direction_indicator): Likewise.
(dr_zero_step_indicator): Likewise.
(dr_known_forward_stride_p): Likewise.
* tree-loop-distribution.c (data_ref_segment_size): Return
DR_STEP * (niters - 1).
(compute_alias_check_pairs): Update call to the dr_with_seg_len
constructor.
* tree-vect-data-refs.c (vect_check_nonzero_value): New function.
(vect_preserves_scalar_order_p): New function, split out from...
(vect_analyze_data_ref_dependence): ...here. Check for zero steps.
(vect_vfa_segment_size): Return DR_STEP * (length_factor - 1).
(vect_vfa_access_size): New function.
(vect_vfa_align): Likewise.
(vect_compile_time_alias): Take access_size_a and access_b arguments.
(dump_lower_bound): New function.
(vect_check_lower_bound): Likewise.
(vect_small_gap_p): Likewise.
(vectorizable_with_step_bound_p): Likewise.
(vect_prune_runtime_alias_test_list): Ignore cross-iteration
depencies if the vectorization factor is 1. Convert the checks
for nonzero steps into checks on the bounds of DR_STEP. Try using
a bunds check for variable steps if the minimum required step is
relatively small. Update calls to the dr_with_seg_len
constructor and to vect_compile_time_alias.
* tree-vect-loop-manip.c (vect_create_cond_for_lower_bounds): New
function.
(vect_loop_versioning): Call it.
* tree-vect-loop.c (vect_analyze_loop_2): Clear LOOP_VINFO_LOWER_BOUNDS
when retrying.
(vect_estimate_min_profitable_iters): Account for any bounds checks.
gcc/testsuite/
* gcc.dg/vect/bb-slp-cond-1.c: Expect loop vectorization rather
than SLP vectorization.
* gcc.dg/vect/vect-alias-check-10.c: New test.
* gcc.dg/vect/vect-alias-check-11.c: Likewise.
* gcc.dg/vect/vect-alias-check-12.c: Likewise.
* gcc.dg/vect/vect-alias-check-8.c: Likewise.
* gcc.dg/vect/vect-alias-check-9.c: Likewise.
* gcc.target/aarch64/sve/strided_load_8.c: Likewise.
* gcc.target/aarch64/sve/var_stride_1.c: Likewise.
* gcc.target/aarch64/sve/var_stride_1.h: Likewise.
* gcc.target/aarch64/sve/var_stride_1_run.c: Likewise.
* gcc.target/aarch64/sve/var_stride_2.c: Likewise.
* gcc.target/aarch64/sve/var_stride_2_run.c: Likewise.
* gcc.target/aarch64/sve/var_stride_3.c: Likewise.
* gcc.target/aarch64/sve/var_stride_3_run.c: Likewise.
* gcc.target/aarch64/sve/var_stride_4.c: Likewise.
* gcc.target/aarch64/sve/var_stride_4_run.c: Likewise.
* gcc.target/aarch64/sve/var_stride_5.c: Likewise.
* gcc.target/aarch64/sve/var_stride_5_run.c: Likewise.
* gcc.target/aarch64/sve/var_stride_6.c: Likewise.
* gcc.target/aarch64/sve/var_stride_6_run.c: Likewise.
* gcc.target/aarch64/sve/var_stride_7.c: Likewise.
* gcc.target/aarch64/sve/var_stride_7_run.c: Likewise.
* gcc.target/aarch64/sve/var_stride_8.c: Likewise.
* gcc.target/aarch64/sve/var_stride_8_run.c: Likewise.
* gfortran.dg/vect/vect-alias-check-1.F90: Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256644
|
|
This patch adds support for fully-masking loops that require peeling
for gaps. It peels exactly one scalar iteration and uses the masked
loop to handle the rest. Previously we would fall back on using a
standard unmasked loop instead.
2018-01-13 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* tree-vect-loop-manip.c (vect_gen_scalar_loop_niters): Replace
vfm1 with a bound_epilog parameter.
(vect_do_peeling): Update calls accordingly, and move the prologue
call earlier in the function. Treat the base bound_epilog as 0 for
fully-masked loops and retain vf - 1 for other loops. Add 1 to
this base when peeling for gaps.
* tree-vect-loop.c (vect_analyze_loop_2): Allow peeling for gaps
with fully-masked loops.
(vect_estimate_min_profitable_iters): Handle the single peeled
iteration in that case.
gcc/testsuite/
* gcc.target/aarch64/sve/struct_vect_18.c: Check the number
of branches.
* gcc.target/aarch64/sve/struct_vect_19.c: Likewise.
* gcc.target/aarch64/sve/struct_vect_20.c: New test.
* gcc.target/aarch64/sve/struct_vect_20_run.c: Likewise.
* gcc.target/aarch64/sve/struct_vect_21.c: Likewise.
* gcc.target/aarch64/sve/struct_vect_21_run.c: Likewise.
* gcc.target/aarch64/sve/struct_vect_22.c: Likewise.
* gcc.target/aarch64/sve/struct_vect_22_run.c: Likewise.
* gcc.target/aarch64/sve/struct_vect_23.c: Likewise.
* gcc.target/aarch64/sve/struct_vect_23_run.c: Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256635
|
|
This patch adds support for aligning vectors by using a partial
first iteration. E.g. if the start pointer is 3 elements beyond
an aligned address, the first iteration will have a mask in which
the first three elements are false.
On SVE, the optimisation is only useful for vector-length-specific
code. Vector-length-agnostic code doesn't try to align vectors
since the vector length might not be a power of 2.
2018-01-13 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* tree-vectorizer.h (_loop_vec_info::mask_skip_niters): New field.
(LOOP_VINFO_MASK_SKIP_NITERS): New macro.
(vect_use_loop_mask_for_alignment_p): New function.
(vect_prepare_for_masked_peels, vect_gen_while_not): Declare.
* tree-vect-loop-manip.c (vect_set_loop_masks_directly): Add an
niters_skip argument. Make sure that the first niters_skip elements
of the first iteration are inactive.
(vect_set_loop_condition_masked): Handle LOOP_VINFO_MASK_SKIP_NITERS.
Update call to vect_set_loop_masks_directly.
(get_misalign_in_elems): New function, split out from...
(vect_gen_prolog_loop_niters): ...here.
(vect_update_init_of_dr): Take a code argument that specifies whether
the adjustment should be added or subtracted.
(vect_update_init_of_drs): Likewise.
(vect_prepare_for_masked_peels): New function.
(vect_do_peeling): Skip prologue peeling if we're using a mask
instead. Update call to vect_update_inits_of_drs.
* tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Initialize
mask_skip_niters.
(vect_analyze_loop_2): Allow fully-masked loops with peeling for
alignment. Do not include the number of peeled iterations in
the minimum threshold in that case.
(vectorizable_induction): Adjust the start value down by
LOOP_VINFO_MASK_SKIP_NITERS iterations.
(vect_transform_loop): Call vect_prepare_for_masked_peels.
Take the number of skipped iterations into account when calculating
the loop bounds.
* tree-vect-stmts.c (vect_gen_while_not): New function.
gcc/testsuite/
* gcc.target/aarch64/sve/nopeel_1.c: New test.
* gcc.target/aarch64/sve/peel_ind_1.c: Likewise.
* gcc.target/aarch64/sve/peel_ind_1_run.c: Likewise.
* gcc.target/aarch64/sve/peel_ind_2.c: Likewise.
* gcc.target/aarch64/sve/peel_ind_2_run.c: Likewise.
* gcc.target/aarch64/sve/peel_ind_3.c: Likewise.
* gcc.target/aarch64/sve/peel_ind_3_run.c: Likewise.
* gcc.target/aarch64/sve/peel_ind_4.c: Likewise.
* gcc.target/aarch64/sve/peel_ind_4_run.c: Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256630
|