aboutsummaryrefslogtreecommitdiff
path: root/gcc/tree-vect-loop.c
AgeCommit message (Collapse)AuthorFilesLines
2019-12-10Add missing conversion in vect_create_epilog_for_reductionRichard Sandiford1-0/+2
The direct_slp_reduc code in vect_create_epilog_for_reduction was still assuming that all types involved in a reduction are the same (up to types_compatible_p), whereas we now support differences in sign. This was causing an ICE in gcc.dg/vect/pr92324-4.c for SVE. 2019-12-10 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vect-loop.c (vect_create_epilog_for_reduction): When handling direct_slp_reduc, allow the PHI arguments to have a different type from the vector elements. From-SVN: r279164
2019-12-10Disallow EXTRACT_LAST_REDUCTION for reduction chainsRichard Sandiford1-2/+3
gcc.dg/vect/vect-cond-reduc-5.c was ICEing for SVE because we tried to use an extract-last reduction for a chain of COND_EXPRs. Adding support for the chained case would be too invasive for stage 3 so this patch explicitly forbids it instead. I've filed PR92884 for the possible future work. 2019-12-10 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vect-loop.c (vectorizable_reduction): Don't use EXTRACT_LAST_REDUCTION for chained reductions. From-SVN: r279161
2019-12-05[Patch, GCC] Fix a condition post r278611Sudakshina Das1-1/+1
gcc/ChangeLog 2019-12-05 Sudakshina Das <sudi.das@arm.com> * tree-vect-loop.c (vect_model_reduction_cost): Remove reduction_type check from if condition. From-SVN: r279012
2019-12-02re PR tree-optimization/92742 (ICE in info_for_reduction, at ↵Richard Biener1-1/+2
tree-vect-loop.c:4367) 2019-12-02 Richard Biener <rguenther@suse.de> PR tree-optimization/92742 * tree-vect-loop.c (vect_fixup_reduc_chain): Do not touch the def-type but verify it is consistent with the original stmts. * gcc.dg/torture/pr92742.c: New testcase. From-SVN: r278896
2019-11-29Fix DR_GROUP_GAP for strided accesses (PR 92677)Richard Sandiford1-1/+4
When dissolving an SLP-only group of accesses, we should only set the gap to group_size - 1 for normal non-strided groups. 2019-11-29 Richard Sandiford <richard.sandiford@arm.com> gcc/ PR tree-optimization/92677 * tree-vect-loop.c (vect_dissolve_slp_only_groups): Set the gap to zero when dissolving a group of strided accesses. gcc/testsuite/ PR tree-optimization/92677 * gcc.dg/vect/pr92677.c: New test. From-SVN: r278852
2019-11-29Don't defer choice of vector type for bools (PR 92596)Richard Sandiford1-30/+8
Now that stmt_vec_info records the choice between vector mask types and normal nonmask types, we can use that information in vect_get_vector_types_for_stmt instead of deferring the choice of vector type till later. vect_get_mask_type_for_stmt used to check whether the boolean inputs to an operation: (a) consistently used mask types or consistently used nonmask types; and (b) agreed on the number of elements. (b) shouldn't be a problem when (a) is met. If the operation consistently uses mask types, tree-vect-patterns.c will have corrected any mismatches in mask precision. (This is because we only use mask types for a small well-known set of operations and tree-vect-patterns.c knows how to handle any that could have different mask precisions.) And if the operation consistently uses normal nonmask types, there's no reason why booleans should need extra vector compatibility checks compared to ordinary integers. So the potential difficulties all seem to come from (a). Now that we've chosen the result type ahead of time, we also have to consider whether the outputs and inputs consistently use mask types. Taking each vectorizable_* routine in turn: - vectorizable_call vect_get_vector_types_for_stmt only handled booleans specially for gassigns, so vect_get_mask_type_for_stmt never had chance to handle calls. I'm not sure we support any calls that operate on booleans, but as things stand, a boolean result would always have a nonmask type. Presumably any vector argument would also need to use nonmask types, unless it corresponds to internal_fn_mask_index (which is already a special case). For safety, I've added a check for mask/nonmask combinations here even though we didn't check this previously. - vectorizable_simd_clone_call Again, vect_get_mask_type_for_stmt never had chance to handle calls. The result of the call will always be a nonmask type and the patch for PR 92710 rejects mask arguments. So all booleans should consistently use nonmask types here. - vectorizable_conversion The function already rejects any conversion between booleans in which one type isn't a mask type. - vectorizable_operation This function definitely needs a consistency check, e.g. to handle & and | in which one operand is loaded from memory and the other is a comparison result. Ideally we'd handle this via pattern stmts instead (like we do for the all-mask case), but that's future work. - vectorizable_assignment VECT_SCALAR_BOOLEAN_TYPE_P requires single-bit precision, so the current code already rejects problematic cases. - vectorizable_load Loads always produce nonmask types and there are no relevant inputs to check against. - vectorizable_store vect_check_store_rhs already rejects mask/nonmask combinations via useless_type_conversion_p. - vectorizable_reduction - vectorizable_lc_phi PHIs always have nonmask types. After the change above, attempts to combine the PHI result with a mask type would be rejected by vectorizable_operation. (Again, it would be better to handle this using pattern stmts.) - vectorizable_induction We don't generate inductions for booleans. - vectorizable_shift The function already rejects boolean shifts via type_has_mode_precision_p. - vectorizable_condition The function already rejects mismatches via useless_type_conversion_p. - vectorizable_comparison The function already rejects comparisons between mask and nonmask types. The result is always a mask type. 2019-11-29 Richard Sandiford <richard.sandiford@arm.com> gcc/ PR tree-optimization/92596 * tree-vect-stmts.c (vectorizable_call): Punt on hybrid mask/nonmask operations. (vectorizable_operation): Likewise, instead of relying on vect_get_mask_type_for_stmt to do this. (vect_get_vector_types_for_stmt): Always return a vector type immediately, rather than deferring the choice for boolean results. Use a vector mask type instead of a normal vector if vect_use_mask_type_p. (vect_get_mask_type_for_stmt): Delete. * tree-vect-loop.c (vect_determine_vf_for_stmt_1): Remove mask_producers argument and special boolean_type_node handling. (vect_determine_vf_for_stmt): Remove mask_producers argument and update calls to vect_determine_vf_for_stmt_1. Remove doubled call. (vect_determine_vectorization_factor): Update call accordingly. * tree-vect-slp.c (vect_build_slp_tree_1): Remove special boolean_type_node handling. (vect_slp_analyze_node_operations_1): Likewise. gcc/testsuite/ PR tree-optimization/92596 * gcc.dg/vect/bb-slp-pr92596.c: New test. * gcc.dg/vect/bb-slp-43.c: Likewise. From-SVN: r278851
2019-11-22Move EXTRACT_LAST_REDUCTION costing to vectorizable_conditionRichard Sandiford1-2/+5
gcc.target/aarch64/sve/clastb_[57].c started failing after the increase in the cost of vec_to_scalar (r278452). The problem is that we were double-counting the cost of the CLASTB: once in vect_model_reduction_cost as a vec_to_scalar and once in vectorizable_condition as a plain vector_stmt. Based on the TODO above vect_model_reduction_cost, I think the preferred long-term direction is for vectorizable_* to cost these things itself, so that's what the patch does (for this one case only). 2019-11-22 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vect-stmts.c (vect_model_simple_cost): Take an optional vect_cost_for_stmt. (vectorizable_condition): Calculate the cost of EXTRACT_LAST_REDUCTION here rather than... * tree-vect-loop.c (vect_model_reduction_cost): ...here. From-SVN: r278611
2019-11-19re PR tree-optimization/92581 (condition chains vectorized wrongly)Richard Biener1-26/+32
2019-11-19 Richard Biener <rguenther@suse.de> PR tree-optimization/92581 * tree-vect-loop.c (vect_create_epilog_for_reduction): For condition reduction chains gather all conditions involved for computing the index reduction vector. * gcc.dg/vect/vect-cond-reduc-5.c: New testcase. From-SVN: r278445
2019-11-19re PR tree-optimization/92554 (ICE in vect_create_epilog_for_reduction, at ↵Richard Biener1-11/+21
tree-vect-loop.c:4325) 2019-11-19 Richard Biener <rguenther@suse.de> PR tree-optimization/92554 * tree-vect-loop.c (vect_create_epilog_for_reduction): Look for the actual condition stmt and deal with sign-changes. * gcc.dg/vect/pr92554.c: New testcase. From-SVN: r278431
2019-11-19re PR tree-optimization/92555 (ICE in exact_div, at poly-int.h:2162)Richard Biener1-0/+12
2019-09-19 Richard Biener <rguenther@suse.de> PR tree-optimization/92555 * tree-vect-loop.c (vect_update_vf_for_slp): Also scan PHIs for non-SLP stmts. * gcc.dg/vect/pr92555.c: New testcase. From-SVN: r278430
2019-11-18re PR tree-optimization/92558 (Miscompare of 554.roms_r with -Ofast ↵Richard Biener1-0/+1
-march=znver2 -flto since r278289) 2019-11-18 Richard Biener <rguenther@suse.de> PR tree-optimization/92558 * tree-vect-loop.c (vect_create_epilog_for_reduction): When reducting the width of a reduction vector def update new_phis. * gcc.dg/vect/pr92558.c: New testcase. From-SVN: r278400
2019-11-16Optionally pick the cheapest loop_vec_infoRichard Sandiford1-7/+141
This patch adds a mode in which the vectoriser tries each available base vector mode and picks the one with the lowest cost. The new behaviour is selected by autovectorize_vector_modes. The patch keeps the current behaviour of preferring a VF of loop->simdlen over any larger or smaller VF, regardless of costs or target preferences. 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/ * target.h (VECT_COMPARE_COSTS): New constant. * target.def (autovectorize_vector_modes): Return a bitmask of flags. * doc/tm.texi: Regenerate. * targhooks.h (default_autovectorize_vector_modes): Update accordingly. * targhooks.c (default_autovectorize_vector_modes): Likewise. * config/aarch64/aarch64.c (aarch64_autovectorize_vector_modes): Likewise. * config/arc/arc.c (arc_autovectorize_vector_modes): Likewise. * config/arm/arm.c (arm_autovectorize_vector_modes): Likewise. * config/i386/i386.c (ix86_autovectorize_vector_modes): Likewise. * config/mips/mips.c (mips_autovectorize_vector_modes): Likewise. * tree-vectorizer.h (_loop_vec_info::vec_outside_cost) (_loop_vec_info::vec_inside_cost): New member variables. * tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Initialize them. (vect_better_loop_vinfo_p, vect_joust_loop_vinfos): New functions. (vect_analyze_loop): When autovectorize_vector_modes returns VECT_COMPARE_COSTS, try vectorizing the loop with each available vector mode and picking the one with the lowest cost. (vect_estimate_min_profitable_iters): Record the computed costs in the loop_vec_info. From-SVN: r278336
2019-11-16Extend can_duplicate_and_interleave_p to mixed-size vectorsRichard Sandiford1-2/+1
This patch makes can_duplicate_and_interleave_p cope with mixtures of vector sizes, by using queries based on get_vectype_for_scalar_type instead of directly querying GET_MODE_SIZE (vinfo->vector_mode). int_mode_for_size is now the first check we do for a candidate mode, so it seemed better to restrict it to MAX_FIXED_MODE_SIZE. This avoids unnecessary work and avoids trying to create scalar types that the target might not support. 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vectorizer.h (can_duplicate_and_interleave_p): Take an element type rather than an element mode. * tree-vect-slp.c (can_duplicate_and_interleave_p): Likewise. Use get_vectype_for_scalar_type to query the natural types for a given element type rather than basing everything on GET_MODE_SIZE (vinfo->vector_mode). Limit int_mode_for_size query to MAX_FIXED_MODE_SIZE. (duplicate_and_interleave): Update call accordingly. * tree-vect-loop.c (vectorizable_reduction): Likewise. From-SVN: r278335
2019-11-15re PR tree-optimization/92512 (ICE in gimple_op, at gimple.h:2436)Richard Biener1-4/+17
2019-11-15 Richard Biener <rguenther@suse.de> PR tree-optimization/92512 * tree-vect-loop.c (check_reduction_path): Fix operand index computability check. Add check for second use in COND_EXPRs. * gcc.dg/torture/pr92512.c: New testcase. From-SVN: r278293
2019-11-15re PR tree-optimization/92324 (ICE in expand_direct_optab_fn, at ↵Richard Biener1-30/+38
internal-fn.c:2890) 2019-11-15 Richard Biener <rguenther@suse.de> PR tree-optimization/92324 * tree-vect-loop.c (vect_create_epilog_for_reduction): Fix singedness of SLP reduction epilouge operations. Also reduce the vector width for SLP reductions before doing elementwise operations if possible. * gcc.dg/vect/pr92324-4.c: New testcase. From-SVN: r278289
2019-11-14Avoid retrying with the same vector modesRichard Sandiford1-0/+13
A later patch makes the AArch64 port add four entries to autovectorize_vector_modes. Each entry describes a different vector mode assignment for vector code that mixes 8-bit, 16-bit, 32-bit and 64-bit elements. But if (as usual) the vector code has fewer element sizes than that, we could end up trying the same combination of vector modes multiple times. This patch adds a check to prevent that. 2019-11-14 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vectorizer.h (vec_info::mode_set): New typedef. (vec_info::used_vector_mode): New member variable. (vect_chooses_same_modes_p): Declare. * tree-vect-stmts.c (get_vectype_for_scalar_type): Record each chosen vector mode in vec_info::used_vector_mode. (vect_chooses_same_modes_p): New function. * tree-vect-loop.c (vect_analyze_loop): Use it to avoid trying the same vector statements multiple times. * tree-vect-slp.c (vect_slp_bb_region): Likewise. From-SVN: r278242
2019-11-14Support vectorisation with mixed vector sizesRichard Sandiford1-14/+40
After previous patches, it's now possible to make the vectoriser support multiple vector sizes in the same vector region, using related_vector_mode to pick the right vector mode for a given element mode. No port yet takes advantage of this, but I have a follow-on patch for AArch64. This patch also seemed like a good opportunity to add some more dump messages: one to make it clear which vector size/mode was being used when analysis passed or failed, and another to say when we've decided to skip a redundant vector size/mode. 2019-11-14 Richard Sandiford <richard.sandiford@arm.com> gcc/ * machmode.h (opt_machine_mode::operator==): New function. (opt_machine_mode::operator!=): Likewise. * tree-vectorizer.h (vec_info::vector_mode): Update comment. (get_related_vectype_for_scalar_type): Delete. (get_vectype_for_scalar_type_and_size): Declare. * tree-vect-slp.c (vect_slp_bb_region): Print dump messages to say whether analysis passed or failed, and with what vector modes. Use related_vector_mode to check whether trying a particular vector mode would be redundant with the autodetected mode, and print a dump message if we decide to skip it. * tree-vect-loop.c (vect_analyze_loop): Likewise. (vect_create_epilog_for_reduction): Use get_related_vectype_for_scalar_type instead of get_vectype_for_scalar_type_and_size. * tree-vect-stmts.c (get_vectype_for_scalar_type_and_size): Replace with... (get_related_vectype_for_scalar_type): ...this new function. Take a starting/"prevailing" vector mode rather than a vector size. Take an optional nunits argument, with the same meaning as for related_vector_mode. Use related_vector_mode when not auto-detecting a mode, falling back to mode_for_vector if no target mode exists. (get_vectype_for_scalar_type): Update accordingly. (get_same_sized_vectype): Likewise. * tree-vectorizer.c (get_vec_alignment_for_array_type): Likewise. From-SVN: r278240
2019-11-14Replace vec_info::vector_size with vec_info::vector_modeRichard Sandiford1-19/+13
This patch replaces vec_info::vector_size with vec_info::vector_mode, but for now continues to use it as a way of specifying a single vector size. This makes it easier for later patches to use related_vector_mode instead. 2019-11-14 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vectorizer.h (vec_info::vector_size): Replace with... (vec_info::vector_mode): ...this new field. * tree-vect-loop.c (vect_update_vf_for_slp): Update accordingly. (vect_analyze_loop, vect_transform_loop): Likewise. * tree-vect-loop-manip.c (vect_do_peeling): Likewise. * tree-vect-slp.c (can_duplicate_and_interleave_p): Likewise. (vect_make_slp_decision, vect_slp_bb_region): Likewise. * tree-vect-stmts.c (get_vectype_for_scalar_type): Likewise. * tree-vectorizer.c (try_vectorize_loop_1): Likewise. gcc/testsuite/ * gcc.dg/vect/vect-tail-nomask-1.c: Update expected epilogue vectorization message. From-SVN: r278237
2019-11-14Replace autovectorize_vector_sizes with autovectorize_vector_modesRichard Sandiford1-18/+15
This is another patch in the series to remove the assumption that all modes involved in vectorisation have to be the same size. Rather than have the target provide a list of vector sizes, it makes the target provide a list of vector "approaches", with each approach represented by a mode. A later patch will pass this mode to targetm.vectorize.related_mode to get the vector mode for a given element mode. Until then, the modes simply act as an alternative way of specifying the vector size. 2019-11-14 Richard Sandiford <richard.sandiford@arm.com> gcc/ * target.h (vector_sizes, auto_vector_sizes): Delete. (vector_modes, auto_vector_modes): New typedefs. * target.def (autovectorize_vector_sizes): Replace with... (autovectorize_vector_modes): ...this new hook. * doc/tm.texi.in (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_SIZES): Replace with... (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES): ...this new hook. * doc/tm.texi: Regenerate. * targhooks.h (default_autovectorize_vector_sizes): Delete. (default_autovectorize_vector_modes): New function. * targhooks.c (default_autovectorize_vector_sizes): Delete. (default_autovectorize_vector_modes): New function. * omp-general.c (omp_max_vf): Use autovectorize_vector_modes instead of autovectorize_vector_sizes. Use the number of units in the mode to calculate the maximum VF. * omp-low.c (omp_clause_aligned_alignment): Use autovectorize_vector_modes instead of autovectorize_vector_sizes. Use a loop based on related_mode to iterate through all supported vector modes for a given scalar mode. * optabs-query.c (can_vec_mask_load_store_p): Use autovectorize_vector_modes instead of autovectorize_vector_sizes. * tree-vect-loop.c (vect_analyze_loop, vect_transform_loop): Likewise. * tree-vect-slp.c (vect_slp_bb_region): Likewise. * config/aarch64/aarch64.c (aarch64_autovectorize_vector_sizes): Replace with... (aarch64_autovectorize_vector_modes): ...this new function. (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_SIZES): Delete. (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES): Define. * config/arc/arc.c (arc_autovectorize_vector_sizes): Replace with... (arc_autovectorize_vector_modes): ...this new function. (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_SIZES): Delete. (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES): Define. * config/arm/arm.c (arm_autovectorize_vector_sizes): Replace with... (arm_autovectorize_vector_modes): ...this new function. (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_SIZES): Delete. (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES): Define. * config/i386/i386.c (ix86_autovectorize_vector_sizes): Replace with... (ix86_autovectorize_vector_modes): ...this new function. (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_SIZES): Delete. (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES): Define. * config/mips/mips.c (mips_autovectorize_vector_sizes): Replace with... (mips_autovectorize_vector_modes): ...this new function. (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_SIZES): Delete. (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES): Define. From-SVN: r278236
2019-11-14Remove build_{same_sized_,}truth_vector_typeRichard Sandiford1-5/+4
build_same_sized_truth_vector_type was confusingly named, since for SVE and AVX512 the returned vector isn't the same byte size (although it does have the same number of elements). What it really returns is the "truth" vector type for a given data vector type. The more general truth_type_for provides the same thing when passed a vector and IMO has a more descriptive name, so this patch replaces all uses of build_same_sized_truth_vector_type with that. It does the same for a call to build_truth_vector_type, leaving truth_type_for itself as the only remaining caller. It's then more natural to pass build_truth_vector_type the original vector type rather than its size and nunits, especially since the given size isn't the size of the returned vector. This in turn allows a future patch to simplify the interface of get_mask_mode. Doing this also fixes a bug in which truth_type_for would pass a size of zero for BLKmode vector types. 2019-11-14 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree.h (build_truth_vector_type): Delete. (build_same_sized_truth_vector_type): Likewise. * tree.c (build_truth_vector_type): Rename to... (build_truth_vector_type_for): ...this. Make static and take a vector type as argument. (truth_type_for): Update accordingly. (build_same_sized_truth_vector_type): Delete. * tree-vect-generic.c (expand_vector_divmod): Use truth_type_for instead of build_same_sized_truth_vector_type. * tree-vect-loop.c (vect_create_epilog_for_reduction): Likewise. (vect_record_loop_mask, vect_get_loop_mask): Likewise. * tree-vect-patterns.c (build_mask_conversion): Likeise. * tree-vect-slp.c (vect_get_constant_vectors): Likewise. * tree-vect-stmts.c (vect_get_vec_def_for_operand): Likewise. (vect_build_gather_load_calls, vectorizable_call): Likewise. (scan_store_can_perm_p, vectorizable_scan_store): Likewise. (vectorizable_store, vectorizable_condition): Likewise. (get_mask_type_for_scalar_type, get_same_sized_vectype): Likewise. (vect_get_mask_type_for_stmt): Use truth_type_for instead of build_truth_vector_type. * config/aarch64/aarch64-sve-builtins.cc (gimple_folder::convert_pred): Use truth_type_for instead of build_same_sized_truth_vector_type. * config/rs6000/rs6000-call.c (fold_build_vec_cmp): Likewise. gcc/c/ * c-typeck.c (build_conditional_expr): Use truth_type_for instead of build_same_sized_truth_vector_type. (build_vec_cmp): Likewise. gcc/cp/ * call.c (build_conditional_expr_1): Use truth_type_for instead of build_same_sized_truth_vector_type. * typeck.c (build_vec_cmp): Likewise. gcc/d/ * d-codegen.cc (build_boolop): Use truth_type_for instead of build_same_sized_truth_vector_type. From-SVN: r278232
2019-11-14Add build_truth_vector_type_for_modeRichard Sandiford1-8/+10
Callers of vect_halve_mask_nunits and vect_double_mask_nunits already know what mode the resulting vector type should have, so we might as well create the vector type directly with that mode, just like build_vector_type_for_mode lets us build normal vectors with a known mode. This avoids the current awkwardness of having to recompute the mode starting from vec_info::vector_size, which hard-codes the assumption that all vectors have to be the same size. A later patch gets rid of build_truth_vector_type and build_same_sized_truth_vector_type, so the net effect of the series is to reduce the number of type functions by one. 2019-11-14 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree.h (build_truth_vector_type_for_mode): Declare. * tree.c (build_truth_vector_type_for_mode): New function, split out from... (build_truth_vector_type): ...here. (build_opaque_vector_type): Fix head comment. * tree-vectorizer.h (supportable_narrowing_operation): Remove vec_info parameter. (vect_halve_mask_nunits): Replace vec_info parameter with the mode of the new vector. (vect_double_mask_nunits): Likewise. * tree-vect-loop.c (vect_halve_mask_nunits): Likewise. (vect_double_mask_nunits): Likewise. * tree-vect-loop-manip.c: Include insn-config.h, rtl.h and recog.h. (vect_maybe_permute_loop_masks): Remove vinfo parameter. Update call to vect_halve_mask_nunits, getting the required mode from the unpack patterns. (vect_set_loop_condition_masked): Update call accordingly. * tree-vect-stmts.c (supportable_narrowing_operation): Remove vec_info parameter and update call to vect_double_mask_nunits. (vectorizable_conversion): Update call accordingly. (simple_integer_narrowing): Likewise. Remove vec_info parameter. (vectorizable_call): Update call accordingly. (supportable_widening_operation): Update call to vect_halve_mask_nunits. * config/aarch64/aarch64-sve-builtins.cc (register_builtin_types): Use build_truth_vector_type_mode instead of build_truth_vector_type. From-SVN: r278231
2019-11-13Account for the cost of generating loop masksRichard Sandiford1-0/+26
We didn't take the cost of generating loop masks into account, and so tended to underestimate the cost of loops that need multiple masks. 2019-11-13 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vect-loop.c (vect_estimate_min_profitable_iters): Include the cost of generating loop masks. gcc/testsuite/ * gcc.target/aarch64/sve/mask_struct_store_3.c: Add -fno-vect-cost-model. * gcc.target/aarch64/sve/mask_struct_store_3_run.c: Likewise. * gcc.target/aarch64/sve/peel_ind_2.c: Likewise. * gcc.target/aarch64/sve/peel_ind_2_run.c: Likewise. * gcc.target/aarch64/sve/peel_ind_3.c: Likewise. * gcc.target/aarch64/sve/peel_ind_3_run.c: Likewise. From-SVN: r278125
2019-11-13Avoid accounting for non-existent vector loop versioningRichard Sandiford1-9/+25
vect_analyze_loop_costing uses two profitability thresholds: a runtime one and a static compile-time one. The runtime one is simply the point at which the vector loop is cheaper than the scalar loop, while the static one also takes into account the cost of choosing between the scalar and vector loops at runtime. We compare this static cost against the expected execution frequency to decide whether it's worth generating any vector code at all. However, we never reclaimed the cost of applying the runtime threshold if it turned out that the vector code can always be used. And we only know whether that's true once we've calculated what the runtime threshold would be. 2019-11-13 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vectorizer.h (vect_apply_runtime_profitability_check_p): New function. * tree-vect-loop-manip.c (vect_loop_versioning): Use it. * tree-vect-loop.c (vect_analyze_loop_2): Likewise. (vect_transform_loop): Likewise. (vect_analyze_loop_costing): Don't take the cost of versioning into account for the static profitability threshold if it turns out that no versioning is needed. From-SVN: r278124
2019-11-13Don't assign a cost to vectorizable_assignmentRichard Sandiford1-1/+3
vectorizable_assignment handles true SSA-to-SSA copies (which hopefully we don't see in practice) and no-op conversions that are required to maintain correct gimple, such as changes between signed and unsigned types. These cases shouldn't generate any code and so shouldn't count against either the scalar or vector costs. Later patches test this, but it seemed worth splitting out. 2019-11-13 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vectorizer.h (vect_nop_conversion_p): Declare. * tree-vect-stmts.c (vect_nop_conversion_p): New function. (vectorizable_assignment): Don't add a cost for nop conversions. * tree-vect-loop.c (vect_compute_single_scalar_iteration_cost): Likewise. * tree-vect-slp.c (vect_bb_slp_scalar_cost): Likewise. From-SVN: r278122
2019-11-13re PR target/92473 (test pr92324-2.c fails on arm and aarch64)Richard Biener1-24/+6
2019-11-13 Richard Biener <rguenther@suse.de> PR tree-optimization/92473 * tree-vect-loop.c (vect_create_epilog_for_reduction): Perform direct optab reduction in the correct type. From-SVN: r278113
2019-11-12re PR tree-optimization/92461 (ICE: verify_ssa failed (error: excess use ↵Richard Biener1-2/+5
operand for statement)) 2019-11-12 Richard Biener <rguenther@suse.de> PR tree-optimization/92461 * tree-vect-loop.c (vect_create_epilog_for_reduction): Update stmt after propagation. * gcc.dg/torture/pr92461.c: New testcase. From-SVN: r278093
2019-11-12Remove gcc/params.* files.Martin Liska1-1/+0
2019-11-12 Martin Liska <mliska@suse.cz> * Makefile.in: Remove PARAMS_H and params.list and params.options. * params-enum.h: Remove. * params-list.h: Remove. * params-options.h: Remove. * params.c: Remove. * params.def: Remove. * params.h: Remove. * asan.c: Do not include params.h. * auto-profile.c: Likewise. * bb-reorder.c: Likewise. * builtins.c: Likewise. * cfgcleanup.c: Likewise. * cfgexpand.c: Likewise. * cfgloopanal.c: Likewise. * cgraph.c: Likewise. * combine.c: Likewise. * common/config/aarch64/aarch64-common.c: Likewise. * common/config/gcn/gcn-common.c: Likewise. * common/config/ia64/ia64-common.c: Likewise. * common/config/powerpcspe/powerpcspe-common.c: Likewise. * common/config/rs6000/rs6000-common.c: Likewise. * common/config/sh/sh-common.c: Likewise. * config/aarch64/aarch64.c: Likewise. * config/alpha/alpha.c: Likewise. * config/arm/arm.c: Likewise. * config/avr/avr.c: Likewise. * config/csky/csky.c: Likewise. * config/i386/i386-builtins.c: Likewise. * config/i386/i386-expand.c: Likewise. * config/i386/i386-features.c: Likewise. * config/i386/i386-options.c: Likewise. * config/i386/i386.c: Likewise. * config/ia64/ia64.c: Likewise. * config/rs6000/rs6000-logue.c: Likewise. * config/rs6000/rs6000.c: Likewise. * config/s390/s390.c: Likewise. * config/sparc/sparc.c: Likewise. * config/visium/visium.c: Likewise. * coverage.c: Likewise. * cprop.c: Likewise. * cse.c: Likewise. * cselib.c: Likewise. * dse.c: Likewise. * emit-rtl.c: Likewise. * explow.c: Likewise. * final.c: Likewise. * fold-const.c: Likewise. * gcc.c: Likewise. * gcse.c: Likewise. * ggc-common.c: Likewise. * ggc-page.c: Likewise. * gimple-loop-interchange.cc: Likewise. * gimple-loop-jam.c: Likewise. * gimple-loop-versioning.cc: Likewise. * gimple-ssa-split-paths.c: Likewise. * gimple-ssa-sprintf.c: Likewise. * gimple-ssa-store-merging.c: Likewise. * gimple-ssa-strength-reduction.c: Likewise. * gimple-ssa-warn-alloca.c: Likewise. * gimple-ssa-warn-restrict.c: Likewise. * graphite-isl-ast-to-gimple.c: Likewise. * graphite-optimize-isl.c: Likewise. * graphite-scop-detection.c: Likewise. * graphite-sese-to-poly.c: Likewise. * graphite.c: Likewise. * haifa-sched.c: Likewise. * hsa-gen.c: Likewise. * ifcvt.c: Likewise. * ipa-cp.c: Likewise. * ipa-fnsummary.c: Likewise. * ipa-inline-analysis.c: Likewise. * ipa-inline.c: Likewise. * ipa-polymorphic-call.c: Likewise. * ipa-profile.c: Likewise. * ipa-prop.c: Likewise. * ipa-split.c: Likewise. * ipa-sra.c: Likewise. * ira-build.c: Likewise. * ira-conflicts.c: Likewise. * loop-doloop.c: Likewise. * loop-invariant.c: Likewise. * loop-unroll.c: Likewise. * lra-assigns.c: Likewise. * lra-constraints.c: Likewise. * modulo-sched.c: Likewise. * opt-suggestions.c: Likewise. * opts.c: Likewise. * postreload-gcse.c: Likewise. * predict.c: Likewise. * reload.c: Likewise. * reorg.c: Likewise. * resource.c: Likewise. * sanopt.c: Likewise. * sched-deps.c: Likewise. * sched-ebb.c: Likewise. * sched-rgn.c: Likewise. * sel-sched-ir.c: Likewise. * sel-sched.c: Likewise. * shrink-wrap.c: Likewise. * stmt.c: Likewise. * targhooks.c: Likewise. * toplev.c: Likewise. * tracer.c: Likewise. * trans-mem.c: Likewise. * tree-chrec.c: Likewise. * tree-data-ref.c: Likewise. * tree-if-conv.c: Likewise. * tree-inline.c: Likewise. * tree-loop-distribution.c: Likewise. * tree-parloops.c: Likewise. * tree-predcom.c: Likewise. * tree-profile.c: Likewise. * tree-scalar-evolution.c: Likewise. * tree-sra.c: Likewise. * tree-ssa-ccp.c: Likewise. * tree-ssa-dom.c: Likewise. * tree-ssa-dse.c: Likewise. * tree-ssa-ifcombine.c: Likewise. * tree-ssa-loop-ch.c: Likewise. * tree-ssa-loop-im.c: Likewise. * tree-ssa-loop-ivcanon.c: Likewise. * tree-ssa-loop-ivopts.c: Likewise. * tree-ssa-loop-manip.c: Likewise. * tree-ssa-loop-niter.c: Likewise. * tree-ssa-loop-prefetch.c: Likewise. * tree-ssa-loop-unswitch.c: Likewise. * tree-ssa-math-opts.c: Likewise. * tree-ssa-phiopt.c: Likewise. * tree-ssa-pre.c: Likewise. * tree-ssa-reassoc.c: Likewise. * tree-ssa-sccvn.c: Likewise. * tree-ssa-scopedtables.c: Likewise. * tree-ssa-sink.c: Likewise. * tree-ssa-strlen.c: Likewise. * tree-ssa-structalias.c: Likewise. * tree-ssa-tail-merge.c: Likewise. * tree-ssa-threadbackward.c: Likewise. * tree-ssa-threadedge.c: Likewise. * tree-ssa-uninit.c: Likewise. * tree-switch-conversion.c: Likewise. * tree-vect-data-refs.c: Likewise. * tree-vect-loop.c: Likewise. * tree-vect-slp.c: Likewise. * tree-vrp.c: Likewise. * tree.c: Likewise. * value-prof.c: Likewise. * var-tracking.c: Likewise. 2019-11-12 Martin Liska <mliska@suse.cz> * gimple-parser.c: Do not include params.h. 2019-11-12 Martin Liska <mliska@suse.cz> * name-lookup.c: Do not include params.h. * typeck.c: Likewise. 2019-11-12 Martin Liska <mliska@suse.cz> * lto-common.c: Do not include params.h. * lto-partition.c: Likewise. * lto.c: Likewise. From-SVN: r278086
2019-11-12Apply mechanical replacement (generated patch).Martin Liska1-3/+3
2019-11-12 Martin Liska <mliska@suse.cz> * asan.c (asan_sanitize_stack_p): Replace old parameter syntax with the new one, include opts.h if needed. Use SET_OPTION_IF_UNSET macro. (asan_sanitize_allocas_p): Likewise. (asan_emit_stack_protection): Likewise. (asan_protect_global): Likewise. (instrument_derefs): Likewise. (instrument_builtin_call): Likewise. (asan_expand_mark_ifn): Likewise. * auto-profile.c (auto_profile): Likewise. * bb-reorder.c (copy_bb_p): Likewise. (duplicate_computed_gotos): Likewise. * builtins.c (inline_expand_builtin_string_cmp): Likewise. * cfgcleanup.c (try_crossjump_to_edge): Likewise. (try_crossjump_bb): Likewise. * cfgexpand.c (defer_stack_allocation): Likewise. (stack_protect_classify_type): Likewise. (pass_expand::execute): Likewise. * cfgloopanal.c (expected_loop_iterations_unbounded): Likewise. (estimate_reg_pressure_cost): Likewise. * cgraph.c (cgraph_edge::maybe_hot_p): Likewise. * combine.c (combine_instructions): Likewise. (record_value_for_reg): Likewise. * common/config/aarch64/aarch64-common.c (aarch64_option_validate_param): Likewise. (aarch64_option_default_params): Likewise. * common/config/ia64/ia64-common.c (ia64_option_default_params): Likewise. * common/config/powerpcspe/powerpcspe-common.c (rs6000_option_default_params): Likewise. * common/config/rs6000/rs6000-common.c (rs6000_option_default_params): Likewise. * common/config/sh/sh-common.c (sh_option_default_params): Likewise. * config/aarch64/aarch64.c (aarch64_output_probe_stack_range): Likewise. (aarch64_allocate_and_probe_stack_space): Likewise. (aarch64_expand_epilogue): Likewise. (aarch64_override_options_internal): Likewise. * config/alpha/alpha.c (alpha_option_override): Likewise. * config/arm/arm.c (arm_option_override): Likewise. (arm_valid_target_attribute_p): Likewise. * config/i386/i386-options.c (ix86_option_override_internal): Likewise. * config/i386/i386.c (get_probe_interval): Likewise. (ix86_adjust_stack_and_probe_stack_clash): Likewise. (ix86_max_noce_ifcvt_seq_cost): Likewise. * config/ia64/ia64.c (ia64_adjust_cost): Likewise. * config/rs6000/rs6000-logue.c (get_stack_clash_protection_probe_interval): Likewise. (get_stack_clash_protection_guard_size): Likewise. * config/rs6000/rs6000.c (rs6000_option_override_internal): Likewise. * config/s390/s390.c (allocate_stack_space): Likewise. (s390_emit_prologue): Likewise. (s390_option_override_internal): Likewise. * config/sparc/sparc.c (sparc_option_override): Likewise. * config/visium/visium.c (visium_option_override): Likewise. * coverage.c (get_coverage_counts): Likewise. (coverage_compute_profile_id): Likewise. (coverage_begin_function): Likewise. (coverage_end_function): Likewise. * cse.c (cse_find_path): Likewise. (cse_extended_basic_block): Likewise. (cse_main): Likewise. * cselib.c (cselib_invalidate_mem): Likewise. * dse.c (dse_step1): Likewise. * emit-rtl.c (set_new_first_and_last_insn): Likewise. (get_max_insn_count): Likewise. (make_debug_insn_raw): Likewise. (init_emit): Likewise. * explow.c (compute_stack_clash_protection_loop_data): Likewise. * final.c (compute_alignments): Likewise. * fold-const.c (fold_range_test): Likewise. (fold_truth_andor): Likewise. (tree_single_nonnegative_warnv_p): Likewise. (integer_valued_real_single_p): Likewise. * gcse.c (want_to_gcse_p): Likewise. (prune_insertions_deletions): Likewise. (hoist_code): Likewise. (gcse_or_cprop_is_too_expensive): Likewise. * ggc-common.c: Likewise. * ggc-page.c (ggc_collect): Likewise. * gimple-loop-interchange.cc (MAX_NUM_STMT): Likewise. (MAX_DATAREFS): Likewise. (OUTER_STRIDE_RATIO): Likewise. * gimple-loop-jam.c (tree_loop_unroll_and_jam): Likewise. * gimple-loop-versioning.cc (loop_versioning::max_insns_for_loop): Likewise. * gimple-ssa-split-paths.c (is_feasible_trace): Likewise. * gimple-ssa-store-merging.c (imm_store_chain_info::try_coalesce_bswap): Likewise. (imm_store_chain_info::coalesce_immediate_stores): Likewise. (imm_store_chain_info::output_merged_store): Likewise. (pass_store_merging::process_store): Likewise. * gimple-ssa-strength-reduction.c (find_basis_for_base_expr): Likewise. * graphite-isl-ast-to-gimple.c (class translate_isl_ast_to_gimple): Likewise. (scop_to_isl_ast): Likewise. * graphite-optimize-isl.c (get_schedule_for_node_st): Likewise. (optimize_isl): Likewise. * graphite-scop-detection.c (build_scops): Likewise. * haifa-sched.c (set_modulo_params): Likewise. (rank_for_schedule): Likewise. (model_add_to_worklist): Likewise. (model_promote_insn): Likewise. (model_choose_insn): Likewise. (queue_to_ready): Likewise. (autopref_multipass_dfa_lookahead_guard): Likewise. (schedule_block): Likewise. (sched_init): Likewise. * hsa-gen.c (init_prologue): Likewise. * ifcvt.c (bb_ok_for_noce_convert_multiple_sets): Likewise. (cond_move_process_if_block): Likewise. * ipa-cp.c (ipcp_lattice::add_value): Likewise. (merge_agg_lats_step): Likewise. (devirtualization_time_bonus): Likewise. (hint_time_bonus): Likewise. (incorporate_penalties): Likewise. (good_cloning_opportunity_p): Likewise. (ipcp_propagate_stage): Likewise. * ipa-fnsummary.c (decompose_param_expr): Likewise. (set_switch_stmt_execution_predicate): Likewise. (analyze_function_body): Likewise. (compute_fn_summary): Likewise. * ipa-inline-analysis.c (estimate_growth): Likewise. * ipa-inline.c (caller_growth_limits): Likewise. (inline_insns_single): Likewise. (inline_insns_auto): Likewise. (can_inline_edge_by_limits_p): Likewise. (want_early_inline_function_p): Likewise. (big_speedup_p): Likewise. (want_inline_small_function_p): Likewise. (want_inline_self_recursive_call_p): Likewise. (edge_badness): Likewise. (recursive_inlining): Likewise. (compute_max_insns): Likewise. (early_inliner): Likewise. * ipa-polymorphic-call.c (csftc_abort_walking_p): Likewise. * ipa-profile.c (ipa_profile): Likewise. * ipa-prop.c (determine_known_aggregate_parts): Likewise. (ipa_analyze_node): Likewise. (ipcp_transform_function): Likewise. * ipa-split.c (consider_split): Likewise. * ipa-sra.c (allocate_access): Likewise. (process_scan_results): Likewise. (ipa_sra_summarize_function): Likewise. (pull_accesses_from_callee): Likewise. * ira-build.c (loop_compare_func): Likewise. (mark_loops_for_removal): Likewise. * ira-conflicts.c (build_conflict_bit_table): Likewise. * loop-doloop.c (doloop_optimize): Likewise. * loop-invariant.c (gain_for_invariant): Likewise. (move_loop_invariants): Likewise. * loop-unroll.c (decide_unroll_constant_iterations): Likewise. (decide_unroll_runtime_iterations): Likewise. (decide_unroll_stupid): Likewise. (expand_var_during_unrolling): Likewise. * lra-assigns.c (spill_for): Likewise. * lra-constraints.c (EBB_PROBABILITY_CUTOFF): Likewise. * modulo-sched.c (sms_schedule): Likewise. (DFA_HISTORY): Likewise. * opts.c (default_options_optimization): Likewise. (finish_options): Likewise. (common_handle_option): Likewise. * postreload-gcse.c (eliminate_partially_redundant_load): Likewise. (if): Likewise. * predict.c (get_hot_bb_threshold): Likewise. (maybe_hot_count_p): Likewise. (probably_never_executed): Likewise. (predictable_edge_p): Likewise. (predict_loops): Likewise. (expr_expected_value_1): Likewise. (tree_predict_by_opcode): Likewise. (handle_missing_profiles): Likewise. * reload.c (find_equiv_reg): Likewise. * reorg.c (redundant_insn): Likewise. * resource.c (mark_target_live_regs): Likewise. (incr_ticks_for_insn): Likewise. * sanopt.c (pass_sanopt::execute): Likewise. * sched-deps.c (sched_analyze_1): Likewise. (sched_analyze_2): Likewise. (sched_analyze_insn): Likewise. (deps_analyze_insn): Likewise. * sched-ebb.c (schedule_ebbs): Likewise. * sched-rgn.c (find_single_block_region): Likewise. (too_large): Likewise. (haifa_find_rgns): Likewise. (extend_rgns): Likewise. (new_ready): Likewise. (schedule_region): Likewise. (sched_rgn_init): Likewise. * sel-sched-ir.c (make_region_from_loop): Likewise. * sel-sched-ir.h (MAX_WS): Likewise. * sel-sched.c (process_pipelined_exprs): Likewise. (sel_setup_region_sched_flags): Likewise. * shrink-wrap.c (try_shrink_wrapping): Likewise. * targhooks.c (default_max_noce_ifcvt_seq_cost): Likewise. * toplev.c (print_version): Likewise. (process_options): Likewise. * tracer.c (tail_duplicate): Likewise. * trans-mem.c (tm_log_add): Likewise. * tree-chrec.c (chrec_fold_plus_1): Likewise. * tree-data-ref.c (split_constant_offset): Likewise. (compute_all_dependences): Likewise. * tree-if-conv.c (MAX_PHI_ARG_NUM): Likewise. * tree-inline.c (remap_gimple_stmt): Likewise. * tree-loop-distribution.c (MAX_DATAREFS_NUM): Likewise. * tree-parloops.c (MIN_PER_THREAD): Likewise. (create_parallel_loop): Likewise. * tree-predcom.c (determine_unroll_factor): Likewise. * tree-scalar-evolution.c (instantiate_scev_r): Likewise. * tree-sra.c (analyze_all_variable_accesses): Likewise. * tree-ssa-ccp.c (fold_builtin_alloca_with_align): Likewise. * tree-ssa-dse.c (setup_live_bytes_from_ref): Likewise. (dse_optimize_redundant_stores): Likewise. (dse_classify_store): Likewise. * tree-ssa-ifcombine.c (ifcombine_ifandif): Likewise. * tree-ssa-loop-ch.c (ch_base::copy_headers): Likewise. * tree-ssa-loop-im.c (LIM_EXPENSIVE): Likewise. * tree-ssa-loop-ivcanon.c (try_unroll_loop_completely): Likewise. (try_peel_loop): Likewise. (tree_unroll_loops_completely): Likewise. * tree-ssa-loop-ivopts.c (avg_loop_niter): Likewise. (CONSIDER_ALL_CANDIDATES_BOUND): Likewise. (MAX_CONSIDERED_GROUPS): Likewise. (ALWAYS_PRUNE_CAND_SET_BOUND): Likewise. * tree-ssa-loop-manip.c (can_unroll_loop_p): Likewise. * tree-ssa-loop-niter.c (MAX_ITERATIONS_TO_TRACK): Likewise. * tree-ssa-loop-prefetch.c (PREFETCH_BLOCK): Likewise. (L1_CACHE_SIZE_BYTES): Likewise. (L2_CACHE_SIZE_BYTES): Likewise. (should_issue_prefetch_p): Likewise. (schedule_prefetches): Likewise. (determine_unroll_factor): Likewise. (volume_of_references): Likewise. (add_subscript_strides): Likewise. (self_reuse_distance): Likewise. (mem_ref_count_reasonable_p): Likewise. (insn_to_prefetch_ratio_too_small_p): Likewise. (loop_prefetch_arrays): Likewise. (tree_ssa_prefetch_arrays): Likewise. * tree-ssa-loop-unswitch.c (tree_unswitch_single_loop): Likewise. * tree-ssa-math-opts.c (gimple_expand_builtin_pow): Likewise. (convert_mult_to_fma): Likewise. (math_opts_dom_walker::after_dom_children): Likewise. * tree-ssa-phiopt.c (cond_if_else_store_replacement): Likewise. (hoist_adjacent_loads): Likewise. (gate_hoist_loads): Likewise. * tree-ssa-pre.c (translate_vuse_through_block): Likewise. (compute_partial_antic_aux): Likewise. * tree-ssa-reassoc.c (get_reassociation_width): Likewise. * tree-ssa-sccvn.c (vn_reference_lookup_pieces): Likewise. (vn_reference_lookup): Likewise. (do_rpo_vn): Likewise. * tree-ssa-scopedtables.c (avail_exprs_stack::lookup_avail_expr): Likewise. * tree-ssa-sink.c (select_best_block): Likewise. * tree-ssa-strlen.c (new_stridx): Likewise. (new_addr_stridx): Likewise. (get_range_strlen_dynamic): Likewise. (class ssa_name_limit_t): Likewise. * tree-ssa-structalias.c (push_fields_onto_fieldstack): Likewise. (create_variable_info_for_1): Likewise. (init_alias_vars): Likewise. * tree-ssa-tail-merge.c (find_clusters_1): Likewise. (tail_merge_optimize): Likewise. * tree-ssa-threadbackward.c (thread_jumps::profitable_jump_thread_path): Likewise. (thread_jumps::fsm_find_control_statement_thread_paths): Likewise. (thread_jumps::find_jump_threads_backwards): Likewise. * tree-ssa-threadedge.c (record_temporary_equivalences_from_stmts_at_dest): Likewise. * tree-ssa-uninit.c (compute_control_dep_chain): Likewise. * tree-switch-conversion.c (switch_conversion::check_range): Likewise. (jump_table_cluster::can_be_handled): Likewise. * tree-switch-conversion.h (jump_table_cluster::case_values_threshold): Likewise. (SWITCH_CONVERSION_BRANCH_RATIO): Likewise. (param_switch_conversion_branch_ratio): Likewise. * tree-vect-data-refs.c (vect_mark_for_runtime_alias_test): Likewise. (vect_enhance_data_refs_alignment): Likewise. (vect_prune_runtime_alias_test_list): Likewise. * tree-vect-loop.c (vect_analyze_loop_costing): Likewise. (vect_get_datarefs_in_loop): Likewise. (vect_analyze_loop): Likewise. * tree-vect-slp.c (vect_slp_bb): Likewise. * tree-vectorizer.h: Likewise. * tree-vrp.c (find_switch_asserts): Likewise. (vrp_prop::check_mem_ref): Likewise. * tree.c (wide_int_to_tree_1): Likewise. (cache_integer_cst): Likewise. * var-tracking.c (EXPR_USE_DEPTH): Likewise. (reverse_op): Likewise. (vt_find_locations): Likewise. 2019-11-12 Martin Liska <mliska@suse.cz> * gimple-parser.c (c_parser_parse_gimple_body): Replace old parameter syntax with the new one, include opts.h if needed. Use SET_OPTION_IF_UNSET macro. 2019-11-12 Martin Liska <mliska@suse.cz> * name-lookup.c (namespace_hints::namespace_hints): Replace old parameter syntax with the new one, include opts.h if needed. Use SET_OPTION_IF_UNSET macro. * typeck.c (comptypes): Likewise. 2019-11-12 Martin Liska <mliska@suse.cz> * lto-partition.c (lto_balanced_map): Replace old parameter syntax with the new one, include opts.h if needed. Use SET_OPTION_IF_UNSET macro. * lto.c (do_whole_program_analysis): Likewise. From-SVN: r278085
2019-11-12re PR tree-optimization/92347 (ICE in vect_get_vec_def_for_operand_1, at ↵Andre Vieira1-1/+0
tree-vect-stmts.c:1537) 2019-11-11 Andre Vieira <andre.simoesdiasvieira@arm.com> PR tree-optimization/92347 * tree-vect-loop.c (vect_transform_loop): Don't overwrite epilogues safelen with 0. * gcc.dg/vect/pr92347.c: New test. From-SVN: r278079
2019-11-08Use correct vector type in neutral_op_for_slp_reductionRichard Sandiford1-13/+16
With the new reduction vectype handling, neutral_op_for_slp_reduction needs to know whether the caller is using STMT_VINFO_REDUC_VECTYPE (for an epilogue value) or STMT_VINFO_VECTYPE (for a PHI argument). This fixes various gcc.target/aarch64/sve/slp_* tests. 2019-11-08 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vect-loop.c (neutral_op_for_slp_reduction): Take the vector type as an argument rather than reading it from the stmt_vec_info. (vect_create_epilog_for_reduction): Update accordingly. (vectorizable_reduction): Likewise. (vect_transform_cycle_phi): Likewise. From-SVN: r277977
2019-11-08[vect] Disable vectorization of epilogues for loops with SIMDUID setAndre Vieira1-2/+6
gcc/ChangeLog: 2019-11-08 Andre Vieira <andre.simoesdiasvieira@arm.com> * tree-vect-loop.c (vect_analyze_loop): Disable epilogue vectorization for loops with SIMDUID set. Enable epilogue vectorization for loops with SIMDLEN set after finding a main loop with a VF that matches it. From-SVN: r277964
2019-11-08re PR tree-optimization/92324 (ICE in expand_direct_optab_fn, at ↵Richard Biener1-85/+94
internal-fn.c:2890) 2019-11-08 Richard Biener <rguenther@suse.de> PR tree-optimization/92324 * tree-vect-loop.c (vect_create_epilog_for_reduction): Use STMT_VINFO_REDUC_VECTYPE for all computations, inserting sign-conversions as necessary. (vectorizable_reduction): Reject conversions in the chain that are not sign-conversions, base analysis on a non-converting stmt and its operation sign. Set STMT_VINFO_REDUC_VECTYPE. * tree-vect-stmts.c (vect_stmt_relevant_p): Don't dump anything for debug stmts. * tree-vectorizer.h (_stmt_vec_info::reduc_vectype): New. (STMT_VINFO_REDUC_VECTYPE): Likewise. * gcc.dg/vect/pr92205.c: XFAIL. * gcc.dg/vect/pr92324-1.c: New testcase. * gcc.dg/vect/pr92324-2.c: Likewise. From-SVN: r277955
2019-11-07re PR tree-optimization/92405 (ICE in vect_get_vec_def_for_stmt_copy, at ↵Richard Biener1-0/+12
tree-vect-stmts.c:1683) 2019-11-07 Richard Biener <rguenther@suse.de> PR tree-optimization/92405 * tree-vect-loop.c (vectorizable_reduction): Appropriately restrict lane-reducing ops to single stmt chains. From-SVN: r277921
2019-11-06Don't vectorise single-iteration epiloguesRichard Sandiford1-0/+1
With a later patch I saw a case in which we peeled a single iteration for gaps but didn't need to peel further iterations to make up a full vector. We then tried to vectorise the single-iteration epilogue. 2019-11-06 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vect-loop.c (vect_analyze_loop): Only try to vectorize the epilogue if there are peeled iterations for it to handle. From-SVN: r277886
2019-11-06tree-vect-loop.c (vectorizable_reduction): Remember reduction PHI.Richard Biener1-16/+8
2019-11-06 Richard Biener <rguenther@suse.de> * tree-vect-loop.c (vectorizable_reduction): Remember reduction PHI. Use STMT_VINFO_REDUC_IDX to skip the reduction operand. Simplify single_defuse_cycle condition. From-SVN: r277882
2019-11-06Check the VF is small enough for an epilogue loopRichard Sandiford1-0/+10
The number of iterations of an epilogue loop is always smaller than the VF of the main loop. vect_analyze_loop_costing was taking this into account when deciding whether the loop is cheap enough to vectorise, but that has no effect with the unlimited cost model. We need to use a separate check for correctness as well. This can happen if the sizes returned by autovectorize_vector_sizes happen to be out of order, e.g. because the target prefers smaller vectors. It can also happen with later patches if two vectorisation attempts happen to end up with the same VF. 2019-11-06 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vect-loop.c (vect_analyze_loop_2): When vectorizing an epilogue loop, make sure that the VF is small enough or that the epilogue loop can be fully-masked. From-SVN: r277880
2019-11-06Restructure vect_analyze_loopRichard Sandiford1-69/+68
Once vect_analyze_loop has found a valid loop_vec_info X, we carry on searching for alternatives if (1) X doesn't satisfy simdlen or (2) we want to vectorize the epilogue of X. I have a patch that optionally adds a third reason: we want to see if there are cheaper alternatives to X. This patch restructures vect_analyze_loop so that it's easier to add more reasons for continuing. There's supposed to be no behavioural change. If we wanted to, we could allow vectorisation of epilogues once loop->simdlen has been reached by changing "loop->simdlen" to "simdlen" in the new vect_epilogues condition. That should be a separate change though. 2019-11-06 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vect-loop.c (vect_analyze_loop): Break out of the main loop when we've finished, rather than returning directly from the loop. Use a local variable to track whether we're still searching for the preferred simdlen. Make vect_epilogues record whether the next iteration should try to treat the loop as an epilogue. From-SVN: r277879
2019-11-05re PR tree-optimization/92371 (ICE in info_for_reduction, at ↵Richard Biener1-7/+8
tree-vect-loop.c:4106) 2019-11-05 Richard Biener <rguenther@suse.de> PR tree-optimization/92371 * tree-vect-loop.c (vectorizable_reduction): Set STMT_VINFO_REDUC_DEF on the original stmt of live stmts in the chain. (vectorizable_live_operation): Look at the original stmt when checking STMT_VINFO_REDUC_DEF. * gcc.dg/torture/pr92371.c: New testcase. From-SVN: r277850
2019-11-05re PR tree-optimization/92324 (ICE in expand_direct_optab_fn, at ↵Richard Biener1-1/+12
internal-fn.c:2890) 2019-11-05 Richard Biener <rguenther@suse.de> PR tree-optimization/92324 * tree-vect-loop.c (check_reduction_path): For MIN/MAX require all signed or unsigned operations. * gcc.dg/vect/pr92324-3.c: New testcase. From-SVN: r277822
2019-11-04[vect] Clean up orig_loop_vinfo from vect_analyze_loopAndre Vieira1-7/+3
gcc/ChangeLog: 2019-11-04 Andre Vieira <andre.simoesdiasvieira@arm.com> * tree-vect-loop.c (vect_analyze_loop): Remove orig_loop_vinfo parameter. * tree-vectorizer.h (vect_analyze_loop): Update declaration. * tree-vectorizer.c (try_vectorize_loop_1): Update calls to vect_analyze_loop. From-SVN: r277785
2019-11-04re PR tree-optimization/92345 (ICE in vec<_stmt_vec_info*, va_heap, ↵Richard Biener1-5/+8
vl_embed>::space (vect_get_and_check_slp_defs)) 2019-11-04 Richard Biener <rguenther@suse.de> PR tree-optimization/92345 * tree-vect-loop.c (vect_is_simple_reduction): Return whether we produced a reduction chain. (vect_analyze_scalar_cycles_1): Do not add reduction chains to LOOP_VINFO_REDUCTIONS. * gcc.dg/torture/pr92345.c: New testcase. From-SVN: r277782
2019-10-30re PR tree-optimization/65930 (Reduction with sign-change not handled)Richard Biener1-4/+11
2019-10-30 Richard Biener <rguenther@suse.de> PR tree-optimization/65930 * tree-vect-loop.c (vect_is_simple_reduction): For reduction chains also allow a leading and trailing conversion. * tree-vect-slp.c (vect_get_and_check_slp_defs): Handle intermediate reduction chains. (vect_analyze_slp_instance): Likewise. Build a SLP node for a trailing conversion manually. * gcc.dg/vect/pr65930-2.c: New testcase. From-SVN: r277603
2019-10-29[vect]PR 88915: Vectorize epilogues when versioning loopsAndre Vieira1-60/+272
gcc/ChangeLog: 2019-10-29 Andre Vieira <andre.simoesdiasvieira@arm.com> PR 88915 * tree-ssa-loop-niter.h (simplify_replace_tree): Change declaration. * tree-ssa-loop-niter.c (simplify_replace_tree): Add context parameter and make the valueize function pointer also take a void pointer. * gcc/tree-ssa-sccvn.c (vn_valueize_wrapper): New function to wrap around vn_valueize, to call it without a context. (process_bb): Use vn_valueize_wrapper instead of vn_valueize. * tree-vect-loop.c (_loop_vec_info): Initialize epilogue_vinfos. (~_loop_vec_info): Release epilogue_vinfos. (vect_analyze_loop_costing): Use knowledge of main VF to estimate number of iterations of epilogue. (vect_analyze_loop_2): Adapt to analyse main loop for all supported vector sizes when vect-epilogues-nomask=1. Also keep track of lowest versioning threshold needed for main loop. (vect_analyze_loop): Likewise. (find_in_mapping): New helper function. (update_epilogue_loop_vinfo): New function. (vect_transform_loop): When vectorizing epilogues re-use analysis done on main loop and call update_epilogue_loop_vinfo to update it. * tree-vect-loop-manip.c (vect_update_inits_of_drs): No longer insert stmts on loop preheader edge. (vect_do_peeling): Enable skip-vectors when doing loop versioning if we decided to vectorize epilogues. Update epilogues NITERS and construct ADVANCE to update epilogues data references where needed. * tree-vectorizer.h (_loop_vec_info): Add epilogue_vinfos. (vect_do_peeling, vect_update_inits_of_drs, determine_peel_for_niter, vect_analyze_loop): Add or update declarations. * tree-vectorizer.c (try_vectorize_loop_1): Make sure to use already created loop_vec_info's for epilogues when available. Otherwise analyse epilogue separately. From-SVN: r277569
2019-10-29re PR tree-optimization/65930 (Reduction with sign-change not handled)Richard Biener1-17/+38
2019-10-29 Richard Biener <rguenther@suse.de> PR tree-optimization/65930 * tree-vect-loop.c (check_reduction_path): Relax single-use check allowing out-of-loop uses. (vect_is_simple_reduction): SLP reduction chains cannot have intermediate stmts used outside of the loop. (vect_create_epilog_for_reduction): The adjustment might need to be converted. (vectorizable_reduction): Annotate live stmts of the reduction chain with STMT_VINFO_REDUC_DEF. * tree-vect-stms.c (process_use): Remove no longer true asserts. * gcc.dg/vect/pr65930-1.c: New testcase. From-SVN: r277566
2019-10-28re PR tree-optimization/92241 (ice in vect_mark_pattern_st mts, at ↵Richard Biener1-5/+14
tree-vect-patterns.c:5175) 2019-10-28 Richard Biener <rguenther@suse.de> PR tree-optimization/92241 * tree-vect-loop.c (vect_fixup_scalar_cycles_with_patterns): When we failed to update the reduction index do not use the pattern stmts for the reduction chain. (vectorizable_reduction): When the reduction chain is corrupt, fail. * tree-vect-patterns.c (vect_mark_pattern_stmts): Stop when we fail to update the reduction chain. * gcc.dg/torture/pr92241.c: New testcase. From-SVN: r277516
2019-10-28tree-vect-loop.c (vect_create_epilog_for_reduction): Use ↵Richard Biener1-36/+17
STMT_VINFO_REDUC_IDX from the actual stmt. 2019-10-28 Richard Biener <rguenther@suse.de> * tree-vect-loop.c (vect_create_epilog_for_reduction): Use STMT_VINFO_REDUC_IDX from the actual stmt. (vect_transform_reduction): Likewise. (vectorizable_reduction): Compute the reduction chain length, do not recompute the reduction operand index. Remove no longer necessary restriction for condition reduction chains. From-SVN: r277513
2019-10-25Fix reductions for fully-masked loopsRichard Sandiford1-30/+21
Now that vectorizable_operation vectorises most loop stmts involved in a reduction, it needs to be aware of reductions in fully-masked loops. The LOOP_VINFO_CAN_FULLY_MASK_P parts of vectorizable_reduction now only apply to cases that use vect_transform_reduction. This new way of doing things is definitely an improvement for SVE though, since it means we can lift the old restriction of not using fully-masked loops for reduction chains. 2019-10-25 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vect-loop.c (vectorizable_reduction): Restrict the LOOP_VINFO_CAN_FULLY_MASK_P handling to cases that will be handled by vect_transform_reduction. Allow fully-masked loops to be used with reduction chains. * tree-vect-stmts.c (vectorizable_operation): Handle reduction operations in fully-masked loops. (vectorizable_condition): Reject EXTRACT_LAST_REDUCTION operations in fully-masked loops. gcc/testsuite/ * gcc.dg/vect/pr65947-1.c: No longer expect doubled dump lines for FOLD_EXTRACT_LAST reductions. * gcc.dg/vect/pr65947-2.c: Likewise. * gcc.dg/vect/pr65947-3.c: Likewise. * gcc.dg/vect/pr65947-4.c: Likewise. * gcc.dg/vect/pr65947-5.c: Likewise. * gcc.dg/vect/pr65947-6.c: Likewise. * gcc.dg/vect/pr65947-9.c: Likewise. * gcc.dg/vect/pr65947-10.c: Likewise. * gcc.dg/vect/pr65947-12.c: Likewise. * gcc.dg/vect/pr65947-13.c: Likewise. * gcc.dg/vect/pr65947-14.c: Likewise. * gcc.dg/vect/pr80631-1.c: Likewise. * gcc.dg/vect/pr80631-2.c: Likewise. * gcc.dg/vect/vect-cond-reduc-3.c: Likewise. * gcc.dg/vect/vect-cond-reduc-4.c: Likewise. From-SVN: r277438
2019-10-25tree-vect-loop.c (vectorizable_reduction): Verify STMT_VINFO_REDUC_IDX on ↵Richard Biener1-1/+13
the to be vectorized stmts is set up correctly. 2019-10-25 Richard Biener <rguenther@suse.de> * tree-vect-loop.c (vectorizable_reduction): Verify STMT_VINFO_REDUC_IDX on the to be vectorized stmts is set up correctly. * tree-vect-patterns.c (vect_mark_pattern_stmts): Transfer STMT_VINFO_REDUC_IDX from the original stmts to the pattern stmts. From-SVN: r277437
2019-10-24re PR tree-optimization/92205 (ICE in vect_get_vec_def_for_stmt_copy, at ↵Richard Biener1-4/+6
tree-vect-stmts.c:1688 since r277322) 2019-10-24 Richard Biener <rguenther@suse.de> PR tree-optimization/92205 * tree-vect-loop.c (vectorizable_reduction): Restrict search for alternate vectype_in to lane-reducing patterns we support. * gcc.dg/vect/pr92205.c: New testcase. From-SVN: r277375
2019-10-23re PR tree-optimization/65930 (Reduction with sign-change not handled)Richard Biener1-14/+5
2019-10-23 Richard Biener <rguenther@suse.de> PR tree-optimization/65930 * tree-vect-loop.c (check_reduction_path): Allow conversions that only change the sign. (vectorizable_reduction): Relax latch def stmts we handle further. * gcc.dg/vect/vect-reduc-2char-big-array.c: Adjust. * gcc.dg/vect/vect-reduc-2char.c: Likewise. * gcc.dg/vect/vect-reduc-2short.c: Likewise. * gcc.dg/vect/vect-reduc-dot-s8b.c: Likewise. * gcc.dg/vect/vect-reduc-pattern-2c.c: Likewise. From-SVN: r277322