riscv-gnu-toolchain/gcc.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
2020-04-20	vect: Tweak vect_better_loop_vinfo_p handling of variable VFs	Richard Sandiford	1	-1/+30
	This patch fixes a large lmbench performance regression with 128-bit SVE, compiled in length-agnostic mode. vect_better_loop_vinfo_p (new in GCC 10) tries to estimate whether a new loop_vinfo is cheaper than a previous one, with an in-built preference for the old one. For variable VF it prefers the old loop_vinfo if it is cheaper for at least one VF. However, we have no idea how likely that VF is in practice. Another extreme would be to do what most of the rest of the vectoriser does, and rely solely on the constant estimated VF. But as noted in the comment, this means that a one-unit cost difference would be enough to pick the new loop_vinfo, despite the target generally preferring the old loop_vinfo where possible. The cost model just isn't accurate enough for that to produce good results as things stand: there might not be any practical benefit to the new loop_vinfo at the estimated VF, and it would be significantly worse for higher VFs. The patch instead goes for a hacky compromise: make sure that the new loop_vinfo is also no worse than the old loop_vinfo at double the estimated VF. For all but trivial loops, this ensures that the new loop_vinfo is only chosen if it is better than the old one by a non-trivial amount at the estimated VF. It also avoids putting too much faith in the VF estimate. I realise this isn't great, but it's supposed to be a conservative fix suitable for stage 4. The only affected testcases are the ones for pr89007-.c, where Advanced SIMD is indeed preferred for 128-bit SVE and is no worse for 256-bit SVE. Part of the problem here is that if the new loop_vinfo is better, we discard the old one and never consider using it even as an epilogue loop. This means that if we choose Advanced SIMD over SVE, we're much more likely to have left-over scalar elements. Another is that the estimate provided by estimated_poly_value might have different probabilities attached. E.g. when tuning for a particular core, the estimate is probably accurate, but when tuning for generic code, the estimate is more of a guess. Relying solely on the estimate is probably correct for the former but not for the latter. Hopefully those are things that we could tackle in GCC 11. 2020-04-20 Richard Sandiford <richard.sandiford@arm.com> gcc/ tree-vect-loop.c (vect_better_loop_vinfo_p): If old_loop_vinfo has a variable VF, prefer new_loop_vinfo if it is cheaper for the estimated VF and is no worse at double the estimated VF. gcc/testsuite/ * gcc.target/aarch64/sve/cost_model_8.c: New test. * gcc.target/aarch64/sve/cost_model_9.c: Likewise. * gcc.target/aarch64/sve/pr89007-1.c: Add -msve-vector-bits=512. * gcc.target/aarch64/sve/pr89007-2.c: Likewise.
2020-04-03	Fix PR94443 with gsi_insert_seq_before [PR94443]	Kewen Lin	1	-2/+2
	This patch is to fix the stupid mistake by using gsi_insert_seq_before instead of gsi_insert_before. BTW, the regression testing on one x86_64 machine from CFarm is unable to reveal it (I guess due to native arch sandybridge?), so I specified additional option -march=znver2 and verified the coverage. Bootstrapped/regtested on powerpc64le-linux-gnu (P9) and x86_64-pc-linux-gnu, also verified the fail cases in related PRs. 2020-04-03 Kewen Lin <linkw@gcc.gnu.org> gcc/ PR tree-optimization/94443 * tree-vect-loop.c (vectorizable_live_operation): Use gsi_insert_seq_before to replace gsi_insert_before. gcc/testsuite/ PR tree-optimization/94443 * gcc.dg/vect/pr94443.c: New test.
2020-04-01	Fix PR94043 by making vect_live_op generate lc-phi	Kewen Lin	1	-6/+44
	As PR94043 shows, my commit r10-4524 exposed one issue in vectorizable_live_operation, which inserts one extra BB before the single exit, leading unexpected operand expansion and unexpected loop depth assertion. As Richi suggested, this patch is to teach vectorizable_live_operation to generate loop closed phi for vec_lhs, it looks like: loop; # lhs' = PHI <lhs> => loop; # vec_lhs' = PHI <vec_lhs> new_tree = BIT_FIELD_REF <vec_lhs', ...>; lhs' = new_tree; I noticed that there are some SLP cases that have same lhs and vec_lhs but different offsets, which can make us have more PHIs for the same vec_lhs there. But I think it would be fine since only one of them is actually live, the others should be eliminated by the following dce. So the patch doesn't check whether there is one phi for vec_lhs, just create one directly instead. Bootstrapped/regtested on powerpc64le-linux-gnu (LE) P8. 2020-04-01 Kewen Lin <linkw@gcc.gnu.org> gcc/ChangeLog PR tree-optimization/94043 * tree-vect-loop.c (vectorizable_live_operation): Generate loop-closed phi for vec_lhs and use it for lane extraction. gcc/testsuite/ChangeLog PR tree-optimization/94043 * gfortran.dg/graphite/vect-pr94043.f90: New test.
2020-03-17	Fix up duplicated duplicated words mostly in comments	Jakub Jelinek	1	-1/+1
	In the r10-7197-gbae7b38cf8a21e068ad5c0bab089dedb78af3346 commit I've noticed duplicated word in a message, which lead me to grep for those and we have a tons of them. I've used grep -v 'long long\\|optab optab\\|template template\\|double double' .[chS] /.[chS] .def config// 2>/dev/null \| grep ' \([a-zA-Z]\+\) \1 ' Note, the command will not detect the doubled words at the start or end of line or when one of the words is at the end of line and the next one at the start of another one. Some of it is fairly obvious, e.g. all the "the the" cases which is something I've posted and committed patch for already e.g. in 2016, other cases are often valid, e.g. "that that" seems to look mostly ok to me. Some cases are quite hard to figure out, I've left out some of them from the patch (e.g. "and and" in some cases isn't talking about bitwise/logical and and so looks incorrect, but in other cases it is talking about those operations). In most cases the right solution seems to be to remove one of the duplicated words, but not always. I think most important are the ones with user visible messages (in the patch 3 of the first 4 hunks), the rest is just comments (and internal documentation; for that see the doc/tm.texi changes). 2020-03-17 Jakub Jelinek <jakub@redhat.com> * lra-spills.c (remove_pseudos): Fix up duplicated word issue in a dump message. * tree-sra.c (create_access_replacement): Fix up duplicated word issue in a comment. * read-rtl-function.c (find_param_by_name, function_reader::parse_enum_value, function_reader::get_insn_by_uid): Likewise. * spellcheck.c (get_edit_distance_cutoff): Likewise. * tree-data-ref.c (create_ifn_alias_checks): Likewise. * tree.def (SWITCH_EXPR): Likewise. * selftest.c (assert_str_contains): Likewise. * ipa-param-manipulation.h (class ipa_param_body_adjustments): Likewise. * tree-ssa-math-opts.c (convert_expand_mult_copysign): Likewise. * tree-ssa-loop-split.c (find_vdef_in_loop): Likewise. * langhooks.h (struct lang_hooks_for_decls): Likewise. * ipa-prop.h (struct ipa_param_descriptor): Likewise. * tree-ssa-strlen.c (handle_builtin_string_cmp, handle_store): Likewise. * tree-ssa-dom.c (simplify_stmt_for_jump_threading): Likewise. * tree-ssa-reassoc.c (reassociate_bb): Likewise. * tree.c (component_ref_size): Likewise. * hsa-common.c (hsa_init_compilation_unit_data): Likewise. * gimple-ssa-sprintf.c (get_string_length, format_string, format_directive): Likewise. * omp-grid.c (grid_process_kernel_body_copy): Likewise. * input.c (string_concat_db::get_string_concatenation, test_lexer_string_locations_ucn4): Likewise. * cfgexpand.c (pass_expand::execute): Likewise. * gimple-ssa-warn-restrict.c (builtin_memref::offset_out_of_bounds, maybe_diag_overlap): Likewise. * rtl.c (RTX_CODE_HWINT_P_1): Likewise. * shrink-wrap.c (spread_components): Likewise. * tree-ssa-dse.c (initialize_ao_ref_for_dse, valid_ao_ref_for_dse): Likewise. * tree-call-cdce.c (shrink_wrap_one_built_in_call_with_conds): Likewise. * dwarf2out.c (dwarf2out_early_finish): Likewise. * gimple-ssa-store-merging.c: Likewise. * ira-costs.c (record_operand_costs): Likewise. * tree-vect-loop.c (vectorizable_reduction): Likewise. * target.def (dispatch): Likewise. (validate_dims, gen_ccmp_first): Fix up duplicated word issue in documentation text. * doc/tm.texi: Regenerated. * config/i386/x86-tune.def (X86_TUNE_PARTIAL_FLAG_REG_STALL): Fix up duplicated word issue in a comment. * config/i386/i386.c (ix86_test_loading_unspec): Likewise. * config/i386/i386-features.c (remove_partial_avx_dependency): Likewise. * config/msp430/msp430.c (msp430_select_section): Likewise. * config/gcn/gcn-run.c (load_image): Likewise. * config/aarch64/aarch64-sve.md (sve_ld1r<mode>): Likewise. * config/aarch64/aarch64.c (aarch64_gen_adjusted_ldpstp): Likewise. * config/aarch64/falkor-tag-collision-avoidance.c (single_dest_per_chain): Likewise. * config/nvptx/nvptx.c (nvptx_record_fndecl): Likewise. * config/fr30/fr30.c (fr30_arg_partial_bytes): Likewise. * config/rs6000/rs6000-string.c (expand_cmp_vec_sequence): Likewise. * config/rs6000/rs6000-p8swap.c (replace_swapped_load_constant): Likewise. * config/rs6000/rs6000-c.c (rs6000_target_modify_macros): Likewise. * config/rs6000/rs6000.c (rs6000_option_override_internal): Likewise. * config/rs6000/rs6000-logue.c (rs6000_emit_probe_stack_range_stack_clash): Likewise. * config/nds32/nds32-md-auxiliary.c (nds32_split_ashiftdi3): Likewise. Fix various other issues in the comment. c-family/ * c-common.c (resolve_overloaded_builtin): Fix up duplicated word issue in a diagnostic message. cp/ * pt.c (tsubst): Fix up duplicated word issue in a diagnostic message. (lookup_template_class_1, tsubst_expr): Fix up duplicated word issue in a comment. * parser.c (cp_parser_statement, cp_parser_linkage_specification, cp_parser_placeholder_type_specifier, cp_parser_constraint_requires_parens): Likewise. * name-lookup.c (suggest_alternative_in_explicit_scope): Likewise. fortran/ * array.c (gfc_check_iter_variable): Fix up duplicated word issue in a comment. * arith.c (gfc_arith_concat): Likewise. * resolve.c (gfc_resolve_ref): Likewise. * frontend-passes.c (matmul_lhs_realloc): Likewise. * module.c (gfc_match_submodule, load_needed): Likewise. * trans-expr.c (gfc_init_se): Likewise.
2020-01-28	vect: Pattern-matched calls in reduction chains	Richard Sandiford	1	-3/+11
	gcc.dg/pr56350.c started ICEing for SVE in GCC 10 because we pattern-matched a division reduction: a /= 8; into a signed shift with division semantics: ... = IFN_SDIV_POW2 (..., 3); whereas the reduction code expected it still to be a gassign. One fix would be to check for a reduction in the pattern matcher (but current patterns don't generally do that). Another would be to fail gracefully for reductions involving calls. Since we can't vectorise the reduction either way, and probably have a better shot with the shift form, this patch goes for the "fail gracefully" approach. 2020-01-28 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vect-loop.c (vectorizable_reduction): Fail gracefully for reduction chains that (now) include a call.
2020-01-20	tree-optimization/93094 pass down VECTORIZED_CALL to versioning	Richard Biener	1	-2/+2
	When versioning is run the IL is already mangled and finding a VECTORIZED_CALL IFN can fail. 2020-01-20 Richard Biener <rguenther@suse.de> PR tree-optimization/93094 * tree-vectorizer.h (vect_loop_versioning): Adjust. (vect_transform_loop): Likewise. * tree-vectorizer.c (try_vectorize_loop_1): Pass down loop_vectorized_call to vect_transform_loop. * tree-vect-loop.c (vect_transform_loop): Pass down loop_vectorized_call to vect_loop_versioning. * tree-vect-loop-manip.c (vect_loop_versioning): Use the earlier discovered loop_vectorized_call. * gcc.dg/vect/pr93094.c: New testcase.
2020-01-16	PR tree-optimization/92429 do not fold when updating epilogue statements	Andre Vieira	1	-1/+6
	This patch addresses the problem reported in PR92429. When creating an epilogue for vectorization we have to replace the SSA_NAMEs in the PATTERN_DEF_SEQs and RELATED_STMTs of the epilogue's loop_vec_infos. When doing this we were using simplify_replace_tree which always folds the replacement. This may lead to a different tree-node than the one which was analyzed in vect_loop_analyze. In turn the new tree-node may require a different vectorization than the one we had prepared for which caused the ICE in question. gcc/ChangeLog: 2020-01-16 Andre Vieira <andre.simoesdiasvieira@arm.com> PR tree-optimization/92429 * tree-ssa-loop-niter.h (simplify_replace_tree): Add parameter. * tree-ssa-loop-niter.c (simplify_replace_tree): Add parameter to control folding. * tree-vect-loop.c (update_epilogue_vinfo): Do not fold when replacing tree. gcc/testsuite/ChangeLog: 2020-01-16 Andre Vieira <andre.simoesdiasvieira@arm.com> PR tree-optimization/92429 * gcc.dg/vect/pr92429.c: New test.
2020-01-15	PR tree-optimization/93247 - ICE in get_load_store_type	Richard Sandiford	1	-1/+2
	My earlier update_epilogue_loop_vinfo patch introduced an ICE on these tests for AVX512. If we use pattern stmts, STMT_VINFO_GATHER_SCATTER_P is valid for both the original stmt and the pattern stmt, but STMT_VINFO_MEMORY_ACCESS_TYPE is valid only for the latter. 2020-01-15 Richard Sandiford <richard.sandiford@arm.com> gcc/ PR tree-optimization/93247 * tree-vect-loop.c (update_epilogue_loop_vinfo): Check the access type of the stmt that we're going to vectorize. gcc/testsuite/ PR tree-optimization/93247 * gcc.dg/vect/pr93247-1.c: New test. * gcc.dg/vect/pr93247-2.c: Likewise.
2020-01-10	Use get_related_vectype_for_scalar_type for reduction indices	Richard Sandiford	1	-2/+3
	The related_vector_mode series missed this case in vect_create_epilog_for_reduction, where we want to create the unsigned integer equivalent of another vector. Without it we could mix SVE and Advanced SIMD vectors in the same operation. This showed up on existing tests when testing with fixed-length -msve-vector-bits=128. 2020-01-10 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vect-loop.c (vect_create_epilog_for_reduction): Use get_related_vectype_for_scalar_type rather than build_vector_type to create the index type for a conditional reduction. From-SVN: r280112
2020-01-10	Fix gather/scatter check when updating a vector epilogue loop	Richard Sandiford	1	-1/+1
	update_epilogue_loop_vinfo applies SSA renmaing to the DR_REF of a gather or scatter, so that vect_check_gather_scatter continues to work. However, we sometimes also rely on vect_check_gather_scatter when using gathers and scatters to implement strided accesses. This showed up on existing tests when testing with fixed-length -msve-vector-bits=128. 2020-01-10 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vect-loop.c (update_epilogue_loop_vinfo): Update DR_REF for any type of gather or scatter, including strided accesses. From-SVN: r280111
2020-01-10	[vect] Keep track of DR_OFFSET advance in dr_vec_info rather than data_reference	Andre Vieira	1	-11/+4
	gcc/ChangeLog: 2020-01-10 Andre Vieira <andre.simoesdiasvieira@arm.com> * tree-vect-data-refs.c (vect_create_addr_base_for_vector_ref): Use get_dr_vinfo_offset * tree-vect-loop.c (update_epilogue_loop_vinfo): Remove orig_drs_init parameter and its use to reset DR_OFFSET's. (vect_transform_loop): Remove orig_drs_init argument. * tree-vect-loop-manip.c (vect_update_init_of_dr): Update the offset member of dr_vec_info rather than the offset of the associated data_reference's innermost_loop_behavior. (vect_update_init_of_dr): Pass dr_vec_info instead of data_reference. (vect_do_peeling): Remove orig_drs_init parameter and its construction. * tree-vect-stmts.c (check_scan_store): Replace use of DR_OFFSET with get_dr_vinfo_offset. (vectorizable_store): Likewise. (vectorizable_load): Likewise. From-SVN: r280107
2020-01-01	Update copyright years.	Jakub Jelinek	1	-1/+1
	From-SVN: r279813
2019-12-27	Add missing target check for fully-masked fold-left reductions	Richard Sandiford	1	-0/+12
	The fold-left reduction code has a (rarely-used) fallback that handles cases in which the loop is fully-masked and the target has no native support for the reduction. The fallback includea a VEC_COND_EXPR between the reduction vector and a safe value, so we should check whether that VEC_COND_EXPR is supported. 2019-12-27 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vect-loop.c (vectorizable_reduction): Check whether the target supports the required VEC_COND_EXPR operation before allowing the fallback handling of masked fold-left reductions. gcc/testsuite/ * gcc.target/aarch64/sve/mixed_size_10.c: New test. From-SVN: r279742
2019-12-17	Add pointer to PR92772	Andrew Stubbs	1	-1/+4
	2019-12-17 Andrew Stubbs <ams@codesourcery.com> * tree-vect-loop.c (vect_create_epilog_for_reduction): Mention pr92772 in the comments. From-SVN: r279460
2019-12-10	Add missing conversion in vect_create_epilog_for_reduction	Richard Sandiford	1	-0/+2
	The direct_slp_reduc code in vect_create_epilog_for_reduction was still assuming that all types involved in a reduction are the same (up to types_compatible_p), whereas we now support differences in sign. This was causing an ICE in gcc.dg/vect/pr92324-4.c for SVE. 2019-12-10 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vect-loop.c (vect_create_epilog_for_reduction): When handling direct_slp_reduc, allow the PHI arguments to have a different type from the vector elements. From-SVN: r279164
2019-12-10	Disallow EXTRACT_LAST_REDUCTION for reduction chains	Richard Sandiford	1	-2/+3
	gcc.dg/vect/vect-cond-reduc-5.c was ICEing for SVE because we tried to use an extract-last reduction for a chain of COND_EXPRs. Adding support for the chained case would be too invasive for stage 3 so this patch explicitly forbids it instead. I've filed PR92884 for the possible future work. 2019-12-10 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vect-loop.c (vectorizable_reduction): Don't use EXTRACT_LAST_REDUCTION for chained reductions. From-SVN: r279161
2019-12-05	[Patch, GCC] Fix a condition post r278611	Sudakshina Das	1	-1/+1
	gcc/ChangeLog 2019-12-05 Sudakshina Das <sudi.das@arm.com> * tree-vect-loop.c (vect_model_reduction_cost): Remove reduction_type check from if condition. From-SVN: r279012
2019-12-02	re PR tree-optimization/92742 (ICE in info_for_reduction, at ↵	Richard Biener	1	-1/+2
	tree-vect-loop.c:4367) 2019-12-02 Richard Biener <rguenther@suse.de> PR tree-optimization/92742 * tree-vect-loop.c (vect_fixup_reduc_chain): Do not touch the def-type but verify it is consistent with the original stmts. * gcc.dg/torture/pr92742.c: New testcase. From-SVN: r278896
2019-11-29	Fix DR_GROUP_GAP for strided accesses (PR 92677)	Richard Sandiford	1	-1/+4
	When dissolving an SLP-only group of accesses, we should only set the gap to group_size - 1 for normal non-strided groups. 2019-11-29 Richard Sandiford <richard.sandiford@arm.com> gcc/ PR tree-optimization/92677 * tree-vect-loop.c (vect_dissolve_slp_only_groups): Set the gap to zero when dissolving a group of strided accesses. gcc/testsuite/ PR tree-optimization/92677 * gcc.dg/vect/pr92677.c: New test. From-SVN: r278852
2019-11-29	Don't defer choice of vector type for bools (PR 92596)	Richard Sandiford	1	-30/+8
	Now that stmt_vec_info records the choice between vector mask types and normal nonmask types, we can use that information in vect_get_vector_types_for_stmt instead of deferring the choice of vector type till later. vect_get_mask_type_for_stmt used to check whether the boolean inputs to an operation: (a) consistently used mask types or consistently used nonmask types; and (b) agreed on the number of elements. (b) shouldn't be a problem when (a) is met. If the operation consistently uses mask types, tree-vect-patterns.c will have corrected any mismatches in mask precision. (This is because we only use mask types for a small well-known set of operations and tree-vect-patterns.c knows how to handle any that could have different mask precisions.) And if the operation consistently uses normal nonmask types, there's no reason why booleans should need extra vector compatibility checks compared to ordinary integers. So the potential difficulties all seem to come from (a). Now that we've chosen the result type ahead of time, we also have to consider whether the outputs and inputs consistently use mask types. Taking each vectorizable_* routine in turn: - vectorizable_call vect_get_vector_types_for_stmt only handled booleans specially for gassigns, so vect_get_mask_type_for_stmt never had chance to handle calls. I'm not sure we support any calls that operate on booleans, but as things stand, a boolean result would always have a nonmask type. Presumably any vector argument would also need to use nonmask types, unless it corresponds to internal_fn_mask_index (which is already a special case). For safety, I've added a check for mask/nonmask combinations here even though we didn't check this previously. - vectorizable_simd_clone_call Again, vect_get_mask_type_for_stmt never had chance to handle calls. The result of the call will always be a nonmask type and the patch for PR 92710 rejects mask arguments. So all booleans should consistently use nonmask types here. - vectorizable_conversion The function already rejects any conversion between booleans in which one type isn't a mask type. - vectorizable_operation This function definitely needs a consistency check, e.g. to handle & and \| in which one operand is loaded from memory and the other is a comparison result. Ideally we'd handle this via pattern stmts instead (like we do for the all-mask case), but that's future work. - vectorizable_assignment VECT_SCALAR_BOOLEAN_TYPE_P requires single-bit precision, so the current code already rejects problematic cases. - vectorizable_load Loads always produce nonmask types and there are no relevant inputs to check against. - vectorizable_store vect_check_store_rhs already rejects mask/nonmask combinations via useless_type_conversion_p. - vectorizable_reduction - vectorizable_lc_phi PHIs always have nonmask types. After the change above, attempts to combine the PHI result with a mask type would be rejected by vectorizable_operation. (Again, it would be better to handle this using pattern stmts.) - vectorizable_induction We don't generate inductions for booleans. - vectorizable_shift The function already rejects boolean shifts via type_has_mode_precision_p. - vectorizable_condition The function already rejects mismatches via useless_type_conversion_p. - vectorizable_comparison The function already rejects comparisons between mask and nonmask types. The result is always a mask type. 2019-11-29 Richard Sandiford <richard.sandiford@arm.com> gcc/ PR tree-optimization/92596 * tree-vect-stmts.c (vectorizable_call): Punt on hybrid mask/nonmask operations. (vectorizable_operation): Likewise, instead of relying on vect_get_mask_type_for_stmt to do this. (vect_get_vector_types_for_stmt): Always return a vector type immediately, rather than deferring the choice for boolean results. Use a vector mask type instead of a normal vector if vect_use_mask_type_p. (vect_get_mask_type_for_stmt): Delete. * tree-vect-loop.c (vect_determine_vf_for_stmt_1): Remove mask_producers argument and special boolean_type_node handling. (vect_determine_vf_for_stmt): Remove mask_producers argument and update calls to vect_determine_vf_for_stmt_1. Remove doubled call. (vect_determine_vectorization_factor): Update call accordingly. * tree-vect-slp.c (vect_build_slp_tree_1): Remove special boolean_type_node handling. (vect_slp_analyze_node_operations_1): Likewise. gcc/testsuite/ PR tree-optimization/92596 * gcc.dg/vect/bb-slp-pr92596.c: New test. * gcc.dg/vect/bb-slp-43.c: Likewise. From-SVN: r278851
2019-11-22	Move EXTRACT_LAST_REDUCTION costing to vectorizable_condition	Richard Sandiford	1	-2/+5
	gcc.target/aarch64/sve/clastb_[57].c started failing after the increase in the cost of vec_to_scalar (r278452). The problem is that we were double-counting the cost of the CLASTB: once in vect_model_reduction_cost as a vec_to_scalar and once in vectorizable_condition as a plain vector_stmt. Based on the TODO above vect_model_reduction_cost, I think the preferred long-term direction is for vectorizable_* to cost these things itself, so that's what the patch does (for this one case only). 2019-11-22 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vect-stmts.c (vect_model_simple_cost): Take an optional vect_cost_for_stmt. (vectorizable_condition): Calculate the cost of EXTRACT_LAST_REDUCTION here rather than... * tree-vect-loop.c (vect_model_reduction_cost): ...here. From-SVN: r278611
2019-11-19	re PR tree-optimization/92581 (condition chains vectorized wrongly)	Richard Biener	1	-26/+32
	2019-11-19 Richard Biener <rguenther@suse.de> PR tree-optimization/92581 * tree-vect-loop.c (vect_create_epilog_for_reduction): For condition reduction chains gather all conditions involved for computing the index reduction vector. * gcc.dg/vect/vect-cond-reduc-5.c: New testcase. From-SVN: r278445
2019-11-19	re PR tree-optimization/92554 (ICE in vect_create_epilog_for_reduction, at ↵	Richard Biener	1	-11/+21
	tree-vect-loop.c:4325) 2019-11-19 Richard Biener <rguenther@suse.de> PR tree-optimization/92554 * tree-vect-loop.c (vect_create_epilog_for_reduction): Look for the actual condition stmt and deal with sign-changes. * gcc.dg/vect/pr92554.c: New testcase. From-SVN: r278431
2019-11-19	re PR tree-optimization/92555 (ICE in exact_div, at poly-int.h:2162)	Richard Biener	1	-0/+12
	2019-09-19 Richard Biener <rguenther@suse.de> PR tree-optimization/92555 * tree-vect-loop.c (vect_update_vf_for_slp): Also scan PHIs for non-SLP stmts. * gcc.dg/vect/pr92555.c: New testcase. From-SVN: r278430
2019-11-18	re PR tree-optimization/92558 (Miscompare of 554.roms_r with -Ofast ↵	Richard Biener	1	-0/+1
	-march=znver2 -flto since r278289) 2019-11-18 Richard Biener <rguenther@suse.de> PR tree-optimization/92558 * tree-vect-loop.c (vect_create_epilog_for_reduction): When reducting the width of a reduction vector def update new_phis. * gcc.dg/vect/pr92558.c: New testcase. From-SVN: r278400
2019-11-16	Optionally pick the cheapest loop_vec_info	Richard Sandiford	1	-7/+141
	This patch adds a mode in which the vectoriser tries each available base vector mode and picks the one with the lowest cost. The new behaviour is selected by autovectorize_vector_modes. The patch keeps the current behaviour of preferring a VF of loop->simdlen over any larger or smaller VF, regardless of costs or target preferences. 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/ * target.h (VECT_COMPARE_COSTS): New constant. * target.def (autovectorize_vector_modes): Return a bitmask of flags. * doc/tm.texi: Regenerate. * targhooks.h (default_autovectorize_vector_modes): Update accordingly. * targhooks.c (default_autovectorize_vector_modes): Likewise. * config/aarch64/aarch64.c (aarch64_autovectorize_vector_modes): Likewise. * config/arc/arc.c (arc_autovectorize_vector_modes): Likewise. * config/arm/arm.c (arm_autovectorize_vector_modes): Likewise. * config/i386/i386.c (ix86_autovectorize_vector_modes): Likewise. * config/mips/mips.c (mips_autovectorize_vector_modes): Likewise. * tree-vectorizer.h (_loop_vec_info::vec_outside_cost) (_loop_vec_info::vec_inside_cost): New member variables. * tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Initialize them. (vect_better_loop_vinfo_p, vect_joust_loop_vinfos): New functions. (vect_analyze_loop): When autovectorize_vector_modes returns VECT_COMPARE_COSTS, try vectorizing the loop with each available vector mode and picking the one with the lowest cost. (vect_estimate_min_profitable_iters): Record the computed costs in the loop_vec_info. From-SVN: r278336
2019-11-16	Extend can_duplicate_and_interleave_p to mixed-size vectors	Richard Sandiford	1	-2/+1
	This patch makes can_duplicate_and_interleave_p cope with mixtures of vector sizes, by using queries based on get_vectype_for_scalar_type instead of directly querying GET_MODE_SIZE (vinfo->vector_mode). int_mode_for_size is now the first check we do for a candidate mode, so it seemed better to restrict it to MAX_FIXED_MODE_SIZE. This avoids unnecessary work and avoids trying to create scalar types that the target might not support. 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vectorizer.h (can_duplicate_and_interleave_p): Take an element type rather than an element mode. * tree-vect-slp.c (can_duplicate_and_interleave_p): Likewise. Use get_vectype_for_scalar_type to query the natural types for a given element type rather than basing everything on GET_MODE_SIZE (vinfo->vector_mode). Limit int_mode_for_size query to MAX_FIXED_MODE_SIZE. (duplicate_and_interleave): Update call accordingly. * tree-vect-loop.c (vectorizable_reduction): Likewise. From-SVN: r278335
2019-11-15	re PR tree-optimization/92512 (ICE in gimple_op, at gimple.h:2436)	Richard Biener	1	-4/+17
	2019-11-15 Richard Biener <rguenther@suse.de> PR tree-optimization/92512 * tree-vect-loop.c (check_reduction_path): Fix operand index computability check. Add check for second use in COND_EXPRs. * gcc.dg/torture/pr92512.c: New testcase. From-SVN: r278293
2019-11-15	re PR tree-optimization/92324 (ICE in expand_direct_optab_fn, at ↵	Richard Biener	1	-30/+38
	internal-fn.c:2890) 2019-11-15 Richard Biener <rguenther@suse.de> PR tree-optimization/92324 * tree-vect-loop.c (vect_create_epilog_for_reduction): Fix singedness of SLP reduction epilouge operations. Also reduce the vector width for SLP reductions before doing elementwise operations if possible. * gcc.dg/vect/pr92324-4.c: New testcase. From-SVN: r278289
2019-11-14	Avoid retrying with the same vector modes	Richard Sandiford	1	-0/+13
	A later patch makes the AArch64 port add four entries to autovectorize_vector_modes. Each entry describes a different vector mode assignment for vector code that mixes 8-bit, 16-bit, 32-bit and 64-bit elements. But if (as usual) the vector code has fewer element sizes than that, we could end up trying the same combination of vector modes multiple times. This patch adds a check to prevent that. 2019-11-14 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vectorizer.h (vec_info::mode_set): New typedef. (vec_info::used_vector_mode): New member variable. (vect_chooses_same_modes_p): Declare. * tree-vect-stmts.c (get_vectype_for_scalar_type): Record each chosen vector mode in vec_info::used_vector_mode. (vect_chooses_same_modes_p): New function. * tree-vect-loop.c (vect_analyze_loop): Use it to avoid trying the same vector statements multiple times. * tree-vect-slp.c (vect_slp_bb_region): Likewise. From-SVN: r278242
2019-11-14	Support vectorisation with mixed vector sizes	Richard Sandiford	1	-14/+40
	After previous patches, it's now possible to make the vectoriser support multiple vector sizes in the same vector region, using related_vector_mode to pick the right vector mode for a given element mode. No port yet takes advantage of this, but I have a follow-on patch for AArch64. This patch also seemed like a good opportunity to add some more dump messages: one to make it clear which vector size/mode was being used when analysis passed or failed, and another to say when we've decided to skip a redundant vector size/mode. 2019-11-14 Richard Sandiford <richard.sandiford@arm.com> gcc/ * machmode.h (opt_machine_mode::operator==): New function. (opt_machine_mode::operator!=): Likewise. * tree-vectorizer.h (vec_info::vector_mode): Update comment. (get_related_vectype_for_scalar_type): Delete. (get_vectype_for_scalar_type_and_size): Declare. * tree-vect-slp.c (vect_slp_bb_region): Print dump messages to say whether analysis passed or failed, and with what vector modes. Use related_vector_mode to check whether trying a particular vector mode would be redundant with the autodetected mode, and print a dump message if we decide to skip it. * tree-vect-loop.c (vect_analyze_loop): Likewise. (vect_create_epilog_for_reduction): Use get_related_vectype_for_scalar_type instead of get_vectype_for_scalar_type_and_size. * tree-vect-stmts.c (get_vectype_for_scalar_type_and_size): Replace with... (get_related_vectype_for_scalar_type): ...this new function. Take a starting/"prevailing" vector mode rather than a vector size. Take an optional nunits argument, with the same meaning as for related_vector_mode. Use related_vector_mode when not auto-detecting a mode, falling back to mode_for_vector if no target mode exists. (get_vectype_for_scalar_type): Update accordingly. (get_same_sized_vectype): Likewise. * tree-vectorizer.c (get_vec_alignment_for_array_type): Likewise. From-SVN: r278240
2019-11-14	Replace vec_info::vector_size with vec_info::vector_mode	Richard Sandiford	1	-19/+13
	This patch replaces vec_info::vector_size with vec_info::vector_mode, but for now continues to use it as a way of specifying a single vector size. This makes it easier for later patches to use related_vector_mode instead. 2019-11-14 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vectorizer.h (vec_info::vector_size): Replace with... (vec_info::vector_mode): ...this new field. * tree-vect-loop.c (vect_update_vf_for_slp): Update accordingly. (vect_analyze_loop, vect_transform_loop): Likewise. * tree-vect-loop-manip.c (vect_do_peeling): Likewise. * tree-vect-slp.c (can_duplicate_and_interleave_p): Likewise. (vect_make_slp_decision, vect_slp_bb_region): Likewise. * tree-vect-stmts.c (get_vectype_for_scalar_type): Likewise. * tree-vectorizer.c (try_vectorize_loop_1): Likewise. gcc/testsuite/ * gcc.dg/vect/vect-tail-nomask-1.c: Update expected epilogue vectorization message. From-SVN: r278237
2019-11-14	Replace autovectorize_vector_sizes with autovectorize_vector_modes	Richard Sandiford	1	-18/+15
	This is another patch in the series to remove the assumption that all modes involved in vectorisation have to be the same size. Rather than have the target provide a list of vector sizes, it makes the target provide a list of vector "approaches", with each approach represented by a mode. A later patch will pass this mode to targetm.vectorize.related_mode to get the vector mode for a given element mode. Until then, the modes simply act as an alternative way of specifying the vector size. 2019-11-14 Richard Sandiford <richard.sandiford@arm.com> gcc/ * target.h (vector_sizes, auto_vector_sizes): Delete. (vector_modes, auto_vector_modes): New typedefs. * target.def (autovectorize_vector_sizes): Replace with... (autovectorize_vector_modes): ...this new hook. * doc/tm.texi.in (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_SIZES): Replace with... (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES): ...this new hook. * doc/tm.texi: Regenerate. * targhooks.h (default_autovectorize_vector_sizes): Delete. (default_autovectorize_vector_modes): New function. * targhooks.c (default_autovectorize_vector_sizes): Delete. (default_autovectorize_vector_modes): New function. * omp-general.c (omp_max_vf): Use autovectorize_vector_modes instead of autovectorize_vector_sizes. Use the number of units in the mode to calculate the maximum VF. * omp-low.c (omp_clause_aligned_alignment): Use autovectorize_vector_modes instead of autovectorize_vector_sizes. Use a loop based on related_mode to iterate through all supported vector modes for a given scalar mode. * optabs-query.c (can_vec_mask_load_store_p): Use autovectorize_vector_modes instead of autovectorize_vector_sizes. * tree-vect-loop.c (vect_analyze_loop, vect_transform_loop): Likewise. * tree-vect-slp.c (vect_slp_bb_region): Likewise. * config/aarch64/aarch64.c (aarch64_autovectorize_vector_sizes): Replace with... (aarch64_autovectorize_vector_modes): ...this new function. (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_SIZES): Delete. (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES): Define. * config/arc/arc.c (arc_autovectorize_vector_sizes): Replace with... (arc_autovectorize_vector_modes): ...this new function. (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_SIZES): Delete. (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES): Define. * config/arm/arm.c (arm_autovectorize_vector_sizes): Replace with... (arm_autovectorize_vector_modes): ...this new function. (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_SIZES): Delete. (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES): Define. * config/i386/i386.c (ix86_autovectorize_vector_sizes): Replace with... (ix86_autovectorize_vector_modes): ...this new function. (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_SIZES): Delete. (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES): Define. * config/mips/mips.c (mips_autovectorize_vector_sizes): Replace with... (mips_autovectorize_vector_modes): ...this new function. (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_SIZES): Delete. (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES): Define. From-SVN: r278236
2019-11-14	Remove build_{same_sized_,}truth_vector_type	Richard Sandiford	1	-5/+4
	build_same_sized_truth_vector_type was confusingly named, since for SVE and AVX512 the returned vector isn't the same byte size (although it does have the same number of elements). What it really returns is the "truth" vector type for a given data vector type. The more general truth_type_for provides the same thing when passed a vector and IMO has a more descriptive name, so this patch replaces all uses of build_same_sized_truth_vector_type with that. It does the same for a call to build_truth_vector_type, leaving truth_type_for itself as the only remaining caller. It's then more natural to pass build_truth_vector_type the original vector type rather than its size and nunits, especially since the given size isn't the size of the returned vector. This in turn allows a future patch to simplify the interface of get_mask_mode. Doing this also fixes a bug in which truth_type_for would pass a size of zero for BLKmode vector types. 2019-11-14 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree.h (build_truth_vector_type): Delete. (build_same_sized_truth_vector_type): Likewise. * tree.c (build_truth_vector_type): Rename to... (build_truth_vector_type_for): ...this. Make static and take a vector type as argument. (truth_type_for): Update accordingly. (build_same_sized_truth_vector_type): Delete. * tree-vect-generic.c (expand_vector_divmod): Use truth_type_for instead of build_same_sized_truth_vector_type. * tree-vect-loop.c (vect_create_epilog_for_reduction): Likewise. (vect_record_loop_mask, vect_get_loop_mask): Likewise. * tree-vect-patterns.c (build_mask_conversion): Likeise. * tree-vect-slp.c (vect_get_constant_vectors): Likewise. * tree-vect-stmts.c (vect_get_vec_def_for_operand): Likewise. (vect_build_gather_load_calls, vectorizable_call): Likewise. (scan_store_can_perm_p, vectorizable_scan_store): Likewise. (vectorizable_store, vectorizable_condition): Likewise. (get_mask_type_for_scalar_type, get_same_sized_vectype): Likewise. (vect_get_mask_type_for_stmt): Use truth_type_for instead of build_truth_vector_type. * config/aarch64/aarch64-sve-builtins.cc (gimple_folder::convert_pred): Use truth_type_for instead of build_same_sized_truth_vector_type. * config/rs6000/rs6000-call.c (fold_build_vec_cmp): Likewise. gcc/c/ * c-typeck.c (build_conditional_expr): Use truth_type_for instead of build_same_sized_truth_vector_type. (build_vec_cmp): Likewise. gcc/cp/ * call.c (build_conditional_expr_1): Use truth_type_for instead of build_same_sized_truth_vector_type. * typeck.c (build_vec_cmp): Likewise. gcc/d/ * d-codegen.cc (build_boolop): Use truth_type_for instead of build_same_sized_truth_vector_type. From-SVN: r278232
2019-11-14	Add build_truth_vector_type_for_mode	Richard Sandiford	1	-8/+10
	Callers of vect_halve_mask_nunits and vect_double_mask_nunits already know what mode the resulting vector type should have, so we might as well create the vector type directly with that mode, just like build_vector_type_for_mode lets us build normal vectors with a known mode. This avoids the current awkwardness of having to recompute the mode starting from vec_info::vector_size, which hard-codes the assumption that all vectors have to be the same size. A later patch gets rid of build_truth_vector_type and build_same_sized_truth_vector_type, so the net effect of the series is to reduce the number of type functions by one. 2019-11-14 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree.h (build_truth_vector_type_for_mode): Declare. * tree.c (build_truth_vector_type_for_mode): New function, split out from... (build_truth_vector_type): ...here. (build_opaque_vector_type): Fix head comment. * tree-vectorizer.h (supportable_narrowing_operation): Remove vec_info parameter. (vect_halve_mask_nunits): Replace vec_info parameter with the mode of the new vector. (vect_double_mask_nunits): Likewise. * tree-vect-loop.c (vect_halve_mask_nunits): Likewise. (vect_double_mask_nunits): Likewise. * tree-vect-loop-manip.c: Include insn-config.h, rtl.h and recog.h. (vect_maybe_permute_loop_masks): Remove vinfo parameter. Update call to vect_halve_mask_nunits, getting the required mode from the unpack patterns. (vect_set_loop_condition_masked): Update call accordingly. * tree-vect-stmts.c (supportable_narrowing_operation): Remove vec_info parameter and update call to vect_double_mask_nunits. (vectorizable_conversion): Update call accordingly. (simple_integer_narrowing): Likewise. Remove vec_info parameter. (vectorizable_call): Update call accordingly. (supportable_widening_operation): Update call to vect_halve_mask_nunits. * config/aarch64/aarch64-sve-builtins.cc (register_builtin_types): Use build_truth_vector_type_mode instead of build_truth_vector_type. From-SVN: r278231
2019-11-13	Account for the cost of generating loop masks	Richard Sandiford	1	-0/+26
	We didn't take the cost of generating loop masks into account, and so tended to underestimate the cost of loops that need multiple masks. 2019-11-13 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vect-loop.c (vect_estimate_min_profitable_iters): Include the cost of generating loop masks. gcc/testsuite/ * gcc.target/aarch64/sve/mask_struct_store_3.c: Add -fno-vect-cost-model. * gcc.target/aarch64/sve/mask_struct_store_3_run.c: Likewise. * gcc.target/aarch64/sve/peel_ind_2.c: Likewise. * gcc.target/aarch64/sve/peel_ind_2_run.c: Likewise. * gcc.target/aarch64/sve/peel_ind_3.c: Likewise. * gcc.target/aarch64/sve/peel_ind_3_run.c: Likewise. From-SVN: r278125
2019-11-13	Avoid accounting for non-existent vector loop versioning	Richard Sandiford	1	-9/+25
	vect_analyze_loop_costing uses two profitability thresholds: a runtime one and a static compile-time one. The runtime one is simply the point at which the vector loop is cheaper than the scalar loop, while the static one also takes into account the cost of choosing between the scalar and vector loops at runtime. We compare this static cost against the expected execution frequency to decide whether it's worth generating any vector code at all. However, we never reclaimed the cost of applying the runtime threshold if it turned out that the vector code can always be used. And we only know whether that's true once we've calculated what the runtime threshold would be. 2019-11-13 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vectorizer.h (vect_apply_runtime_profitability_check_p): New function. * tree-vect-loop-manip.c (vect_loop_versioning): Use it. * tree-vect-loop.c (vect_analyze_loop_2): Likewise. (vect_transform_loop): Likewise. (vect_analyze_loop_costing): Don't take the cost of versioning into account for the static profitability threshold if it turns out that no versioning is needed. From-SVN: r278124
2019-11-13	Don't assign a cost to vectorizable_assignment	Richard Sandiford	1	-1/+3
	vectorizable_assignment handles true SSA-to-SSA copies (which hopefully we don't see in practice) and no-op conversions that are required to maintain correct gimple, such as changes between signed and unsigned types. These cases shouldn't generate any code and so shouldn't count against either the scalar or vector costs. Later patches test this, but it seemed worth splitting out. 2019-11-13 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vectorizer.h (vect_nop_conversion_p): Declare. * tree-vect-stmts.c (vect_nop_conversion_p): New function. (vectorizable_assignment): Don't add a cost for nop conversions. * tree-vect-loop.c (vect_compute_single_scalar_iteration_cost): Likewise. * tree-vect-slp.c (vect_bb_slp_scalar_cost): Likewise. From-SVN: r278122
2019-11-13	re PR target/92473 (test pr92324-2.c fails on arm and aarch64)	Richard Biener	1	-24/+6
	2019-11-13 Richard Biener <rguenther@suse.de> PR tree-optimization/92473 * tree-vect-loop.c (vect_create_epilog_for_reduction): Perform direct optab reduction in the correct type. From-SVN: r278113
2019-11-12	re PR tree-optimization/92461 (ICE: verify_ssa failed (error: excess use ↵	Richard Biener	1	-2/+5
	operand for statement)) 2019-11-12 Richard Biener <rguenther@suse.de> PR tree-optimization/92461 * tree-vect-loop.c (vect_create_epilog_for_reduction): Update stmt after propagation. * gcc.dg/torture/pr92461.c: New testcase. From-SVN: r278093
2019-11-12	Remove gcc/params.* files.	Martin Liska	1	-1/+0
	2019-11-12 Martin Liska <mliska@suse.cz> * Makefile.in: Remove PARAMS_H and params.list and params.options. * params-enum.h: Remove. * params-list.h: Remove. * params-options.h: Remove. * params.c: Remove. * params.def: Remove. * params.h: Remove. * asan.c: Do not include params.h. * auto-profile.c: Likewise. * bb-reorder.c: Likewise. * builtins.c: Likewise. * cfgcleanup.c: Likewise. * cfgexpand.c: Likewise. * cfgloopanal.c: Likewise. * cgraph.c: Likewise. * combine.c: Likewise. * common/config/aarch64/aarch64-common.c: Likewise. * common/config/gcn/gcn-common.c: Likewise. * common/config/ia64/ia64-common.c: Likewise. * common/config/powerpcspe/powerpcspe-common.c: Likewise. * common/config/rs6000/rs6000-common.c: Likewise. * common/config/sh/sh-common.c: Likewise. * config/aarch64/aarch64.c: Likewise. * config/alpha/alpha.c: Likewise. * config/arm/arm.c: Likewise. * config/avr/avr.c: Likewise. * config/csky/csky.c: Likewise. * config/i386/i386-builtins.c: Likewise. * config/i386/i386-expand.c: Likewise. * config/i386/i386-features.c: Likewise. * config/i386/i386-options.c: Likewise. * config/i386/i386.c: Likewise. * config/ia64/ia64.c: Likewise. * config/rs6000/rs6000-logue.c: Likewise. * config/rs6000/rs6000.c: Likewise. * config/s390/s390.c: Likewise. * config/sparc/sparc.c: Likewise. * config/visium/visium.c: Likewise. * coverage.c: Likewise. * cprop.c: Likewise. * cse.c: Likewise. * cselib.c: Likewise. * dse.c: Likewise. * emit-rtl.c: Likewise. * explow.c: Likewise. * final.c: Likewise. * fold-const.c: Likewise. * gcc.c: Likewise. * gcse.c: Likewise. * ggc-common.c: Likewise. * ggc-page.c: Likewise. * gimple-loop-interchange.cc: Likewise. * gimple-loop-jam.c: Likewise. * gimple-loop-versioning.cc: Likewise. * gimple-ssa-split-paths.c: Likewise. * gimple-ssa-sprintf.c: Likewise. * gimple-ssa-store-merging.c: Likewise. * gimple-ssa-strength-reduction.c: Likewise. * gimple-ssa-warn-alloca.c: Likewise. * gimple-ssa-warn-restrict.c: Likewise. * graphite-isl-ast-to-gimple.c: Likewise. * graphite-optimize-isl.c: Likewise. * graphite-scop-detection.c: Likewise. * graphite-sese-to-poly.c: Likewise. * graphite.c: Likewise. * haifa-sched.c: Likewise. * hsa-gen.c: Likewise. * ifcvt.c: Likewise. * ipa-cp.c: Likewise. * ipa-fnsummary.c: Likewise. * ipa-inline-analysis.c: Likewise. * ipa-inline.c: Likewise. * ipa-polymorphic-call.c: Likewise. * ipa-profile.c: Likewise. * ipa-prop.c: Likewise. * ipa-split.c: Likewise. * ipa-sra.c: Likewise. * ira-build.c: Likewise. * ira-conflicts.c: Likewise. * loop-doloop.c: Likewise. * loop-invariant.c: Likewise. * loop-unroll.c: Likewise. * lra-assigns.c: Likewise. * lra-constraints.c: Likewise. * modulo-sched.c: Likewise. * opt-suggestions.c: Likewise. * opts.c: Likewise. * postreload-gcse.c: Likewise. * predict.c: Likewise. * reload.c: Likewise. * reorg.c: Likewise. * resource.c: Likewise. * sanopt.c: Likewise. * sched-deps.c: Likewise. * sched-ebb.c: Likewise. * sched-rgn.c: Likewise. * sel-sched-ir.c: Likewise. * sel-sched.c: Likewise. * shrink-wrap.c: Likewise. * stmt.c: Likewise. * targhooks.c: Likewise. * toplev.c: Likewise. * tracer.c: Likewise. * trans-mem.c: Likewise. * tree-chrec.c: Likewise. * tree-data-ref.c: Likewise. * tree-if-conv.c: Likewise. * tree-inline.c: Likewise. * tree-loop-distribution.c: Likewise. * tree-parloops.c: Likewise. * tree-predcom.c: Likewise. * tree-profile.c: Likewise. * tree-scalar-evolution.c: Likewise. * tree-sra.c: Likewise. * tree-ssa-ccp.c: Likewise. * tree-ssa-dom.c: Likewise. * tree-ssa-dse.c: Likewise. * tree-ssa-ifcombine.c: Likewise. * tree-ssa-loop-ch.c: Likewise. * tree-ssa-loop-im.c: Likewise. * tree-ssa-loop-ivcanon.c: Likewise. * tree-ssa-loop-ivopts.c: Likewise. * tree-ssa-loop-manip.c: Likewise. * tree-ssa-loop-niter.c: Likewise. * tree-ssa-loop-prefetch.c: Likewise. * tree-ssa-loop-unswitch.c: Likewise. * tree-ssa-math-opts.c: Likewise. * tree-ssa-phiopt.c: Likewise. * tree-ssa-pre.c: Likewise. * tree-ssa-reassoc.c: Likewise. * tree-ssa-sccvn.c: Likewise. * tree-ssa-scopedtables.c: Likewise. * tree-ssa-sink.c: Likewise. * tree-ssa-strlen.c: Likewise. * tree-ssa-structalias.c: Likewise. * tree-ssa-tail-merge.c: Likewise. * tree-ssa-threadbackward.c: Likewise. * tree-ssa-threadedge.c: Likewise. * tree-ssa-uninit.c: Likewise. * tree-switch-conversion.c: Likewise. * tree-vect-data-refs.c: Likewise. * tree-vect-loop.c: Likewise. * tree-vect-slp.c: Likewise. * tree-vrp.c: Likewise. * tree.c: Likewise. * value-prof.c: Likewise. * var-tracking.c: Likewise. 2019-11-12 Martin Liska <mliska@suse.cz> * gimple-parser.c: Do not include params.h. 2019-11-12 Martin Liska <mliska@suse.cz> * name-lookup.c: Do not include params.h. * typeck.c: Likewise. 2019-11-12 Martin Liska <mliska@suse.cz> * lto-common.c: Do not include params.h. * lto-partition.c: Likewise. * lto.c: Likewise. From-SVN: r278086
2019-11-12	Apply mechanical replacement (generated patch).	Martin Liska	1	-3/+3
	2019-11-12 Martin Liska <mliska@suse.cz> * asan.c (asan_sanitize_stack_p): Replace old parameter syntax with the new one, include opts.h if needed. Use SET_OPTION_IF_UNSET macro. (asan_sanitize_allocas_p): Likewise. (asan_emit_stack_protection): Likewise. (asan_protect_global): Likewise. (instrument_derefs): Likewise. (instrument_builtin_call): Likewise. (asan_expand_mark_ifn): Likewise. * auto-profile.c (auto_profile): Likewise. * bb-reorder.c (copy_bb_p): Likewise. (duplicate_computed_gotos): Likewise. * builtins.c (inline_expand_builtin_string_cmp): Likewise. * cfgcleanup.c (try_crossjump_to_edge): Likewise. (try_crossjump_bb): Likewise. * cfgexpand.c (defer_stack_allocation): Likewise. (stack_protect_classify_type): Likewise. (pass_expand::execute): Likewise. * cfgloopanal.c (expected_loop_iterations_unbounded): Likewise. (estimate_reg_pressure_cost): Likewise. * cgraph.c (cgraph_edge::maybe_hot_p): Likewise. * combine.c (combine_instructions): Likewise. (record_value_for_reg): Likewise. * common/config/aarch64/aarch64-common.c (aarch64_option_validate_param): Likewise. (aarch64_option_default_params): Likewise. * common/config/ia64/ia64-common.c (ia64_option_default_params): Likewise. * common/config/powerpcspe/powerpcspe-common.c (rs6000_option_default_params): Likewise. * common/config/rs6000/rs6000-common.c (rs6000_option_default_params): Likewise. * common/config/sh/sh-common.c (sh_option_default_params): Likewise. * config/aarch64/aarch64.c (aarch64_output_probe_stack_range): Likewise. (aarch64_allocate_and_probe_stack_space): Likewise. (aarch64_expand_epilogue): Likewise. (aarch64_override_options_internal): Likewise. * config/alpha/alpha.c (alpha_option_override): Likewise. * config/arm/arm.c (arm_option_override): Likewise. (arm_valid_target_attribute_p): Likewise. * config/i386/i386-options.c (ix86_option_override_internal): Likewise. * config/i386/i386.c (get_probe_interval): Likewise. (ix86_adjust_stack_and_probe_stack_clash): Likewise. (ix86_max_noce_ifcvt_seq_cost): Likewise. * config/ia64/ia64.c (ia64_adjust_cost): Likewise. * config/rs6000/rs6000-logue.c (get_stack_clash_protection_probe_interval): Likewise. (get_stack_clash_protection_guard_size): Likewise. * config/rs6000/rs6000.c (rs6000_option_override_internal): Likewise. * config/s390/s390.c (allocate_stack_space): Likewise. (s390_emit_prologue): Likewise. (s390_option_override_internal): Likewise. * config/sparc/sparc.c (sparc_option_override): Likewise. * config/visium/visium.c (visium_option_override): Likewise. * coverage.c (get_coverage_counts): Likewise. (coverage_compute_profile_id): Likewise. (coverage_begin_function): Likewise. (coverage_end_function): Likewise. * cse.c (cse_find_path): Likewise. (cse_extended_basic_block): Likewise. (cse_main): Likewise. * cselib.c (cselib_invalidate_mem): Likewise. * dse.c (dse_step1): Likewise. * emit-rtl.c (set_new_first_and_last_insn): Likewise. (get_max_insn_count): Likewise. (make_debug_insn_raw): Likewise. (init_emit): Likewise. * explow.c (compute_stack_clash_protection_loop_data): Likewise. * final.c (compute_alignments): Likewise. * fold-const.c (fold_range_test): Likewise. (fold_truth_andor): Likewise. (tree_single_nonnegative_warnv_p): Likewise. (integer_valued_real_single_p): Likewise. * gcse.c (want_to_gcse_p): Likewise. (prune_insertions_deletions): Likewise. (hoist_code): Likewise. (gcse_or_cprop_is_too_expensive): Likewise. * ggc-common.c: Likewise. * ggc-page.c (ggc_collect): Likewise. * gimple-loop-interchange.cc (MAX_NUM_STMT): Likewise. (MAX_DATAREFS): Likewise. (OUTER_STRIDE_RATIO): Likewise. * gimple-loop-jam.c (tree_loop_unroll_and_jam): Likewise. * gimple-loop-versioning.cc (loop_versioning::max_insns_for_loop): Likewise. * gimple-ssa-split-paths.c (is_feasible_trace): Likewise. * gimple-ssa-store-merging.c (imm_store_chain_info::try_coalesce_bswap): Likewise. (imm_store_chain_info::coalesce_immediate_stores): Likewise. (imm_store_chain_info::output_merged_store): Likewise. (pass_store_merging::process_store): Likewise. * gimple-ssa-strength-reduction.c (find_basis_for_base_expr): Likewise. * graphite-isl-ast-to-gimple.c (class translate_isl_ast_to_gimple): Likewise. (scop_to_isl_ast): Likewise. * graphite-optimize-isl.c (get_schedule_for_node_st): Likewise. (optimize_isl): Likewise. * graphite-scop-detection.c (build_scops): Likewise. * haifa-sched.c (set_modulo_params): Likewise. (rank_for_schedule): Likewise. (model_add_to_worklist): Likewise. (model_promote_insn): Likewise. (model_choose_insn): Likewise. (queue_to_ready): Likewise. (autopref_multipass_dfa_lookahead_guard): Likewise. (schedule_block): Likewise. (sched_init): Likewise. * hsa-gen.c (init_prologue): Likewise. * ifcvt.c (bb_ok_for_noce_convert_multiple_sets): Likewise. (cond_move_process_if_block): Likewise. * ipa-cp.c (ipcp_lattice::add_value): Likewise. (merge_agg_lats_step): Likewise. (devirtualization_time_bonus): Likewise. (hint_time_bonus): Likewise. (incorporate_penalties): Likewise. (good_cloning_opportunity_p): Likewise. (ipcp_propagate_stage): Likewise. * ipa-fnsummary.c (decompose_param_expr): Likewise. (set_switch_stmt_execution_predicate): Likewise. (analyze_function_body): Likewise. (compute_fn_summary): Likewise. * ipa-inline-analysis.c (estimate_growth): Likewise. * ipa-inline.c (caller_growth_limits): Likewise. (inline_insns_single): Likewise. (inline_insns_auto): Likewise. (can_inline_edge_by_limits_p): Likewise. (want_early_inline_function_p): Likewise. (big_speedup_p): Likewise. (want_inline_small_function_p): Likewise. (want_inline_self_recursive_call_p): Likewise. (edge_badness): Likewise. (recursive_inlining): Likewise. (compute_max_insns): Likewise. (early_inliner): Likewise. * ipa-polymorphic-call.c (csftc_abort_walking_p): Likewise. * ipa-profile.c (ipa_profile): Likewise. * ipa-prop.c (determine_known_aggregate_parts): Likewise. (ipa_analyze_node): Likewise. (ipcp_transform_function): Likewise. * ipa-split.c (consider_split): Likewise. * ipa-sra.c (allocate_access): Likewise. (process_scan_results): Likewise. (ipa_sra_summarize_function): Likewise. (pull_accesses_from_callee): Likewise. * ira-build.c (loop_compare_func): Likewise. (mark_loops_for_removal): Likewise. * ira-conflicts.c (build_conflict_bit_table): Likewise. * loop-doloop.c (doloop_optimize): Likewise. * loop-invariant.c (gain_for_invariant): Likewise. (move_loop_invariants): Likewise. * loop-unroll.c (decide_unroll_constant_iterations): Likewise. (decide_unroll_runtime_iterations): Likewise. (decide_unroll_stupid): Likewise. (expand_var_during_unrolling): Likewise. * lra-assigns.c (spill_for): Likewise. * lra-constraints.c (EBB_PROBABILITY_CUTOFF): Likewise. * modulo-sched.c (sms_schedule): Likewise. (DFA_HISTORY): Likewise. * opts.c (default_options_optimization): Likewise. (finish_options): Likewise. (common_handle_option): Likewise. * postreload-gcse.c (eliminate_partially_redundant_load): Likewise. (if): Likewise. * predict.c (get_hot_bb_threshold): Likewise. (maybe_hot_count_p): Likewise. (probably_never_executed): Likewise. (predictable_edge_p): Likewise. (predict_loops): Likewise. (expr_expected_value_1): Likewise. (tree_predict_by_opcode): Likewise. (handle_missing_profiles): Likewise. * reload.c (find_equiv_reg): Likewise. * reorg.c (redundant_insn): Likewise. * resource.c (mark_target_live_regs): Likewise. (incr_ticks_for_insn): Likewise. * sanopt.c (pass_sanopt::execute): Likewise. * sched-deps.c (sched_analyze_1): Likewise. (sched_analyze_2): Likewise. (sched_analyze_insn): Likewise. (deps_analyze_insn): Likewise. * sched-ebb.c (schedule_ebbs): Likewise. * sched-rgn.c (find_single_block_region): Likewise. (too_large): Likewise. (haifa_find_rgns): Likewise. (extend_rgns): Likewise. (new_ready): Likewise. (schedule_region): Likewise. (sched_rgn_init): Likewise. * sel-sched-ir.c (make_region_from_loop): Likewise. * sel-sched-ir.h (MAX_WS): Likewise. * sel-sched.c (process_pipelined_exprs): Likewise. (sel_setup_region_sched_flags): Likewise. * shrink-wrap.c (try_shrink_wrapping): Likewise. * targhooks.c (default_max_noce_ifcvt_seq_cost): Likewise. * toplev.c (print_version): Likewise. (process_options): Likewise. * tracer.c (tail_duplicate): Likewise. * trans-mem.c (tm_log_add): Likewise. * tree-chrec.c (chrec_fold_plus_1): Likewise. * tree-data-ref.c (split_constant_offset): Likewise. (compute_all_dependences): Likewise. * tree-if-conv.c (MAX_PHI_ARG_NUM): Likewise. * tree-inline.c (remap_gimple_stmt): Likewise. * tree-loop-distribution.c (MAX_DATAREFS_NUM): Likewise. * tree-parloops.c (MIN_PER_THREAD): Likewise. (create_parallel_loop): Likewise. * tree-predcom.c (determine_unroll_factor): Likewise. * tree-scalar-evolution.c (instantiate_scev_r): Likewise. * tree-sra.c (analyze_all_variable_accesses): Likewise. * tree-ssa-ccp.c (fold_builtin_alloca_with_align): Likewise. * tree-ssa-dse.c (setup_live_bytes_from_ref): Likewise. (dse_optimize_redundant_stores): Likewise. (dse_classify_store): Likewise. * tree-ssa-ifcombine.c (ifcombine_ifandif): Likewise. * tree-ssa-loop-ch.c (ch_base::copy_headers): Likewise. * tree-ssa-loop-im.c (LIM_EXPENSIVE): Likewise. * tree-ssa-loop-ivcanon.c (try_unroll_loop_completely): Likewise. (try_peel_loop): Likewise. (tree_unroll_loops_completely): Likewise. * tree-ssa-loop-ivopts.c (avg_loop_niter): Likewise. (CONSIDER_ALL_CANDIDATES_BOUND): Likewise. (MAX_CONSIDERED_GROUPS): Likewise. (ALWAYS_PRUNE_CAND_SET_BOUND): Likewise. * tree-ssa-loop-manip.c (can_unroll_loop_p): Likewise. * tree-ssa-loop-niter.c (MAX_ITERATIONS_TO_TRACK): Likewise. * tree-ssa-loop-prefetch.c (PREFETCH_BLOCK): Likewise. (L1_CACHE_SIZE_BYTES): Likewise. (L2_CACHE_SIZE_BYTES): Likewise. (should_issue_prefetch_p): Likewise. (schedule_prefetches): Likewise. (determine_unroll_factor): Likewise. (volume_of_references): Likewise. (add_subscript_strides): Likewise. (self_reuse_distance): Likewise. (mem_ref_count_reasonable_p): Likewise. (insn_to_prefetch_ratio_too_small_p): Likewise. (loop_prefetch_arrays): Likewise. (tree_ssa_prefetch_arrays): Likewise. * tree-ssa-loop-unswitch.c (tree_unswitch_single_loop): Likewise. * tree-ssa-math-opts.c (gimple_expand_builtin_pow): Likewise. (convert_mult_to_fma): Likewise. (math_opts_dom_walker::after_dom_children): Likewise. * tree-ssa-phiopt.c (cond_if_else_store_replacement): Likewise. (hoist_adjacent_loads): Likewise. (gate_hoist_loads): Likewise. * tree-ssa-pre.c (translate_vuse_through_block): Likewise. (compute_partial_antic_aux): Likewise. * tree-ssa-reassoc.c (get_reassociation_width): Likewise. * tree-ssa-sccvn.c (vn_reference_lookup_pieces): Likewise. (vn_reference_lookup): Likewise. (do_rpo_vn): Likewise. * tree-ssa-scopedtables.c (avail_exprs_stack::lookup_avail_expr): Likewise. * tree-ssa-sink.c (select_best_block): Likewise. * tree-ssa-strlen.c (new_stridx): Likewise. (new_addr_stridx): Likewise. (get_range_strlen_dynamic): Likewise. (class ssa_name_limit_t): Likewise. * tree-ssa-structalias.c (push_fields_onto_fieldstack): Likewise. (create_variable_info_for_1): Likewise. (init_alias_vars): Likewise. * tree-ssa-tail-merge.c (find_clusters_1): Likewise. (tail_merge_optimize): Likewise. * tree-ssa-threadbackward.c (thread_jumps::profitable_jump_thread_path): Likewise. (thread_jumps::fsm_find_control_statement_thread_paths): Likewise. (thread_jumps::find_jump_threads_backwards): Likewise. * tree-ssa-threadedge.c (record_temporary_equivalences_from_stmts_at_dest): Likewise. * tree-ssa-uninit.c (compute_control_dep_chain): Likewise. * tree-switch-conversion.c (switch_conversion::check_range): Likewise. (jump_table_cluster::can_be_handled): Likewise. * tree-switch-conversion.h (jump_table_cluster::case_values_threshold): Likewise. (SWITCH_CONVERSION_BRANCH_RATIO): Likewise. (param_switch_conversion_branch_ratio): Likewise. * tree-vect-data-refs.c (vect_mark_for_runtime_alias_test): Likewise. (vect_enhance_data_refs_alignment): Likewise. (vect_prune_runtime_alias_test_list): Likewise. * tree-vect-loop.c (vect_analyze_loop_costing): Likewise. (vect_get_datarefs_in_loop): Likewise. (vect_analyze_loop): Likewise. * tree-vect-slp.c (vect_slp_bb): Likewise. * tree-vectorizer.h: Likewise. * tree-vrp.c (find_switch_asserts): Likewise. (vrp_prop::check_mem_ref): Likewise. * tree.c (wide_int_to_tree_1): Likewise. (cache_integer_cst): Likewise. * var-tracking.c (EXPR_USE_DEPTH): Likewise. (reverse_op): Likewise. (vt_find_locations): Likewise. 2019-11-12 Martin Liska <mliska@suse.cz> * gimple-parser.c (c_parser_parse_gimple_body): Replace old parameter syntax with the new one, include opts.h if needed. Use SET_OPTION_IF_UNSET macro. 2019-11-12 Martin Liska <mliska@suse.cz> * name-lookup.c (namespace_hints::namespace_hints): Replace old parameter syntax with the new one, include opts.h if needed. Use SET_OPTION_IF_UNSET macro. * typeck.c (comptypes): Likewise. 2019-11-12 Martin Liska <mliska@suse.cz> * lto-partition.c (lto_balanced_map): Replace old parameter syntax with the new one, include opts.h if needed. Use SET_OPTION_IF_UNSET macro. * lto.c (do_whole_program_analysis): Likewise. From-SVN: r278085
2019-11-12	re PR tree-optimization/92347 (ICE in vect_get_vec_def_for_operand_1, at ↵	Andre Vieira	1	-1/+0
	tree-vect-stmts.c:1537) 2019-11-11 Andre Vieira <andre.simoesdiasvieira@arm.com> PR tree-optimization/92347 * tree-vect-loop.c (vect_transform_loop): Don't overwrite epilogues safelen with 0. * gcc.dg/vect/pr92347.c: New test. From-SVN: r278079
2019-11-08	Use correct vector type in neutral_op_for_slp_reduction	Richard Sandiford	1	-13/+16
	With the new reduction vectype handling, neutral_op_for_slp_reduction needs to know whether the caller is using STMT_VINFO_REDUC_VECTYPE (for an epilogue value) or STMT_VINFO_VECTYPE (for a PHI argument). This fixes various gcc.target/aarch64/sve/slp_* tests. 2019-11-08 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vect-loop.c (neutral_op_for_slp_reduction): Take the vector type as an argument rather than reading it from the stmt_vec_info. (vect_create_epilog_for_reduction): Update accordingly. (vectorizable_reduction): Likewise. (vect_transform_cycle_phi): Likewise. From-SVN: r277977
2019-11-08	[vect] Disable vectorization of epilogues for loops with SIMDUID set	Andre Vieira	1	-2/+6
	gcc/ChangeLog: 2019-11-08 Andre Vieira <andre.simoesdiasvieira@arm.com> * tree-vect-loop.c (vect_analyze_loop): Disable epilogue vectorization for loops with SIMDUID set. Enable epilogue vectorization for loops with SIMDLEN set after finding a main loop with a VF that matches it. From-SVN: r277964
2019-11-08	re PR tree-optimization/92324 (ICE in expand_direct_optab_fn, at ↵	Richard Biener	1	-85/+94
	internal-fn.c:2890) 2019-11-08 Richard Biener <rguenther@suse.de> PR tree-optimization/92324 * tree-vect-loop.c (vect_create_epilog_for_reduction): Use STMT_VINFO_REDUC_VECTYPE for all computations, inserting sign-conversions as necessary. (vectorizable_reduction): Reject conversions in the chain that are not sign-conversions, base analysis on a non-converting stmt and its operation sign. Set STMT_VINFO_REDUC_VECTYPE. * tree-vect-stmts.c (vect_stmt_relevant_p): Don't dump anything for debug stmts. * tree-vectorizer.h (_stmt_vec_info::reduc_vectype): New. (STMT_VINFO_REDUC_VECTYPE): Likewise. * gcc.dg/vect/pr92205.c: XFAIL. * gcc.dg/vect/pr92324-1.c: New testcase. * gcc.dg/vect/pr92324-2.c: Likewise. From-SVN: r277955
2019-11-07	re PR tree-optimization/92405 (ICE in vect_get_vec_def_for_stmt_copy, at ↵	Richard Biener	1	-0/+12
	tree-vect-stmts.c:1683) 2019-11-07 Richard Biener <rguenther@suse.de> PR tree-optimization/92405 * tree-vect-loop.c (vectorizable_reduction): Appropriately restrict lane-reducing ops to single stmt chains. From-SVN: r277921
2019-11-06	Don't vectorise single-iteration epilogues	Richard Sandiford	1	-0/+1
	With a later patch I saw a case in which we peeled a single iteration for gaps but didn't need to peel further iterations to make up a full vector. We then tried to vectorise the single-iteration epilogue. 2019-11-06 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vect-loop.c (vect_analyze_loop): Only try to vectorize the epilogue if there are peeled iterations for it to handle. From-SVN: r277886
2019-11-06	tree-vect-loop.c (vectorizable_reduction): Remember reduction PHI.	Richard Biener	1	-16/+8
	2019-11-06 Richard Biener <rguenther@suse.de> * tree-vect-loop.c (vectorizable_reduction): Remember reduction PHI. Use STMT_VINFO_REDUC_IDX to skip the reduction operand. Simplify single_defuse_cycle condition. From-SVN: r277882
2019-11-06	Check the VF is small enough for an epilogue loop	Richard Sandiford	1	-0/+10
	The number of iterations of an epilogue loop is always smaller than the VF of the main loop. vect_analyze_loop_costing was taking this into account when deciding whether the loop is cheap enough to vectorise, but that has no effect with the unlimited cost model. We need to use a separate check for correctness as well. This can happen if the sizes returned by autovectorize_vector_sizes happen to be out of order, e.g. because the target prefers smaller vectors. It can also happen with later patches if two vectorisation attempts happen to end up with the same VF. 2019-11-06 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vect-loop.c (vect_analyze_loop_2): When vectorizing an epilogue loop, make sure that the VF is small enough or that the epilogue loop can be fully-masked. From-SVN: r277880