aboutsummaryrefslogtreecommitdiff
path: root/gcc/tree-vect-data-refs.c
AgeCommit message (Collapse)AuthorFilesLines
2017-10-06re PR tree-optimization/82397 (qsort comparator non-negative on sorted ↵Richard Biener1-31/+18
output: 1 in vect_analyze_data_ref_accesses) 2017-10-06 Richard Biener <rguenther@suse.de> PR tree-optimization/82397 * tree-vect-data-refs.c (dr_group_sort_cmp): Do not use operand_equal_p but rely on data_ref_compare_tree for detecting equalities. (vect_analyze_data_ref_accesses): Use data_ref_compare_tree to match up with dr_group_sort_cmp. * gfortran.dg/pr82397.f: New testcase. From-SVN: r253482
2017-09-22PR82289: Computing peeling costs for irrelevant drsRichard Sandiford1-0/+3
This PR shows that we weren't filtering out irrelevant stmts in vect_get_peeling_costs_all_drs (unlike related loops in which we iterate over all datarefs). 2017-09-22 Richard Sandiford <richard.sandiford@linaro.org> gcc/ PR tree-optimization/82289 * tree-vect-data-refs.c (vect_get_peeling_costs_all_drs): Check STMT_VINFO_RELEVANT_P. gcc/testsuite/ PR tree-optimization/82289 * gcc.dg/vect/pr82289.c: New test. From-SVN: r253103
2017-09-22Let the target choose a vectorisation alignmentRichard Sandiford1-40/+52
The vectoriser aligned vectors to TYPE_ALIGN unconditionally, although there was also a hard-coded assumption that this was equal to the type size. This was inconvenient for SVE for two reasons: - When compiling for a specific power-of-2 SVE vector length, we might want to align to a full vector. However, the TYPE_ALIGN is governed by the ABI alignment, which is 128 bits regardless of size. - For vector-length-agnostic code it doesn't usually make sense to align, since the runtime vector length might not be a power of two. Even for power of two sizes, there's no guarantee that aligning to the previous 16 bytes will be an improveent. This patch therefore adds a target hook to control the preferred vectoriser (as opposed to ABI) alignment. 2017-09-22 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * target.def (preferred_vector_alignment): New hook. * doc/tm.texi.in (TARGET_VECTORIZE_PREFERRED_VECTOR_ALIGNMENT): New hook. * doc/tm.texi: Regenerate. * targhooks.h (default_preferred_vector_alignment): Declare. * targhooks.c (default_preferred_vector_alignment): New function. * tree-vectorizer.h (dataref_aux): Add a target_alignment field. Expand commentary. (DR_TARGET_ALIGNMENT): New macro. (aligned_access_p): Update commentary. (vect_known_alignment_in_bytes): New function. * tree-vect-data-refs.c (vect_calculate_required_alignment): New function. (vect_compute_data_ref_alignment): Set DR_TARGET_ALIGNMENT. Calculate the misalignment based on the target alignment rather than the vector size. (vect_update_misalignment_for_peel): Use DR_TARGET_ALIGMENT rather than TYPE_ALIGN / BITS_PER_UNIT to update the misalignment. (vect_enhance_data_refs_alignment): Mask the byte misalignment with the target alignment, rather than masking the element misalignment with the number of elements in a vector. Also use the target alignment when calculating the maximum number of peels. (vect_find_same_alignment_drs): Use vect_calculate_required_alignment instead of TYPE_ALIGN_UNIT. (vect_duplicate_ssa_name_ptr_info): Remove stmt_info parameter. Measure DR_MISALIGNMENT relative to DR_TARGET_ALIGNMENT. (vect_create_addr_base_for_vector_ref): Update call accordingly. (vect_create_data_ref_ptr): Likewise. (vect_setup_realignment): Realign by ANDing with -DR_TARGET_MISALIGNMENT. * tree-vect-loop-manip.c (vect_gen_prolog_loop_niters): Calculate the number of peels based on DR_TARGET_ALIGNMENT. * tree-vect-stmts.c (get_group_load_store_type): Compare the gap with the guaranteed alignment boundary when deciding whether overrun is OK. (vectorizable_mask_load_store): Interpret DR_MISALIGNMENT relative to DR_TARGET_ALIGNMENT instead of TYPE_ALIGN_UNIT. (ensure_base_align): Remove stmt_info parameter. Get the target base alignment from DR_TARGET_ALIGNMENT. (vectorizable_store): Update call accordingly. Interpret DR_MISALIGNMENT relative to DR_TARGET_ALIGNMENT instead of TYPE_ALIGN_UNIT. (vectorizable_load): Likewise. gcc/testsuite/ * gcc.dg/vect/vect-outer-3a.c: Adjust dump scan for new wording of alignment message. * gcc.dg/vect/vect-outer-3a-big-array.c: Likewise. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r253101
2017-09-22Add a vect_get_scalar_dr_size helper functionRichard Sandiford1-7/+4
This patch adds a helper function for getting the number of bytes accessed by an unvectorised data reference, which helps when general modes have a variable size. 2017-09-22 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * tree-vectorizer.h (vect_get_scalar_dr_size): New function. * tree-vect-data-refs.c (vect_update_misalignment_for_peel): Use it. (vect_enhance_data_refs_alignment): Likewise. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r253099
2017-09-18Fix an SVE failure in the Fortran matmul* testsRichard Sandiford1-0/+5
The vectoriser was calling vect_get_smallest_scalar_type without having proven that the type actually is a scalar. This seems to be the intended behaviour: the ultimate test of whether the type is interesting (and hence scalar) is whether an associated vector type exists, but this is only tested later. The patch simply makes the function cope gracefully with non-scalar inputs. 2017-09-18 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * tree-vect-data-refs.c (vect_get_smallest_scalar_type): Cope with types that aren't in fact scalar. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r252934
2017-09-14Add LOOP_VINFO_MAX_VECT_FACTORRichard Sandiford1-1/+1
Epilogue vectorisation uses the vectorisation factor of the main loop as the maximum vectorisation factor allowed for correctness. That makes sense as a conservatively correct value, since the chosen vectorisation factor will be strictly less than that anyway. However, once the VF itself becomes variable, it's easier to carry across the original maximum VF instead. 2017-09-14 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * tree-vectorizer.h (_loop_vec_info): Add max_vectorization_factor. (LOOP_VINFO_MAX_VECT_FACTOR): New macro. (LOOP_VINFO_ORIG_VECT_FACTOR): Replace with... (LOOP_VINFO_ORIG_MAX_VECT_FACTOR): ...this new macro. * tree-vect-data-refs.c (vect_analyze_data_ref_dependences): Update accordingly. * tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Initialize max_vectorization_factor. (vect_analyze_loop_2): Set LOOP_VINFO_MAX_VECT_FACTOR. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r252766
2017-09-14Add a vect_get_num_copies helper routineRichard Sandiford1-3/+6
This patch adds a vectoriser helper routine to calculate how many copies of a vector statement we need. At present this is always: LOOP_VINFO_VECT_FACTOR (loop_vinfo) / TYPE_VECTOR_SUBPARTS (vectype) but later patches add other cases. Another benefit of using a helper routine is that it can assert that the division is exact (which it must be). 2017-09-14 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * tree-vectorizer.h (vect_get_num_copies): New function. * tree-vect-data-refs.c (vect_get_data_access_cost): Use it. * tree-vect-loop.c (vectorizable_reduction): Likewise. (vectorizable_induction): Likewise. (vectorizable_live_operation): Likewise. * tree-vect-stmts.c (vectorizable_mask_load_store): Likewise. (vectorizable_bswap): Likewise. (vectorizable_call): Likewise. (vectorizable_conversion): Likewise. (vectorizable_assignment): Likewise. (vectorizable_shift): Likewise. (vectorizable_operation): Likewise. (vectorizable_store): Likewise. (vectorizable_load): Likewise. (vectorizable_condition): Likewise. (vectorizable_comparison): Likewise. (vect_analyze_stmt): Pass the slp node to vectorizable_live_operation. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r252764
2017-09-14Use vec<> for constant permute masksRichard Sandiford1-27/+35
This patch makes can_vec_perm_p & co. take a vec<>, wrapped in new typedefs vec_perm_indices and auto_vec_perm_indices. There are two reasons for doing this for SVE: (1) it means that the number of elements is bundled with the elements themselves, and is obviously constant. (2) it makes it easier to change the "unsigned char" element type to something wider. Changing the target hook is left as follow-on work. 2017-09-14 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * target.h (vec_perm_indices): New typedef. (auto_vec_perm_indices): Likewise. * optabs-query.h: Include target.h (can_vec_perm_p): Take a vec_perm_indices *. * optabs-query.c (can_vec_perm_p): Likewise. (can_mult_highpart_p): Update accordingly. Use auto_vec_perm_indices. * tree-ssa-forwprop.c (simplify_vector_constructor): Likewise. * tree-vect-generic.c (lower_vec_perm): Likewise. * tree-vect-data-refs.c (vect_grouped_store_supported): Likewise. (vect_grouped_load_supported): Likewise. (vect_shift_permute_load_chain): Likewise. (vect_permute_store_chain): Use auto_vec_perm_indices. (vect_permute_load_chain): Likewise. * fold-const.c (fold_vec_perm): Take vec_perm_indices. (fold_ternary_loc): Update accordingly. Use auto_vec_perm_indices. Update uses of can_vec_perm_p. * tree-vect-loop.c (calc_vec_perm_mask_for_shift): Replace the mode with a number of elements. Take a vec_perm_indices *. (vect_create_epilog_for_reduction): Update accordingly. Use auto_vec_perm_indices. (have_whole_vector_shift): Likewise. Update call to can_vec_perm_p. * tree-vect-slp.c (vect_build_slp_tree_1): Likewise. (vect_transform_slp_perm_load): Likewise. (vect_schedule_slp_instance): Use auto_vec_perm_indices. * tree-vectorizer.h (vect_gen_perm_mask_any): Take a vec_perm_indices. (vect_gen_perm_mask_checked): Likewise. * tree-vect-stmts.c (vect_gen_perm_mask_any): Take a vec_perm_indices. (vect_gen_perm_mask_checked): Likewise. (vectorizable_mask_load_store): Use auto_vec_perm_indices. (vectorizable_store): Likewise. (vectorizable_load): Likewise. (perm_mask_for_reverse): Likewise. Update call to can_vec_perm_p. (vectorizable_bswap): Likewise. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r252761
2017-08-30[17/77] Add an int_mode_for_size helper functionRichard Sandiford1-5/+4
This patch adds a wrapper around mode_for_size for cases in which the mode class is MODE_INT (the commonest case). The return type can then be an opt_scalar_int_mode instead of a machine_mode. 2017-08-30 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * machmode.h (int_mode_for_size): New function. * builtins.c (set_builtin_user_assembler_name): Use int_mode_for_size instead of mode_for_size. * calls.c (save_fixed_argument_area): Likewise. Make use of BLKmode explicit. * combine.c (expand_field_assignment): Use int_mode_for_size instead of mode_for_size. (make_extraction): Likewise. (simplify_shift_const_1): Likewise. (simplify_comparison): Likewise. * dojump.c (do_jump): Likewise. * dwarf2out.c (mem_loc_descriptor): Likewise. * emit-rtl.c (init_derived_machine_modes): Likewise. * expmed.c (flip_storage_order): Likewise. (convert_extracted_bit_field): Likewise. * expr.c (copy_blkmode_from_reg): Likewise. * graphite-isl-ast-to-gimple.c (max_mode_int_precision): Likewise. * internal-fn.c (expand_mul_overflow): Likewise. * lower-subreg.c (simple_move): Likewise. * optabs-libfuncs.c (init_optabs): Likewise. * simplify-rtx.c (simplify_unary_operation_1): Likewise. * tree.c (vector_type_mode): Likewise. * tree-ssa-strlen.c (handle_builtin_memcmp): Likewise. * tree-vect-data-refs.c (vect_lanes_optab_supported_p): Likewise. * tree-vect-generic.c (expand_vector_parallel): Likewise. * tree-vect-stmts.c (vectorizable_load): Likewise. (vectorizable_store): Likewise. gcc/ada/ * gcc-interface/decl.c (gnat_to_gnu_entity): Use int_mode_for_size instead of mode_for_size. (gnat_to_gnu_subprog_type): Likewise. * gcc-interface/utils.c (make_type_from_size): Likewise. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r251469
2017-08-04Pool alignment information for common basesRichard Sandiford1-3/+77
This patch is a follow-on to the fix for PR81136. The testcase for that PR shows that we can (correctly) calculate different base alignments for two data_references but still tell that their misalignments wrt the vector size are equal. This is because we calculate the base alignments for each dr individually, without looking at the other drs, and in general the alignment we calculate is only guaranteed if the dr's DR_REF actually occurs. This is working as designed, but it does expose a missed opportunity. We know that if a vectorised loop is reached, all statements in that loop execute at least once, so it should be safe to pool the alignment information for all the statements we're vectorising. The only catch is that DR_REFs for masked loads and stores only occur if the mask value is nonzero. For example, in: struct s __attribute__((aligned(32))) { int misaligner; int array[N]; }; int *ptr; for (int i = 0; i < n; ++i) ptr[i] = c[i] ? ((struct s *) (ptr - 1))->array[i] : 0; we can only guarantee that ptr points to a "struct s" if at least one c[i] is true. This patch adds a DR_IS_CONDITIONAL_IN_STMT flag to record whether the DR_REF is guaranteed to occur every time that the statement executes to completion. It then pools the alignment information for references that aren't conditional in this sense. 2017-08-04 Richard Sandiford <richard.sandiford@linaro.org> gcc/ PR tree-optimization/81136 * tree-vectorizer.h: Include tree-hash-traits.h. (vec_base_alignments): New typedef. (vec_info): Add a base_alignments field. (vect_record_base_alignments): Declare. * tree-data-ref.h (data_reference): Add an is_conditional_in_stmt field. (DR_IS_CONDITIONAL_IN_STMT): New macro. (create_data_ref): Add an is_conditional_in_stmt argument. * tree-data-ref.c (create_data_ref): Likewise. Use it to initialize the is_conditional_in_stmt field. (data_ref_loc): Add an is_conditional_in_stmt field. (get_references_in_stmt): Set the is_conditional_in_stmt field. (find_data_references_in_stmt): Update call to create_data_ref. (graphite_find_data_references_in_stmt): Likewise. * tree-ssa-loop-prefetch.c (determine_loop_nest_reuse): Likewise. * tree-vect-data-refs.c (vect_analyze_data_refs): Likewise. (vect_record_base_alignment): New function. (vect_record_base_alignments): Likewise. (vect_compute_data_ref_alignment): Adjust base_addr and aligned_to for nested statements even if we fail to compute a misalignment. Use pooled base alignments for unconditional references. (vect_find_same_alignment_drs): Compare base addresses instead of base objects. (vect_analyze_data_refs_alignment): Call vect_record_base_alignments. * tree-vect-slp.c (vect_slp_analyze_bb_1): Likewise. gcc/testsuite/ PR tree-optimization/81136 * gcc.dg/vect/pr81136.c: Add scan test. From-SVN: r250870
2017-08-04Use base inequality for some vector alias checksRichard Sandiford1-7/+32
This patch checks whether two data references x and y cannot partially overlap and so are independent whenever &x != &y. We can then use this in the vectoriser to optimise alias checks. gcc/ 2016-08-04 Richard Sandiford <richard.sandiford@linaro.org> * hash-traits.h (pair_hash): New struct. * tree-data-ref.h (data_dependence_relation): Add object_a and object_b fields. (DDR_OBJECT_A, DDR_OBJECT_B): New macros. * tree-data-ref.c (initialize_data_dependence_relation): Initialize DDR_OBJECT_A and DDR_OBJECT_B. * tree-vectorizer.h (vec_object_pair): New type. (_loop_vec_info): Add a check_unequal_addrs field. (LOOP_VINFO_CHECK_UNEQUAL_ADDRS): New macro. (LOOP_REQUIRES_VERSIONING_FOR_ALIAS): Return true if there is an entry in check_unequal_addrs. Check comp_alias_ddrs instead of may_alias_ddrs. * tree-vect-loop.c (destroy_loop_vec_info): Release LOOP_VINFO_CHECK_UNEQUAL_ADDRS. (vect_analyze_loop_2): Likewise, when restarting. (vect_estimate_min_profitable_iters): Estimate the cost of LOOP_VINFO_CHECK_UNEQUAL_ADDRS. * tree-vect-data-refs.c: Include tree-hash-traits.h. (vect_prune_runtime_alias_test_list): Try to handle conflicts using LOOP_VINFO_CHECK_UNEQUAL_ADDRS, if the data dependence allows. Count such tests in the final summary. * tree-vect-loop-manip.c (chain_cond_expr): New function. (vect_create_cond_for_align_checks): Use it. (vect_create_cond_for_unequal_addrs): New function. (vect_loop_versioning): Call it. gcc/testsuite/ * gcc.dg/vect/vect-alias-check-6.c: New test. From-SVN: r250868
2017-08-04Handle data dependence relations with different basesRichard Sandiford1-4/+107
This patch tries to calculate conservatively-correct distance vectors for two references whose base addresses are not the same. It sets a new flag DDR_COULD_BE_INDEPENDENT_P if the dependence isn't guaranteed to occur. The motivating example is: struct s { int x[8]; }; void f (struct s *a, struct s *b) { for (int i = 0; i < 8; ++i) a->x[i] += b->x[i]; } in which the "a" and "b" accesses are either independent or have a dependence distance of 0 (assuming -fstrict-aliasing). Neither case prevents vectorisation, so we can vectorise without an alias check. I'd originally wanted to do the same thing for arrays as well, e.g.: void f (int a[][8], struct b[][8]) { for (int i = 0; i < 8; ++i) a[0][i] += b[0][i]; } I think this is valid because C11 6.7.6.2/6 says: For two array types to be compatible, both shall have compatible element types, and if both size specifiers are present, and are integer constant expressions, then both size specifiers shall have the same constant value. So if we access an array through an int (*)[8], it must have type X[8] or X[], where X is compatible with int. It doesn't seem possible in either case for "a[0]" and "b[0]" to overlap when "a != b". However, as the comment above "if (same_base_p)" explains, GCC is more forgiving: it supports arbitrary overlap of arrays and allows arrays to be accessed with different dimensionality. There are examples of this in PR50067. The patch therefore only handles references that end in a structure field access. There are two ways of handling these dependences in the vectoriser: use them to limit VF, or check at runtime as before. I've gone for the approach of checking at runtime if we can, to avoid limiting VF unnecessarily, but falling back to a VF cap when runtime checks aren't allowed. The patch tests whether we queued an alias check with a dependence distance of X and then picked a VF <= X, in which case it's safe to drop the alias check. Since vect_prune_runtime_alias_check_list can be called twice with different VF for the same loop, it's no longer safe to clear may_alias_ddrs on exit. Instead we should use comp_alias_ddrs to check whether versioning is necessary. 2017-08-04 Richard Sandiford <richard.sandiford@linaro.org> gcc/ * tree-data-ref.h (subscript): Add access_fn field. (data_dependence_relation): Add could_be_independent_p. (SUB_ACCESS_FN, DDR_COULD_BE_INDEPENDENT_P): New macros. (same_access_functions): Move to tree-data-ref.c. * tree-data-ref.c (ref_contains_union_access_p): New function. (access_fn_component_p): Likewise. (access_fn_components_comparable_p): Likewise. (dr_analyze_indices): Add a reference to access_fn_component_p. (dump_data_dependence_relation): Use SUB_ACCESS_FN instead of DR_ACCESS_FN. (constant_access_functions): Likewise. (add_other_self_distances): Likewise. (same_access_functions): Likewise. (Moved from tree-data-ref.h.) (initialize_data_dependence_relation): Use XCNEW and remove explicit zeroing of DDR_REVERSED_P. Look for a subsequence of access functions that have the same type. Allow the subsequence to end with different bases in some circumstances. Record the chosen access functions in SUB_ACCESS_FN. (build_classic_dist_vector_1): Replace ddr_a and ddr_b with a_index and b_index. Use SUB_ACCESS_FN instead of DR_ACCESS_FN. (subscript_dependence_tester_1): Likewise dra and drb. (build_classic_dist_vector): Update calls accordingly. (subscript_dependence_tester): Likewise. * tree-ssa-loop-prefetch.c (determine_loop_nest_reuse): Check DDR_COULD_BE_INDEPENDENT_P. * tree-vectorizer.h (LOOP_REQUIRES_VERSIONING_FOR_ALIAS): Test comp_alias_ddrs instead of may_alias_ddrs. * tree-vect-data-refs.c (vect_analyze_possibly_independent_ddr): New function. (vect_analyze_data_ref_dependence): Use it if DDR_COULD_BE_INDEPENDENT_P, but fall back to using the recorded distance vectors if that fails. (dependence_distance_ge_vf): New function. (vect_prune_runtime_alias_test_list): Use it. Don't clear LOOP_VINFO_MAY_ALIAS_DDRS. gcc/testsuite/ * gcc.dg/vect/vect-alias-check-3.c: New test. * gcc.dg/vect/vect-alias-check-4.c: Likewise. * gcc.dg/vect/vect-alias-check-5.c: Likewise. From-SVN: r250867
2017-07-21re PR tree-optimization/81303 (410.bwaves regression caused by r249919)Richard Biener1-18/+14
2017-07-21 Richard Biener <rguenther@suse.de> PR tree-optimization/81303 * tree-vect-data-refs.c (vect_get_peeling_costs_all_drs): Pass in datarefs vector. Allow NULL dr0 for no peeling cost estimate. (vect_peeling_hash_get_lowest_cost): Adjust. (vect_enhance_data_refs_alignment): Likewise. Use vect_get_peeling_costs_all_drs to compute the penalty for no peeling to match up costs. From-SVN: r250424
2017-07-18Fix PR81362: Vector peelingRobin Dapp1-22/+8
npeel was erroneously overwritten by vect_peeling_hash_get_lowest_cost although the corresponding dataref is not used afterwards. It should be safe to get rid of the npeel parameter since we use the returned peeling_info's npeel anyway. Also removed the body_cost_vec parameter which is not used elsewhere. gcc/ChangeLog: 2017-07-18 Robin Dapp <rdapp@linux.vnet.ibm.com> * tree-vect-data-refs.c (vect_enhance_data_refs_alignment): Remove body_cost_vec from _vect_peel_extended_info. (vect_peeling_hash_get_lowest_cost): Do not set body_cost_vec. (vect_peeling_hash_choose_best_peeling): Remove body_cost_vec and npeel. From-SVN: r250300
2017-07-03Add a helper for getting the overall alignment of a DRRichard Sandiford1-7/+0
This combines the information from previous patches to give a guaranteed alignment for the DR as a whole. This should be a bit safer than using base_element_aligned, since that only really took the base into account (not the init or offset). 2017-07-03 Richard Sandiford <richard.sandiford@linaro.org> gcc/ * tree-data-ref.h (dr_alignment): Declare. * tree-data-ref.c (dr_alignment): New function. * tree-vectorizer.h (dataref_aux): Remove base_element_aligned. * tree-vect-data-refs.c (vect_compute_data_ref_alignment): Don't set it. * tree-vect-stmts.c (vectorizable_store): Use dr_alignment. From-SVN: r249917
2017-07-03Add DR_BASE_ALIGNMENT and DR_BASE_MISALIGNMENTRichard Sandiford1-142/+39
This patch records the base alignment and misalignment in innermost_loop_behavior, to avoid the second-guessing that was previously done in vect_compute_data_ref_alignment. It also makes vect_analyze_data_refs use dr_analyze_innermost, instead of having an almost-copy of the same code. I wasn't sure whether the alignments should be measured in bits (for consistency with most other interfaces) or in bytes (for consistency with DR_ALIGNED_TO, now DR_OFFSET_ALIGNMENT, and with *_ptr_info_alignment). I went for bytes because: - I think in practice most consumers are going to want bytes. E.g. using bytes avoids having to mix TYPE_ALIGN and TYPE_ALIGN_UNIT in vect_compute_data_ref_alignment. - It means that any bit-level paranoia is dealt with when building the innermost_loop_behavior and doesn't get pushed down to consumers. 2017-07-03 Richard Sandiford <richard.sandiford@linaro.org> gcc/ * tree-data-ref.h (innermost_loop_behavior): Add base_alignment and base_misalignment fields. (DR_BASE_ALIGNMENT, DR_BASE_MISALIGNMENT): New macros. * tree-data-ref.c: Include builtins.h. (dr_analyze_innermost): Set up the new innmost_loop_behavior fields. * tree-vectorizer.h (STMT_VINFO_DR_BASE_ALIGNMENT): New macro. (STMT_VINFO_DR_BASE_MISALIGNMENT): Likewise. * tree-vect-data-refs.c: Include tree-cfg.h. (vect_compute_data_ref_alignment): Use the new innermost_loop_behavior fields instead of calculating an alignment here. (vect_analyze_data_refs): Use dr_analyze_innermost. Dump the new innermost_loop_behavior fields. From-SVN: r249916
2017-07-03Add DR_STEP_ALIGNMENTRichard Sandiford1-8/+12
A later patch adds base alignment information to innermost_loop_behavior. After that, the only remaining piece of alignment information that wasn't immediately obvious was the step alignment. Adding that allows a minor simplification to vect_compute_data_ref_alignment, and also potentially improves the handling of variable strides for outer loop vectorisation. A later patch will also use it to give the alignment of the DR as a whole. 2017-07-03 Richard Sandiford <richard.sandiford@linaro.org> gcc/ * tree-data-ref.h (innermost_loop_behavior): Add a step_alignment field. (DR_STEP_ALIGNMENT): New macro. * tree-vectorizer.h (STMT_VINFO_DR_STEP_ALIGNMENT): Likewise. * tree-data-ref.c (dr_analyze_innermost): Initalize step_alignment. (create_data_ref): Print it. * tree-vect-stmts.c (vectorizable_load): Use the step alignment to tell whether the step preserves vector (mis)alignment. * tree-vect-data-refs.c (vect_compute_data_ref_alignment): Likewise. Move the check for an integer step and generalise to all INTEGER_CST. (vect_analyze_data_refs): Set DR_STEP_ALIGNMENT when setting DR_STEP. Print the outer step alignment. From-SVN: r249915
2017-07-03Rename DR_ALIGNED_TO to DR_OFFSET_ALIGNMENTRichard Sandiford1-9/+7
This patch renames DR_ALIGNED_TO to DR_OFFSET_ALIGNMENT, to avoid confusion with the upcoming DR_BASE_ALIGNMENT. Nothing needed the value as a tree, and the value is clipped to BIGGEST_ALIGNMENT (maybe it should be MAX_OFILE_ALIGNMENT?) so we might as well use an unsigned int instead. 2017-07-03 Richard Sandiford <richard.sandiford@linaro.org> gcc/ * tree-data-ref.h (innermost_loop_behavior): Replace aligned_to with offset_alignment. (DR_ALIGNED_TO): Delete. (DR_OFFSET_ALIGNMENT): New macro. * tree-vectorizer.h (STMT_VINFO_DR_ALIGNED_TO): Delete. (STMT_VINFO_DR_OFFSET_ALIGNMENT): New macro. * tree-data-ref.c (dr_analyze_innermost): Update after above changes. (create_data_ref): Likewise. * tree-vect-data-refs.c (vect_compute_data_ref_alignment): Likewise. (vect_analyze_data_refs): Likewise. * tree-if-conv.c (if_convertible_loop_p_1): Use memset before creating dummy innermost behavior. From-SVN: r249914
2017-07-03Use innermost_loop_behavior for outer loop vectorisationRichard Sandiford1-70/+42
This patch replaces the individual stmt_vinfo dr_* fields with an innermost_loop_behavior, so that the changes in later patches get picked up automatically. It also adds a helper function for getting the behavior of a data reference wrt the vectorised loop. 2017-07-03 Richard Sandiford <richard.sandiford@linaro.org> gcc/ * tree-vectorizer.h (_stmt_vec_info): Replace individual dr_* fields with dr_wrt_vec_loop. (STMT_VINFO_DR_BASE_ADDRESS, STMT_VINFO_DR_INIT, STMT_VINFO_DR_OFFSET) (STMT_VINFO_DR_STEP, STMT_VINFO_DR_ALIGNED_TO): Update accordingly. (STMT_VINFO_DR_WRT_VEC_LOOP): New macro. (vect_dr_behavior): New function. (vect_create_addr_base_for_vector_ref): Remove loop parameter. * tree-vect-data-refs.c (vect_compute_data_ref_alignment): Use vect_dr_behavior. Use a step_preserves_misalignment_p boolean to track whether the step preserves the misalignment. (vect_create_addr_base_for_vector_ref): Remove loop parameter. Use vect_dr_behavior. (vect_setup_realignment): Update call accordingly. (vect_create_data_ref_ptr): Likewise. Use vect_dr_behavior. * tree-vect-loop-manip.c (vect_gen_prolog_loop_niters): Update call to vect_create_addr_base_for_vector_ref. (vect_create_cond_for_align_checks): Likewise. * tree-vect-patterns.c (vect_recog_bool_pattern): Copy STMT_VINFO_DR_WRT_VEC_LOOP as a block. (vect_recog_mask_conversion_pattern): Likewise. * tree-vect-stmts.c (compare_step_with_zero): Use vect_dr_behavior. (new_stmt_vec_info): Remove redundant zeroing. From-SVN: r249911
2017-07-02PR81136: ICE from inconsistent DR_MISALIGNMENTsRichard Sandiford1-2/+4
The test case triggered this assert in vect_update_misalignment_for_peel: gcc_assert (DR_MISALIGNMENT (dr) / dr_size == DR_MISALIGNMENT (dr_peel) / dr_peel_size); The problem was that: - one memory reference guaranteed a high base alignment, when considering that reference in isolation. This meant that we could calculate the vector misalignment for its DR at compile time. - the other memory reference only guaranteed a low base alignment, when considering that reference in isolation. We therefore couldn't calculate the vector misalignment for its DR at compile time. - when looking at the values of the two addresses as a pair (rather than the memory references), it was obvious that they had the same misalignment, whatever that misalignment happened to be. This is working as designed, so the patch restricts the assert to cases in which both addresses have a compile-time misalignment. In the test case this looks like a missed opportunity. Both references are unconditional, so it should be possible to use the highest of the available base alignment guarantees when analyzing each reference. A later patch does this, but the problem would still remain for conditional references. 2017-07-02 Richard Sandiford <richard.sandiford@linaro.org> gcc/ PR tree-optimization/81136 * tree-vect-data-refs.c (vect_update_misalignment_for_peel): Only assert that two references with the same misalignment have the same compile-time misalignment if those compile-time misalignments are known. gcc/testsuite/ PR tree-optimization/81136 * gcc.dg/vect/pr81136.c: New test. From-SVN: r249878
2017-06-07tree-vect-data-refs.c (vect_mark_for_runtime_alias_test): Factor out code ↵Bin Cheng1-39/+3
checking if runtime alias check is possible to below ... * tree-vect-data-refs.c (vect_mark_for_runtime_alias_test): Factor out code checking if runtime alias check is possible to below ... Call the new function. * tree-data-ref.c (runtime_alias_check_p): ... to new function. * tree-data-ref.h (runtime_alias_check_p): New decalaration. From-SVN: r248962
2017-05-31Alternative check for vector refs with same alignmentRichard Sandiford1-48/+29
vect_find_same_alignment_drs uses the ddr dependence distance to tell whether two references have the same alignment. Although that's safe with the current code, there's no particular reason why a dependence distance of 0 should mean that the accesses start on the same byte. E.g. a reference to a full complex value could in principle depend on a reference to the imaginary component. A later patch adds support for this kind of dependence. On the other side, checking modulo vf is pessimistic when the step divided by the element size is a factor of 2. This patch instead looks for cases in which the drs have the same base, offset and step, and for which the difference in their constant initial values is a multiple of the alignment. 2017-05-03 Richard Sandiford <richard.sandiford@linaro.org> gcc/ * tree-vect-data-refs.c (vect_find_same_alignment_drs): Remove loop_vinfo argument and use of dependence distance vectors. Check instead whether the two references differ only in their initial value and assume that they have the same alignment if the difference is a multiple of the vector alignment. (vect_analyze_data_refs_alignment): Update call accordingly. gcc/testsuite/ * gcc.dg/vect/vect-103.c: Update wording of dump message. From-SVN: r248730
2017-05-30Vector peeling cost model 6/6Robin Dapp1-42/+71
gcc/ChangeLog: 2017-05-24 Robin Dapp <rdapp@linux.vnet.ibm.com> * tree-vect-data-refs.c (vect_get_peeling_costs_all_drs): Introduce unknown_misalignment parameter and remove vf. (vect_peeling_hash_get_lowest_cost): Pass unknown_misalignment parameter. (vect_enhance_data_refs_alignment): Fix unsupportable data ref treatment. From-SVN: r248680
2017-05-30Vector peeling cost model 4/6Robin Dapp1-83/+103
gcc/ChangeLog: 2017-05-30 Robin Dapp <rdapp@linux.vnet.ibm.com> * tree-vect-data-refs.c (vect_get_data_access_cost): Workaround for SLP handling. (vect_enhance_data_refs_alignment): Compute costs for doing no peeling at all, compare to the best peeling costs so far and avoid peeling if cheaper. From-SVN: r248678
2017-05-30Vector peeling cost model 3/6Robin Dapp1-97/+101
gcc/ChangeLog: 2017-05-30 Robin Dapp <rdapp@linux.vnet.ibm.com> * tree-vect-data-refs.c (vect_peeling_hash_choose_best_peeling): Return peeling info and set costs to zero for unlimited cost model. (vect_enhance_data_refs_alignment): Also inspect all datarefs with unknown misalignment. Compute and costs for unknown misalignment, compare them to the costs for known misalignment and choose the cheapest for peeling. From-SVN: r248677
2017-05-30Vector peeling cost model 2/6Robin Dapp1-56/+103
gcc/ChangeLog: 2017-05-30 Robin Dapp <rdapp@linux.vnet.ibm.com> * tree-vect-data-refs.c (vect_update_misalignment_for_peel): Rename. (vect_get_peeling_costs_all_drs): Create function. (vect_peeling_hash_get_lowest_cost): Use vect_get_peeling_costs_all_drs. (vect_peeling_supportable): Create function. (vect_enhance_data_refs_alignment): Use vect_peeling_supportable. From-SVN: r248676
2017-05-30Vector peeling cost model 1/6Robin Dapp1-21/+19
gcc/ChangeLog: 2017-05-30 Robin Dapp <rdapp@linux.vnet.ibm.com> * tree-vect-data-refs.c (vect_compute_data_ref_alignment): Create DR_HAS_NEGATIVE_STEP. (vect_update_misalignment_for_peel): Define DR_MISALIGNMENT. (vect_enhance_data_refs_alignment): Use. (vect_duplicate_ssa_name_ptr_info): Use. * tree-vectorizer.h (dr_misalignment): Use. (known_alignment_for_access_p): Use. From-SVN: r248675
2017-05-26tree-vect-data-refs.c (Operator==, [...]): Move from ...Bin Cheng1-232/+2
* tree-vect-data-refs.c (Operator==, comp_dr_with_seg_len_pair): Move from ... * tree-data-ref.c (Operator==, comp_dr_with_seg_len_pair): To here. * tree-vect-data-refs.c (vect_prune_runtime_alias_test_list): Factor out code pruning runtime alias checks. * tree-data-ref.c (prune_runtime_alias_test_list): New function factored out from above. * tree-vectorizer.h (struct dr_with_seg_len, dr_with_seg_len_pair_t): Move from ... * tree-data-ref.h (struct dr_with_seg_len, dr_with_seg_len_pair_t): ... to here. (prune_runtime_alias_test_list): New decalaration. From-SVN: r248511
2017-05-26tree-vect-data-refs.c (compare_tree): Rename and move ...Bin Cheng1-97/+29
* tree-vect-data-refs.c (compare_tree): Rename and move ... * tree-data-ref.c (data_ref_compare_tree): ... to here. * tree-data-ref.h (data_ref_compare_tree): New decalaration. * tree-vect-data-refs.c (dr_group_sort_cmp): Update uses. (operator==, comp_dr_with_seg_len_pair): Ditto. (vect_prune_runtime_alias_test_list): Ditto. From-SVN: r248510
2017-05-11re PR tree-optimization/80705 (Incorrect code generated for profile counter ↵Richard Biener1-0/+21
updates due to SLP+LIM) 2017-05-11 Richard Biener <rguenther@suse.de> PR tree-optimization/80705 * tree-vect-data-refs.c (vect_analyze_data_refs): DECL_NONALIASED bases are not vectorizable. * gcc.dg/vect/bb-slp-pr80705.c: New testcase. From-SVN: r247906
2017-05-03tree-vect-data-refs.c (vect_enhance_data_refs_alignment): When all DRs have ↵Richard Biener1-7/+7
unknown misaligned do not always peel when... 2017-05-03 Richard Biener <rguenther@suse.de> * tree-vect-data-refs.c (vect_enhance_data_refs_alignment): When all DRs have unknown misaligned do not always peel when there is a store but apply the same costing model as if there were only loads. * gcc.dg/vect/costmodel/x86_64/costmodel-alignpeel.c: New testcase. From-SVN: r247544
2017-04-03Fix numerous typos in commentsJonathan Wakely1-1/+1
gcc: * alias.c (base_alias_check): Fix typo in comment. * cgraph.h (class ipa_polymorphic_call_context): Likewise. * cgraphunit.c (symbol_table::compile): Likewise. * collect2.c (maybe_run_lto_and_relink): Likewise. * config/arm/arm.c (arm_thumb1_mi_thunk): Likewise. * config/avr/avr-arch.h (avr_arch_info_t): Likewise. * config/avr/avr.c (avr_map_op_t): Likewise. * config/cr16/cr16.h (DATA_ALIGNMENT): Likewise. * config/epiphany/epiphany.c (TARGET_ARG_PARTIAL_BYTES): Likewise. * config/epiphany/epiphany.md (movcc): Likewise. * config/i386/i386.c (legitimize_pe_coff_extern_decl): Likewise. * config/m68k/m68k.c (struct _sched_ib, m68k_sched_variable_issue): Likewise. * config/mips/mips.c (mips_save_restore_reg): Likewise. * config/rx/rx.c (rx_is_restricted_memory_address): Likewise. * config/s390/s390.c (Z10_EARLYLOAD_DISTANCE): Likewise. * config/sh/sh.c (sh_rtx_costs): Likewise. * fold-const.c (fold_truth_andor): Likewise. * genautomata.c (collapse_flag): Likewise. * gengtype.h (struct type::u::s): Likewise. * gensupport.c (has_subst_attribute, add_mnemonic_string): Likewise. * input.c (FORMAT_AMOUNT): Likewise. * ipa-cp.c (class ipcp_lattice, agg_replacements_to_vector) (known_aggs_to_agg_replacement_list): Likewise. * ipa-inline-analysis.c: Likewise. * ipa-inline.h (estimate_edge_time, estimate_edge_hints): Likewise. * ipa-polymorphic-call.c (ipa_polymorphic_call_context::restrict_to_inner_class): Likewise. * loop-unroll.c (analyze_insn_to_expand_var): Likewise. * lra.c (lra_optional_reload_pseudos, lra_subreg_reload_pseudos): Likewise. * modulo-sched.c (apply_reg_moves): Likewise. * omp-expand.c (build_omp_regions_1): Likewise. * trans-mem.c (struct tm_wrapper_hasher): Likewise. * tree-ssa-loop-ivopts.c (may_eliminate_iv): Likewise. * tree-ssa-loop-niter.c (maybe_lower_iteration_bound): Likewise. * tree-vect-data-refs.c (vect_enhance_data_refs_alignment): Likewise. * value-prof.c: Likewise. * var-tracking.c (val_reset): Likewise. gcc/ada: * doc/gnat_ugn/gnat_and_program_execution.rst: Fix typo. * g-socket.adb (To_Host_Entry): Fix typo in comment. * gnat_ugn.texi: Fix typo. * raise.c (_gnat_builtin_longjmp): Fix capitalization in comment. * s-stposu.adb (Allocate_Any_Controlled): Fix typo in comment. * sem_ch3.adb (Build_Derived_Record_Type): Likewise. * sem_util.adb (Mark_Coextensions): Likewise. * sem_util.ads (Available_Full_View_Of_Component): Likewise. gcc/c: * c-array-notation.c: Fix typo in comment. gcc/c-family: * c-warn.c (do_warn_double_promotion): Fix typo in comment. gcc/cp: * class.c (update_vtable_entry_for_fn): Fix typo in comment. * decl2.c (one_static_initialization_or_destruction): Likewise. * name-lookup.c (store_bindings): Likewise. * parser.c (make_call_declarator): Likewise. * pt.c (check_explicit_specialization): Likewise. gcc/testsuite: * g++.old-deja/g++.benjamin/scope02.C: Fix typo in comment. * gcc.dg/20031012-1.c: Likewise. * gcc.dg/ipa/ipcp-1.c: Likewise. * gcc.dg/torture/matrix-3.c: Likewise. * gcc.target/powerpc/ppc-spe.c: Likewise. * gcc.target/rx/zero-width-bitfield.c: Likewise. libcpp: * include/line-map.h (LINEMAPS_MACRO_MAPS): Fix typo in comment. * lex.c (search_line_fast): Likewise. * pch.h (cpp_valid_state): Likewise. libdecnumber: * decCommon.c (decFloatFromPackedChecked): Fix typo in comment. * decNumber.c (decNumberPower, decMultiplyOp): Likewise. libgcc: * config/c6x/pr-support.c (__gnu_unwind_execute): Fix typo in comment. libitm: * libitm_i.h (sutrct gtm_thread): Fix typo in comment. From-SVN: r246664
2017-03-27re PR tree-optimization/80170 (SLP vectorization creates aligned access)Richard Biener1-3/+12
2017-03-27 Richard Biener <rguenther@suse.de> PR tree-optimization/80170 * tree-vect-data-refs.c (vect_compute_data_ref_alignment): Make sure DR/SCEV didnt fold in constants we do not see when looking at the reference base alignment. * gcc.dg/pr80170.c: New testcase. From-SVN: r246491
2017-03-14alias.c (struct alias_set_entry): Pack properly.Richard Biener1-1/+1
2017-03-14 Richard Biener <rguenther@suse.de> * alias.c (struct alias_set_entry): Pack properly. * cfgloop.h (struct loop): Likewise. * cse.c (struct set): Likewise. * ipa-utils.c (struct searchc_env): Likewise. * loop-invariant.c (struct invariant): Likewise. * lra-remat.c (struct cand): Likewise. * recog.c (struct change_t): Likewise. * rtl.h (struct address_info): Likewise. * symbol-summary.h (function_summary): Likewise. * tree-loop-distribution.c (struct partition): Likewise. * tree-object-size.c (struct object_size_info): Likewise. * tree-ssa-loop-ivopts.c (struct cost_pair): Likewise. * tree-ssa-threadupdate.c (struct ssa_local_info_t): Likewise. * tree-vect-data-refs.c (struct _vect_peel_info): Likewise. * tree-vect-slp.c (struct _slp_oprnd_info): Likewise. * tree-vect-stmts.c (struct simd_call_arg_info): Likewise. * tree-vectorizer.h (struct _loop_vec_info): Likewise. (struct _stmt_vec_info): Likewise. From-SVN: r246121
2017-01-25re PR target/69264 (ICE building spidermonkey -mcpu=970 -maltivec -O3: ↵Richard Biener1-14/+7
rs6000_builtin_vectorization_cost, at config/rs6000/rs6000.c:4350) 2017-01-25 Richard Biener <rguenther@suse.de> PR tree-optimization/69264 * target.def (vector_alignment_reachable): Improve documentation. * doc/tm.texi: Regenerate. * targhooks.c (default_builtin_vector_alignment_reachable): Simplify and add a comment. * tree-vect-data-refs.c (vect_supportable_dr_alignment): Revert earlier changes with respect to TYPE_USER_ALIGN. (vector_alignment_reachable_p): Likewise. Improve dumping. * g++.dg/torture/pr69264.C: New testcase. From-SVN: r244897
2017-01-01Update copyright years.Jakub Jelinek1-1/+1
From-SVN: r243994
2016-12-13re PR tree-optimization/78699 (ICE (segfault) on powerpc64le-linux-gnu ↵Richard Biener1-1/+3
(memory-hog)) 2016-12-13 Richard Biener <rguenther@suse.de> PR tree-optimization/78699 * tree-vect-data-refs.c (vect_analyze_group_access_1): Limit group size. From-SVN: r243599
2016-11-16Support non-masked epilogue vectoriziationYuri Rumyantsev1-3/+9
gcc/ 2016-11-16 Yuri Rumyantsev <ysrumyan@gmail.com> * params.def (PARAM_VECT_EPILOGUES_NOMASK): New. * tree-if-conv.c (tree_if_conversion): Make public. * * tree-if-conv.h: New file. * tree-vect-data-refs.c (vect_analyze_data_ref_dependences) Avoid dynamic alias checks for epilogues. * tree-vect-loop-manip.c (vect_do_peeling): Return created epilog. * tree-vect-loop.c: include tree-if-conv.h. (new_loop_vec_info): Add zeroing orig_loop_info field. (vect_analyze_loop_2): Don't try to enhance alignment for epilogues. (vect_analyze_loop): Add argument ORIG_LOOP_INFO which is not NULL if epilogue is vectorized, set up orig_loop_info field of loop_vinfo using passed argument. (vect_transform_loop): Check if created epilogue should be returned for further vectorization with less vf. If-convert epilogue if required. Print vectorization success for epilogue. * tree-vectorizer.c (vectorize_loops): Add epilogue vectorization if it is required, pass loop_vinfo produced during vectorization of loop body to vect_analyze_loop. * tree-vectorizer.h (struct _loop_vec_info): Add new field orig_loop_info. (LOOP_VINFO_ORIG_LOOP_INFO): New. (LOOP_VINFO_EPILOGUE_P): New. (LOOP_VINFO_ORIG_VECT_FACTOR): New. (vect_do_peeling): Change prototype to return epilogue. (vect_analyze_loop): Add argument of loop_vec_info type. (vect_transform_loop): Return created loop. gcc/testsuite/ 2016-11-16 Yuri Rumyantsev <ysrumyan@gmail.com> * lib/target-supports.exp (check_avx2_hw_available): New. (check_effective_target_avx2_runtime): New. * gcc.dg/vect/vect-tail-nomask-1.c: New test. From-SVN: r242501
2016-11-09tree-vect-data-refs.c (vect_compute_data_ref_alignment): Look at the ↵Richard Biener1-6/+3
DR_BASE_ADDRESS object for forcing alignment. 2016-11-09 Richard Biener <rguenther@suse.de> * tree-vect-data-refs.c (vect_compute_data_ref_alignment): Look at the DR_BASE_ADDRESS object for forcing alignment. From-SVN: r241991
2016-11-08tree-vect-stmts.c (get_group_load_store_type): If the access is aligned do ↵Richard Biener1-0/+13
not trigger peeling for gaps. 2016-11-08 Richard Biener <rguenther@suse.de> * tree-vect-stmts.c (get_group_load_store_type): If the access is aligned do not trigger peeling for gaps. * tree-vect-data-refs.c (vect_compute_data_ref_alignment): Do not force alignment of vars with DECL_USER_ALIGN. * gcc.dg/vect/vect-nb-iter-ub-2.c: Adjust. From-SVN: r241959
2016-11-07re PR tree-optimization/78189 (movaps generated for unaligned store in ↵Richard Biener1-3/+18
aligned struct, when struct is referenced via unaligned member.) 2016-11-07 Richard Biener <rguenther@suse.de> PR tree-optimization/78189 * tree-vect-data-refs.c (vect_compute_data_ref_alignment): Fix alignment computation. * g++.dg/torture/pr78189.C: New testcase. From-SVN: r241892
2016-10-31tree-vect-data-refs.c (vect_slp_analyze_node_dependences): Skip unnecessary ↵Bin Cheng1-10/+10
data dependence check after visited store stmt. * tree-vect-data-refs.c (vect_slp_analyze_node_dependences): Skip unnecessary data dependence check after visited store stmt. From-SVN: r241696
2016-10-13Move MEMMODEL_* from coretypes.h to memmodel.hThomas Preud'homme1-0/+1
2016-10-13 Thomas Preud'homme <thomas.preudhomme@arm.com> gcc/ * coretypes.h: Move MEMMODEL_* macros and enum memmodel definition into ... * memmodel.h: This file. * alias.c, asan.c, auto-inc-dec.c, bb-reorder.c, bt-load.c, caller-save.c, calls.c, ccmp.c, cfgbuild.c, cfgcleanup.c, cfgexpand.c, cfgloopanal.c, cfgrtl.c, cilk-common.c, combine.c, combine-stack-adj.c, common/config/aarch64/aarch64-common.c, common/config/arm/arm-common.c, common/config/bfin/bfin-common.c, common/config/c6x/c6x-common.c, common/config/i386/i386-common.c, common/config/ia64/ia64-common.c, common/config/nvptx/nvptx-common.c, compare-elim.c, config/aarch64/aarch64-builtins.c, config/aarch64/aarch64-c.c, config/aarch64/cortex-a57-fma-steering.c, config/arc/arc.c, config/arc/arc-c.c, config/arm/arm-builtins.c, config/arm/arm-c.c, config/avr/avr.c, config/avr/avr-c.c, config/avr/avr-log.c, config/bfin/bfin.c, config/c6x/c6x.c, config/cr16/cr16.c, config/cris/cris.c, config/darwin-c.c, config/darwin.c, config/epiphany/epiphany.c, config/epiphany/mode-switch-use.c, config/epiphany/resolve-sw-modes.c, config/fr30/fr30.c, config/frv/frv.c, config/ft32/ft32.c, config/h8300/h8300.c, config/i386/i386-c.c, config/i386/winnt.c, config/iq2000/iq2000.c, config/lm32/lm32.c, config/m32c/m32c.c, config/m32r/m32r.c, config/m68k/m68k.c, config/mcore/mcore.c, config/microblaze/microblaze.c, config/mmix/mmix.c, config/mn10300/mn10300.c, config/moxie/moxie.c, config/msp430/msp430.c, config/nds32/nds32-cost.c, config/nds32/nds32-intrinsic.c, config/nds32/nds32-md-auxiliary.c, config/nds32/nds32-memory-manipulation.c, config/nds32/nds32-predicates.c, config/nds32/nds32.c, config/nios2/nios2.c, config/nvptx/nvptx.c, config/pa/pa.c, config/pdp11/pdp11.c, config/rl78/rl78.c, config/rs6000/rs6000-c.c, config/rx/rx.c, config/s390/s390-c.c, config/s390/s390.c, config/sh/sh.c, config/sh/sh-c.c, config/sh/sh-mem.cc, config/sh/sh_treg_combine.cc, config/sol2.c, config/spu/spu.c, config/stormy16/stormy16.c, config/tilegx/tilegx.c, config/tilepro/tilepro.c, config/v850/v850.c, config/vax/vax.c, config/visium/visium.c, config/vms/vms-c.c, config/xtensa/xtensa.c, coverage.c, cppbuiltin.c, cprop.c, cse.c, cselib.c, dbxout.c, dce.c, df-core.c, df-problems.c, df-scan.c, dojump.c, dse.c, dwarf2asm.c, dwarf2cfi.c, dwarf2out.c, emit-rtl.c, except.c, explow.c, expmed.c, expr.c, final.c, fold-const.c, function.c, fwprop.c, gcse.c, ggc-page.c, haifa-sched.c, hsa-brig.c, hsa-gen.c, hw-doloop.c, ifcvt.c, init-regs.c, internal-fn.c, ira-build.c, ira-color.c, ira-conflicts.c, ira-costs.c, ira-emit.c, ira-lives.c, ira.c, jump.c, loop-doloop.c, loop-invariant.c, loop-iv.c, loop-unroll.c, lower-subreg.c, lra.c, lra-assigns.c, lra-coalesce.c, lra-constraints.c, lra-eliminations.c, lra-lives.c, lra-remat.c, lra-spills.c, mode-switching.c, modulo-sched.c, omp-low.c, passes.c, postreload-gcse.c, postreload.c, predict.c, print-rtl-function.c, recog.c, ree.c, reg-stack.c, regcprop.c, reginfo.c, regrename.c, reload.c, reload1.c, reorg.c, resource.c, rtl-chkp.c, rtl-tests.c, rtlanal.c, rtlhooks.c, sched-deps.c, sched-rgn.c, sdbout.c, sel-sched-ir.c, sel-sched.c, shrink-wrap.c, simplify-rtx.c, stack-ptr-mod.c, stmt.c, stor-layout.c, target-globals.c, targhooks.c, toplev.c, tree-nested.c, tree-outof-ssa.c, tree-profile.c, tree-ssa-coalesce.c, tree-ssa-ifcombine.c, tree-ssa-loop-ivopts.c, tree-ssa-loop.c, tree-ssa-reassoc.c, tree-ssa-sccvn.c, tree-vect-data-refs.c, ubsan.c, valtrack.c, var-tracking.c, varasm.c: Include memmodel.h. * genattrtab.c (write_header): Include memmodel.h in generated file. * genautomata.c (main): Likewise. * gengtype.c (open_base_files): Likewise. * genopinit.c (main): Likewise. * genconditions.c (write_header): Include memmodel.h earlier in generated file. * genemit.c (main): Likewise. * genoutput.c (output_prologue): Likewise. * genpeep.c (main): Likewise. * genpreds.c (write_insn_preds_c): Likewise. * genrecog.c (write_header): Likewise. * Makefile.in (PLUGIN_HEADERS): Include memmodel.h gcc/ada/ * gcc-interface/utils2.c: Include memmodel.h. gcc/c-family/ * c-cppbuiltin.c: Include memmodel.h. * c-opts.c: Likewise. * c-pragma.c: Likewise. * c-warn.c: Likewise. gcc/c/ * c-typeck.c: Include memmodel.h. gcc/cp/ * decl2.c: Include memmodel.h. * rtti.c: Likewise. gcc/fortran/ * trans-intrinsic.c: Include memmodel.h. gcc/go/ * go-backend.c: Include memmodel.h. libgcc/ * libgcov-profiler.c: Replace MEMMODEL_* macros by their __ATOMIC_* equivalent. * config/tilepro/atomic.c: Likewise and stop casting model to enum memmodel. From-SVN: r241121
2016-10-09tree-ssa.c (target_for_debug_bind, [...]): Use VAR_P and/or ↵Jakub Jelinek1-1/+1
VAR_OR_FUNCTION_DECL_P macros. * tree-ssa.c (target_for_debug_bind, verify_phi_args, ssa_undefined_value_p, maybe_optimize_var): Use VAR_P and/or VAR_OR_FUNCTION_DECL_P macros. * tree-chkp.c (chkp_register_var_initializer, chkp_make_static_bounds, chkp_get_bounds_for_decl_addr, chkp_parse_array_and_component_ref, chkp_find_bounds_1): Likewise. * ipa-polymorphic-call.c (decl_maybe_in_construction_p): Likewise. * hsa-gen.c (get_symbol_for_decl): Likewise. * cgraphunit.c (check_global_declaration, analyze_functions, handle_alias_pairs, thunk_adjust, cgraph_node::expand_thunk): Likewise. * gimple-fold.c (can_refer_decl_in_current_unit_p, canonicalize_constructor_val, gimple_get_virt_method_for_vtable): Likewise. * tree.c (set_decl_section_name, copy_node_stat, need_assembler_name_p, free_lang_data_in_decl, find_decls_types_r, merge_dllimport_decl_attributes, handle_dll_attribute, decl_init_priority_insert, auto_var_in_fn_p, array_at_struct_end_p, verify_type): Likewise. * gimple-ssa-isolate-paths.c (find_implicit_erroneous_behavior, find_explicit_erroneous_behavior): Likewise. * sdbout.c (sdbout_toplevel_data, sdbout_late_global_decl): Likewise. * ipa.c (process_references): Likewise. * tree-chkp-opt.c (chkp_get_check_result): Likewise. * varasm.c (get_block_for_decl, use_blocks_for_decl_p, make_decl_rtl, notice_global_symbol, assemble_variable, mark_decl_referenced, build_constant_desc, output_constant_def_contents, do_assemble_alias, make_decl_one_only, default_section_type_flags, categorize_decl_for_section, default_encode_section_info): Likewise. * trans-mem.c (requires_barrier): Likewise. * gimple-expr.c (mark_addressable): Likewise. * cfgexpand.c (add_scope_conflicts_1, expand_one_var, expand_used_vars_for_block, clear_tree_used, stack_protect_decl_p, expand_debug_expr): Likewise. * tree-dump.c (dequeue_and_dump): Likewise. * ubsan.c (instrument_bool_enum_load): Likewise. * tree-pretty-print.c (print_declaration): Likewise. * simplify-rtx.c (delegitimize_mem_from_attrs): Likewise. * tree-ssa-uninit.c (warn_uninitialized_vars): Likewise. * asan.c (asan_protect_global, instrument_derefs): Likewise. * tree-into-ssa.c (rewrite_stmt, maybe_register_def, pass_build_ssa::execute): Likewise. * var-tracking.c (var_debug_decl, track_expr_p): Likewise. * tree-ssa-loop-ivopts.c (force_expr_to_var_cost, split_address_cost): Likewise. * ipa-split.c (test_nonssa_use, consider_split, mark_nonssa_use): Likewise. * tree-inline.c (insert_debug_decl_map, remap_ssa_name, can_be_nonlocal, remap_decls, copy_debug_stmt, initialize_inlined_parameters, add_local_variables, reset_debug_binding, replace_locals_op): Likewise. * dse.c (can_escape): Likewise. * ipa-devirt.c (compare_virtual_tables, referenced_from_vtable_p): Likewise. * tree-diagnostic.c (default_tree_printer): Likewise. * tree-streamer-in.c (unpack_ts_decl_common_value_fields, unpack_ts_decl_with_vis_value_fields, lto_input_ts_decl_common_tree_pointers): Likewise. * builtins.c (builtin_save_expr, fold_builtin_expect, readonly_data_expr): Likewise. * tree-ssa-structalias.c (new_var_info, get_constraint_for_ssa_var, create_variable_info_for, set_uids_in_ptset, visit_loadstore): Likewise. * gimple-streamer-out.c (output_gimple_stmt): Likewise. * gimplify.c (force_constant_size, gimplify_bind_expr, gimplify_decl_expr, gimplify_var_or_parm_decl, gimplify_compound_lval, gimplify_init_constructor, gimplify_modify_expr, gimplify_asm_expr, gimplify_oacc_declare, gimplify_type_sizes): Likewise. * cgraphbuild.c (record_reference, record_type_list, mark_address, mark_load, mark_store, pass_build_cgraph_edges::execute): Likewise. * tree-ssa-live.c (mark_all_vars_used_1, remove_unused_scope_block_p, remove_unused_locals): Likewise. * tree-ssa-alias.c (ptr_deref_may_alias_decl_p, ptrs_compare_unequal, ref_maybe_used_by_call_p_1, call_may_clobber_ref_p_1): Likewise. * function.c (instantiate_expr, instantiate_decls_1, setjmp_vars_warning, add_local_decl): Likewise. * alias.c (ao_ref_from_mem, get_alias_set, compare_base_symbol_refs): Likewise. * tree-stdarg.c (find_va_list_reference, va_list_counter_struct_op, va_list_ptr_read, va_list_ptr_write, check_all_va_list_escapes, optimize_va_list_gpr_fpr_size): Likewise. * tree-nrv.c (pass_nrv::execute): Likewise. * tsan.c (instrument_expr): Likewise. * tree-ssa-dce.c (remove_dead_stmt): Likewise. * vtable-verify.c (verify_bb_vtables): Likewise. * tree-dfa.c (ssa_default_def, set_ssa_default_def, get_ref_base_and_extent): Likewise. * toplev.c (wrapup_global_declaration_1, wrapup_global_declaration_2): Likewise. * tree-sra.c (static bool constant_decl_p, find_var_candidates, analyze_all_variable_accesses): Likewise. * tree-nested.c (get_nonlocal_debug_decl, convert_nonlocal_omp_clauses, note_nonlocal_vla_type, note_nonlocal_block_vlas, convert_nonlocal_reference_stmt, get_local_debug_decl, convert_local_omp_clauses, convert_local_reference_stmt, nesting_copy_decl, remap_vla_decls): Likewise. * tree-vect-data-refs.c (vect_can_force_dr_alignment_p): Likewise. * stmt.c (decl_overlaps_hard_reg_set_p): Likewise. * dbxout.c (dbxout_late_global_decl, dbxout_type_fields, dbxout_symbol, dbxout_common_check): Likewise. * expr.c (expand_assignment, expand_expr_real_2, expand_expr_real_1, string_constant): Likewise. * hsa.c (hsa_get_declaration_name): Likewise. * passes.c (rest_of_decl_compilation): Likewise. * tree-ssanames.c (make_ssa_name_fn): Likewise. * tree-streamer-out.c (pack_ts_decl_common_value_fields, pack_ts_decl_with_vis_value_fields, write_ts_decl_common_tree_pointers): Likewise. * stor-layout.c (place_field): Likewise. * symtab.c (symtab_node::maybe_create_reference, symtab_node::verify_base, symtab_node::make_decl_local, symtab_node::copy_visibility_from, symtab_node::can_increase_alignment_p): Likewise. * dwarf2out.c (add_var_loc_to_decl, tls_mem_loc_descriptor, decl_by_reference_p, reference_to_unused, rtl_for_decl_location, fortran_common, add_location_or_const_value_attribute, add_scalar_info, add_linkage_name, set_block_abstract_flags, local_function_static, gen_variable_die, dwarf2out_late_global_decl, optimize_one_addr_into_implicit_ptr, optimize_location_into_implicit_ptr): Likewise. * gimple-low.c (record_vars_into): Likewise. * ipa-visibility.c (update_vtable_references): Likewise. * tree-ssa-address.c (fixed_address_object_p, copy_ref_info): Likewise. * lto-streamer-out.c (tree_is_indexable, get_symbol_initial_value, DFS::DFS_write_tree_body, write_symbol): Likewise. * langhooks.c (lhd_warn_unused_global_decl, lhd_set_decl_assembler_name): Likewise. * attribs.c (decl_attributes): Likewise. * except.c (output_ttype): Likewise. * varpool.c (varpool_node::get_create, ctor_for_folding, varpool_node::assemble_decl, varpool_node::create_alias): Likewise. * fold-const.c (fold_unary_loc): Likewise. * ipa-prop.c (ipa_compute_jump_functions_for_edge, ipa_find_agg_cst_from_init): Likewise. * omp-low.c (expand_omp_regimplify_p, expand_omp_taskreg, expand_omp_target, lower_omp_regimplify_p, grid_reg_assignment_to_local_var_p, grid_remap_prebody_decls, find_link_var_op): Likewise. * tree-chrec.c (chrec_contains_symbols): Likewise. * tree-cfg.c (verify_address, verify_expr, verify_expr_location_1, gimple_duplicate_bb, move_stmt_op, replace_block_vars_by_duplicates, execute_fixup_cfg): Likewise. From-SVN: r240900
2016-09-21re PR tree-optimization/77621 (Internal compiler error for mtune=atom + msse2)Richard Biener1-0/+7
2016-09-21 Richard Biener <rguenther@suse.de> Jakub Jelinek <jakub@redhat.com> PR tree-optimization/77621 * tree-vect-data-refs.c (vect_analyze_data_ref_accesses): Split group at non-vectorizable stmts. * gcc.dg/pr77621.c: New testcase. Co-Authored-By: Jakub Jelinek <jakub@redhat.com> From-SVN: r240302
2016-09-16Add inline functions for various bitwise operations.Jason Merrill1-7/+7
* hwint.h (least_bit_hwi, pow2_or_zerop, pow2p_hwi, ctz_or_zero): New. * hwint.c (exact_log2): Use pow2p_hwi. (ctz_hwi, ffs_hwi): Use least_bit_hwi. * alias.c (memrefs_conflict_p): Use pow2_or_zerop. * builtins.c (get_object_alignment_2, get_object_alignment) (get_pointer_alignment, fold_builtin_atomic_always_lock_free): Use least_bit_hwi. * calls.c (compute_argument_addresses, store_one_arg): Use least_bit_hwi. * cfgexpand.c (expand_one_stack_var_at): Use least_bit_hwi. * combine.c (force_to_mode): Use least_bit_hwi. * emit-rtl.c (set_mem_attributes_minus_bitpos, adjust_address_1): Use least_bit_hwi. * expmed.c (synth_mult, expand_divmod): Use ctz_or_zero, ctz_hwi. (init_expmed_one_conv): Use pow2p_hwi. * fold-const.c (round_up_loc, round_down_loc): Use pow2_or_zerop. (fold_binary_loc): Use pow2p_hwi. * function.c (assign_parm_find_stack_rtl): Use least_bit_hwi. * gimple-fold.c (gimple_fold_builtin_memory_op): Use pow2p_hwi. * gimple-ssa-strength-reduction.c (replace_ref): Use least_bit_hwi. * hsa-gen.c (gen_hsa_addr_with_align, hsa_bitmemref_alignment): Use least_bit_hwi. * ipa-cp.c (ipcp_alignment_lattice::meet_with_1): Use least_bit_hwi. * ipa-prop.c (ipa_modify_call_arguments): Use least_bit_hwi. * omp-low.c (oacc_loop_fixed_partitions) (oacc_loop_auto_partitions): Use least_bit_hwi. * rtlanal.c (nonzero_bits1): Use ctz_or_zero. * stor-layout.c (place_field): Use least_bit_hwi. * tree-pretty-print.c (dump_generic_node): Use pow2p_hwi. * tree-sra.c (build_ref_for_offset): Use least_bit_hwi. * tree-ssa-ccp.c (ccp_finalize): Use least_bit_hwi. * tree-ssa-math-opts.c (bswap_replace): Use least_bit_hwi. * tree-ssa-strlen.c (handle_builtin_memcmp): Use pow2p_hwi. * tree-vect-data-refs.c (vect_analyze_group_access_1) (vect_grouped_store_supported, vect_grouped_load_supported) (vect_permute_load_chain, vect_shift_permute_load_chain) (vect_transform_grouped_load): Use pow2p_hwi. * tree-vect-generic.c (expand_vector_divmod): Use ctz_or_zero. * tree-vect-patterns.c (vect_recog_divmod_pattern): Use ctz_or_zero. * tree-vect-stmts.c (vectorizable_mask_load_store): Use least_bit_hwi. * tsan.c (instrument_expr): Use least_bit_hwi. * var-tracking.c (negative_power_of_two_p): Use pow2_or_zerop. From-SVN: r240194
2016-07-27re PR tree-optimization/72517 (436.cactusADM: More than 40% regression in O3 ↵Richard Biener1-1/+2
and Ofast on AMD bdver4 m/c.) 2016-07-27 Richard Biener <rguenther@suse.de> PR tree-optimization/72517 * tree-vect-data-refs.c (vect_analyze_data_ref_dependences): Revert change to not compute read-read dependences. From-SVN: r238783
2016-07-13tree-vect-data-refs.c (vect_no_alias_p): New function.Bin Cheng1-5/+85
* tree-vect-data-refs.c (vect_no_alias_p): New function. (vect_prune_runtime_alias_test_list): Call vect_no_alias_p to resolve alias checks which are known at compilation time. Truncate vector LOOP_VINFO_MAY_ALIAS_DDRS(loop_vinfo) if all alias checks are resolved. Move dump info for too many runtime alias checks to here... * tree-vect-loop.c (vect_analyze_loop_2): ...From here. gcc/testsuite * gcc.dg/vect/vect-35-big-array.c: Refine comment and test. * gcc.dg/vect/vect-35.c: Ditto. * gcc.dg/vect/vect-alias-check-2.c: New test. From-SVN: r238301
2016-07-11Convert TYPE_ALIGN_OK to a TYPE_LANG_FLAG.Bernd Edlinger1-2/+2
2016-07-11 Bernd Edlinger <bernd.edlinger@hotmail.de> Convert TYPE_ALIGN_OK to a TYPE_LANG_FLAG. * tree-core.h (tree_base::nothrow_flag): Adjust comment. (tree_type_common::lang_flag_7): New. (tree_type_common::spare): Reduce size. * tree.h (TYPE_ALIGN_OK): Remove. (TYPE_LANG_FLAG_7): New. (get_inner_reference): Adjust header. * print-tree.c (print_node): Adjust. * expr.c (get_inner_reference): Remove parameter keep_aligning. (get_bit_range, expand_assignment, expand_expr_addr_expr_1): Adjust calls to get_inner_reference. (expand_expr_real_1): Adjust call to get_inner_reference. Remove handling of TYPE_ALIGN_OK. * builtins.c (get_object_alignment_2): Adjust call to get_inner_reference. Remove handling of VIEW_CONVERT_EXPR. * emit-rtl.c (set_mem_attributes_minus_bitpos): Remove handling of TYPE_ALIGN_OK. * asan.c (instrument_derefs): Adjust calls to get_inner_reference. * cfgexpand.c (expand_debug_expr): Likewise. * dbxout.c (dbxout_expand_expr): Likewise. * dwarf2out.c (loc_list_for_address_of_addr_expr_of_indirect_ref, loc_list_from_tree, fortran_common): Likewise. * fold-const.c (optimize_bit_field_compare, decode_field_reference, fold_unary_loc, fold_comparison, split_address_to_core_and_offset): Likewise. * gimple-laddress.c (execute): Likewise. * gimple-ssa-strength-reduction.c (slsr_process_ref): Likewise. * gimplify.c (gimplify_scan_omp_clauses): Likewise. * hsa-gen.c (gen_hsa_addr): Likewise. * simplifx-rtx.c (delegitimize_mem_from_attrs): Likewise. * tsan.c (instrument_expr): Likewise. * ubsan.c (instrument_bool_enum_load, instrument_object_size): Likewise. * tree.c (verify_type_variant): Remove handling of TYPE_ALIGN_OK. * tree-affine.c (tree_to_aff_combination, get_inner_reference_aff): Adjust calls to get_inner_reference. * tree-data-ref.c (split_constant_offset_1, dr_analyze_innermost): Likewise. * tree-scalar-evolution.c (interpret_rhs_expr): Likewise. * tree-sra.c (ipa_sra_check_caller): Likewise. * tree-ssa-loop-ivopts.c (split_address_cost): Likewise. * tree-ssa-math-opts.c (find_bswap_or_nop_load, bswap_replace): Likewise. * tree-vect-data-refs.c (vect_check_gather, vect_analyze_data_refs): Likewise. * config/mips/mips.c (r10k_safe_mem_expr_p): Likewise. * config/pa/pa.c (pa_emit_move_sequence): Remove handling of TYPE_ALIGN_OK. ada: 2016-07-11 Bernd Edlinger <bernd.edlinger@hotmail.de> Convert TYPE_ALIGN_OK to a TYPE_LANG_FLAG. * gcc-interface/ada-tree.h (TYPE_ALIGN_OK): Define. * gcc-interface/trans.c (Attribute_to_gnu): Adjust call to get_inner_reference. * gcc-interface/utils2.c (build_unary_op): Likewise. From-SVN: r238210
2016-07-06re PR tree-optimization/71518 (wrong code at -O3 on x86_64-linux-gnu in ↵Yuri Rumyantsev1-3/+8
64-bit mode (not in 32-bit mode)) gcc/ 2016-07-06 Yuri Rumyantsev <ysrumyan@gmail.com> PR tree-optimization/71518 * tree-vect-data-refs.c (vect_compute_data_ref_alignment): Adjust misalign also for outer loops with negative step. gcc/testsuite/ 2016-07-06 Yuri Rumyantsev <ysrumyan@gmail.com> PR tree-optimization/71518 * gcc.dg/pr71518.c: New test. From-SVN: r238055