Age | Commit message (Collapse) | Author | Files | Lines |
|
The direct_slp_reduc code in vect_create_epilog_for_reduction was
still assuming that all types involved in a reduction are the same
(up to types_compatible_p), whereas we now support differences in
sign. This was causing an ICE in gcc.dg/vect/pr92324-4.c for SVE.
2019-12-10 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vect-loop.c (vect_create_epilog_for_reduction): When
handling direct_slp_reduc, allow the PHI arguments to have
a different type from the vector elements.
From-SVN: r279164
|
|
gcc.dg/vect/vect-cond-reduc-5.c was ICEing for SVE because we
tried to use an extract-last reduction for a chain of COND_EXPRs.
Adding support for the chained case would be too invasive for stage 3
so this patch explicitly forbids it instead. I've filed PR92884 for
the possible future work.
2019-12-10 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vect-loop.c (vectorizable_reduction): Don't use
EXTRACT_LAST_REDUCTION for chained reductions.
From-SVN: r279161
|
|
gcc/ChangeLog
2019-12-05 Sudakshina Das <sudi.das@arm.com>
* tree-vect-loop.c (vect_model_reduction_cost): Remove reduction_type
check from if condition.
From-SVN: r279012
|
|
tree-vect-loop.c:4367)
2019-12-02 Richard Biener <rguenther@suse.de>
PR tree-optimization/92742
* tree-vect-loop.c (vect_fixup_reduc_chain): Do not
touch the def-type but verify it is consistent with the
original stmts.
* gcc.dg/torture/pr92742.c: New testcase.
From-SVN: r278896
|
|
When dissolving an SLP-only group of accesses, we should only set
the gap to group_size - 1 for normal non-strided groups.
2019-11-29 Richard Sandiford <richard.sandiford@arm.com>
gcc/
PR tree-optimization/92677
* tree-vect-loop.c (vect_dissolve_slp_only_groups): Set the gap
to zero when dissolving a group of strided accesses.
gcc/testsuite/
PR tree-optimization/92677
* gcc.dg/vect/pr92677.c: New test.
From-SVN: r278852
|
|
Now that stmt_vec_info records the choice between vector mask
types and normal nonmask types, we can use that information in
vect_get_vector_types_for_stmt instead of deferring the choice
of vector type till later.
vect_get_mask_type_for_stmt used to check whether the boolean inputs
to an operation:
(a) consistently used mask types or consistently used nonmask types; and
(b) agreed on the number of elements.
(b) shouldn't be a problem when (a) is met. If the operation
consistently uses mask types, tree-vect-patterns.c will have corrected
any mismatches in mask precision. (This is because we only use mask
types for a small well-known set of operations and tree-vect-patterns.c
knows how to handle any that could have different mask precisions.)
And if the operation consistently uses normal nonmask types, there's
no reason why booleans should need extra vector compatibility checks
compared to ordinary integers.
So the potential difficulties all seem to come from (a). Now that
we've chosen the result type ahead of time, we also have to consider
whether the outputs and inputs consistently use mask types.
Taking each vectorizable_* routine in turn:
- vectorizable_call
vect_get_vector_types_for_stmt only handled booleans specially
for gassigns, so vect_get_mask_type_for_stmt never had chance to
handle calls. I'm not sure we support any calls that operate on
booleans, but as things stand, a boolean result would always have
a nonmask type. Presumably any vector argument would also need to
use nonmask types, unless it corresponds to internal_fn_mask_index
(which is already a special case).
For safety, I've added a check for mask/nonmask combinations here
even though we didn't check this previously.
- vectorizable_simd_clone_call
Again, vect_get_mask_type_for_stmt never had chance to handle calls.
The result of the call will always be a nonmask type and the patch
for PR 92710 rejects mask arguments. So all booleans should
consistently use nonmask types here.
- vectorizable_conversion
The function already rejects any conversion between booleans in which
one type isn't a mask type.
- vectorizable_operation
This function definitely needs a consistency check, e.g. to handle
& and | in which one operand is loaded from memory and the other is
a comparison result. Ideally we'd handle this via pattern stmts
instead (like we do for the all-mask case), but that's future work.
- vectorizable_assignment
VECT_SCALAR_BOOLEAN_TYPE_P requires single-bit precision, so the
current code already rejects problematic cases.
- vectorizable_load
Loads always produce nonmask types and there are no relevant inputs
to check against.
- vectorizable_store
vect_check_store_rhs already rejects mask/nonmask combinations
via useless_type_conversion_p.
- vectorizable_reduction
- vectorizable_lc_phi
PHIs always have nonmask types. After the change above, attempts
to combine the PHI result with a mask type would be rejected by
vectorizable_operation. (Again, it would be better to handle
this using pattern stmts.)
- vectorizable_induction
We don't generate inductions for booleans.
- vectorizable_shift
The function already rejects boolean shifts via type_has_mode_precision_p.
- vectorizable_condition
The function already rejects mismatches via useless_type_conversion_p.
- vectorizable_comparison
The function already rejects comparisons between mask and nonmask types.
The result is always a mask type.
2019-11-29 Richard Sandiford <richard.sandiford@arm.com>
gcc/
PR tree-optimization/92596
* tree-vect-stmts.c (vectorizable_call): Punt on hybrid mask/nonmask
operations.
(vectorizable_operation): Likewise, instead of relying on
vect_get_mask_type_for_stmt to do this.
(vect_get_vector_types_for_stmt): Always return a vector type
immediately, rather than deferring the choice for boolean results.
Use a vector mask type instead of a normal vector if
vect_use_mask_type_p.
(vect_get_mask_type_for_stmt): Delete.
* tree-vect-loop.c (vect_determine_vf_for_stmt_1): Remove
mask_producers argument and special boolean_type_node handling.
(vect_determine_vf_for_stmt): Remove mask_producers argument and
update calls to vect_determine_vf_for_stmt_1. Remove doubled call.
(vect_determine_vectorization_factor): Update call accordingly.
* tree-vect-slp.c (vect_build_slp_tree_1): Remove special
boolean_type_node handling.
(vect_slp_analyze_node_operations_1): Likewise.
gcc/testsuite/
PR tree-optimization/92596
* gcc.dg/vect/bb-slp-pr92596.c: New test.
* gcc.dg/vect/bb-slp-43.c: Likewise.
From-SVN: r278851
|
|
gcc.target/aarch64/sve/clastb_[57].c started failing after the increase
in the cost of vec_to_scalar (r278452). The problem is that we were
double-counting the cost of the CLASTB: once in vect_model_reduction_cost
as a vec_to_scalar and once in vectorizable_condition as a plain
vector_stmt.
Based on the TODO above vect_model_reduction_cost, I think the
preferred long-term direction is for vectorizable_* to cost these
things itself, so that's what the patch does (for this one case only).
2019-11-22 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vect-stmts.c (vect_model_simple_cost): Take an optional
vect_cost_for_stmt.
(vectorizable_condition): Calculate the cost of EXTRACT_LAST_REDUCTION
here rather than...
* tree-vect-loop.c (vect_model_reduction_cost): ...here.
From-SVN: r278611
|
|
2019-11-19 Richard Biener <rguenther@suse.de>
PR tree-optimization/92581
* tree-vect-loop.c (vect_create_epilog_for_reduction): For
condition reduction chains gather all conditions involved
for computing the index reduction vector.
* gcc.dg/vect/vect-cond-reduc-5.c: New testcase.
From-SVN: r278445
|
|
tree-vect-loop.c:4325)
2019-11-19 Richard Biener <rguenther@suse.de>
PR tree-optimization/92554
* tree-vect-loop.c (vect_create_epilog_for_reduction): Look
for the actual condition stmt and deal with sign-changes.
* gcc.dg/vect/pr92554.c: New testcase.
From-SVN: r278431
|
|
2019-09-19 Richard Biener <rguenther@suse.de>
PR tree-optimization/92555
* tree-vect-loop.c (vect_update_vf_for_slp): Also scan PHIs
for non-SLP stmts.
* gcc.dg/vect/pr92555.c: New testcase.
From-SVN: r278430
|
|
-march=znver2 -flto since r278289)
2019-11-18 Richard Biener <rguenther@suse.de>
PR tree-optimization/92558
* tree-vect-loop.c (vect_create_epilog_for_reduction): When
reducting the width of a reduction vector def update new_phis.
* gcc.dg/vect/pr92558.c: New testcase.
From-SVN: r278400
|
|
This patch adds a mode in which the vectoriser tries each available
base vector mode and picks the one with the lowest cost. The new
behaviour is selected by autovectorize_vector_modes.
The patch keeps the current behaviour of preferring a VF of
loop->simdlen over any larger or smaller VF, regardless of costs
or target preferences.
2019-11-16 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* target.h (VECT_COMPARE_COSTS): New constant.
* target.def (autovectorize_vector_modes): Return a bitmask of flags.
* doc/tm.texi: Regenerate.
* targhooks.h (default_autovectorize_vector_modes): Update accordingly.
* targhooks.c (default_autovectorize_vector_modes): Likewise.
* config/aarch64/aarch64.c (aarch64_autovectorize_vector_modes):
Likewise.
* config/arc/arc.c (arc_autovectorize_vector_modes): Likewise.
* config/arm/arm.c (arm_autovectorize_vector_modes): Likewise.
* config/i386/i386.c (ix86_autovectorize_vector_modes): Likewise.
* config/mips/mips.c (mips_autovectorize_vector_modes): Likewise.
* tree-vectorizer.h (_loop_vec_info::vec_outside_cost)
(_loop_vec_info::vec_inside_cost): New member variables.
* tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Initialize them.
(vect_better_loop_vinfo_p, vect_joust_loop_vinfos): New functions.
(vect_analyze_loop): When autovectorize_vector_modes returns
VECT_COMPARE_COSTS, try vectorizing the loop with each available
vector mode and picking the one with the lowest cost.
(vect_estimate_min_profitable_iters): Record the computed costs
in the loop_vec_info.
From-SVN: r278336
|
|
This patch makes can_duplicate_and_interleave_p cope with mixtures of
vector sizes, by using queries based on get_vectype_for_scalar_type
instead of directly querying GET_MODE_SIZE (vinfo->vector_mode).
int_mode_for_size is now the first check we do for a candidate mode,
so it seemed better to restrict it to MAX_FIXED_MODE_SIZE. This avoids
unnecessary work and avoids trying to create scalar types that the
target might not support.
2019-11-16 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vectorizer.h (can_duplicate_and_interleave_p): Take an
element type rather than an element mode.
* tree-vect-slp.c (can_duplicate_and_interleave_p): Likewise.
Use get_vectype_for_scalar_type to query the natural types
for a given element type rather than basing everything on
GET_MODE_SIZE (vinfo->vector_mode). Limit int_mode_for_size
query to MAX_FIXED_MODE_SIZE.
(duplicate_and_interleave): Update call accordingly.
* tree-vect-loop.c (vectorizable_reduction): Likewise.
From-SVN: r278335
|
|
2019-11-15 Richard Biener <rguenther@suse.de>
PR tree-optimization/92512
* tree-vect-loop.c (check_reduction_path): Fix operand index
computability check. Add check for second use in COND_EXPRs.
* gcc.dg/torture/pr92512.c: New testcase.
From-SVN: r278293
|
|
internal-fn.c:2890)
2019-11-15 Richard Biener <rguenther@suse.de>
PR tree-optimization/92324
* tree-vect-loop.c (vect_create_epilog_for_reduction): Fix
singedness of SLP reduction epilouge operations. Also reduce
the vector width for SLP reductions before doing elementwise
operations if possible.
* gcc.dg/vect/pr92324-4.c: New testcase.
From-SVN: r278289
|
|
A later patch makes the AArch64 port add four entries to
autovectorize_vector_modes. Each entry describes a different
vector mode assignment for vector code that mixes 8-bit, 16-bit,
32-bit and 64-bit elements. But if (as usual) the vector code has
fewer element sizes than that, we could end up trying the same
combination of vector modes multiple times. This patch adds a
check to prevent that.
2019-11-14 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vectorizer.h (vec_info::mode_set): New typedef.
(vec_info::used_vector_mode): New member variable.
(vect_chooses_same_modes_p): Declare.
* tree-vect-stmts.c (get_vectype_for_scalar_type): Record each
chosen vector mode in vec_info::used_vector_mode.
(vect_chooses_same_modes_p): New function.
* tree-vect-loop.c (vect_analyze_loop): Use it to avoid trying
the same vector statements multiple times.
* tree-vect-slp.c (vect_slp_bb_region): Likewise.
From-SVN: r278242
|
|
After previous patches, it's now possible to make the vectoriser
support multiple vector sizes in the same vector region, using
related_vector_mode to pick the right vector mode for a given
element mode. No port yet takes advantage of this, but I have
a follow-on patch for AArch64.
This patch also seemed like a good opportunity to add some more dump
messages: one to make it clear which vector size/mode was being used
when analysis passed or failed, and another to say when we've decided
to skip a redundant vector size/mode.
2019-11-14 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* machmode.h (opt_machine_mode::operator==): New function.
(opt_machine_mode::operator!=): Likewise.
* tree-vectorizer.h (vec_info::vector_mode): Update comment.
(get_related_vectype_for_scalar_type): Delete.
(get_vectype_for_scalar_type_and_size): Declare.
* tree-vect-slp.c (vect_slp_bb_region): Print dump messages to say
whether analysis passed or failed, and with what vector modes.
Use related_vector_mode to check whether trying a particular
vector mode would be redundant with the autodetected mode,
and print a dump message if we decide to skip it.
* tree-vect-loop.c (vect_analyze_loop): Likewise.
(vect_create_epilog_for_reduction): Use
get_related_vectype_for_scalar_type instead of
get_vectype_for_scalar_type_and_size.
* tree-vect-stmts.c (get_vectype_for_scalar_type_and_size): Replace
with...
(get_related_vectype_for_scalar_type): ...this new function.
Take a starting/"prevailing" vector mode rather than a vector size.
Take an optional nunits argument, with the same meaning as for
related_vector_mode. Use related_vector_mode when not
auto-detecting a mode, falling back to mode_for_vector if no
target mode exists.
(get_vectype_for_scalar_type): Update accordingly.
(get_same_sized_vectype): Likewise.
* tree-vectorizer.c (get_vec_alignment_for_array_type): Likewise.
From-SVN: r278240
|
|
This patch replaces vec_info::vector_size with vec_info::vector_mode,
but for now continues to use it as a way of specifying a single
vector size. This makes it easier for later patches to use
related_vector_mode instead.
2019-11-14 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vectorizer.h (vec_info::vector_size): Replace with...
(vec_info::vector_mode): ...this new field.
* tree-vect-loop.c (vect_update_vf_for_slp): Update accordingly.
(vect_analyze_loop, vect_transform_loop): Likewise.
* tree-vect-loop-manip.c (vect_do_peeling): Likewise.
* tree-vect-slp.c (can_duplicate_and_interleave_p): Likewise.
(vect_make_slp_decision, vect_slp_bb_region): Likewise.
* tree-vect-stmts.c (get_vectype_for_scalar_type): Likewise.
* tree-vectorizer.c (try_vectorize_loop_1): Likewise.
gcc/testsuite/
* gcc.dg/vect/vect-tail-nomask-1.c: Update expected epilogue
vectorization message.
From-SVN: r278237
|
|
This is another patch in the series to remove the assumption that
all modes involved in vectorisation have to be the same size.
Rather than have the target provide a list of vector sizes,
it makes the target provide a list of vector "approaches",
with each approach represented by a mode.
A later patch will pass this mode to targetm.vectorize.related_mode
to get the vector mode for a given element mode. Until then, the modes
simply act as an alternative way of specifying the vector size.
2019-11-14 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* target.h (vector_sizes, auto_vector_sizes): Delete.
(vector_modes, auto_vector_modes): New typedefs.
* target.def (autovectorize_vector_sizes): Replace with...
(autovectorize_vector_modes): ...this new hook.
* doc/tm.texi.in (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_SIZES):
Replace with...
(TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES): ...this new hook.
* doc/tm.texi: Regenerate.
* targhooks.h (default_autovectorize_vector_sizes): Delete.
(default_autovectorize_vector_modes): New function.
* targhooks.c (default_autovectorize_vector_sizes): Delete.
(default_autovectorize_vector_modes): New function.
* omp-general.c (omp_max_vf): Use autovectorize_vector_modes instead
of autovectorize_vector_sizes. Use the number of units in the mode
to calculate the maximum VF.
* omp-low.c (omp_clause_aligned_alignment): Use
autovectorize_vector_modes instead of autovectorize_vector_sizes.
Use a loop based on related_mode to iterate through all supported
vector modes for a given scalar mode.
* optabs-query.c (can_vec_mask_load_store_p): Use
autovectorize_vector_modes instead of autovectorize_vector_sizes.
* tree-vect-loop.c (vect_analyze_loop, vect_transform_loop): Likewise.
* tree-vect-slp.c (vect_slp_bb_region): Likewise.
* config/aarch64/aarch64.c (aarch64_autovectorize_vector_sizes):
Replace with...
(aarch64_autovectorize_vector_modes): ...this new function.
(TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_SIZES): Delete.
(TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES): Define.
* config/arc/arc.c (arc_autovectorize_vector_sizes): Replace with...
(arc_autovectorize_vector_modes): ...this new function.
(TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_SIZES): Delete.
(TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES): Define.
* config/arm/arm.c (arm_autovectorize_vector_sizes): Replace with...
(arm_autovectorize_vector_modes): ...this new function.
(TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_SIZES): Delete.
(TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES): Define.
* config/i386/i386.c (ix86_autovectorize_vector_sizes): Replace with...
(ix86_autovectorize_vector_modes): ...this new function.
(TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_SIZES): Delete.
(TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES): Define.
* config/mips/mips.c (mips_autovectorize_vector_sizes): Replace with...
(mips_autovectorize_vector_modes): ...this new function.
(TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_SIZES): Delete.
(TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES): Define.
From-SVN: r278236
|
|
build_same_sized_truth_vector_type was confusingly named, since for
SVE and AVX512 the returned vector isn't the same byte size (although
it does have the same number of elements). What it really returns
is the "truth" vector type for a given data vector type.
The more general truth_type_for provides the same thing when passed
a vector and IMO has a more descriptive name, so this patch replaces
all uses of build_same_sized_truth_vector_type with that. It does
the same for a call to build_truth_vector_type, leaving truth_type_for
itself as the only remaining caller.
It's then more natural to pass build_truth_vector_type the original
vector type rather than its size and nunits, especially since the
given size isn't the size of the returned vector. This in turn allows
a future patch to simplify the interface of get_mask_mode. Doing this
also fixes a bug in which truth_type_for would pass a size of zero for
BLKmode vector types.
2019-11-14 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree.h (build_truth_vector_type): Delete.
(build_same_sized_truth_vector_type): Likewise.
* tree.c (build_truth_vector_type): Rename to...
(build_truth_vector_type_for): ...this. Make static and take
a vector type as argument.
(truth_type_for): Update accordingly.
(build_same_sized_truth_vector_type): Delete.
* tree-vect-generic.c (expand_vector_divmod): Use truth_type_for
instead of build_same_sized_truth_vector_type.
* tree-vect-loop.c (vect_create_epilog_for_reduction): Likewise.
(vect_record_loop_mask, vect_get_loop_mask): Likewise.
* tree-vect-patterns.c (build_mask_conversion): Likeise.
* tree-vect-slp.c (vect_get_constant_vectors): Likewise.
* tree-vect-stmts.c (vect_get_vec_def_for_operand): Likewise.
(vect_build_gather_load_calls, vectorizable_call): Likewise.
(scan_store_can_perm_p, vectorizable_scan_store): Likewise.
(vectorizable_store, vectorizable_condition): Likewise.
(get_mask_type_for_scalar_type, get_same_sized_vectype): Likewise.
(vect_get_mask_type_for_stmt): Use truth_type_for instead of
build_truth_vector_type.
* config/aarch64/aarch64-sve-builtins.cc (gimple_folder::convert_pred):
Use truth_type_for instead of build_same_sized_truth_vector_type.
* config/rs6000/rs6000-call.c (fold_build_vec_cmp): Likewise.
gcc/c/
* c-typeck.c (build_conditional_expr): Use truth_type_for instead
of build_same_sized_truth_vector_type.
(build_vec_cmp): Likewise.
gcc/cp/
* call.c (build_conditional_expr_1): Use truth_type_for instead
of build_same_sized_truth_vector_type.
* typeck.c (build_vec_cmp): Likewise.
gcc/d/
* d-codegen.cc (build_boolop): Use truth_type_for instead of
build_same_sized_truth_vector_type.
From-SVN: r278232
|
|
Callers of vect_halve_mask_nunits and vect_double_mask_nunits
already know what mode the resulting vector type should have,
so we might as well create the vector type directly with that mode,
just like build_vector_type_for_mode lets us build normal vectors
with a known mode. This avoids the current awkwardness of having
to recompute the mode starting from vec_info::vector_size, which
hard-codes the assumption that all vectors have to be the same size.
A later patch gets rid of build_truth_vector_type and
build_same_sized_truth_vector_type, so the net effect of the
series is to reduce the number of type functions by one.
2019-11-14 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree.h (build_truth_vector_type_for_mode): Declare.
* tree.c (build_truth_vector_type_for_mode): New function,
split out from...
(build_truth_vector_type): ...here.
(build_opaque_vector_type): Fix head comment.
* tree-vectorizer.h (supportable_narrowing_operation): Remove
vec_info parameter.
(vect_halve_mask_nunits): Replace vec_info parameter with the
mode of the new vector.
(vect_double_mask_nunits): Likewise.
* tree-vect-loop.c (vect_halve_mask_nunits): Likewise.
(vect_double_mask_nunits): Likewise.
* tree-vect-loop-manip.c: Include insn-config.h, rtl.h and recog.h.
(vect_maybe_permute_loop_masks): Remove vinfo parameter. Update call
to vect_halve_mask_nunits, getting the required mode from the unpack
patterns.
(vect_set_loop_condition_masked): Update call accordingly.
* tree-vect-stmts.c (supportable_narrowing_operation): Remove vec_info
parameter and update call to vect_double_mask_nunits.
(vectorizable_conversion): Update call accordingly.
(simple_integer_narrowing): Likewise. Remove vec_info parameter.
(vectorizable_call): Update call accordingly.
(supportable_widening_operation): Update call to
vect_halve_mask_nunits.
* config/aarch64/aarch64-sve-builtins.cc (register_builtin_types):
Use build_truth_vector_type_mode instead of build_truth_vector_type.
From-SVN: r278231
|
|
We didn't take the cost of generating loop masks into account, and so
tended to underestimate the cost of loops that need multiple masks.
2019-11-13 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vect-loop.c (vect_estimate_min_profitable_iters): Include
the cost of generating loop masks.
gcc/testsuite/
* gcc.target/aarch64/sve/mask_struct_store_3.c: Add
-fno-vect-cost-model.
* gcc.target/aarch64/sve/mask_struct_store_3_run.c: Likewise.
* gcc.target/aarch64/sve/peel_ind_2.c: Likewise.
* gcc.target/aarch64/sve/peel_ind_2_run.c: Likewise.
* gcc.target/aarch64/sve/peel_ind_3.c: Likewise.
* gcc.target/aarch64/sve/peel_ind_3_run.c: Likewise.
From-SVN: r278125
|
|
vect_analyze_loop_costing uses two profitability thresholds: a runtime
one and a static compile-time one. The runtime one is simply the point
at which the vector loop is cheaper than the scalar loop, while the
static one also takes into account the cost of choosing between the
scalar and vector loops at runtime. We compare this static cost against
the expected execution frequency to decide whether it's worth generating
any vector code at all.
However, we never reclaimed the cost of applying the runtime threshold
if it turned out that the vector code can always be used. And we only
know whether that's true once we've calculated what the runtime
threshold would be.
2019-11-13 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vectorizer.h (vect_apply_runtime_profitability_check_p):
New function.
* tree-vect-loop-manip.c (vect_loop_versioning): Use it.
* tree-vect-loop.c (vect_analyze_loop_2): Likewise.
(vect_transform_loop): Likewise.
(vect_analyze_loop_costing): Don't take the cost of versioning
into account for the static profitability threshold if it turns
out that no versioning is needed.
From-SVN: r278124
|
|
vectorizable_assignment handles true SSA-to-SSA copies (which hopefully
we don't see in practice) and no-op conversions that are required
to maintain correct gimple, such as changes between signed and
unsigned types. These cases shouldn't generate any code and so
shouldn't count against either the scalar or vector costs.
Later patches test this, but it seemed worth splitting out.
2019-11-13 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vectorizer.h (vect_nop_conversion_p): Declare.
* tree-vect-stmts.c (vect_nop_conversion_p): New function.
(vectorizable_assignment): Don't add a cost for nop conversions.
* tree-vect-loop.c (vect_compute_single_scalar_iteration_cost):
Likewise.
* tree-vect-slp.c (vect_bb_slp_scalar_cost): Likewise.
From-SVN: r278122
|
|
2019-11-13 Richard Biener <rguenther@suse.de>
PR tree-optimization/92473
* tree-vect-loop.c (vect_create_epilog_for_reduction): Perform
direct optab reduction in the correct type.
From-SVN: r278113
|
|
operand for statement))
2019-11-12 Richard Biener <rguenther@suse.de>
PR tree-optimization/92461
* tree-vect-loop.c (vect_create_epilog_for_reduction): Update
stmt after propagation.
* gcc.dg/torture/pr92461.c: New testcase.
From-SVN: r278093
|
|
2019-11-12 Martin Liska <mliska@suse.cz>
* Makefile.in: Remove PARAMS_H and params.list
and params.options.
* params-enum.h: Remove.
* params-list.h: Remove.
* params-options.h: Remove.
* params.c: Remove.
* params.def: Remove.
* params.h: Remove.
* asan.c: Do not include params.h.
* auto-profile.c: Likewise.
* bb-reorder.c: Likewise.
* builtins.c: Likewise.
* cfgcleanup.c: Likewise.
* cfgexpand.c: Likewise.
* cfgloopanal.c: Likewise.
* cgraph.c: Likewise.
* combine.c: Likewise.
* common/config/aarch64/aarch64-common.c: Likewise.
* common/config/gcn/gcn-common.c: Likewise.
* common/config/ia64/ia64-common.c: Likewise.
* common/config/powerpcspe/powerpcspe-common.c: Likewise.
* common/config/rs6000/rs6000-common.c: Likewise.
* common/config/sh/sh-common.c: Likewise.
* config/aarch64/aarch64.c: Likewise.
* config/alpha/alpha.c: Likewise.
* config/arm/arm.c: Likewise.
* config/avr/avr.c: Likewise.
* config/csky/csky.c: Likewise.
* config/i386/i386-builtins.c: Likewise.
* config/i386/i386-expand.c: Likewise.
* config/i386/i386-features.c: Likewise.
* config/i386/i386-options.c: Likewise.
* config/i386/i386.c: Likewise.
* config/ia64/ia64.c: Likewise.
* config/rs6000/rs6000-logue.c: Likewise.
* config/rs6000/rs6000.c: Likewise.
* config/s390/s390.c: Likewise.
* config/sparc/sparc.c: Likewise.
* config/visium/visium.c: Likewise.
* coverage.c: Likewise.
* cprop.c: Likewise.
* cse.c: Likewise.
* cselib.c: Likewise.
* dse.c: Likewise.
* emit-rtl.c: Likewise.
* explow.c: Likewise.
* final.c: Likewise.
* fold-const.c: Likewise.
* gcc.c: Likewise.
* gcse.c: Likewise.
* ggc-common.c: Likewise.
* ggc-page.c: Likewise.
* gimple-loop-interchange.cc: Likewise.
* gimple-loop-jam.c: Likewise.
* gimple-loop-versioning.cc: Likewise.
* gimple-ssa-split-paths.c: Likewise.
* gimple-ssa-sprintf.c: Likewise.
* gimple-ssa-store-merging.c: Likewise.
* gimple-ssa-strength-reduction.c: Likewise.
* gimple-ssa-warn-alloca.c: Likewise.
* gimple-ssa-warn-restrict.c: Likewise.
* graphite-isl-ast-to-gimple.c: Likewise.
* graphite-optimize-isl.c: Likewise.
* graphite-scop-detection.c: Likewise.
* graphite-sese-to-poly.c: Likewise.
* graphite.c: Likewise.
* haifa-sched.c: Likewise.
* hsa-gen.c: Likewise.
* ifcvt.c: Likewise.
* ipa-cp.c: Likewise.
* ipa-fnsummary.c: Likewise.
* ipa-inline-analysis.c: Likewise.
* ipa-inline.c: Likewise.
* ipa-polymorphic-call.c: Likewise.
* ipa-profile.c: Likewise.
* ipa-prop.c: Likewise.
* ipa-split.c: Likewise.
* ipa-sra.c: Likewise.
* ira-build.c: Likewise.
* ira-conflicts.c: Likewise.
* loop-doloop.c: Likewise.
* loop-invariant.c: Likewise.
* loop-unroll.c: Likewise.
* lra-assigns.c: Likewise.
* lra-constraints.c: Likewise.
* modulo-sched.c: Likewise.
* opt-suggestions.c: Likewise.
* opts.c: Likewise.
* postreload-gcse.c: Likewise.
* predict.c: Likewise.
* reload.c: Likewise.
* reorg.c: Likewise.
* resource.c: Likewise.
* sanopt.c: Likewise.
* sched-deps.c: Likewise.
* sched-ebb.c: Likewise.
* sched-rgn.c: Likewise.
* sel-sched-ir.c: Likewise.
* sel-sched.c: Likewise.
* shrink-wrap.c: Likewise.
* stmt.c: Likewise.
* targhooks.c: Likewise.
* toplev.c: Likewise.
* tracer.c: Likewise.
* trans-mem.c: Likewise.
* tree-chrec.c: Likewise.
* tree-data-ref.c: Likewise.
* tree-if-conv.c: Likewise.
* tree-inline.c: Likewise.
* tree-loop-distribution.c: Likewise.
* tree-parloops.c: Likewise.
* tree-predcom.c: Likewise.
* tree-profile.c: Likewise.
* tree-scalar-evolution.c: Likewise.
* tree-sra.c: Likewise.
* tree-ssa-ccp.c: Likewise.
* tree-ssa-dom.c: Likewise.
* tree-ssa-dse.c: Likewise.
* tree-ssa-ifcombine.c: Likewise.
* tree-ssa-loop-ch.c: Likewise.
* tree-ssa-loop-im.c: Likewise.
* tree-ssa-loop-ivcanon.c: Likewise.
* tree-ssa-loop-ivopts.c: Likewise.
* tree-ssa-loop-manip.c: Likewise.
* tree-ssa-loop-niter.c: Likewise.
* tree-ssa-loop-prefetch.c: Likewise.
* tree-ssa-loop-unswitch.c: Likewise.
* tree-ssa-math-opts.c: Likewise.
* tree-ssa-phiopt.c: Likewise.
* tree-ssa-pre.c: Likewise.
* tree-ssa-reassoc.c: Likewise.
* tree-ssa-sccvn.c: Likewise.
* tree-ssa-scopedtables.c: Likewise.
* tree-ssa-sink.c: Likewise.
* tree-ssa-strlen.c: Likewise.
* tree-ssa-structalias.c: Likewise.
* tree-ssa-tail-merge.c: Likewise.
* tree-ssa-threadbackward.c: Likewise.
* tree-ssa-threadedge.c: Likewise.
* tree-ssa-uninit.c: Likewise.
* tree-switch-conversion.c: Likewise.
* tree-vect-data-refs.c: Likewise.
* tree-vect-loop.c: Likewise.
* tree-vect-slp.c: Likewise.
* tree-vrp.c: Likewise.
* tree.c: Likewise.
* value-prof.c: Likewise.
* var-tracking.c: Likewise.
2019-11-12 Martin Liska <mliska@suse.cz>
* gimple-parser.c: Do not include params.h.
2019-11-12 Martin Liska <mliska@suse.cz>
* name-lookup.c: Do not include params.h.
* typeck.c: Likewise.
2019-11-12 Martin Liska <mliska@suse.cz>
* lto-common.c: Do not include params.h.
* lto-partition.c: Likewise.
* lto.c: Likewise.
From-SVN: r278086
|
|
2019-11-12 Martin Liska <mliska@suse.cz>
* asan.c (asan_sanitize_stack_p): Replace old parameter syntax
with the new one, include opts.h if needed. Use SET_OPTION_IF_UNSET
macro.
(asan_sanitize_allocas_p): Likewise.
(asan_emit_stack_protection): Likewise.
(asan_protect_global): Likewise.
(instrument_derefs): Likewise.
(instrument_builtin_call): Likewise.
(asan_expand_mark_ifn): Likewise.
* auto-profile.c (auto_profile): Likewise.
* bb-reorder.c (copy_bb_p): Likewise.
(duplicate_computed_gotos): Likewise.
* builtins.c (inline_expand_builtin_string_cmp): Likewise.
* cfgcleanup.c (try_crossjump_to_edge): Likewise.
(try_crossjump_bb): Likewise.
* cfgexpand.c (defer_stack_allocation): Likewise.
(stack_protect_classify_type): Likewise.
(pass_expand::execute): Likewise.
* cfgloopanal.c (expected_loop_iterations_unbounded): Likewise.
(estimate_reg_pressure_cost): Likewise.
* cgraph.c (cgraph_edge::maybe_hot_p): Likewise.
* combine.c (combine_instructions): Likewise.
(record_value_for_reg): Likewise.
* common/config/aarch64/aarch64-common.c (aarch64_option_validate_param): Likewise.
(aarch64_option_default_params): Likewise.
* common/config/ia64/ia64-common.c (ia64_option_default_params): Likewise.
* common/config/powerpcspe/powerpcspe-common.c (rs6000_option_default_params): Likewise.
* common/config/rs6000/rs6000-common.c (rs6000_option_default_params): Likewise.
* common/config/sh/sh-common.c (sh_option_default_params): Likewise.
* config/aarch64/aarch64.c (aarch64_output_probe_stack_range): Likewise.
(aarch64_allocate_and_probe_stack_space): Likewise.
(aarch64_expand_epilogue): Likewise.
(aarch64_override_options_internal): Likewise.
* config/alpha/alpha.c (alpha_option_override): Likewise.
* config/arm/arm.c (arm_option_override): Likewise.
(arm_valid_target_attribute_p): Likewise.
* config/i386/i386-options.c (ix86_option_override_internal): Likewise.
* config/i386/i386.c (get_probe_interval): Likewise.
(ix86_adjust_stack_and_probe_stack_clash): Likewise.
(ix86_max_noce_ifcvt_seq_cost): Likewise.
* config/ia64/ia64.c (ia64_adjust_cost): Likewise.
* config/rs6000/rs6000-logue.c (get_stack_clash_protection_probe_interval): Likewise.
(get_stack_clash_protection_guard_size): Likewise.
* config/rs6000/rs6000.c (rs6000_option_override_internal): Likewise.
* config/s390/s390.c (allocate_stack_space): Likewise.
(s390_emit_prologue): Likewise.
(s390_option_override_internal): Likewise.
* config/sparc/sparc.c (sparc_option_override): Likewise.
* config/visium/visium.c (visium_option_override): Likewise.
* coverage.c (get_coverage_counts): Likewise.
(coverage_compute_profile_id): Likewise.
(coverage_begin_function): Likewise.
(coverage_end_function): Likewise.
* cse.c (cse_find_path): Likewise.
(cse_extended_basic_block): Likewise.
(cse_main): Likewise.
* cselib.c (cselib_invalidate_mem): Likewise.
* dse.c (dse_step1): Likewise.
* emit-rtl.c (set_new_first_and_last_insn): Likewise.
(get_max_insn_count): Likewise.
(make_debug_insn_raw): Likewise.
(init_emit): Likewise.
* explow.c (compute_stack_clash_protection_loop_data): Likewise.
* final.c (compute_alignments): Likewise.
* fold-const.c (fold_range_test): Likewise.
(fold_truth_andor): Likewise.
(tree_single_nonnegative_warnv_p): Likewise.
(integer_valued_real_single_p): Likewise.
* gcse.c (want_to_gcse_p): Likewise.
(prune_insertions_deletions): Likewise.
(hoist_code): Likewise.
(gcse_or_cprop_is_too_expensive): Likewise.
* ggc-common.c: Likewise.
* ggc-page.c (ggc_collect): Likewise.
* gimple-loop-interchange.cc (MAX_NUM_STMT): Likewise.
(MAX_DATAREFS): Likewise.
(OUTER_STRIDE_RATIO): Likewise.
* gimple-loop-jam.c (tree_loop_unroll_and_jam): Likewise.
* gimple-loop-versioning.cc (loop_versioning::max_insns_for_loop): Likewise.
* gimple-ssa-split-paths.c (is_feasible_trace): Likewise.
* gimple-ssa-store-merging.c (imm_store_chain_info::try_coalesce_bswap): Likewise.
(imm_store_chain_info::coalesce_immediate_stores): Likewise.
(imm_store_chain_info::output_merged_store): Likewise.
(pass_store_merging::process_store): Likewise.
* gimple-ssa-strength-reduction.c (find_basis_for_base_expr): Likewise.
* graphite-isl-ast-to-gimple.c (class translate_isl_ast_to_gimple): Likewise.
(scop_to_isl_ast): Likewise.
* graphite-optimize-isl.c (get_schedule_for_node_st): Likewise.
(optimize_isl): Likewise.
* graphite-scop-detection.c (build_scops): Likewise.
* haifa-sched.c (set_modulo_params): Likewise.
(rank_for_schedule): Likewise.
(model_add_to_worklist): Likewise.
(model_promote_insn): Likewise.
(model_choose_insn): Likewise.
(queue_to_ready): Likewise.
(autopref_multipass_dfa_lookahead_guard): Likewise.
(schedule_block): Likewise.
(sched_init): Likewise.
* hsa-gen.c (init_prologue): Likewise.
* ifcvt.c (bb_ok_for_noce_convert_multiple_sets): Likewise.
(cond_move_process_if_block): Likewise.
* ipa-cp.c (ipcp_lattice::add_value): Likewise.
(merge_agg_lats_step): Likewise.
(devirtualization_time_bonus): Likewise.
(hint_time_bonus): Likewise.
(incorporate_penalties): Likewise.
(good_cloning_opportunity_p): Likewise.
(ipcp_propagate_stage): Likewise.
* ipa-fnsummary.c (decompose_param_expr): Likewise.
(set_switch_stmt_execution_predicate): Likewise.
(analyze_function_body): Likewise.
(compute_fn_summary): Likewise.
* ipa-inline-analysis.c (estimate_growth): Likewise.
* ipa-inline.c (caller_growth_limits): Likewise.
(inline_insns_single): Likewise.
(inline_insns_auto): Likewise.
(can_inline_edge_by_limits_p): Likewise.
(want_early_inline_function_p): Likewise.
(big_speedup_p): Likewise.
(want_inline_small_function_p): Likewise.
(want_inline_self_recursive_call_p): Likewise.
(edge_badness): Likewise.
(recursive_inlining): Likewise.
(compute_max_insns): Likewise.
(early_inliner): Likewise.
* ipa-polymorphic-call.c (csftc_abort_walking_p): Likewise.
* ipa-profile.c (ipa_profile): Likewise.
* ipa-prop.c (determine_known_aggregate_parts): Likewise.
(ipa_analyze_node): Likewise.
(ipcp_transform_function): Likewise.
* ipa-split.c (consider_split): Likewise.
* ipa-sra.c (allocate_access): Likewise.
(process_scan_results): Likewise.
(ipa_sra_summarize_function): Likewise.
(pull_accesses_from_callee): Likewise.
* ira-build.c (loop_compare_func): Likewise.
(mark_loops_for_removal): Likewise.
* ira-conflicts.c (build_conflict_bit_table): Likewise.
* loop-doloop.c (doloop_optimize): Likewise.
* loop-invariant.c (gain_for_invariant): Likewise.
(move_loop_invariants): Likewise.
* loop-unroll.c (decide_unroll_constant_iterations): Likewise.
(decide_unroll_runtime_iterations): Likewise.
(decide_unroll_stupid): Likewise.
(expand_var_during_unrolling): Likewise.
* lra-assigns.c (spill_for): Likewise.
* lra-constraints.c (EBB_PROBABILITY_CUTOFF): Likewise.
* modulo-sched.c (sms_schedule): Likewise.
(DFA_HISTORY): Likewise.
* opts.c (default_options_optimization): Likewise.
(finish_options): Likewise.
(common_handle_option): Likewise.
* postreload-gcse.c (eliminate_partially_redundant_load): Likewise.
(if): Likewise.
* predict.c (get_hot_bb_threshold): Likewise.
(maybe_hot_count_p): Likewise.
(probably_never_executed): Likewise.
(predictable_edge_p): Likewise.
(predict_loops): Likewise.
(expr_expected_value_1): Likewise.
(tree_predict_by_opcode): Likewise.
(handle_missing_profiles): Likewise.
* reload.c (find_equiv_reg): Likewise.
* reorg.c (redundant_insn): Likewise.
* resource.c (mark_target_live_regs): Likewise.
(incr_ticks_for_insn): Likewise.
* sanopt.c (pass_sanopt::execute): Likewise.
* sched-deps.c (sched_analyze_1): Likewise.
(sched_analyze_2): Likewise.
(sched_analyze_insn): Likewise.
(deps_analyze_insn): Likewise.
* sched-ebb.c (schedule_ebbs): Likewise.
* sched-rgn.c (find_single_block_region): Likewise.
(too_large): Likewise.
(haifa_find_rgns): Likewise.
(extend_rgns): Likewise.
(new_ready): Likewise.
(schedule_region): Likewise.
(sched_rgn_init): Likewise.
* sel-sched-ir.c (make_region_from_loop): Likewise.
* sel-sched-ir.h (MAX_WS): Likewise.
* sel-sched.c (process_pipelined_exprs): Likewise.
(sel_setup_region_sched_flags): Likewise.
* shrink-wrap.c (try_shrink_wrapping): Likewise.
* targhooks.c (default_max_noce_ifcvt_seq_cost): Likewise.
* toplev.c (print_version): Likewise.
(process_options): Likewise.
* tracer.c (tail_duplicate): Likewise.
* trans-mem.c (tm_log_add): Likewise.
* tree-chrec.c (chrec_fold_plus_1): Likewise.
* tree-data-ref.c (split_constant_offset): Likewise.
(compute_all_dependences): Likewise.
* tree-if-conv.c (MAX_PHI_ARG_NUM): Likewise.
* tree-inline.c (remap_gimple_stmt): Likewise.
* tree-loop-distribution.c (MAX_DATAREFS_NUM): Likewise.
* tree-parloops.c (MIN_PER_THREAD): Likewise.
(create_parallel_loop): Likewise.
* tree-predcom.c (determine_unroll_factor): Likewise.
* tree-scalar-evolution.c (instantiate_scev_r): Likewise.
* tree-sra.c (analyze_all_variable_accesses): Likewise.
* tree-ssa-ccp.c (fold_builtin_alloca_with_align): Likewise.
* tree-ssa-dse.c (setup_live_bytes_from_ref): Likewise.
(dse_optimize_redundant_stores): Likewise.
(dse_classify_store): Likewise.
* tree-ssa-ifcombine.c (ifcombine_ifandif): Likewise.
* tree-ssa-loop-ch.c (ch_base::copy_headers): Likewise.
* tree-ssa-loop-im.c (LIM_EXPENSIVE): Likewise.
* tree-ssa-loop-ivcanon.c (try_unroll_loop_completely): Likewise.
(try_peel_loop): Likewise.
(tree_unroll_loops_completely): Likewise.
* tree-ssa-loop-ivopts.c (avg_loop_niter): Likewise.
(CONSIDER_ALL_CANDIDATES_BOUND): Likewise.
(MAX_CONSIDERED_GROUPS): Likewise.
(ALWAYS_PRUNE_CAND_SET_BOUND): Likewise.
* tree-ssa-loop-manip.c (can_unroll_loop_p): Likewise.
* tree-ssa-loop-niter.c (MAX_ITERATIONS_TO_TRACK): Likewise.
* tree-ssa-loop-prefetch.c (PREFETCH_BLOCK): Likewise.
(L1_CACHE_SIZE_BYTES): Likewise.
(L2_CACHE_SIZE_BYTES): Likewise.
(should_issue_prefetch_p): Likewise.
(schedule_prefetches): Likewise.
(determine_unroll_factor): Likewise.
(volume_of_references): Likewise.
(add_subscript_strides): Likewise.
(self_reuse_distance): Likewise.
(mem_ref_count_reasonable_p): Likewise.
(insn_to_prefetch_ratio_too_small_p): Likewise.
(loop_prefetch_arrays): Likewise.
(tree_ssa_prefetch_arrays): Likewise.
* tree-ssa-loop-unswitch.c (tree_unswitch_single_loop): Likewise.
* tree-ssa-math-opts.c (gimple_expand_builtin_pow): Likewise.
(convert_mult_to_fma): Likewise.
(math_opts_dom_walker::after_dom_children): Likewise.
* tree-ssa-phiopt.c (cond_if_else_store_replacement): Likewise.
(hoist_adjacent_loads): Likewise.
(gate_hoist_loads): Likewise.
* tree-ssa-pre.c (translate_vuse_through_block): Likewise.
(compute_partial_antic_aux): Likewise.
* tree-ssa-reassoc.c (get_reassociation_width): Likewise.
* tree-ssa-sccvn.c (vn_reference_lookup_pieces): Likewise.
(vn_reference_lookup): Likewise.
(do_rpo_vn): Likewise.
* tree-ssa-scopedtables.c (avail_exprs_stack::lookup_avail_expr): Likewise.
* tree-ssa-sink.c (select_best_block): Likewise.
* tree-ssa-strlen.c (new_stridx): Likewise.
(new_addr_stridx): Likewise.
(get_range_strlen_dynamic): Likewise.
(class ssa_name_limit_t): Likewise.
* tree-ssa-structalias.c (push_fields_onto_fieldstack): Likewise.
(create_variable_info_for_1): Likewise.
(init_alias_vars): Likewise.
* tree-ssa-tail-merge.c (find_clusters_1): Likewise.
(tail_merge_optimize): Likewise.
* tree-ssa-threadbackward.c (thread_jumps::profitable_jump_thread_path): Likewise.
(thread_jumps::fsm_find_control_statement_thread_paths): Likewise.
(thread_jumps::find_jump_threads_backwards): Likewise.
* tree-ssa-threadedge.c (record_temporary_equivalences_from_stmts_at_dest): Likewise.
* tree-ssa-uninit.c (compute_control_dep_chain): Likewise.
* tree-switch-conversion.c (switch_conversion::check_range): Likewise.
(jump_table_cluster::can_be_handled): Likewise.
* tree-switch-conversion.h (jump_table_cluster::case_values_threshold): Likewise.
(SWITCH_CONVERSION_BRANCH_RATIO): Likewise.
(param_switch_conversion_branch_ratio): Likewise.
* tree-vect-data-refs.c (vect_mark_for_runtime_alias_test): Likewise.
(vect_enhance_data_refs_alignment): Likewise.
(vect_prune_runtime_alias_test_list): Likewise.
* tree-vect-loop.c (vect_analyze_loop_costing): Likewise.
(vect_get_datarefs_in_loop): Likewise.
(vect_analyze_loop): Likewise.
* tree-vect-slp.c (vect_slp_bb): Likewise.
* tree-vectorizer.h: Likewise.
* tree-vrp.c (find_switch_asserts): Likewise.
(vrp_prop::check_mem_ref): Likewise.
* tree.c (wide_int_to_tree_1): Likewise.
(cache_integer_cst): Likewise.
* var-tracking.c (EXPR_USE_DEPTH): Likewise.
(reverse_op): Likewise.
(vt_find_locations): Likewise.
2019-11-12 Martin Liska <mliska@suse.cz>
* gimple-parser.c (c_parser_parse_gimple_body): Replace old parameter syntax
with the new one, include opts.h if needed. Use SET_OPTION_IF_UNSET
macro.
2019-11-12 Martin Liska <mliska@suse.cz>
* name-lookup.c (namespace_hints::namespace_hints): Replace old parameter syntax
with the new one, include opts.h if needed. Use SET_OPTION_IF_UNSET
macro.
* typeck.c (comptypes): Likewise.
2019-11-12 Martin Liska <mliska@suse.cz>
* lto-partition.c (lto_balanced_map): Replace old parameter syntax
with the new one, include opts.h if needed. Use SET_OPTION_IF_UNSET
macro.
* lto.c (do_whole_program_analysis): Likewise.
From-SVN: r278085
|
|
tree-vect-stmts.c:1537)
2019-11-11 Andre Vieira <andre.simoesdiasvieira@arm.com>
PR tree-optimization/92347
* tree-vect-loop.c (vect_transform_loop): Don't overwrite epilogues
safelen with 0.
* gcc.dg/vect/pr92347.c: New test.
From-SVN: r278079
|
|
With the new reduction vectype handling, neutral_op_for_slp_reduction
needs to know whether the caller is using STMT_VINFO_REDUC_VECTYPE
(for an epilogue value) or STMT_VINFO_VECTYPE (for a PHI argument).
This fixes various gcc.target/aarch64/sve/slp_* tests.
2019-11-08 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vect-loop.c (neutral_op_for_slp_reduction): Take the
vector type as an argument rather than reading it from the
stmt_vec_info.
(vect_create_epilog_for_reduction): Update accordingly.
(vectorizable_reduction): Likewise.
(vect_transform_cycle_phi): Likewise.
From-SVN: r277977
|
|
gcc/ChangeLog:
2019-11-08 Andre Vieira <andre.simoesdiasvieira@arm.com>
* tree-vect-loop.c (vect_analyze_loop): Disable epilogue vectorization
for loops with SIMDUID set. Enable epilogue vectorization for loops
with SIMDLEN set after finding a main loop with a VF that matches it.
From-SVN: r277964
|
|
internal-fn.c:2890)
2019-11-08 Richard Biener <rguenther@suse.de>
PR tree-optimization/92324
* tree-vect-loop.c (vect_create_epilog_for_reduction): Use
STMT_VINFO_REDUC_VECTYPE for all computations, inserting
sign-conversions as necessary.
(vectorizable_reduction): Reject conversions in the chain
that are not sign-conversions, base analysis on a non-converting
stmt and its operation sign. Set STMT_VINFO_REDUC_VECTYPE.
* tree-vect-stmts.c (vect_stmt_relevant_p): Don't dump anything
for debug stmts.
* tree-vectorizer.h (_stmt_vec_info::reduc_vectype): New.
(STMT_VINFO_REDUC_VECTYPE): Likewise.
* gcc.dg/vect/pr92205.c: XFAIL.
* gcc.dg/vect/pr92324-1.c: New testcase.
* gcc.dg/vect/pr92324-2.c: Likewise.
From-SVN: r277955
|
|
tree-vect-stmts.c:1683)
2019-11-07 Richard Biener <rguenther@suse.de>
PR tree-optimization/92405
* tree-vect-loop.c (vectorizable_reduction): Appropriately
restrict lane-reducing ops to single stmt chains.
From-SVN: r277921
|
|
With a later patch I saw a case in which we peeled a single iteration
for gaps but didn't need to peel further iterations to make up a full
vector. We then tried to vectorise the single-iteration epilogue.
2019-11-06 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vect-loop.c (vect_analyze_loop): Only try to vectorize
the epilogue if there are peeled iterations for it to handle.
From-SVN: r277886
|
|
2019-11-06 Richard Biener <rguenther@suse.de>
* tree-vect-loop.c (vectorizable_reduction): Remember reduction
PHI. Use STMT_VINFO_REDUC_IDX to skip the reduction operand.
Simplify single_defuse_cycle condition.
From-SVN: r277882
|
|
The number of iterations of an epilogue loop is always smaller than the
VF of the main loop. vect_analyze_loop_costing was taking this into
account when deciding whether the loop is cheap enough to vectorise,
but that has no effect with the unlimited cost model. We need to use
a separate check for correctness as well.
This can happen if the sizes returned by autovectorize_vector_sizes
happen to be out of order, e.g. because the target prefers smaller
vectors. It can also happen with later patches if two vectorisation
attempts happen to end up with the same VF.
2019-11-06 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vect-loop.c (vect_analyze_loop_2): When vectorizing an
epilogue loop, make sure that the VF is small enough or that
the epilogue loop can be fully-masked.
From-SVN: r277880
|
|
Once vect_analyze_loop has found a valid loop_vec_info X, we carry
on searching for alternatives if (1) X doesn't satisfy simdlen or
(2) we want to vectorize the epilogue of X. I have a patch that
optionally adds a third reason: we want to see if there are cheaper
alternatives to X.
This patch restructures vect_analyze_loop so that it's easier
to add more reasons for continuing. There's supposed to be no
behavioural change.
If we wanted to, we could allow vectorisation of epilogues once
loop->simdlen has been reached by changing "loop->simdlen" to
"simdlen" in the new vect_epilogues condition. That should be
a separate change though.
2019-11-06 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vect-loop.c (vect_analyze_loop): Break out of the main
loop when we've finished, rather than returning directly from
the loop. Use a local variable to track whether we're still
searching for the preferred simdlen. Make vect_epilogues
record whether the next iteration should try to treat the
loop as an epilogue.
From-SVN: r277879
|
|
tree-vect-loop.c:4106)
2019-11-05 Richard Biener <rguenther@suse.de>
PR tree-optimization/92371
* tree-vect-loop.c (vectorizable_reduction): Set STMT_VINFO_REDUC_DEF
on the original stmt of live stmts in the chain.
(vectorizable_live_operation): Look at the original stmt when
checking STMT_VINFO_REDUC_DEF.
* gcc.dg/torture/pr92371.c: New testcase.
From-SVN: r277850
|
|
internal-fn.c:2890)
2019-11-05 Richard Biener <rguenther@suse.de>
PR tree-optimization/92324
* tree-vect-loop.c (check_reduction_path): For MIN/MAX require
all signed or unsigned operations.
* gcc.dg/vect/pr92324-3.c: New testcase.
From-SVN: r277822
|
|
gcc/ChangeLog:
2019-11-04 Andre Vieira <andre.simoesdiasvieira@arm.com>
* tree-vect-loop.c (vect_analyze_loop): Remove orig_loop_vinfo
parameter.
* tree-vectorizer.h (vect_analyze_loop): Update declaration.
* tree-vectorizer.c (try_vectorize_loop_1): Update calls to
vect_analyze_loop.
From-SVN: r277785
|
|
vl_embed>::space (vect_get_and_check_slp_defs))
2019-11-04 Richard Biener <rguenther@suse.de>
PR tree-optimization/92345
* tree-vect-loop.c (vect_is_simple_reduction): Return whether
we produced a reduction chain.
(vect_analyze_scalar_cycles_1): Do not add reduction chains to
LOOP_VINFO_REDUCTIONS.
* gcc.dg/torture/pr92345.c: New testcase.
From-SVN: r277782
|
|
2019-10-30 Richard Biener <rguenther@suse.de>
PR tree-optimization/65930
* tree-vect-loop.c (vect_is_simple_reduction): For reduction
chains also allow a leading and trailing conversion.
* tree-vect-slp.c (vect_get_and_check_slp_defs): Handle
intermediate reduction chains.
(vect_analyze_slp_instance): Likewise. Build a SLP
node for a trailing conversion manually.
* gcc.dg/vect/pr65930-2.c: New testcase.
From-SVN: r277603
|
|
gcc/ChangeLog:
2019-10-29 Andre Vieira <andre.simoesdiasvieira@arm.com>
PR 88915
* tree-ssa-loop-niter.h (simplify_replace_tree): Change declaration.
* tree-ssa-loop-niter.c (simplify_replace_tree): Add context parameter
and make the valueize function pointer also take a void pointer.
* gcc/tree-ssa-sccvn.c (vn_valueize_wrapper): New function to wrap
around vn_valueize, to call it without a context.
(process_bb): Use vn_valueize_wrapper instead of vn_valueize.
* tree-vect-loop.c (_loop_vec_info): Initialize epilogue_vinfos.
(~_loop_vec_info): Release epilogue_vinfos.
(vect_analyze_loop_costing): Use knowledge of main VF to estimate
number of iterations of epilogue.
(vect_analyze_loop_2): Adapt to analyse main loop for all supported
vector sizes when vect-epilogues-nomask=1. Also keep track of lowest
versioning threshold needed for main loop.
(vect_analyze_loop): Likewise.
(find_in_mapping): New helper function.
(update_epilogue_loop_vinfo): New function.
(vect_transform_loop): When vectorizing epilogues re-use analysis done
on main loop and call update_epilogue_loop_vinfo to update it.
* tree-vect-loop-manip.c (vect_update_inits_of_drs): No longer insert
stmts on loop preheader edge.
(vect_do_peeling): Enable skip-vectors when doing loop versioning if
we decided to vectorize epilogues. Update epilogues NITERS and
construct ADVANCE to update epilogues data references where needed.
* tree-vectorizer.h (_loop_vec_info): Add epilogue_vinfos.
(vect_do_peeling, vect_update_inits_of_drs,
determine_peel_for_niter, vect_analyze_loop): Add or update
declarations.
* tree-vectorizer.c (try_vectorize_loop_1): Make sure to use already
created loop_vec_info's for epilogues when available. Otherwise analyse
epilogue separately.
From-SVN: r277569
|
|
2019-10-29 Richard Biener <rguenther@suse.de>
PR tree-optimization/65930
* tree-vect-loop.c (check_reduction_path): Relax single-use
check allowing out-of-loop uses.
(vect_is_simple_reduction): SLP reduction chains cannot have
intermediate stmts used outside of the loop.
(vect_create_epilog_for_reduction): The adjustment might need
to be converted.
(vectorizable_reduction): Annotate live stmts of the reduction
chain with STMT_VINFO_REDUC_DEF.
* tree-vect-stms.c (process_use): Remove no longer true asserts.
* gcc.dg/vect/pr65930-1.c: New testcase.
From-SVN: r277566
|
|
tree-vect-patterns.c:5175)
2019-10-28 Richard Biener <rguenther@suse.de>
PR tree-optimization/92241
* tree-vect-loop.c (vect_fixup_scalar_cycles_with_patterns): When
we failed to update the reduction index do not use the pattern
stmts for the reduction chain.
(vectorizable_reduction): When the reduction chain is corrupt,
fail.
* tree-vect-patterns.c (vect_mark_pattern_stmts): Stop when we
fail to update the reduction chain.
* gcc.dg/torture/pr92241.c: New testcase.
From-SVN: r277516
|
|
STMT_VINFO_REDUC_IDX from the actual stmt.
2019-10-28 Richard Biener <rguenther@suse.de>
* tree-vect-loop.c (vect_create_epilog_for_reduction): Use
STMT_VINFO_REDUC_IDX from the actual stmt.
(vect_transform_reduction): Likewise.
(vectorizable_reduction): Compute the reduction chain length,
do not recompute the reduction operand index. Remove no longer
necessary restriction for condition reduction chains.
From-SVN: r277513
|
|
Now that vectorizable_operation vectorises most loop stmts involved
in a reduction, it needs to be aware of reductions in fully-masked loops.
The LOOP_VINFO_CAN_FULLY_MASK_P parts of vectorizable_reduction now only
apply to cases that use vect_transform_reduction.
This new way of doing things is definitely an improvement for SVE though,
since it means we can lift the old restriction of not using fully-masked
loops for reduction chains.
2019-10-25 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vect-loop.c (vectorizable_reduction): Restrict the
LOOP_VINFO_CAN_FULLY_MASK_P handling to cases that will be
handled by vect_transform_reduction. Allow fully-masked loops
to be used with reduction chains.
* tree-vect-stmts.c (vectorizable_operation): Handle reduction
operations in fully-masked loops.
(vectorizable_condition): Reject EXTRACT_LAST_REDUCTION
operations in fully-masked loops.
gcc/testsuite/
* gcc.dg/vect/pr65947-1.c: No longer expect doubled dump lines
for FOLD_EXTRACT_LAST reductions.
* gcc.dg/vect/pr65947-2.c: Likewise.
* gcc.dg/vect/pr65947-3.c: Likewise.
* gcc.dg/vect/pr65947-4.c: Likewise.
* gcc.dg/vect/pr65947-5.c: Likewise.
* gcc.dg/vect/pr65947-6.c: Likewise.
* gcc.dg/vect/pr65947-9.c: Likewise.
* gcc.dg/vect/pr65947-10.c: Likewise.
* gcc.dg/vect/pr65947-12.c: Likewise.
* gcc.dg/vect/pr65947-13.c: Likewise.
* gcc.dg/vect/pr65947-14.c: Likewise.
* gcc.dg/vect/pr80631-1.c: Likewise.
* gcc.dg/vect/pr80631-2.c: Likewise.
* gcc.dg/vect/vect-cond-reduc-3.c: Likewise.
* gcc.dg/vect/vect-cond-reduc-4.c: Likewise.
From-SVN: r277438
|
|
the to be vectorized stmts is set up correctly.
2019-10-25 Richard Biener <rguenther@suse.de>
* tree-vect-loop.c (vectorizable_reduction): Verify
STMT_VINFO_REDUC_IDX on the to be vectorized stmts is set up
correctly.
* tree-vect-patterns.c (vect_mark_pattern_stmts): Transfer
STMT_VINFO_REDUC_IDX from the original stmts to the pattern
stmts.
From-SVN: r277437
|
|
tree-vect-stmts.c:1688 since r277322)
2019-10-24 Richard Biener <rguenther@suse.de>
PR tree-optimization/92205
* tree-vect-loop.c (vectorizable_reduction): Restrict
search for alternate vectype_in to lane-reducing patterns
we support.
* gcc.dg/vect/pr92205.c: New testcase.
From-SVN: r277375
|
|
2019-10-23 Richard Biener <rguenther@suse.de>
PR tree-optimization/65930
* tree-vect-loop.c (check_reduction_path): Allow conversions
that only change the sign.
(vectorizable_reduction): Relax latch def stmts we handle further.
* gcc.dg/vect/vect-reduc-2char-big-array.c: Adjust.
* gcc.dg/vect/vect-reduc-2char.c: Likewise.
* gcc.dg/vect/vect-reduc-2short.c: Likewise.
* gcc.dg/vect/vect-reduc-dot-s8b.c: Likewise.
* gcc.dg/vect/vect-reduc-pattern-2c.c: Likewise.
From-SVN: r277322
|