diff options
author | Richard Sandiford <richard.sandiford@linaro.org> | 2018-01-13 18:01:24 +0000 |
---|---|---|
committer | Richard Sandiford <rsandifo@gcc.gnu.org> | 2018-01-13 18:01:24 +0000 |
commit | b781a135a06fc1805c072778d7513df09a32171d (patch) | |
tree | 43af641081da5b462f6d95a1d23ab6b0f16dd13a /gcc/tree-parloops.c | |
parent | b89fa419ca39b13b5ed0f7a23722b394b3af399e (diff) | |
download | gcc-b781a135a06fc1805c072778d7513df09a32171d.zip gcc-b781a135a06fc1805c072778d7513df09a32171d.tar.gz gcc-b781a135a06fc1805c072778d7513df09a32171d.tar.bz2 |
Add support for in-order addition reduction using SVE FADDA
This patch adds support for in-order floating-point addition reductions,
which are suitable even in strict IEEE mode.
Previously vect_is_simple_reduction would reject any cases that forbid
reassociation. The idea is instead to tentatively accept them as
"FOLD_LEFT_REDUCTIONs" and only fail later if there is no support
for them. Although this patch only handles the particular case of plus
and minus on floating-point types, there's no reason in principle why
we couldn't handle other cases.
The reductions use a new fold_left_plus_optab if available, otherwise
they fall back to elementwise additions or subtractions.
The vect_force_simple_reduction change makes it easier for parloops
to read the type of reduction.
2018-01-13 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* optabs.def (fold_left_plus_optab): New optab.
* doc/md.texi (fold_left_plus_@var{m}): Document.
* internal-fn.def (IFN_FOLD_LEFT_PLUS): New internal function.
* internal-fn.c (fold_left_direct): Define.
(expand_fold_left_optab_fn): Likewise.
(direct_fold_left_optab_supported_p): Likewise.
* fold-const-call.c (fold_const_fold_left): New function.
(fold_const_call): Use it to fold CFN_FOLD_LEFT_PLUS.
* tree-parloops.c (valid_reduction_p): New function.
(gather_scalar_reductions): Use it.
* tree-vectorizer.h (FOLD_LEFT_REDUCTION): New vect_reduction_type.
(vect_finish_replace_stmt): Declare.
* tree-vect-loop.c (fold_left_reduction_fn): New function.
(needs_fold_left_reduction_p): New function, split out from...
(vect_is_simple_reduction): ...here. Accept reductions that
forbid reassociation, but give them type FOLD_LEFT_REDUCTION.
(vect_force_simple_reduction): Also store the reduction type in
the assignment's STMT_VINFO_REDUC_TYPE.
(vect_model_reduction_cost): Handle FOLD_LEFT_REDUCTION.
(merge_with_identity): New function.
(vect_expand_fold_left): Likewise.
(vectorize_fold_left_reduction): Likewise.
(vectorizable_reduction): Handle FOLD_LEFT_REDUCTION. Leave the
scalar phi in place for it. Check for target support and reject
cases that would reassociate the operation. Defer the transform
phase to vectorize_fold_left_reduction.
* config/aarch64/aarch64.md (UNSPEC_FADDA): New unspec.
* config/aarch64/aarch64-sve.md (fold_left_plus_<mode>): New expander.
(*fold_left_plus_<mode>, *pred_fold_left_plus_<mode>): New insns.
gcc/testsuite/
* gcc.dg/vect/no-fast-math-vect16.c: Expect the test to pass and
check for a message about using in-order reductions.
* gcc.dg/vect/pr79920.c: Expect both loops to be vectorized and
check for a message about using in-order reductions.
* gcc.dg/vect/trapv-vect-reduc-4.c: Expect all three loops to be
vectorized and check for a message about using in-order reductions.
Expect targets with variable-length vectors to fall back to the
fixed-length mininum.
* gcc.dg/vect/vect-reduc-6.c: Expect the loop to be vectorized and
check for a message about using in-order reductions.
* gcc.dg/vect/vect-reduc-in-order-1.c: New test.
* gcc.dg/vect/vect-reduc-in-order-2.c: Likewise.
* gcc.dg/vect/vect-reduc-in-order-3.c: Likewise.
* gcc.dg/vect/vect-reduc-in-order-4.c: Likewise.
* gcc.target/aarch64/sve/reduc_strict_1.c: New test.
* gcc.target/aarch64/sve/reduc_strict_1_run.c: Likewise.
* gcc.target/aarch64/sve/reduc_strict_2.c: Likewise.
* gcc.target/aarch64/sve/reduc_strict_2_run.c: Likewise.
* gcc.target/aarch64/sve/reduc_strict_3.c: Likewise.
* gcc.target/aarch64/sve/slp_13.c: Add floating-point types.
* gfortran.dg/vect/vect-8.f90: Expect 22 loops to be vectorized if
vect_fold_left_plus.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256639
Diffstat (limited to 'gcc/tree-parloops.c')
-rw-r--r-- | gcc/tree-parloops.c | 18 |
1 files changed, 16 insertions, 2 deletions
diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c index a872f8c..e44ad5e 100644 --- a/gcc/tree-parloops.c +++ b/gcc/tree-parloops.c @@ -2531,6 +2531,19 @@ set_reduc_phi_uids (reduction_info **slot, void *data ATTRIBUTE_UNUSED) return 1; } +/* Return true if the type of reduction performed by STMT is suitable + for this pass. */ + +static bool +valid_reduction_p (gimple *stmt) +{ + /* Parallelization would reassociate the operation, which isn't + allowed for in-order reductions. */ + stmt_vec_info stmt_info = vinfo_for_stmt (stmt); + vect_reduction_type reduc_type = STMT_VINFO_REDUC_TYPE (stmt_info); + return reduc_type != FOLD_LEFT_REDUCTION; +} + /* Detect all reductions in the LOOP, insert them into REDUCTION_LIST. */ static void @@ -2564,7 +2577,7 @@ gather_scalar_reductions (loop_p loop, reduction_info_table_type *reduction_list gimple *reduc_stmt = vect_force_simple_reduction (simple_loop_info, phi, &double_reduc, true); - if (!reduc_stmt) + if (!reduc_stmt || !valid_reduction_p (reduc_stmt)) continue; if (double_reduc) @@ -2610,7 +2623,8 @@ gather_scalar_reductions (loop_p loop, reduction_info_table_type *reduction_list = vect_force_simple_reduction (simple_loop_info, inner_phi, &double_reduc, true); gcc_assert (!double_reduc); - if (inner_reduc_stmt == NULL) + if (inner_reduc_stmt == NULL + || !valid_reduction_p (inner_reduc_stmt)) continue; build_new_reduction (reduction_list, double_reduc_stmts[i], phi); |