diff options
author | Richard Biener <rguenther@suse.de> | 2022-10-06 13:56:09 +0200 |
---|---|---|
committer | Richard Biener <rguenther@suse.de> | 2022-10-17 12:36:35 +0200 |
commit | 46a8e017d048ec3271bbb898942e3b166c4e8ff3 (patch) | |
tree | e6b1a35d4364b763d6ca815781a84d61dd46dff2 /gcc/tree-vectorizer.h | |
parent | acdb24166d13d87c374e578d2ad5d58249171930 (diff) | |
download | gcc-46a8e017d048ec3271bbb898942e3b166c4e8ff3.zip gcc-46a8e017d048ec3271bbb898942e3b166c4e8ff3.tar.gz gcc-46a8e017d048ec3271bbb898942e3b166c4e8ff3.tar.bz2 |
Vectorization of first-order recurrences
The following picks up the prototype by Ju-Zhe Zhong for vectorizing
first order recurrences. That solves two TSVC missed optimization PRs.
There's a new scalar cycle def kind, vect_first_order_recurrence
and it's handling of the backedge value vectorization is complicated
by the fact that the vectorized value isn't the PHI but instead
a (series of) permute(s) shifting in the recurring value from the
previous iteration. I've implemented this by creating both the
single vectorized PHI and the series of permutes when vectorizing
the scalar PHI but leave the backedge values in both unassigned.
The backedge values are (for the testcases) computed by a load
which is also the place after which the permutes are inserted.
That placement also restricts the cases we can handle (without
resorting to code motion).
I added both costing and SLP handling though SLP handling is
restricted to the case where a single vectorized PHI is enough.
Missing is epilogue handling - while prologue peeling would
be handled transparently by adjusting iv_phi_p the epilogue
case doesn't work with just inserting a scalar LC PHI since
that a) keeps the scalar load live and b) that loads is the
wrong one, it has to be the last, much like when we'd vectorize
the LC PHI as live operation. Unfortunately LIVE
compute/analysis happens too early before we decide on
peeling. When using fully masked loop vectorization the
vect-recurr-6.c works as expected though.
I have tested this on x86_64 for now, but since epilogue
handling is missing there's probably no practical cases.
My prototype WHILE_ULT AVX512 patch can handle vect-recurr-6.c
just fine but I didn't feel like running SPEC within SDE nor
is the WHILE_ULT patch complete enough.
PR tree-optimization/99409
PR tree-optimization/99394
* tree-vectorizer.h (vect_def_type::vect_first_order_recurrence): Add.
(stmt_vec_info_type::recurr_info_type): Likewise.
(vectorizable_recurr): New function.
* tree-vect-loop.cc (vect_phi_first_order_recurrence_p): New
function.
(vect_analyze_scalar_cycles_1): Look for first order
recurrences.
(vect_analyze_loop_operations): Handle them.
(vect_transform_loop): Likewise.
(vectorizable_recurr): New function.
(maybe_set_vectorized_backedge_value): Handle the backedge value
setting in the first order recurrence PHI and the permutes.
* tree-vect-stmts.cc (vect_analyze_stmt): Handle first order
recurrences.
(vect_transform_stmt): Likewise.
(vect_is_simple_use): Likewise.
(vect_is_simple_use): Likewise.
* tree-vect-slp.cc (vect_get_and_check_slp_defs): Likewise.
(vect_build_slp_tree_2): Likewise.
(vect_schedule_scc): Handle the backedge value setting in the
first order recurrence PHI and the permutes.
* gcc.dg/vect/vect-recurr-1.c: New testcase.
* gcc.dg/vect/vect-recurr-2.c: Likewise.
* gcc.dg/vect/vect-recurr-3.c: Likewise.
* gcc.dg/vect/vect-recurr-4.c: Likewise.
* gcc.dg/vect/vect-recurr-5.c: Likewise.
* gcc.dg/vect/vect-recurr-6.c: Likewise.
* gcc.dg/vect/tsvc/vect-tsvc-s252.c: Un-XFAIL.
* gcc.dg/vect/tsvc/vect-tsvc-s254.c: Likewise.
* gcc.dg/vect/tsvc/vect-tsvc-s291.c: Likewise.
Co-authored-by: Ju-Zhe Zhong <juzhe.zhong@rivai.ai>
Diffstat (limited to 'gcc/tree-vectorizer.h')
-rw-r--r-- | gcc/tree-vectorizer.h | 4 |
1 files changed, 4 insertions, 0 deletions
diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h index 4870c75..016961d 100644 --- a/gcc/tree-vectorizer.h +++ b/gcc/tree-vectorizer.h @@ -65,6 +65,7 @@ enum vect_def_type { vect_reduction_def, vect_double_reduction_def, vect_nested_cycle, + vect_first_order_recurrence, vect_unknown_def_type }; @@ -1027,6 +1028,7 @@ enum stmt_vec_info_type { cycle_phi_info_type, lc_phi_info_type, phi_info_type, + recurr_info_type, loop_exit_ctrl_vec_info_type }; @@ -2331,6 +2333,8 @@ extern bool vectorizable_lc_phi (loop_vec_info, stmt_vec_info, gimple **, slp_tree); extern bool vectorizable_phi (vec_info *, stmt_vec_info, gimple **, slp_tree, stmt_vector_for_cost *); +extern bool vectorizable_recurr (loop_vec_info, stmt_vec_info, + gimple **, slp_tree, stmt_vector_for_cost *); extern bool vect_emulated_vector_p (tree); extern bool vect_can_vectorize_without_simd_p (tree_code); extern bool vect_can_vectorize_without_simd_p (code_helper); |