diff options
author | Tamar Christina <tamar.christina@arm.com> | 2024-01-12 15:25:34 +0000 |
---|---|---|
committer | Tamar Christina <tamar.christina@arm.com> | 2024-01-12 15:31:06 +0000 |
commit | 411de96dbf2bdafc7a90ebbfc63e68afd6388d29 (patch) | |
tree | e0e6c70acfa8ec86f0b7facc33865678762dff1a /gcc/tree-vect-loop.cc | |
parent | 6cb155a6cf314232248a12bdd395ed4151ae5a28 (diff) | |
download | gcc-411de96dbf2bdafc7a90ebbfc63e68afd6388d29.zip gcc-411de96dbf2bdafc7a90ebbfc63e68afd6388d29.tar.gz gcc-411de96dbf2bdafc7a90ebbfc63e68afd6388d29.tar.bz2 |
middle-end: maintain LCSSA form when peeled vector iterations have virtual operands
This patch fixes several interconnected issues.
1. When picking an exit we wanted to check for niter_desc.may_be_zero not true.
i.e. we want to pick an exit which we know will iterate at least once.
However niter_desc.may_be_zero is not a boolean. It is a tree that encodes
a boolean value. !niter_desc.may_be_zero is just checking if we have some
information, not what the information is. This leads us to pick a more
difficult to vectorize exit more often than we should.
2. Because we had this bug, we used to pick an alternative exit much more ofthen
which showed one issue, when the loop accesses memory and we "invert it" we
would corrupt the VUSE chain. This is because on an peeled vector iteration
every exit restarts the loop (i.e. they're all early) BUT since we may have
performed a store, the vUSE would need to be updated. This version maintains
virtual PHIs correctly in these cases. Note that we can't simply remove all
of them and recreate them because we need the PHI nodes still in the right
order for if skip_vector.
3. Since we're moving the stores to a safe location I don't think we actually
need to analyze whether the store is in range of the memref, because if we
ever get there, we know that the loads must be in range, and if the loads are
in range and we get to the store we know the early breaks were not taken and
so the scalar loop would have done the VF stores too.
4. Instead of searching for where to move stores to, they should always be in
exit belonging to the latch. We can only ever delay stores and even if we
pick a different exit than the latch one as the main one, effects still
happen in program order when vectorized. If we don't move the stores to the
latch exit but instead to whever we pick as the "main" exit then we can
perform incorrect memory accesses (luckily these are trapped by verify_ssa).
5. We only used to analyze loads inside the same BB as an early break, and also
we'd never analyze the ones inside the block where we'd be moving memory
references to. This is obviously bogus and to fix it this patch splits apart
the two constraints. We first validate that all load memory references are
in bounds and only after that do we perform the alias checks for the writes.
This makes the code simpler to understand and more trivially correct.
gcc/ChangeLog:
PR tree-optimization/113137
PR tree-optimization/113136
PR tree-optimization/113172
PR tree-optimization/113178
* tree-vect-loop-manip.cc (slpeel_tree_duplicate_loop_to_edge_cfg):
Maintain PHIs on inverted loops.
(vect_do_peeling): Maintain virtual PHIs on inverted loops.
* tree-vect-loop.cc (vec_init_loop_exit_info): Pick exit closes to
latch.
(vect_create_loop_vinfo): Record all conds instead of only alt ones.
gcc/testsuite/ChangeLog:
PR tree-optimization/113137
PR tree-optimization/113136
PR tree-optimization/113172
PR tree-optimization/113178
* g++.dg/vect/vect-early-break_4-pr113137.cc: New test.
* g++.dg/vect/vect-early-break_5-pr113137.cc: New test.
* gcc.dg/vect/vect-early-break_95-pr113137.c: New test.
* gcc.dg/vect/vect-early-break_96-pr113136.c: New test.
* gcc.dg/vect/vect-early-break_97-pr113172.c: New test.
Diffstat (limited to 'gcc/tree-vect-loop.cc')
-rw-r--r-- | gcc/tree-vect-loop.cc | 6 |
1 files changed, 5 insertions, 1 deletions
diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc index 38bd826..0f4a557 100644 --- a/gcc/tree-vect-loop.cc +++ b/gcc/tree-vect-loop.cc @@ -989,7 +989,11 @@ vec_init_loop_exit_info (class loop *loop) if (number_of_iterations_exit_assumptions (loop, exit, &niter_desc, NULL) && !chrec_contains_undetermined (niter_desc.niter)) { - if (!niter_desc.may_be_zero || !candidate) + tree may_be_zero = niter_desc.may_be_zero; + if (integer_zerop (may_be_zero) + && (!candidate + || dominated_by_p (CDI_DOMINATORS, exit->src, + candidate->src))) candidate = exit; } } |