diff options
author | Richard Biener <rguenther@suse.de> | 2023-12-15 10:32:29 +0100 |
---|---|---|
committer | Richard Biener <rguenther@suse.de> | 2024-01-08 14:45:56 +0100 |
commit | b3cc5a1efead520bc977b4ba51f1328d01b3e516 (patch) | |
tree | 7695ab4fd594ff8ec83b51a549513d47588d267e /gcc/tree-vect-loop.cc | |
parent | 8c0dd8a6ff85d6e7b38957f2da400f5cfa8fef6b (diff) | |
download | gcc-b3cc5a1efead520bc977b4ba51f1328d01b3e516.zip gcc-b3cc5a1efead520bc977b4ba51f1328d01b3e516.tar.gz gcc-b3cc5a1efead520bc977b4ba51f1328d01b3e516.tar.bz2 |
tree-optimization/113026 - avoid vector epilog in more cases
The following avoids creating a niter peeling epilog more consistently,
matching what peeling later uses for the skip_vector condition, in
particular when versioning is required which then also ensures the
vector loop is entered unless the epilog is vectorized. This should
ideally match LOOP_VINFO_VERSIONING_THRESHOLD which is only computed
later, some refactoring could make that better matching.
The patch also makes sure to adjust the upper bound of the epilogues
when we do not have a skip edge around the vector loop.
PR tree-optimization/113026
* tree-vect-loop.cc (vect_need_peeling_or_partial_vectors_p):
Avoid an epilog in more cases.
* tree-vect-loop-manip.cc (vect_do_peeling): Adjust the
epilogues niter upper bounds and estimates.
* gcc.dg/torture/pr113026-1.c: New testcase.
* gcc.dg/torture/pr113026-2.c: Likewise.
Diffstat (limited to 'gcc/tree-vect-loop.cc')
-rw-r--r-- | gcc/tree-vect-loop.cc | 6 |
1 files changed, 5 insertions, 1 deletions
diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc index a067716..9dd573e 100644 --- a/gcc/tree-vect-loop.cc +++ b/gcc/tree-vect-loop.cc @@ -1261,7 +1261,11 @@ vect_need_peeling_or_partial_vectors_p (loop_vec_info loop_vinfo) the epilogue is unnecessary. */ && (!LOOP_REQUIRES_VERSIONING (loop_vinfo) || ((unsigned HOST_WIDE_INT) max_niter - > (th / const_vf) * const_vf)))) + /* We'd like to use LOOP_VINFO_VERSIONING_THRESHOLD + but that's only computed later based on our result. + The following is the most conservative approximation. */ + > (std::max ((unsigned HOST_WIDE_INT) th, + const_vf) / const_vf) * const_vf)))) return true; return false; |