diff options
author | Richard Sandiford <richard.sandiford@arm.com> | 2019-06-18 09:18:17 +0000 |
---|---|---|
committer | Richard Sandiford <rsandifo@gcc.gnu.org> | 2019-06-18 09:18:17 +0000 |
commit | fcae0292de06aeb54c44d26cfb80d798df60e339 (patch) | |
tree | f98a62e46bca2b433c2c6b950f710a6af3f1183c /gcc/tree-vect-loop.c | |
parent | a9e47ccf267fb088b004461c29e2daf9167bd102 (diff) | |
download | gcc-fcae0292de06aeb54c44d26cfb80d798df60e339.zip gcc-fcae0292de06aeb54c44d26cfb80d798df60e339.tar.gz gcc-fcae0292de06aeb54c44d26cfb80d798df60e339.tar.bz2 |
Restore correct iv step for fully-masked loops
r272233 introduced a large number of execution failures on SVE.
The patch hard-coded an IV step of VF, but for SLP groups it needs
to be VF * group size.
Also, iv_precision had type widest_int but only needs to be unsigned int.
2019-06-18 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vect-loop-manip.c (vect_set_loop_masks_directly): Remove
vf parameter. Restore the previous iv step of nscalars_step,
but give it iv_type rather than compare_type. Tweak code order
to match the comments.
(vect_set_loop_condition_masked): Update accordingly.
* tree-vect-loop.c (vect_verify_full_masking): Use "unsigned int"
for iv_precision. Tweak comment formatting.
From-SVN: r272411
Diffstat (limited to 'gcc/tree-vect-loop.c')
-rw-r--r-- | gcc/tree-vect-loop.c | 10 |
1 files changed, 5 insertions, 5 deletions
diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c index a27eda6..d3facf6 100644 --- a/gcc/tree-vect-loop.c +++ b/gcc/tree-vect-loop.c @@ -1062,7 +1062,7 @@ vect_verify_full_masking (loop_vec_info loop_vinfo) tree cmp_type = NULL_TREE; tree iv_type = NULL_TREE; widest_int iv_limit = vect_iv_limit_for_full_masking (loop_vinfo); - widest_int iv_precision = UINT_MAX; + unsigned int iv_precision = UINT_MAX; if (iv_limit != -1) iv_precision = wi::min_precision (iv_limit * max_nscalars_per_iter, @@ -1083,12 +1083,12 @@ vect_verify_full_masking (loop_vec_info loop_vinfo) best choice: - An IV that's Pmode or wider is more likely to be reusable - in address calculations than an IV that's narrower than - Pmode. + in address calculations than an IV that's narrower than + Pmode. - Doing the comparison in IV_PRECISION or wider allows - a natural 0-based IV, whereas using a narrower comparison - type requires mitigations against wrap-around. + a natural 0-based IV, whereas using a narrower comparison + type requires mitigations against wrap-around. Conversely, if the IV limit is variable, doing the comparison in a wider type than the original type can introduce |