From 64432b680eab0bddbe9a4ad4798457cf6a14ad60 Mon Sep 17 00:00:00 2001 From: Kyrylo Tkachov Date: Thu, 17 Dec 2020 18:02:37 +0000 Subject: vect, aarch64: Extend SVE vs Advanced SIMD costing decisions in vect_better_loop_vinfo_p While experimenting with some backend costs for Advanced SIMD and SVE I hit many cases where GCC would pick SVE for VLA auto-vectorisation even when the backend very clearly presented cheaper costs for Advanced SIMD. For a simple float addition loop the SVE costs were: vec.c:9:21: note: Cost model analysis: Vector inside of loop cost: 28 Vector prologue cost: 2 Vector epilogue cost: 0 Scalar iteration cost: 10 Scalar outside cost: 0 Vector outside cost: 2 prologue iterations: 0 epilogue iterations: 0 Minimum number of vector iterations: 1 Calculated minimum iters for profitability: 4 and for Advanced SIMD (Neon) they're: vec.c:9:21: note: Cost model analysis: Vector inside of loop cost: 11 Vector prologue cost: 0 Vector epilogue cost: 0 Scalar iteration cost: 10 Scalar outside cost: 0 Vector outside cost: 0 prologue iterations: 0 epilogue iterations: 0 Calculated minimum iters for profitability: 0 vec.c:9:21: note: Runtime profitability threshold = 4 yet the SVE one was always picked. With guidance from Richard this seems to be due to the vinfo comparisons in vect_better_loop_vinfo_p, in particular the part with the big comment explaining the estimated_rel_new * 2 <= estimated_rel_old heuristic. This patch extends the comparisons by introducing a three-way estimate kind for poly_int values that the backend can distinguish. This allows vect_better_loop_vinfo_p to ask for minimum, maximum and likely estimates and pick Advanced SIMD overs SVE when it is clearly cheaper. gcc/ * target.h (enum poly_value_estimate_kind): Define. (estimated_poly_value): Take an estimate kind argument. * target.def (estimated_poly_value): Update definition for the above. * doc/tm.texi: Regenerate. * targhooks.c (estimated_poly_value): Update prototype. * tree-vect-loop.c (vect_better_loop_vinfo_p): Use min, max and likely estimates of VF to pick between vinfos. * config/aarch64/aarch64.c (aarch64_cmp_autovec_modes): Use estimated_poly_value instead of aarch64_estimated_poly_value. (aarch64_estimated_poly_value): Take a kind argument and handle it. --- gcc/target.h | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) (limited to 'gcc/target.h') diff --git a/gcc/target.h b/gcc/target.h index 9601880..68ef519 100644 --- a/gcc/target.h +++ b/gcc/target.h @@ -252,6 +252,13 @@ enum type_context_kind { TCTX_CAPTURE_BY_COPY }; +enum poly_value_estimate_kind +{ + POLY_VALUE_MIN, + POLY_VALUE_MAX, + POLY_VALUE_LIKELY +}; + extern bool verify_type_context (location_t, type_context_kind, const_tree, bool = false); @@ -272,12 +279,13 @@ extern struct gcc_target targetm; provides a rough guess. */ static inline HOST_WIDE_INT -estimated_poly_value (poly_int64 x) +estimated_poly_value (poly_int64 x, + poly_value_estimate_kind kind = POLY_VALUE_LIKELY) { if (NUM_POLY_INT_COEFFS == 1) return x.coeffs[0]; else - return targetm.estimated_poly_value (x); + return targetm.estimated_poly_value (x, kind); } #ifdef GCC_TM_H -- cgit v1.1