vect: Use sdot for a fallback implementation of usdot

Following a suggestion from Tamar, this patch adds a fallback implementation of usdot using sdot. Specifically, for 8-bit input types: acc_2 = DOT_PROD_EXPR <a_unsigned, b_signed, acc_1>; becomes: tmp_1 = DOT_PROD_EXPR <64, b_signed, acc_1>; tmp_2 = DOT_PROD_EXPR <64, b_signed, tmp_1>; acc_2 = DOT_PROD_EXPR <a_unsigned - 128, b_signed, tmp_2>; on the basis that (x-128)*y + 64*y + 64*y. Doing the two 64*y operations first should give more time for x to be calculated, on the off chance that that's useful. gcc/ * tree-vect-patterns.cc (vect_convert_input): Expect the input type to be signed for optab_vector_mixed_sign. Update the vectype at the same time as type. (vect_recog_dot_prod_pattern): Update accordingly. If usdot isn't available, try sdot instead. * tree-vect-loop.cc (vect_is_emulated_mixed_dot_prod): New function. (vect_model_reduction_cost): Model the cost of implementing usdot using sdot. (vectorizable_reduction): Likewise. Skip target support test for lane reductions. (vect_emulate_mixed_dot_prod): New function. (vect_transform_reduction): Use it to emulate usdot via sdot. gcc/testsuite/ * gcc.dg/vect/vect-reduc-dot-9.c: Reduce target requirements from i8mm to dotprod. * gcc.dg/vect/vect-reduc-dot-10.c: Likewise. * gcc.dg/vect/vect-reduc-dot-11.c: Likewise. * gcc.dg/vect/vect-reduc-dot-12.c: Likewise. * gcc.dg/vect/vect-reduc-dot-13.c: Likewise. * gcc.dg/vect/vect-reduc-dot-14.c: Likewise. * gcc.dg/vect/vect-reduc-dot-15.c: Likewise. * gcc.dg/vect/vect-reduc-dot-16.c: Likewise. * gcc.dg/vect/vect-reduc-dot-17.c: Likewise. * gcc.dg/vect/vect-reduc-dot-18.c: Likewise. * gcc.dg/vect/vect-reduc-dot-19.c: Likewise. * gcc.dg/vect/vect-reduc-dot-20.c: Likewise. * gcc.dg/vect/vect-reduc-dot-21.c: Likewise. * gcc.dg/vect/vect-reduc-dot-22.c: Likewise.
author: Richard Sandiford <richard.sandiford@arm.com> 2022-07-05 08:53:10 +0100
committer: Richard Sandiford <richard.sandiford@arm.com> 2022-07-05 08:53:10 +0100
commit: 76c3041b856cb0495d8f71110cd76f6fe64a0038 (patch)
tree: 36d0189235a320b99eb52b915b227402f1b0b283 /gcc/tree-vect-patterns.cc
parent: b55284f4a1235fccd8254f539ddc6b869580462b (diff)
download: gcc-76c3041b856cb0495d8f71110cd76f6fe64a0038.zip
gcc-76c3041b856cb0495d8f71110cd76f6fe64a0038.tar.gz
gcc-76c3041b856cb0495d8f71110cd76f6fe64a0038.tar.bz2
1 files changed, 30 insertions, 8 deletions
diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
index 8f62486..dfbfb71 100644
--- a/gcc/tree-vect-patterns.cc
+++ b/gcc/tree-vect-patterns.cc
@@ -760,12 +760,16 @@ vect_convert_input (vec_info *vinfo, stmt_vec_info stmt_info, tree type,
 		    vect_unpromoted_value *unprom, tree vectype,
 		    enum optab_subtype subtype = optab_default)
 {
-
   /* Update the type if the signs differ.  */
-  if (subtype == optab_vector_mixed_sign
-      && TYPE_SIGN (type) != TYPE_SIGN (TREE_TYPE (unprom->op)))
-    type = build_nonstandard_integer_type (TYPE_PRECISION (type),
-					   TYPE_SIGN (unprom->type));
+  if (subtype == optab_vector_mixed_sign)
+    {
+      gcc_assert (!TYPE_UNSIGNED (type));
+      if (TYPE_UNSIGNED (TREE_TYPE (unprom->op)))
+	{
+	  type = unsigned_type_for (type);
+	  vectype = unsigned_type_for (vectype);
+	}
+    }
 
   /* Check for a no-op conversion.  */
   if (types_compatible_p (type, TREE_TYPE (unprom->op)))
@@ -1139,16 +1143,34 @@ vect_recog_dot_prod_pattern (vec_info *vinfo,
      is signed; otherwise, the result has the same sign as the operands.  */
   if (TYPE_PRECISION (unprom_mult.type) != TYPE_PRECISION (type)
       && (subtype == optab_vector_mixed_sign
-	? TYPE_UNSIGNED (unprom_mult.type)
-	: TYPE_SIGN (unprom_mult.type) != TYPE_SIGN (half_type)))
+	  ? TYPE_UNSIGNED (unprom_mult.type)
+	  : TYPE_SIGN (unprom_mult.type) != TYPE_SIGN (half_type)))
     return NULL;
 
   vect_pattern_detected ("vect_recog_dot_prod_pattern", last_stmt);
 
+  /* If the inputs have mixed signs, canonicalize on using the signed
+     input type for analysis.  This also helps when emulating mixed-sign
+     operations using signed operations.  */
+  if (subtype == optab_vector_mixed_sign)
+    half_type = signed_type_for (half_type);
+
   tree half_vectype;
   if (!vect_supportable_direct_optab_p (vinfo, type, DOT_PROD_EXPR, half_type,
 					type_out, &half_vectype, subtype))
-    return NULL;
+    {
+      /* We can emulate a mixed-sign dot-product using a sequence of
+	 signed dot-products; see vect_emulate_mixed_dot_prod for details.  */
+      if (subtype != optab_vector_mixed_sign
+	  || !vect_supportable_direct_optab_p (vinfo, signed_type_for (type),
+					       DOT_PROD_EXPR, half_type,
+					       type_out, &half_vectype,
+					       optab_vector))
+	return NULL;
+
+      *type_out = signed_or_unsigned_type_for (TYPE_UNSIGNED (type),
+					       *type_out);
+    }
 
   /* Get the inputs in the appropriate types.  */
   tree mult_oprnd[2];
author	Richard Sandiford <richard.sandiford@arm.com>	2022-07-05 08:53:10 +0100
committer	Richard Sandiford <richard.sandiford@arm.com>	2022-07-05 08:53:10 +0100
commit	76c3041b856cb0495d8f71110cd76f6fe64a0038 (patch)
tree	36d0189235a320b99eb52b915b227402f1b0b283 /gcc/tree-vect-patterns.cc
parent	b55284f4a1235fccd8254f539ddc6b869580462b (diff)
download	gcc-76c3041b856cb0495d8f71110cd76f6fe64a0038.zip gcc-76c3041b856cb0495d8f71110cd76f6fe64a0038.tar.gz gcc-76c3041b856cb0495d8f71110cd76f6fe64a0038.tar.bz2