From ab0a6b213abf6843b59cdea6399030e828109551 Mon Sep 17 00:00:00 2001 From: Tamar Christina Date: Wed, 14 Jul 2021 14:54:26 +0100 Subject: Vect: Add support for dot-product where the sign for the multiplicant changes. This patch adds support for a dot product where the sign of the multiplication arguments differ. i.e. one is signed and one is unsigned but the precisions are the same. #define N 480 #define SIGNEDNESS_1 unsigned #define SIGNEDNESS_2 signed #define SIGNEDNESS_3 signed #define SIGNEDNESS_4 unsigned SIGNEDNESS_1 int __attribute__ ((noipa)) f (SIGNEDNESS_1 int res, SIGNEDNESS_3 char *restrict a, SIGNEDNESS_4 char *restrict b) { for (__INTPTR_TYPE__ i = 0; i < N; ++i) { int av = a[i]; int bv = b[i]; SIGNEDNESS_2 short mult = av * bv; res += mult; } return res; } The operations are performed as if the operands were extended to a 32-bit value. As such this operation isn't valid if there is an intermediate conversion to an unsigned value. i.e. if SIGNEDNESS_2 is unsigned. more over if the signs of SIGNEDNESS_3 and SIGNEDNESS_4 are flipped the same optab is used but the operands are flipped in the optab expansion. To support this the patch extends the dot-product detection to optionally ignore operands with different signs and stores this information in the optab subtype which is now made a bitfield. The subtype can now additionally controls which optab an EXPR can expand to. gcc/ChangeLog: * optabs.def (usdot_prod_optab): New. * doc/md.texi: Document it and clarify other dot prod optabs. * optabs-tree.h (enum optab_subtype): Add optab_vector_mixed_sign. * optabs-tree.c (optab_for_tree_code): Support usdot_prod_optab. * optabs.c (expand_widen_pattern_expr): Likewise. * tree-cfg.c (verify_gimple_assign_ternary): Likewise. * tree-vect-loop.c (vectorizable_reduction): Query dot-product kind. * tree-vect-patterns.c (vect_supportable_direct_optab_p): Take optional optab subtype. (vect_widened_op_tree): Optionally ignore mismatch types. (vect_recog_dot_prod_pattern): Support usdot_prod_optab. --- gcc/doc/md.texi | 52 ++++++++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 46 insertions(+), 6 deletions(-) (limited to 'gcc/doc') diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index 8225a76..07681e2 100644 --- a/gcc/doc/md.texi +++ b/gcc/doc/md.texi @@ -5449,13 +5449,53 @@ Like @samp{fold_left_plus_@var{m}}, but takes an additional mask operand @cindex @code{sdot_prod@var{m}} instruction pattern @item @samp{sdot_prod@var{m}} + +Compute the sum of the products of two signed elements. +Operand 1 and operand 2 are of the same mode. Their +product, which is of a wider mode, is computed and added to operand 3. +Operand 3 is of a mode equal or wider than the mode of the product. The +result is placed in operand 0, which is of the same mode as operand 3. + +Semantically the expressions perform the multiplication in the following signs + +@smallexample +sdot == + op0 = sign-ext (op1) * sign-ext (op2) + op3 +@dots{} +@end smallexample + @cindex @code{udot_prod@var{m}} instruction pattern -@itemx @samp{udot_prod@var{m}} -Compute the sum of the products of two signed/unsigned elements. -Operand 1 and operand 2 are of the same mode. Their product, which is of a -wider mode, is computed and added to operand 3. Operand 3 is of a mode equal or -wider than the mode of the product. The result is placed in operand 0, which -is of the same mode as operand 3. +@item @samp{udot_prod@var{m}} + +Compute the sum of the products of two unsigned elements. +Operand 1 and operand 2 are of the same mode. Their +product, which is of a wider mode, is computed and added to operand 3. +Operand 3 is of a mode equal or wider than the mode of the product. The +result is placed in operand 0, which is of the same mode as operand 3. + +Semantically the expressions perform the multiplication in the following signs + +@smallexample +udot == + op0 = zero-ext (op1) * zero-ext (op2) + op3 +@dots{} +@end smallexample + +@cindex @code{usdot_prod@var{m}} instruction pattern +@item @samp{usdot_prod@var{m}} +Compute the sum of the products of elements of different signs. +Operand 1 must be unsigned and operand 2 signed. Their +product, which is of a wider mode, is computed and added to operand 3. +Operand 3 is of a mode equal or wider than the mode of the product. The +result is placed in operand 0, which is of the same mode as operand 3. + +Semantically the expressions perform the multiplication in the following signs + +@smallexample +usdot == + op0 = ((signed-conv) zero-ext (op1)) * sign-ext (op2) + op3 +@dots{} +@end smallexample @cindex @code{ssad@var{m}} instruction pattern @item @samp{ssad@var{m}} -- cgit v1.1