From 1bda738bab8193f0fb4551672d3be928d2015cd2 Mon Sep 17 00:00:00 2001
From: Jakub Jelinek <jakub@redhat.com>
Date: Tue, 29 May 2018 13:58:24 +0200
Subject: re PR target/85918 (Conversions to/from [unsigned] long long are not
 vectorized for AVX512DQ target)

	PR target/85918
	* tree.def (VEC_UNPACK_FIX_TRUNC_HI_EXPR, VEC_UNPACK_FIX_TRUNC_LO_EXPR,
	VEC_PACK_FLOAT_EXPR): New tree codes.
	* tree-pretty-print.c (op_code_prio): Handle
	VEC_UNPACK_FIX_TRUNC_HI_EXPR and VEC_UNPACK_FIX_TRUNC_LO_EXPR.
	(dump_generic_node): Handle VEC_UNPACK_FIX_TRUNC_HI_EXPR,
	VEC_UNPACK_FIX_TRUNC_LO_EXPR and VEC_PACK_FLOAT_EXPR.
	* tree-inline.c (estimate_operator_cost): Likewise.
	* gimple-pretty-print.c (dump_binary_rhs): Handle VEC_PACK_FLOAT_EXPR.
	* fold-const.c (const_binop): Likewise.
	(const_unop): Handle VEC_UNPACK_FIX_TRUNC_HI_EXPR and
	VEC_UNPACK_FIX_TRUNC_LO_EXPR.
	* tree-cfg.c (verify_gimple_assign_unary): Likewise.
	(verify_gimple_assign_binary): Handle VEC_PACK_FLOAT_EXPR.
	* cfgexpand.c (expand_debug_expr): Handle VEC_UNPACK_FIX_TRUNC_HI_EXPR,
	VEC_UNPACK_FIX_TRUNC_LO_EXPR and VEC_PACK_FLOAT_EXPR.
	* expr.c (expand_expr_real_2): Likewise.
	* optabs.def (vec_packs_float_optab, vec_packu_float_optab,
	vec_unpack_sfix_trunc_hi_optab, vec_unpack_sfix_trunc_lo_optab,
	vec_unpack_ufix_trunc_hi_optab, vec_unpack_ufix_trunc_lo_optab): New
	optabs.
	* optabs.c (expand_widen_pattern_expr): For
	VEC_UNPACK_FIX_TRUNC_HI_EXPR and VEC_UNPACK_FIX_TRUNC_LO_EXPR use
	sign from result type rather than operand's type.
	(expand_binop_directly): For vec_packu_float_optab and
	vec_packs_float_optab allow result type to be different from operand's
	type.
	* optabs-tree.c (optab_for_tree_code): Handle
	VEC_UNPACK_FIX_TRUNC_HI_EXPR, VEC_UNPACK_FIX_TRUNC_LO_EXPR and
	VEC_PACK_FLOAT_EXPR.  Formatting fixes.
	* tree-vect-generic.c (expand_vector_operations_1):  Handle
	VEC_UNPACK_FIX_TRUNC_HI_EXPR, VEC_UNPACK_FIX_TRUNC_LO_EXPR and
	VEC_PACK_FLOAT_EXPR.
	* tree-vect-stmts.c (supportable_widening_operation): Handle
	FIX_TRUNC_EXPR.
	(supportable_narrowing_operation): Handle FLOAT_EXPR.
	* config/i386/i386.md (fixprefix, floatprefix): New code attributes.
	* config/i386/sse.md (*float<floatunssuffix>v2div2sf2): Rename to ...
	(float<floatunssuffix>v2div2sf2): ... this.  Formatting fix.
	(vpckfloat_concat_mode, vpckfloat_temp_mode, vpckfloat_op_mode): New
	mode attributes.
	(vec_pack<floatprefix>_float_<mode>): New expander.
	(vunpckfixt_mode, vunpckfixt_model, vunpckfixt_extract_mode): New mode
	attributes.
	(vec_unpack_<fixprefix>fix_trunc_lo_<mode>,
	vec_unpack_<fixprefix>fix_trunc_hi_<mode>): New expanders.
	* doc/md.texi (vec_packs_float_@var{m}, vec_packu_float_@var{m},
	vec_unpack_sfix_trunc_hi_@var{m}, vec_unpack_sfix_trunc_lo_@var{m},
	vec_unpack_ufix_trunc_hi_@var{m}, vec_unpack_ufix_trunc_lo_@var{m}):
	Document.
	* doc/generic.texi (VEC_UNPACK_FLOAT_HI_EXPR,
	VEC_UNPACK_FLOAT_LO_EXPR): Fix pasto in description.
	(VEC_UNPACK_FIX_TRUNC_HI_EXPR, VEC_UNPACK_FIX_TRUNC_LO_EXPR,
	VEC_PACK_FLOAT_EXPR): Document.

	* gcc.target/i386/avx512dq-pr85918.c: Add -mprefer-vector-width=512
	and -fno-vect-cost-model options.  Add aligned(64) attribute to the
	arrays.  Add suffix 1 to all functions and use 4 iterations rather
	than N.  Add functions with conversions to and from float.
	Add new set of functions with 8 iterations and another one
	with 16 iterations, expect 24 vectorized loops instead of just 4.
	* gcc.target/i386/avx512dq-pr85918-2.c: New test.

From-SVN: r260893
---
 gcc/doc/md.texi | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

(limited to 'gcc/doc/md.texi')
diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 02fbfb3..be37619 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -5371,6 +5371,14 @@ of two vectors.  Operands 1 and 2 are vectors of the same mode having N
 floating point elements of size S@.  Operand 0 is the resulting vector
 in which 2*N elements of size N/2 are concatenated.
 
+@cindex @code{vec_packs_float_@var{m}} instruction pattern
+@cindex @code{vec_packu_float_@var{m}} instruction pattern
+@item @samp{vec_packs_float_@var{m}}, @samp{vec_packu_float_@var{m}}
+Narrow, convert to floating point type and merge the elements
+of two vectors.  Operands 1 and 2 are vectors of the same mode having N
+signed/unsigned integral elements of size S@.  Operand 0 is the resulting vector
+in which 2*N elements of size N/2 are concatenated.
+
 @cindex @code{vec_unpacks_hi_@var{m}} instruction pattern
 @cindex @code{vec_unpacks_lo_@var{m}} instruction pattern
 @item @samp{vec_unpacks_hi_@var{m}}, @samp{vec_unpacks_lo_@var{m}}
@@ -5400,6 +5408,20 @@ has N elements of size S@.  Convert the high/low elements of the vector using
 floating point conversion and place the resulting N/2 values of size 2*S in
 the output vector (operand 0).
 
+@cindex @code{vec_unpack_sfix_trunc_hi_@var{m}} instruction pattern
+@cindex @code{vec_unpack_sfix_trunc_lo_@var{m}} instruction pattern
+@cindex @code{vec_unpack_ufix_trunc_hi_@var{m}} instruction pattern
+@cindex @code{vec_unpack_ufix_trunc_lo_@var{m}} instruction pattern
+@item @samp{vec_unpack_sfix_trunc_hi_@var{m}},
+@itemx @samp{vec_unpack_sfix_trunc_lo_@var{m}}
+@itemx @samp{vec_unpack_ufix_trunc_hi_@var{m}}
+@itemx @samp{vec_unpack_ufix_trunc_lo_@var{m}}
+Extract, convert to signed/unsigned integer type and widen the high/low part of a
+vector of floating point elements.  The input vector (operand 1)
+has N elements of size S@.  Convert the high/low elements of the vector
+to integers and place the resulting N/2 values of size 2*S in
+the output vector (operand 0).
+
 @cindex @code{vec_widen_umult_hi_@var{m}} instruction pattern
 @cindex @code{vec_widen_umult_lo_@var{m}} instruction pattern
 @cindex @code{vec_widen_smult_hi_@var{m}} instruction pattern
-- 
cgit v1.1