From 1bda738bab8193f0fb4551672d3be928d2015cd2 Mon Sep 17 00:00:00 2001
From: Jakub Jelinek <jakub@redhat.com>
Date: Tue, 29 May 2018 13:58:24 +0200
Subject: re PR target/85918 (Conversions to/from [unsigned] long long are not
 vectorized for AVX512DQ target)

	PR target/85918
	* tree.def (VEC_UNPACK_FIX_TRUNC_HI_EXPR, VEC_UNPACK_FIX_TRUNC_LO_EXPR,
	VEC_PACK_FLOAT_EXPR): New tree codes.
	* tree-pretty-print.c (op_code_prio): Handle
	VEC_UNPACK_FIX_TRUNC_HI_EXPR and VEC_UNPACK_FIX_TRUNC_LO_EXPR.
	(dump_generic_node): Handle VEC_UNPACK_FIX_TRUNC_HI_EXPR,
	VEC_UNPACK_FIX_TRUNC_LO_EXPR and VEC_PACK_FLOAT_EXPR.
	* tree-inline.c (estimate_operator_cost): Likewise.
	* gimple-pretty-print.c (dump_binary_rhs): Handle VEC_PACK_FLOAT_EXPR.
	* fold-const.c (const_binop): Likewise.
	(const_unop): Handle VEC_UNPACK_FIX_TRUNC_HI_EXPR and
	VEC_UNPACK_FIX_TRUNC_LO_EXPR.
	* tree-cfg.c (verify_gimple_assign_unary): Likewise.
	(verify_gimple_assign_binary): Handle VEC_PACK_FLOAT_EXPR.
	* cfgexpand.c (expand_debug_expr): Handle VEC_UNPACK_FIX_TRUNC_HI_EXPR,
	VEC_UNPACK_FIX_TRUNC_LO_EXPR and VEC_PACK_FLOAT_EXPR.
	* expr.c (expand_expr_real_2): Likewise.
	* optabs.def (vec_packs_float_optab, vec_packu_float_optab,
	vec_unpack_sfix_trunc_hi_optab, vec_unpack_sfix_trunc_lo_optab,
	vec_unpack_ufix_trunc_hi_optab, vec_unpack_ufix_trunc_lo_optab): New
	optabs.
	* optabs.c (expand_widen_pattern_expr): For
	VEC_UNPACK_FIX_TRUNC_HI_EXPR and VEC_UNPACK_FIX_TRUNC_LO_EXPR use
	sign from result type rather than operand's type.
	(expand_binop_directly): For vec_packu_float_optab and
	vec_packs_float_optab allow result type to be different from operand's
	type.
	* optabs-tree.c (optab_for_tree_code): Handle
	VEC_UNPACK_FIX_TRUNC_HI_EXPR, VEC_UNPACK_FIX_TRUNC_LO_EXPR and
	VEC_PACK_FLOAT_EXPR.  Formatting fixes.
	* tree-vect-generic.c (expand_vector_operations_1):  Handle
	VEC_UNPACK_FIX_TRUNC_HI_EXPR, VEC_UNPACK_FIX_TRUNC_LO_EXPR and
	VEC_PACK_FLOAT_EXPR.
	* tree-vect-stmts.c (supportable_widening_operation): Handle
	FIX_TRUNC_EXPR.
	(supportable_narrowing_operation): Handle FLOAT_EXPR.
	* config/i386/i386.md (fixprefix, floatprefix): New code attributes.
	* config/i386/sse.md (*float<floatunssuffix>v2div2sf2): Rename to ...
	(float<floatunssuffix>v2div2sf2): ... this.  Formatting fix.
	(vpckfloat_concat_mode, vpckfloat_temp_mode, vpckfloat_op_mode): New
	mode attributes.
	(vec_pack<floatprefix>_float_<mode>): New expander.
	(vunpckfixt_mode, vunpckfixt_model, vunpckfixt_extract_mode): New mode
	attributes.
	(vec_unpack_<fixprefix>fix_trunc_lo_<mode>,
	vec_unpack_<fixprefix>fix_trunc_hi_<mode>): New expanders.
	* doc/md.texi (vec_packs_float_@var{m}, vec_packu_float_@var{m},
	vec_unpack_sfix_trunc_hi_@var{m}, vec_unpack_sfix_trunc_lo_@var{m},
	vec_unpack_ufix_trunc_hi_@var{m}, vec_unpack_ufix_trunc_lo_@var{m}):
	Document.
	* doc/generic.texi (VEC_UNPACK_FLOAT_HI_EXPR,
	VEC_UNPACK_FLOAT_LO_EXPR): Fix pasto in description.
	(VEC_UNPACK_FIX_TRUNC_HI_EXPR, VEC_UNPACK_FIX_TRUNC_LO_EXPR,
	VEC_PACK_FLOAT_EXPR): Document.

	* gcc.target/i386/avx512dq-pr85918.c: Add -mprefer-vector-width=512
	and -fno-vect-cost-model options.  Add aligned(64) attribute to the
	arrays.  Add suffix 1 to all functions and use 4 iterations rather
	than N.  Add functions with conversions to and from float.
	Add new set of functions with 8 iterations and another one
	with 16 iterations, expect 24 vectorized loops instead of just 4.
	* gcc.target/i386/avx512dq-pr85918-2.c: New test.

From-SVN: r260893
---
 gcc/tree.def | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

(limited to 'gcc/tree.def')
diff --git a/gcc/tree.def b/gcc/tree.def
index c660b2c..9696fee 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -1371,6 +1371,15 @@ DEFTREECODE (VEC_UNPACK_LO_EXPR, "vec_unpack_lo_expr", tcc_unary, 1)
 DEFTREECODE (VEC_UNPACK_FLOAT_HI_EXPR, "vec_unpack_float_hi_expr", tcc_unary, 1)
 DEFTREECODE (VEC_UNPACK_FLOAT_LO_EXPR, "vec_unpack_float_lo_expr", tcc_unary, 1)
 
+/* Unpack (extract) the high/low elements of the input vector, convert
+   floating point values to integer and widen elements into the output
+   vector.  The input vector has twice as many elements as the output
+   vector, that are half the size of the elements of the output vector.  */
+DEFTREECODE (VEC_UNPACK_FIX_TRUNC_HI_EXPR, "vec_unpack_fix_trunc_hi_expr",
+	     tcc_unary, 1)
+DEFTREECODE (VEC_UNPACK_FIX_TRUNC_LO_EXPR, "vec_unpack_fix_trunc_lo_expr",
+	     tcc_unary, 1)
+
 /* Pack (demote/narrow and merge) the elements of the two input vectors
    into the output vector using truncation/saturation.
    The elements of the input vectors are twice the size of the elements of the
@@ -1384,6 +1393,12 @@ DEFTREECODE (VEC_PACK_SAT_EXPR, "vec_pack_sat_expr", tcc_binary, 2)
    the output vector.  */
 DEFTREECODE (VEC_PACK_FIX_TRUNC_EXPR, "vec_pack_fix_trunc_expr", tcc_binary, 2)
 
+/* Convert fixed point values of the two input vectors to floating point
+   and pack (narrow and merge) the elements into the output vector. The
+   elements of the input vector are twice the size of the elements of
+   the output vector.  */
+DEFTREECODE (VEC_PACK_FLOAT_EXPR, "vec_pack_float_expr", tcc_binary, 2)
+
 /* Widening vector shift left in bits.
    Operand 0 is a vector to be shifted with N elements of size S.
    Operand 1 is an integer shift amount in bits.
-- 
cgit v1.1