diff options
author | Jakub Jelinek <jakub@redhat.com> | 2011-12-23 10:38:03 -0800 |
---|---|---|
committer | Richard Henderson <rth@gcc.gnu.org> | 2011-12-23 10:38:03 -0800 |
commit | 3fcc1b5520fbb2ce2b55a5bd825924a55e044b79 (patch) | |
tree | 896a83936f868d1ebb9685fb7093757d116cbe8b /gcc/config | |
parent | 7dab511cf3331378aaafdeb7676835c0cdb194fa (diff) | |
download | gcc-3fcc1b5520fbb2ce2b55a5bd825924a55e044b79.zip gcc-3fcc1b5520fbb2ce2b55a5bd825924a55e044b79.tar.gz gcc-3fcc1b5520fbb2ce2b55a5bd825924a55e044b79.tar.bz2 |
Delete VEC_INTERLEAVE_*_EXPR.
* tree.def (VEC_INTERLEAVE_HIGH_EXPR, VEC_INTERLEAVE_LOW_EXPR): Remove.
* gimple-pretty-print.c (dump_binary_rhs): Don't handle
VEC_INTERLEAVE_HIGH_EXPR and VEC_INTERLEAVE_LOW_EXPR.
* expr.c (expand_expr_real_2): Likewise.
* tree-cfg.c (verify_gimple_assign_binary): Likewise.
* cfgexpand.c (expand_debug_expr): Likewise.
* tree-inline.c (estimate_operator_cost): Likewise.
* tree-pretty-print.c (dump_generic_node): Likewise.
* tree-vect-generic.c (expand_vector_operations_1): Likewise.
* fold-const.c (fold_binary_loc): Likewise.
* doc/generic.texi (VEC_INTERLEAVE_HIGH_EXPR,
VEC_INTERLEAVE_LOW_EXPR): Remove documentation.
* optabs.c (optab_for_tree_code): Don't handle
VEC_INTERLEAVE_HIGH_EXPR and VEC_INTERLEAVE_LOW_EXPR.
(expand_binop, init_optabs): Remove vec_interleave_high_optab
and vec_interleave_low_optab.
* genopinit.c (optabs): Likewise.
* optabs.h (OTI_vec_interleave_high, OTI_vec_interleave_low): Remove.
(vec_interleave_high_optab, vec_interleave_low_optab): Remove.
* doc/md.texi (vec_interleave_high, vec_interleave_low): Remove
documentation.
* tree-vect-stmts.c (gen_perm_mask): Renamed to...
(vect_gen_perm_mask): ... this. No longer static.
(perm_mask_for_reverse, vectorizable_load): Adjust callers.
* tree-vectorizer.h (vect_gen_perm_mask): New prototype.
* tree-vect-data-refs.c (vect_strided_store_supported): Don't try
VEC_INTERLEAVE_*_EXPR, use can_vec_perm_p instead of
can_vec_perm_for_code_p.
(vect_permute_store_chain): Generate VEC_PERM_EXPR with interleaving
masks instead of VEC_INTERLEAVE_HIGH_EXPR and VEC_INTERLEAVE_LOW_EXPR.
* config/i386/i386.c (expand_vec_perm_interleave2): If
expand_vec_perm_interleave3 would handle it, return false.
(expand_vec_perm_broadcast_1): Don't use vec_interleave_*_optab.
From-SVN: r182663
Diffstat (limited to 'gcc/config')
-rw-r--r-- | gcc/config/i386/i386.c | 26 |
1 files changed, 22 insertions, 4 deletions
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index af58f7c..b8e6396 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -36021,6 +36021,8 @@ expand_vec_perm_palignr (struct expand_vec_perm_d *d) return ok; } +static bool expand_vec_perm_interleave3 (struct expand_vec_perm_d *d); + /* A subroutine of ix86_expand_vec_perm_builtin_1. Try to simplify a two vector permutation into a single vector permutation by using an interleave operation to merge the vectors. */ @@ -36047,6 +36049,17 @@ expand_vec_perm_interleave2 (struct expand_vec_perm_d *d) /* For 32-byte modes allow even d->op0 == d->op1. The lack of cross-lane shuffling in some instructions might prevent a single insn shuffle. */ + dfinal = *d; + dfinal.testing_p = true; + /* If expand_vec_perm_interleave3 can expand this into + a 3 insn sequence, give up and let it be expanded as + 3 insn sequence. While that is one insn longer, + it doesn't need a memory operand and in the common + case that both interleave low and high permutations + with the same operands are adjacent needs 4 insns + for both after CSE. */ + if (expand_vec_perm_interleave3 (&dfinal)) + return false; } else return false; @@ -36886,18 +36899,23 @@ expand_vec_perm_broadcast_1 (struct expand_vec_perm_d *d) stopping once we have promoted to V4SImode and then use pshufd. */ do { - optab otab = vec_interleave_low_optab; + rtx dest; + rtx (*gen) (rtx, rtx, rtx) + = vmode == V16QImode ? gen_vec_interleave_lowv16qi + : gen_vec_interleave_lowv8hi; if (elt >= nelt2) { - otab = vec_interleave_high_optab; + gen = vmode == V16QImode ? gen_vec_interleave_highv16qi + : gen_vec_interleave_highv8hi; elt -= nelt2; } nelt2 /= 2; - op0 = expand_binop (vmode, otab, op0, op0, NULL, 0, OPTAB_DIRECT); + dest = gen_reg_rtx (vmode); + emit_insn (gen (dest, op0, op0)); vmode = get_mode_wider_vector (vmode); - op0 = gen_lowpart (vmode, op0); + op0 = gen_lowpart (vmode, dest); } while (vmode != V4SImode); |