diff options
author | Richard Sandiford <richard.sandiford@linaro.org> | 2018-01-02 18:26:47 +0000 |
---|---|---|
committer | Richard Sandiford <rsandifo@gcc.gnu.org> | 2018-01-02 18:26:47 +0000 |
commit | e3342de49cbee48957acc749b9566eee230860be (patch) | |
tree | 32a86a752b83bafed11e1621d738a7fd284a93f7 /gcc/tree-vect-data-refs.c | |
parent | 6da64f1b329f57c07f22ec034bc7bc4b0dc9e87b (diff) | |
download | gcc-e3342de49cbee48957acc749b9566eee230860be.zip gcc-e3342de49cbee48957acc749b9566eee230860be.tar.gz gcc-e3342de49cbee48957acc749b9566eee230860be.tar.bz2 |
Make vec_perm_indices use new vector encoding
This patch changes vec_perm_indices from a plain vec<> to a class
that stores a canonicalized permutation, using the same encoding
as for VECTOR_CSTs. This means that vec_perm_indices now carries
information about the number of vectors being permuted (currently
always 1 or 2) and the number of elements in each input vector.
A new vec_perm_builder class is used to actually build up the vector,
like tree_vector_builder does for trees. vec_perm_indices is the
completed representation, a bit like VECTOR_CST is for trees.
The patch just does a mechanical conversion of the code to
vec_perm_builder: a later patch uses explicit encodings where possible.
The point of all this is that it makes the representation suitable
for variable-length vectors. It's no longer necessary for the
underlying vec<>s to store every element explicitly.
In int-vector-builder.h, "using the same encoding as tree and rtx constants"
describes the endpoint -- adding the rtx encoding comes later.
2018-01-02 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* int-vector-builder.h: New file.
* vec-perm-indices.h: Include int-vector-builder.h.
(vec_perm_indices): Redefine as an int_vector_builder.
(auto_vec_perm_indices): Delete.
(vec_perm_builder): Redefine as a stand-alone class.
(vec_perm_indices::vec_perm_indices): New function.
(vec_perm_indices::clamp): Likewise.
* vec-perm-indices.c: Include fold-const.h and tree-vector-builder.h.
(vec_perm_indices::new_vector): New function.
(vec_perm_indices::new_expanded_vector): Update for new
vec_perm_indices class.
(vec_perm_indices::rotate_inputs): New function.
(vec_perm_indices::all_in_range_p): Operate directly on the
encoded form, without computing elided elements.
(tree_to_vec_perm_builder): Operate directly on the VECTOR_CST
encoding. Update for new vec_perm_indices class.
* optabs.c (expand_vec_perm_const): Create a vec_perm_indices for
the given vec_perm_builder.
(expand_vec_perm_var): Update vec_perm_builder constructor.
(expand_mult_highpart): Use vec_perm_builder instead of
auto_vec_perm_indices.
* optabs-query.c (can_mult_highpart_p): Use vec_perm_builder and
vec_perm_indices instead of auto_vec_perm_indices. Use a single
or double series encoding as appropriate.
* fold-const.c (fold_ternary_loc): Use vec_perm_builder and
vec_perm_indices instead of auto_vec_perm_indices.
* tree-ssa-forwprop.c (simplify_vector_constructor): Likewise.
* tree-vect-data-refs.c (vect_grouped_store_supported): Likewise.
(vect_permute_store_chain): Likewise.
(vect_grouped_load_supported): Likewise.
(vect_permute_load_chain): Likewise.
(vect_shift_permute_load_chain): Likewise.
* tree-vect-slp.c (vect_build_slp_tree_1): Likewise.
(vect_transform_slp_perm_load): Likewise.
(vect_schedule_slp_instance): Likewise.
* tree-vect-stmts.c (perm_mask_for_reverse): Likewise.
(vectorizable_mask_load_store): Likewise.
(vectorizable_bswap): Likewise.
(vectorizable_store): Likewise.
(vectorizable_load): Likewise.
* tree-vect-generic.c (lower_vec_perm): Use vec_perm_builder and
vec_perm_indices instead of auto_vec_perm_indices. Use
tree_to_vec_perm_builder to read the vector from a tree.
* tree-vect-loop.c (calc_vec_perm_mask_for_shift): Take a
vec_perm_builder instead of a vec_perm_indices.
(have_whole_vector_shift): Use vec_perm_builder and
vec_perm_indices instead of auto_vec_perm_indices. Leave the
truncation to calc_vec_perm_mask_for_shift.
(vect_create_epilog_for_reduction): Likewise.
* config/aarch64/aarch64.c (expand_vec_perm_d::perm): Change
from auto_vec_perm_indices to vec_perm_indices.
(aarch64_expand_vec_perm_const_1): Use rotate_inputs on d.perm
instead of changing individual elements.
(aarch64_vectorize_vec_perm_const): Use new_vector to install
the vector in d.perm.
* config/arm/arm.c (expand_vec_perm_d::perm): Change
from auto_vec_perm_indices to vec_perm_indices.
(arm_expand_vec_perm_const_1): Use rotate_inputs on d.perm
instead of changing individual elements.
(arm_vectorize_vec_perm_const): Use new_vector to install
the vector in d.perm.
* config/powerpcspe/powerpcspe.c (rs6000_expand_extract_even):
Update vec_perm_builder constructor.
(rs6000_expand_interleave): Likewise.
* config/rs6000/rs6000.c (rs6000_expand_extract_even): Likewise.
(rs6000_expand_interleave): Likewise.
From-SVN: r256095
Diffstat (limited to 'gcc/tree-vect-data-refs.c')
-rw-r--r-- | gcc/tree-vect-data-refs.c | 109 |
1 files changed, 69 insertions, 40 deletions
diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c index 367b085..48673d1 100644 --- a/gcc/tree-vect-data-refs.c +++ b/gcc/tree-vect-data-refs.c @@ -4579,7 +4579,7 @@ vect_grouped_store_supported (tree vectype, unsigned HOST_WIDE_INT count) if (VECTOR_MODE_P (mode)) { unsigned int i, nelt = GET_MODE_NUNITS (mode); - auto_vec_perm_indices sel (nelt); + vec_perm_builder sel (nelt, nelt, 1); sel.quick_grow (nelt); if (count == 3) @@ -4587,6 +4587,7 @@ vect_grouped_store_supported (tree vectype, unsigned HOST_WIDE_INT count) unsigned int j0 = 0, j1 = 0, j2 = 0; unsigned int i, j; + vec_perm_indices indices; for (j = 0; j < 3; j++) { int nelt0 = ((3 - j) * nelt) % 3; @@ -4601,7 +4602,8 @@ vect_grouped_store_supported (tree vectype, unsigned HOST_WIDE_INT count) if (3 * i + nelt2 < nelt) sel[3 * i + nelt2] = 0; } - if (!can_vec_perm_const_p (mode, sel)) + indices.new_vector (sel, 2, nelt); + if (!can_vec_perm_const_p (mode, indices)) { if (dump_enabled_p ()) dump_printf (MSG_MISSED_OPTIMIZATION, @@ -4618,7 +4620,8 @@ vect_grouped_store_supported (tree vectype, unsigned HOST_WIDE_INT count) if (3 * i + nelt2 < nelt) sel[3 * i + nelt2] = nelt + j2++; } - if (!can_vec_perm_const_p (mode, sel)) + indices.new_vector (sel, 2, nelt); + if (!can_vec_perm_const_p (mode, indices)) { if (dump_enabled_p ()) dump_printf (MSG_MISSED_OPTIMIZATION, @@ -4638,11 +4641,13 @@ vect_grouped_store_supported (tree vectype, unsigned HOST_WIDE_INT count) sel[i * 2] = i; sel[i * 2 + 1] = i + nelt; } - if (can_vec_perm_const_p (mode, sel)) + vec_perm_indices indices (sel, 2, nelt); + if (can_vec_perm_const_p (mode, indices)) { for (i = 0; i < nelt; i++) sel[i] += nelt / 2; - if (can_vec_perm_const_p (mode, sel)) + indices.new_vector (sel, 2, nelt); + if (can_vec_perm_const_p (mode, indices)) return true; } } @@ -4744,7 +4749,7 @@ vect_permute_store_chain (vec<tree> dr_chain, unsigned int i, n, log_length = exact_log2 (length); unsigned int j, nelt = TYPE_VECTOR_SUBPARTS (vectype); - auto_vec_perm_indices sel (nelt); + vec_perm_builder sel (nelt, nelt, 1); sel.quick_grow (nelt); result_chain->quick_grow (length); @@ -4755,6 +4760,7 @@ vect_permute_store_chain (vec<tree> dr_chain, { unsigned int j0 = 0, j1 = 0, j2 = 0; + vec_perm_indices indices; for (j = 0; j < 3; j++) { int nelt0 = ((3 - j) * nelt) % 3; @@ -4770,7 +4776,8 @@ vect_permute_store_chain (vec<tree> dr_chain, if (3 * i + nelt2 < nelt) sel[3 * i + nelt2] = 0; } - perm3_mask_low = vect_gen_perm_mask_checked (vectype, sel); + indices.new_vector (sel, 2, nelt); + perm3_mask_low = vect_gen_perm_mask_checked (vectype, indices); for (i = 0; i < nelt; i++) { @@ -4781,7 +4788,8 @@ vect_permute_store_chain (vec<tree> dr_chain, if (3 * i + nelt2 < nelt) sel[3 * i + nelt2] = nelt + j2++; } - perm3_mask_high = vect_gen_perm_mask_checked (vectype, sel); + indices.new_vector (sel, 2, nelt); + perm3_mask_high = vect_gen_perm_mask_checked (vectype, indices); vect1 = dr_chain[0]; vect2 = dr_chain[1]; @@ -4818,11 +4826,13 @@ vect_permute_store_chain (vec<tree> dr_chain, sel[i * 2] = i; sel[i * 2 + 1] = i + nelt; } - perm_mask_high = vect_gen_perm_mask_checked (vectype, sel); + vec_perm_indices indices (sel, 2, nelt); + perm_mask_high = vect_gen_perm_mask_checked (vectype, indices); for (i = 0; i < nelt; i++) sel[i] += nelt / 2; - perm_mask_low = vect_gen_perm_mask_checked (vectype, sel); + indices.new_vector (sel, 2, nelt); + perm_mask_low = vect_gen_perm_mask_checked (vectype, indices); for (i = 0, n = log_length; i < n; i++) { @@ -5167,11 +5177,12 @@ vect_grouped_load_supported (tree vectype, bool single_element_p, if (VECTOR_MODE_P (mode)) { unsigned int i, j, nelt = GET_MODE_NUNITS (mode); - auto_vec_perm_indices sel (nelt); + vec_perm_builder sel (nelt, nelt, 1); sel.quick_grow (nelt); if (count == 3) { + vec_perm_indices indices; unsigned int k; for (k = 0; k < 3; k++) { @@ -5180,7 +5191,8 @@ vect_grouped_load_supported (tree vectype, bool single_element_p, sel[i] = 3 * i + k; else sel[i] = 0; - if (!can_vec_perm_const_p (mode, sel)) + indices.new_vector (sel, 2, nelt); + if (!can_vec_perm_const_p (mode, indices)) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, @@ -5193,7 +5205,8 @@ vect_grouped_load_supported (tree vectype, bool single_element_p, sel[i] = i; else sel[i] = nelt + ((nelt + k) % 3) + 3 * (j++); - if (!can_vec_perm_const_p (mode, sel)) + indices.new_vector (sel, 2, nelt); + if (!can_vec_perm_const_p (mode, indices)) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, @@ -5208,13 +5221,16 @@ vect_grouped_load_supported (tree vectype, bool single_element_p, { /* If length is not equal to 3 then only power of 2 is supported. */ gcc_assert (pow2p_hwi (count)); + for (i = 0; i < nelt; i++) sel[i] = i * 2; - if (can_vec_perm_const_p (mode, sel)) + vec_perm_indices indices (sel, 2, nelt); + if (can_vec_perm_const_p (mode, indices)) { for (i = 0; i < nelt; i++) sel[i] = i * 2 + 1; - if (can_vec_perm_const_p (mode, sel)) + indices.new_vector (sel, 2, nelt); + if (can_vec_perm_const_p (mode, indices)) return true; } } @@ -5329,7 +5345,7 @@ vect_permute_load_chain (vec<tree> dr_chain, unsigned int i, j, log_length = exact_log2 (length); unsigned nelt = TYPE_VECTOR_SUBPARTS (vectype); - auto_vec_perm_indices sel (nelt); + vec_perm_builder sel (nelt, nelt, 1); sel.quick_grow (nelt); result_chain->quick_grow (length); @@ -5340,6 +5356,7 @@ vect_permute_load_chain (vec<tree> dr_chain, { unsigned int k; + vec_perm_indices indices; for (k = 0; k < 3; k++) { for (i = 0; i < nelt; i++) @@ -5347,15 +5364,16 @@ vect_permute_load_chain (vec<tree> dr_chain, sel[i] = 3 * i + k; else sel[i] = 0; - perm3_mask_low = vect_gen_perm_mask_checked (vectype, sel); + indices.new_vector (sel, 2, nelt); + perm3_mask_low = vect_gen_perm_mask_checked (vectype, indices); for (i = 0, j = 0; i < nelt; i++) if (3 * i + k < 2 * nelt) sel[i] = i; else sel[i] = nelt + ((nelt + k) % 3) + 3 * (j++); - - perm3_mask_high = vect_gen_perm_mask_checked (vectype, sel); + indices.new_vector (sel, 2, nelt); + perm3_mask_high = vect_gen_perm_mask_checked (vectype, indices); first_vect = dr_chain[0]; second_vect = dr_chain[1]; @@ -5387,11 +5405,13 @@ vect_permute_load_chain (vec<tree> dr_chain, for (i = 0; i < nelt; ++i) sel[i] = i * 2; - perm_mask_even = vect_gen_perm_mask_checked (vectype, sel); + vec_perm_indices indices (sel, 2, nelt); + perm_mask_even = vect_gen_perm_mask_checked (vectype, indices); for (i = 0; i < nelt; ++i) sel[i] = i * 2 + 1; - perm_mask_odd = vect_gen_perm_mask_checked (vectype, sel); + indices.new_vector (sel, 2, nelt); + perm_mask_odd = vect_gen_perm_mask_checked (vectype, indices); for (i = 0; i < log_length; i++) { @@ -5527,7 +5547,7 @@ vect_shift_permute_load_chain (vec<tree> dr_chain, stmt_vec_info stmt_info = vinfo_for_stmt (stmt); loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info); - auto_vec_perm_indices sel (nelt); + vec_perm_builder sel (nelt, nelt, 1); sel.quick_grow (nelt); result_chain->quick_grow (length); @@ -5541,7 +5561,8 @@ vect_shift_permute_load_chain (vec<tree> dr_chain, sel[i] = i * 2; for (i = 0; i < nelt / 2; ++i) sel[nelt / 2 + i] = i * 2 + 1; - if (!can_vec_perm_const_p (TYPE_MODE (vectype), sel)) + vec_perm_indices indices (sel, 2, nelt); + if (!can_vec_perm_const_p (TYPE_MODE (vectype), indices)) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, @@ -5549,13 +5570,14 @@ vect_shift_permute_load_chain (vec<tree> dr_chain, supported by target\n"); return false; } - perm2_mask1 = vect_gen_perm_mask_checked (vectype, sel); + perm2_mask1 = vect_gen_perm_mask_checked (vectype, indices); for (i = 0; i < nelt / 2; ++i) sel[i] = i * 2 + 1; for (i = 0; i < nelt / 2; ++i) sel[nelt / 2 + i] = i * 2; - if (!can_vec_perm_const_p (TYPE_MODE (vectype), sel)) + indices.new_vector (sel, 2, nelt); + if (!can_vec_perm_const_p (TYPE_MODE (vectype), indices)) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, @@ -5563,20 +5585,21 @@ vect_shift_permute_load_chain (vec<tree> dr_chain, supported by target\n"); return false; } - perm2_mask2 = vect_gen_perm_mask_checked (vectype, sel); + perm2_mask2 = vect_gen_perm_mask_checked (vectype, indices); /* Generating permutation constant to shift all elements. For vector length 8 it is {4 5 6 7 8 9 10 11}. */ for (i = 0; i < nelt; i++) sel[i] = nelt / 2 + i; - if (!can_vec_perm_const_p (TYPE_MODE (vectype), sel)) + indices.new_vector (sel, 2, nelt); + if (!can_vec_perm_const_p (TYPE_MODE (vectype), indices)) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, "shift permutation is not supported by target\n"); return false; } - shift1_mask = vect_gen_perm_mask_checked (vectype, sel); + shift1_mask = vect_gen_perm_mask_checked (vectype, indices); /* Generating permutation constant to select vector from 2. For vector length 8 it is {0 1 2 3 12 13 14 15}. */ @@ -5584,14 +5607,15 @@ vect_shift_permute_load_chain (vec<tree> dr_chain, sel[i] = i; for (i = nelt / 2; i < nelt; i++) sel[i] = nelt + i; - if (!can_vec_perm_const_p (TYPE_MODE (vectype), sel)) + indices.new_vector (sel, 2, nelt); + if (!can_vec_perm_const_p (TYPE_MODE (vectype), indices)) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, "select is not supported by target\n"); return false; } - select_mask = vect_gen_perm_mask_checked (vectype, sel); + select_mask = vect_gen_perm_mask_checked (vectype, indices); for (i = 0; i < log_length; i++) { @@ -5647,7 +5671,8 @@ vect_shift_permute_load_chain (vec<tree> dr_chain, sel[i] = 3 * k + (l % 3); k++; } - if (!can_vec_perm_const_p (TYPE_MODE (vectype), sel)) + vec_perm_indices indices (sel, 2, nelt); + if (!can_vec_perm_const_p (TYPE_MODE (vectype), indices)) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, @@ -5655,59 +5680,63 @@ vect_shift_permute_load_chain (vec<tree> dr_chain, supported by target\n"); return false; } - perm3_mask = vect_gen_perm_mask_checked (vectype, sel); + perm3_mask = vect_gen_perm_mask_checked (vectype, indices); /* Generating permutation constant to shift all elements. For vector length 8 it is {6 7 8 9 10 11 12 13}. */ for (i = 0; i < nelt; i++) sel[i] = 2 * (nelt / 3) + (nelt % 3) + i; - if (!can_vec_perm_const_p (TYPE_MODE (vectype), sel)) + indices.new_vector (sel, 2, nelt); + if (!can_vec_perm_const_p (TYPE_MODE (vectype), indices)) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, "shift permutation is not supported by target\n"); return false; } - shift1_mask = vect_gen_perm_mask_checked (vectype, sel); + shift1_mask = vect_gen_perm_mask_checked (vectype, indices); /* Generating permutation constant to shift all elements. For vector length 8 it is {5 6 7 8 9 10 11 12}. */ for (i = 0; i < nelt; i++) sel[i] = 2 * (nelt / 3) + 1 + i; - if (!can_vec_perm_const_p (TYPE_MODE (vectype), sel)) + indices.new_vector (sel, 2, nelt); + if (!can_vec_perm_const_p (TYPE_MODE (vectype), indices)) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, "shift permutation is not supported by target\n"); return false; } - shift2_mask = vect_gen_perm_mask_checked (vectype, sel); + shift2_mask = vect_gen_perm_mask_checked (vectype, indices); /* Generating permutation constant to shift all elements. For vector length 8 it is {3 4 5 6 7 8 9 10}. */ for (i = 0; i < nelt; i++) sel[i] = (nelt / 3) + (nelt % 3) / 2 + i; - if (!can_vec_perm_const_p (TYPE_MODE (vectype), sel)) + indices.new_vector (sel, 2, nelt); + if (!can_vec_perm_const_p (TYPE_MODE (vectype), indices)) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, "shift permutation is not supported by target\n"); return false; } - shift3_mask = vect_gen_perm_mask_checked (vectype, sel); + shift3_mask = vect_gen_perm_mask_checked (vectype, indices); /* Generating permutation constant to shift all elements. For vector length 8 it is {5 6 7 8 9 10 11 12}. */ for (i = 0; i < nelt; i++) sel[i] = 2 * (nelt / 3) + (nelt % 3) / 2 + i; - if (!can_vec_perm_const_p (TYPE_MODE (vectype), sel)) + indices.new_vector (sel, 2, nelt); + if (!can_vec_perm_const_p (TYPE_MODE (vectype), indices)) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, "shift permutation is not supported by target\n"); return false; } - shift4_mask = vect_gen_perm_mask_checked (vectype, sel); + shift4_mask = vect_gen_perm_mask_checked (vectype, indices); for (k = 0; k < 3; k++) { |