Age | Commit message (Collapse) | Author | Files | Lines |
|
This makes sure to use the correct type for the LHS of the scalar
replacement statement.
20220-11-30 Richard Biener <rguenther@suse.de>
PR tree-optimization/98048
* tree-vect-generic.c (expand_vector_operations_1): Use the
correct type for the scalar LHS replacement.
* gcc.dg/vect/pr98048.c: New testcase.
|
|
Add widening add, subtract patterns to tree-vect-patterns. Update the
widened code of patterns that detect PLUS_EXPR to also detect
WIDEN_PLUS_EXPR. These patterns take 2 vectors with N elements of size
S and perform an add/subtract on the elements, storing the results as N
elements of size 2*S (in 2 result vectors). This is implemented in the
aarch64 backend as addl,addl2 and subl,subl2 respectively. Add aarch64
tests for patterns.
gcc/ChangeLog:
* doc/generic.texi: Document new widen_plus/minus_lo/hi tree codes.
* doc/md.texi: Document new widenening add/subtract hi/lo optabs.
* expr.c (expand_expr_real_2): Add widen_add, widen_subtract cases.
* optabs-tree.c (optab_for_tree_code): Add case for widening optabs.
* optabs.def (OPTAB_D): Define vectorized widen add, subtracts.
* tree-cfg.c (verify_gimple_assign_binary): Add case for widening adds,
subtracts.
* tree-inline.c (estimate_operator_cost): Add case for widening adds,
subtracts.
* tree-vect-generic.c (expand_vector_operations_1): Add case for
widening adds, subtracts
* tree-vect-patterns.c (vect_recog_widen_add_pattern): New recog
pattern.
(vect_recog_widen_sub_pattern): New recog pattern.
(vect_recog_average_pattern): Update widened add code.
(vect_recog_average_pattern): Update widened add code.
* tree-vect-stmts.c (vectorizable_conversion): Add case for widened add,
subtract.
(supportable_widening_operation): Add case for widened add, subtract.
* tree.def
(WIDEN_PLUS_EXPR): New tree code.
(WIDEN_MINUS_EXPR): New tree code.
(VEC_WIDEN_ADD_HI_EXPR): New tree code.
(VEC_WIDEN_PLUS_LO_EXPR): New tree code.
(VEC_WIDEN_MINUS_HI_EXPR): New tree code.
(VEC_WIDEN_MINUS_LO_EXPR): New tree code.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/vect-widen-add.c: New test.
* gcc.target/aarch64/vect-widen-sub.c: New test.
|
|
This improves the situation somewhat when vector lowering tries
to access vector bools as seen in PR96814.
2020-09-03 Richard Biener <rguenther@suse.de>
* tree-vect-generic.c (tree_vec_extract): Remove odd
special-casing of boolean vectors.
* fold-const.c (fold_ternary_loc): Handle boolean vector
type BIT_FIELD_REFs.
|
|
.VEC_CONVERT is a const internal call, so normally if the lhs is not used,
we'd DCE it far before getting to veclower, but with -O0 (or perhaps
-fno-tree-dce and some other -fno-* options) it can happen.
But as the internal fn needs the lhs to know the type to which the
conversion is done (and I think that is a reasonable representation, having
some magic another argument and having to create constants with that type
looks overkill to me), we just should DCE those calls ourselves.
During veclower, we can't really remove insns, as the callers would be
upset, so this just replaces it with a GIMPLE_NOP.
2020-08-04 Jakub Jelinek <jakub@redhat.com>
PR middle-end/96426
* tree-vect-generic.c (expand_vector_conversion): Replace .VEC_CONVERT
call with GIMPLE_NOP if there is no lhs.
* gcc.c-torture/compile/pr96426.c: New test.
|
|
gcc/ChangeLog:
PR tree-optimization/96128
* tree-vect-generic.c (expand_vector_comparison): Do not expand
vector comparison with VEC_COND_EXPR.
gcc/testsuite/ChangeLog:
PR tree-optimization/96128
* gcc.target/s390/vector/pr96128.c: New test.
|
|
gcc/ChangeLog:
PR middle-end/95830
* tree-vect-generic.c (expand_vector_condition): Forward declaration.
(expand_vector_comparison): Do not expand a comparison if all
uses are consumed by a VEC_COND_EXPR.
(expand_vector_operation): Change void return type to bool.
(expand_vector_operations_1): Pass dce_ssa_names.
|
|
gcc/ChangeLog:
PR tree-optimization/95745
PR middle-end/95830
* gimple-isel.cc (gimple_expand_vec_cond_exprs): Delete dead
SSA_NAMEs used as the first argument of a VEC_COND_EXPR. Always
return 0.
* tree-vect-generic.c (expand_vector_condition): Remove dead
SSA_NAMEs used as the first argument of a VEC_COND_EXPR.
|
|
gcc/ChangeLog:
* tree-vect-generic.c (expand_vector_condition): Check
for gassign before inspecting RHS.
|
|
gcc/ChangeLog:
* Makefile.in: Add new file.
* expr.c (expand_expr_real_2): Add gcc_unreachable as we should
not meet this condition.
(do_store_flag): Likewise.
* gimplify.c (gimplify_expr): Gimplify first argument of
VEC_COND_EXPR to be a SSA name.
* internal-fn.c (vec_cond_mask_direct): New.
(vec_cond_direct): Likewise.
(vec_condu_direct): Likewise.
(vec_condeq_direct): Likewise.
(expand_vect_cond_optab_fn): New.
(expand_vec_cond_optab_fn): Likewise.
(expand_vec_condu_optab_fn): Likewise.
(expand_vec_condeq_optab_fn): Likewise.
(expand_vect_cond_mask_optab_fn): Likewise.
(expand_vec_cond_mask_optab_fn): Likewise.
(direct_vec_cond_mask_optab_supported_p): Likewise.
(direct_vec_cond_optab_supported_p): Likewise.
(direct_vec_condu_optab_supported_p): Likewise.
(direct_vec_condeq_optab_supported_p): Likewise.
* internal-fn.def (VCOND): New OPTAB.
(VCONDU): Likewise.
(VCONDEQ): Likewise.
(VCOND_MASK): Likewise.
* optabs.c (get_rtx_code): Make it global.
(expand_vec_cond_mask_expr): Removed.
(expand_vec_cond_expr): Removed.
* optabs.h (expand_vec_cond_expr): Likewise.
(vector_compare_rtx): Make it global.
* passes.def: Add new pass_gimple_isel pass.
* tree-cfg.c (verify_gimple_assign_ternary): Add check
for VEC_COND_EXPR about first argument.
* tree-pass.h (make_pass_gimple_isel): New.
* tree-ssa-forwprop.c (pass_forwprop::execute): Prevent
propagation of the first argument of a VEC_COND_EXPR.
* tree-ssa-reassoc.c (ovce_extract_ops): Support SSA_NAME as
first argument of a VEC_COND_EXPR.
(optimize_vec_cond_expr): Likewise.
* tree-vect-generic.c (expand_vector_divmod): Make SSA_NAME
for a first argument of created VEC_COND_EXPR.
(expand_vector_condition): Fix coding style.
* tree-vect-stmts.c (vectorizable_condition): Gimplify
first argument.
* gimple-isel.cc: New file.
gcc/testsuite/ChangeLog:
* g++.dg/vect/vec-cond-expr-eh.C: New test.
|
|
This third patch of three actually fixes the PR. We were using
8-bit BIT_FIELD_REFs to access single-bit elements, and multiplying
the vector index by 8 bits rather than 1 bit.
2020-05-12 Richard Sandiford <richard.sandiford@arm.com>
gcc/
PR tree-optimization/94980
* tree-vect-generic.c (expand_vector_comparison): Use
vector_element_bits_tree to get the element size in bits,
rather than using TYPE_SIZE.
(expand_vector_condition, vector_element): Likewise.
gcc/testsuite/
PR tree-optimization/94980
* gcc.target/i386/pr94980.c: New test.
|
|
This patch makes build_replicated_const take the number of bits
in VALUE rather than calculating the width from the element type.
The callers can then use vector_element_bits to calculate the
correct element size from the vector type.
2020-05-12 Richard Sandiford <richard.sandiford@arm.com>
gcc/
PR tree-optimization/94980
* tree-vect-generic.c (build_replicated_const): Take the number
of bits as a parameter, instead of the type of the elements.
(do_plus_minus): Update accordingly, using vector_element_bits
to calculate the correct number of bits.
(do_negate): Likewise.
|
|
A lot of code that wants to know the number of bits in a vector
element gets that information from the element's TYPE_SIZE,
which is always equal to TYPE_SIZE_UNIT * BITS_PER_UNIT.
This doesn't work for SVE and AVX512-style packed boolean vectors,
where several elements can occupy a single byte.
This patch introduces a new pair of helpers for getting the true
(possibly sub-byte) size. I made a token attempt to convert obvious
element size calculations, but I'm sure I missed some.
2020-05-12 Richard Sandiford <richard.sandiford@arm.com>
gcc/
PR tree-optimization/94980
* tree.h (vector_element_bits, vector_element_bits_tree): Declare.
* tree.c (vector_element_bits, vector_element_bits_tree): New.
* match.pd: Use the new functions instead of determining the
vector element size directly from TYPE_SIZE(_UNIT).
* tree-vect-data-refs.c (vect_gather_scatter_fn_p): Likewise.
* tree-vect-patterns.c (vect_recog_mask_conversion_pattern): Likewise.
* tree-vect-stmts.c (vect_is_simple_cond): Likewise.
* tree-vect-generic.c (expand_vector_piecewise): Likewise.
(expand_vector_conversion): Likewise.
(expand_vector_addition): Likewise for a TYPE_SIZE_UNIT used as
a divisor. Convert the dividend to bits to compensate.
* tree-vect-loop.c (vectorizable_live_operation): Call
vector_element_bits instead of open-coding it.
|
|
The first testcase below is miscompiled, because for the division part
of the lowering we canonicalize negative divisors to their absolute value
(similarly how expmed.c canonicalizes it), but when multiplying the division
result back by the VECTOR_CST, we use the original constant, which can
contain negative divisors.
Fixed by computing ABS_EXPR of the VECTOR_CST. Unfortunately, fold-const.c
doesn't support const_unop (ABS_EXPR, VECTOR_CST) and I think it is too late
in GCC 10 cycle to add it now.
Furthermore, while modulo by most negative constant happens to return the
right value, it does that only by invoking UB in the IL, because
we then expand division by that 1U+INT_MAX and say for INT_MIN % INT_MIN
compute the division as -1, and then multiply by INT_MIN, which is signed
integer overflow. We in theory could do the computation in unsigned vector
types instead, but is it worth bothering. People that are doing % INT_MIN
are either testing for standard conformance, or doing something wrong.
So, I've also added punting on % INT_MIN, both in vect lowering and vect
pattern recognition (we punt already for / INT_MIN).
2020-04-08 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/94524
* tree-vect-generic.c (expand_vector_divmod): If any elt of op1 is
negative for signed TRUNC_MOD_EXPR, multiply with absolute value of
op1 rather than op1 itself at the end. Punt for signed modulo by
most negative constant.
* tree-vect-patterns.c (vect_recog_divmod_pattern): Punt for signed
modulo by most negative constant.
* gcc.c-torture/execute/pr94524-1.c: New test.
* gcc.c-torture/execute/pr94524-2.c: New test.
|
|
From-SVN: r279813
|
|
2019-11-27 Richard Biener <rguenther@suse.de>
* target.def (TARGET_VECTORIZE_BUILTIN_CONVERSION): Remove.
* targhooks.c (default_builtin_vectorized_conversion): Likewise.
* targhooks.h (default_builtin_vectorized_conversion): Likewise.
* optabs-tree.c (supportable_convert_operation): Do not call
targetm.vectorize.builtin_conversion. Remove unused decl parameter.
* optabs-tree.h (supportable_convert_operation): Adjust.
* doc/tm.texi.in (TARGET_VECTORIZE_BUILTIN_CONVERSION): Remove.
* doc/tm.texi: Regenerate.
* tree-ssa-forwprop.c (simplify_vector_constructor): Adjust.
* tree-vect-generic.c (expand_vector_conversion): Likewise.
* tree-vect-stmts.c (vect_gen_widened_results_half): Remove
unused decl parameter and adjust.
(vect_create_vectorized_promotion_stmts): Likewise.
(vectorizable_conversion): Adjust.
From-SVN: r278765
|
|
build_same_sized_truth_vector_type was confusingly named, since for
SVE and AVX512 the returned vector isn't the same byte size (although
it does have the same number of elements). What it really returns
is the "truth" vector type for a given data vector type.
The more general truth_type_for provides the same thing when passed
a vector and IMO has a more descriptive name, so this patch replaces
all uses of build_same_sized_truth_vector_type with that. It does
the same for a call to build_truth_vector_type, leaving truth_type_for
itself as the only remaining caller.
It's then more natural to pass build_truth_vector_type the original
vector type rather than its size and nunits, especially since the
given size isn't the size of the returned vector. This in turn allows
a future patch to simplify the interface of get_mask_mode. Doing this
also fixes a bug in which truth_type_for would pass a size of zero for
BLKmode vector types.
2019-11-14 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree.h (build_truth_vector_type): Delete.
(build_same_sized_truth_vector_type): Likewise.
* tree.c (build_truth_vector_type): Rename to...
(build_truth_vector_type_for): ...this. Make static and take
a vector type as argument.
(truth_type_for): Update accordingly.
(build_same_sized_truth_vector_type): Delete.
* tree-vect-generic.c (expand_vector_divmod): Use truth_type_for
instead of build_same_sized_truth_vector_type.
* tree-vect-loop.c (vect_create_epilog_for_reduction): Likewise.
(vect_record_loop_mask, vect_get_loop_mask): Likewise.
* tree-vect-patterns.c (build_mask_conversion): Likeise.
* tree-vect-slp.c (vect_get_constant_vectors): Likewise.
* tree-vect-stmts.c (vect_get_vec_def_for_operand): Likewise.
(vect_build_gather_load_calls, vectorizable_call): Likewise.
(scan_store_can_perm_p, vectorizable_scan_store): Likewise.
(vectorizable_store, vectorizable_condition): Likewise.
(get_mask_type_for_scalar_type, get_same_sized_vectype): Likewise.
(vect_get_mask_type_for_stmt): Use truth_type_for instead of
build_truth_vector_type.
* config/aarch64/aarch64-sve-builtins.cc (gimple_folder::convert_pred):
Use truth_type_for instead of build_same_sized_truth_vector_type.
* config/rs6000/rs6000-call.c (fold_build_vec_cmp): Likewise.
gcc/c/
* c-typeck.c (build_conditional_expr): Use truth_type_for instead
of build_same_sized_truth_vector_type.
(build_vec_cmp): Likewise.
gcc/cp/
* call.c (build_conditional_expr_1): Use truth_type_for instead
of build_same_sized_truth_vector_type.
* typeck.c (build_vec_cmp): Likewise.
gcc/d/
* d-codegen.cc (build_boolop): Use truth_type_for instead of
build_same_sized_truth_vector_type.
From-SVN: r278232
|
|
plus size exceeds size of referenced object in 'bit_field_ref'))
PR tree-optimization/91157
* tree-vect-generic.c (expand_vector_comparison): Handle lhs being
a vector boolean with scalar mode.
(expand_vector_condition): Handle first operand being a vector boolean
with scalar mode.
(expand_vector_operations_1): For comparisons, don't bail out early
if the return type is vector boolean with scalar mode, but comparison
operand type is not.
* gcc.target/i386/avx512f-pr91157.c: New test.
* gcc.target/i386/avx512bw-pr91157.c: New test.
From-SVN: r273545
|
|
plus size exceeds size of referenced object in 'bit_field_ref'))
PR tree-optimization/91157
* tree-vect-generic.c (expand_vector_comparison): Handle lhs being
a vector boolean with scalar mode.
(expand_vector_condition): Handle first operand being a vector boolean
with scalar mode.
(expand_vector_operations_1): For comparisons, don't bail out early
if the return type is vector boolean with scalar mode, but comparison
operand type is not.
* gcc.target/i386/avx512f-pr91157.c: New test.
* gcc.target/i386/avx512bw-pr91157.c: New test.
From-SVN: r273543
|
|
2019-07-03 Martin Liska <mliska@suse.cz>
* lra-eliminations.c (eliminate_regs_in_insn): Remove
dead assignemts.
* reg-stack.c (check_asm_stack_operands): Likewise.
* tree-ssa-structalias.c (create_function_info_for): Likewise.
* tree-vect-generic.c (expand_vector_operations_1): Likewise.
* config/i386/i386-expand.c (ix86_expand_sse2_mulvxdi3): Use
force_expand_binop.
2019-07-03 Martin Liska <mliska@suse.cz>
* c-common.c (try_to_locate_new_include_insertion_point): Remove
dead assignemts.
2019-07-03 Martin Liska <mliska@suse.cz>
* call.c (build_new_op_1): Remove
dead assignemts.
* typeck.c (cp_build_binary_op): Likewise.
2019-07-03 Martin Liska <mliska@suse.cz>
* check.c (gfc_check_c_funloc): Remove
dead assignemts.
* decl.c (variable_decl): Likewise.
* resolve.c (resolve_typebound_function): Likewise.
* simplify.c (gfc_simplify_matmul): Likewise.
(gfc_simplify_scan): Likewise.
* trans-array.c (gfc_could_be_alias): Likewise.
* trans-common.c (add_equivalences): Likewise.
* trans-expr.c (trans_class_vptr_len_assignment): Likewise.
(gfc_trans_array_constructor_copy): Likewise.
(gfc_trans_assignment_1): Likewise.
* trans-intrinsic.c (conv_intrinsic_atomic_op): Likewise.
* trans-openmp.c (gfc_omp_finish_clause): Likewise.
* trans-types.c (gfc_get_array_descriptor_base): Likewise.
* trans.c (gfc_build_final_call): Likewise.
2019-07-03 Martin Liska <mliska@suse.cz>
* line-map.c (linemap_get_expansion_filename): Remove
dead assignemts.
* mkdeps.c (make_write): Likewise.
From-SVN: r272994
|
|
* doc/md.texi: Document vec_shl_<mode> pattern.
* optabs.def (vec_shl_optab): New optab.
* optabs.c (shift_amt_for_vec_perm_mask): Add shift_optab
argument, if == vec_shl_optab, check for left whole vector shift
pattern rather than right shift.
(expand_vec_perm_const): Add vec_shl_optab support.
* optabs-query.c (can_vec_perm_var_p): Mention also vec_shl optab
in the comment.
* tree-vect-generic.c (lower_vec_perm): Support permutations which
can be handled by vec_shl_optab.
* tree-vect-stmts.c (scan_store_can_perm_p): New function.
(check_scan_store): Use it.
(vectorizable_scan_store): If target can't do normal permutations,
try to use whole vector left shifts and if needed a VEC_COND_EXPR
after it.
* config/i386/sse.md (vec_shl_<mode>): New expander.
* gcc.dg/vect/vect-simd-8.c: If main is defined, don't include
tree-vect.h nor call check_vect.
* gcc.dg/vect/vect-simd-9.c: Likewise.
* gcc.dg/vect/vect-simd-10.c: New test.
* gcc.target/i386/sse2-vect-simd-8.c: New test.
* gcc.target/i386/sse2-vect-simd-9.c: New test.
* gcc.target/i386/sse2-vect-simd-10.c: New test.
* gcc.target/i386/avx2-vect-simd-8.c: New test.
* gcc.target/i386/avx2-vect-simd-9.c: New test.
* gcc.target/i386/avx2-vect-simd-10.c: New test.
* gcc.target/i386/avx512f-vect-simd-8.c: New test.
* gcc.target/i386/avx512f-vect-simd-9.c: New test.
* gcc.target/i386/avx512f-vect-simd-10.c: New test.
From-SVN: r272472
|
|
PR c++/85052
* tree-vect-generic.c: Include insn-config.h and recog.h.
(expand_vector_piecewise): Add defaulted ret_type argument,
if non-NULL, use that in preference to type for the result type.
(expand_vector_parallel): Formatting fix.
(do_vec_conversion, do_vec_narrowing_conversion,
expand_vector_conversion): New functions.
(expand_vector_operations_1): Call expand_vector_conversion
for VEC_CONVERT ifn calls.
* internal-fn.def (VEC_CONVERT): New internal function.
* internal-fn.c (expand_VEC_CONVERT): New function.
* fold-const-call.c (fold_const_vec_convert): New function.
(fold_const_call): Use it for CFN_VEC_CONVERT.
* doc/extend.texi (__builtin_convertvector): Document.
c-family/
* c-common.h (enum rid): Add RID_BUILTIN_CONVERTVECTOR.
(c_build_vec_convert): Declare.
* c-common.c (c_build_vec_convert): New function.
c/
* c-parser.c (c_parser_postfix_expression): Parse
__builtin_convertvector.
cp/
* cp-tree.h (cp_build_vec_convert): Declare.
* parser.c (cp_parser_postfix_expression): Parse
__builtin_convertvector.
* constexpr.c: Include fold-const-call.h.
(cxx_eval_internal_function): Handle IFN_VEC_CONVERT.
(potential_constant_expression_1): Likewise.
* semantics.c (cp_build_vec_convert): New function.
* pt.c (tsubst_copy_and_build): Handle CALL_EXPR to
IFN_VEC_CONVERT.
testsuite/
* c-c++-common/builtin-convertvector-1.c: New test.
* c-c++-common/torture/builtin-convertvector-1.c: New test.
* g++.dg/ext/builtin-convertvector-1.C: New test.
* g++.dg/cpp0x/constexpr-builtin4.C: New test.
From-SVN: r267632
|
|
From-SVN: r267494
|
|
of vec_splat().
[gcc]
2018-09-06 Will Schmidt <will_schmidt@vnet.ibm.com>
* config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): Add support for
early gimple folding of vec_splat().
* tree-vect-generic.c: Remove static from tree_vec_extract() definition.
* gimple-fold.h: Add an extern define for tree_vec_extract().
From-SVN: r264146
|
|
arm-linux-gnueabihf)
2018-06-14 Richard Biener <rguenther@suse.de>
PR middle-end/86139
* tree-vect-generic.c (build_word_mode_vector_type): Remove
duplicate and harmful type_hash_canon.
* tree.c (type_hash_canon): Assert we didn't find ourselves.
From-SVN: r261588
|
|
vectorized for AVX512DQ target)
PR target/85918
* tree.def (VEC_UNPACK_FIX_TRUNC_HI_EXPR, VEC_UNPACK_FIX_TRUNC_LO_EXPR,
VEC_PACK_FLOAT_EXPR): New tree codes.
* tree-pretty-print.c (op_code_prio): Handle
VEC_UNPACK_FIX_TRUNC_HI_EXPR and VEC_UNPACK_FIX_TRUNC_LO_EXPR.
(dump_generic_node): Handle VEC_UNPACK_FIX_TRUNC_HI_EXPR,
VEC_UNPACK_FIX_TRUNC_LO_EXPR and VEC_PACK_FLOAT_EXPR.
* tree-inline.c (estimate_operator_cost): Likewise.
* gimple-pretty-print.c (dump_binary_rhs): Handle VEC_PACK_FLOAT_EXPR.
* fold-const.c (const_binop): Likewise.
(const_unop): Handle VEC_UNPACK_FIX_TRUNC_HI_EXPR and
VEC_UNPACK_FIX_TRUNC_LO_EXPR.
* tree-cfg.c (verify_gimple_assign_unary): Likewise.
(verify_gimple_assign_binary): Handle VEC_PACK_FLOAT_EXPR.
* cfgexpand.c (expand_debug_expr): Handle VEC_UNPACK_FIX_TRUNC_HI_EXPR,
VEC_UNPACK_FIX_TRUNC_LO_EXPR and VEC_PACK_FLOAT_EXPR.
* expr.c (expand_expr_real_2): Likewise.
* optabs.def (vec_packs_float_optab, vec_packu_float_optab,
vec_unpack_sfix_trunc_hi_optab, vec_unpack_sfix_trunc_lo_optab,
vec_unpack_ufix_trunc_hi_optab, vec_unpack_ufix_trunc_lo_optab): New
optabs.
* optabs.c (expand_widen_pattern_expr): For
VEC_UNPACK_FIX_TRUNC_HI_EXPR and VEC_UNPACK_FIX_TRUNC_LO_EXPR use
sign from result type rather than operand's type.
(expand_binop_directly): For vec_packu_float_optab and
vec_packs_float_optab allow result type to be different from operand's
type.
* optabs-tree.c (optab_for_tree_code): Handle
VEC_UNPACK_FIX_TRUNC_HI_EXPR, VEC_UNPACK_FIX_TRUNC_LO_EXPR and
VEC_PACK_FLOAT_EXPR. Formatting fixes.
* tree-vect-generic.c (expand_vector_operations_1): Handle
VEC_UNPACK_FIX_TRUNC_HI_EXPR, VEC_UNPACK_FIX_TRUNC_LO_EXPR and
VEC_PACK_FLOAT_EXPR.
* tree-vect-stmts.c (supportable_widening_operation): Handle
FIX_TRUNC_EXPR.
(supportable_narrowing_operation): Handle FLOAT_EXPR.
* config/i386/i386.md (fixprefix, floatprefix): New code attributes.
* config/i386/sse.md (*float<floatunssuffix>v2div2sf2): Rename to ...
(float<floatunssuffix>v2div2sf2): ... this. Formatting fix.
(vpckfloat_concat_mode, vpckfloat_temp_mode, vpckfloat_op_mode): New
mode attributes.
(vec_pack<floatprefix>_float_<mode>): New expander.
(vunpckfixt_mode, vunpckfixt_model, vunpckfixt_extract_mode): New mode
attributes.
(vec_unpack_<fixprefix>fix_trunc_lo_<mode>,
vec_unpack_<fixprefix>fix_trunc_hi_<mode>): New expanders.
* doc/md.texi (vec_packs_float_@var{m}, vec_packu_float_@var{m},
vec_unpack_sfix_trunc_hi_@var{m}, vec_unpack_sfix_trunc_lo_@var{m},
vec_unpack_ufix_trunc_hi_@var{m}, vec_unpack_ufix_trunc_lo_@var{m}):
Document.
* doc/generic.texi (VEC_UNPACK_FLOAT_HI_EXPR,
VEC_UNPACK_FLOAT_LO_EXPR): Fix pasto in description.
(VEC_UNPACK_FIX_TRUNC_HI_EXPR, VEC_UNPACK_FIX_TRUNC_LO_EXPR,
VEC_PACK_FLOAT_EXPR): Document.
* gcc.target/i386/avx512dq-pr85918.c: Add -mprefer-vector-width=512
and -fno-vect-cost-model options. Add aligned(64) attribute to the
arrays. Add suffix 1 to all functions and use 4 iterations rather
than N. Add functions with conversions to and from float.
Add new set of functions with 8 iterations and another one
with 16 iterations, expect 24 vectorized loops instead of just 4.
* gcc.target/i386/avx512dq-pr85918-2.c: New test.
From-SVN: r260893
|
|
mismatch in vector pack expression))
2018-05-28 Richard Biener <rguenther@suse.de>
PR tree-optimization/85934
* tree-vect-generic.c (expand_vector_operations_1): Hoist
vector boolean check before scalar optimization.
* gcc.target/i386/pr85934.c: New testcase.
From-SVN: r260847
|
|
This patch adds a new mode class to represent vectors of booleans.
GET_MODE_BITSIZE (m) / GET_MODE_NUNITS (m) determines the number
of bits that are used to represent each boolean; this can be 1
for a fully-packed representation or greater than 1 for an unpacked
representation. In the latter case, the value of bits other than
the lowest is not significant.
These are used by the SVE port to represent predicates.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* mode-classes.def (MODE_VECTOR_BOOL): New mode class.
* machmode.h (INTEGRAL_MODE_P, VECTOR_MODE_P): Return true
for MODE_VECTOR_BOOL.
* machmode.def (VECTOR_BOOL_MODE): Document.
* genmodes.c (VECTOR_BOOL_MODE): New macro.
(make_vector_bool_mode): New function.
(complete_mode, emit_mode_wider, emit_mode_adjustments): Handle
MODE_VECTOR_BOOL.
* lto-streamer-in.c (lto_input_mode_table): Likewise.
* rtx-vector-builder.c (rtx_vector_builder::find_cached_value):
Likewise.
* stor-layout.c (int_mode_for_mode): Likewise.
* tree.c (build_vector_type_for_mode): Likewise.
* varasm.c (output_constant_pool_2): Likewise.
* emit-rtl.c (init_emit_once): Make sure that CONST1_RTX (BImode) and
CONSTM1_RTX (BImode) are the same thing. Initialize const_tiny_rtx
for MODE_VECTOR_BOOL.
* expr.c (expand_expr_real_1): Use VECTOR_MODE_P instead of a list
of mode class checks.
* tree-vect-generic.c (expand_vector_operation): Use VECTOR_MODE_P
instead of a list of mode class checks.
(expand_vector_scalar_condition): Likewise.
(type_for_widest_vector_mode): Handle BImode as an inner mode.
gcc/c-family/
* c-common.c (c_common_type_for_mode): Handle MODE_VECTOR_BOOL.
gcc/fortran/
* trans-types.c (gfc_type_for_mode): Handle MODE_VECTOR_BOOL.
gcc/go/
* go-lang.c (go_langhook_type_for_mode): Handle MODE_VECTOR_BOOL.
gcc/lto/
* lto-lang.c (lto_type_for_mode): Handle MODE_VECTOR_BOOL.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256202
|
|
This patch changes TYPE_VECTOR_SUBPARTS to a poly_uint64. The value is
encoded in the 10-bit precision field and was previously always stored
as a simple log2 value. The challenge was to use this 10 bits to
encode the number of elements in variable-length vectors, so that
we didn't need to increase the size of the tree.
In practice the number of vector elements should always have the form
N + N * X (where X is the runtime value), and as for constant-length
vectors, N must be a power of 2 (even though X itself might not be).
The patch therefore uses the low 8 bits to encode log2(N) and bit
8 to select between constant-length and variable-length vectors.
Targets without variable-length vectors continue to use the old scheme.
A new valid_vector_subparts_p function tests whether a given number
of elements can be encoded. This is false for the vector modes that
represent an LD3 or ST3 vector triple (which we want to treat as arrays
of vectors rather than single vectors).
Most of the patch is mechanical; previous patches handled the changes
that weren't entirely straightforward.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* tree.h (TYPE_VECTOR_SUBPARTS): Turn into a function and handle
polynomial numbers of units.
(SET_TYPE_VECTOR_SUBPARTS): Likewise.
(valid_vector_subparts_p): New function.
(build_vector_type): Remove temporary shim and take the number
of units as a poly_uint64 rather than an int.
(build_opaque_vector_type): Take the number of units as a
poly_uint64 rather than an int.
* tree.c (build_vector_from_ctor): Handle polynomial
TYPE_VECTOR_SUBPARTS.
(type_hash_canon_hash, type_cache_hasher::equal): Likewise.
(uniform_vector_p, vector_type_mode, build_vector): Likewise.
(build_vector_from_val): If the number of units is variable,
use build_vec_duplicate_cst for constant operands and
VEC_DUPLICATE_EXPR otherwise.
(make_vector_type): Remove temporary is_constant ().
(build_vector_type, build_opaque_vector_type): Take the number of
units as a poly_uint64 rather than an int.
(check_vector_cst): Handle polynomial TYPE_VECTOR_SUBPARTS and
VECTOR_CST_NELTS.
* cfgexpand.c (expand_debug_expr): Likewise.
* expr.c (count_type_elements, categorize_ctor_elements_1): Likewise.
(store_constructor, expand_expr_real_1): Likewise.
(const_scalar_mask_from_tree): Likewise.
* fold-const-call.c (fold_const_reduction): Likewise.
* fold-const.c (const_binop, const_unop, fold_convert_const): Likewise.
(operand_equal_p, fold_vec_perm, fold_ternary_loc): Likewise.
(native_encode_vector, vec_cst_ctor_to_array): Likewise.
(fold_relational_const): Likewise.
(native_interpret_vector): Likewise. Change the size from an
int to an unsigned int.
* gimple-fold.c (gimple_fold_stmt_to_constant_1): Handle polynomial
TYPE_VECTOR_SUBPARTS.
(gimple_fold_indirect_ref, gimple_build_vector): Likewise.
(gimple_build_vector_from_val): Use VEC_DUPLICATE_EXPR when
duplicating a non-constant operand into a variable-length vector.
* hsa-brig.c (hsa_op_immed::emit_to_buffer): Handle polynomial
TYPE_VECTOR_SUBPARTS and VECTOR_CST_NELTS.
* ipa-icf.c (sem_variable::equals): Likewise.
* match.pd: Likewise.
* omp-simd-clone.c (simd_clone_subparts): Likewise.
* print-tree.c (print_node): Likewise.
* stor-layout.c (layout_type): Likewise.
* targhooks.c (default_builtin_vectorization_cost): Likewise.
* tree-cfg.c (verify_gimple_comparison): Likewise.
(verify_gimple_assign_binary): Likewise.
(verify_gimple_assign_ternary): Likewise.
(verify_gimple_assign_single): Likewise.
* tree-pretty-print.c (dump_generic_node): Likewise.
* tree-ssa-forwprop.c (simplify_vector_constructor): Likewise.
(simplify_bitfield_ref, is_combined_permutation_identity): Likewise.
* tree-vect-data-refs.c (vect_permute_store_chain): Likewise.
(vect_grouped_load_supported, vect_permute_load_chain): Likewise.
(vect_shift_permute_load_chain): Likewise.
* tree-vect-generic.c (nunits_for_known_piecewise_op): Likewise.
(expand_vector_condition, optimize_vector_constructor): Likewise.
(lower_vec_perm, get_compute_type): Likewise.
* tree-vect-loop.c (vect_determine_vectorization_factor): Likewise.
(get_initial_defs_for_reduction, vect_transform_loop): Likewise.
* tree-vect-patterns.c (vect_recog_bool_pattern): Likewise.
(vect_recog_mask_conversion_pattern): Likewise.
* tree-vect-slp.c (vect_supported_load_permutation_p): Likewise.
(vect_get_constant_vectors, vect_transform_slp_perm_load): Likewise.
* tree-vect-stmts.c (perm_mask_for_reverse): Likewise.
(get_group_load_store_type, vectorizable_mask_load_store): Likewise.
(vectorizable_bswap, simd_clone_subparts, vectorizable_assignment)
(vectorizable_shift, vectorizable_operation, vectorizable_store)
(vectorizable_load, vect_is_simple_cond, vectorizable_comparison)
(supportable_widening_operation): Likewise.
(supportable_narrowing_operation): Likewise.
* tree-vector-builder.c (tree_vector_builder::binary_encoded_nelts):
Likewise.
* varasm.c (output_constant): Likewise.
gcc/ada/
* gcc-interface/utils.c (gnat_types_compatible_p): Handle
polynomial TYPE_VECTOR_SUBPARTS.
gcc/brig/
* brigfrontend/brig-to-generic.cc (get_unsigned_int_type): Handle
polynomial TYPE_VECTOR_SUBPARTS.
* brigfrontend/brig-util.h (gccbrig_type_vector_subparts): Likewise.
gcc/c-family/
* c-common.c (vector_types_convertible_p, c_build_vec_perm_expr)
(convert_vector_to_array_for_subscript): Handle polynomial
TYPE_VECTOR_SUBPARTS.
(c_common_type_for_mode): Check valid_vector_subparts_p.
* c-pretty-print.c (pp_c_initializer_list): Handle polynomial
VECTOR_CST_NELTS.
gcc/c/
* c-typeck.c (comptypes_internal, build_binary_op): Handle polynomial
TYPE_VECTOR_SUBPARTS.
gcc/cp/
* constexpr.c (cxx_eval_array_reference): Handle polynomial
VECTOR_CST_NELTS.
(cxx_fold_indirect_ref): Handle polynomial TYPE_VECTOR_SUBPARTS.
* call.c (build_conditional_expr_1): Likewise.
* decl.c (cp_finish_decomp): Likewise.
* mangle.c (write_type): Likewise.
* typeck.c (structural_comptypes): Likewise.
(cp_build_binary_op): Likewise.
* typeck2.c (process_init_constructor_array): Likewise.
gcc/fortran/
* trans-types.c (gfc_type_for_mode): Check valid_vector_subparts_p.
gcc/lto/
* lto-lang.c (lto_type_for_mode): Check valid_vector_subparts_p.
* lto.c (hash_canonical_type): Handle polynomial TYPE_VECTOR_SUBPARTS.
gcc/go/
* go-lang.c (go_langhook_type_for_mode): Check valid_vector_subparts_p.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256197
|
|
This patch changes GET_MODE_NUNITS from unsigned char
to poly_uint16, although it remains a macro when compiling
target code with NUM_POLY_INT_COEFFS == 1.
We can handle permuted loads and stores for variable nunits if
the number of statements is a power of 2, but not otherwise.
The to_constant call in make_vector_type goes away in a later patch.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* machmode.h (mode_nunits): Change from unsigned char to
poly_uint16_pod.
(ONLY_FIXED_SIZE_MODES): New macro.
(pod_mode::measurement_type, scalar_int_mode::measurement_type)
(scalar_float_mode::measurement_type, scalar_mode::measurement_type)
(complex_mode::measurement_type, fixed_size_mode::measurement_type):
New typedefs.
(mode_to_nunits): Return a poly_uint16 rather than an unsigned short.
(GET_MODE_NUNITS): Return a constant if ONLY_FIXED_SIZE_MODES,
or if measurement_type is not polynomial.
* genmodes.c (ZERO_COEFFS): New macro.
(emit_mode_nunits_inline): Make mode_nunits_inline return a
poly_uint16.
(emit_mode_nunits): Change the type of mode_nunits to poly_uint16_pod.
Use ZERO_COEFFS when emitting initializers.
* data-streamer.h (bp_pack_poly_value): New function.
(bp_unpack_poly_value): Likewise.
* lto-streamer-in.c (lto_input_mode_table): Use bp_unpack_poly_value
for GET_MODE_NUNITS.
* lto-streamer-out.c (lto_write_mode_table): Use bp_pack_poly_value
for GET_MODE_NUNITS.
* tree.c (make_vector_type): Remove temporary shim and make
the real function take the number of units as a poly_uint64
rather than an int.
(build_vector_type_for_mode): Handle polynomial nunits.
* dwarf2out.c (loc_descriptor, add_const_value_attribute): Likewise.
* emit-rtl.c (const_vec_series_p_1): Likewise.
(gen_rtx_CONST_VECTOR): Likewise.
* fold-const.c (test_vec_duplicate_folding): Likewise.
* genrecog.c (validate_pattern): Likewise.
* optabs-query.c (can_vec_perm_var_p, can_mult_highpart_p): Likewise.
* optabs-tree.c (expand_vec_cond_expr_p): Likewise.
* optabs.c (expand_vector_broadcast, expand_binop_directly): Likewise.
(shift_amt_for_vec_perm_mask, expand_vec_perm_var): Likewise.
(expand_vec_cond_expr, expand_mult_highpart): Likewise.
* rtlanal.c (subreg_get_info): Likewise.
* tree-vect-data-refs.c (vect_grouped_store_supported): Likewise.
(vect_grouped_load_supported): Likewise.
* tree-vect-generic.c (type_for_widest_vector_mode): Likewise.
* tree-vect-loop.c (have_whole_vector_shift): Likewise.
* simplify-rtx.c (simplify_unary_operation_1): Likewise.
(simplify_const_unary_operation, simplify_binary_operation_1)
(simplify_const_binary_operation, simplify_ternary_operation)
(test_vector_ops_duplicate, test_vector_ops): Likewise.
(simplify_immed_subreg): Use GET_MODE_NUNITS on a fixed_size_mode
instead of CONST_VECTOR_NUNITS.
* varasm.c (output_constant_pool_2): Likewise.
* rtx-vector-builder.c (rtx_vector_builder::build): Only include the
explicit-encoded elements in the XVEC for variable-length vectors.
gcc/ada/
* gcc-interface/misc.c (enumerate_modes): Handle polynomial
GET_MODE_NUNITS.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256195
|
|
From-SVN: r256169
|
|
This patch changes the vec_perm_indices element type from HOST_WIDE_INT
to poly_int64, so that it can represent indices into a variable-length
vector.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* vec-perm-indices.h (vec_perm_builder): Change element type
from HOST_WIDE_INT to poly_int64.
(vec_perm_indices::element_type): Update accordingly.
(vec_perm_indices::clamp): Handle polynomial element_types.
* vec-perm-indices.c (vec_perm_indices::series_p): Likewise.
(vec_perm_indices::all_in_range_p): Likewise.
(tree_to_vec_perm_builder): Check for poly_int64 trees rather
than shwi trees.
* vector-builder.h (vector_builder::stepped_sequence_p): Handle
polynomial vec_perm_indices element types.
* int-vector-builder.h (int_vector_builder::equal_p): Likewise.
* fold-const.c (fold_vec_perm): Likewise.
* optabs.c (shift_amt_for_vec_perm_mask): Likewise.
* tree-vect-generic.c (lower_vec_perm): Likewise.
* tree-vect-slp.c (vect_transform_slp_perm_load): Likewise.
* config/aarch64/aarch64.c (aarch64_evpc_tbl): Cast d->perm
element type to HOST_WIDE_INT.
From-SVN: r256164
|
|
This patch makes tree-vect-generic.c cope with variable-length vectors.
Decomposition is only supported for constant-length vectors, since we
should never generate unsupported variable-length operations.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* tree-vect-generic.c (nunits_for_known_piecewise_op): New function.
(expand_vector_piecewise): Use it instead of TYPE_VECTOR_SUBPARTS.
(expand_vector_addition, add_rshift, expand_vector_divmod): Likewise.
(expand_vector_condition, vector_element): Likewise.
(subparts_gt): New function.
(get_compute_type): Use subparts_gt.
(count_type_subparts): Delete.
(expand_vector_operations_1): Use subparts_gt instead of
count_type_subparts.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256143
|
|
This patch makes shift_amt_for_vec_perm_mask use series_p to check
for the simple case of a natural linear series before falling back
to testing each element individually. The series_p test works with
variable-length vectors but testing every individual element doesn't.
2018-01-02 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* optabs.c (shift_amt_for_vec_perm_mask): Try using series_p
before testing each element individually.
* tree-vect-generic.c (lower_vec_perm): Likewise.
From-SVN: r256099
|
|
This patch changes vec_perm_indices from a plain vec<> to a class
that stores a canonicalized permutation, using the same encoding
as for VECTOR_CSTs. This means that vec_perm_indices now carries
information about the number of vectors being permuted (currently
always 1 or 2) and the number of elements in each input vector.
A new vec_perm_builder class is used to actually build up the vector,
like tree_vector_builder does for trees. vec_perm_indices is the
completed representation, a bit like VECTOR_CST is for trees.
The patch just does a mechanical conversion of the code to
vec_perm_builder: a later patch uses explicit encodings where possible.
The point of all this is that it makes the representation suitable
for variable-length vectors. It's no longer necessary for the
underlying vec<>s to store every element explicitly.
In int-vector-builder.h, "using the same encoding as tree and rtx constants"
describes the endpoint -- adding the rtx encoding comes later.
2018-01-02 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* int-vector-builder.h: New file.
* vec-perm-indices.h: Include int-vector-builder.h.
(vec_perm_indices): Redefine as an int_vector_builder.
(auto_vec_perm_indices): Delete.
(vec_perm_builder): Redefine as a stand-alone class.
(vec_perm_indices::vec_perm_indices): New function.
(vec_perm_indices::clamp): Likewise.
* vec-perm-indices.c: Include fold-const.h and tree-vector-builder.h.
(vec_perm_indices::new_vector): New function.
(vec_perm_indices::new_expanded_vector): Update for new
vec_perm_indices class.
(vec_perm_indices::rotate_inputs): New function.
(vec_perm_indices::all_in_range_p): Operate directly on the
encoded form, without computing elided elements.
(tree_to_vec_perm_builder): Operate directly on the VECTOR_CST
encoding. Update for new vec_perm_indices class.
* optabs.c (expand_vec_perm_const): Create a vec_perm_indices for
the given vec_perm_builder.
(expand_vec_perm_var): Update vec_perm_builder constructor.
(expand_mult_highpart): Use vec_perm_builder instead of
auto_vec_perm_indices.
* optabs-query.c (can_mult_highpart_p): Use vec_perm_builder and
vec_perm_indices instead of auto_vec_perm_indices. Use a single
or double series encoding as appropriate.
* fold-const.c (fold_ternary_loc): Use vec_perm_builder and
vec_perm_indices instead of auto_vec_perm_indices.
* tree-ssa-forwprop.c (simplify_vector_constructor): Likewise.
* tree-vect-data-refs.c (vect_grouped_store_supported): Likewise.
(vect_permute_store_chain): Likewise.
(vect_grouped_load_supported): Likewise.
(vect_permute_load_chain): Likewise.
(vect_shift_permute_load_chain): Likewise.
* tree-vect-slp.c (vect_build_slp_tree_1): Likewise.
(vect_transform_slp_perm_load): Likewise.
(vect_schedule_slp_instance): Likewise.
* tree-vect-stmts.c (perm_mask_for_reverse): Likewise.
(vectorizable_mask_load_store): Likewise.
(vectorizable_bswap): Likewise.
(vectorizable_store): Likewise.
(vectorizable_load): Likewise.
* tree-vect-generic.c (lower_vec_perm): Use vec_perm_builder and
vec_perm_indices instead of auto_vec_perm_indices. Use
tree_to_vec_perm_builder to read the vector from a tree.
* tree-vect-loop.c (calc_vec_perm_mask_for_shift): Take a
vec_perm_builder instead of a vec_perm_indices.
(have_whole_vector_shift): Use vec_perm_builder and
vec_perm_indices instead of auto_vec_perm_indices. Leave the
truncation to calc_vec_perm_mask_for_shift.
(vect_create_epilog_for_reduction): Likewise.
* config/aarch64/aarch64.c (expand_vec_perm_d::perm): Change
from auto_vec_perm_indices to vec_perm_indices.
(aarch64_expand_vec_perm_const_1): Use rotate_inputs on d.perm
instead of changing individual elements.
(aarch64_vectorize_vec_perm_const): Use new_vector to install
the vector in d.perm.
* config/arm/arm.c (expand_vec_perm_d::perm): Change
from auto_vec_perm_indices to vec_perm_indices.
(arm_expand_vec_perm_const_1): Use rotate_inputs on d.perm
instead of changing individual elements.
(arm_vectorize_vec_perm_const): Use new_vector to install
the vector in d.perm.
* config/powerpcspe/powerpcspe.c (rs6000_expand_extract_even):
Update vec_perm_builder constructor.
(rs6000_expand_interleave): Likewise.
* config/rs6000/rs6000.c (rs6000_expand_extract_even): Likewise.
(rs6000_expand_interleave): Likewise.
From-SVN: r256095
|
|
One of the changes needed for variable-length VEC_PERM_EXPRs -- and for
long fixed-length VEC_PERM_EXPRs -- is the ability to use constant
selectors that wouldn't fit in the vectors being permuted. E.g. a
permute on two V256QIs can't be done using a V256QI selector.
At the moment constant permutes use two interfaces:
targetm.vectorizer.vec_perm_const_ok for testing whether a permute is
valid and the vec_perm_const optab for actually emitting the permute.
The former gets passed a vec<> selector and the latter an rtx selector.
Most ports share a lot of code between the hook and the optab, with a
wrapper function for each interface.
We could try to keep that interface and require ports to define wider
vector modes that could be attached to the CONST_VECTOR (e.g. V256HI or
V256SI in the example above). But building a CONST_VECTOR rtx seems a bit
pointless here, since the expand code only creates the CONST_VECTOR in
order to call the optab, and the first thing the target does is take
the CONST_VECTOR apart again.
The easiest approach therefore seemed to be to remove the optab and
reuse the target hook to emit the code. One potential drawback is that
it's no longer possible to use match_operand predicates to force
operands into the required form, but in practice all targets want
register operands anyway.
The patch also changes vec_perm_indices into a class that provides
some simple routines for handling permutations. A later patch will
flesh this out and get rid of auto_vec_perm_indices, but I didn't
want to do all that in this patch and make it more complicated than
it already is.
2018-01-02 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* Makefile.in (OBJS): Add vec-perm-indices.o.
* vec-perm-indices.h: New file.
* vec-perm-indices.c: Likewise.
* target.h (vec_perm_indices): Replace with a forward class
declaration.
(auto_vec_perm_indices): Move to vec-perm-indices.h.
* optabs.h: Include vec-perm-indices.h.
(expand_vec_perm): Delete.
(selector_fits_mode_p, expand_vec_perm_var): Declare.
(expand_vec_perm_const): Declare.
* target.def (vec_perm_const_ok): Replace with...
(vec_perm_const): ...this new hook.
* doc/tm.texi.in (TARGET_VECTORIZE_VEC_PERM_CONST_OK): Replace with...
(TARGET_VECTORIZE_VEC_PERM_CONST): ...this new hook.
* doc/tm.texi: Regenerate.
* optabs.def (vec_perm_const): Delete.
* doc/md.texi (vec_perm_const): Likewise.
(vec_perm): Refer to TARGET_VECTORIZE_VEC_PERM_CONST.
* expr.c (expand_expr_real_2): Use expand_vec_perm_const rather than
expand_vec_perm for constant permutation vectors. Assert that
the mode of variable permutation vectors is the integer equivalent
of the mode that is being permuted.
* optabs-query.h (selector_fits_mode_p): Declare.
* optabs-query.c: Include vec-perm-indices.h.
(selector_fits_mode_p): New function.
(can_vec_perm_const_p): Check whether targetm.vectorize.vec_perm_const
is defined, instead of checking whether the vec_perm_const_optab
exists. Use targetm.vectorize.vec_perm_const instead of
targetm.vectorize.vec_perm_const_ok. Check whether the indices
fit in the vector mode before using a variable permute.
* optabs.c (shift_amt_for_vec_perm_mask): Take a mode and a
vec_perm_indices instead of an rtx.
(expand_vec_perm): Replace with...
(expand_vec_perm_const): ...this new function. Take the selector
as a vec_perm_indices rather than an rtx. Also take the mode of
the selector. Update call to shift_amt_for_vec_perm_mask.
Use targetm.vectorize.vec_perm_const instead of vec_perm_const_optab.
Use vec_perm_indices::new_expanded_vector to expand the original
selector into bytes. Check whether the indices fit in the vector
mode before using a variable permute.
(expand_vec_perm_var): Make global.
(expand_mult_highpart): Use expand_vec_perm_const.
* fold-const.c: Includes vec-perm-indices.h.
* tree-ssa-forwprop.c: Likewise.
* tree-vect-data-refs.c: Likewise.
* tree-vect-generic.c: Likewise.
* tree-vect-loop.c: Likewise.
* tree-vect-slp.c: Likewise.
* tree-vect-stmts.c: Likewise.
* config/aarch64/aarch64-protos.h (aarch64_expand_vec_perm_const):
Delete.
* config/aarch64/aarch64-simd.md (vec_perm_const<mode>): Delete.
* config/aarch64/aarch64.c (aarch64_expand_vec_perm_const)
(aarch64_vectorize_vec_perm_const_ok): Fuse into...
(aarch64_vectorize_vec_perm_const): ...this new function.
(TARGET_VECTORIZE_VEC_PERM_CONST_OK): Delete.
(TARGET_VECTORIZE_VEC_PERM_CONST): Redefine.
* config/arm/arm-protos.h (arm_expand_vec_perm_const): Delete.
* config/arm/vec-common.md (vec_perm_const<mode>): Delete.
* config/arm/arm.c (TARGET_VECTORIZE_VEC_PERM_CONST_OK): Delete.
(TARGET_VECTORIZE_VEC_PERM_CONST): Redefine.
(arm_expand_vec_perm_const, arm_vectorize_vec_perm_const_ok): Merge
into...
(arm_vectorize_vec_perm_const): ...this new function. Explicitly
check for NEON modes.
* config/i386/i386-protos.h (ix86_expand_vec_perm_const): Delete.
* config/i386/sse.md (VEC_PERM_CONST, vec_perm_const<mode>): Delete.
* config/i386/i386.c (ix86_expand_vec_perm_const_1): Update comment.
(ix86_expand_vec_perm_const, ix86_vectorize_vec_perm_const_ok): Merge
into...
(ix86_vectorize_vec_perm_const): ...this new function. Incorporate
the old VEC_PERM_CONST conditions.
* config/ia64/ia64-protos.h (ia64_expand_vec_perm_const): Delete.
* config/ia64/vect.md (vec_perm_const<mode>): Delete.
* config/ia64/ia64.c (ia64_expand_vec_perm_const)
(ia64_vectorize_vec_perm_const_ok): Merge into...
(ia64_vectorize_vec_perm_const): ...this new function.
* config/mips/loongson.md (vec_perm_const<mode>): Delete.
* config/mips/mips-msa.md (vec_perm_const<mode>): Delete.
* config/mips/mips-ps-3d.md (vec_perm_constv2sf): Delete.
* config/mips/mips-protos.h (mips_expand_vec_perm_const): Delete.
* config/mips/mips.c (mips_expand_vec_perm_const)
(mips_vectorize_vec_perm_const_ok): Merge into...
(mips_vectorize_vec_perm_const): ...this new function.
* config/powerpcspe/altivec.md (vec_perm_constv16qi): Delete.
* config/powerpcspe/paired.md (vec_perm_constv2sf): Delete.
* config/powerpcspe/spe.md (vec_perm_constv2si): Delete.
* config/powerpcspe/vsx.md (vec_perm_const<mode>): Delete.
* config/powerpcspe/powerpcspe-protos.h (altivec_expand_vec_perm_const)
(rs6000_expand_vec_perm_const): Delete.
* config/powerpcspe/powerpcspe.c (TARGET_VECTORIZE_VEC_PERM_CONST_OK):
Delete.
(TARGET_VECTORIZE_VEC_PERM_CONST): Redefine.
(altivec_expand_vec_perm_const_le): Take each operand individually.
Operate on constant selectors rather than rtxes.
(altivec_expand_vec_perm_const): Likewise. Update call to
altivec_expand_vec_perm_const_le.
(rs6000_expand_vec_perm_const): Delete.
(rs6000_vectorize_vec_perm_const_ok): Delete.
(rs6000_vectorize_vec_perm_const): New function.
(rs6000_do_expand_vec_perm): Take a vec_perm_builder instead of
an element count and rtx array.
(rs6000_expand_extract_even): Update call accordingly.
(rs6000_expand_interleave): Likewise.
* config/rs6000/altivec.md (vec_perm_constv16qi): Delete.
* config/rs6000/paired.md (vec_perm_constv2sf): Delete.
* config/rs6000/vsx.md (vec_perm_const<mode>): Delete.
* config/rs6000/rs6000-protos.h (altivec_expand_vec_perm_const)
(rs6000_expand_vec_perm_const): Delete.
* config/rs6000/rs6000.c (TARGET_VECTORIZE_VEC_PERM_CONST_OK): Delete.
(TARGET_VECTORIZE_VEC_PERM_CONST): Redefine.
(altivec_expand_vec_perm_const_le): Take each operand individually.
Operate on constant selectors rather than rtxes.
(altivec_expand_vec_perm_const): Likewise. Update call to
altivec_expand_vec_perm_const_le.
(rs6000_expand_vec_perm_const): Delete.
(rs6000_vectorize_vec_perm_const_ok): Delete.
(rs6000_vectorize_vec_perm_const): New function. Remove stray
reference to the SPE evmerge intructions.
(rs6000_do_expand_vec_perm): Take a vec_perm_builder instead of
an element count and rtx array.
(rs6000_expand_extract_even): Update call accordingly.
(rs6000_expand_interleave): Likewise.
* config/sparc/sparc.md (vec_perm_constv8qi): Delete in favor of...
* config/sparc/sparc.c (sparc_vectorize_vec_perm_const): ...this
new function.
(TARGET_VECTORIZE_VEC_PERM_CONST): Redefine.
From-SVN: r256093
|
|
This patch splits can_vec_perm_p into two functions: can_vec_perm_var_p
for testing permute operations with variable selection vectors, and
can_vec_perm_const_p for testing permute operations with specific
constant selection vectors. This means that we can pass the constant
selection vector by reference.
Constant permutes can still use a variable permute as a fallback.
A later patch adds a check to makre sure that we don't truncate the
vector indices when doing this.
However, have_whole_vector_shift checked:
if (direct_optab_handler (vec_perm_const_optab, mode) == CODE_FOR_nothing)
return false;
which had the effect of disallowing the fallback to variable permutes.
I'm not sure whether that was the intention or whether it was just
supposed to short-cut the loop on targets that don't support permutes.
(But then why bother? The first check in the loop would fail and
we'd bail out straightaway.)
The patch adds a parameter for disallowing the fallback. I think it
makes sense to do this for the following code in the VEC_PERM_EXPR
folder:
/* Some targets are deficient and fail to expand a single
argument permutation while still allowing an equivalent
2-argument version. */
if (need_mask_canon && arg2 == op2
&& !can_vec_perm_p (TYPE_MODE (type), false, &sel)
&& can_vec_perm_p (TYPE_MODE (type), false, &sel2))
since it's really testing whether the expand_vec_perm_const code expects
a particular form.
2018-01-02 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* optabs-query.h (can_vec_perm_p): Delete.
(can_vec_perm_var_p, can_vec_perm_const_p): Declare.
* optabs-query.c (can_vec_perm_p): Split into...
(can_vec_perm_var_p, can_vec_perm_const_p): ...these two functions.
(can_mult_highpart_p): Use can_vec_perm_const_p to test whether a
particular selector is valid.
* tree-ssa-forwprop.c (simplify_vector_constructor): Likewise.
* tree-vect-data-refs.c (vect_grouped_store_supported): Likewise.
(vect_grouped_load_supported): Likewise.
(vect_shift_permute_load_chain): Likewise.
* tree-vect-slp.c (vect_build_slp_tree_1): Likewise.
(vect_transform_slp_perm_load): Likewise.
* tree-vect-stmts.c (perm_mask_for_reverse): Likewise.
(vectorizable_bswap): Likewise.
(vect_gen_perm_mask_checked): Likewise.
* fold-const.c (fold_ternary_loc): Likewise. Don't take
implementations of variable permutation vectors into account
when deciding which selector to use.
* tree-vect-loop.c (have_whole_vector_shift): Don't check whether
vec_perm_const_optab is supported; instead use can_vec_perm_const_p
with a false third argument.
* tree-vect-generic.c (lower_vec_perm): Use can_vec_perm_const_p
to test whether the constant selector is valid and can_vec_perm_var_p
to test whether a variable selector is valid.
From-SVN: r256091
|
|
Similarly to the VEC_DUPLICATE_EXPR, this patch adds a tree code
equivalent of the VEC_SERIES rtx code: VEC_SERIES_EXPR.
2017-12-16 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* doc/generic.texi (VEC_SERIES_EXPR): Document.
* doc/md.texi (vec_series@var{m}): Document.
* tree.def (VEC_SERIES_EXPR): New tree code.
* tree.h (build_vec_series): Declare.
* tree.c (build_vec_series): New function.
* cfgexpand.c (expand_debug_expr): Handle VEC_SERIES_EXPR.
* tree-pretty-print.c (dump_generic_node): Likewise.
* gimple-pretty-print.c (dump_binary_rhs): Likewise.
* tree-inline.c (estimate_operator_cost): Likewise.
* expr.c (expand_expr_real_2): Likewise.
* optabs-tree.c (optab_for_tree_code): Likewise.
* tree-cfg.c (verify_gimple_assign_binary): Likewise.
* fold-const.c (const_binop): Fold VEC_SERIES_EXPRs of constants.
* expmed.c (make_tree): Handle VEC_SERIES.
* optabs.def (vec_series_optab): New optab.
* optabs.h (expand_vec_series_expr): Declare.
* optabs.c (expand_vec_series_expr): New function.
* tree-vect-generic.c (expand_vector_operations_1): Check that
the operands also have vector type.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r255741
|
|
SVE needs a way of broadcasting a scalar to a variable-length vector.
This patch adds VEC_DUPLICATE_EXPR for when CONSTRUCTOR would be used
for fixed-length vectors; this is the tree equivalent of the existing
rtl code VEC_DUPLICATE.
The patch also adds a vec_duplicate_optab to go with VEC_DUPLICATE_EXPR.
2017-12-16 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hawyard@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* doc/generic.texi (VEC_DUPLICATE_EXPR): Document.
(VEC_COND_EXPR): Add missing @tindex.
* doc/md.texi (vec_duplicate@var{m}): Document.
* tree.def (VEC_DUPLICATE_EXPR): New tree codes.
* tree.c (build_vector_from_val): Add stubbed-out handling of
variable-length vectors, using VEC_DUPLICATE_EXPR.
(uniform_vector_p): Handle VEC_DUPLICATE_EXPR.
* cfgexpand.c (expand_debug_expr): Likewise.
* tree-cfg.c (verify_gimple_assign_unary): Likewise.
* tree-inline.c (estimate_operator_cost): Likewise.
* tree-pretty-print.c (dump_generic_node): Likewise.
* tree-vect-generic.c (ssa_uniform_vector_p): Likewise.
* fold-const.c (const_unop): Fold VEC_DUPLICATE_EXPRs of a constant.
(test_vec_duplicate_folding): New function.
(fold_const_c_tests): Call it.
* optabs.def (vec_duplicate_optab): New optab.
* optabs-tree.c (optab_for_tree_code): Handle VEC_DUPLICATE_EXPR.
* optabs.h (expand_vector_broadcast): Declare.
* optabs.c (expand_vector_broadcast): Make non-static. Try using
vec_duplicate_optab.
* expr.c (store_constructor): Try using vec_duplicate_optab for
uniform vectors.
(expand_expr_real_2): Handle VEC_DUPLICATE_EXPR.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r255740
|
|
This patch switches most build_vector calls over to tree_vector_builder,
using explicit encodings where appropriate. Later patches handle
the remaining uses of build_vector.
2017-12-07 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* config/sparc/sparc.c: Include tree-vector-builder.h.
(sparc_fold_builtin): Use tree_vector_builder instead of build_vector.
* expmed.c: Include tree-vector-builder.h.
(make_tree): Use tree_vector_builder instead of build_vector.
* fold-const.c: Include tree-vector-builder.h.
(const_binop): Use tree_vector_builder instead of build_vector.
(const_unop): Likewise.
(native_interpret_vector): Likewise.
(fold_vec_perm): Likewise.
(fold_ternary_loc): Likewise.
* gimple-fold.c: Include tree-vector-builder.h.
(gimple_fold_stmt_to_constant_1): Use tree_vector_builder instead
of build_vector.
* tree-ssa-forwprop.c: Include tree-vector-builder.h.
(simplify_vector_constructor): Use tree_vector_builder instead
of build_vector.
* tree-vect-generic.c: Include tree-vector-builder.h.
(add_rshift): Use tree_vector_builder instead of build_vector.
(expand_vector_divmod): Likewise.
(optimize_vector_constructor): Likewise.
* tree-vect-loop.c: Include tree-vector-builder.h.
(vect_create_epilog_for_reduction): Use tree_vector_builder instead
of build_vector. Explicitly use a stepped encoding for
{ 1, 2, 3, ... }.
* tree-vect-slp.c: Include tree-vector-builder.h.
(vect_get_constant_vectors): Use tree_vector_builder instead
of build_vector.
(vect_transform_slp_perm_load): Likewise.
(vect_schedule_slp_instance): Likewise.
* tree-vect-stmts.c: Include tree-vector-builder.h.
(vectorizable_bswap): Use tree_vector_builder instead of build_vector.
(vect_gen_perm_mask_any): Likewise.
(vectorizable_call): Likewise. Explicitly use a stepped encoding.
* tree.c: (build_vector_from_ctor): Use tree_vector_builder instead
of build_vector.
(build_vector_from_val): Likewise. Explicitly use a duplicate
encoding.
From-SVN: r255475
|
|
This patch makes can_vec_perm_p & co. take a vec<>, wrapped in new
typedefs vec_perm_indices and auto_vec_perm_indices. There are two
reasons for doing this for SVE:
(1) it means that the number of elements is bundled with the elements
themselves, and is obviously constant.
(2) it makes it easier to change the "unsigned char" element type to
something wider.
Changing the target hook is left as follow-on work.
2017-09-14 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* target.h (vec_perm_indices): New typedef.
(auto_vec_perm_indices): Likewise.
* optabs-query.h: Include target.h
(can_vec_perm_p): Take a vec_perm_indices *.
* optabs-query.c (can_vec_perm_p): Likewise.
(can_mult_highpart_p): Update accordingly. Use auto_vec_perm_indices.
* tree-ssa-forwprop.c (simplify_vector_constructor): Likewise.
* tree-vect-generic.c (lower_vec_perm): Likewise.
* tree-vect-data-refs.c (vect_grouped_store_supported): Likewise.
(vect_grouped_load_supported): Likewise.
(vect_shift_permute_load_chain): Likewise.
(vect_permute_store_chain): Use auto_vec_perm_indices.
(vect_permute_load_chain): Likewise.
* fold-const.c (fold_vec_perm): Take vec_perm_indices.
(fold_ternary_loc): Update accordingly. Use auto_vec_perm_indices.
Update uses of can_vec_perm_p.
* tree-vect-loop.c (calc_vec_perm_mask_for_shift): Replace the
mode with a number of elements. Take a vec_perm_indices *.
(vect_create_epilog_for_reduction): Update accordingly.
Use auto_vec_perm_indices.
(have_whole_vector_shift): Likewise. Update call to can_vec_perm_p.
* tree-vect-slp.c (vect_build_slp_tree_1): Likewise.
(vect_transform_slp_perm_load): Likewise.
(vect_schedule_slp_instance): Use auto_vec_perm_indices.
* tree-vectorizer.h (vect_gen_perm_mask_any): Take a vec_perm_indices.
(vect_gen_perm_mask_checked): Likewise.
* tree-vect-stmts.c (vect_gen_perm_mask_any): Take a vec_perm_indices.
(vect_gen_perm_mask_checked): Likewise.
(vectorizable_mask_load_store): Use auto_vec_perm_indices.
(vectorizable_store): Likewise.
(vectorizable_load): Likewise.
(perm_mask_for_reverse): Likewise. Update call to can_vec_perm_p.
(vectorizable_bswap): Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r252761
|
|
This patch makes build_vector take the elements as a vec<> rather
than a tree *. This is useful for SVE because it bundles the number
of elements with the elements themselves, and enforces the fact that
the number is constant. Also, I think things like the folds can be used
with any generic GNU vector, not just those that match machine vectors,
so the arguments to XALLOCAVEC had no clear limit.
2017-09-14 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* tree.h (build_vector): Take a vec<tree> instead of a tree *.
* tree.c (build_vector): Likewise.
(build_vector_from_ctor): Update accordingly.
(build_vector_from_val): Likewise.
* gimple-fold.c (gimple_fold_stmt_to_constant_1): Likewise.
* tree-ssa-forwprop.c (simplify_vector_constructor): Likewise.
* tree-vect-generic.c (add_rshift): Likewise.
(expand_vector_divmod): Likewise.
(optimize_vector_constructor): Likewise.
* tree-vect-slp.c (vect_get_constant_vectors): Likewise.
(vect_transform_slp_perm_load): Likewise.
(vect_schedule_slp_instance): Likewise.
* tree-vect-stmts.c (vectorizable_bswap): Likewise.
(vectorizable_call): Likewise.
(vect_gen_perm_mask_any): Likewise. Add elements in order.
* expmed.c (make_tree): Likewise.
* fold-const.c (fold_negate_expr_1): Use auto_vec<tree> when building
a vector passed to build_vector.
(fold_convert_const): Likewise.
(exact_inverse): Likewise.
(fold_ternary_loc): Likewise.
(fold_relational_const): Likewise.
(const_binop): Likewise. Use VECTOR_CST_ELT directly when operating
on VECTOR_CSTs, rather than going through vec_cst_ctor_to_array.
(const_unop): Likewise. Store the reduction accumulator in a
variable rather than an array.
(vec_cst_ctor_to_array): Take the number of elements as a parameter.
(fold_vec_perm): Update calls accordingly. Use auto_vec<tree> for
the new vector, rather than constructing it after the input arrays.
(native_interpret_vector): Use auto_vec<tree> when building
a vector passed to build_vector. Add elements in order.
* tree-vect-loop.c (get_initial_defs_for_reduction): Use
auto_vec<tree> when building a vector passed to build_vector.
(vect_create_epilog_for_reduction): Likewise.
(vectorizable_induction): Likewise.
(get_initial_def_for_reduction): Likewise. Fix indentation of
case statements.
* config/sparc/sparc.c (sparc_handle_vis_mul8x16): Change n_elts
to a vec<tree> *.
(sparc_fold_builtin): Use auto_vec<tree> when building a vector
passed to build_vector.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r252760
|
|
we cannot scalarize.
2017-09-12 Richard Biener <rguenther@suse.de>
* tree-vect-generic.c (expand_vector_operations_1): Do nothing
for operations we cannot scalarize.
From-SVN: r252002
|
|
This patch adds a wrapper around mode_for_size for cases in which
the mode class is MODE_INT (the commonest case). The return type
can then be an opt_scalar_int_mode instead of a machine_mode.
2017-08-30 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* machmode.h (int_mode_for_size): New function.
* builtins.c (set_builtin_user_assembler_name): Use int_mode_for_size
instead of mode_for_size.
* calls.c (save_fixed_argument_area): Likewise. Make use of BLKmode
explicit.
* combine.c (expand_field_assignment): Use int_mode_for_size
instead of mode_for_size.
(make_extraction): Likewise.
(simplify_shift_const_1): Likewise.
(simplify_comparison): Likewise.
* dojump.c (do_jump): Likewise.
* dwarf2out.c (mem_loc_descriptor): Likewise.
* emit-rtl.c (init_derived_machine_modes): Likewise.
* expmed.c (flip_storage_order): Likewise.
(convert_extracted_bit_field): Likewise.
* expr.c (copy_blkmode_from_reg): Likewise.
* graphite-isl-ast-to-gimple.c (max_mode_int_precision): Likewise.
* internal-fn.c (expand_mul_overflow): Likewise.
* lower-subreg.c (simple_move): Likewise.
* optabs-libfuncs.c (init_optabs): Likewise.
* simplify-rtx.c (simplify_unary_operation_1): Likewise.
* tree.c (vector_type_mode): Likewise.
* tree-ssa-strlen.c (handle_builtin_memcmp): Likewise.
* tree-vect-data-refs.c (vect_lanes_optab_supported_p): Likewise.
* tree-vect-generic.c (expand_vector_parallel): Likewise.
* tree-vect-stmts.c (vectorizable_load): Likewise.
(vectorizable_store): Likewise.
gcc/ada/
* gcc-interface/decl.c (gnat_to_gnu_entity): Use int_mode_for_size
instead of mode_for_size.
(gnat_to_gnu_subprog_type): Likewise.
* gcc-interface/utils.c (make_type_from_size): Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r251469
|
|
The new iterators are:
- FOR_EACH_MODE_IN_CLASS: iterate over all the modes in a mode class.
- FOR_EACH_MODE_FROM: iterate over all the modes in a class,
starting at a given mode.
- FOR_EACH_WIDER_MODE: iterate over all the modes in a class,
starting at the next widest mode after a given mode.
- FOR_EACH_2XWIDER_MODE: same, but considering only modes that
are two times wider than the previous mode.
- FOR_EACH_MODE_UNTIL: iterate over all the modes in a class until
a given mode is reached.
- FOR_EACH_MODE: iterate over all the modes in a class between
two given modes, inclusive of the first but not the second.
These help with the stronger type checking added by later patches,
since every new mode will be in the same class as the previous one.
2017-08-30 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* machmode.h (mode_traits): New structure.
(get_narrowest_mode): New function.
(mode_iterator::start): Likewise.
(mode_iterator::iterate_p): Likewise.
(mode_iterator::get_wider): Likewise.
(mode_iterator::get_known_wider): Likewise.
(mode_iterator::get_2xwider): Likewise.
(FOR_EACH_MODE_IN_CLASS): New mode iterator.
(FOR_EACH_MODE): Likewise.
(FOR_EACH_MODE_FROM): Likewise.
(FOR_EACH_MODE_UNTIL): Likewise.
(FOR_EACH_WIDER_MODE): Likewise.
(FOR_EACH_2XWIDER_MODE): Likewise.
* builtins.c (expand_builtin_strlen): Use new mode iterators.
* combine.c (simplify_comparison): Likewise
* config/i386/i386.c (type_natural_mode): Likewise.
* cse.c (cse_insn): Likewise.
* dse.c (find_shift_sequence): Likewise.
* emit-rtl.c (init_derived_machine_modes): Likewise.
(init_emit_once): Likewise.
* explow.c (hard_function_value): Likewise.
* expmed.c (extract_fixed_bit_field_1): Likewise.
(extract_bit_field_1): Likewise.
(expand_divmod): Likewise.
(emit_store_flag_1): Likewise.
* expr.c (init_expr_target): Likewise.
(convert_move): Likewise.
(alignment_for_piecewise_move): Likewise.
(widest_int_mode_for_size): Likewise.
(emit_block_move_via_movmem): Likewise.
(copy_blkmode_to_reg): Likewise.
(set_storage_via_setmem): Likewise.
(compress_float_constant): Likewise.
* omp-low.c (omp_clause_aligned_alignment): Likewise.
* optabs-query.c (get_best_extraction_insn): Likewise.
* optabs.c (expand_binop): Likewise.
(expand_twoval_unop): Likewise.
(expand_twoval_binop): Likewise.
(widen_leading): Likewise.
(widen_bswap): Likewise.
(expand_parity): Likewise.
(expand_unop): Likewise.
(prepare_cmp_insn): Likewise.
(prepare_float_lib_cmp): Likewise.
(expand_float): Likewise.
(expand_fix): Likewise.
(expand_sfix_optab): Likewise.
* postreload.c (move2add_use_add2_insn): Likewise.
* reg-stack.c (reg_to_stack): Likewise.
* reginfo.c (choose_hard_reg_mode): Likewise.
* rtlanal.c (init_num_sign_bit_copies_in_rep): Likewise.
* stor-layout.c (mode_for_size): Likewise.
(smallest_mode_for_size): Likewise.
(mode_for_vector): Likewise.
(finish_bitfield_representative): Likewise.
* tree-ssa-math-opts.c (target_supports_divmod_p): Likewise.
* tree-vect-generic.c (type_for_widest_vector_mode): Likewise.
* tree-vect-stmts.c (vectorizable_conversion): Likewise.
* var-tracking.c (prepare_call_arguments): Likewise.
gcc/ada/
* gcc-interface/misc.c (fp_prec_to_size): Use new mode iterators.
(fp_size_to_prec): Likewise.
gcc/c-family/
* c-common.c (c_common_fixed_point_type_for_size): Use new mode
iterators.
* c-cppbuiltin.c (c_cpp_builtins): Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r251455
|
|
PR tree-optimization/79734
* tree-vect-generic.c (expand_vector_condition): Optimize
AVX512 vector boolean VEC_COND_EXPRs into bitwise operations.
Handle VEC_COND_EXPR where comparison has different inner width from
type's inner width.
* g++.dg/opt/pr79734.C: New test.
From-SVN: r245801
|
|
From-SVN: r243994
|
|
PR target/78102
* optabs.def (vcondeq_optab, vec_cmpeq_optab): New optabs.
* optabs.c (expand_vec_cond_expr): For comparison codes
EQ_EXPR and NE_EXPR, attempt vcondeq_optab as fallback.
(expand_vec_cmp_expr): For comparison codes
EQ_EXPR and NE_EXPR, attempt vec_cmpeq_optab as fallback.
* optabs-tree.h (expand_vec_cmp_expr_p, expand_vec_cond_expr_p):
Add enum tree_code argument.
* optabs-query.h (get_vec_cmp_eq_icode, get_vcond_eq_icode): New
inline functions.
* optabs-tree.c (expand_vec_cmp_expr_p): Add CODE argument. For
CODE EQ_EXPR or NE_EXPR, attempt to use vec_cmpeq_optab as
fallback.
(expand_vec_cond_expr_p): Add CODE argument. For CODE EQ_EXPR or
NE_EXPR, attempt to use vcondeq_optab as fallback.
* tree-vect-generic.c (expand_vector_comparison,
expand_vector_divmod, expand_vector_condition): Adjust
expand_vec_cmp_expr_p and expand_vec_cond_expr_p callers.
* tree-vect-stmts.c (vectorizable_condition,
vectorizable_comparison): Likewise.
* tree-vect-patterns.c (vect_recog_mixed_size_cond_pattern,
check_bool_pattern, search_type_for_mask_1): Likewise.
* expr.c (do_store_flag): Likewise.
* doc/md.texi (@code{vec_cmpeq@var{m}@var{n}},
@code{vcondeq@var{m}@var{n}}): Document.
* config/i386/sse.md (vec_cmpeqv2div2di, vcondeq<VI8F_128:mode>v2di):
New expanders.
testsuite/
* gcc.target/i386/pr78102.c: New test.
From-SVN: r241525
|
|
* hwint.h (least_bit_hwi, pow2_or_zerop, pow2p_hwi, ctz_or_zero):
New.
* hwint.c (exact_log2): Use pow2p_hwi.
(ctz_hwi, ffs_hwi): Use least_bit_hwi.
* alias.c (memrefs_conflict_p): Use pow2_or_zerop.
* builtins.c (get_object_alignment_2, get_object_alignment)
(get_pointer_alignment, fold_builtin_atomic_always_lock_free): Use
least_bit_hwi.
* calls.c (compute_argument_addresses, store_one_arg): Use
least_bit_hwi.
* cfgexpand.c (expand_one_stack_var_at): Use least_bit_hwi.
* combine.c (force_to_mode): Use least_bit_hwi.
* emit-rtl.c (set_mem_attributes_minus_bitpos, adjust_address_1):
Use least_bit_hwi.
* expmed.c (synth_mult, expand_divmod): Use ctz_or_zero, ctz_hwi.
(init_expmed_one_conv): Use pow2p_hwi.
* fold-const.c (round_up_loc, round_down_loc): Use pow2_or_zerop.
(fold_binary_loc): Use pow2p_hwi.
* function.c (assign_parm_find_stack_rtl): Use least_bit_hwi.
* gimple-fold.c (gimple_fold_builtin_memory_op): Use pow2p_hwi.
* gimple-ssa-strength-reduction.c (replace_ref): Use least_bit_hwi.
* hsa-gen.c (gen_hsa_addr_with_align, hsa_bitmemref_alignment):
Use least_bit_hwi.
* ipa-cp.c (ipcp_alignment_lattice::meet_with_1): Use least_bit_hwi.
* ipa-prop.c (ipa_modify_call_arguments): Use least_bit_hwi.
* omp-low.c (oacc_loop_fixed_partitions)
(oacc_loop_auto_partitions): Use least_bit_hwi.
* rtlanal.c (nonzero_bits1): Use ctz_or_zero.
* stor-layout.c (place_field): Use least_bit_hwi.
* tree-pretty-print.c (dump_generic_node): Use pow2p_hwi.
* tree-sra.c (build_ref_for_offset): Use least_bit_hwi.
* tree-ssa-ccp.c (ccp_finalize): Use least_bit_hwi.
* tree-ssa-math-opts.c (bswap_replace): Use least_bit_hwi.
* tree-ssa-strlen.c (handle_builtin_memcmp): Use pow2p_hwi.
* tree-vect-data-refs.c (vect_analyze_group_access_1)
(vect_grouped_store_supported, vect_grouped_load_supported)
(vect_permute_load_chain, vect_shift_permute_load_chain)
(vect_transform_grouped_load): Use pow2p_hwi.
* tree-vect-generic.c (expand_vector_divmod): Use ctz_or_zero.
* tree-vect-patterns.c (vect_recog_divmod_pattern): Use ctz_or_zero.
* tree-vect-stmts.c (vectorizable_mask_load_store): Use
least_bit_hwi.
* tsan.c (instrument_expr): Use least_bit_hwi.
* var-tracking.c (negative_power_of_two_p): Use pow2_or_zerop.
From-SVN: r240194
|
|
* cse.c: Use HOST_WIDE_INT_M1 instead of ~(HOST_WIDE_INT) 0.
* combine.c: Use HOST_WIDE_INT_M1U instead of
~(unsigned HOST_WIDE_INT) 0.
* double-int.h: Ditto.
* dse.c: Ditto.
* dwarf2asm.c:Ditto.
* expmed.c: Ditto.
* genmodes.c: Ditto.
* match.pd: Ditto.
* read-rtl.c: Ditto.
* tree-ssa-loop-ivopts.c: Ditto.
* tree-ssa-loop-prefetch.c: Ditto.
* tree-vect-generic.c: Ditto.
* tree-vect-patterns.c: Ditto.
* tree.c: Ditto.
From-SVN: r238529
|
|
* builtins.c: Use HOST_WIDE_INT_1 instead of (HOST_WIDE_INT) 1,
HOST_WIDE_INT_1U instead of (unsigned HOST_WIDE_INT) 1,
HOST_WIDE_INT_M1 instead of (HOST_WIDE_INT) -1 and
HOST_WIDE_INT_M1U instead of (unsigned HOST_WIDE_INT) -1.
* combine.c: Ditto.
* cse.c: Ditto.
* dojump.c: Ditto.
* double-int.c: Ditto.
* dse.c: Ditto.
* dwarf2out.c: Ditto.
* expmed.c: Ditto.
* expr.c: Ditto.
* fold-const.c: Ditto.
* function.c: Ditto.
* fwprop.c: Ditto.
* genmodes.c: Ditto.
* hwint.c: Ditto.
* hwint.h: Ditto.
* ifcvt.c: Ditto.
* loop-doloop.c: Ditto.
* loop-invariant.c: Ditto.
* loop-iv.c: Ditto.
* match.pd: Ditto.
* optabs.c: Ditto.
* real.c: Ditto.
* reload.c: Ditto.
* rtlanal.c: Ditto.
* simplify-rtx.c: Ditto.
* stor-layout.c: Ditto.
* toplev.c: Ditto.
* tree-ssa-loop-ivopts.c: Ditto.
* tree-vect-generic.c: Ditto.
* tree-vect-patterns.c: Ditto.
* tree.c: Ditto.
* tree.h: Ditto.
* ubsan.c: Ditto.
* varasm.c: Ditto.
* wide-int-print.cc: Ditto.
* wide-int.cc: Ditto.
* wide-int.h: Ditto.
From-SVN: r238481
|