Age | Commit message (Collapse) | Author | Files | Lines |
|
This is mostly a mechanical extension of the previous gather load
support to scatter stores. The internal functions in this case are:
IFN_SCATTER_STORE (base, offsets, scale, values)
IFN_MASK_SCATTER_STORE (base, offsets, scale, values, mask)
However, one nonobvious change is to vect_analyze_data_ref_access.
If we're treating an access as a gather load or scatter store
(i.e. if STMT_VINFO_GATHER_SCATTER_P is true), the existing code
would create a dummy data_reference whose step is 0. There's not
really much else it could do, since the whole point is that the
step isn't predictable from iteration to iteration. We then
went into this code in vect_analyze_data_ref_access:
/* Allow loads with zero step in inner-loop vectorization. */
if (loop_vinfo && integer_zerop (step))
{
GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt)) = NULL;
if (!nested_in_vect_loop_p (loop, stmt))
return DR_IS_READ (dr);
I.e. we'd take the step literally and assume that this is a load
or store to an invariant address. Loads from invariant addresses
are supported but stores to them aren't.
The code therefore had the effect of disabling all scatter stores.
AFAICT this is true of AVX too: although tests like avx512f-scatter-1.c
test for the correctness of a scatter-like loop, they don't seem to
check whether a scatter instruction is actually used.
The patch therefore makes vect_analyze_data_ref_access return true
for scatters. We do seem to handle the aliasing correctly;
that's tested by other functions, and is symmetrical to the
already-working gather case.
2018-01-13 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* doc/sourcebuild.texi (vect_scatter_store): Document.
* optabs.def (scatter_store_optab, mask_scatter_store_optab): New
optabs.
* doc/md.texi (scatter_store@var{m}, mask_scatter_store@var{m}):
Document.
* genopinit.c (main): Add supports_vec_scatter_store and
supports_vec_scatter_store_cached to target_optabs.
* gimple.h (gimple_expr_type): Handle IFN_SCATTER_STORE and
IFN_MASK_SCATTER_STORE.
* internal-fn.def (SCATTER_STORE, MASK_SCATTER_STORE): New internal
functions.
* internal-fn.h (internal_store_fn_p): Declare.
(internal_fn_stored_value_index): Likewise.
* internal-fn.c (scatter_store_direct): New macro.
(expand_scatter_store_optab_fn): New function.
(direct_scatter_store_optab_supported_p): New macro.
(internal_store_fn_p): New function.
(internal_gather_scatter_fn_p): Handle IFN_SCATTER_STORE and
IFN_MASK_SCATTER_STORE.
(internal_fn_mask_index): Likewise.
(internal_fn_stored_value_index): New function.
(internal_gather_scatter_fn_supported_p): Adjust operand numbers
for scatter stores.
* optabs-query.h (supports_vec_scatter_store_p): Declare.
* optabs-query.c (supports_vec_scatter_store_p): New function.
* tree-vectorizer.h (vect_get_store_rhs): Declare.
* tree-vect-data-refs.c (vect_analyze_data_ref_access): Return
true for scatter stores.
(vect_gather_scatter_fn_p): Handle scatter stores too.
(vect_check_gather_scatter): Consider using scatter stores if
supports_vec_scatter_store_p.
* tree-vect-patterns.c (vect_try_gather_scatter_pattern): Handle
scatter stores too.
* tree-vect-stmts.c (exist_non_indexing_operands_for_use_p): Use
internal_fn_stored_value_index.
(check_load_store_masking): Handle scatter stores too.
(vect_get_store_rhs): Make public.
(vectorizable_call): Use internal_store_fn_p.
(vectorizable_store): Handle scatter store internal functions.
(vect_transform_stmt): Compare GROUP_STORE_COUNT with GROUP_SIZE
when deciding whether the end of the group has been reached.
* config/aarch64/aarch64.md (UNSPEC_ST1_SCATTER): New unspec.
* config/aarch64/aarch64-sve.md (scatter_store<mode>): New expander.
(mask_scatter_store<mode>): New insns.
gcc/testsuite/
* lib/target-supports.exp (check_effective_target_vect_scatter_store):
New proc.
* gcc.dg/vect/pr25413a.c: Expect both loops to be optimized on
targets with scatter stores.
* gcc.dg/vect/vect-71.c: Restrict XFAIL to targets without scatter
stores.
* gcc.target/aarch64/sve/mask_scatter_store_1.c: New test.
* gcc.target/aarch64/sve/mask_scatter_store_2.c: Likewise.
* gcc.target/aarch64/sve/scatter_store_1.c: Likewise.
* gcc.target/aarch64/sve/scatter_store_2.c: Likewise.
* gcc.target/aarch64/sve/scatter_store_3.c: Likewise.
* gcc.target/aarch64/sve/scatter_store_4.c: Likewise.
* gcc.target/aarch64/sve/scatter_store_5.c: Likewise.
* gcc.target/aarch64/sve/scatter_store_6.c: Likewise.
* gcc.target/aarch64/sve/scatter_store_7.c: Likewise.
* gcc.target/aarch64/sve/strided_store_1.c: Likewise.
* gcc.target/aarch64/sve/strided_store_2.c: Likewise.
* gcc.target/aarch64/sve/strided_store_3.c: Likewise.
* gcc.target/aarch64/sve/strided_store_4.c: Likewise.
* gcc.target/aarch64/sve/strided_store_5.c: Likewise.
* gcc.target/aarch64/sve/strided_store_6.c: Likewise.
* gcc.target/aarch64/sve/strided_store_7.c: Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256643
|
|
This patch adds support for SVE gather loads. It uses the basically
the same analysis code as the AVX gather support, but after that
there are two major differences:
- It uses new internal functions rather than target built-ins.
The interface is:
IFN_GATHER_LOAD (base, offsets scale)
IFN_MASK_GATHER_LOAD (base, offsets scale, mask)
which should be reasonably generic. One of the advantages of
using internal functions is that other passes can understand what
the functions do, but a more immediate advantage is that we can
query the underlying target pattern to see which scales it supports.
- It uses pattern recognition to convert the offset to the right width,
if it was originally narrower than that. This avoids having to do
a widening operation as part of the gather expansion itself.
2018-01-13 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* doc/md.texi (gather_load@var{m}): Document.
(mask_gather_load@var{m}): Likewise.
* genopinit.c (main): Add supports_vec_gather_load and
supports_vec_gather_load_cached to target_optabs.
* optabs-tree.c (init_tree_optimization_optabs): Use
ggc_cleared_alloc to allocate target_optabs.
* optabs.def (gather_load_optab, mask_gather_laod_optab): New optabs.
* internal-fn.def (GATHER_LOAD, MASK_GATHER_LOAD): New internal
functions.
* internal-fn.h (internal_load_fn_p): Declare.
(internal_gather_scatter_fn_p): Likewise.
(internal_fn_mask_index): Likewise.
(internal_gather_scatter_fn_supported_p): Likewise.
* internal-fn.c (gather_load_direct): New macro.
(expand_gather_load_optab_fn): New function.
(direct_gather_load_optab_supported_p): New macro.
(direct_internal_fn_optab): New function.
(internal_load_fn_p): Likewise.
(internal_gather_scatter_fn_p): Likewise.
(internal_fn_mask_index): Likewise.
(internal_gather_scatter_fn_supported_p): Likewise.
* optabs-query.c (supports_at_least_one_mode_p): New function.
(supports_vec_gather_load_p): Likewise.
* optabs-query.h (supports_vec_gather_load_p): Declare.
* tree-vectorizer.h (gather_scatter_info): Add ifn, element_type
and memory_type field.
(NUM_PATTERNS): Bump to 15.
* tree-vect-data-refs.c: Include internal-fn.h.
(vect_gather_scatter_fn_p): New function.
(vect_describe_gather_scatter_call): Likewise.
(vect_check_gather_scatter): Try using internal functions for
gather loads. Recognize existing calls to a gather load function.
(vect_analyze_data_refs): Consider using gather loads if
supports_vec_gather_load_p.
* tree-vect-patterns.c (vect_get_load_store_mask): New function.
(vect_get_gather_scatter_offset_type): Likewise.
(vect_convert_mask_for_vectype): Likewise.
(vect_add_conversion_to_patterm): Likewise.
(vect_try_gather_scatter_pattern): Likewise.
(vect_recog_gather_scatter_pattern): New pattern recognizer.
(vect_vect_recog_func_ptrs): Add it.
* tree-vect-stmts.c (exist_non_indexing_operands_for_use_p): Use
internal_fn_mask_index and internal_gather_scatter_fn_p.
(check_load_store_masking): Take the gather_scatter_info as an
argument and handle gather loads.
(vect_get_gather_scatter_ops): New function.
(vectorizable_call): Check internal_load_fn_p.
(vectorizable_load): Likewise. Handle gather load internal
functions.
(vectorizable_store): Update call to check_load_store_masking.
* config/aarch64/aarch64.md (UNSPEC_LD1_GATHER): New unspec.
* config/aarch64/iterators.md (SVE_S, SVE_D): New mode iterators.
* config/aarch64/predicates.md (aarch64_gather_scale_operand_w)
(aarch64_gather_scale_operand_d): New predicates.
* config/aarch64/aarch64-sve.md (gather_load<mode>): New expander.
(mask_gather_load<mode>): New insns.
gcc/testsuite/
* gcc.target/aarch64/sve/gather_load_1.c: New test.
* gcc.target/aarch64/sve/gather_load_2.c: Likewise.
* gcc.target/aarch64/sve/gather_load_3.c: Likewise.
* gcc.target/aarch64/sve/gather_load_4.c: Likewise.
* gcc.target/aarch64/sve/gather_load_5.c: Likewise.
* gcc.target/aarch64/sve/gather_load_6.c: Likewise.
* gcc.target/aarch64/sve/gather_load_7.c: Likewise.
* gcc.target/aarch64/sve/mask_gather_load_1.c: Likewise.
* gcc.target/aarch64/sve/mask_gather_load_2.c: Likewise.
* gcc.target/aarch64/sve/mask_gather_load_3.c: Likewise.
* gcc.target/aarch64/sve/mask_gather_load_4.c: Likewise.
* gcc.target/aarch64/sve/mask_gather_load_5.c: Likewise.
* gcc.target/aarch64/sve/mask_gather_load_6.c: Likewise.
* gcc.target/aarch64/sve/mask_gather_load_7.c: Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256640
|
|
This patch changes GET_MODE_SIZE from unsigned short to poly_uint16.
The non-mechanical parts were handled by previous patches.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* machmode.h (mode_size): Change from unsigned short to
poly_uint16_pod.
(mode_to_bytes): Return a poly_uint16 rather than an unsigned short.
(GET_MODE_SIZE): Return a constant if ONLY_FIXED_SIZE_MODES,
or if measurement_type is not polynomial.
(fixed_size_mode::includes_p): Check for constant-sized modes.
* genmodes.c (emit_mode_size_inline): Make mode_size_inline
return a poly_uint16 rather than an unsigned short.
(emit_mode_size): Change the type of mode_size from unsigned short
to poly_uint16_pod. Use ZERO_COEFFS for the initializer.
(emit_mode_adjustments): Cope with polynomial vector sizes.
* lto-streamer-in.c (lto_input_mode_table): Use bp_unpack_poly_value
for GET_MODE_SIZE.
* lto-streamer-out.c (lto_write_mode_table): Use bp_pack_poly_value
for GET_MODE_SIZE.
* auto-inc-dec.c (try_merge): Treat GET_MODE_SIZE as polynomial.
* builtins.c (expand_ifn_atomic_compare_exchange_into_call): Likewise.
* caller-save.c (setup_save_areas): Likewise.
(replace_reg_with_saved_mem): Likewise.
* calls.c (emit_library_call_value_1): Likewise.
* combine-stack-adj.c (combine_stack_adjustments_for_block): Likewise.
* combine.c (simplify_set, make_extraction, simplify_shift_const_1)
(gen_lowpart_for_combine): Likewise.
* convert.c (convert_to_integer_1): Likewise.
* cse.c (equiv_constant, cse_insn): Likewise.
* cselib.c (autoinc_split, cselib_hash_rtx): Likewise.
(cselib_subst_to_values): Likewise.
* dce.c (word_dce_process_block): Likewise.
* df-problems.c (df_word_lr_mark_ref): Likewise.
* dwarf2cfi.c (init_one_dwarf_reg_size): Likewise.
* dwarf2out.c (multiple_reg_loc_descriptor, mem_loc_descriptor)
(concat_loc_descriptor, concatn_loc_descriptor, loc_descriptor)
(rtl_for_decl_location): Likewise.
* emit-rtl.c (gen_highpart, widen_memory_access): Likewise.
* expmed.c (extract_bit_field_1, extract_integral_bit_field): Likewise.
* expr.c (emit_group_load_1, clear_storage_hints): Likewise.
(emit_move_complex, emit_move_multi_word, emit_push_insn): Likewise.
(expand_expr_real_1): Likewise.
* function.c (assign_parm_setup_block_p, assign_parm_setup_block)
(pad_below): Likewise.
* gimple-fold.c (optimize_atomic_compare_exchange_p): Likewise.
* gimple-ssa-store-merging.c (rhs_valid_for_store_merging_p): Likewise.
* ira.c (get_subreg_tracking_sizes): Likewise.
* ira-build.c (ira_create_allocno_objects): Likewise.
* ira-color.c (coalesced_pseudo_reg_slot_compare): Likewise.
(ira_sort_regnos_for_alter_reg): Likewise.
* ira-costs.c (record_operand_costs): Likewise.
* lower-subreg.c (interesting_mode_p, simplify_gen_subreg_concatn)
(resolve_simple_move): Likewise.
* lra-constraints.c (get_reload_reg, operands_match_p): Likewise.
(process_addr_reg, simplify_operand_subreg, curr_insn_transform)
(lra_constraints): Likewise.
(CONST_POOL_OK_P): Reject variable-sized modes.
* lra-spills.c (slot, assign_mem_slot, pseudo_reg_slot_compare)
(add_pseudo_to_slot, lra_spill): Likewise.
* omp-low.c (omp_clause_aligned_alignment): Likewise.
* optabs-query.c (get_best_extraction_insn): Likewise.
* optabs-tree.c (expand_vec_cond_expr_p): Likewise.
* optabs.c (expand_vec_perm_var, expand_vec_cond_expr): Likewise.
(expand_mult_highpart, valid_multiword_target_p): Likewise.
* recog.c (offsettable_address_addr_space_p): Likewise.
* regcprop.c (maybe_mode_change): Likewise.
* reginfo.c (choose_hard_reg_mode, record_subregs_of_mode): Likewise.
* regrename.c (build_def_use): Likewise.
* regstat.c (dump_reg_info): Likewise.
* reload.c (complex_word_subreg_p, push_reload, find_dummy_reload)
(find_reloads, find_reloads_subreg_address): Likewise.
* reload1.c (eliminate_regs_1): Likewise.
* rtlanal.c (for_each_inc_dec_find_inc_dec, rtx_cost): Likewise.
* simplify-rtx.c (avoid_constant_pool_reference): Likewise.
(simplify_binary_operation_1, simplify_subreg): Likewise.
* targhooks.c (default_function_arg_padding): Likewise.
(default_hard_regno_nregs, default_class_max_nregs): Likewise.
* tree-cfg.c (verify_gimple_assign_binary): Likewise.
(verify_gimple_assign_ternary): Likewise.
* tree-inline.c (estimate_move_cost): Likewise.
* tree-ssa-forwprop.c (simplify_vector_constructor): Likewise.
* tree-ssa-loop-ivopts.c (add_autoinc_candidates): Likewise.
(get_address_cost_ainc): Likewise.
* tree-vect-data-refs.c (vect_enhance_data_refs_alignment): Likewise.
(vect_supportable_dr_alignment): Likewise.
* tree-vect-loop.c (vect_determine_vectorization_factor): Likewise.
(vectorizable_reduction): Likewise.
* tree-vect-stmts.c (vectorizable_assignment, vectorizable_shift)
(vectorizable_operation, vectorizable_load): Likewise.
* tree.c (build_same_sized_truth_vector_type): Likewise.
* valtrack.c (cleanup_auto_inc_dec): Likewise.
* var-tracking.c (emit_note_insn_var_location): Likewise.
* config/arc/arc.h (ASM_OUTPUT_CASE_END): Use as_a <scalar_int_mode>.
(ADDR_VEC_ALIGN): Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256201
|
|
This patch changes GET_MODE_PRECISION from an unsigned short
to a poly_uint16.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* machmode.h (mode_precision): Change from unsigned short to
poly_uint16_pod.
(mode_to_precision): Return a poly_uint16 rather than an unsigned
short.
(GET_MODE_PRECISION): Return a constant if ONLY_FIXED_SIZE_MODES,
or if measurement_type is not polynomial.
(HWI_COMPUTABLE_MODE_P): Turn into a function. Optimize the case
in which the mode is already known to be a scalar_int_mode.
* genmodes.c (emit_mode_precision): Change the type of mode_precision
from unsigned short to poly_uint16_pod. Use ZERO_COEFFS for the
initializer.
* lto-streamer-in.c (lto_input_mode_table): Use bp_unpack_poly_value
for GET_MODE_PRECISION.
* lto-streamer-out.c (lto_write_mode_table): Use bp_pack_poly_value
for GET_MODE_PRECISION.
* combine.c (update_rsp_from_reg_equal): Treat GET_MODE_PRECISION
as polynomial.
(try_combine, find_split_point, combine_simplify_rtx): Likewise.
(expand_field_assignment, make_extraction): Likewise.
(make_compound_operation_int, record_dead_and_set_regs_1): Likewise.
(get_last_value): Likewise.
* convert.c (convert_to_integer_1): Likewise.
* cse.c (cse_insn): Likewise.
* expr.c (expand_expr_real_1): Likewise.
* lra-constraints.c (simplify_operand_subreg): Likewise.
* optabs-query.c (can_atomic_load_p): Likewise.
* optabs.c (expand_atomic_load): Likewise.
(expand_atomic_store): Likewise.
* ree.c (combine_reaching_defs): Likewise.
* rtl.h (partial_subreg_p, paradoxical_subreg_p): Likewise.
* rtlanal.c (nonzero_bits1, lsb_bitfield_op_p): Likewise.
* tree.h (type_has_mode_precision_p): Likewise.
* ubsan.c (instrument_si_overflow): Likewise.
gcc/ada/
* gcc-interface/misc.c (enumerate_modes): Treat GET_MODE_PRECISION
as polynomial.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256198
|
|
This patch changes GET_MODE_NUNITS from unsigned char
to poly_uint16, although it remains a macro when compiling
target code with NUM_POLY_INT_COEFFS == 1.
We can handle permuted loads and stores for variable nunits if
the number of statements is a power of 2, but not otherwise.
The to_constant call in make_vector_type goes away in a later patch.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* machmode.h (mode_nunits): Change from unsigned char to
poly_uint16_pod.
(ONLY_FIXED_SIZE_MODES): New macro.
(pod_mode::measurement_type, scalar_int_mode::measurement_type)
(scalar_float_mode::measurement_type, scalar_mode::measurement_type)
(complex_mode::measurement_type, fixed_size_mode::measurement_type):
New typedefs.
(mode_to_nunits): Return a poly_uint16 rather than an unsigned short.
(GET_MODE_NUNITS): Return a constant if ONLY_FIXED_SIZE_MODES,
or if measurement_type is not polynomial.
* genmodes.c (ZERO_COEFFS): New macro.
(emit_mode_nunits_inline): Make mode_nunits_inline return a
poly_uint16.
(emit_mode_nunits): Change the type of mode_nunits to poly_uint16_pod.
Use ZERO_COEFFS when emitting initializers.
* data-streamer.h (bp_pack_poly_value): New function.
(bp_unpack_poly_value): Likewise.
* lto-streamer-in.c (lto_input_mode_table): Use bp_unpack_poly_value
for GET_MODE_NUNITS.
* lto-streamer-out.c (lto_write_mode_table): Use bp_pack_poly_value
for GET_MODE_NUNITS.
* tree.c (make_vector_type): Remove temporary shim and make
the real function take the number of units as a poly_uint64
rather than an int.
(build_vector_type_for_mode): Handle polynomial nunits.
* dwarf2out.c (loc_descriptor, add_const_value_attribute): Likewise.
* emit-rtl.c (const_vec_series_p_1): Likewise.
(gen_rtx_CONST_VECTOR): Likewise.
* fold-const.c (test_vec_duplicate_folding): Likewise.
* genrecog.c (validate_pattern): Likewise.
* optabs-query.c (can_vec_perm_var_p, can_mult_highpart_p): Likewise.
* optabs-tree.c (expand_vec_cond_expr_p): Likewise.
* optabs.c (expand_vector_broadcast, expand_binop_directly): Likewise.
(shift_amt_for_vec_perm_mask, expand_vec_perm_var): Likewise.
(expand_vec_cond_expr, expand_mult_highpart): Likewise.
* rtlanal.c (subreg_get_info): Likewise.
* tree-vect-data-refs.c (vect_grouped_store_supported): Likewise.
(vect_grouped_load_supported): Likewise.
* tree-vect-generic.c (type_for_widest_vector_mode): Likewise.
* tree-vect-loop.c (have_whole_vector_shift): Likewise.
* simplify-rtx.c (simplify_unary_operation_1): Likewise.
(simplify_const_unary_operation, simplify_binary_operation_1)
(simplify_const_binary_operation, simplify_ternary_operation)
(test_vector_ops_duplicate, test_vector_ops): Likewise.
(simplify_immed_subreg): Use GET_MODE_NUNITS on a fixed_size_mode
instead of CONST_VECTOR_NUNITS.
* varasm.c (output_constant_pool_2): Likewise.
* rtx-vector-builder.c (rtx_vector_builder::build): Only include the
explicit-encoded elements in the XVEC for variable-length vectors.
gcc/ada/
* gcc-interface/misc.c (enumerate_modes): Handle polynomial
GET_MODE_NUNITS.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256195
|
|
From-SVN: r256169
|
|
This patch changes the type of current_vector_size to poly_uint64.
It also changes TARGET_AUTOVECTORIZE_VECTOR_SIZES so that it fills
in a vector of possible sizes (as poly_uint64s) instead of returning
a bitmask. The documentation claimed that the hook didn't need to
include the default vector size (returned by preferred_simd_mode),
but that wasn't consistent with the omp-low.c usage.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* target.h (vector_sizes, auto_vector_sizes): New typedefs.
* target.def (autovectorize_vector_sizes): Return the vector sizes
by pointer, using vector_sizes rather than a bitmask.
* targhooks.h (default_autovectorize_vector_sizes): Update accordingly.
* targhooks.c (default_autovectorize_vector_sizes): Likewise.
* config/aarch64/aarch64.c (aarch64_autovectorize_vector_sizes):
Likewise.
* config/arc/arc.c (arc_autovectorize_vector_sizes): Likewise.
* config/arm/arm.c (arm_autovectorize_vector_sizes): Likewise.
* config/i386/i386.c (ix86_autovectorize_vector_sizes): Likewise.
* config/mips/mips.c (mips_autovectorize_vector_sizes): Likewise.
* omp-general.c (omp_max_vf): Likewise.
* omp-low.c (omp_clause_aligned_alignment): Likewise.
* optabs-query.c (can_vec_mask_load_store_p): Likewise.
* tree-vect-loop.c (vect_analyze_loop): Likewise.
* tree-vect-slp.c (vect_slp_bb): Likewise.
* doc/tm.texi: Regenerate.
* tree-vectorizer.h (current_vector_size): Change from an unsigned int
to a poly_uint64.
* tree-vect-stmts.c (get_vectype_for_scalar_type_and_size): Take
the vector size as a poly_uint64 rather than an unsigned int.
(current_vector_size): Change from an unsigned int to a poly_uint64.
(get_vectype_for_scalar_type): Update accordingly.
* tree.h (build_truth_vector_type): Take the size and number of
units as a poly_uint64 rather than an unsigned int.
(build_vector_type): Add a temporary overload that takes
the number of units as a poly_uint64 rather than an unsigned int.
* tree.c (make_vector_type): Likewise.
(build_truth_vector_type): Take the number of units as a poly_uint64
rather than an unsigned int.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256131
|
|
This patch makes users of vec_perm_builders use the compressed encoding
where possible. This means that they work with variable-length vectors.
2018-01-02 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* optabs.c (expand_vec_perm_var): Use an explicit encoding for
the broadcast of the low byte.
(expand_mult_highpart): Use an explicit encoding for the permutes.
* optabs-query.c (can_mult_highpart_p): Likewise.
* tree-vect-loop.c (calc_vec_perm_mask_for_shift): Likewise.
* tree-vect-stmts.c (perm_mask_for_reverse): Likewise.
(vectorizable_bswap): Likewise.
* tree-vect-data-refs.c (vect_grouped_store_supported): Use an
explicit encoding for the power-of-2 permutes.
(vect_permute_store_chain): Likewise.
(vect_grouped_load_supported): Likewise.
(vect_permute_load_chain): Likewise.
From-SVN: r256097
|
|
This patch changes vec_perm_indices from a plain vec<> to a class
that stores a canonicalized permutation, using the same encoding
as for VECTOR_CSTs. This means that vec_perm_indices now carries
information about the number of vectors being permuted (currently
always 1 or 2) and the number of elements in each input vector.
A new vec_perm_builder class is used to actually build up the vector,
like tree_vector_builder does for trees. vec_perm_indices is the
completed representation, a bit like VECTOR_CST is for trees.
The patch just does a mechanical conversion of the code to
vec_perm_builder: a later patch uses explicit encodings where possible.
The point of all this is that it makes the representation suitable
for variable-length vectors. It's no longer necessary for the
underlying vec<>s to store every element explicitly.
In int-vector-builder.h, "using the same encoding as tree and rtx constants"
describes the endpoint -- adding the rtx encoding comes later.
2018-01-02 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* int-vector-builder.h: New file.
* vec-perm-indices.h: Include int-vector-builder.h.
(vec_perm_indices): Redefine as an int_vector_builder.
(auto_vec_perm_indices): Delete.
(vec_perm_builder): Redefine as a stand-alone class.
(vec_perm_indices::vec_perm_indices): New function.
(vec_perm_indices::clamp): Likewise.
* vec-perm-indices.c: Include fold-const.h and tree-vector-builder.h.
(vec_perm_indices::new_vector): New function.
(vec_perm_indices::new_expanded_vector): Update for new
vec_perm_indices class.
(vec_perm_indices::rotate_inputs): New function.
(vec_perm_indices::all_in_range_p): Operate directly on the
encoded form, without computing elided elements.
(tree_to_vec_perm_builder): Operate directly on the VECTOR_CST
encoding. Update for new vec_perm_indices class.
* optabs.c (expand_vec_perm_const): Create a vec_perm_indices for
the given vec_perm_builder.
(expand_vec_perm_var): Update vec_perm_builder constructor.
(expand_mult_highpart): Use vec_perm_builder instead of
auto_vec_perm_indices.
* optabs-query.c (can_mult_highpart_p): Use vec_perm_builder and
vec_perm_indices instead of auto_vec_perm_indices. Use a single
or double series encoding as appropriate.
* fold-const.c (fold_ternary_loc): Use vec_perm_builder and
vec_perm_indices instead of auto_vec_perm_indices.
* tree-ssa-forwprop.c (simplify_vector_constructor): Likewise.
* tree-vect-data-refs.c (vect_grouped_store_supported): Likewise.
(vect_permute_store_chain): Likewise.
(vect_grouped_load_supported): Likewise.
(vect_permute_load_chain): Likewise.
(vect_shift_permute_load_chain): Likewise.
* tree-vect-slp.c (vect_build_slp_tree_1): Likewise.
(vect_transform_slp_perm_load): Likewise.
(vect_schedule_slp_instance): Likewise.
* tree-vect-stmts.c (perm_mask_for_reverse): Likewise.
(vectorizable_mask_load_store): Likewise.
(vectorizable_bswap): Likewise.
(vectorizable_store): Likewise.
(vectorizable_load): Likewise.
* tree-vect-generic.c (lower_vec_perm): Use vec_perm_builder and
vec_perm_indices instead of auto_vec_perm_indices. Use
tree_to_vec_perm_builder to read the vector from a tree.
* tree-vect-loop.c (calc_vec_perm_mask_for_shift): Take a
vec_perm_builder instead of a vec_perm_indices.
(have_whole_vector_shift): Use vec_perm_builder and
vec_perm_indices instead of auto_vec_perm_indices. Leave the
truncation to calc_vec_perm_mask_for_shift.
(vect_create_epilog_for_reduction): Likewise.
* config/aarch64/aarch64.c (expand_vec_perm_d::perm): Change
from auto_vec_perm_indices to vec_perm_indices.
(aarch64_expand_vec_perm_const_1): Use rotate_inputs on d.perm
instead of changing individual elements.
(aarch64_vectorize_vec_perm_const): Use new_vector to install
the vector in d.perm.
* config/arm/arm.c (expand_vec_perm_d::perm): Change
from auto_vec_perm_indices to vec_perm_indices.
(arm_expand_vec_perm_const_1): Use rotate_inputs on d.perm
instead of changing individual elements.
(arm_vectorize_vec_perm_const): Use new_vector to install
the vector in d.perm.
* config/powerpcspe/powerpcspe.c (rs6000_expand_extract_even):
Update vec_perm_builder constructor.
(rs6000_expand_interleave): Likewise.
* config/rs6000/rs6000.c (rs6000_expand_extract_even): Likewise.
(rs6000_expand_interleave): Likewise.
From-SVN: r256095
|
|
The patch to remove the vec_perm_const optab checked whether replacing
a constant permute with a variable permute is safe, or whether it might
truncate the indices. This patch adds a corresponding check for whether
variable permutes can be lowered to QImode-based permutes.
2018-01-02 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* optabs-query.c (can_vec_perm_var_p): Check whether lowering
to qimode could truncate the indices.
* optabs.c (expand_vec_perm_var): Likewise.
From-SVN: r256094
|
|
One of the changes needed for variable-length VEC_PERM_EXPRs -- and for
long fixed-length VEC_PERM_EXPRs -- is the ability to use constant
selectors that wouldn't fit in the vectors being permuted. E.g. a
permute on two V256QIs can't be done using a V256QI selector.
At the moment constant permutes use two interfaces:
targetm.vectorizer.vec_perm_const_ok for testing whether a permute is
valid and the vec_perm_const optab for actually emitting the permute.
The former gets passed a vec<> selector and the latter an rtx selector.
Most ports share a lot of code between the hook and the optab, with a
wrapper function for each interface.
We could try to keep that interface and require ports to define wider
vector modes that could be attached to the CONST_VECTOR (e.g. V256HI or
V256SI in the example above). But building a CONST_VECTOR rtx seems a bit
pointless here, since the expand code only creates the CONST_VECTOR in
order to call the optab, and the first thing the target does is take
the CONST_VECTOR apart again.
The easiest approach therefore seemed to be to remove the optab and
reuse the target hook to emit the code. One potential drawback is that
it's no longer possible to use match_operand predicates to force
operands into the required form, but in practice all targets want
register operands anyway.
The patch also changes vec_perm_indices into a class that provides
some simple routines for handling permutations. A later patch will
flesh this out and get rid of auto_vec_perm_indices, but I didn't
want to do all that in this patch and make it more complicated than
it already is.
2018-01-02 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* Makefile.in (OBJS): Add vec-perm-indices.o.
* vec-perm-indices.h: New file.
* vec-perm-indices.c: Likewise.
* target.h (vec_perm_indices): Replace with a forward class
declaration.
(auto_vec_perm_indices): Move to vec-perm-indices.h.
* optabs.h: Include vec-perm-indices.h.
(expand_vec_perm): Delete.
(selector_fits_mode_p, expand_vec_perm_var): Declare.
(expand_vec_perm_const): Declare.
* target.def (vec_perm_const_ok): Replace with...
(vec_perm_const): ...this new hook.
* doc/tm.texi.in (TARGET_VECTORIZE_VEC_PERM_CONST_OK): Replace with...
(TARGET_VECTORIZE_VEC_PERM_CONST): ...this new hook.
* doc/tm.texi: Regenerate.
* optabs.def (vec_perm_const): Delete.
* doc/md.texi (vec_perm_const): Likewise.
(vec_perm): Refer to TARGET_VECTORIZE_VEC_PERM_CONST.
* expr.c (expand_expr_real_2): Use expand_vec_perm_const rather than
expand_vec_perm for constant permutation vectors. Assert that
the mode of variable permutation vectors is the integer equivalent
of the mode that is being permuted.
* optabs-query.h (selector_fits_mode_p): Declare.
* optabs-query.c: Include vec-perm-indices.h.
(selector_fits_mode_p): New function.
(can_vec_perm_const_p): Check whether targetm.vectorize.vec_perm_const
is defined, instead of checking whether the vec_perm_const_optab
exists. Use targetm.vectorize.vec_perm_const instead of
targetm.vectorize.vec_perm_const_ok. Check whether the indices
fit in the vector mode before using a variable permute.
* optabs.c (shift_amt_for_vec_perm_mask): Take a mode and a
vec_perm_indices instead of an rtx.
(expand_vec_perm): Replace with...
(expand_vec_perm_const): ...this new function. Take the selector
as a vec_perm_indices rather than an rtx. Also take the mode of
the selector. Update call to shift_amt_for_vec_perm_mask.
Use targetm.vectorize.vec_perm_const instead of vec_perm_const_optab.
Use vec_perm_indices::new_expanded_vector to expand the original
selector into bytes. Check whether the indices fit in the vector
mode before using a variable permute.
(expand_vec_perm_var): Make global.
(expand_mult_highpart): Use expand_vec_perm_const.
* fold-const.c: Includes vec-perm-indices.h.
* tree-ssa-forwprop.c: Likewise.
* tree-vect-data-refs.c: Likewise.
* tree-vect-generic.c: Likewise.
* tree-vect-loop.c: Likewise.
* tree-vect-slp.c: Likewise.
* tree-vect-stmts.c: Likewise.
* config/aarch64/aarch64-protos.h (aarch64_expand_vec_perm_const):
Delete.
* config/aarch64/aarch64-simd.md (vec_perm_const<mode>): Delete.
* config/aarch64/aarch64.c (aarch64_expand_vec_perm_const)
(aarch64_vectorize_vec_perm_const_ok): Fuse into...
(aarch64_vectorize_vec_perm_const): ...this new function.
(TARGET_VECTORIZE_VEC_PERM_CONST_OK): Delete.
(TARGET_VECTORIZE_VEC_PERM_CONST): Redefine.
* config/arm/arm-protos.h (arm_expand_vec_perm_const): Delete.
* config/arm/vec-common.md (vec_perm_const<mode>): Delete.
* config/arm/arm.c (TARGET_VECTORIZE_VEC_PERM_CONST_OK): Delete.
(TARGET_VECTORIZE_VEC_PERM_CONST): Redefine.
(arm_expand_vec_perm_const, arm_vectorize_vec_perm_const_ok): Merge
into...
(arm_vectorize_vec_perm_const): ...this new function. Explicitly
check for NEON modes.
* config/i386/i386-protos.h (ix86_expand_vec_perm_const): Delete.
* config/i386/sse.md (VEC_PERM_CONST, vec_perm_const<mode>): Delete.
* config/i386/i386.c (ix86_expand_vec_perm_const_1): Update comment.
(ix86_expand_vec_perm_const, ix86_vectorize_vec_perm_const_ok): Merge
into...
(ix86_vectorize_vec_perm_const): ...this new function. Incorporate
the old VEC_PERM_CONST conditions.
* config/ia64/ia64-protos.h (ia64_expand_vec_perm_const): Delete.
* config/ia64/vect.md (vec_perm_const<mode>): Delete.
* config/ia64/ia64.c (ia64_expand_vec_perm_const)
(ia64_vectorize_vec_perm_const_ok): Merge into...
(ia64_vectorize_vec_perm_const): ...this new function.
* config/mips/loongson.md (vec_perm_const<mode>): Delete.
* config/mips/mips-msa.md (vec_perm_const<mode>): Delete.
* config/mips/mips-ps-3d.md (vec_perm_constv2sf): Delete.
* config/mips/mips-protos.h (mips_expand_vec_perm_const): Delete.
* config/mips/mips.c (mips_expand_vec_perm_const)
(mips_vectorize_vec_perm_const_ok): Merge into...
(mips_vectorize_vec_perm_const): ...this new function.
* config/powerpcspe/altivec.md (vec_perm_constv16qi): Delete.
* config/powerpcspe/paired.md (vec_perm_constv2sf): Delete.
* config/powerpcspe/spe.md (vec_perm_constv2si): Delete.
* config/powerpcspe/vsx.md (vec_perm_const<mode>): Delete.
* config/powerpcspe/powerpcspe-protos.h (altivec_expand_vec_perm_const)
(rs6000_expand_vec_perm_const): Delete.
* config/powerpcspe/powerpcspe.c (TARGET_VECTORIZE_VEC_PERM_CONST_OK):
Delete.
(TARGET_VECTORIZE_VEC_PERM_CONST): Redefine.
(altivec_expand_vec_perm_const_le): Take each operand individually.
Operate on constant selectors rather than rtxes.
(altivec_expand_vec_perm_const): Likewise. Update call to
altivec_expand_vec_perm_const_le.
(rs6000_expand_vec_perm_const): Delete.
(rs6000_vectorize_vec_perm_const_ok): Delete.
(rs6000_vectorize_vec_perm_const): New function.
(rs6000_do_expand_vec_perm): Take a vec_perm_builder instead of
an element count and rtx array.
(rs6000_expand_extract_even): Update call accordingly.
(rs6000_expand_interleave): Likewise.
* config/rs6000/altivec.md (vec_perm_constv16qi): Delete.
* config/rs6000/paired.md (vec_perm_constv2sf): Delete.
* config/rs6000/vsx.md (vec_perm_const<mode>): Delete.
* config/rs6000/rs6000-protos.h (altivec_expand_vec_perm_const)
(rs6000_expand_vec_perm_const): Delete.
* config/rs6000/rs6000.c (TARGET_VECTORIZE_VEC_PERM_CONST_OK): Delete.
(TARGET_VECTORIZE_VEC_PERM_CONST): Redefine.
(altivec_expand_vec_perm_const_le): Take each operand individually.
Operate on constant selectors rather than rtxes.
(altivec_expand_vec_perm_const): Likewise. Update call to
altivec_expand_vec_perm_const_le.
(rs6000_expand_vec_perm_const): Delete.
(rs6000_vectorize_vec_perm_const_ok): Delete.
(rs6000_vectorize_vec_perm_const): New function. Remove stray
reference to the SPE evmerge intructions.
(rs6000_do_expand_vec_perm): Take a vec_perm_builder instead of
an element count and rtx array.
(rs6000_expand_extract_even): Update call accordingly.
(rs6000_expand_interleave): Likewise.
* config/sparc/sparc.md (vec_perm_constv8qi): Delete in favor of...
* config/sparc/sparc.c (sparc_vectorize_vec_perm_const): ...this
new function.
(TARGET_VECTORIZE_VEC_PERM_CONST): Redefine.
From-SVN: r256093
|
|
This patch splits can_vec_perm_p into two functions: can_vec_perm_var_p
for testing permute operations with variable selection vectors, and
can_vec_perm_const_p for testing permute operations with specific
constant selection vectors. This means that we can pass the constant
selection vector by reference.
Constant permutes can still use a variable permute as a fallback.
A later patch adds a check to makre sure that we don't truncate the
vector indices when doing this.
However, have_whole_vector_shift checked:
if (direct_optab_handler (vec_perm_const_optab, mode) == CODE_FOR_nothing)
return false;
which had the effect of disallowing the fallback to variable permutes.
I'm not sure whether that was the intention or whether it was just
supposed to short-cut the loop on targets that don't support permutes.
(But then why bother? The first check in the loop would fail and
we'd bail out straightaway.)
The patch adds a parameter for disallowing the fallback. I think it
makes sense to do this for the following code in the VEC_PERM_EXPR
folder:
/* Some targets are deficient and fail to expand a single
argument permutation while still allowing an equivalent
2-argument version. */
if (need_mask_canon && arg2 == op2
&& !can_vec_perm_p (TYPE_MODE (type), false, &sel)
&& can_vec_perm_p (TYPE_MODE (type), false, &sel2))
since it's really testing whether the expand_vec_perm_const code expects
a particular form.
2018-01-02 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* optabs-query.h (can_vec_perm_p): Delete.
(can_vec_perm_var_p, can_vec_perm_const_p): Declare.
* optabs-query.c (can_vec_perm_p): Split into...
(can_vec_perm_var_p, can_vec_perm_const_p): ...these two functions.
(can_mult_highpart_p): Use can_vec_perm_const_p to test whether a
particular selector is valid.
* tree-ssa-forwprop.c (simplify_vector_constructor): Likewise.
* tree-vect-data-refs.c (vect_grouped_store_supported): Likewise.
(vect_grouped_load_supported): Likewise.
(vect_shift_permute_load_chain): Likewise.
* tree-vect-slp.c (vect_build_slp_tree_1): Likewise.
(vect_transform_slp_perm_load): Likewise.
* tree-vect-stmts.c (perm_mask_for_reverse): Likewise.
(vectorizable_bswap): Likewise.
(vect_gen_perm_mask_checked): Likewise.
* fold-const.c (fold_ternary_loc): Likewise. Don't take
implementations of variable permutation vectors into account
when deciding which selector to use.
* tree-vect-loop.c (have_whole_vector_shift): Don't check whether
vec_perm_const_optab is supported; instead use can_vec_perm_const_p
with a false third argument.
* tree-vect-generic.c (lower_vec_perm): Use can_vec_perm_const_p
to test whether the constant selector is valid and can_vec_perm_var_p
to test whether a variable selector is valid.
From-SVN: r256091
|
|
This patch makes functions take vec_perm_indices by reference rather
than value, since a later patch will turn vec_perm_indices into a class
that would be more expensive to copy.
2018-01-02 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* optabs-query.h (can_vec_perm_p): Take a const vec_perm_indices *.
* optabs-query.c (can_vec_perm_p): Likewise.
* fold-const.c (fold_vec_perm): Take a const vec_perm_indices &
instead of vec_perm_indices.
* tree-vectorizer.h (vect_gen_perm_mask_any): Likewise,
(vect_gen_perm_mask_checked): Likewise,
* tree-vect-stmts.c (vect_gen_perm_mask_any): Likewise,
(vect_gen_perm_mask_checked): Likewise,
From-SVN: r256090
|
|
permutes aren't supported.
qimode_for_vec_perm
The vec_perm code falls back to doing byte-level permutes if
element-level permutes aren't supported. There were two copies
of the code to calculate the mode, and later patches add another,
so this patch splits it out into a helper function.
2018-01-02 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* optabs-query.h (qimode_for_vec_perm): Declare.
* optabs-query.c (can_vec_perm_p): Split out qimode search to...
(qimode_for_vec_perm): ...this new function.
* optabs.c (expand_vec_perm): Use qimode_for_vec_perm.
From-SVN: r256089
|
|
widening_optab_handler had the comment:
/* ??? Why does find_widening_optab_handler_and_mode attempt to
widen things that can't be widened? E.g. add_optab... */
if (op > LAST_CONV_OPTAB)
return CODE_FOR_nothing;
I think it comes from expand_binop using
find_widening_optab_handler_and_mode for two things: to test whether
a "normal" optab like add_optab is supported for a standard binary
operation and to test whether a "convert" optab is supported for a
widening operation like umul_widen_optab. In the former case from_mode
and to_mode must be the same, in the latter from_mode must be narrower
than to_mode.
For the former case, find_widening_optab_handler_and_mode is only really
testing the modes that are passed in. permit_non_widening must be true
here.
For the latter case, find_widening_optab_handler_and_mode should only
really consider new from_modes that are wider than the original
from_mode and narrower than the original to_mode. Logically
permit_non_widening should be false, since widening optabs aren't
supposed to take operands that are the same width as the destination.
We get away with permit_non_widening being true because no target
would/should define a widening .md pattern with matching modes.
But really, it seems better for expand_binop to handle these two
cases itself rather than pushing them down. With that change,
find_widening_optab_handler_and_mode is only ever called with
permit_non_widening set to false and is only ever called with
a "proper" convert optab. We then no longer need widening_optab_handler,
we can just use convert_optab_handler directly.
The patch also passes the instruction code down to expand_binop_directly.
This should be more efficient and removes an extra call to
find_widening_optab_handler_and_mode.
2017-11-01 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* optabs-query.h (convert_optab_p): New function, split out from...
(convert_optab_handler): ...here.
(widening_optab_handler): Delete.
(find_widening_optab_handler): Remove permit_non_widening parameter.
(find_widening_optab_handler_and_mode): Likewise. Provide an
override that operates on mode class wrappers.
* optabs-query.c (widening_optab_handler): Delete.
(find_widening_optab_handler_and_mode): Remove permit_non_widening
parameter. Assert that the two modes are the same class and that
the "from" mode is narrower than the "to" mode. Use
convert_optab_handler instead of widening_optab_handler.
* expmed.c (expmed_mult_highpart_optab): Use convert_optab_handler
instead of widening_optab_handler.
* expr.c (expand_expr_real_2): Update calls to
find_widening_optab_handler.
* optabs.c (expand_widen_pattern_expr): Likewise.
(expand_binop_directly): Take the insn_code as a parameter.
(expand_binop): Only call find_widening_optab_handler for
conversion optabs; use optab_handler otherwise. Update calls
to find_widening_optab_handler and expand_binop_directly.
Use convert_optab_handler instead of widening_optab_handler.
* tree-ssa-math-opts.c (convert_mult_to_widen): Update calls to
find_widening_optab_handler and use scalar_mode rather than
machine_mode.
(convert_plusminus_to_widen): Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r254302
|
|
This patch makes TARGET_VECTORIZE_VEC_PERM_CONST_OK take the permute
vector in the form of a vec_perm_indices instead of an unsigned char *.
It follows on from the recent patch that did the same in target-independent
code.
It was easy to make ARM and AArch64 use vec_perm_indices internally
as well, and converting AArch64 helps with SVE. I did try doing the same
for the other ports, but the surgery needed was much more invasive and
much less obviously correct.
2017-09-22 Richard Sandiford <richard.sandifird@linaro.org>
gcc/
* target.def (vec_perm_const_ok): Change sel parameter to
vec_perm_indices.
* optabs-query.c (can_vec_perm_p): Update accordingly.
* doc/tm.texi: Regenerate.
* config/aarch64/aarch64.c (expand_vec_perm_d): Change perm to
auto_vec_perm_indices and remove separate nelt field.
(aarch64_evpc_trn, aarch64_evpc_uzp, aarch64_evpc_zip)
(aarch64_evpc_ext, aarch64_evpc_rev, aarch64_evpc_dup)
(aarch64_evpc_tbl, aarch64_expand_vec_perm_const_1)
(aarch64_expand_vec_perm_const): Update accordingly.
(aarch64_vectorize_vec_perm_const_ok): Likewise. Change sel
to vec_perm_indices.
* config/arm/arm.c (expand_vec_perm_d): Change perm to
auto_vec_perm_indices and remove separate nelt field.
(arm_evpc_neon_vuzp, arm_evpc_neon_vzip, arm_evpc_neon_vrev)
(arm_evpc_neon_vtrn, arm_evpc_neon_vext, arm_evpc_neon_vtbl)
(arm_expand_vec_perm_const_1, arm_expand_vec_perm_const): Update
accordingly.
(arm_vectorize_vec_perm_const_ok): Likewise. Change sel
to vec_perm_indices.
* config/i386/i386.c (ix86_vectorize_vec_perm_const_ok): Change
sel to vec_perm_indices.
* config/ia64/ia64.c (ia64_vectorize_vec_perm_const_ok): Likewise.
* config/mips/mips.c (mips_vectorize_vec_perm_const_ok): Likewise.
* config/powerpcspe/powerpcspe.c (rs6000_vectorize_vec_perm_const_ok):
Likewise.
* config/rs6000/rs6000.c (rs6000_vectorize_vec_perm_const_ok):
Likewise.
From-SVN: r253148
|
|
This patch makes can_vec_perm_p & co. take a vec<>, wrapped in new
typedefs vec_perm_indices and auto_vec_perm_indices. There are two
reasons for doing this for SVE:
(1) it means that the number of elements is bundled with the elements
themselves, and is obviously constant.
(2) it makes it easier to change the "unsigned char" element type to
something wider.
Changing the target hook is left as follow-on work.
2017-09-14 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* target.h (vec_perm_indices): New typedef.
(auto_vec_perm_indices): Likewise.
* optabs-query.h: Include target.h
(can_vec_perm_p): Take a vec_perm_indices *.
* optabs-query.c (can_vec_perm_p): Likewise.
(can_mult_highpart_p): Update accordingly. Use auto_vec_perm_indices.
* tree-ssa-forwprop.c (simplify_vector_constructor): Likewise.
* tree-vect-generic.c (lower_vec_perm): Likewise.
* tree-vect-data-refs.c (vect_grouped_store_supported): Likewise.
(vect_grouped_load_supported): Likewise.
(vect_shift_permute_load_chain): Likewise.
(vect_permute_store_chain): Use auto_vec_perm_indices.
(vect_permute_load_chain): Likewise.
* fold-const.c (fold_vec_perm): Take vec_perm_indices.
(fold_ternary_loc): Update accordingly. Use auto_vec_perm_indices.
Update uses of can_vec_perm_p.
* tree-vect-loop.c (calc_vec_perm_mask_for_shift): Replace the
mode with a number of elements. Take a vec_perm_indices *.
(vect_create_epilog_for_reduction): Update accordingly.
Use auto_vec_perm_indices.
(have_whole_vector_shift): Likewise. Update call to can_vec_perm_p.
* tree-vect-slp.c (vect_build_slp_tree_1): Likewise.
(vect_transform_slp_perm_load): Likewise.
(vect_schedule_slp_instance): Use auto_vec_perm_indices.
* tree-vectorizer.h (vect_gen_perm_mask_any): Take a vec_perm_indices.
(vect_gen_perm_mask_checked): Likewise.
* tree-vect-stmts.c (vect_gen_perm_mask_any): Take a vec_perm_indices.
(vect_gen_perm_mask_checked): Likewise.
(vectorizable_mask_load_store): Use auto_vec_perm_indices.
(vectorizable_store): Likewise.
(vectorizable_load): Likewise.
(perm_mask_for_reverse): Likewise. Update call to can_vec_perm_p.
(vectorizable_bswap): Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r252761
|
|
...for consistency with mode_for_vector.
2017-09-05 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* target.def (get_mask_mode): Change return type to opt_mode.
Expand commentary.
* doc/tm.texi: Regenerate.
* targhooks.h (default_get_mask_mode): Return an opt_mode.
* targhooks.c (default_get_mask_mode): Likewise.
* config/i386/i386.c (ix86_get_mask_mode): Likewise.
* optabs-query.c (can_vec_mask_load_store_p): Update use of
targetm.get_mask_mode.
* tree.c (build_truth_vector_type): Likewise.
From-SVN: r251731
|
|
...following on from the mode_for_size change. The patch also removes
machmode.h versions of the stor-layout.c comments, since the comments
in the .c file are more complete.
2017-09-05 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* machmode.h (mode_for_vector): Return an opt_mode.
* stor-layout.c (mode_for_vector): Likewise.
(mode_for_int_vector): Update accordingly.
(layout_type): Likewise.
* config/i386/i386.c (emit_memmov): Likewise.
(ix86_expand_set_or_movmem): Likewise.
(ix86_expand_vector_init): Likewise.
(ix86_get_mask_mode): Likewise.
* config/powerpcspe/powerpcspe.c (rs6000_expand_vec_perm_const_1):
Likewise.
* config/rs6000/rs6000.c (rs6000_expand_vec_perm_const_1): Likewise.
* expmed.c (extract_bit_field_1): Likewise.
* expr.c (expand_expr_real_2): Likewise.
* optabs-query.c (can_vec_perm_p): Likewise.
(can_vec_mask_load_store_p): Likewise.
* optabs.c (expand_vec_perm): Likewise.
* targhooks.c (default_get_mask_mode): Likewise.
* tree-vect-stmts.c (vectorizable_store): Likewise.
(vectorizable_load): Likewise.
(get_vectype_for_scalar_type_and_size): Likewise.
From-SVN: r251730
|
|
This patch makes the preferred_simd_mode target hook take a scalar_mode
rather than a machine_mode.
2017-08-30 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* target.def (preferred_simd_mode): Take a scalar_mode
instead of a machine_mode.
* targhooks.h (default_preferred_simd_mode): Likewise.
* targhooks.c (default_preferred_simd_mode): Likewise.
* config/arc/arc.c (arc_preferred_simd_mode): Likewise.
* config/arm/arm.c (arm_preferred_simd_mode): Likewise.
* config/c6x/c6x.c (c6x_preferred_simd_mode): Likewise.
* config/epiphany/epiphany.c (epiphany_preferred_simd_mode): Likewise.
* config/i386/i386.c (ix86_preferred_simd_mode): Likewise.
* config/mips/mips.c (mips_preferred_simd_mode): Likewise.
* config/nvptx/nvptx.c (nvptx_preferred_simd_mode): Likewise.
* config/powerpcspe/powerpcspe.c (rs6000_preferred_simd_mode):
Likewise.
* config/rs6000/rs6000.c (rs6000_preferred_simd_mode): Likewise.
* config/s390/s390.c (s390_preferred_simd_mode): Likewise.
* config/sparc/sparc.c (sparc_preferred_simd_mode): Likewise.
* config/aarch64/aarch64.c (aarch64_preferred_simd_mode): Likewise.
(aarch64_simd_scalar_immediate_valid_for_move): Update accordingly.
* doc/tm.texi: Regenerate.
* optabs-query.c (can_vec_mask_load_store_p): Return false for
non-scalar modes.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r251524
|
|
insv, extv and eztzv modify or read a field in a register or
memory. The field always has a scalar integer mode, while the
register or memory either has a scalar integer mode or BLKmode.
The mode of the bit position is also a scalar integer.
This patch uses the type system to make that explicit.
2017-08-30 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* optabs-query.h (extraction_insn::struct_mode): Change type to
opt_scalar_int_mode and update comment.
(extraction_insn::field_mode): Change type to scalar_int_mode.
(extraction_insn::pos_mode): Likewise.
* combine.c (make_extraction): Update accordingly.
* optabs-query.c (get_traditional_extraction_insn): Likewise.
(get_optab_extraction_insn): Likewise.
* recog.c (simplify_while_replacing): Likewise.
* expmed.c (narrow_bit_field_mem): Change the type of the mode
parameter to opt_scalar_int_mode.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r251492
|
|
This patch adds a wrapper around smallest_mode_for_size
for cases in which the mode class is MODE_INT. Unlike
(int_)mode_for_size, smallest_mode_for_size always returns
a mode of the specified class, asserting if no such mode exists.
smallest_int_mode_for_size therefore returns a scalar_int_mode
rather than an opt_scalar_int_mode.
2017-08-30 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* machmode.h (smallest_mode_for_size): Fix formatting.
(smallest_int_mode_for_size): New function.
* cfgexpand.c (expand_debug_expr): Use smallest_int_mode_for_size
instead of smallest_mode_for_size.
* combine.c (make_extraction): Likewise.
* config/arc/arc.c (arc_expand_movmem): Likewise.
* config/arm/arm.c (arm_expand_divmod_libfunc): Likewise.
* config/i386/i386.c (ix86_get_mask_mode): Likewise.
* config/s390/s390.c (s390_expand_insv): Likewise.
* config/sparc/sparc.c (assign_int_registers): Likewise.
* config/spu/spu.c (spu_function_value): Likewise.
(spu_function_arg): Likewise.
* coverage.c (get_gcov_type): Likewise.
(get_gcov_unsigned_t): Likewise.
* dse.c (find_shift_sequence): Likewise.
* expmed.c (store_bit_field_1): Likewise.
* expr.c (convert_move): Likewise.
(store_field): Likewise.
* internal-fn.c (expand_arith_overflow): Likewise.
* optabs-query.c (get_best_extraction_insn): Likewise.
* optabs.c (expand_twoval_binop_libfunc): Likewise.
* stor-layout.c (layout_type): Likewise.
(initialize_sizetypes): Likewise.
* targhooks.c (default_get_mask_mode): Likewise.
* tree-ssa-loop-manip.c (canonicalize_loop_ivs): Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r251471
|
|
GET_MODE_WIDER previously returned VOIDmode if no wider mode existed.
That would cause problems with stricter mode classes, since VOIDmode
isn't for example a valid scalar integer or floating-point mode.
This patch instead makes it return a new opt_mode<T> class, which
holds either a T or nothing.
2017-08-30 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* coretypes.h (opt_mode): New class.
* machmode.h (opt_mode): Likewise.
(opt_mode::else_void): New function.
(opt_mode::require): Likewise.
(opt_mode::exists): Likewise.
(GET_MODE_WIDER_MODE): Turn into a function and return an opt_mode.
(GET_MODE_2XWIDER_MODE): Likewise.
(mode_iterator::get_wider): Update accordingly.
(mode_iterator::get_2xwider): Likewise.
(mode_iterator::get_known_wider): Likewise, turning into a template.
* combine.c (make_extraction): Update use of GET_MODE_WIDER_MODE,
forcing a wider mode to exist.
* config/cr16/cr16.h (LONG_REG_P): Likewise.
* rtlanal.c (init_num_sign_bit_copies_in_rep): Likewise.
* config/c6x/c6x.c (c6x_rtx_costs): Update use of
GET_MODE_2XWIDER_MODE, forcing a wider mode to exist.
* lower-subreg.c (init_lower_subreg): Likewise.
* optabs-libfuncs.c (init_sync_libfuncs_1): Likewise, but not
on the final iteration.
* config/i386/i386.c (ix86_expand_set_or_movmem): Check whether
a wider mode exists before asking for a move pattern.
(get_mode_wider_vector): Update use of GET_MODE_WIDER_MODE,
forcing a wider mode to exist.
(expand_vselect_vconcat): Update use of GET_MODE_2XWIDER_MODE,
returning false if no such mode exists.
* config/ia64/ia64.c (expand_vselect_vconcat): Likewise.
* config/mips/mips.c (mips_expand_vselect_vconcat): Likewise.
* expmed.c (init_expmed_one_mode): Update use of GET_MODE_WIDER_MODE.
Avoid checking for a MODE_INT if we already know the mode is not a
SCALAR_INT_MODE_P.
(extract_high_half): Update use of GET_MODE_WIDER_MODE,
forcing a wider mode to exist.
(expmed_mult_highpart_optab): Likewise.
(expmed_mult_highpart): Likewise.
* expr.c (expand_expr_real_2): Update use of GET_MODE_WIDER_MODE,
using else_void.
* lto-streamer-in.c (lto_input_mode_table): Likewise.
* optabs-query.c (find_widening_optab_handler_and_mode): Likewise.
* stor-layout.c (bit_field_mode_iterator::next_mode): Likewise.
* internal-fn.c (expand_mul_overflow): Update use of
GET_MODE_2XWIDER_MODE.
* omp-low.c (omp_clause_aligned_alignment): Likewise.
* tree-ssa-math-opts.c (convert_mult_to_widen): Update use of
GET_MODE_WIDER_MODE.
(convert_plusminus_to_widen): Likewise.
* tree-switch-conversion.c (array_value_type): Likewise.
* var-tracking.c (emit_note_insn_var_location): Likewise.
* tree-vrp.c (simplify_float_conversion_using_ranges): Likewise.
Return false inside rather than outside the loop if no wider mode
exists
* optabs.c (expand_binop): Update use of GET_MODE_WIDER_MODE
and GET_MODE_2XWIDER_MODE
(can_compare_p): Use else_void.
* gdbhooks.py (OptMachineModePrinter): New class.
(build_pretty_printer): Use it for opt_mode.
gcc/ada/
* gcc-interface/decl.c (validate_size): Update use of
GET_MODE_WIDER_MODE, forcing a wider mode to exist.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r251457
|
|
The new iterators are:
- FOR_EACH_MODE_IN_CLASS: iterate over all the modes in a mode class.
- FOR_EACH_MODE_FROM: iterate over all the modes in a class,
starting at a given mode.
- FOR_EACH_WIDER_MODE: iterate over all the modes in a class,
starting at the next widest mode after a given mode.
- FOR_EACH_2XWIDER_MODE: same, but considering only modes that
are two times wider than the previous mode.
- FOR_EACH_MODE_UNTIL: iterate over all the modes in a class until
a given mode is reached.
- FOR_EACH_MODE: iterate over all the modes in a class between
two given modes, inclusive of the first but not the second.
These help with the stronger type checking added by later patches,
since every new mode will be in the same class as the previous one.
2017-08-30 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* machmode.h (mode_traits): New structure.
(get_narrowest_mode): New function.
(mode_iterator::start): Likewise.
(mode_iterator::iterate_p): Likewise.
(mode_iterator::get_wider): Likewise.
(mode_iterator::get_known_wider): Likewise.
(mode_iterator::get_2xwider): Likewise.
(FOR_EACH_MODE_IN_CLASS): New mode iterator.
(FOR_EACH_MODE): Likewise.
(FOR_EACH_MODE_FROM): Likewise.
(FOR_EACH_MODE_UNTIL): Likewise.
(FOR_EACH_WIDER_MODE): Likewise.
(FOR_EACH_2XWIDER_MODE): Likewise.
* builtins.c (expand_builtin_strlen): Use new mode iterators.
* combine.c (simplify_comparison): Likewise
* config/i386/i386.c (type_natural_mode): Likewise.
* cse.c (cse_insn): Likewise.
* dse.c (find_shift_sequence): Likewise.
* emit-rtl.c (init_derived_machine_modes): Likewise.
(init_emit_once): Likewise.
* explow.c (hard_function_value): Likewise.
* expmed.c (extract_fixed_bit_field_1): Likewise.
(extract_bit_field_1): Likewise.
(expand_divmod): Likewise.
(emit_store_flag_1): Likewise.
* expr.c (init_expr_target): Likewise.
(convert_move): Likewise.
(alignment_for_piecewise_move): Likewise.
(widest_int_mode_for_size): Likewise.
(emit_block_move_via_movmem): Likewise.
(copy_blkmode_to_reg): Likewise.
(set_storage_via_setmem): Likewise.
(compress_float_constant): Likewise.
* omp-low.c (omp_clause_aligned_alignment): Likewise.
* optabs-query.c (get_best_extraction_insn): Likewise.
* optabs.c (expand_binop): Likewise.
(expand_twoval_unop): Likewise.
(expand_twoval_binop): Likewise.
(widen_leading): Likewise.
(widen_bswap): Likewise.
(expand_parity): Likewise.
(expand_unop): Likewise.
(prepare_cmp_insn): Likewise.
(prepare_float_lib_cmp): Likewise.
(expand_float): Likewise.
(expand_fix): Likewise.
(expand_sfix_optab): Likewise.
* postreload.c (move2add_use_add2_insn): Likewise.
* reg-stack.c (reg_to_stack): Likewise.
* reginfo.c (choose_hard_reg_mode): Likewise.
* rtlanal.c (init_num_sign_bit_copies_in_rep): Likewise.
* stor-layout.c (mode_for_size): Likewise.
(smallest_mode_for_size): Likewise.
(mode_for_vector): Likewise.
(finish_bitfield_representative): Likewise.
* tree-ssa-math-opts.c (target_supports_divmod_p): Likewise.
* tree-vect-generic.c (type_for_widest_vector_mode): Likewise.
* tree-vect-stmts.c (vectorizable_conversion): Likewise.
* var-tracking.c (prepare_call_arguments): Likewise.
gcc/ada/
* gcc-interface/misc.c (fp_prec_to_size): Use new mode iterators.
(fp_size_to_prec): Likewise.
gcc/c-family/
* c-common.c (c_common_fixed_point_type_for_size): Use new mode
iterators.
* c-cppbuiltin.c (c_cpp_builtins): Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r251455
|
|
gcc/
* builtins.c (fold_builtin_atomic_always_lock_free): Make "lock-free"
conditional on existance of a fast atomic load.
* optabs-query.c (can_atomic_load_p): New function.
* optabs-query.h (can_atomic_load_p): Declare it.
* optabs.c (expand_atomic_exchange): Always delegate to libatomic if
no fast atomic load is available for the particular size of access.
(expand_atomic_compare_and_swap): Likewise.
(expand_atomic_load): Likewise.
(expand_atomic_store): Likewise.
(expand_atomic_fetch_op): Likewise.
* testsuite/lib/target-supports.exp
(check_effective_target_sync_int_128): Remove x86 because it provides
no fast atomic load.
(check_effective_target_sync_int_128_runtime): Likewise.
libatomic/
* acinclude.m4: Add #define FAST_ATOMIC_LDST_*.
* auto-config.h.in: Regenerate.
* config/x86/host-config.h (FAST_ATOMIC_LDST_16): Define to 0.
(atomic_compare_exchange_n): New.
* glfree.c (EXACT, LARGER): Change condition and add comments.
From-SVN: r245098
|
|
From-SVN: r243994
|
|
From-SVN: r232055
|
|
The problem in the PR is that some i386 optabs FAIL when
optimising for size rather than speed. The gimple level generally
needs access to this information before calling the generator,
so this patch adds a new hook to say whether an optab should
be used when optimising for size or speed. It also has a "both"
option for cases where we want code that is optimised for both
size and speed.
I've passed the optab to the target hook because I think in most
cases that's more useful than the instruction code. We could pass
both if there's a use for it though.
At the moment the match-and-simplify code doesn't have direct access
to the target block, so for now I've used "both" there.
Tested on x86_64-linux-gnu and powerpc64-linux-gnu.
gcc/
PR tree-optimization/68432
* coretypes.h (optimization_type): New enum.
* doc/tm.texi.in (TARGET_OPTAB_SUPPORTED_P): New hook.
* doc/tm.texi: Regenerate.
* target.def (optab_supported_p): New hook.
* targhooks.h (default_optab_supported_p): Declare.
* targhooks.c (default_optab_supported_p): New function.
* predict.h (function_optimization_type): Declare.
(bb_optimization_type): Likewise.
* predict.c (function_optimization_type): New function.
(bb_optimization_type): Likewise.
* optabs-query.h (convert_optab_handler): Define an overload
that takes an optimization type.
(direct_optab_handler): Likewise.
* optabs-query.c (convert_optab_handler): Likewise.
(direct_optab_handler): Likewise.
* internal-fn.h (direct_internal_fn_supported_p): Take an
optimization_type argument.
* internal-fn.c (direct_optab_supported_p): Likewise.
(multi_vector_optab_supported_p): Likewise.
(direct_internal_fn_supported_p): Likewise.
* builtins.c (replacement_internal_fn): Update call to
direct_internal_fn_supported_p.
* gimple-match-head.c (build_call_internal): Likewise.
* tree-vect-patterns.c (vect_recog_pow_pattern): Likewise.
* tree-vect-stmts.c (vectorizable_internal_function): Likewise.
* tree.c (maybe_build_call_expr_loc): Likewise.
* config/i386/i386.c (ix86_optab_supported_p): New function.
(TARGET_OPTAB_SUPPORTED_P): Define.
* config/i386/i386.md (asinxf2): Remove optimize_insn_for_size_p check.
(asin<mode>2, acosxf2, acos<mode>2, log1pxf2, log1p<mode>2)
(expNcorexf3, expxf2, exp<mode>2, exp10xf2, exp10<mode>2, exp2xf2)
(exp2<mode>2, expm1xf2, expm1<mode>2, ldexpxf3, ldexp<mode>3)
(scalbxf3, scalb<mode>3, rint<mode>2, round<mode>2)
(<rounding_insn>xf2, <rounding_insn><mode>2): Likewise.
gcc/testsuite/
* gcc.target/i386/pr68432-1.c: New test.
* gcc.target/i386/pr68432-2.c: Likewise.
* gcc.target/i386/pr68432-3.c: Likewise.
From-SVN: r231161
|
|
gcc/
* internal-fn.c (expand_MASK_LOAD): Adjust to maskload optab changes.
(expand_MASK_STORE): Adjust to maskstore optab changes.
* optabs-query.c (can_vec_mask_load_store_p): Add MASK_MODE arg.
Adjust to maskload, maskstore optab changes.
* optabs-query.h (can_vec_mask_load_store_p): Add MASK_MODE arg.
* optabs.def (maskload_optab): Transform into convert optab.
(maskstore_optab): Likewise.
* tree-if-conv.c (ifcvt_can_use_mask_load_store): Adjust to
can_vec_mask_load_store_p signature change.
(predicate_mem_writes): Use boolean mask.
* tree-vect-stmts.c (vectorizable_mask_load_store): Adjust to
can_vec_mask_load_store_p signature change. Allow invariant masks.
(vectorizable_operation): Ignore type precision for boolean vectors.
gcc/testsuite/
* gcc.target/i386/avx2-vec-mask-bit-not.c: New test.
From-SVN: r230099
|
|
optabs.[hc] is a bit of a behemoth. It includes basic functions for querying
what a target can do, related tree- and gimple-level query functions,
related rtl-level query functions, and the functions that actually
generate code. Some gimple optimisations therefore need:
#include "insn-config.h"
#include "expmed.h"
#include "dojump.h"
#include "explow.h"
#include "emit-rtl.h"
#include "varasm.h"
#include "stmt.h"
#include "expr.h"
purely to query whether the target has support for a particular operation.
This patch splits optabs up as follows:
- optabs-query.[hc]: IL-independent functions for querying what a target
can do natively.
- optabs-tree.[hc]: tree and gimple query functions (an extension of
optabs-query.[hc]).
- optabs-libfuncs.[hc]: optabs-specific libfuncs (an extension of
libfuncs.h)
- optabs.h: For now includes optabs-query.h and optabs-libfuncs.h.
Only two files outside optabs need to include both optabs.h and
optabs-tree.h: expr.c and function.c. I think that's expected given
that both are related to expand.
It might be good to split optabs.h further, but this is already quite
a big patch.
I changed can_conditionally_move_p from returning an int to returning
a bool and fixed a few formatting glitches. There should be no other
changes to the functions themselves.
gcc/
* Makefile.in (OBJS): Add optabs-libfuncs.o, optabs-query.o
and optabs-tree.o.
(GTFILES): Replace optabs.c with optabs-libfunc.c.
* genopinit.c (main): Add an include guard to insn-opinit.h.
Protect the rtx_code parts with NUM_RTX_CODE.
* optabs.h: Split parts out to...
* optabs-libfuncs.h, optabs-query.h, optabs-tree.h: ...these
new files.
* optabs.c: Split parts out to...
* optabs-libfuncs.c, optabs-query.c, optabs-tree.c: ...these
new files.
* cilk-common.c: Include optabs-query.h rather than optabs.h.
* fold-const.c: Likewise.
* target-globals.c: Likewise.
* tree-if-conv.c: Likewise.
* tree-ssa-forwprop.c: Likewise.
* tree-ssa-loop-prefetch.c: Likewise.
* tree-ssa-math-opts.c: Include optabs-tree.h rather than
optabs.h. Remove unncessary include files.
* tree-ssa-phiopt.c: Likewise.
* tree-ssa-reassoc.c: Likewise.
* tree-switch-conversion.c: Likewise.
* tree-vect-data-refs.c: Likewise.
* tree-vect-generic.c: Likewise.
* tree-vect-loop.c: Likewise.
* tree-vect-patterns.c: Likewise.
* tree-vect-slp.c: Likewise.
* tree-vect-stmts.c: Likewise.
* tree-vrp.c: Likewise.
* toplev.c: Include optabs-query.h and optabs-libfuncs.h
rather than optabs.h.
* expr.c: Include optabs-tree.h.
* function.c: Likewise.
From-SVN: r227865
|