aboutsummaryrefslogtreecommitdiff
path: root/gcc/combine.c
AgeCommit message (Collapse)AuthorFilesLines
2018-04-10re PR rtl-optimization/85300 (ICE in exact_int_to_float_conversion_p, at ↵Jakub Jelinek1-4/+8
simplify-rtx.c:895) PR rtl-optimization/85300 * combine.c (subst): Handle subst of CONST_SCALAR_INT_P new_rtx also into FLOAT and UNSIGNED_FLOAT like ZERO_EXTEND, return a CLOBBER if simplify_unary_operation fails. * gcc.dg/pr85300.c: New test. From-SVN: r259285
2018-03-14combine: Don't make log_links for pc_rtx (PR84780 #c10)Segher Boessenkool1-0/+3
distribute_links tries to place a log_link for whatever the destination of the modified instruction is. It shouldn't do that when that dest is pc_rtx, which isn't actually a register. * combine.c (distribute_links): Don't make a link based on pc_rtx. From-SVN: r258523
2018-03-12combine: Fix PR84780 (more LOG_LINKS trouble)Segher Boessenkool1-0/+1
There still are situations where we have stale LOG_LINKS. This causes combine to try two-insn combinations I2->I3 where the register set by I2 is used before I3 as well. Not good. This patch fixes it by checking for this situation in can_combine_p (similar to what we already do for three and four insn combinations). From-SVN: r258452
2018-03-06re PR target/84710 (ICE: RTL check: expected code 'reg', have 'subreg' in ↵Jakub Jelinek1-6/+2
rhs_regno, at rtl.h:1896 with -O -fno-forward-propagate) PR target/84710 * combine.c (try_combine): Use reg_or_subregno instead of handling just paradoxical SUBREGs and REGs. * gcc.dg/pr84710.c: New test. From-SVN: r258301
2018-03-05re PR target/84700 (ICE on 32-bit BE powerpc targets w/ -misel -O1)Jakub Jelinek1-1/+5
PR target/84700 * combine.c (combine_simplify_rtx): Don't try to simplify if if_then_else_cond returned non-NULL, but either true_rtx or false_rtx are equal to x. * gcc.target/powerpc/pr84700.c: New test. From-SVN: r258263
2018-03-02re PR rtl-optimization/84614 (wrong code with u16->u128 extension at aarch64 ↵Jakub Jelinek1-2/+2
-fno-split-wide-types -g3 --param=max-combine-insns=3) PR target/84614 * rtl.h (prev_real_nondebug_insn, next_real_nondebug_insn): New prototypes. * emit-rtl.c (next_real_insn, prev_real_insn): Fix up function comments. (next_real_nondebug_insn, prev_real_nondebug_insn): New functions. * cfgcleanup.c (try_head_merge_bb): Use prev_real_nondebug_insn instead of a loop around prev_real_insn. * combine.c (move_deaths): Use prev_real_nondebug_insn instead of prev_real_insn. * gcc.dg/pr84614.c: New test. From-SVN: r258129
2018-02-16combine: Fix problem with RTL checkingSegher Boessenkool1-1/+6
As Jakub found, after my recent combine patch at least on x86 problems show up with RTL checking enabled. This is because the I2 generated by a successful instruction combination can write not only a register but it can also write a paradoxical subreg of one. This fixes it. * combine.c (try_combine): When adjusting LOG_LINKS for the destination that moved to I2, also allow destinations that are a paradoxical subreg (instead of a normal reg). From-SVN: r257736
2018-02-13combine: Update links correctly for new I2 (PR84169)Segher Boessenkool1-26/+31
If there is a LOG_LINK between two insns, this means those two insns can be combined, as far as dataflow is concerned. There never should be a LOG_LINK between two unrelated insns. If there is one, combine will try to combine the insns without doing all the needed checks if the earlier destination is used before the later insn, etc. Unfortunately we do not update the LOG_LINKs correctly in some cases. This patch fixes at least some of those cases. PR rtl-optimization/84169 * combine.c (try_combine): New variable split_i2i3. Set it to true if we generated a parallel as new i3 and we split that to new i2 and i3 instructions. Handle split_i2i3 similar to swap_i2i3: scan the LOG_LINKs of i3 to see which of those need to link to i2 now. Link those to i2, not i1. Partially rewrite this scan code. gcc/testsuite/ PR rtl-optimization/84169 * gcc.c-torture/execute/pr84169.c: New. From-SVN: r257644
2018-02-01re PR rtl-optimization/84157 ([nvptx] ICE: RTL check: expected code 'reg', ↵Uros Bizjak1-2/+2
have 'lshiftrt') PR rtl-optimization/84157 * combine.c (change_zero_ext): Use REG_P predicate in front of HARD_REGISTER_P predicate. From-SVN: r257302
2018-01-31re PR rtl-optimization/84123 (internal compiler error: in gen_rtx_SUBREG, at ↵Uros Bizjak1-2/+15
emit-rtl.c:908, alpha linux.) PR rtl-optimization/84123 * combine.c (change_zero_ext): Check if hard register satisfies can_change_dest_mode before calling gen_lowpart_SUBREG. From-SVN: r257270
2018-01-31re PR rtl-optimization/84071 (wrong elimination of zero-extension after ↵Eric Botcazou1-5/+12
sign-extended load) PR rtl-optimization/84071 * combine.c (record_dead_and_set_regs_1): Record the source unmodified for a paradoxical SUBREG on a WORD_REGISTER_OPERATIONS target. From-SVN: r257224
2018-01-09re PR rtl-optimization/83628 (performance regression when accessing arrays ↵Uros Bizjak1-1/+1
on alpha) PR target/83628 * combine.c (force_int_to_mode) <case ASHIFT>: Use mode instead of op_mode in the force_to_mode call. From-SVN: r256387
2018-01-03poly_int: GET_MODE_SIZERichard Sandiford1-16/+10
This patch changes GET_MODE_SIZE from unsigned short to poly_uint16. The non-mechanical parts were handled by previous patches. 2018-01-03 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * machmode.h (mode_size): Change from unsigned short to poly_uint16_pod. (mode_to_bytes): Return a poly_uint16 rather than an unsigned short. (GET_MODE_SIZE): Return a constant if ONLY_FIXED_SIZE_MODES, or if measurement_type is not polynomial. (fixed_size_mode::includes_p): Check for constant-sized modes. * genmodes.c (emit_mode_size_inline): Make mode_size_inline return a poly_uint16 rather than an unsigned short. (emit_mode_size): Change the type of mode_size from unsigned short to poly_uint16_pod. Use ZERO_COEFFS for the initializer. (emit_mode_adjustments): Cope with polynomial vector sizes. * lto-streamer-in.c (lto_input_mode_table): Use bp_unpack_poly_value for GET_MODE_SIZE. * lto-streamer-out.c (lto_write_mode_table): Use bp_pack_poly_value for GET_MODE_SIZE. * auto-inc-dec.c (try_merge): Treat GET_MODE_SIZE as polynomial. * builtins.c (expand_ifn_atomic_compare_exchange_into_call): Likewise. * caller-save.c (setup_save_areas): Likewise. (replace_reg_with_saved_mem): Likewise. * calls.c (emit_library_call_value_1): Likewise. * combine-stack-adj.c (combine_stack_adjustments_for_block): Likewise. * combine.c (simplify_set, make_extraction, simplify_shift_const_1) (gen_lowpart_for_combine): Likewise. * convert.c (convert_to_integer_1): Likewise. * cse.c (equiv_constant, cse_insn): Likewise. * cselib.c (autoinc_split, cselib_hash_rtx): Likewise. (cselib_subst_to_values): Likewise. * dce.c (word_dce_process_block): Likewise. * df-problems.c (df_word_lr_mark_ref): Likewise. * dwarf2cfi.c (init_one_dwarf_reg_size): Likewise. * dwarf2out.c (multiple_reg_loc_descriptor, mem_loc_descriptor) (concat_loc_descriptor, concatn_loc_descriptor, loc_descriptor) (rtl_for_decl_location): Likewise. * emit-rtl.c (gen_highpart, widen_memory_access): Likewise. * expmed.c (extract_bit_field_1, extract_integral_bit_field): Likewise. * expr.c (emit_group_load_1, clear_storage_hints): Likewise. (emit_move_complex, emit_move_multi_word, emit_push_insn): Likewise. (expand_expr_real_1): Likewise. * function.c (assign_parm_setup_block_p, assign_parm_setup_block) (pad_below): Likewise. * gimple-fold.c (optimize_atomic_compare_exchange_p): Likewise. * gimple-ssa-store-merging.c (rhs_valid_for_store_merging_p): Likewise. * ira.c (get_subreg_tracking_sizes): Likewise. * ira-build.c (ira_create_allocno_objects): Likewise. * ira-color.c (coalesced_pseudo_reg_slot_compare): Likewise. (ira_sort_regnos_for_alter_reg): Likewise. * ira-costs.c (record_operand_costs): Likewise. * lower-subreg.c (interesting_mode_p, simplify_gen_subreg_concatn) (resolve_simple_move): Likewise. * lra-constraints.c (get_reload_reg, operands_match_p): Likewise. (process_addr_reg, simplify_operand_subreg, curr_insn_transform) (lra_constraints): Likewise. (CONST_POOL_OK_P): Reject variable-sized modes. * lra-spills.c (slot, assign_mem_slot, pseudo_reg_slot_compare) (add_pseudo_to_slot, lra_spill): Likewise. * omp-low.c (omp_clause_aligned_alignment): Likewise. * optabs-query.c (get_best_extraction_insn): Likewise. * optabs-tree.c (expand_vec_cond_expr_p): Likewise. * optabs.c (expand_vec_perm_var, expand_vec_cond_expr): Likewise. (expand_mult_highpart, valid_multiword_target_p): Likewise. * recog.c (offsettable_address_addr_space_p): Likewise. * regcprop.c (maybe_mode_change): Likewise. * reginfo.c (choose_hard_reg_mode, record_subregs_of_mode): Likewise. * regrename.c (build_def_use): Likewise. * regstat.c (dump_reg_info): Likewise. * reload.c (complex_word_subreg_p, push_reload, find_dummy_reload) (find_reloads, find_reloads_subreg_address): Likewise. * reload1.c (eliminate_regs_1): Likewise. * rtlanal.c (for_each_inc_dec_find_inc_dec, rtx_cost): Likewise. * simplify-rtx.c (avoid_constant_pool_reference): Likewise. (simplify_binary_operation_1, simplify_subreg): Likewise. * targhooks.c (default_function_arg_padding): Likewise. (default_hard_regno_nregs, default_class_max_nregs): Likewise. * tree-cfg.c (verify_gimple_assign_binary): Likewise. (verify_gimple_assign_ternary): Likewise. * tree-inline.c (estimate_move_cost): Likewise. * tree-ssa-forwprop.c (simplify_vector_constructor): Likewise. * tree-ssa-loop-ivopts.c (add_autoinc_candidates): Likewise. (get_address_cost_ainc): Likewise. * tree-vect-data-refs.c (vect_enhance_data_refs_alignment): Likewise. (vect_supportable_dr_alignment): Likewise. * tree-vect-loop.c (vect_determine_vectorization_factor): Likewise. (vectorizable_reduction): Likewise. * tree-vect-stmts.c (vectorizable_assignment, vectorizable_shift) (vectorizable_operation, vectorizable_load): Likewise. * tree.c (build_same_sized_truth_vector_type): Likewise. * valtrack.c (cleanup_auto_inc_dec): Likewise. * var-tracking.c (emit_note_insn_var_location): Likewise. * config/arc/arc.h (ASM_OUTPUT_CASE_END): Use as_a <scalar_int_mode>. (ADDR_VEC_ALIGN): Likewise. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r256201
2018-01-03poly_int: GET_MODE_BITSIZERichard Sandiford1-5/+8
This patch changes GET_MODE_BITSIZE from an unsigned short to a poly_uint16. 2018-01-03 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * machmode.h (mode_to_bits): Return a poly_uint16 rather than an unsigned short. (GET_MODE_BITSIZE): Return a constant if ONLY_FIXED_SIZE_MODES, or if measurement_type is polynomial. * calls.c (shift_return_value): Treat GET_MODE_BITSIZE as polynomial. * combine.c (make_extraction): Likewise. * dse.c (find_shift_sequence): Likewise. * dwarf2out.c (mem_loc_descriptor): Likewise. * expmed.c (store_integral_bit_field, extract_bit_field_1): Likewise. (extract_bit_field, extract_low_bits): Likewise. * expr.c (convert_move, convert_modes, emit_move_insn_1): Likewise. (optimize_bitfield_assignment_op, expand_assignment): Likewise. (store_expr_with_bounds, store_field, expand_expr_real_1): Likewise. * fold-const.c (optimize_bit_field_compare, merge_ranges): Likewise. * gimple-fold.c (optimize_atomic_compare_exchange_p): Likewise. * reload.c (find_reloads): Likewise. * reload1.c (alter_reg): Likewise. * stor-layout.c (bitwise_mode_for_mode, compute_record_mode): Likewise. * targhooks.c (default_secondary_memory_needed_mode): Likewise. * tree-if-conv.c (predicate_mem_writes): Likewise. * tree-ssa-strlen.c (handle_builtin_memcmp): Likewise. * tree-vect-patterns.c (adjust_bool_pattern): Likewise. * tree-vect-stmts.c (vectorizable_simd_clone_call): Likewise. * valtrack.c (dead_debug_insert_temp): Likewise. * varasm.c (mergeable_constant_section): Likewise. * config/sh/sh.h (LOCAL_ALIGNMENT): Use as_a <fixed_size_mode>. gcc/ada/ * gcc-interface/misc.c (enumerate_modes): Treat GET_MODE_BITSIZE as polynomial. gcc/c-family/ * c-ubsan.c (ubsan_instrument_shift): Treat GET_MODE_BITSIZE as polynomial. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r256200
2018-01-03poly_int: GET_MODE_PRECISIONRichard Sandiford1-35/+42
This patch changes GET_MODE_PRECISION from an unsigned short to a poly_uint16. 2018-01-03 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * machmode.h (mode_precision): Change from unsigned short to poly_uint16_pod. (mode_to_precision): Return a poly_uint16 rather than an unsigned short. (GET_MODE_PRECISION): Return a constant if ONLY_FIXED_SIZE_MODES, or if measurement_type is not polynomial. (HWI_COMPUTABLE_MODE_P): Turn into a function. Optimize the case in which the mode is already known to be a scalar_int_mode. * genmodes.c (emit_mode_precision): Change the type of mode_precision from unsigned short to poly_uint16_pod. Use ZERO_COEFFS for the initializer. * lto-streamer-in.c (lto_input_mode_table): Use bp_unpack_poly_value for GET_MODE_PRECISION. * lto-streamer-out.c (lto_write_mode_table): Use bp_pack_poly_value for GET_MODE_PRECISION. * combine.c (update_rsp_from_reg_equal): Treat GET_MODE_PRECISION as polynomial. (try_combine, find_split_point, combine_simplify_rtx): Likewise. (expand_field_assignment, make_extraction): Likewise. (make_compound_operation_int, record_dead_and_set_regs_1): Likewise. (get_last_value): Likewise. * convert.c (convert_to_integer_1): Likewise. * cse.c (cse_insn): Likewise. * expr.c (expand_expr_real_1): Likewise. * lra-constraints.c (simplify_operand_subreg): Likewise. * optabs-query.c (can_atomic_load_p): Likewise. * optabs.c (expand_atomic_load): Likewise. (expand_atomic_store): Likewise. * ree.c (combine_reaching_defs): Likewise. * rtl.h (partial_subreg_p, paradoxical_subreg_p): Likewise. * rtlanal.c (nonzero_bits1, lsb_bitfield_op_p): Likewise. * tree.h (type_has_mode_precision_p): Likewise. * ubsan.c (instrument_si_overflow): Likewise. gcc/ada/ * gcc-interface/misc.c (enumerate_modes): Treat GET_MODE_PRECISION as polynomial. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r256198
2018-01-03Update copyright years.Jakub Jelinek1-1/+1
From-SVN: r256169
2018-01-03poly_int: REGMODE_NATURAL_SIZERichard Sandiford1-2/+2
This patch makes target-independent code that uses REGMODE_NATURAL_SIZE treat it as a poly_int rather than a constant. 2018-01-03 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * combine.c (can_change_dest_mode): Handle polynomial REGMODE_NATURAL_SIZE. * expmed.c (store_bit_field_1): Likewise. * expr.c (store_constructor): Likewise. * emit-rtl.c (validate_subreg): Operate on polynomial mode sizes and polynomial REGMODE_NATURAL_SIZE. (gen_lowpart_common): Likewise. * reginfo.c (record_subregs_of_mode): Likewise. * rtlanal.c (read_modify_subreg_p): Likewise. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r256149
2017-12-21[Patch combine] Don't create ZERO_EXTEND from subregs unless we have a ↵James Greenhalgh1-2/+3
scalar int mode gcc/ * combine.c (simplify_set): Do not transform subregs to zero_extends if the destination is not a scalar int mode. From-SVN: r255945
2017-12-21poly_int: REG_ARGS_SIZERichard Sandiford1-2/+2
This patch adds new utility functions for manipulating REG_ARGS_SIZE notes and allows the notes to carry polynomial as well as constant sizes. The code was inconsistent about whether INT_MIN or HOST_WIDE_INT_MIN should be used to represent an unknown size. The patch uses HOST_WIDE_INT_MIN throughout. 2017-12-21 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * rtl.h (get_args_size, add_args_size_note): New functions. (find_args_size_adjust): Return a poly_int64 rather than a HOST_WIDE_INT. (fixup_args_size_notes): Likewise. Make the same change to the end_args_size parameter. * rtlanal.c (get_args_size, add_args_size_note): New functions. * builtins.c (expand_builtin_trap): Use add_args_size_note. * calls.c (emit_call_1): Likewise. * explow.c (adjust_stack_1): Likewise. * cfgcleanup.c (old_insns_match_p): Update use of find_args_size_adjust. * combine.c (distribute_notes): Track polynomial arg sizes. * dwarf2cfi.c (dw_trace_info): Change beg_true_args_size, end_true_args_size, beg_delay_args_size and end_delay_args_size from HOST_WIDE_INT to poly_int64. (add_cfi_args_size): Take the args_size as a poly_int64 rather than a HOST_WIDE_INT. (notice_args_size, notice_eh_throw, maybe_record_trace_start) (maybe_record_trace_start_abnormal, scan_trace, connect_traces): Track polynomial arg sizes. * emit-rtl.c (try_split): Use get_args_size. * recog.c (peep2_attempt): Likewise. * reload1.c (reload_as_needed): Likewise. * expr.c (find_args_size_adjust): Return the adjustment as a poly_int64 rather than a HOST_WIDE_INT. (fixup_args_size_notes): Change end_args_size from a HOST_WIDE_INT to a poly_int64 and change the return type in the same way. (emit_single_push_insn): Track polynomial arg sizes. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r255919
2017-12-20poly_int: SUBREG_BYTERichard Sandiford1-6/+7
This patch changes SUBREG_BYTE from an int to a poly_int. Since valid SUBREG_BYTEs must be contained within the mode of the SUBREG_REG, the required range is the same as for GET_MODE_SIZE, i.e. unsigned short. The patch therefore uses poly_uint16(_pod) for the SUBREG_BYTE. Using poly_uint16_pod rtx fields requires a new field code ('p'). Since there are no other uses of 'p' besides SUBREG_BYTE, the patch doesn't add an XPOLY or whatever; all uses should go via SUBREG_BYTE instead. The patch doesn't bother implementing 'p' support for legacy define_peepholes, since none of the remaining ones have subregs in their patterns. As it happened, the rtl documentation used SUBREG as an example of a code with mixed field types, accessed via XEXP (x, 0) and XINT (x, 1). Since there's no direct replacement for XINT, and since people should never use it even if there were, the patch changes the example to use INT_LIST instead. The patch also changes subreg-related helper functions so that they too take and return polynomial offsets. This makes the patch quite big, but it's mostly mechanical. The patch generally sticks to existing choices wrt signedness. 2017-12-20 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * doc/rtl.texi: Update documentation of SUBREG_BYTE. Document the 'p' format code. Use INT_LIST rather than SUBREG as the example of a code with an XINT and an XEXP. Remove the implication that accessing an rtx field using XINT is expected to work. * rtl.def (SUBREG): Change format from "ei" to "ep". * rtl.h (rtunion::rt_subreg): New field. (XCSUBREG): New macro. (SUBREG_BYTE): Use it. (subreg_shape): Change offset from an unsigned int to a poly_uint16. Update constructor accordingly. (subreg_shape::operator ==): Update accordingly. (subreg_shape::unique_id): Return an unsigned HOST_WIDE_INT rather than an unsigned int. (subreg_lsb, subreg_lowpart_offset, subreg_highpart_offset): Return a poly_uint64 rather than an unsigned int. (subreg_lsb_1): Likewise. Take the offset as a poly_uint64 rather than an unsigned int. (subreg_size_offset_from_lsb, subreg_size_lowpart_offset) (subreg_size_highpart_offset): Return a poly_uint64 rather than an unsigned int. Take the sizes as poly_uint64s. (subreg_offset_from_lsb): Return a poly_uint64 rather than an unsigned int. Take the shift as a poly_uint64 rather than an unsigned int. (subreg_regno_offset, subreg_offset_representable_p): Take the offset as a poly_uint64 rather than an unsigned int. (simplify_subreg_regno): Likewise. (byte_lowpart_offset): Return the memory offset as a poly_int64 rather than an int. (subreg_memory_offset): Likewise. Take the subreg offset as a poly_uint64 rather than an unsigned int. (simplify_subreg, simplify_gen_subreg, subreg_get_info) (gen_rtx_SUBREG, validate_subreg): Take the subreg offset as a poly_uint64 rather than an unsigned int. * rtl.c (rtx_format): Describe 'p' in comment. (copy_rtx, rtx_equal_p_cb, rtx_equal_p): Handle 'p'. * emit-rtl.c (validate_subreg, gen_rtx_SUBREG): Take the subreg offset as a poly_uint64 rather than an unsigned int. (byte_lowpart_offset): Return the memory offset as a poly_int64 rather than an int. (subreg_memory_offset): Likewise. Take the subreg offset as a poly_uint64 rather than an unsigned int. (subreg_size_lowpart_offset, subreg_size_highpart_offset): Take the mode sizes as poly_uint64s rather than unsigned ints. Return a poly_uint64 rather than an unsigned int. (subreg_lowpart_p): Treat subreg offsets as poly_ints. (copy_insn_1): Handle 'p'. * rtlanal.c (set_noop_p): Treat subregs offsets as poly_uint64s. (subreg_lsb_1): Take the subreg offset as a poly_uint64 rather than an unsigned int. Return the shift in the same way. (subreg_lsb): Return the shift as a poly_uint64 rather than an unsigned int. (subreg_size_offset_from_lsb): Take the sizes and shift as poly_uint64s rather than unsigned ints. Return the offset as a poly_uint64. (subreg_get_info, subreg_regno_offset, subreg_offset_representable_p) (simplify_subreg_regno): Take the offset as a poly_uint64 rather than an unsigned int. * rtlhash.c (add_rtx): Handle 'p'. * genemit.c (gen_exp): Likewise. * gengenrtl.c (type_from_format, gendef): Likewise. * gensupport.c (subst_pattern_match, get_alternatives_number) (collect_insn_data, alter_predicate_for_insn, alter_constraints) (subst_dup): Likewise. * gengtype.c (adjust_field_rtx_def): Likewise. * genrecog.c (find_operand, find_matching_operand, validate_pattern) (match_pattern_2): Likewise. (rtx_test::SUBREG_FIELD): New rtx_test::kind_enum. (rtx_test::subreg_field): New function. (operator ==, safe_to_hoist_p, transition_parameter_type) (print_nonbool_test, print_test): Handle SUBREG_FIELD. * genattrtab.c (attr_rtx_1): Say that 'p' is deliberately not handled. * genpeep.c (match_rtx): Likewise. * print-rtl.c (print_poly_int): Include if GENERATOR_FILE too. (rtx_writer::print_rtx_operand): Handle 'p'. (print_value): Handle SUBREG. * read-rtl.c (apply_int_iterator): Likewise. (rtx_reader::read_rtx_operand): Handle 'p'. * alias.c (rtx_equal_for_memref_p): Likewise. * cselib.c (rtx_equal_for_cselib_1, cselib_hash_rtx): Likewise. * caller-save.c (replace_reg_with_saved_mem): Treat subreg offsets as poly_ints. * calls.c (expand_call): Likewise. * combine.c (combine_simplify_rtx, expand_field_assignment): Likewise. (make_extraction, gen_lowpart_for_combine): Likewise. * loop-invariant.c (hash_invariant_expr_1, invariant_expr_equal_p): Likewise. * cse.c (remove_invalid_subreg_refs): Take the offset as a poly_uint64 rather than an unsigned int. Treat subreg offsets as poly_ints. (exp_equiv_p): Handle 'p'. (hash_rtx_cb): Likewise. Treat subreg offsets as poly_ints. (equiv_constant, cse_insn): Treat subreg offsets as poly_ints. * dse.c (find_shift_sequence): Likewise. * dwarf2out.c (rtl_for_decl_location): Likewise. * expmed.c (extract_low_bits): Likewise. * expr.c (emit_group_store, undefined_operand_subword_p): Likewise. (expand_expr_real_2): Likewise. * final.c (alter_subreg): Likewise. (leaf_renumber_regs_insn): Handle 'p'. * function.c (assign_parm_find_stack_rtl, assign_parm_setup_stack): Treat subreg offsets as poly_ints. * fwprop.c (forward_propagate_and_simplify): Likewise. * ifcvt.c (noce_emit_move_insn, noce_emit_cmove): Likewise. * ira.c (get_subreg_tracking_sizes): Likewise. * ira-conflicts.c (go_through_subreg): Likewise. * ira-lives.c (process_single_reg_class_operands): Likewise. * jump.c (rtx_renumbered_equal_p): Likewise. Handle 'p'. * lower-subreg.c (simplify_subreg_concatn): Take the subreg offset as a poly_uint64 rather than an unsigned int. (simplify_gen_subreg_concatn, resolve_simple_move): Treat subreg offsets as poly_ints. * lra-constraints.c (operands_match_p): Handle 'p'. (match_reload, curr_insn_transform): Treat subreg offsets as poly_ints. * lra-spills.c (assign_mem_slot): Likewise. * postreload.c (move2add_valid_value_p): Likewise. * recog.c (general_operand, indirect_operand): Likewise. * regcprop.c (copy_value, maybe_mode_change): Likewise. (copyprop_hardreg_forward_1): Likewise. * reginfo.c (simplifiable_subregs_hasher::hash, simplifiable_subregs) (record_subregs_of_mode): Likewise. * rtlhooks.c (gen_lowpart_general, gen_lowpart_if_possible): Likewise. * reload.c (operands_match_p): Handle 'p'. (find_reloads_subreg_address): Treat subreg offsets as poly_ints. * reload1.c (alter_reg, choose_reload_regs): Likewise. (compute_reload_subreg_offset): Likewise, and return an poly_int64. * simplify-rtx.c (simplify_truncation, simplify_binary_operation_1): (test_vector_ops_duplicate): Treat subreg offsets as poly_ints. (simplify_const_poly_int_tests<N>::run): Likewise. (simplify_subreg, simplify_gen_subreg): Take the subreg offset as a poly_uint64 rather than an unsigned int. * valtrack.c (debug_lowpart_subreg): Likewise. * var-tracking.c (var_lowpart): Likewise. (loc_cmp): Handle 'p'. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r255882
2017-12-20Add a gen_int_shift_amount helper functionRichard Sandiford1-42/+47
This patch adds a helper routine that constructs rtxes for constant shift amounts, given the mode of the value being shifted. As well as helping with the SVE patches, this is one step towards allowing CONST_INTs to have a real mode. One long-standing problem has been to decide what the mode of a shift count should be for arbitrary rtxes (as opposed to those directly tied to a target pattern). Realistic choices would be the mode of the shifted elements, word_mode, QImode, a 64-bit mode, or the same mode as the shift optabs (in which case what should the mode be when the target doesn't have a pattern?) For now the patch picks a 64-bit mode, but with a ??? comment. 2017-12-20 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * emit-rtl.h (gen_int_shift_amount): Declare. * emit-rtl.c (gen_int_shift_amount): New function. * asan.c (asan_emit_stack_protection): Use gen_int_shift_amount instead of GEN_INT. * calls.c (shift_return_value): Likewise. * cse.c (fold_rtx): Likewise. * dse.c (find_shift_sequence): Likewise. * expmed.c (init_expmed_one_mode, store_bit_field_1, expand_shift_1) (expand_shift, expand_smod_pow2): Likewise. * lower-subreg.c (shift_cost): Likewise. * optabs.c (expand_superword_shift, expand_doubleword_mult) (expand_unop, expand_binop, shift_amt_for_vec_perm_mask) (expand_vec_perm_var): Likewise. * simplify-rtx.c (simplify_unary_operation_1): Likewise. (simplify_binary_operation_1): Likewise. * combine.c (try_combine, find_split_point, force_int_to_mode) (simplify_shift_const_1, simplify_shift_const): Likewise. (change_zero_ext): Likewise. Use simplify_gen_binary. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r255861
2017-12-19read-rtl.c (parse_reg_note_name): Replace Yoda conditions with typical order ↵Jakub Jelinek1-33/+33
conditions. * read-rtl.c (parse_reg_note_name): Replace Yoda conditions with typical order conditions. * sel-sched.c (extract_new_fences_from): Likewise. * config/visium/constraints.md (J, K, L): Likewise. * config/visium/predicates.md (const_shift_operand): Likewise. * config/visium/visium.c (visium_legitimize_address, visium_legitimize_reload_address): Likewise. * config/m68k/m68k.c (output_reg_adjust, emit_reg_adjust): Likewise. * config/arm/arm.c (arm_block_move_unaligned_straight): Likewise. * config/avr/constraints.md (Y01, Ym1, Y02, Ym2): Likewise. * config/avr/avr-log.c (avr_vdump, avr_log_set_avr_log, SET_DUMP_DETAIL): Likewise. * config/avr/predicates.md (const_8_16_24_operand): Likewise. * config/avr/avr.c (STR_PREFIX_P, avr_popcount_each_byte, avr_is_casesi_sequence, avr_casei_sequence_check_operands, avr_set_core_architecture, avr_set_current_function, avr_legitimize_reload_address, avr_asm_len, avr_print_operand, output_movqi, output_movsisf, avr_out_plus, avr_out_bitop, avr_out_fract, avr_adjust_insn_length, avr_encode_section_info, avr_2word_insn_p, output_reload_in_const, avr_has_nibble_0xf, avr_map_decompose, avr_fold_builtin): Likewise. * config/avr/driver-avr.c (avr_devicespecs_file): Likewise. * config/avr/gen-avr-mmcu-specs.c (str_prefix_p, print_mcu): Likewise. * config/i386/i386.c (ix86_parse_stringop_strategy_string): Likewise. * config/m32c/m32c-pragma.c (m32c_pragma_memregs): Likewise. * config/m32c/m32c.c (m32c_conditional_register_usage, m32c_address_cost): Likewise. * config/m32c/predicates.md (shiftcount_operand, longshiftcount_operand): Likewise. * config/iq2000/iq2000.c (iq2000_expand_prologue): Likewise. * config/nios2/nios2.c (nios2_handle_custom_fpu_insn_option, can_use_cdx_ldstw): Likewise. * config/nios2/nios2.h (CDX_REG_P): Likewise. * config/cr16/cr16.h (RETURN_ADDR_RTX, REGNO_MODE_OK_FOR_BASE_P): Likewise. * config/cr16/cr16.md (*mov<mode>_double): Likewise. * config/cr16/cr16.c (cr16_create_dwarf_for_multi_push): Likewise. * config/h8300/h8300.c (h8300_rtx_costs, get_shift_alg): Likewise. * config/vax/constraints.md (U06, U08, U16, CN6, S08, S16): Likewise. * config/vax/vax.c (adjacent_operands_p): Likewise. * config/ft32/constraints.md (L, b, KA): Likewise. * config/ft32/ft32.c (ft32_load_immediate, ft32_expand_prologue): Likewise. * cfgexpand.c (expand_stack_alignment): Likewise. * gcse.c (insert_expr_in_table): Likewise. * print-rtl.c (rtx_writer::print_rtx_operand_codes_E_and_V): Likewise. * cgraphunit.c (cgraph_node::expand): Likewise. * ira-build.c (setup_min_max_allocno_live_range_point): Likewise. * emit-rtl.c (add_insn): Likewise. * input.c (dump_location_info): Likewise. * passes.c (NEXT_PASS): Likewise. * read-rtl-function.c (parse_note_insn_name, function_reader::read_rtx_operand_r, function_reader::parse_mem_expr): Likewise. * sched-rgn.c (sched_rgn_init): Likewise. * diagnostic-show-locus.c (layout::show_ruler): Likewise. * combine.c (find_split_point, simplify_if_then_else, force_to_mode, if_then_else_cond, simplify_shift_const_1, simplify_comparison): Likewise. * explow.c (eliminate_constant_term): Likewise. * final.c (leaf_renumber_regs_insn): Likewise. * cfgrtl.c (print_rtl_with_bb): Likewise. * genhooks.c (emit_init_macros): Likewise. * poly-int.h (maybe_ne, maybe_le, maybe_lt): Likewise. * tree-data-ref.c (conflict_fn): Likewise. * selftest.c (assert_streq): Likewise. * expr.c (store_constructor_field, expand_expr_real_1): Likewise. * fold-const.c (fold_range_test, extract_muldiv_1, fold_truth_andor, fold_binary_loc, multiple_of_p): Likewise. * reload.c (push_reload, find_equiv_reg): Likewise. * et-forest.c (et_nca, et_below): Likewise. * dbxout.c (dbxout_symbol_location): Likewise. * reorg.c (relax_delay_slots): Likewise. * dojump.c (do_compare_rtx_and_jump): Likewise. * gengtype-parse.c (type): Likewise. * simplify-rtx.c (simplify_gen_ternary, simplify_gen_relational, simplify_const_relational_operation): Likewise. * reload1.c (do_output_reload): Likewise. * dumpfile.c (get_dump_file_info_by_switch): Likewise. * gengtype.c (type_for_name): Likewise. * gimple-ssa-sprintf.c (format_directive): Likewise. ada/ * gcc-interface/trans.c (Loop_Statement_to_gnu): Replace Yoda conditions with typical order conditions. * gcc-interface/misc.c (gnat_get_array_descr_info, default_pass_by_ref): Likewise. * gcc-interface/decl.c (gnat_to_gnu_entity): Likewise. * adaint.c (__gnat_tmp_name): Likewise. c-family/ * known-headers.cc (get_stdlib_header_for_name): Replace Yoda conditions with typical order conditions. c/ * c-typeck.c (comptypes_internal, function_types_compatible_p, perform_integral_promotions, digest_init): Replace Yoda conditions with typical order conditions. * c-decl.c (check_bitfield_type_and_width): Likewise. cp/ * name-lookup.c (get_std_name_hint): Replace Yoda conditions with typical order conditions. * class.c (check_bitfield_decl): Likewise. * pt.c (convert_template_argument): Likewise. * decl.c (duplicate_decls): Likewise. * typeck.c (commonparms): Likewise. fortran/ * scanner.c (preprocessor_line): Replace Yoda conditions with typical order conditions. * dependency.c (check_section_vs_section): Likewise. * trans-array.c (gfc_conv_expr_descriptor): Likewise. jit/ * jit-playback.c (get_type, playback::compile_to_file::copy_file, playback::context::acquire_mutex): Replace Yoda conditions with typical order conditions. * libgccjit.c (gcc_jit_context_new_struct_type, gcc_jit_struct_set_fields, gcc_jit_context_new_union_type, gcc_jit_context_new_function, gcc_jit_timer_pop): Likewise. * jit-builtins.c (matches_builtin): Likewise. * jit-recording.c (recording::compound_type::set_fields, recording::fields::write_reproducer, recording::rvalue::set_scope, recording::function::validate): Likewise. * jit-logging.c (logger::decref): Likewise. From-SVN: r255831
2017-12-16Revert accidental commitRichard Sandiford1-47/+42
From-SVN: r255746
2017-12-16Add a gen_int_shift_amount helper functionRichard Sandiford1-42/+47
This patch adds a helper routine that constructs rtxes for constant shift amounts, given the mode of the value being shifted. As well as helping with the SVE patches, this is one step towards allowing CONST_INTs to have a real mode. One long-standing problem has been to decide what the mode of a shift count should be for arbitrary rtxes (as opposed to those directly tied to a target pattern). Realistic choices would be the mode of the shifted elements, word_mode, QImode, or the same mode as the shift optabs (in which case what should the mode be when the target doesn't have a pattern?) For now the patch picks the mode of the shifted elements, but with a ??? comment. 2017-11-06 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * emit-rtl.h (gen_int_shift_amount): Declare. * emit-rtl.c (gen_int_shift_amount): New function. * asan.c (asan_emit_stack_protection): Use gen_int_shift_amount instead of GEN_INT. * calls.c (shift_return_value): Likewise. * cse.c (fold_rtx): Likewise. * dse.c (find_shift_sequence): Likewise. * expmed.c (init_expmed_one_mode, store_bit_field_1, expand_shift_1) (expand_shift, expand_smod_pow2): Likewise. * lower-subreg.c (shift_cost): Likewise. * optabs.c (expand_superword_shift, expand_doubleword_mult) (expand_unop, expand_binop, shift_amt_for_vec_perm_mask) (expand_vec_perm_var): Likewise. * simplify-rtx.c (simplify_unary_operation_1): Likewise. (simplify_binary_operation_1): Likewise. * combine.c (try_combine, find_split_point, force_int_to_mode) (simplify_shift_const_1, simplify_shift_const): Likewise. (change_zero_ext): Likewise. Use simplify_gen_binary. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r255745
2017-12-13combine: Fix PR83393Segher Boessenkool1-1/+1
In move_deaths we move a REG_DEAD note if the instruction combination has extended the lifetime of a register so that the existing note is no longer valid. We find that note using reg_stat, but what that finds can refer to a later insn. If so, we cannot use the cached value. This patch implements that. PR rtl-optimization/83393 * combine.c (move_deaths): If reg_stat points to a too new insn in last_death, do not use it: find the proper insn instead. gcc/testsuite/ PR rtl-optimization/83393 * gcc.dg/pr83393.c: New testcase. From-SVN: r255606
2017-12-12[Patch combine] Don't create vector mode ZERO_EXTEND from subregsJames Greenhalgh1-1/+3
The code in simplify set to handle transforming the paradoxical subreg expression: (set FOO (subreg:M (mem:N BAR) 0)) in to: (set FOO (zero_extend:M (mem:N BAR))) Does not consider the case where M is a vector mode, allowing it to construct (for example): (zero_extend:V4SI (mem:SI)) For one, this has the wrong semantics - but fortunately we fail long before then in expand_compound_operation. We need to explicitly reject vector modes from this transformation. gcc/ * combine.c (simplify_set): Do not transform subregs to zero_extends if the destination mode is a vector mode. From-SVN: r255578
2017-12-12[SFN] boilerplate changes in preparation to introduce nonbind markersAlexandre Oliva1-6/+6
This patch introduces a number of new macros and functions that will be used to distinguish between different kinds of debug stmts, insns and notes, namely, preexisting debug bind ones and to-be-introduced nonbind markers. In a seemingly mechanical way, it adjusts several uses of the macros and functions, so that they refer to narrower categories when appropriate. These changes, by themselves, should not have any visible effect in the compiler behavior, since the upcoming debug markers are never created with this patch alone. for gcc/ChangeLog * gimple.h (enum gimple_debug_subcode): Add GIMPLE_DEBUG_BEGIN_STMT. (gimple_debug_begin_stmt_p): New. (gimple_debug_nonbind_marker_p): New. * tree.h (MAY_HAVE_DEBUG_MARKER_STMTS): New. (MAY_HAVE_DEBUG_BIND_STMTS): Renamed from.... (MAY_HAVE_DEBUG_STMTS): ... this. Check both. * insn-notes.def (BEGIN_STMT): New. * rtl.h (MAY_HAVE_DEBUG_MARKER_INSNS): New. (MAY_HAVE_DEBUG_BIND_INSNS): Renamed from.... (MAY_HAVE_DEBUG_INSNS): ... this. Check both. (NOTE_MARKER_LOCATION, NOTE_MARKER_P): New. (DEBUG_BIND_INSN_P, DEBUG_MARKER_INSN_P): New. (INSN_DEBUG_MARKER_KIND): New. (GEN_RTX_DEBUG_MARKER_BEGIN_STMT_PAT): New. (INSN_VAR_LOCATION): Check for VAR_LOCATION. (INSN_VAR_LOCATION_PTR): New. * cfgexpand.c (expand_debug_locations): Handle debug bind insns only. (expand_gimple_basic_block): Likewise. Emit debug temps for TER deps only if debug bind insns are enabled. (pass_expand::execute): Avoid deep TER and expand debug locations for debug bind insns only. * cgraph.c (cgraph_edge::redirect_call_stmt_to_callee): Narrow debug stmts special handling down to debug bind stmts. * combine.c (try_combine): Narrow debug insns special handling down to debug bind insns. * cse.c (delete_trivially_dead_insns): Handle debug bindings. Narrow debug insns preexisting special handling down to debug bind insns. * dce.c (rest_of_handle_ud_dce): Narrow debug insns special handling down to debug bind insns. * function.c (instantiate_virtual_regs): Skip debug markers, adjust handling of debug binds. * gimple-ssa-backprop.c (backprop::prepare_change): Try debug temp insertion iff MAY_HAVE_DEBUG_BIND_STMTS. * haifa-sched.c (schedule_insn): Narrow special handling of debug insns to debug bind insns. * ipa-param-manipulation.c (ipa_modify_call_arguments): Narrow special handling of debug stmts to debug bind stmts. * ipa-split.c (split_function): Likewise. * ira.c (combine_and_move_insns): Adjust debug bind insns only. * loop-unroll.c (apply_opt_in_copies): Adjust tests on bind debug insns. * reg-stack.c (convert_regs_1): Use DEBUG_BIND_INSN_P. * regrename.c (build_def_use): Likewise. * regcprop.c (copyprop_hardreg_forward_1): Likewise. (pass_cprop_hardreg): Narrow special casing of debug insns to debug bind insns. * regstat.c (regstat_init_n_sets_and_refs): Likewise. * reload1.c (reload): Likewise. * sese.c (sese_insert_phis_for_liveouts): Narrow special casing of debug stmts to debug bind stmts. * shrink-wrap.c (move_insn_for_shrink_wrap): Likewise. * ssa-iterators.h (num_imm_uses): Likewise. * tree-cfg.c (gimple_merge_blocks): Narrow special casing of debug stmts to debug bind stmts. * tree-inline.c (tree_function_versioning): Narrow special casing of debug stmts to debug bind stmts. * tree-loop-distribution.c (generate_loops_for_partition): Narrow special casing of debug stmts to debug bind stmts. * tree-sra.c (analyze_access_subtree): Narrow special casing of debug stmts to debug bind stmts. * tree-ssa-dce.c (remove_dead_stmt): Narrow special casing of debug stmts to debug bind stmts. * tree-ssa-loop-ivopt.c (remove_unused_ivs): Narrow special casing of debug stmts to debug bind stmts. * tree-ssa-reassoc.c (reassoc_remove_stmt): Likewise. * tree-ssa-tail-merge.c (tail_merge_optimize): Narrow special casing of debug stmts to debug bind stmts. * tree-ssa-threadedge.c (propagate_threaded_block_debug_info): Likewise. * tree-ssa.c (flush_pending_stmts): Narrow special casing of debug stmts to debug bind stmts. (gimple_replace_ssa_lhs): Likewise. (insert_debug_temp_for_var_def): Likewise. (insert_debug_temps_for_defs): Likewise. (reset_debug_uses): Likewise. * tree-ssanames.c (release_ssa_name_fn): Likewise. * tree-vect-loop-manip.c (adjust_debug_stmts_now): Likewise. (adjust_debug_stmts): Likewise. (adjust_phi_and_debug_stmts): Likewise. (vect_do_peeling): Likewise. * tree-vect-loop.c (vect_transform_loop): Likewise. * valtrack.c (propagate_for_debug): Use BIND_DEBUG_INSN_P. * var-tracking.c (adjust_mems): Narrow special casing of debug insns to debug bind insns. (dv_onepart_p, dataflow_set_clar_at_call, use_type): Likewise. (compute_bb_dataflow, vt_find_locations): Likewise. (vt_expand_loc, emit_notes_for_changes): Likewise. (vt_init_cfa_base): Likewise. (vt_emit_notes): Likewise. (vt_initialize): Likewise. (vt_finalize): Likewise. From-SVN: r255565
2017-12-11[PR80693] drop value of parallel SETs dropped by combineAlexandre Oliva1-0/+40
When combine drops a REG_UNUSED SET in a parallel, we have to clear cached values, so that, even if the REGs remain used (e.g. because they were referenced in the used SET_SRC), we will not use properties of the dropped modified value as if they applied to the preserved original one. We fail to adjust REG_N_SETS. for gcc/ChangeLog PR rtl-optimization/80693 PR rtl-optimization/81019 PR rtl-optimization/81020 * combine.c (distribute_notes): Reset any REG_UNUSED REGs that are not mentioned in i3. Place the REG_UNUSED note on i2, possibly modified to REG_DEAD, if it did not originate in i3. for gcc/testsuite/ChangeLog PR rtl-optimization/80693 PR rtl-optimization/81019 PR rtl-optimization/81020 * gcc.dg/pr80693.c: New. * gcc.dg/pr81019.c: New. From-SVN: r255554
2017-12-08combine: Fix PR83304Segher Boessenkool1-0/+20
In PR83304 two insns are combined, where the I2 uses a register that has a REG_DEAD note on an insn after I2 but before I3. In such a case move_deaths should move that death note. But move_deaths only looks at the reg_stat[regno].last_death insn, and that field can be zeroed out (previously, use_crosses_set_p would prevent the combination in this case). If the last_death field is zero it means "unknown", not "no death", so we have to find if there is a REG_DEAD note. PR rtl-optimization/83304 * combine.c (move_deaths): If we do not know where a register died, search for it. From-SVN: r255506
2017-12-04combine: Remove use_crosses_set_pSegher Boessenkool1-62/+7
This removes use_crosses_set_p, and uses modified_between_p instead everywhere it was used. This improves optimisation. * combine.c: Adjust comment. (use_crosses_set_p): Delete. (can_combine_p): Use modified_between_p instead of use_crosses_set_p. (try_combine): Ditto. From-SVN: r255384
2017-11-29combine: Print to dump if some insn cannot be combined into i3Segher Boessenkool1-6/+18
Eventually we should print the reason that any combination fails. This is a good start (these happen often). * combine.c (try_combine): Print a message to dump file whenever I0, I1, or I2 cannot be combined into I3. From-SVN: r255261
2017-11-29combine: Do not throw away unneeded arms of parallels (PR83156)Segher Boessenkool1-1/+2
The fix for PR82621 makes us not split an I2 if one of the results of those SETs is unused, since combine does not handle that properly. But this results in degradation for i386 (or more in general, for any target that does not have patterns for parallels with an unused result as a CLOBBER instead of a SET for that result). This patch instead makes us not split only if one of the results is set again before I3. That fixes PR83156 and also fixes PR82621. Unfortunately it undoes the nice optimisations that the previous patch did on powerpc. PR rtl-optimization/83156 PR rtl-optimization/82621 * combine.c (try_combine): Don't split an I2 if one of the dests is set again before I3. Allow unused dests. From-SVN: r255260
2017-11-25re PR rtl-optimization/81553 (ICE in immed_wide_int_const, at emit-rtl.c:607)Jakub Jelinek1-3/+7
PR rtl-optimization/81553 * combine.c (simplify_if_then_else): In (if_then_else COND (OP Z C1) Z) to (OP Z (mult COND (C1 * STORE_FLAG_VALUE))) optimization, if OP is a shift where C1 has different mode than the whole shift, use C1's mode for MULT rather than the shift's mode. * gcc.c-torture/compile/pr81553.c: New test. From-SVN: r255150
2017-11-17combine: Add added_notes_insnSegher Boessenkool1-7/+25
This patch makes combine reconsider insns it added notes to. This matters for example if the note is a REG_DEAD; without the note the setter of the register has to be kept around in the result of combinations, so it cannot be a 2->1 combination, and the cost of the result is higher than without that extra set, so try_combine may refuse the combination with the set, but allow it without the set. This fixes a regression for powerpc: pr69946.c has started to fail after the bitfield expansion changes. GCC used to generate lwz 3,0(9) rlwinm 3,3,12,20,23 ori 3,3,0x11 rotldi 3,3,52 bl bar but now it does lwz 3,0(9) rldicr 3,3,32,3 srdi 3,3,48 ori 3,3,0x110 sldi 3,3,48 bl bar (an instruction too many). After this patch it is lwz 3,0(9) rlwinm 3,3,16,16,19 ori 3,3,0x110 sldi 3,3,48 bl bar (the testcase still does not pass, it looks for very specific insns). * combine.c (added_notes_insn): New. (try_combine): Handle added_notes_insn like added_links_insn. Rewrite return value code. (distribute_notes): Set added_notes_insn to the earliest insn we added a note to. From-SVN: r254875
2017-11-17combine: Don't split insns if half is unused (PR82621)Segher Boessenkool1-1/+2
If we have a PARALLEL of two SETs, and one half is unused, we currently happily split that into two instructions (although the unused one is useless). Worse, as PR82621 shows, combine will happily merge this insn into I3 even if some intervening insn sets the same register again, which is wrong. This fixes it by not splitting PARALLELs with REG_UNUSED notes. It all is handled fine by combine in that case: just the "single set that is unused" case isn't handled properly. This also results in better code: combine will now actually throw away the unused SET. (It still won't do that in an I3). PR rtl-optimization/82621 * combine.c (try_combine): Do not split PARALLELs of two SETs if the dest of one of those SETs is unused. From-SVN: r254874
2017-11-03combine: Print insns we try to combineSegher Boessenkool1-0/+7
This adds some extra debug info to the dump file for combine: print the insns that are input to try_combine. I was worried printing more will make the dump file only harder to read, but especially the info from the REG_DEAD notes is invaluable. * combine (try_combine): Print the insns input to try_combine to the dump file. From-SVN: r254365
2017-11-01revert: combine.c (can_change_dest_mode): Reject changes in ↵Richard Sandiford1-6/+0
REGMODE_NATURAL_SIZE. 2017-11-01 Richard Sandiford <richard.sandiford@linaro.org> gcc/ Revert accidental duplicate: * combine.c (can_change_dest_mode): Reject changes in REGMODE_NATURAL_SIZE. From-SVN: r254316
2017-11-01combine: Fix bug in giving up placing REG_DEAD notes (PR82683)Segher Boessenkool1-5/+11
When we have a REG_DEAD note for a reg that is set in the new I2, we drop the note on the floor (we cannot find whether to place it on I2 or on I3). But the code I added to do this has a bug and does not always actually drop it. This patch fixes it. But that on its own is too pessimistic, it turns out, and we generate worse code. One case where we do know where to place the note is if it came from I3 (it should go to I3 again). Doing this fixes all of the regressions. PR rtl-optimization/64682 PR rtl-optimization/69567 PR rtl-optimization/69737 PR rtl-optimization/82683 * combine.c (distribute_notes) <REG_DEAD>: If the new I2 sets the same register mentioned in the note, drop the note, unless it came from I3, in which case it should go to I3 again. From-SVN: r254315
2017-11-01Prevent invalid register mode changes in combineRichard Sandiford1-0/+6
This patch stops combine from changing the mode of an existing register in-place if doing so would change the size of the underlying register allocation size, as given by REGMODE_NATURAL_SIZE. Without this, many tests fail in adjust_reg_mode after SVE is added. One example is gcc.c-torture/compile/20090401-1.c. 2017-11-01 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * combine.c (can_change_dest_mode): Reject changes in REGMODE_NATURAL_SIZE. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r254291
2017-10-26Make more use of df_read_modify_subreg_pRichard Sandiford1-14/+5
This patch uses df_read_modify_subreg_p to check whether writing to a subreg would preserve some of the existing contents. This has the effect of putting more emphasis on the REGMODE_NATURAL_SIZE-based definition of whether something can be partially modified, instead of using UNITS_PER_WORD unconditionally. This becomes important for SVE, where UNITS_PER_WORD has no significance for subregs of multi-register LD2/ST2, LD3/ST3 and LD4/ST4 tuples. 2017-10-26 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * caller-save.c (mark_referenced_regs): Use read_modify_subreg_p. * combine.c (find_single_use_1): Likewise. (expand_field_assignment): Likewise. (move_deaths): Likewise. * lra-constraints.c (simplify_operand_subreg): Likewise. (curr_insn_transform): Likewise. * lra.c (collect_non_operand_hard_regs): Likewise. (add_regs_to_insn_regno_info): Likewise. * rtlanal.c (reg_referenced_p): Likewise. (covers_regno_no_parallel_p): Likewise. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r254110
2017-10-23Fix HWI + -unsigned in combine.cRichard Sandiford1-7/+3
rtx_equal_for_field_assignment_p had: x = adjust_address_nv (x, GET_MODE (y), -subreg_lowpart_offset (GET_MODE (x), GET_MODE (y))); But subreg_lowpart_offset returns an unsigned int and adjust_address_nv takes a HWI, so a subreg offset of 4 would give a memory offset of 0x00000000fffffffffc. 2017-10-23 Richard Sandiford <richard.sandiford@linaro.org> gcc/ * combine.c (rtx_equal_for_field_assignment_p): Use byte_lowpart_offset. From-SVN: r253997
2017-10-22Make more use of GET_MODE_UNIT_PRECISIONRichard Sandiford1-1/+1
This patch is like the earlier GET_MODE_UNIT_SIZE one, but for precisions rather than sizes. There is one behavioural change in expand_debug_expr: we shouldn't use lowpart subregs for non-scalar truncations, since that would just reinterpret some of the scalars and drop the rest. (This probably doesn't trigger in practice.) Using TRUNCATE is fine for scalars, since simplify_gen_unary knows when a subreg can be used. 2017-10-22 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * cfgexpand.c (expand_debug_expr): Use GET_MODE_UNIT_PRECISION. (expand_debug_source_expr): Likewise. * combine.c (combine_simplify_rtx): Likewise. * cse.c (fold_rtx): Likewise. * optabs.c (expand_float): Likewise. * simplify-rtx.c (simplify_unary_operation_1): Likewise. (simplify_binary_operation_1): Likewise. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r253991
2017-10-22Make more use of HWI_COMPUTABLE_MODE_PRichard Sandiford1-3/+2
This patch uses HWI_COMPUTABLE_MODE_P (X) instead of GET_MODE_PRECISION (X) <= HOST_BITS_PER_WIDE_INT in cases where X also needs to be a scalar integer. 2017-10-22 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * combine.c (simplify_comparison): Use HWI_COMPUTABLE_MODE_P. (record_promoted_value): Likewise. * expr.c (expand_expr_real_2): Likewise. * ree.c (update_reg_equal_equiv_notes): Likewise. (combine_set_extension): Likewise. * rtlanal.c (low_bitmask_len): Likewise. * simplify-rtx.c (neg_const_int): Likewise. (simplify_binary_operation_1): Likewise. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r253990
2017-10-20Add generic part for Intel CET enabling. The spec is available atIgor Tsimbalist1-0/+1
https://software.intel.com/sites/default/files/managed/4d/2a/control-flow-enforcement-technology-preview.pdf A proposal is to introduce a target independent flag -fcf-protection=[none|branch|return|full] with a semantic to instrument a code to control validness or integrity of control-flow transfers using jump and call instructions. The main goal is to detect and block a possible malware execution through transfer the execution to unknown target address. Implementation could be either software or target based. Any target platforms can provide their implementation for instrumentation under this option. The compiler should instrument any control-flow transfer points in a program (ex. call/jmp/ret) as well as any landing pads, which are targets of control-flow transfers. A new 'nocf_check' attribute is introduced to provide hand tuning support. The attribute directs the compiler to skip a call to a function and a function's landing pad from instrumentation. The attribute can be used for function and pointer to function types, otherwise it will be ignored. Currently all platforms except i386 will report the error and do no instrumentation. i386 will provide the implementation based on a specification published by Intel for a new technology called Control-flow Enforcement Technology (CET). gcc/c-family/ * c-attribs.c (handle_nocf_check_attribute): New function. (c_common_attribute_table): Add 'nocf_check' handling. gcc/c/ * gimple-parser.c: Add second argument NULL to gimple_build_call_from_tree. gcc/ * attrib.c (comp_type_attributes): Check nocf_check attribute. * cfgexpand.c (expand_call_stmt): Set REG_CALL_NOCF_CHECK for call insn. * combine.c (distribute_notes): Add REG_CALL_NOCF_CHECK handling. * common.opt: Add fcf-protection flag. * emit-rtl.c (try_split): Add REG_CALL_NOCF_CHECK handling. * flag-types.h: Add enum cf_protection_level. * gimple.c (gimple_build_call_from_tree): Add second parameter. Add 'nocf_check' attribute propagation to gimple call. * gimple.h (gf_mask): Add GF_CALL_NOCF_CHECK. (gimple_build_call_from_tree): Update prototype. (gimple_call_nocf_check_p): New function. (gimple_call_set_nocf_check): Likewise. * gimplify.c: Add second argument to gimple_build_call_from_tree. * ipa-icf.c: Add nocf_check attribute in statement hash. * recog.c (peep2_attempt): Add REG_CALL_NOCF_CHECK handling. * reg-notes.def: Add REG_NOTE (CALL_NOCF_CHECK). * toplev.c (process_options): Add flag_cf_protection handling. From-SVN: r253936
2017-10-18Fix -Wimplicit-fallthrough in combine.cMartin Liska1-0/+2
2017-10-18 Martin Liska <mliska@suse.cz> * combine.c (simplify_compare_const): Add gcc_fallthrough. From-SVN: r253853
2017-10-13Prevent invalid register mode changes in combineRichard Sandiford1-0/+6
This patch stops combine from changing the mode of an existing register in-place if doing so would change the size of the underlying register allocation size, as given by REGMODE_NATURAL_SIZE. Without this, many tests fail in adjust_reg_mode after SVE is added. One example is gcc.c-torture/compile/20090401-1.c. 2017-10-13 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * combine.c (can_change_dest_mode): Reject changes in REGMODE_NATURAL_SIZE. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r253717
2017-10-13Make more use of GET_MODE_UNIT_BITSIZERichard Sandiford1-1/+2
This patch is like the previous GET_MODE_UNIT_SIZE one, but for bit rather than byte sizes. 2017-10-13 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * cfgexpand.c (expand_debug_expr): Use GET_MODE_UNIT_BITSIZE. (expand_debug_source_expr): Likewise. * combine.c (combine_simplify_rtx): Likewise. * cse.c (fold_rtx): Likewise. * fwprop.c (canonicalize_address): Likewise. * targhooks.c (default_shift_truncation_mask): Likewise. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r253716
2017-10-13Make more use of byte_lowpart_offsetRichard Sandiford1-10/+1
This patch uses byte_lowpart_offset in places that open-coded the calculation. 2017-10-13 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * caller-save.c (replace_reg_with_saved_mem): Use byte_lowpart_offset. * combine.c (gen_lowpart_for_combine): Likewise. * dwarf2out.c (rtl_for_decl_location): Likewise. * final.c (alter_subreg): Likewise. * rtlhooks.c (gen_lowpart_general): Likewise. (gen_lowpart_if_possible): Likewise. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r253714
2017-10-09combine: Use insn_cost instead of pattern_cost everywhereSegher Boessenkool1-6/+25
* combine.c (combine_validate_cost): Compute the new insn_cost, not just pattern_cost. (try_combine): Adjust comment. From-SVN: r253561
2017-10-09Replace insn_rtx_cost with insn_cost and pattern_costSegher Boessenkool1-9/+8
This replaces insn_rtx_cost with insn_cost if an insn is readily available, and with pattern_cost otherwise. * cfgrtl.c (rtl_account_profile_record): Replace insn_rtx_cost with insn_cost. * combine.c (uid_insn_cost): Adjust comment. (combine_validate_cost): Adjust comment. Use pattern_cost instead of insn_rtx_cost (combine_instructions): Use insn_cost instead of insn_rtx_cost. * dse.c (find_shift_sequence): Ditto. * ifcvt.c (cheap_bb_rtx_cost_p): Ditto. (bb_valid_for_noce_process_p): Use pattern_cost. * rtl.h (insn_rtx_cost): Delete. (pattern_cost): New prototype. (insn_cost): New prototype. * rtlanal.c (insn_rtx_cost): Rename to... (pattern_cost): ... this. (insn_cost): New. From-SVN: r253560