aboutsummaryrefslogtreecommitdiff
path: root/gcc/combine.c
AgeCommit message (Collapse)AuthorFilesLines
2018-11-20re PR rtl-optimization/85925 (compilation of masking with 257 goes wrong in ↵Eric Botcazou1-0/+1
combine at -02) PR rtl-optimization/85925 * rtl.h (word_register_operation_p): New predicate. * combine.c (record_dead_and_set_regs_1): Only apply specific handling for WORD_REGISTER_OPERATIONS targets to word_register_operation_p RTX. * rtlanal.c (nonzero_bits1): Likewise. Adjust couple of comments. (num_sign_bit_copies1): Likewise. From-SVN: r266302
2018-11-11combine: More make_more_copiesSegher Boessenkool1-5/+4
This makes make_more_copies do what its documentation says, that is, only make an intermediate pseudo if copying to a pseudo. This regressed generated code quality when we didn't keep the original notes that were on the copy, but since r265582 we do, and only allowing pseudos now is a win. It also simplifies the code. * combine.c (make_more_copies): Only make an intermediate copy if the dest of a move is a pseudo. From-SVN: r266004
2018-11-05combine: Don't make an intermediate reg for assigning to sfp (PR87871)Segher Boessenkool1-0/+3
The code with an intermediate register is perfectly fine, but LRA apparently cannot handle the resulting code, or perhaps something else is wrong. In either case, making an extra temporary will not likely help here, so let's just skip it. PR rtl-optimization/87871 * combine.c (make_more_copies): Skip if dest is frame_pointer_rtx. From-SVN: r265821
2018-10-29combine: Fix various shortcomings in make_more_copies (PR87701, PR87780)Segher Boessenkool1-10/+5
This rewrites most of make_more_copies, in the process fixing a few PRs and some other bugs, and working around a few target problems. Certain notes turn out to actually change the meaning of the RTL, so we cannot drop them; and i386 takes subregs of hard regs. PR rtl-optimization/87701 PR rtl-optimization/87780 * combine.c (make_more_copies): Rewrite. From-SVN: r265582
2018-10-25combine: Don't do make_more_copies for dest PC (PR87720)Segher Boessenkool1-0/+2
Jumps are written in RTL as moves to PC. But the latter has no mode, so we shouldn't try to use it. Since the optimization this routine does does not really help for jumps at all, let's just skip it. PR rtl-optimization/87720 * combine.c (make_more_copies): Skip if the dest is pc_rtx. From-SVN: r265474
2018-10-22combine: Do not combine moves from hard registersSegher Boessenkool1-4/+46
On most targets every function starts with moves from the parameter passing (hard) registers into pseudos. Similarly, after every call there is a move from the return register into a pseudo. These moves usually combine with later instructions (leaving pretty much the same instruction, just with a hard reg instead of a pseudo). This isn't a good idea. Register allocation can get rid of unnecessary moves just fine, and moving the parameter passing registers into many later instructions tends to prevent good register allocation. This patch disallows combining moves from a hard (non-fixed) register. This also avoid the problem mentioned in PR87600 #c3 (combining hard registers into inline assembler is problematic). Because the register move can often be combined with other instructions *itself*, for example for setting some condition code, this patch adds extra copies via new pseudos after every copy-from-hard-reg. On some targets this reduces average code size. On others it increases it a bit, 0.1% or 0.2% or so. (I tested this on all *-linux targets). PR rtl-optimization/87600 * combine.c: Add include of expr.h. (cant_combine_insn_p): Do not combine moves from any hard non-fixed register to a pseudo. (make_more_copies): New function, add a copy to a new pseudo after the moves from hard registers into pseudos. (rest_of_handle_combine): Declare rebuild_jump_labels_after_combine later. Call make_more_copies. From-SVN: r265398
2018-09-19combine: Use correct mode in new comparison (PR86902)Segher Boessenkool1-2/+2
This code in try_combine uses the wrong mode. This fails (with RTL checking) in trunk, but not in any released branches. PR rtl-optimization/86902 * combine.c (try_combine): When changing the CC mode used, don't change an unrelated mode in other_insn to that new CC mode. From-SVN: r264426
2018-08-27re PR rtl-optimization/87065 (combine causes ICE in trunc_int_for_mode)Jakub Jelinek1-5/+6
PR rtl-optimization/87065 * combine.c (simplify_if_then_else): Formatting fix. (if_then_else_cond): Guard MULT optimization with SCALAR_INT_MODE_P check. (known_cond): Don't return const_true_rtx for vector modes. Use CONST0_RTX instead of const0_rtx. Formatting fixes. * gcc.target/i386/pr87065.c: New test. From-SVN: r263872
2018-08-22combine: Do another check before splitting a parallel (PR86771)Segher Boessenkool1-2/+8
When combine splits a resulting parallel into its two SETs, it has to place one at i2, and the other stays at i3. This does not work if the destination of the SET that will be placed at i2 is modified between i2 and i3. This patch fixes it. * combine.c (try_combine): Do not allow splitting a resulting PARALLEL of two SETs into those two SETs, one to be placed at i2, if that SETs destination is modified between i2 and i3. From-SVN: r263776
2018-08-06Remaining support for clobber highAlan Hayward1-5/+33
gcc/ * alias.c (record_set): Check for clobber high. * cfgexpand.c (expand_gimple_stmt): Likewise. * combine-stack-adj.c (single_set_for_csa): Likewise. * combine.c (find_single_use_1): Likewise. (set_nonzero_bits_and_sign_copies): Likewise. (get_combine_src_dest): Likewise. (is_parallel_of_n_reg_sets): Likewise. (try_combine): Likewise. (record_dead_and_set_regs_1): Likewise. (reg_dead_at_p_1): Likewise. (reg_dead_at_p): Likewise. * dce.c (deletable_insn_p): Likewise. (mark_nonreg_stores_1): Likewise. (mark_nonreg_stores_2): Likewise. * df-scan.c (df_find_hard_reg_defs): Likewise. (df_uses_record): Likewise. (df_get_call_refs): Likewise. * dwarf2out.c (mem_loc_descriptor): Likewise. * haifa-sched.c (haifa_classify_rtx): Likewise. * ira-build.c (create_insn_allocnos): Likewise. * ira-costs.c (scan_one_insn): Likewise. * ira.c (equiv_init_movable_p): Likewise. (rtx_moveable_p): Likewise. (interesting_dest_for_shprep): Likewise. * jump.c (mark_jump_label_1): Likewise. * postreload-gcse.c (record_opr_changes): Likewise. * postreload.c (reload_cse_simplify): Likewise. (struct reg_use): Add source expr. (reload_combine): Check for clobber high. (reload_combine_note_use): Likewise. (reload_cse_move2add): Likewise. (move2add_note_store): Likewise. * print-rtl.c (print_pattern): Likewise. * recog.c (decode_asm_operands): Likewise. (store_data_bypass_p): Likewise. (if_test_bypass_p): Likewise. * regcprop.c (kill_clobbered_value): Likewise. (kill_set_value): Likewise. * reginfo.c (reg_scan_mark_refs): Likewise. * reload1.c (maybe_fix_stack_asms): Likewise. (eliminate_regs_1): Likewise. (elimination_effects): Likewise. (mark_not_eliminable): Likewise. (scan_paradoxical_subregs): Likewise. (forget_old_reloads_1): Likewise. * reorg.c (find_end_label): Likewise. (try_merge_delay_insns): Likewise. (redundant_insn): Likewise. (own_thread_p): Likewise. (fill_simple_delay_slots): Likewise. (fill_slots_from_thread): Likewise. (dbr_schedule): Likewise. * resource.c (update_live_status): Likewise. (mark_referenced_resources): Likewise. (mark_set_resources): Likewise. * rtl.c (copy_rtx): Likewise. * rtlanal.c (reg_referenced_p): Likewise. (single_set_2): Likewise. (noop_move_p): Likewise. (note_stores): Likewise. * sched-deps.c (sched_analyze_reg): Likewise. (sched_analyze_insn): Likewise. From-SVN: r263331
2018-07-30combine: Allow combining two insns to two insnsSegher Boessenkool1-2/+20
This patch allows combine to combine two insns into two. This helps in many cases, by reducing instruction path length, and also allowing further combinations to happen. PR85160 is a typical example of code that it can improve. This patch does not allow such combinations if either of the original instructions was a simple move instruction. In those cases combining the two instructions increases register pressure without improving the code. With this move test register pressure does no longer increase noticably as far as I can tell. (At first I also didn't allow either of the resulting insns to be a move instruction. But that is actually a very good thing to have, as should have been obvious). PR rtl-optimization/85160 * combine.c (is_just_move): New function. (try_combine): Allow combining two instructions into two if neither of the original instructions was a move. From-SVN: r263067
2018-07-26combine: Another hard register problem (PR85805)Segher Boessenkool1-1/+2
The current code in reg_nonzero_bits_for_combine allows using the reg_stat info when last_set_mode is a different integer mode. This is completely wrong for non-pseudos. For example, as in the PR, a value in a DImode hard register is set by eight writes to its constituent QImode parts. The value written to the DImode is not the same as that written to the lowest-numbered QImode! PR rtl-optimization/85805 * combine.c (reg_nonzero_bits_for_combine): Only use the last set value for hard registers if that was written in the same mode. From-SVN: r262994
2018-06-12Use poly_int rtx accessors instead of hwi accessorsRichard Sandiford1-4/+8
This patch generalises various places that used hwi rtx accessors so that they can handle poly_ints instead. In many cases these changes are by inspection rather than because something had shown them to be necessary. 2018-06-12 Richard Sandiford <richard.sandiford@linaro.org> gcc/ * poly-int.h (can_div_trunc_p): Add new overload in which all values are poly_ints. * alias.c (get_addr): Extend CONST_INT handling to poly_int_rtx_p. (memrefs_conflict_p): Likewise. (init_alias_analysis): Likewise. * cfgexpand.c (expand_debug_expr): Likewise. * combine.c (combine_simplify_rtx, force_int_to_mode): Likewise. * cse.c (fold_rtx): Likewise. * explow.c (adjust_stack, anti_adjust_stack): Likewise. * expr.c (emit_block_move_hints): Likewise. (clear_storage_hints, push_block, emit_push_insn): Likewise. (store_expr_with_bounds, reduce_to_bit_field_precision): Likewise. (emit_group_load_1): Use rtx_to_poly_int64 for group offsets. (emit_group_store): Likewise. (find_args_size_adjust): Use strip_offset. Use rtx_to_poly_int64 to read the PRE/POST_MODIFY increment. * calls.c (store_one_arg): Use strip_offset. * rtlanal.c (rtx_addr_can_trap_p_1): Extend CONST_INT handling to poly_int_rtx_p. (set_noop_p): Use rtx_to_poly_int64 for the elements selected by a VEC_SELECT. * simplify-rtx.c (avoid_constant_pool_reference): Use strip_offset. (simplify_binary_operation_1): Extend CONST_INT handling to poly_int_rtx_p. * var-tracking.c (compute_cfa_pointer): Take a poly_int64 rather than a HOST_WIDE_INT. (hard_frame_pointer_adjustment): Change from HOST_WIDE_INT to poly_int64. (adjust_mems, add_stores): Update accodingly. (vt_canonicalize_addr): Track polynomial offsets. (emit_note_insn_var_location): Likewise. (vt_add_function_parameter): Likewise. (vt_initialize): Likewise. From-SVN: r261530
2018-04-10re PR rtl-optimization/85300 (ICE in exact_int_to_float_conversion_p, at ↵Jakub Jelinek1-4/+8
simplify-rtx.c:895) PR rtl-optimization/85300 * combine.c (subst): Handle subst of CONST_SCALAR_INT_P new_rtx also into FLOAT and UNSIGNED_FLOAT like ZERO_EXTEND, return a CLOBBER if simplify_unary_operation fails. * gcc.dg/pr85300.c: New test. From-SVN: r259285
2018-03-14combine: Don't make log_links for pc_rtx (PR84780 #c10)Segher Boessenkool1-0/+3
distribute_links tries to place a log_link for whatever the destination of the modified instruction is. It shouldn't do that when that dest is pc_rtx, which isn't actually a register. * combine.c (distribute_links): Don't make a link based on pc_rtx. From-SVN: r258523
2018-03-12combine: Fix PR84780 (more LOG_LINKS trouble)Segher Boessenkool1-0/+1
There still are situations where we have stale LOG_LINKS. This causes combine to try two-insn combinations I2->I3 where the register set by I2 is used before I3 as well. Not good. This patch fixes it by checking for this situation in can_combine_p (similar to what we already do for three and four insn combinations). From-SVN: r258452
2018-03-06re PR target/84710 (ICE: RTL check: expected code 'reg', have 'subreg' in ↵Jakub Jelinek1-6/+2
rhs_regno, at rtl.h:1896 with -O -fno-forward-propagate) PR target/84710 * combine.c (try_combine): Use reg_or_subregno instead of handling just paradoxical SUBREGs and REGs. * gcc.dg/pr84710.c: New test. From-SVN: r258301
2018-03-05re PR target/84700 (ICE on 32-bit BE powerpc targets w/ -misel -O1)Jakub Jelinek1-1/+5
PR target/84700 * combine.c (combine_simplify_rtx): Don't try to simplify if if_then_else_cond returned non-NULL, but either true_rtx or false_rtx are equal to x. * gcc.target/powerpc/pr84700.c: New test. From-SVN: r258263
2018-03-02re PR rtl-optimization/84614 (wrong code with u16->u128 extension at aarch64 ↵Jakub Jelinek1-2/+2
-fno-split-wide-types -g3 --param=max-combine-insns=3) PR target/84614 * rtl.h (prev_real_nondebug_insn, next_real_nondebug_insn): New prototypes. * emit-rtl.c (next_real_insn, prev_real_insn): Fix up function comments. (next_real_nondebug_insn, prev_real_nondebug_insn): New functions. * cfgcleanup.c (try_head_merge_bb): Use prev_real_nondebug_insn instead of a loop around prev_real_insn. * combine.c (move_deaths): Use prev_real_nondebug_insn instead of prev_real_insn. * gcc.dg/pr84614.c: New test. From-SVN: r258129
2018-02-16combine: Fix problem with RTL checkingSegher Boessenkool1-1/+6
As Jakub found, after my recent combine patch at least on x86 problems show up with RTL checking enabled. This is because the I2 generated by a successful instruction combination can write not only a register but it can also write a paradoxical subreg of one. This fixes it. * combine.c (try_combine): When adjusting LOG_LINKS for the destination that moved to I2, also allow destinations that are a paradoxical subreg (instead of a normal reg). From-SVN: r257736
2018-02-13combine: Update links correctly for new I2 (PR84169)Segher Boessenkool1-26/+31
If there is a LOG_LINK between two insns, this means those two insns can be combined, as far as dataflow is concerned. There never should be a LOG_LINK between two unrelated insns. If there is one, combine will try to combine the insns without doing all the needed checks if the earlier destination is used before the later insn, etc. Unfortunately we do not update the LOG_LINKs correctly in some cases. This patch fixes at least some of those cases. PR rtl-optimization/84169 * combine.c (try_combine): New variable split_i2i3. Set it to true if we generated a parallel as new i3 and we split that to new i2 and i3 instructions. Handle split_i2i3 similar to swap_i2i3: scan the LOG_LINKs of i3 to see which of those need to link to i2 now. Link those to i2, not i1. Partially rewrite this scan code. gcc/testsuite/ PR rtl-optimization/84169 * gcc.c-torture/execute/pr84169.c: New. From-SVN: r257644
2018-02-01re PR rtl-optimization/84157 ([nvptx] ICE: RTL check: expected code 'reg', ↵Uros Bizjak1-2/+2
have 'lshiftrt') PR rtl-optimization/84157 * combine.c (change_zero_ext): Use REG_P predicate in front of HARD_REGISTER_P predicate. From-SVN: r257302
2018-01-31re PR rtl-optimization/84123 (internal compiler error: in gen_rtx_SUBREG, at ↵Uros Bizjak1-2/+15
emit-rtl.c:908, alpha linux.) PR rtl-optimization/84123 * combine.c (change_zero_ext): Check if hard register satisfies can_change_dest_mode before calling gen_lowpart_SUBREG. From-SVN: r257270
2018-01-31re PR rtl-optimization/84071 (wrong elimination of zero-extension after ↵Eric Botcazou1-5/+12
sign-extended load) PR rtl-optimization/84071 * combine.c (record_dead_and_set_regs_1): Record the source unmodified for a paradoxical SUBREG on a WORD_REGISTER_OPERATIONS target. From-SVN: r257224
2018-01-09re PR rtl-optimization/83628 (performance regression when accessing arrays ↵Uros Bizjak1-1/+1
on alpha) PR target/83628 * combine.c (force_int_to_mode) <case ASHIFT>: Use mode instead of op_mode in the force_to_mode call. From-SVN: r256387
2018-01-03poly_int: GET_MODE_SIZERichard Sandiford1-16/+10
This patch changes GET_MODE_SIZE from unsigned short to poly_uint16. The non-mechanical parts were handled by previous patches. 2018-01-03 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * machmode.h (mode_size): Change from unsigned short to poly_uint16_pod. (mode_to_bytes): Return a poly_uint16 rather than an unsigned short. (GET_MODE_SIZE): Return a constant if ONLY_FIXED_SIZE_MODES, or if measurement_type is not polynomial. (fixed_size_mode::includes_p): Check for constant-sized modes. * genmodes.c (emit_mode_size_inline): Make mode_size_inline return a poly_uint16 rather than an unsigned short. (emit_mode_size): Change the type of mode_size from unsigned short to poly_uint16_pod. Use ZERO_COEFFS for the initializer. (emit_mode_adjustments): Cope with polynomial vector sizes. * lto-streamer-in.c (lto_input_mode_table): Use bp_unpack_poly_value for GET_MODE_SIZE. * lto-streamer-out.c (lto_write_mode_table): Use bp_pack_poly_value for GET_MODE_SIZE. * auto-inc-dec.c (try_merge): Treat GET_MODE_SIZE as polynomial. * builtins.c (expand_ifn_atomic_compare_exchange_into_call): Likewise. * caller-save.c (setup_save_areas): Likewise. (replace_reg_with_saved_mem): Likewise. * calls.c (emit_library_call_value_1): Likewise. * combine-stack-adj.c (combine_stack_adjustments_for_block): Likewise. * combine.c (simplify_set, make_extraction, simplify_shift_const_1) (gen_lowpart_for_combine): Likewise. * convert.c (convert_to_integer_1): Likewise. * cse.c (equiv_constant, cse_insn): Likewise. * cselib.c (autoinc_split, cselib_hash_rtx): Likewise. (cselib_subst_to_values): Likewise. * dce.c (word_dce_process_block): Likewise. * df-problems.c (df_word_lr_mark_ref): Likewise. * dwarf2cfi.c (init_one_dwarf_reg_size): Likewise. * dwarf2out.c (multiple_reg_loc_descriptor, mem_loc_descriptor) (concat_loc_descriptor, concatn_loc_descriptor, loc_descriptor) (rtl_for_decl_location): Likewise. * emit-rtl.c (gen_highpart, widen_memory_access): Likewise. * expmed.c (extract_bit_field_1, extract_integral_bit_field): Likewise. * expr.c (emit_group_load_1, clear_storage_hints): Likewise. (emit_move_complex, emit_move_multi_word, emit_push_insn): Likewise. (expand_expr_real_1): Likewise. * function.c (assign_parm_setup_block_p, assign_parm_setup_block) (pad_below): Likewise. * gimple-fold.c (optimize_atomic_compare_exchange_p): Likewise. * gimple-ssa-store-merging.c (rhs_valid_for_store_merging_p): Likewise. * ira.c (get_subreg_tracking_sizes): Likewise. * ira-build.c (ira_create_allocno_objects): Likewise. * ira-color.c (coalesced_pseudo_reg_slot_compare): Likewise. (ira_sort_regnos_for_alter_reg): Likewise. * ira-costs.c (record_operand_costs): Likewise. * lower-subreg.c (interesting_mode_p, simplify_gen_subreg_concatn) (resolve_simple_move): Likewise. * lra-constraints.c (get_reload_reg, operands_match_p): Likewise. (process_addr_reg, simplify_operand_subreg, curr_insn_transform) (lra_constraints): Likewise. (CONST_POOL_OK_P): Reject variable-sized modes. * lra-spills.c (slot, assign_mem_slot, pseudo_reg_slot_compare) (add_pseudo_to_slot, lra_spill): Likewise. * omp-low.c (omp_clause_aligned_alignment): Likewise. * optabs-query.c (get_best_extraction_insn): Likewise. * optabs-tree.c (expand_vec_cond_expr_p): Likewise. * optabs.c (expand_vec_perm_var, expand_vec_cond_expr): Likewise. (expand_mult_highpart, valid_multiword_target_p): Likewise. * recog.c (offsettable_address_addr_space_p): Likewise. * regcprop.c (maybe_mode_change): Likewise. * reginfo.c (choose_hard_reg_mode, record_subregs_of_mode): Likewise. * regrename.c (build_def_use): Likewise. * regstat.c (dump_reg_info): Likewise. * reload.c (complex_word_subreg_p, push_reload, find_dummy_reload) (find_reloads, find_reloads_subreg_address): Likewise. * reload1.c (eliminate_regs_1): Likewise. * rtlanal.c (for_each_inc_dec_find_inc_dec, rtx_cost): Likewise. * simplify-rtx.c (avoid_constant_pool_reference): Likewise. (simplify_binary_operation_1, simplify_subreg): Likewise. * targhooks.c (default_function_arg_padding): Likewise. (default_hard_regno_nregs, default_class_max_nregs): Likewise. * tree-cfg.c (verify_gimple_assign_binary): Likewise. (verify_gimple_assign_ternary): Likewise. * tree-inline.c (estimate_move_cost): Likewise. * tree-ssa-forwprop.c (simplify_vector_constructor): Likewise. * tree-ssa-loop-ivopts.c (add_autoinc_candidates): Likewise. (get_address_cost_ainc): Likewise. * tree-vect-data-refs.c (vect_enhance_data_refs_alignment): Likewise. (vect_supportable_dr_alignment): Likewise. * tree-vect-loop.c (vect_determine_vectorization_factor): Likewise. (vectorizable_reduction): Likewise. * tree-vect-stmts.c (vectorizable_assignment, vectorizable_shift) (vectorizable_operation, vectorizable_load): Likewise. * tree.c (build_same_sized_truth_vector_type): Likewise. * valtrack.c (cleanup_auto_inc_dec): Likewise. * var-tracking.c (emit_note_insn_var_location): Likewise. * config/arc/arc.h (ASM_OUTPUT_CASE_END): Use as_a <scalar_int_mode>. (ADDR_VEC_ALIGN): Likewise. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r256201
2018-01-03poly_int: GET_MODE_BITSIZERichard Sandiford1-5/+8
This patch changes GET_MODE_BITSIZE from an unsigned short to a poly_uint16. 2018-01-03 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * machmode.h (mode_to_bits): Return a poly_uint16 rather than an unsigned short. (GET_MODE_BITSIZE): Return a constant if ONLY_FIXED_SIZE_MODES, or if measurement_type is polynomial. * calls.c (shift_return_value): Treat GET_MODE_BITSIZE as polynomial. * combine.c (make_extraction): Likewise. * dse.c (find_shift_sequence): Likewise. * dwarf2out.c (mem_loc_descriptor): Likewise. * expmed.c (store_integral_bit_field, extract_bit_field_1): Likewise. (extract_bit_field, extract_low_bits): Likewise. * expr.c (convert_move, convert_modes, emit_move_insn_1): Likewise. (optimize_bitfield_assignment_op, expand_assignment): Likewise. (store_expr_with_bounds, store_field, expand_expr_real_1): Likewise. * fold-const.c (optimize_bit_field_compare, merge_ranges): Likewise. * gimple-fold.c (optimize_atomic_compare_exchange_p): Likewise. * reload.c (find_reloads): Likewise. * reload1.c (alter_reg): Likewise. * stor-layout.c (bitwise_mode_for_mode, compute_record_mode): Likewise. * targhooks.c (default_secondary_memory_needed_mode): Likewise. * tree-if-conv.c (predicate_mem_writes): Likewise. * tree-ssa-strlen.c (handle_builtin_memcmp): Likewise. * tree-vect-patterns.c (adjust_bool_pattern): Likewise. * tree-vect-stmts.c (vectorizable_simd_clone_call): Likewise. * valtrack.c (dead_debug_insert_temp): Likewise. * varasm.c (mergeable_constant_section): Likewise. * config/sh/sh.h (LOCAL_ALIGNMENT): Use as_a <fixed_size_mode>. gcc/ada/ * gcc-interface/misc.c (enumerate_modes): Treat GET_MODE_BITSIZE as polynomial. gcc/c-family/ * c-ubsan.c (ubsan_instrument_shift): Treat GET_MODE_BITSIZE as polynomial. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r256200
2018-01-03poly_int: GET_MODE_PRECISIONRichard Sandiford1-35/+42
This patch changes GET_MODE_PRECISION from an unsigned short to a poly_uint16. 2018-01-03 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * machmode.h (mode_precision): Change from unsigned short to poly_uint16_pod. (mode_to_precision): Return a poly_uint16 rather than an unsigned short. (GET_MODE_PRECISION): Return a constant if ONLY_FIXED_SIZE_MODES, or if measurement_type is not polynomial. (HWI_COMPUTABLE_MODE_P): Turn into a function. Optimize the case in which the mode is already known to be a scalar_int_mode. * genmodes.c (emit_mode_precision): Change the type of mode_precision from unsigned short to poly_uint16_pod. Use ZERO_COEFFS for the initializer. * lto-streamer-in.c (lto_input_mode_table): Use bp_unpack_poly_value for GET_MODE_PRECISION. * lto-streamer-out.c (lto_write_mode_table): Use bp_pack_poly_value for GET_MODE_PRECISION. * combine.c (update_rsp_from_reg_equal): Treat GET_MODE_PRECISION as polynomial. (try_combine, find_split_point, combine_simplify_rtx): Likewise. (expand_field_assignment, make_extraction): Likewise. (make_compound_operation_int, record_dead_and_set_regs_1): Likewise. (get_last_value): Likewise. * convert.c (convert_to_integer_1): Likewise. * cse.c (cse_insn): Likewise. * expr.c (expand_expr_real_1): Likewise. * lra-constraints.c (simplify_operand_subreg): Likewise. * optabs-query.c (can_atomic_load_p): Likewise. * optabs.c (expand_atomic_load): Likewise. (expand_atomic_store): Likewise. * ree.c (combine_reaching_defs): Likewise. * rtl.h (partial_subreg_p, paradoxical_subreg_p): Likewise. * rtlanal.c (nonzero_bits1, lsb_bitfield_op_p): Likewise. * tree.h (type_has_mode_precision_p): Likewise. * ubsan.c (instrument_si_overflow): Likewise. gcc/ada/ * gcc-interface/misc.c (enumerate_modes): Treat GET_MODE_PRECISION as polynomial. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r256198
2018-01-03Update copyright years.Jakub Jelinek1-1/+1
From-SVN: r256169
2018-01-03poly_int: REGMODE_NATURAL_SIZERichard Sandiford1-2/+2
This patch makes target-independent code that uses REGMODE_NATURAL_SIZE treat it as a poly_int rather than a constant. 2018-01-03 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * combine.c (can_change_dest_mode): Handle polynomial REGMODE_NATURAL_SIZE. * expmed.c (store_bit_field_1): Likewise. * expr.c (store_constructor): Likewise. * emit-rtl.c (validate_subreg): Operate on polynomial mode sizes and polynomial REGMODE_NATURAL_SIZE. (gen_lowpart_common): Likewise. * reginfo.c (record_subregs_of_mode): Likewise. * rtlanal.c (read_modify_subreg_p): Likewise. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r256149
2017-12-21[Patch combine] Don't create ZERO_EXTEND from subregs unless we have a ↵James Greenhalgh1-2/+3
scalar int mode gcc/ * combine.c (simplify_set): Do not transform subregs to zero_extends if the destination is not a scalar int mode. From-SVN: r255945
2017-12-21poly_int: REG_ARGS_SIZERichard Sandiford1-2/+2
This patch adds new utility functions for manipulating REG_ARGS_SIZE notes and allows the notes to carry polynomial as well as constant sizes. The code was inconsistent about whether INT_MIN or HOST_WIDE_INT_MIN should be used to represent an unknown size. The patch uses HOST_WIDE_INT_MIN throughout. 2017-12-21 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * rtl.h (get_args_size, add_args_size_note): New functions. (find_args_size_adjust): Return a poly_int64 rather than a HOST_WIDE_INT. (fixup_args_size_notes): Likewise. Make the same change to the end_args_size parameter. * rtlanal.c (get_args_size, add_args_size_note): New functions. * builtins.c (expand_builtin_trap): Use add_args_size_note. * calls.c (emit_call_1): Likewise. * explow.c (adjust_stack_1): Likewise. * cfgcleanup.c (old_insns_match_p): Update use of find_args_size_adjust. * combine.c (distribute_notes): Track polynomial arg sizes. * dwarf2cfi.c (dw_trace_info): Change beg_true_args_size, end_true_args_size, beg_delay_args_size and end_delay_args_size from HOST_WIDE_INT to poly_int64. (add_cfi_args_size): Take the args_size as a poly_int64 rather than a HOST_WIDE_INT. (notice_args_size, notice_eh_throw, maybe_record_trace_start) (maybe_record_trace_start_abnormal, scan_trace, connect_traces): Track polynomial arg sizes. * emit-rtl.c (try_split): Use get_args_size. * recog.c (peep2_attempt): Likewise. * reload1.c (reload_as_needed): Likewise. * expr.c (find_args_size_adjust): Return the adjustment as a poly_int64 rather than a HOST_WIDE_INT. (fixup_args_size_notes): Change end_args_size from a HOST_WIDE_INT to a poly_int64 and change the return type in the same way. (emit_single_push_insn): Track polynomial arg sizes. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r255919
2017-12-20poly_int: SUBREG_BYTERichard Sandiford1-6/+7
This patch changes SUBREG_BYTE from an int to a poly_int. Since valid SUBREG_BYTEs must be contained within the mode of the SUBREG_REG, the required range is the same as for GET_MODE_SIZE, i.e. unsigned short. The patch therefore uses poly_uint16(_pod) for the SUBREG_BYTE. Using poly_uint16_pod rtx fields requires a new field code ('p'). Since there are no other uses of 'p' besides SUBREG_BYTE, the patch doesn't add an XPOLY or whatever; all uses should go via SUBREG_BYTE instead. The patch doesn't bother implementing 'p' support for legacy define_peepholes, since none of the remaining ones have subregs in their patterns. As it happened, the rtl documentation used SUBREG as an example of a code with mixed field types, accessed via XEXP (x, 0) and XINT (x, 1). Since there's no direct replacement for XINT, and since people should never use it even if there were, the patch changes the example to use INT_LIST instead. The patch also changes subreg-related helper functions so that they too take and return polynomial offsets. This makes the patch quite big, but it's mostly mechanical. The patch generally sticks to existing choices wrt signedness. 2017-12-20 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * doc/rtl.texi: Update documentation of SUBREG_BYTE. Document the 'p' format code. Use INT_LIST rather than SUBREG as the example of a code with an XINT and an XEXP. Remove the implication that accessing an rtx field using XINT is expected to work. * rtl.def (SUBREG): Change format from "ei" to "ep". * rtl.h (rtunion::rt_subreg): New field. (XCSUBREG): New macro. (SUBREG_BYTE): Use it. (subreg_shape): Change offset from an unsigned int to a poly_uint16. Update constructor accordingly. (subreg_shape::operator ==): Update accordingly. (subreg_shape::unique_id): Return an unsigned HOST_WIDE_INT rather than an unsigned int. (subreg_lsb, subreg_lowpart_offset, subreg_highpart_offset): Return a poly_uint64 rather than an unsigned int. (subreg_lsb_1): Likewise. Take the offset as a poly_uint64 rather than an unsigned int. (subreg_size_offset_from_lsb, subreg_size_lowpart_offset) (subreg_size_highpart_offset): Return a poly_uint64 rather than an unsigned int. Take the sizes as poly_uint64s. (subreg_offset_from_lsb): Return a poly_uint64 rather than an unsigned int. Take the shift as a poly_uint64 rather than an unsigned int. (subreg_regno_offset, subreg_offset_representable_p): Take the offset as a poly_uint64 rather than an unsigned int. (simplify_subreg_regno): Likewise. (byte_lowpart_offset): Return the memory offset as a poly_int64 rather than an int. (subreg_memory_offset): Likewise. Take the subreg offset as a poly_uint64 rather than an unsigned int. (simplify_subreg, simplify_gen_subreg, subreg_get_info) (gen_rtx_SUBREG, validate_subreg): Take the subreg offset as a poly_uint64 rather than an unsigned int. * rtl.c (rtx_format): Describe 'p' in comment. (copy_rtx, rtx_equal_p_cb, rtx_equal_p): Handle 'p'. * emit-rtl.c (validate_subreg, gen_rtx_SUBREG): Take the subreg offset as a poly_uint64 rather than an unsigned int. (byte_lowpart_offset): Return the memory offset as a poly_int64 rather than an int. (subreg_memory_offset): Likewise. Take the subreg offset as a poly_uint64 rather than an unsigned int. (subreg_size_lowpart_offset, subreg_size_highpart_offset): Take the mode sizes as poly_uint64s rather than unsigned ints. Return a poly_uint64 rather than an unsigned int. (subreg_lowpart_p): Treat subreg offsets as poly_ints. (copy_insn_1): Handle 'p'. * rtlanal.c (set_noop_p): Treat subregs offsets as poly_uint64s. (subreg_lsb_1): Take the subreg offset as a poly_uint64 rather than an unsigned int. Return the shift in the same way. (subreg_lsb): Return the shift as a poly_uint64 rather than an unsigned int. (subreg_size_offset_from_lsb): Take the sizes and shift as poly_uint64s rather than unsigned ints. Return the offset as a poly_uint64. (subreg_get_info, subreg_regno_offset, subreg_offset_representable_p) (simplify_subreg_regno): Take the offset as a poly_uint64 rather than an unsigned int. * rtlhash.c (add_rtx): Handle 'p'. * genemit.c (gen_exp): Likewise. * gengenrtl.c (type_from_format, gendef): Likewise. * gensupport.c (subst_pattern_match, get_alternatives_number) (collect_insn_data, alter_predicate_for_insn, alter_constraints) (subst_dup): Likewise. * gengtype.c (adjust_field_rtx_def): Likewise. * genrecog.c (find_operand, find_matching_operand, validate_pattern) (match_pattern_2): Likewise. (rtx_test::SUBREG_FIELD): New rtx_test::kind_enum. (rtx_test::subreg_field): New function. (operator ==, safe_to_hoist_p, transition_parameter_type) (print_nonbool_test, print_test): Handle SUBREG_FIELD. * genattrtab.c (attr_rtx_1): Say that 'p' is deliberately not handled. * genpeep.c (match_rtx): Likewise. * print-rtl.c (print_poly_int): Include if GENERATOR_FILE too. (rtx_writer::print_rtx_operand): Handle 'p'. (print_value): Handle SUBREG. * read-rtl.c (apply_int_iterator): Likewise. (rtx_reader::read_rtx_operand): Handle 'p'. * alias.c (rtx_equal_for_memref_p): Likewise. * cselib.c (rtx_equal_for_cselib_1, cselib_hash_rtx): Likewise. * caller-save.c (replace_reg_with_saved_mem): Treat subreg offsets as poly_ints. * calls.c (expand_call): Likewise. * combine.c (combine_simplify_rtx, expand_field_assignment): Likewise. (make_extraction, gen_lowpart_for_combine): Likewise. * loop-invariant.c (hash_invariant_expr_1, invariant_expr_equal_p): Likewise. * cse.c (remove_invalid_subreg_refs): Take the offset as a poly_uint64 rather than an unsigned int. Treat subreg offsets as poly_ints. (exp_equiv_p): Handle 'p'. (hash_rtx_cb): Likewise. Treat subreg offsets as poly_ints. (equiv_constant, cse_insn): Treat subreg offsets as poly_ints. * dse.c (find_shift_sequence): Likewise. * dwarf2out.c (rtl_for_decl_location): Likewise. * expmed.c (extract_low_bits): Likewise. * expr.c (emit_group_store, undefined_operand_subword_p): Likewise. (expand_expr_real_2): Likewise. * final.c (alter_subreg): Likewise. (leaf_renumber_regs_insn): Handle 'p'. * function.c (assign_parm_find_stack_rtl, assign_parm_setup_stack): Treat subreg offsets as poly_ints. * fwprop.c (forward_propagate_and_simplify): Likewise. * ifcvt.c (noce_emit_move_insn, noce_emit_cmove): Likewise. * ira.c (get_subreg_tracking_sizes): Likewise. * ira-conflicts.c (go_through_subreg): Likewise. * ira-lives.c (process_single_reg_class_operands): Likewise. * jump.c (rtx_renumbered_equal_p): Likewise. Handle 'p'. * lower-subreg.c (simplify_subreg_concatn): Take the subreg offset as a poly_uint64 rather than an unsigned int. (simplify_gen_subreg_concatn, resolve_simple_move): Treat subreg offsets as poly_ints. * lra-constraints.c (operands_match_p): Handle 'p'. (match_reload, curr_insn_transform): Treat subreg offsets as poly_ints. * lra-spills.c (assign_mem_slot): Likewise. * postreload.c (move2add_valid_value_p): Likewise. * recog.c (general_operand, indirect_operand): Likewise. * regcprop.c (copy_value, maybe_mode_change): Likewise. (copyprop_hardreg_forward_1): Likewise. * reginfo.c (simplifiable_subregs_hasher::hash, simplifiable_subregs) (record_subregs_of_mode): Likewise. * rtlhooks.c (gen_lowpart_general, gen_lowpart_if_possible): Likewise. * reload.c (operands_match_p): Handle 'p'. (find_reloads_subreg_address): Treat subreg offsets as poly_ints. * reload1.c (alter_reg, choose_reload_regs): Likewise. (compute_reload_subreg_offset): Likewise, and return an poly_int64. * simplify-rtx.c (simplify_truncation, simplify_binary_operation_1): (test_vector_ops_duplicate): Treat subreg offsets as poly_ints. (simplify_const_poly_int_tests<N>::run): Likewise. (simplify_subreg, simplify_gen_subreg): Take the subreg offset as a poly_uint64 rather than an unsigned int. * valtrack.c (debug_lowpart_subreg): Likewise. * var-tracking.c (var_lowpart): Likewise. (loc_cmp): Handle 'p'. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r255882
2017-12-20Add a gen_int_shift_amount helper functionRichard Sandiford1-42/+47
This patch adds a helper routine that constructs rtxes for constant shift amounts, given the mode of the value being shifted. As well as helping with the SVE patches, this is one step towards allowing CONST_INTs to have a real mode. One long-standing problem has been to decide what the mode of a shift count should be for arbitrary rtxes (as opposed to those directly tied to a target pattern). Realistic choices would be the mode of the shifted elements, word_mode, QImode, a 64-bit mode, or the same mode as the shift optabs (in which case what should the mode be when the target doesn't have a pattern?) For now the patch picks a 64-bit mode, but with a ??? comment. 2017-12-20 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * emit-rtl.h (gen_int_shift_amount): Declare. * emit-rtl.c (gen_int_shift_amount): New function. * asan.c (asan_emit_stack_protection): Use gen_int_shift_amount instead of GEN_INT. * calls.c (shift_return_value): Likewise. * cse.c (fold_rtx): Likewise. * dse.c (find_shift_sequence): Likewise. * expmed.c (init_expmed_one_mode, store_bit_field_1, expand_shift_1) (expand_shift, expand_smod_pow2): Likewise. * lower-subreg.c (shift_cost): Likewise. * optabs.c (expand_superword_shift, expand_doubleword_mult) (expand_unop, expand_binop, shift_amt_for_vec_perm_mask) (expand_vec_perm_var): Likewise. * simplify-rtx.c (simplify_unary_operation_1): Likewise. (simplify_binary_operation_1): Likewise. * combine.c (try_combine, find_split_point, force_int_to_mode) (simplify_shift_const_1, simplify_shift_const): Likewise. (change_zero_ext): Likewise. Use simplify_gen_binary. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r255861
2017-12-19read-rtl.c (parse_reg_note_name): Replace Yoda conditions with typical order ↵Jakub Jelinek1-33/+33
conditions. * read-rtl.c (parse_reg_note_name): Replace Yoda conditions with typical order conditions. * sel-sched.c (extract_new_fences_from): Likewise. * config/visium/constraints.md (J, K, L): Likewise. * config/visium/predicates.md (const_shift_operand): Likewise. * config/visium/visium.c (visium_legitimize_address, visium_legitimize_reload_address): Likewise. * config/m68k/m68k.c (output_reg_adjust, emit_reg_adjust): Likewise. * config/arm/arm.c (arm_block_move_unaligned_straight): Likewise. * config/avr/constraints.md (Y01, Ym1, Y02, Ym2): Likewise. * config/avr/avr-log.c (avr_vdump, avr_log_set_avr_log, SET_DUMP_DETAIL): Likewise. * config/avr/predicates.md (const_8_16_24_operand): Likewise. * config/avr/avr.c (STR_PREFIX_P, avr_popcount_each_byte, avr_is_casesi_sequence, avr_casei_sequence_check_operands, avr_set_core_architecture, avr_set_current_function, avr_legitimize_reload_address, avr_asm_len, avr_print_operand, output_movqi, output_movsisf, avr_out_plus, avr_out_bitop, avr_out_fract, avr_adjust_insn_length, avr_encode_section_info, avr_2word_insn_p, output_reload_in_const, avr_has_nibble_0xf, avr_map_decompose, avr_fold_builtin): Likewise. * config/avr/driver-avr.c (avr_devicespecs_file): Likewise. * config/avr/gen-avr-mmcu-specs.c (str_prefix_p, print_mcu): Likewise. * config/i386/i386.c (ix86_parse_stringop_strategy_string): Likewise. * config/m32c/m32c-pragma.c (m32c_pragma_memregs): Likewise. * config/m32c/m32c.c (m32c_conditional_register_usage, m32c_address_cost): Likewise. * config/m32c/predicates.md (shiftcount_operand, longshiftcount_operand): Likewise. * config/iq2000/iq2000.c (iq2000_expand_prologue): Likewise. * config/nios2/nios2.c (nios2_handle_custom_fpu_insn_option, can_use_cdx_ldstw): Likewise. * config/nios2/nios2.h (CDX_REG_P): Likewise. * config/cr16/cr16.h (RETURN_ADDR_RTX, REGNO_MODE_OK_FOR_BASE_P): Likewise. * config/cr16/cr16.md (*mov<mode>_double): Likewise. * config/cr16/cr16.c (cr16_create_dwarf_for_multi_push): Likewise. * config/h8300/h8300.c (h8300_rtx_costs, get_shift_alg): Likewise. * config/vax/constraints.md (U06, U08, U16, CN6, S08, S16): Likewise. * config/vax/vax.c (adjacent_operands_p): Likewise. * config/ft32/constraints.md (L, b, KA): Likewise. * config/ft32/ft32.c (ft32_load_immediate, ft32_expand_prologue): Likewise. * cfgexpand.c (expand_stack_alignment): Likewise. * gcse.c (insert_expr_in_table): Likewise. * print-rtl.c (rtx_writer::print_rtx_operand_codes_E_and_V): Likewise. * cgraphunit.c (cgraph_node::expand): Likewise. * ira-build.c (setup_min_max_allocno_live_range_point): Likewise. * emit-rtl.c (add_insn): Likewise. * input.c (dump_location_info): Likewise. * passes.c (NEXT_PASS): Likewise. * read-rtl-function.c (parse_note_insn_name, function_reader::read_rtx_operand_r, function_reader::parse_mem_expr): Likewise. * sched-rgn.c (sched_rgn_init): Likewise. * diagnostic-show-locus.c (layout::show_ruler): Likewise. * combine.c (find_split_point, simplify_if_then_else, force_to_mode, if_then_else_cond, simplify_shift_const_1, simplify_comparison): Likewise. * explow.c (eliminate_constant_term): Likewise. * final.c (leaf_renumber_regs_insn): Likewise. * cfgrtl.c (print_rtl_with_bb): Likewise. * genhooks.c (emit_init_macros): Likewise. * poly-int.h (maybe_ne, maybe_le, maybe_lt): Likewise. * tree-data-ref.c (conflict_fn): Likewise. * selftest.c (assert_streq): Likewise. * expr.c (store_constructor_field, expand_expr_real_1): Likewise. * fold-const.c (fold_range_test, extract_muldiv_1, fold_truth_andor, fold_binary_loc, multiple_of_p): Likewise. * reload.c (push_reload, find_equiv_reg): Likewise. * et-forest.c (et_nca, et_below): Likewise. * dbxout.c (dbxout_symbol_location): Likewise. * reorg.c (relax_delay_slots): Likewise. * dojump.c (do_compare_rtx_and_jump): Likewise. * gengtype-parse.c (type): Likewise. * simplify-rtx.c (simplify_gen_ternary, simplify_gen_relational, simplify_const_relational_operation): Likewise. * reload1.c (do_output_reload): Likewise. * dumpfile.c (get_dump_file_info_by_switch): Likewise. * gengtype.c (type_for_name): Likewise. * gimple-ssa-sprintf.c (format_directive): Likewise. ada/ * gcc-interface/trans.c (Loop_Statement_to_gnu): Replace Yoda conditions with typical order conditions. * gcc-interface/misc.c (gnat_get_array_descr_info, default_pass_by_ref): Likewise. * gcc-interface/decl.c (gnat_to_gnu_entity): Likewise. * adaint.c (__gnat_tmp_name): Likewise. c-family/ * known-headers.cc (get_stdlib_header_for_name): Replace Yoda conditions with typical order conditions. c/ * c-typeck.c (comptypes_internal, function_types_compatible_p, perform_integral_promotions, digest_init): Replace Yoda conditions with typical order conditions. * c-decl.c (check_bitfield_type_and_width): Likewise. cp/ * name-lookup.c (get_std_name_hint): Replace Yoda conditions with typical order conditions. * class.c (check_bitfield_decl): Likewise. * pt.c (convert_template_argument): Likewise. * decl.c (duplicate_decls): Likewise. * typeck.c (commonparms): Likewise. fortran/ * scanner.c (preprocessor_line): Replace Yoda conditions with typical order conditions. * dependency.c (check_section_vs_section): Likewise. * trans-array.c (gfc_conv_expr_descriptor): Likewise. jit/ * jit-playback.c (get_type, playback::compile_to_file::copy_file, playback::context::acquire_mutex): Replace Yoda conditions with typical order conditions. * libgccjit.c (gcc_jit_context_new_struct_type, gcc_jit_struct_set_fields, gcc_jit_context_new_union_type, gcc_jit_context_new_function, gcc_jit_timer_pop): Likewise. * jit-builtins.c (matches_builtin): Likewise. * jit-recording.c (recording::compound_type::set_fields, recording::fields::write_reproducer, recording::rvalue::set_scope, recording::function::validate): Likewise. * jit-logging.c (logger::decref): Likewise. From-SVN: r255831
2017-12-16Revert accidental commitRichard Sandiford1-47/+42
From-SVN: r255746
2017-12-16Add a gen_int_shift_amount helper functionRichard Sandiford1-42/+47
This patch adds a helper routine that constructs rtxes for constant shift amounts, given the mode of the value being shifted. As well as helping with the SVE patches, this is one step towards allowing CONST_INTs to have a real mode. One long-standing problem has been to decide what the mode of a shift count should be for arbitrary rtxes (as opposed to those directly tied to a target pattern). Realistic choices would be the mode of the shifted elements, word_mode, QImode, or the same mode as the shift optabs (in which case what should the mode be when the target doesn't have a pattern?) For now the patch picks the mode of the shifted elements, but with a ??? comment. 2017-11-06 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * emit-rtl.h (gen_int_shift_amount): Declare. * emit-rtl.c (gen_int_shift_amount): New function. * asan.c (asan_emit_stack_protection): Use gen_int_shift_amount instead of GEN_INT. * calls.c (shift_return_value): Likewise. * cse.c (fold_rtx): Likewise. * dse.c (find_shift_sequence): Likewise. * expmed.c (init_expmed_one_mode, store_bit_field_1, expand_shift_1) (expand_shift, expand_smod_pow2): Likewise. * lower-subreg.c (shift_cost): Likewise. * optabs.c (expand_superword_shift, expand_doubleword_mult) (expand_unop, expand_binop, shift_amt_for_vec_perm_mask) (expand_vec_perm_var): Likewise. * simplify-rtx.c (simplify_unary_operation_1): Likewise. (simplify_binary_operation_1): Likewise. * combine.c (try_combine, find_split_point, force_int_to_mode) (simplify_shift_const_1, simplify_shift_const): Likewise. (change_zero_ext): Likewise. Use simplify_gen_binary. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r255745
2017-12-13combine: Fix PR83393Segher Boessenkool1-1/+1
In move_deaths we move a REG_DEAD note if the instruction combination has extended the lifetime of a register so that the existing note is no longer valid. We find that note using reg_stat, but what that finds can refer to a later insn. If so, we cannot use the cached value. This patch implements that. PR rtl-optimization/83393 * combine.c (move_deaths): If reg_stat points to a too new insn in last_death, do not use it: find the proper insn instead. gcc/testsuite/ PR rtl-optimization/83393 * gcc.dg/pr83393.c: New testcase. From-SVN: r255606
2017-12-12[Patch combine] Don't create vector mode ZERO_EXTEND from subregsJames Greenhalgh1-1/+3
The code in simplify set to handle transforming the paradoxical subreg expression: (set FOO (subreg:M (mem:N BAR) 0)) in to: (set FOO (zero_extend:M (mem:N BAR))) Does not consider the case where M is a vector mode, allowing it to construct (for example): (zero_extend:V4SI (mem:SI)) For one, this has the wrong semantics - but fortunately we fail long before then in expand_compound_operation. We need to explicitly reject vector modes from this transformation. gcc/ * combine.c (simplify_set): Do not transform subregs to zero_extends if the destination mode is a vector mode. From-SVN: r255578
2017-12-12[SFN] boilerplate changes in preparation to introduce nonbind markersAlexandre Oliva1-6/+6
This patch introduces a number of new macros and functions that will be used to distinguish between different kinds of debug stmts, insns and notes, namely, preexisting debug bind ones and to-be-introduced nonbind markers. In a seemingly mechanical way, it adjusts several uses of the macros and functions, so that they refer to narrower categories when appropriate. These changes, by themselves, should not have any visible effect in the compiler behavior, since the upcoming debug markers are never created with this patch alone. for gcc/ChangeLog * gimple.h (enum gimple_debug_subcode): Add GIMPLE_DEBUG_BEGIN_STMT. (gimple_debug_begin_stmt_p): New. (gimple_debug_nonbind_marker_p): New. * tree.h (MAY_HAVE_DEBUG_MARKER_STMTS): New. (MAY_HAVE_DEBUG_BIND_STMTS): Renamed from.... (MAY_HAVE_DEBUG_STMTS): ... this. Check both. * insn-notes.def (BEGIN_STMT): New. * rtl.h (MAY_HAVE_DEBUG_MARKER_INSNS): New. (MAY_HAVE_DEBUG_BIND_INSNS): Renamed from.... (MAY_HAVE_DEBUG_INSNS): ... this. Check both. (NOTE_MARKER_LOCATION, NOTE_MARKER_P): New. (DEBUG_BIND_INSN_P, DEBUG_MARKER_INSN_P): New. (INSN_DEBUG_MARKER_KIND): New. (GEN_RTX_DEBUG_MARKER_BEGIN_STMT_PAT): New. (INSN_VAR_LOCATION): Check for VAR_LOCATION. (INSN_VAR_LOCATION_PTR): New. * cfgexpand.c (expand_debug_locations): Handle debug bind insns only. (expand_gimple_basic_block): Likewise. Emit debug temps for TER deps only if debug bind insns are enabled. (pass_expand::execute): Avoid deep TER and expand debug locations for debug bind insns only. * cgraph.c (cgraph_edge::redirect_call_stmt_to_callee): Narrow debug stmts special handling down to debug bind stmts. * combine.c (try_combine): Narrow debug insns special handling down to debug bind insns. * cse.c (delete_trivially_dead_insns): Handle debug bindings. Narrow debug insns preexisting special handling down to debug bind insns. * dce.c (rest_of_handle_ud_dce): Narrow debug insns special handling down to debug bind insns. * function.c (instantiate_virtual_regs): Skip debug markers, adjust handling of debug binds. * gimple-ssa-backprop.c (backprop::prepare_change): Try debug temp insertion iff MAY_HAVE_DEBUG_BIND_STMTS. * haifa-sched.c (schedule_insn): Narrow special handling of debug insns to debug bind insns. * ipa-param-manipulation.c (ipa_modify_call_arguments): Narrow special handling of debug stmts to debug bind stmts. * ipa-split.c (split_function): Likewise. * ira.c (combine_and_move_insns): Adjust debug bind insns only. * loop-unroll.c (apply_opt_in_copies): Adjust tests on bind debug insns. * reg-stack.c (convert_regs_1): Use DEBUG_BIND_INSN_P. * regrename.c (build_def_use): Likewise. * regcprop.c (copyprop_hardreg_forward_1): Likewise. (pass_cprop_hardreg): Narrow special casing of debug insns to debug bind insns. * regstat.c (regstat_init_n_sets_and_refs): Likewise. * reload1.c (reload): Likewise. * sese.c (sese_insert_phis_for_liveouts): Narrow special casing of debug stmts to debug bind stmts. * shrink-wrap.c (move_insn_for_shrink_wrap): Likewise. * ssa-iterators.h (num_imm_uses): Likewise. * tree-cfg.c (gimple_merge_blocks): Narrow special casing of debug stmts to debug bind stmts. * tree-inline.c (tree_function_versioning): Narrow special casing of debug stmts to debug bind stmts. * tree-loop-distribution.c (generate_loops_for_partition): Narrow special casing of debug stmts to debug bind stmts. * tree-sra.c (analyze_access_subtree): Narrow special casing of debug stmts to debug bind stmts. * tree-ssa-dce.c (remove_dead_stmt): Narrow special casing of debug stmts to debug bind stmts. * tree-ssa-loop-ivopt.c (remove_unused_ivs): Narrow special casing of debug stmts to debug bind stmts. * tree-ssa-reassoc.c (reassoc_remove_stmt): Likewise. * tree-ssa-tail-merge.c (tail_merge_optimize): Narrow special casing of debug stmts to debug bind stmts. * tree-ssa-threadedge.c (propagate_threaded_block_debug_info): Likewise. * tree-ssa.c (flush_pending_stmts): Narrow special casing of debug stmts to debug bind stmts. (gimple_replace_ssa_lhs): Likewise. (insert_debug_temp_for_var_def): Likewise. (insert_debug_temps_for_defs): Likewise. (reset_debug_uses): Likewise. * tree-ssanames.c (release_ssa_name_fn): Likewise. * tree-vect-loop-manip.c (adjust_debug_stmts_now): Likewise. (adjust_debug_stmts): Likewise. (adjust_phi_and_debug_stmts): Likewise. (vect_do_peeling): Likewise. * tree-vect-loop.c (vect_transform_loop): Likewise. * valtrack.c (propagate_for_debug): Use BIND_DEBUG_INSN_P. * var-tracking.c (adjust_mems): Narrow special casing of debug insns to debug bind insns. (dv_onepart_p, dataflow_set_clar_at_call, use_type): Likewise. (compute_bb_dataflow, vt_find_locations): Likewise. (vt_expand_loc, emit_notes_for_changes): Likewise. (vt_init_cfa_base): Likewise. (vt_emit_notes): Likewise. (vt_initialize): Likewise. (vt_finalize): Likewise. From-SVN: r255565
2017-12-11[PR80693] drop value of parallel SETs dropped by combineAlexandre Oliva1-0/+40
When combine drops a REG_UNUSED SET in a parallel, we have to clear cached values, so that, even if the REGs remain used (e.g. because they were referenced in the used SET_SRC), we will not use properties of the dropped modified value as if they applied to the preserved original one. We fail to adjust REG_N_SETS. for gcc/ChangeLog PR rtl-optimization/80693 PR rtl-optimization/81019 PR rtl-optimization/81020 * combine.c (distribute_notes): Reset any REG_UNUSED REGs that are not mentioned in i3. Place the REG_UNUSED note on i2, possibly modified to REG_DEAD, if it did not originate in i3. for gcc/testsuite/ChangeLog PR rtl-optimization/80693 PR rtl-optimization/81019 PR rtl-optimization/81020 * gcc.dg/pr80693.c: New. * gcc.dg/pr81019.c: New. From-SVN: r255554
2017-12-08combine: Fix PR83304Segher Boessenkool1-0/+20
In PR83304 two insns are combined, where the I2 uses a register that has a REG_DEAD note on an insn after I2 but before I3. In such a case move_deaths should move that death note. But move_deaths only looks at the reg_stat[regno].last_death insn, and that field can be zeroed out (previously, use_crosses_set_p would prevent the combination in this case). If the last_death field is zero it means "unknown", not "no death", so we have to find if there is a REG_DEAD note. PR rtl-optimization/83304 * combine.c (move_deaths): If we do not know where a register died, search for it. From-SVN: r255506
2017-12-04combine: Remove use_crosses_set_pSegher Boessenkool1-62/+7
This removes use_crosses_set_p, and uses modified_between_p instead everywhere it was used. This improves optimisation. * combine.c: Adjust comment. (use_crosses_set_p): Delete. (can_combine_p): Use modified_between_p instead of use_crosses_set_p. (try_combine): Ditto. From-SVN: r255384
2017-11-29combine: Print to dump if some insn cannot be combined into i3Segher Boessenkool1-6/+18
Eventually we should print the reason that any combination fails. This is a good start (these happen often). * combine.c (try_combine): Print a message to dump file whenever I0, I1, or I2 cannot be combined into I3. From-SVN: r255261
2017-11-29combine: Do not throw away unneeded arms of parallels (PR83156)Segher Boessenkool1-1/+2
The fix for PR82621 makes us not split an I2 if one of the results of those SETs is unused, since combine does not handle that properly. But this results in degradation for i386 (or more in general, for any target that does not have patterns for parallels with an unused result as a CLOBBER instead of a SET for that result). This patch instead makes us not split only if one of the results is set again before I3. That fixes PR83156 and also fixes PR82621. Unfortunately it undoes the nice optimisations that the previous patch did on powerpc. PR rtl-optimization/83156 PR rtl-optimization/82621 * combine.c (try_combine): Don't split an I2 if one of the dests is set again before I3. Allow unused dests. From-SVN: r255260
2017-11-25re PR rtl-optimization/81553 (ICE in immed_wide_int_const, at emit-rtl.c:607)Jakub Jelinek1-3/+7
PR rtl-optimization/81553 * combine.c (simplify_if_then_else): In (if_then_else COND (OP Z C1) Z) to (OP Z (mult COND (C1 * STORE_FLAG_VALUE))) optimization, if OP is a shift where C1 has different mode than the whole shift, use C1's mode for MULT rather than the shift's mode. * gcc.c-torture/compile/pr81553.c: New test. From-SVN: r255150
2017-11-17combine: Add added_notes_insnSegher Boessenkool1-7/+25
This patch makes combine reconsider insns it added notes to. This matters for example if the note is a REG_DEAD; without the note the setter of the register has to be kept around in the result of combinations, so it cannot be a 2->1 combination, and the cost of the result is higher than without that extra set, so try_combine may refuse the combination with the set, but allow it without the set. This fixes a regression for powerpc: pr69946.c has started to fail after the bitfield expansion changes. GCC used to generate lwz 3,0(9) rlwinm 3,3,12,20,23 ori 3,3,0x11 rotldi 3,3,52 bl bar but now it does lwz 3,0(9) rldicr 3,3,32,3 srdi 3,3,48 ori 3,3,0x110 sldi 3,3,48 bl bar (an instruction too many). After this patch it is lwz 3,0(9) rlwinm 3,3,16,16,19 ori 3,3,0x110 sldi 3,3,48 bl bar (the testcase still does not pass, it looks for very specific insns). * combine.c (added_notes_insn): New. (try_combine): Handle added_notes_insn like added_links_insn. Rewrite return value code. (distribute_notes): Set added_notes_insn to the earliest insn we added a note to. From-SVN: r254875
2017-11-17combine: Don't split insns if half is unused (PR82621)Segher Boessenkool1-1/+2
If we have a PARALLEL of two SETs, and one half is unused, we currently happily split that into two instructions (although the unused one is useless). Worse, as PR82621 shows, combine will happily merge this insn into I3 even if some intervening insn sets the same register again, which is wrong. This fixes it by not splitting PARALLELs with REG_UNUSED notes. It all is handled fine by combine in that case: just the "single set that is unused" case isn't handled properly. This also results in better code: combine will now actually throw away the unused SET. (It still won't do that in an I3). PR rtl-optimization/82621 * combine.c (try_combine): Do not split PARALLELs of two SETs if the dest of one of those SETs is unused. From-SVN: r254874
2017-11-03combine: Print insns we try to combineSegher Boessenkool1-0/+7
This adds some extra debug info to the dump file for combine: print the insns that are input to try_combine. I was worried printing more will make the dump file only harder to read, but especially the info from the REG_DEAD notes is invaluable. * combine (try_combine): Print the insns input to try_combine to the dump file. From-SVN: r254365
2017-11-01revert: combine.c (can_change_dest_mode): Reject changes in ↵Richard Sandiford1-6/+0
REGMODE_NATURAL_SIZE. 2017-11-01 Richard Sandiford <richard.sandiford@linaro.org> gcc/ Revert accidental duplicate: * combine.c (can_change_dest_mode): Reject changes in REGMODE_NATURAL_SIZE. From-SVN: r254316