Age | Commit message (Collapse) | Author | Files | Lines |
|
Two fixes for UB when handling very large offsets. The calculation in
force_int_to_mode would have been correct if signed integers used modulo
arithmetic, so just switch to unsigned types. The calculation in
rtx_addr_can_trap_p_1 didn't handle overflow properly, so switch to
known_subrange_p instead (which is supposed to handle all cases).
2019-04-18 Richard Sandiford <richard.sandiford@arm.com>
gcc/
PR middle-end/85164
* combine.c (force_int_to_mode): Cast the argument rather than
the result of known_alignment.
* rtlanal.c (rtx_addr_can_trap_p_1): Use known_subrange_p.
gcc/testsuite/
PR middle-end/85164
* gcc.dg/pr85164-1.c, gcc.dg/pr85164-2.c: New tests.
From-SVN: r270442
|
|
The code that checks if an auto-increment from i0 or i1 is not lost is
a bit shaky. The code to check the same for i2 is non-existent, and
cannot be implemented in a similar way at all. So, this patch counts
all auto-increments, and makes sure we end up with the same number as
we started with. This works because we still have a check that we
will not duplicate any.
We should do this some better way, but not while we are in stage 4.
PR rtl-optimization/89794
* combine.c (count_auto_inc): New function.
(try_combine): Count how many auto_inc expressions there were in the
original instructions. Ensure we have the same number in the new
instructions. Remove the code that tried to ensure auto_inc side
effects on i1 and i0 are not lost.
gcc/testsuite/
PR rtl-optimization/89794
* gcc.dg/torture/pr89794.c: New testcase.
From-SVN: r270368
|
|
-msse2 for 32bit target)
PR rtl-optimization/89354
* combine.c (make_extraction): Punt if extraction_mode is narrower
than len bits.
* gcc.dg/pr89354.c: New test.
From-SVN: r268913
|
|
PR rtl-optimization/89195
* combine.c (make_extraction): For MEMs, don't extract bytes outside
of the original MEM.
* gcc.c-torture/execute/pr89195.c: New test.
From-SVN: r268542
|
|
Some people use the -fdump-rtl-combine dumps (instead of the -da or
-fdump-rtl-combine-all dump), but the "Can't combine iN into iM"
messages do not make any sense if the failed combine attempts are not
printed otherwise. So let's change that.
* combine.c (try_combine): Do not print "Can't combine" messages unless
printing failed combination attempts.
From-SVN: r268453
|
|
PR target/88861 reports an ICE in "ce2" due to an unreachable
basic block.
The block becomes unreachable in "combine" when delete_noop_moves
deletes an insn with a REG_EH_REGION, deleting the EH edge, the
only edge leading to the basic block.
Normally, rest_of_handle_combine would call cleanup_cfg, deleting
unreachable blocks, if combine_instructions returns true, and
combine_instructions does return true for some cases of edge-removal,
but it doesn't for this case, leading to the ICE.
This patch updates delete_noop_moves so that it returns true if
it deletes any edges, and passes that through to combine_instructions,
so that it too will return true if any edges were deleted, ensuring that
cleanup_cfg will be called by rest_of_handle_combine for this case,
deleting the now-unreachable block, and fixing the ICE.
gcc/ChangeLog:
PR target/88861
* combine.c (delete_noop_moves): Convert to "bool" return,
returning true if any edges are eliminated.
(combine_instructions): Also return true if delete_noop_moves
returns true.
gcc/testsuite/ChangeLog:
PR target/88861
* g++.dg/torture/pr88861.C: New test.
From-SVN: r267984
|
|
2019-01-09 Sandra Loosemore <sandra@codesourcery.com>
PR other/16615 [1/5]
contrib/
* mklog: Mechanically replace "can not" with "cannot".
gcc/
* Makefile.in: Mechanically replace "can not" with "cannot".
* alias.c: Likewise.
* builtins.c: Likewise.
* calls.c: Likewise.
* cgraph.c: Likewise.
* cgraph.h: Likewise.
* cgraphclones.c: Likewise.
* cgraphunit.c: Likewise.
* combine-stack-adj.c: Likewise.
* combine.c: Likewise.
* common/config/i386/i386-common.c: Likewise.
* config/aarch64/aarch64.c: Likewise.
* config/alpha/sync.md: Likewise.
* config/arc/arc.c: Likewise.
* config/arc/predicates.md: Likewise.
* config/arm/arm-c.c: Likewise.
* config/arm/arm.c: Likewise.
* config/arm/arm.h: Likewise.
* config/arm/arm.md: Likewise.
* config/arm/cortex-r4f.md: Likewise.
* config/csky/csky.c: Likewise.
* config/csky/csky.h: Likewise.
* config/darwin-f.c: Likewise.
* config/epiphany/epiphany.md: Likewise.
* config/i386/i386.c: Likewise.
* config/i386/sol2.h: Likewise.
* config/m68k/m68k.c: Likewise.
* config/mcore/mcore.h: Likewise.
* config/microblaze/microblaze.md: Likewise.
* config/mips/20kc.md: Likewise.
* config/mips/sb1.md: Likewise.
* config/nds32/nds32.c: Likewise.
* config/nds32/predicates.md: Likewise.
* config/pa/pa.c: Likewise.
* config/rs6000/e300c2c3.md: Likewise.
* config/rs6000/rs6000.c: Likewise.
* config/s390/s390.h: Likewise.
* config/sh/sh.c: Likewise.
* config/sh/sh.md: Likewise.
* config/spu/vmx2spu.h: Likewise.
* cprop.c: Likewise.
* dbxout.c: Likewise.
* df-scan.c: Likewise.
* doc/cfg.texi: Likewise.
* doc/extend.texi: Likewise.
* doc/fragments.texi: Likewise.
* doc/gty.texi: Likewise.
* doc/invoke.texi: Likewise.
* doc/lto.texi: Likewise.
* doc/md.texi: Likewise.
* doc/objc.texi: Likewise.
* doc/rtl.texi: Likewise.
* doc/tm.texi: Likewise.
* dse.c: Likewise.
* emit-rtl.c: Likewise.
* emit-rtl.h: Likewise.
* except.c: Likewise.
* expmed.c: Likewise.
* expr.c: Likewise.
* fold-const.c: Likewise.
* genautomata.c: Likewise.
* gimple-fold.c: Likewise.
* hard-reg-set.h: Likewise.
* ifcvt.c: Likewise.
* ipa-comdats.c: Likewise.
* ipa-cp.c: Likewise.
* ipa-devirt.c: Likewise.
* ipa-fnsummary.c: Likewise.
* ipa-icf.c: Likewise.
* ipa-inline-transform.c: Likewise.
* ipa-inline.c: Likewise.
* ipa-polymorphic-call.c: Likewise.
* ipa-profile.c: Likewise.
* ipa-prop.c: Likewise.
* ipa-pure-const.c: Likewise.
* ipa-reference.c: Likewise.
* ipa-split.c: Likewise.
* ipa-visibility.c: Likewise.
* ipa.c: Likewise.
* ira-build.c: Likewise.
* ira-color.c: Likewise.
* ira-conflicts.c: Likewise.
* ira-costs.c: Likewise.
* ira-int.h: Likewise.
* ira-lives.c: Likewise.
* ira.c: Likewise.
* ira.h: Likewise.
* loop-invariant.c: Likewise.
* loop-unroll.c: Likewise.
* lower-subreg.c: Likewise.
* lra-assigns.c: Likewise.
* lra-constraints.c: Likewise.
* lra-eliminations.c: Likewise.
* lra-lives.c: Likewise.
* lra-remat.c: Likewise.
* lra-spills.c: Likewise.
* lra.c: Likewise.
* lto-cgraph.c: Likewise.
* lto-streamer-out.c: Likewise.
* postreload-gcse.c: Likewise.
* predict.c: Likewise.
* profile-count.h: Likewise.
* profile.c: Likewise.
* recog.c: Likewise.
* ree.c: Likewise.
* reload.c: Likewise.
* reload1.c: Likewise.
* reorg.c: Likewise.
* resource.c: Likewise.
* rtl.def: Likewise.
* rtl.h: Likewise.
* rtlanal.c: Likewise.
* sched-deps.c: Likewise.
* sched-ebb.c: Likewise.
* sched-rgn.c: Likewise.
* sel-sched-ir.c: Likewise.
* sel-sched.c: Likewise.
* shrink-wrap.c: Likewise.
* simplify-rtx.c: Likewise.
* symtab.c: Likewise.
* target.def: Likewise.
* toplev.c: Likewise.
* tree-call-cdce.c: Likewise.
* tree-cfg.c: Likewise.
* tree-complex.c: Likewise.
* tree-core.h: Likewise.
* tree-eh.c: Likewise.
* tree-inline.c: Likewise.
* tree-loop-distribution.c: Likewise.
* tree-nrv.c: Likewise.
* tree-profile.c: Likewise.
* tree-sra.c: Likewise.
* tree-ssa-alias.c: Likewise.
* tree-ssa-dce.c: Likewise.
* tree-ssa-dom.c: Likewise.
* tree-ssa-forwprop.c: Likewise.
* tree-ssa-loop-im.c: Likewise.
* tree-ssa-loop-ivcanon.c: Likewise.
* tree-ssa-loop-ivopts.c: Likewise.
* tree-ssa-loop-niter.c: Likewise.
* tree-ssa-phionlycprop.c: Likewise.
* tree-ssa-phiopt.c: Likewise.
* tree-ssa-propagate.c: Likewise.
* tree-ssa-threadedge.c: Likewise.
* tree-ssa-threadupdate.c: Likewise.
* tree-ssa-uninit.c: Likewise.
* tree-ssanames.c: Likewise.
* tree-streamer-out.c: Likewise.
* tree.c: Likewise.
* tree.h: Likewise.
* vr-values.c: Likewise.
gcc/ada/
* exp_ch9.adb: Mechanically replace "can not" with "cannot".
* libgnat/s-regpat.ads: Likewise.
* par-ch4.adb: Likewise.
* set_targ.adb: Likewise.
* types.ads: Likewise.
gcc/cp/
* cp-tree.h: Mechanically replace "can not" with "cannot".
* parser.c: Likewise.
* pt.c: Likewise.
gcc/fortran/
* class.c: Mechanically replace "can not" with "cannot".
* decl.c: Likewise.
* expr.c: Likewise.
* gfc-internals.texi: Likewise.
* intrinsic.texi: Likewise.
* invoke.texi: Likewise.
* io.c: Likewise.
* match.c: Likewise.
* parse.c: Likewise.
* primary.c: Likewise.
* resolve.c: Likewise.
* symbol.c: Likewise.
* trans-array.c: Likewise.
* trans-decl.c: Likewise.
* trans-intrinsic.c: Likewise.
* trans-stmt.c: Likewise.
gcc/go/
* go-backend.c: Mechanically replace "can not" with "cannot".
* go-gcc.cc: Likewise.
gcc/lto/
* lto-partition.c: Mechanically replace "can not" with "cannot".
* lto-symtab.c: Likewise.
* lto.c: Likewise.
gcc/objc/
* objc-act.c: Mechanically replace "can not" with "cannot".
libbacktrace/
* backtrace.h: Mechanically replace "can not" with "cannot".
libgcc/
* config/c6x/libunwind.S: Mechanically replace "can not" with
"cannot".
* config/tilepro/atomic.h: Likewise.
* config/vxlib-tls.c: Likewise.
* generic-morestack-thread.c: Likewise.
* generic-morestack.c: Likewise.
* mkmap-symver.awk: Likewise.
libgfortran/
* caf/single.c: Mechanically replace "can not" with "cannot".
* io/unit.c: Likewise.
libobjc/
* class.c: Mechanically replace "can not" with "cannot".
* objc/runtime.h: Likewise.
* sendmsg.c: Likewise.
liboffloadmic/
* include/coi/common/COIResult_common.h: Mechanically replace
"can not" with "cannot".
* include/coi/source/COIBuffer_source.h: Likewise.
libstdc++-v3/
* include/ext/bitmap_allocator.h: Mechanically replace "can not"
with "cannot".
From-SVN: r267783
|
|
From-SVN: r267494
|
|
PR rtl-optimization/87727
* combine.c (cant_combine_insn_p): On a LEAF_REGISTERS target, combine
again moves from leaf hard registers.
* final.c (final_scan_insn_1) <NOTE_INSN_INLINE_ENTRY>: Minor tweak.
From-SVN: r267328
|
|
in nonzero_bits_mode if...
2018-12-18 Jozef Lawrynowicz <jozef.l@mittosystems.com>
* combine.c (update_rsp_from_reg_equal): Only look for the nonzero bits
of src in nonzero_bits_mode if the mode of src is MODE_INT and
HWI_COMPUTABLE.
(reg_nonzero_bits_for_combine): Add clarification to comment.
From-SVN: r267227
|
|
volatile register access when using XOR in avr-gcc)
Fix PR 88253
gcc/ChangeLog:
PR rtl-optimization/88253
* combine.c (combine_simplify_rtx): Test for side-effects before
substituting by zero.
gcc/testsuite/ChangeLog:
PR rtl-optimization/88253
* gcc.target/avr/pr88253.c: New test.
From-SVN: r267198
|
|
PR target/54589
* combine.c (find_split_point): For invalid memory address
nonobj + obj + const, if reg + obj + const is valid addressing
mode, split at nonobj. Use if rather than else if for the
fallback. Comment fixes.
* gcc.target/i386/pr54589.c: New test.
From-SVN: r266707
|
|
combine at -02)
PR rtl-optimization/85925
* rtl.h (word_register_operation_p): New predicate.
* combine.c (record_dead_and_set_regs_1): Only apply specific handling
for WORD_REGISTER_OPERATIONS targets to word_register_operation_p RTX.
* rtlanal.c (nonzero_bits1): Likewise. Adjust couple of comments.
(num_sign_bit_copies1): Likewise.
From-SVN: r266302
|
|
This makes make_more_copies do what its documentation says, that is,
only make an intermediate pseudo if copying to a pseudo.
This regressed generated code quality when we didn't keep the original
notes that were on the copy, but since r265582 we do, and only allowing
pseudos now is a win. It also simplifies the code.
* combine.c (make_more_copies): Only make an intermediate copy if the
dest of a move is a pseudo.
From-SVN: r266004
|
|
The code with an intermediate register is perfectly fine, but LRA
apparently cannot handle the resulting code, or perhaps something else
is wrong. In either case, making an extra temporary will not likely
help here, so let's just skip it.
PR rtl-optimization/87871
* combine.c (make_more_copies): Skip if dest is frame_pointer_rtx.
From-SVN: r265821
|
|
This rewrites most of make_more_copies, in the process fixing a few PRs
and some other bugs, and working around a few target problems. Certain
notes turn out to actually change the meaning of the RTL, so we cannot
drop them; and i386 takes subregs of hard regs.
PR rtl-optimization/87701
PR rtl-optimization/87780
* combine.c (make_more_copies): Rewrite.
From-SVN: r265582
|
|
Jumps are written in RTL as moves to PC. But the latter has no mode,
so we shouldn't try to use it. Since the optimization this routine
does does not really help for jumps at all, let's just skip it.
PR rtl-optimization/87720
* combine.c (make_more_copies): Skip if the dest is pc_rtx.
From-SVN: r265474
|
|
On most targets every function starts with moves from the parameter
passing (hard) registers into pseudos. Similarly, after every call
there is a move from the return register into a pseudo. These moves
usually combine with later instructions (leaving pretty much the same
instruction, just with a hard reg instead of a pseudo).
This isn't a good idea. Register allocation can get rid of unnecessary
moves just fine, and moving the parameter passing registers into many
later instructions tends to prevent good register allocation. This
patch disallows combining moves from a hard (non-fixed) register.
This also avoid the problem mentioned in PR87600 #c3 (combining hard
registers into inline assembler is problematic).
Because the register move can often be combined with other instructions
*itself*, for example for setting some condition code, this patch adds
extra copies via new pseudos after every copy-from-hard-reg.
On some targets this reduces average code size. On others it increases
it a bit, 0.1% or 0.2% or so. (I tested this on all *-linux targets).
PR rtl-optimization/87600
* combine.c: Add include of expr.h.
(cant_combine_insn_p): Do not combine moves from any hard non-fixed
register to a pseudo.
(make_more_copies): New function, add a copy to a new pseudo after
the moves from hard registers into pseudos.
(rest_of_handle_combine): Declare rebuild_jump_labels_after_combine
later. Call make_more_copies.
From-SVN: r265398
|
|
This code in try_combine uses the wrong mode. This fails (with RTL
checking) in trunk, but not in any released branches.
PR rtl-optimization/86902
* combine.c (try_combine): When changing the CC mode used, don't change
an unrelated mode in other_insn to that new CC mode.
From-SVN: r264426
|
|
PR rtl-optimization/87065
* combine.c (simplify_if_then_else): Formatting fix.
(if_then_else_cond): Guard MULT optimization with SCALAR_INT_MODE_P
check.
(known_cond): Don't return const_true_rtx for vector modes. Use
CONST0_RTX instead of const0_rtx. Formatting fixes.
* gcc.target/i386/pr87065.c: New test.
From-SVN: r263872
|
|
When combine splits a resulting parallel into its two SETs, it has to
place one at i2, and the other stays at i3. This does not work if the
destination of the SET that will be placed at i2 is modified between
i2 and i3. This patch fixes it.
* combine.c (try_combine): Do not allow splitting a resulting PARALLEL
of two SETs into those two SETs, one to be placed at i2, if that SETs
destination is modified between i2 and i3.
From-SVN: r263776
|
|
gcc/
* alias.c (record_set): Check for clobber high.
* cfgexpand.c (expand_gimple_stmt): Likewise.
* combine-stack-adj.c (single_set_for_csa): Likewise.
* combine.c (find_single_use_1): Likewise.
(set_nonzero_bits_and_sign_copies): Likewise.
(get_combine_src_dest): Likewise.
(is_parallel_of_n_reg_sets): Likewise.
(try_combine): Likewise.
(record_dead_and_set_regs_1): Likewise.
(reg_dead_at_p_1): Likewise.
(reg_dead_at_p): Likewise.
* dce.c (deletable_insn_p): Likewise.
(mark_nonreg_stores_1): Likewise.
(mark_nonreg_stores_2): Likewise.
* df-scan.c (df_find_hard_reg_defs): Likewise.
(df_uses_record): Likewise.
(df_get_call_refs): Likewise.
* dwarf2out.c (mem_loc_descriptor): Likewise.
* haifa-sched.c (haifa_classify_rtx): Likewise.
* ira-build.c (create_insn_allocnos): Likewise.
* ira-costs.c (scan_one_insn): Likewise.
* ira.c (equiv_init_movable_p): Likewise.
(rtx_moveable_p): Likewise.
(interesting_dest_for_shprep): Likewise.
* jump.c (mark_jump_label_1): Likewise.
* postreload-gcse.c (record_opr_changes): Likewise.
* postreload.c (reload_cse_simplify): Likewise.
(struct reg_use): Add source expr.
(reload_combine): Check for clobber high.
(reload_combine_note_use): Likewise.
(reload_cse_move2add): Likewise.
(move2add_note_store): Likewise.
* print-rtl.c (print_pattern): Likewise.
* recog.c (decode_asm_operands): Likewise.
(store_data_bypass_p): Likewise.
(if_test_bypass_p): Likewise.
* regcprop.c (kill_clobbered_value): Likewise.
(kill_set_value): Likewise.
* reginfo.c (reg_scan_mark_refs): Likewise.
* reload1.c (maybe_fix_stack_asms): Likewise.
(eliminate_regs_1): Likewise.
(elimination_effects): Likewise.
(mark_not_eliminable): Likewise.
(scan_paradoxical_subregs): Likewise.
(forget_old_reloads_1): Likewise.
* reorg.c (find_end_label): Likewise.
(try_merge_delay_insns): Likewise.
(redundant_insn): Likewise.
(own_thread_p): Likewise.
(fill_simple_delay_slots): Likewise.
(fill_slots_from_thread): Likewise.
(dbr_schedule): Likewise.
* resource.c (update_live_status): Likewise.
(mark_referenced_resources): Likewise.
(mark_set_resources): Likewise.
* rtl.c (copy_rtx): Likewise.
* rtlanal.c (reg_referenced_p): Likewise.
(single_set_2): Likewise.
(noop_move_p): Likewise.
(note_stores): Likewise.
* sched-deps.c (sched_analyze_reg): Likewise.
(sched_analyze_insn): Likewise.
From-SVN: r263331
|
|
This patch allows combine to combine two insns into two. This helps
in many cases, by reducing instruction path length, and also allowing
further combinations to happen. PR85160 is a typical example of code
that it can improve.
This patch does not allow such combinations if either of the original
instructions was a simple move instruction. In those cases combining
the two instructions increases register pressure without improving the
code. With this move test register pressure does no longer increase
noticably as far as I can tell.
(At first I also didn't allow either of the resulting insns to be a
move instruction. But that is actually a very good thing to have, as
should have been obvious).
PR rtl-optimization/85160
* combine.c (is_just_move): New function.
(try_combine): Allow combining two instructions into two if neither of
the original instructions was a move.
From-SVN: r263067
|
|
The current code in reg_nonzero_bits_for_combine allows using the
reg_stat info when last_set_mode is a different integer mode. This is
completely wrong for non-pseudos. For example, as in the PR, a value
in a DImode hard register is set by eight writes to its constituent
QImode parts. The value written to the DImode is not the same as that
written to the lowest-numbered QImode!
PR rtl-optimization/85805
* combine.c (reg_nonzero_bits_for_combine): Only use the last set
value for hard registers if that was written in the same mode.
From-SVN: r262994
|
|
This patch generalises various places that used hwi rtx accessors so
that they can handle poly_ints instead. In many cases these changes
are by inspection rather than because something had shown them to be
necessary.
2018-06-12 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* poly-int.h (can_div_trunc_p): Add new overload in which all values
are poly_ints.
* alias.c (get_addr): Extend CONST_INT handling to poly_int_rtx_p.
(memrefs_conflict_p): Likewise.
(init_alias_analysis): Likewise.
* cfgexpand.c (expand_debug_expr): Likewise.
* combine.c (combine_simplify_rtx, force_int_to_mode): Likewise.
* cse.c (fold_rtx): Likewise.
* explow.c (adjust_stack, anti_adjust_stack): Likewise.
* expr.c (emit_block_move_hints): Likewise.
(clear_storage_hints, push_block, emit_push_insn): Likewise.
(store_expr_with_bounds, reduce_to_bit_field_precision): Likewise.
(emit_group_load_1): Use rtx_to_poly_int64 for group offsets.
(emit_group_store): Likewise.
(find_args_size_adjust): Use strip_offset. Use rtx_to_poly_int64
to read the PRE/POST_MODIFY increment.
* calls.c (store_one_arg): Use strip_offset.
* rtlanal.c (rtx_addr_can_trap_p_1): Extend CONST_INT handling to
poly_int_rtx_p.
(set_noop_p): Use rtx_to_poly_int64 for the elements selected
by a VEC_SELECT.
* simplify-rtx.c (avoid_constant_pool_reference): Use strip_offset.
(simplify_binary_operation_1): Extend CONST_INT handling to
poly_int_rtx_p.
* var-tracking.c (compute_cfa_pointer): Take a poly_int64 rather
than a HOST_WIDE_INT.
(hard_frame_pointer_adjustment): Change from HOST_WIDE_INT to
poly_int64.
(adjust_mems, add_stores): Update accodingly.
(vt_canonicalize_addr): Track polynomial offsets.
(emit_note_insn_var_location): Likewise.
(vt_add_function_parameter): Likewise.
(vt_initialize): Likewise.
From-SVN: r261530
|
|
simplify-rtx.c:895)
PR rtl-optimization/85300
* combine.c (subst): Handle subst of CONST_SCALAR_INT_P new_rtx also
into FLOAT and UNSIGNED_FLOAT like ZERO_EXTEND, return a CLOBBER if
simplify_unary_operation fails.
* gcc.dg/pr85300.c: New test.
From-SVN: r259285
|
|
distribute_links tries to place a log_link for whatever the destination
of the modified instruction is. It shouldn't do that when that dest
is pc_rtx, which isn't actually a register.
* combine.c (distribute_links): Don't make a link based on pc_rtx.
From-SVN: r258523
|
|
There still are situations where we have stale LOG_LINKS. This causes
combine to try two-insn combinations I2->I3 where the register set by
I2 is used before I3 as well. Not good.
This patch fixes it by checking for this situation in can_combine_p
(similar to what we already do for three and four insn combinations).
From-SVN: r258452
|
|
rhs_regno, at rtl.h:1896 with -O -fno-forward-propagate)
PR target/84710
* combine.c (try_combine): Use reg_or_subregno instead of handling
just paradoxical SUBREGs and REGs.
* gcc.dg/pr84710.c: New test.
From-SVN: r258301
|
|
PR target/84700
* combine.c (combine_simplify_rtx): Don't try to simplify if
if_then_else_cond returned non-NULL, but either true_rtx or false_rtx
are equal to x.
* gcc.target/powerpc/pr84700.c: New test.
From-SVN: r258263
|
|
-fno-split-wide-types -g3 --param=max-combine-insns=3)
PR target/84614
* rtl.h (prev_real_nondebug_insn, next_real_nondebug_insn): New
prototypes.
* emit-rtl.c (next_real_insn, prev_real_insn): Fix up function
comments.
(next_real_nondebug_insn, prev_real_nondebug_insn): New functions.
* cfgcleanup.c (try_head_merge_bb): Use prev_real_nondebug_insn
instead of a loop around prev_real_insn.
* combine.c (move_deaths): Use prev_real_nondebug_insn instead of
prev_real_insn.
* gcc.dg/pr84614.c: New test.
From-SVN: r258129
|
|
As Jakub found, after my recent combine patch at least on x86 problems
show up with RTL checking enabled. This is because the I2 generated
by a successful instruction combination can write not only a register
but it can also write a paradoxical subreg of one.
This fixes it.
* combine.c (try_combine): When adjusting LOG_LINKS for the destination
that moved to I2, also allow destinations that are a paradoxical
subreg (instead of a normal reg).
From-SVN: r257736
|
|
If there is a LOG_LINK between two insns, this means those two insns
can be combined, as far as dataflow is concerned. There never should
be a LOG_LINK between two unrelated insns. If there is one, combine
will try to combine the insns without doing all the needed checks if
the earlier destination is used before the later insn, etc.
Unfortunately we do not update the LOG_LINKs correctly in some cases.
This patch fixes at least some of those cases.
PR rtl-optimization/84169
* combine.c (try_combine): New variable split_i2i3. Set it to true if
we generated a parallel as new i3 and we split that to new i2 and i3
instructions. Handle split_i2i3 similar to swap_i2i3: scan the
LOG_LINKs of i3 to see which of those need to link to i2 now. Link
those to i2, not i1. Partially rewrite this scan code.
gcc/testsuite/
PR rtl-optimization/84169
* gcc.c-torture/execute/pr84169.c: New.
From-SVN: r257644
|
|
have 'lshiftrt')
PR rtl-optimization/84157
* combine.c (change_zero_ext): Use REG_P predicate in
front of HARD_REGISTER_P predicate.
From-SVN: r257302
|
|
emit-rtl.c:908, alpha linux.)
PR rtl-optimization/84123
* combine.c (change_zero_ext): Check if hard register satisfies
can_change_dest_mode before calling gen_lowpart_SUBREG.
From-SVN: r257270
|
|
sign-extended load)
PR rtl-optimization/84071
* combine.c (record_dead_and_set_regs_1): Record the source unmodified
for a paradoxical SUBREG on a WORD_REGISTER_OPERATIONS target.
From-SVN: r257224
|
|
on alpha)
PR target/83628
* combine.c (force_int_to_mode) <case ASHIFT>: Use mode instead of
op_mode in the force_to_mode call.
From-SVN: r256387
|
|
This patch changes GET_MODE_SIZE from unsigned short to poly_uint16.
The non-mechanical parts were handled by previous patches.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* machmode.h (mode_size): Change from unsigned short to
poly_uint16_pod.
(mode_to_bytes): Return a poly_uint16 rather than an unsigned short.
(GET_MODE_SIZE): Return a constant if ONLY_FIXED_SIZE_MODES,
or if measurement_type is not polynomial.
(fixed_size_mode::includes_p): Check for constant-sized modes.
* genmodes.c (emit_mode_size_inline): Make mode_size_inline
return a poly_uint16 rather than an unsigned short.
(emit_mode_size): Change the type of mode_size from unsigned short
to poly_uint16_pod. Use ZERO_COEFFS for the initializer.
(emit_mode_adjustments): Cope with polynomial vector sizes.
* lto-streamer-in.c (lto_input_mode_table): Use bp_unpack_poly_value
for GET_MODE_SIZE.
* lto-streamer-out.c (lto_write_mode_table): Use bp_pack_poly_value
for GET_MODE_SIZE.
* auto-inc-dec.c (try_merge): Treat GET_MODE_SIZE as polynomial.
* builtins.c (expand_ifn_atomic_compare_exchange_into_call): Likewise.
* caller-save.c (setup_save_areas): Likewise.
(replace_reg_with_saved_mem): Likewise.
* calls.c (emit_library_call_value_1): Likewise.
* combine-stack-adj.c (combine_stack_adjustments_for_block): Likewise.
* combine.c (simplify_set, make_extraction, simplify_shift_const_1)
(gen_lowpart_for_combine): Likewise.
* convert.c (convert_to_integer_1): Likewise.
* cse.c (equiv_constant, cse_insn): Likewise.
* cselib.c (autoinc_split, cselib_hash_rtx): Likewise.
(cselib_subst_to_values): Likewise.
* dce.c (word_dce_process_block): Likewise.
* df-problems.c (df_word_lr_mark_ref): Likewise.
* dwarf2cfi.c (init_one_dwarf_reg_size): Likewise.
* dwarf2out.c (multiple_reg_loc_descriptor, mem_loc_descriptor)
(concat_loc_descriptor, concatn_loc_descriptor, loc_descriptor)
(rtl_for_decl_location): Likewise.
* emit-rtl.c (gen_highpart, widen_memory_access): Likewise.
* expmed.c (extract_bit_field_1, extract_integral_bit_field): Likewise.
* expr.c (emit_group_load_1, clear_storage_hints): Likewise.
(emit_move_complex, emit_move_multi_word, emit_push_insn): Likewise.
(expand_expr_real_1): Likewise.
* function.c (assign_parm_setup_block_p, assign_parm_setup_block)
(pad_below): Likewise.
* gimple-fold.c (optimize_atomic_compare_exchange_p): Likewise.
* gimple-ssa-store-merging.c (rhs_valid_for_store_merging_p): Likewise.
* ira.c (get_subreg_tracking_sizes): Likewise.
* ira-build.c (ira_create_allocno_objects): Likewise.
* ira-color.c (coalesced_pseudo_reg_slot_compare): Likewise.
(ira_sort_regnos_for_alter_reg): Likewise.
* ira-costs.c (record_operand_costs): Likewise.
* lower-subreg.c (interesting_mode_p, simplify_gen_subreg_concatn)
(resolve_simple_move): Likewise.
* lra-constraints.c (get_reload_reg, operands_match_p): Likewise.
(process_addr_reg, simplify_operand_subreg, curr_insn_transform)
(lra_constraints): Likewise.
(CONST_POOL_OK_P): Reject variable-sized modes.
* lra-spills.c (slot, assign_mem_slot, pseudo_reg_slot_compare)
(add_pseudo_to_slot, lra_spill): Likewise.
* omp-low.c (omp_clause_aligned_alignment): Likewise.
* optabs-query.c (get_best_extraction_insn): Likewise.
* optabs-tree.c (expand_vec_cond_expr_p): Likewise.
* optabs.c (expand_vec_perm_var, expand_vec_cond_expr): Likewise.
(expand_mult_highpart, valid_multiword_target_p): Likewise.
* recog.c (offsettable_address_addr_space_p): Likewise.
* regcprop.c (maybe_mode_change): Likewise.
* reginfo.c (choose_hard_reg_mode, record_subregs_of_mode): Likewise.
* regrename.c (build_def_use): Likewise.
* regstat.c (dump_reg_info): Likewise.
* reload.c (complex_word_subreg_p, push_reload, find_dummy_reload)
(find_reloads, find_reloads_subreg_address): Likewise.
* reload1.c (eliminate_regs_1): Likewise.
* rtlanal.c (for_each_inc_dec_find_inc_dec, rtx_cost): Likewise.
* simplify-rtx.c (avoid_constant_pool_reference): Likewise.
(simplify_binary_operation_1, simplify_subreg): Likewise.
* targhooks.c (default_function_arg_padding): Likewise.
(default_hard_regno_nregs, default_class_max_nregs): Likewise.
* tree-cfg.c (verify_gimple_assign_binary): Likewise.
(verify_gimple_assign_ternary): Likewise.
* tree-inline.c (estimate_move_cost): Likewise.
* tree-ssa-forwprop.c (simplify_vector_constructor): Likewise.
* tree-ssa-loop-ivopts.c (add_autoinc_candidates): Likewise.
(get_address_cost_ainc): Likewise.
* tree-vect-data-refs.c (vect_enhance_data_refs_alignment): Likewise.
(vect_supportable_dr_alignment): Likewise.
* tree-vect-loop.c (vect_determine_vectorization_factor): Likewise.
(vectorizable_reduction): Likewise.
* tree-vect-stmts.c (vectorizable_assignment, vectorizable_shift)
(vectorizable_operation, vectorizable_load): Likewise.
* tree.c (build_same_sized_truth_vector_type): Likewise.
* valtrack.c (cleanup_auto_inc_dec): Likewise.
* var-tracking.c (emit_note_insn_var_location): Likewise.
* config/arc/arc.h (ASM_OUTPUT_CASE_END): Use as_a <scalar_int_mode>.
(ADDR_VEC_ALIGN): Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256201
|
|
This patch changes GET_MODE_BITSIZE from an unsigned short
to a poly_uint16.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* machmode.h (mode_to_bits): Return a poly_uint16 rather than an
unsigned short.
(GET_MODE_BITSIZE): Return a constant if ONLY_FIXED_SIZE_MODES,
or if measurement_type is polynomial.
* calls.c (shift_return_value): Treat GET_MODE_BITSIZE as polynomial.
* combine.c (make_extraction): Likewise.
* dse.c (find_shift_sequence): Likewise.
* dwarf2out.c (mem_loc_descriptor): Likewise.
* expmed.c (store_integral_bit_field, extract_bit_field_1): Likewise.
(extract_bit_field, extract_low_bits): Likewise.
* expr.c (convert_move, convert_modes, emit_move_insn_1): Likewise.
(optimize_bitfield_assignment_op, expand_assignment): Likewise.
(store_expr_with_bounds, store_field, expand_expr_real_1): Likewise.
* fold-const.c (optimize_bit_field_compare, merge_ranges): Likewise.
* gimple-fold.c (optimize_atomic_compare_exchange_p): Likewise.
* reload.c (find_reloads): Likewise.
* reload1.c (alter_reg): Likewise.
* stor-layout.c (bitwise_mode_for_mode, compute_record_mode): Likewise.
* targhooks.c (default_secondary_memory_needed_mode): Likewise.
* tree-if-conv.c (predicate_mem_writes): Likewise.
* tree-ssa-strlen.c (handle_builtin_memcmp): Likewise.
* tree-vect-patterns.c (adjust_bool_pattern): Likewise.
* tree-vect-stmts.c (vectorizable_simd_clone_call): Likewise.
* valtrack.c (dead_debug_insert_temp): Likewise.
* varasm.c (mergeable_constant_section): Likewise.
* config/sh/sh.h (LOCAL_ALIGNMENT): Use as_a <fixed_size_mode>.
gcc/ada/
* gcc-interface/misc.c (enumerate_modes): Treat GET_MODE_BITSIZE
as polynomial.
gcc/c-family/
* c-ubsan.c (ubsan_instrument_shift): Treat GET_MODE_BITSIZE
as polynomial.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256200
|
|
This patch changes GET_MODE_PRECISION from an unsigned short
to a poly_uint16.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* machmode.h (mode_precision): Change from unsigned short to
poly_uint16_pod.
(mode_to_precision): Return a poly_uint16 rather than an unsigned
short.
(GET_MODE_PRECISION): Return a constant if ONLY_FIXED_SIZE_MODES,
or if measurement_type is not polynomial.
(HWI_COMPUTABLE_MODE_P): Turn into a function. Optimize the case
in which the mode is already known to be a scalar_int_mode.
* genmodes.c (emit_mode_precision): Change the type of mode_precision
from unsigned short to poly_uint16_pod. Use ZERO_COEFFS for the
initializer.
* lto-streamer-in.c (lto_input_mode_table): Use bp_unpack_poly_value
for GET_MODE_PRECISION.
* lto-streamer-out.c (lto_write_mode_table): Use bp_pack_poly_value
for GET_MODE_PRECISION.
* combine.c (update_rsp_from_reg_equal): Treat GET_MODE_PRECISION
as polynomial.
(try_combine, find_split_point, combine_simplify_rtx): Likewise.
(expand_field_assignment, make_extraction): Likewise.
(make_compound_operation_int, record_dead_and_set_regs_1): Likewise.
(get_last_value): Likewise.
* convert.c (convert_to_integer_1): Likewise.
* cse.c (cse_insn): Likewise.
* expr.c (expand_expr_real_1): Likewise.
* lra-constraints.c (simplify_operand_subreg): Likewise.
* optabs-query.c (can_atomic_load_p): Likewise.
* optabs.c (expand_atomic_load): Likewise.
(expand_atomic_store): Likewise.
* ree.c (combine_reaching_defs): Likewise.
* rtl.h (partial_subreg_p, paradoxical_subreg_p): Likewise.
* rtlanal.c (nonzero_bits1, lsb_bitfield_op_p): Likewise.
* tree.h (type_has_mode_precision_p): Likewise.
* ubsan.c (instrument_si_overflow): Likewise.
gcc/ada/
* gcc-interface/misc.c (enumerate_modes): Treat GET_MODE_PRECISION
as polynomial.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256198
|
|
From-SVN: r256169
|
|
This patch makes target-independent code that uses REGMODE_NATURAL_SIZE
treat it as a poly_int rather than a constant.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* combine.c (can_change_dest_mode): Handle polynomial
REGMODE_NATURAL_SIZE.
* expmed.c (store_bit_field_1): Likewise.
* expr.c (store_constructor): Likewise.
* emit-rtl.c (validate_subreg): Operate on polynomial mode sizes
and polynomial REGMODE_NATURAL_SIZE.
(gen_lowpart_common): Likewise.
* reginfo.c (record_subregs_of_mode): Likewise.
* rtlanal.c (read_modify_subreg_p): Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256149
|
|
scalar int mode
gcc/
* combine.c (simplify_set): Do not transform subregs to zero_extends
if the destination is not a scalar int mode.
From-SVN: r255945
|
|
This patch adds new utility functions for manipulating REG_ARGS_SIZE
notes and allows the notes to carry polynomial as well as constant sizes.
The code was inconsistent about whether INT_MIN or HOST_WIDE_INT_MIN
should be used to represent an unknown size. The patch uses
HOST_WIDE_INT_MIN throughout.
2017-12-21 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* rtl.h (get_args_size, add_args_size_note): New functions.
(find_args_size_adjust): Return a poly_int64 rather than a
HOST_WIDE_INT.
(fixup_args_size_notes): Likewise. Make the same change to the
end_args_size parameter.
* rtlanal.c (get_args_size, add_args_size_note): New functions.
* builtins.c (expand_builtin_trap): Use add_args_size_note.
* calls.c (emit_call_1): Likewise.
* explow.c (adjust_stack_1): Likewise.
* cfgcleanup.c (old_insns_match_p): Update use of
find_args_size_adjust.
* combine.c (distribute_notes): Track polynomial arg sizes.
* dwarf2cfi.c (dw_trace_info): Change beg_true_args_size,
end_true_args_size, beg_delay_args_size and end_delay_args_size
from HOST_WIDE_INT to poly_int64.
(add_cfi_args_size): Take the args_size as a poly_int64 rather
than a HOST_WIDE_INT.
(notice_args_size, notice_eh_throw, maybe_record_trace_start)
(maybe_record_trace_start_abnormal, scan_trace, connect_traces): Track
polynomial arg sizes.
* emit-rtl.c (try_split): Use get_args_size.
* recog.c (peep2_attempt): Likewise.
* reload1.c (reload_as_needed): Likewise.
* expr.c (find_args_size_adjust): Return the adjustment as a
poly_int64 rather than a HOST_WIDE_INT.
(fixup_args_size_notes): Change end_args_size from a HOST_WIDE_INT
to a poly_int64 and change the return type in the same way.
(emit_single_push_insn): Track polynomial arg sizes.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r255919
|
|
This patch changes SUBREG_BYTE from an int to a poly_int.
Since valid SUBREG_BYTEs must be contained within the mode of the
SUBREG_REG, the required range is the same as for GET_MODE_SIZE,
i.e. unsigned short. The patch therefore uses poly_uint16(_pod)
for the SUBREG_BYTE.
Using poly_uint16_pod rtx fields requires a new field code ('p').
Since there are no other uses of 'p' besides SUBREG_BYTE, the patch
doesn't add an XPOLY or whatever; all uses should go via SUBREG_BYTE
instead.
The patch doesn't bother implementing 'p' support for legacy
define_peepholes, since none of the remaining ones have subregs
in their patterns.
As it happened, the rtl documentation used SUBREG as an example of a
code with mixed field types, accessed via XEXP (x, 0) and XINT (x, 1).
Since there's no direct replacement for XINT, and since people should
never use it even if there were, the patch changes the example to use
INT_LIST instead.
The patch also changes subreg-related helper functions so that they too
take and return polynomial offsets. This makes the patch quite big, but
it's mostly mechanical. The patch generally sticks to existing choices
wrt signedness.
2017-12-20 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* doc/rtl.texi: Update documentation of SUBREG_BYTE. Document the
'p' format code. Use INT_LIST rather than SUBREG as the example of
a code with an XINT and an XEXP. Remove the implication that
accessing an rtx field using XINT is expected to work.
* rtl.def (SUBREG): Change format from "ei" to "ep".
* rtl.h (rtunion::rt_subreg): New field.
(XCSUBREG): New macro.
(SUBREG_BYTE): Use it.
(subreg_shape): Change offset from an unsigned int to a poly_uint16.
Update constructor accordingly.
(subreg_shape::operator ==): Update accordingly.
(subreg_shape::unique_id): Return an unsigned HOST_WIDE_INT rather
than an unsigned int.
(subreg_lsb, subreg_lowpart_offset, subreg_highpart_offset): Return
a poly_uint64 rather than an unsigned int.
(subreg_lsb_1): Likewise. Take the offset as a poly_uint64 rather
than an unsigned int.
(subreg_size_offset_from_lsb, subreg_size_lowpart_offset)
(subreg_size_highpart_offset): Return a poly_uint64 rather than
an unsigned int. Take the sizes as poly_uint64s.
(subreg_offset_from_lsb): Return a poly_uint64 rather than
an unsigned int. Take the shift as a poly_uint64 rather than
an unsigned int.
(subreg_regno_offset, subreg_offset_representable_p): Take the offset
as a poly_uint64 rather than an unsigned int.
(simplify_subreg_regno): Likewise.
(byte_lowpart_offset): Return the memory offset as a poly_int64
rather than an int.
(subreg_memory_offset): Likewise. Take the subreg offset as a
poly_uint64 rather than an unsigned int.
(simplify_subreg, simplify_gen_subreg, subreg_get_info)
(gen_rtx_SUBREG, validate_subreg): Take the subreg offset as a
poly_uint64 rather than an unsigned int.
* rtl.c (rtx_format): Describe 'p' in comment.
(copy_rtx, rtx_equal_p_cb, rtx_equal_p): Handle 'p'.
* emit-rtl.c (validate_subreg, gen_rtx_SUBREG): Take the subreg
offset as a poly_uint64 rather than an unsigned int.
(byte_lowpart_offset): Return the memory offset as a poly_int64
rather than an int.
(subreg_memory_offset): Likewise. Take the subreg offset as a
poly_uint64 rather than an unsigned int.
(subreg_size_lowpart_offset, subreg_size_highpart_offset): Take the
mode sizes as poly_uint64s rather than unsigned ints. Return a
poly_uint64 rather than an unsigned int.
(subreg_lowpart_p): Treat subreg offsets as poly_ints.
(copy_insn_1): Handle 'p'.
* rtlanal.c (set_noop_p): Treat subregs offsets as poly_uint64s.
(subreg_lsb_1): Take the subreg offset as a poly_uint64 rather than
an unsigned int. Return the shift in the same way.
(subreg_lsb): Return the shift as a poly_uint64 rather than an
unsigned int.
(subreg_size_offset_from_lsb): Take the sizes and shift as
poly_uint64s rather than unsigned ints. Return the offset as
a poly_uint64.
(subreg_get_info, subreg_regno_offset, subreg_offset_representable_p)
(simplify_subreg_regno): Take the offset as a poly_uint64 rather than
an unsigned int.
* rtlhash.c (add_rtx): Handle 'p'.
* genemit.c (gen_exp): Likewise.
* gengenrtl.c (type_from_format, gendef): Likewise.
* gensupport.c (subst_pattern_match, get_alternatives_number)
(collect_insn_data, alter_predicate_for_insn, alter_constraints)
(subst_dup): Likewise.
* gengtype.c (adjust_field_rtx_def): Likewise.
* genrecog.c (find_operand, find_matching_operand, validate_pattern)
(match_pattern_2): Likewise.
(rtx_test::SUBREG_FIELD): New rtx_test::kind_enum.
(rtx_test::subreg_field): New function.
(operator ==, safe_to_hoist_p, transition_parameter_type)
(print_nonbool_test, print_test): Handle SUBREG_FIELD.
* genattrtab.c (attr_rtx_1): Say that 'p' is deliberately not handled.
* genpeep.c (match_rtx): Likewise.
* print-rtl.c (print_poly_int): Include if GENERATOR_FILE too.
(rtx_writer::print_rtx_operand): Handle 'p'.
(print_value): Handle SUBREG.
* read-rtl.c (apply_int_iterator): Likewise.
(rtx_reader::read_rtx_operand): Handle 'p'.
* alias.c (rtx_equal_for_memref_p): Likewise.
* cselib.c (rtx_equal_for_cselib_1, cselib_hash_rtx): Likewise.
* caller-save.c (replace_reg_with_saved_mem): Treat subreg offsets
as poly_ints.
* calls.c (expand_call): Likewise.
* combine.c (combine_simplify_rtx, expand_field_assignment): Likewise.
(make_extraction, gen_lowpart_for_combine): Likewise.
* loop-invariant.c (hash_invariant_expr_1, invariant_expr_equal_p):
Likewise.
* cse.c (remove_invalid_subreg_refs): Take the offset as a poly_uint64
rather than an unsigned int. Treat subreg offsets as poly_ints.
(exp_equiv_p): Handle 'p'.
(hash_rtx_cb): Likewise. Treat subreg offsets as poly_ints.
(equiv_constant, cse_insn): Treat subreg offsets as poly_ints.
* dse.c (find_shift_sequence): Likewise.
* dwarf2out.c (rtl_for_decl_location): Likewise.
* expmed.c (extract_low_bits): Likewise.
* expr.c (emit_group_store, undefined_operand_subword_p): Likewise.
(expand_expr_real_2): Likewise.
* final.c (alter_subreg): Likewise.
(leaf_renumber_regs_insn): Handle 'p'.
* function.c (assign_parm_find_stack_rtl, assign_parm_setup_stack):
Treat subreg offsets as poly_ints.
* fwprop.c (forward_propagate_and_simplify): Likewise.
* ifcvt.c (noce_emit_move_insn, noce_emit_cmove): Likewise.
* ira.c (get_subreg_tracking_sizes): Likewise.
* ira-conflicts.c (go_through_subreg): Likewise.
* ira-lives.c (process_single_reg_class_operands): Likewise.
* jump.c (rtx_renumbered_equal_p): Likewise. Handle 'p'.
* lower-subreg.c (simplify_subreg_concatn): Take the subreg offset
as a poly_uint64 rather than an unsigned int.
(simplify_gen_subreg_concatn, resolve_simple_move): Treat
subreg offsets as poly_ints.
* lra-constraints.c (operands_match_p): Handle 'p'.
(match_reload, curr_insn_transform): Treat subreg offsets as poly_ints.
* lra-spills.c (assign_mem_slot): Likewise.
* postreload.c (move2add_valid_value_p): Likewise.
* recog.c (general_operand, indirect_operand): Likewise.
* regcprop.c (copy_value, maybe_mode_change): Likewise.
(copyprop_hardreg_forward_1): Likewise.
* reginfo.c (simplifiable_subregs_hasher::hash, simplifiable_subregs)
(record_subregs_of_mode): Likewise.
* rtlhooks.c (gen_lowpart_general, gen_lowpart_if_possible): Likewise.
* reload.c (operands_match_p): Handle 'p'.
(find_reloads_subreg_address): Treat subreg offsets as poly_ints.
* reload1.c (alter_reg, choose_reload_regs): Likewise.
(compute_reload_subreg_offset): Likewise, and return an poly_int64.
* simplify-rtx.c (simplify_truncation, simplify_binary_operation_1):
(test_vector_ops_duplicate): Treat subreg offsets as poly_ints.
(simplify_const_poly_int_tests<N>::run): Likewise.
(simplify_subreg, simplify_gen_subreg): Take the subreg offset as
a poly_uint64 rather than an unsigned int.
* valtrack.c (debug_lowpart_subreg): Likewise.
* var-tracking.c (var_lowpart): Likewise.
(loc_cmp): Handle 'p'.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r255882
|
|
This patch adds a helper routine that constructs rtxes
for constant shift amounts, given the mode of the value
being shifted. As well as helping with the SVE patches, this
is one step towards allowing CONST_INTs to have a real mode.
One long-standing problem has been to decide what the mode
of a shift count should be for arbitrary rtxes (as opposed to those
directly tied to a target pattern). Realistic choices would be
the mode of the shifted elements, word_mode, QImode, a 64-bit mode,
or the same mode as the shift optabs (in which case what should the
mode be when the target doesn't have a pattern?)
For now the patch picks a 64-bit mode, but with a ??? comment.
2017-12-20 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* emit-rtl.h (gen_int_shift_amount): Declare.
* emit-rtl.c (gen_int_shift_amount): New function.
* asan.c (asan_emit_stack_protection): Use gen_int_shift_amount
instead of GEN_INT.
* calls.c (shift_return_value): Likewise.
* cse.c (fold_rtx): Likewise.
* dse.c (find_shift_sequence): Likewise.
* expmed.c (init_expmed_one_mode, store_bit_field_1, expand_shift_1)
(expand_shift, expand_smod_pow2): Likewise.
* lower-subreg.c (shift_cost): Likewise.
* optabs.c (expand_superword_shift, expand_doubleword_mult)
(expand_unop, expand_binop, shift_amt_for_vec_perm_mask)
(expand_vec_perm_var): Likewise.
* simplify-rtx.c (simplify_unary_operation_1): Likewise.
(simplify_binary_operation_1): Likewise.
* combine.c (try_combine, find_split_point, force_int_to_mode)
(simplify_shift_const_1, simplify_shift_const): Likewise.
(change_zero_ext): Likewise. Use simplify_gen_binary.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r255861
|
|
conditions.
* read-rtl.c (parse_reg_note_name): Replace Yoda conditions with
typical order conditions.
* sel-sched.c (extract_new_fences_from): Likewise.
* config/visium/constraints.md (J, K, L): Likewise.
* config/visium/predicates.md (const_shift_operand): Likewise.
* config/visium/visium.c (visium_legitimize_address,
visium_legitimize_reload_address): Likewise.
* config/m68k/m68k.c (output_reg_adjust, emit_reg_adjust): Likewise.
* config/arm/arm.c (arm_block_move_unaligned_straight): Likewise.
* config/avr/constraints.md (Y01, Ym1, Y02, Ym2): Likewise.
* config/avr/avr-log.c (avr_vdump, avr_log_set_avr_log,
SET_DUMP_DETAIL): Likewise.
* config/avr/predicates.md (const_8_16_24_operand): Likewise.
* config/avr/avr.c (STR_PREFIX_P, avr_popcount_each_byte,
avr_is_casesi_sequence, avr_casei_sequence_check_operands,
avr_set_core_architecture, avr_set_current_function,
avr_legitimize_reload_address, avr_asm_len, avr_print_operand,
output_movqi, output_movsisf, avr_out_plus, avr_out_bitop,
avr_out_fract, avr_adjust_insn_length, avr_encode_section_info,
avr_2word_insn_p, output_reload_in_const, avr_has_nibble_0xf,
avr_map_decompose, avr_fold_builtin): Likewise.
* config/avr/driver-avr.c (avr_devicespecs_file): Likewise.
* config/avr/gen-avr-mmcu-specs.c (str_prefix_p, print_mcu): Likewise.
* config/i386/i386.c (ix86_parse_stringop_strategy_string): Likewise.
* config/m32c/m32c-pragma.c (m32c_pragma_memregs): Likewise.
* config/m32c/m32c.c (m32c_conditional_register_usage,
m32c_address_cost): Likewise.
* config/m32c/predicates.md (shiftcount_operand,
longshiftcount_operand): Likewise.
* config/iq2000/iq2000.c (iq2000_expand_prologue): Likewise.
* config/nios2/nios2.c (nios2_handle_custom_fpu_insn_option,
can_use_cdx_ldstw): Likewise.
* config/nios2/nios2.h (CDX_REG_P): Likewise.
* config/cr16/cr16.h (RETURN_ADDR_RTX, REGNO_MODE_OK_FOR_BASE_P):
Likewise.
* config/cr16/cr16.md (*mov<mode>_double): Likewise.
* config/cr16/cr16.c (cr16_create_dwarf_for_multi_push): Likewise.
* config/h8300/h8300.c (h8300_rtx_costs, get_shift_alg): Likewise.
* config/vax/constraints.md (U06, U08, U16, CN6, S08, S16): Likewise.
* config/vax/vax.c (adjacent_operands_p): Likewise.
* config/ft32/constraints.md (L, b, KA): Likewise.
* config/ft32/ft32.c (ft32_load_immediate, ft32_expand_prologue):
Likewise.
* cfgexpand.c (expand_stack_alignment): Likewise.
* gcse.c (insert_expr_in_table): Likewise.
* print-rtl.c (rtx_writer::print_rtx_operand_codes_E_and_V): Likewise.
* cgraphunit.c (cgraph_node::expand): Likewise.
* ira-build.c (setup_min_max_allocno_live_range_point): Likewise.
* emit-rtl.c (add_insn): Likewise.
* input.c (dump_location_info): Likewise.
* passes.c (NEXT_PASS): Likewise.
* read-rtl-function.c (parse_note_insn_name,
function_reader::read_rtx_operand_r, function_reader::parse_mem_expr):
Likewise.
* sched-rgn.c (sched_rgn_init): Likewise.
* diagnostic-show-locus.c (layout::show_ruler): Likewise.
* combine.c (find_split_point, simplify_if_then_else, force_to_mode,
if_then_else_cond, simplify_shift_const_1, simplify_comparison): Likewise.
* explow.c (eliminate_constant_term): Likewise.
* final.c (leaf_renumber_regs_insn): Likewise.
* cfgrtl.c (print_rtl_with_bb): Likewise.
* genhooks.c (emit_init_macros): Likewise.
* poly-int.h (maybe_ne, maybe_le, maybe_lt): Likewise.
* tree-data-ref.c (conflict_fn): Likewise.
* selftest.c (assert_streq): Likewise.
* expr.c (store_constructor_field, expand_expr_real_1): Likewise.
* fold-const.c (fold_range_test, extract_muldiv_1, fold_truth_andor,
fold_binary_loc, multiple_of_p): Likewise.
* reload.c (push_reload, find_equiv_reg): Likewise.
* et-forest.c (et_nca, et_below): Likewise.
* dbxout.c (dbxout_symbol_location): Likewise.
* reorg.c (relax_delay_slots): Likewise.
* dojump.c (do_compare_rtx_and_jump): Likewise.
* gengtype-parse.c (type): Likewise.
* simplify-rtx.c (simplify_gen_ternary, simplify_gen_relational,
simplify_const_relational_operation): Likewise.
* reload1.c (do_output_reload): Likewise.
* dumpfile.c (get_dump_file_info_by_switch): Likewise.
* gengtype.c (type_for_name): Likewise.
* gimple-ssa-sprintf.c (format_directive): Likewise.
ada/
* gcc-interface/trans.c (Loop_Statement_to_gnu): Replace Yoda
conditions with typical order conditions.
* gcc-interface/misc.c (gnat_get_array_descr_info,
default_pass_by_ref): Likewise.
* gcc-interface/decl.c (gnat_to_gnu_entity): Likewise.
* adaint.c (__gnat_tmp_name): Likewise.
c-family/
* known-headers.cc (get_stdlib_header_for_name): Replace Yoda
conditions with typical order conditions.
c/
* c-typeck.c (comptypes_internal, function_types_compatible_p,
perform_integral_promotions, digest_init): Replace Yoda conditions
with typical order conditions.
* c-decl.c (check_bitfield_type_and_width): Likewise.
cp/
* name-lookup.c (get_std_name_hint): Replace Yoda conditions with
typical order conditions.
* class.c (check_bitfield_decl): Likewise.
* pt.c (convert_template_argument): Likewise.
* decl.c (duplicate_decls): Likewise.
* typeck.c (commonparms): Likewise.
fortran/
* scanner.c (preprocessor_line): Replace Yoda conditions with typical
order conditions.
* dependency.c (check_section_vs_section): Likewise.
* trans-array.c (gfc_conv_expr_descriptor): Likewise.
jit/
* jit-playback.c (get_type, playback::compile_to_file::copy_file,
playback::context::acquire_mutex): Replace Yoda conditions with
typical order conditions.
* libgccjit.c (gcc_jit_context_new_struct_type,
gcc_jit_struct_set_fields, gcc_jit_context_new_union_type,
gcc_jit_context_new_function, gcc_jit_timer_pop): Likewise.
* jit-builtins.c (matches_builtin): Likewise.
* jit-recording.c (recording::compound_type::set_fields,
recording::fields::write_reproducer, recording::rvalue::set_scope,
recording::function::validate): Likewise.
* jit-logging.c (logger::decref): Likewise.
From-SVN: r255831
|
|
From-SVN: r255746
|
|
This patch adds a helper routine that constructs rtxes
for constant shift amounts, given the mode of the value
being shifted. As well as helping with the SVE patches, this
is one step towards allowing CONST_INTs to have a real mode.
One long-standing problem has been to decide what the mode
of a shift count should be for arbitrary rtxes (as opposed to those
directly tied to a target pattern). Realistic choices would be
the mode of the shifted elements, word_mode, QImode, or the same
mode as the shift optabs (in which case what should the mode
be when the target doesn't have a pattern?)
For now the patch picks the mode of the shifted elements,
but with a ??? comment.
2017-11-06 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* emit-rtl.h (gen_int_shift_amount): Declare.
* emit-rtl.c (gen_int_shift_amount): New function.
* asan.c (asan_emit_stack_protection): Use gen_int_shift_amount
instead of GEN_INT.
* calls.c (shift_return_value): Likewise.
* cse.c (fold_rtx): Likewise.
* dse.c (find_shift_sequence): Likewise.
* expmed.c (init_expmed_one_mode, store_bit_field_1, expand_shift_1)
(expand_shift, expand_smod_pow2): Likewise.
* lower-subreg.c (shift_cost): Likewise.
* optabs.c (expand_superword_shift, expand_doubleword_mult)
(expand_unop, expand_binop, shift_amt_for_vec_perm_mask)
(expand_vec_perm_var): Likewise.
* simplify-rtx.c (simplify_unary_operation_1): Likewise.
(simplify_binary_operation_1): Likewise.
* combine.c (try_combine, find_split_point, force_int_to_mode)
(simplify_shift_const_1, simplify_shift_const): Likewise.
(change_zero_ext): Likewise. Use simplify_gen_binary.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r255745
|
|
In move_deaths we move a REG_DEAD note if the instruction combination
has extended the lifetime of a register so that the existing note is
no longer valid. We find that note using reg_stat, but what that finds
can refer to a later insn. If so, we cannot use the cached value. This
patch implements that.
PR rtl-optimization/83393
* combine.c (move_deaths): If reg_stat points to a too new insn in
last_death, do not use it: find the proper insn instead.
gcc/testsuite/
PR rtl-optimization/83393
* gcc.dg/pr83393.c: New testcase.
From-SVN: r255606
|