Age | Commit message (Collapse) | Author | Files | Lines |
|
The Neon multiply/multiply-accumulate/multiply-subtract instructions
can select the top or bottom half of the operand registers. This
selection does not change the cost of the underlying instruction and
this should be reflected by the RTL cost function.
This patch adds RTL tree traversal in the Neon multiply cost function
to match vec_select high-half of its operands. This traversal
prevents the cost of the vec_select from being added into the cost of
the multiply - meaning that these instructions can now be emitted in
the combine pass as they are no longer deemed prohibitively
expensive.
gcc/ChangeLog:
2021-07-19 Jonathan Wright <jonathan.wright@arm.com>
* config/aarch64/aarch64.c (aarch64_strip_extend_vec_half):
Define.
(aarch64_rtx_mult_cost): Traverse RTL tree to prevent cost of
vec_select high-half from being added into Neon multiply
cost.
* rtlanal.c (vec_series_highpart_p): Define.
* rtlanal.h (vec_series_highpart_p): Declare.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/vmul_high_cost.c: New test.
|
|
Just like the old bug PR9651, unsigned_fix rtl should
also be handled as a trapping instruction.
OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
gcc/ChangeLog:
PR rtl-optimization/101683
* rtlanal.c (may_trap_p_1): Handle UNSIGNED_FIX.
|
|
1. Replace scalar_int_mode with fixed_size_mode in the by-pieces
infrastructure to allow non-integer mode.
2. Rename widest_int_mode_for_size to widest_fixed_size_mode_for_size
to return QI vector mode for memset.
3. Add op_by_pieces_d::smallest_fixed_size_mode_for_size to return the
smallest integer or QI vector mode.
4. Remove clear_by_pieces_1 and use builtin_memset_read_str in
clear_by_pieces to support vector mode broadcast.
5. Add lowpart_subreg_regno, a wrapper around simplify_subreg_regno that
uses subreg_lowpart_offset (mode, prev_mode) as the offset.
6. Add TARGET_GEN_MEMSET_SCRATCH_RTX to allow the backend to use a hard
scratch register to avoid stack realignment when expanding memset.
gcc/
PR middle-end/90773
* builtins.c (builtin_memcpy_read_str): Change the mode argument
from scalar_int_mode to fixed_size_mode.
(builtin_strncpy_read_str): Likewise.
(gen_memset_value_from_prev): New function.
(builtin_memset_read_str): Change the mode argument from
scalar_int_mode to fixed_size_mode. Use gen_memset_value_from_prev
and support CONST_VECTOR.
(builtin_memset_gen_str): Likewise.
(try_store_by_multiple_pieces): Use by_pieces_constfn to declare
constfun.
* builtins.h (builtin_strncpy_read_str): Replace scalar_int_mode
with fixed_size_mode.
(builtin_memset_read_str): Likewise.
* expr.c (widest_int_mode_for_size): Renamed to ...
(widest_fixed_size_mode_for_size): Add a bool argument to
indicate if QI vector mode can be used.
(by_pieces_ninsns): Call widest_fixed_size_mode_for_size
instead of widest_int_mode_for_size.
(pieces_addr::adjust): Change the mode argument from
scalar_int_mode to fixed_size_mode.
(op_by_pieces_d): Make m_len read-only. Add a bool member,
m_qi_vector_mode, to indicate that QI vector mode can be used.
(op_by_pieces_d::op_by_pieces_d): Add a bool argument to
initialize m_qi_vector_mode. Call widest_fixed_size_mode_for_size
instead of widest_int_mode_for_size.
(op_by_pieces_d::get_usable_mode): Change the mode argument from
scalar_int_mode to fixed_size_mode. Call
widest_fixed_size_mode_for_size instead of
widest_int_mode_for_size.
(op_by_pieces_d::smallest_fixed_size_mode_for_size): New member
function to return the smallest integer or QI vector mode.
(op_by_pieces_d::run): Call widest_fixed_size_mode_for_size
instead of widest_int_mode_for_size. Call
smallest_fixed_size_mode_for_size instead of
smallest_int_mode_for_size.
(store_by_pieces_d::store_by_pieces_d): Add a bool argument to
indicate that QI vector mode can be used and pass it to
op_by_pieces_d::op_by_pieces_d.
(can_store_by_pieces): Call widest_fixed_size_mode_for_size
instead of widest_int_mode_for_size. Pass memsetp to
widest_fixed_size_mode_for_size to support QI vector mode.
Allow all CONST_VECTORs for memset if vec_duplicate is supported.
(store_by_pieces): Pass memsetp to
store_by_pieces_d::store_by_pieces_d.
(clear_by_pieces_1): Removed.
(clear_by_pieces): Replace clear_by_pieces_1 with
builtin_memset_read_str and pass true to store_by_pieces_d to
support vector mode broadcast.
(string_cst_read_str): Change the mode argument from
scalar_int_mode to fixed_size_mode.
* expr.h (by_pieces_constfn): Change scalar_int_mode to
fixed_size_mode.
(by_pieces_prev): Likewise.
* rtl.h (lowpart_subreg_regno): New.
* rtlanal.c (lowpart_subreg_regno): New. A wrapper around
simplify_subreg_regno.
* target.def (gen_memset_scratch_rtx): New hook.
* doc/tm.texi.in: Add TARGET_GEN_MEMSET_SCRATCH_RTX.
* doc/tm.texi: Regenerated.
gcc/testsuite/
* gcc.target/i386/pr100865-3.c: Expect vmovdqu8 instead of
vmovdqu.
* gcc.target/i386/pr100865-4b.c: Likewise.
|
|
Add a new RTL simplification for the case of a VEC_SELECT selecting
the low part of a vector. The simplification returns a SUBREG.
The primary goal of this patch is to enable better combinations of
Neon RTL patterns - specifically allowing generation of 'write-to-
high-half' narrowing intructions.
Adding this RTL simplification means that the expected results for a
number of tests need to be updated:
* aarch64 Neon: Update the scan-assembler regex for intrinsics tests
to expect a scalar register instead of lane 0 of a vector.
* aarch64 SVE: Likewise.
* arm MVE: Use lane 1 instead of lane 0 for lane-extraction
intrinsics tests (as the move instructions get optimized away for
lane 0.)
This patch also adds new code generation tests to
narrow_high_combine.c to verify the benefit of this RTL
simplification.
gcc/ChangeLog:
2021-06-08 Jonathan Wright <jonathan.wright@arm.com>
* combine.c (combine_simplify_rtx): Add vec_select -> subreg
simplification.
* config/aarch64/aarch64.md (*zero_extend<SHORT:mode><GPI:mode>2_aarch64):
Add Neon to general purpose register case for zero-extend
pattern.
* config/arm/vfp.md (*arm_movsi_vfp): Remove "*" from *t -> r
case to prevent some cases opting to go through memory.
* cse.c (fold_rtx): Add vec_select -> subreg simplification.
* rtl.c (rtvec_series_p): Define predicate to determine
whether a vector contains a linear series of integers.
* rtl.h (rtvec_series_p): Define.
* rtlanal.c (vec_series_lowpart_p): Define predicate to
determine if a vector selection is equivalent to the low part
of the vector.
* rtlanal.h (vec_series_lowpart_p): Define.
* simplify-rtx.c (simplify_context::simplify_binary_operation_1):
Add vec_select -> subreg simplification.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/extract_zero_extend.c: Remove dump scan
for RTL pattern match.
* gcc.target/aarch64/narrow_high_combine.c: Add new tests.
* gcc.target/aarch64/simd/vmulx_laneq_f64_1.c: Update
scan-assembler regex to look for a scalar register instead of
lane 0 of a vector.
* gcc.target/aarch64/simd/vmulxd_laneq_f64_1.c: Likewise.
* gcc.target/aarch64/simd/vmulxs_lane_f32_1.c: Likewise.
* gcc.target/aarch64/simd/vmulxs_laneq_f32_1.c: Likewise.
* gcc.target/aarch64/simd/vqdmlalh_lane_s16.c: Likewise.
* gcc.target/aarch64/simd/vqdmlals_lane_s32.c: Likewise.
* gcc.target/aarch64/simd/vqdmlslh_lane_s16.c: Likewise.
* gcc.target/aarch64/simd/vqdmlsls_lane_s32.c: Likewise.
* gcc.target/aarch64/simd/vqdmullh_lane_s16.c: Likewise.
* gcc.target/aarch64/simd/vqdmullh_laneq_s16.c: Likewise.
* gcc.target/aarch64/simd/vqdmulls_lane_s32.c: Likewise.
* gcc.target/aarch64/simd/vqdmulls_laneq_s32.c: Likewise.
* gcc.target/aarch64/sve/dup_lane_1.c: Likewise.
* gcc.target/aarch64/sve/extract_1.c: Likewise.
* gcc.target/aarch64/sve/extract_2.c: Likewise.
* gcc.target/aarch64/sve/extract_3.c: Likewise.
* gcc.target/aarch64/sve/extract_4.c: Likewise.
* gcc.target/aarch64/sve/live_1.c: Update scan-assembler regex
cases to look for 'b' and 'h' registers instead of 'w'.
* gcc.target/arm/crypto-vsha1cq_u32.c: Update scan-assembler
regex to reflect lane 0 vector extractions being simplified
to scalar register moves.
* gcc.target/arm/crypto-vsha1h_u32.c: Likewise.
* gcc.target/arm/crypto-vsha1mq_u32.c: Likewise.
* gcc.target/arm/crypto-vsha1pq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vgetq_lane_f16.c: Extract
lane 1 as the moves for lane 0 now get optimized away.
* gcc.target/arm/mve/intrinsics/vgetq_lane_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vgetq_lane_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vgetq_lane_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vgetq_lane_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vgetq_lane_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vgetq_lane_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vgetq_lane_u8.c: Likewise.
|
|
1. Replace PUSH_ARGS with a target calls hook, TARGET_PUSH_ARGUMENT, which
takes an integer argument. When it returns true, push instructions will
be used to pass outgoing arguments. If the argument is nonzero, it is
the number of bytes to push and indicates the PUSH instruction usage is
optional so that the backend can decide if PUSH instructions should be
generated. Otherwise, the argument is zero.
2. Implement x86 target hook which returns false when the number of bytes
to push is no less than 16 (8 for 32-bit targets) if vector load and store
can be used.
3. Remove target PUSH_ARGS definitions which return 0 as it is the same
as the default.
4. Define TARGET_PUSH_ARGUMENT of cr16 and m32c to always return true.
gcc/
PR target/100704
* calls.c (expand_call): Replace PUSH_ARGS with
targetm.calls.push_argument (0).
(emit_library_call_value_1): Likewise.
* defaults.h (PUSH_ARGS): Removed.
(PUSH_ARGS_REVERSED): Replace PUSH_ARGS with
targetm.calls.push_argument (0).
* expr.c (block_move_libcall_safe_for_call_parm): Likewise.
(emit_push_insn): Pass the number bytes to push to
targetm.calls.push_argument and pass 0 if ARGS_ADDR is 0.
* hooks.c (hook_bool_uint_true): New.
* hooks.h (hook_bool_uint_true): Likewise.
* rtlanal.c (nonzero_bits1): Replace PUSH_ARGS with
targetm.calls.push_argument (0).
* target.def (push_argument): Add a targetm.calls hook.
* targhooks.c (default_push_argument): New.
* targhooks.h (default_push_argument): Likewise.
* config/bpf/bpf.h (PUSH_ARGS): Removed.
* config/cr16/cr16.c (TARGET_PUSH_ARGUMENT): New.
* config/cr16/cr16.h (PUSH_ARGS): Removed.
* config/i386/i386.c (ix86_push_argument): New.
(TARGET_PUSH_ARGUMENT): Likewise.
* config/i386/i386.h (PUSH_ARGS): Removed.
* config/m32c/m32c.c (TARGET_PUSH_ARGUMENT): New.
* config/m32c/m32c.h (PUSH_ARGS): Removed.
* config/nios2/nios2.h (PUSH_ARGS): Likewise.
* config/pru/pru.h (PUSH_ARGS): Likewise.
* doc/tm.texi.in: Remove PUSH_ARGS documentation. Add
TARGET_PUSH_ARGUMENT hook.
* doc/tm.texi: Regenerated.
gcc/testsuite/
PR target/100704
* gcc.target/i386/pr100704-1.c: New test.
* gcc.target/i386/pr100704-2.c: Likewise.
* gcc.target/i386/pr100704-3.c: Likewise.
|
|
This removes CC0 and all directly related infrastructure.
CC_STATUS, CC_STATUS_MDEP, CC_STATUS_MDEP_INIT, and NOTICE_UPDATE_CC
are deleted and poisoned. CC0 is only deleted (some targets use that
name for something else). HAVE_cc0 is automatically generated, and we
no longer will do that after this patch.
CC_STATUS_INIT is suggested in final.c to also be useful for ports that
are not CC0, and at least arm seems to use it for something. So I am
leaving that alone, but most targets that have it could remove it.
2021-05-04 Segher Boessenkool <segher@kernel.crashing.org>
* caller-save.c: Remove CC0.
* cfgcleanup.c: Remove CC0.
* cfgrtl.c: Remove CC0.
* combine.c: Remove CC0.
* compare-elim.c: Remove CC0.
* conditions.h: Remove CC0.
* config/h8300/h8300.h: Remove CC0.
* config/h8300/h8300-protos.h: Remove CC0.
* config/h8300/peepholes.md: Remove CC0.
* config/i386/x86-tune-sched.c: Remove CC0.
* config/m68k/m68k.c: Remove CC0.
* config/rl78/rl78.c: Remove CC0.
* config/sparc/sparc.c: Remove CC0.
* config/xtensa/xtensa.c: Remove CC0.
(gen_conditional_move): Use pc_rtx instead of cc0_rtx in a piece of
RTL where that is used as a placeholder only.
* cprop.c: Remove CC0.
* cse.c: Remove CC0.
* cselib.c: Remove CC0.
* df-problems.c: Remove CC0.
* df-scan.c: Remove CC0.
* doc/md.texi: Remove CC0. Adjust an example.
* doc/rtl.texi: Remove CC0. Adjust an example.
* doc/tm.texi: Regenerate.
* doc/tm.texi.in: Remove CC0.
* emit-rtl.c: Remove CC0.
* final.c: Remove CC0.
* fwprop.c: Remove CC0.
* gcse-common.c: Remove CC0.
* gcse.c: Remove CC0.
* genattrtab.c: Remove CC0.
* genconfig.c: Remove CC0.
* genemit.c: Remove CC0.
* genextract.c: Remove CC0.
* gengenrtl.c: Remove CC0.
* genrecog.c: Remove CC0.
* haifa-sched.c: Remove CC0.
* ifcvt.c: Remove CC0.
* ira-costs.c: Remove CC0.
* ira.c: Remove CC0.
* jump.c: Remove CC0.
* loop-invariant.c: Remove CC0.
* lra-constraints.c: Remove CC0.
* lra-eliminations.c: Remove CC0.
* optabs.c: Remove CC0.
* postreload-gcse.c: Remove CC0.
* postreload.c: Remove CC0.
* print-rtl.c: Remove CC0.
* read-rtl-function.c: Remove CC0.
* reg-notes.def: Remove CC0.
* reg-stack.c: Remove CC0.
* reginfo.c: Remove CC0.
* regrename.c: Remove CC0.
* reload.c: Remove CC0.
* reload1.c: Remove CC0.
* reorg.c: Remove CC0.
* resource.c: Remove CC0.
* rtl.c: Remove CC0.
* rtl.def: Remove CC0.
* rtl.h: Remove CC0.
* rtlanal.c: Remove CC0.
* sched-deps.c: Remove CC0.
* sched-rgn.c: Remove CC0.
* shrink-wrap.c: Remove CC0.
* simplify-rtx.c: Remove CC0.
* system.h: Remove CC0. Poison NOTICE_UPDATE_CC, CC_STATUS_MDEP_INIT,
CC_STATUS_MDEP, and CC_STATUS.
* target.def: Remove CC0.
* valtrack.c: Remove CC0.
* var-tracking.c: Remove CC0.
|
|
This patch fixes a regression introduced by the rtl-ssa patches.
It was seen on HPPA but it might be latent elsewhere.
The problem is that the traditional way of expanding an untyped_call
is to emit sequences like:
(call (mem (symbol_ref "foo")))
(set (reg pseudo1) (reg result1))
...
(set (reg pseudon) (reg resultn))
The ABI specifies that result1..resultn are clobbered by the call but
nothing in the RTL indicates that result1..resultn are the results
of the call. Normally, using a clobbered value gives undefined results,
but in this case the results are well-defined and matter for correctness.
This seems like a niche case, so I think it would be better to mark
it explicitly rather than try to detect it heuristically.
Note that in expand_builtin_apply we already have an rtx_insn *,
so it doesn't matter whether we call emit_call_insn or emit_insn.
Calling emit_insn seems more natural now that the gen_* call
has been split out. It also matches later code in the function.
gcc/
PR rtl-optimization/98689
* reg-notes.def (UNTYPED_CALL): New note.
* combine.c (distribute_notes): Handle it.
* emit-rtl.c (try_split): Likewise.
* rtlanal.c (rtx_properties::try_to_add_insn): Likewise. Assume
that calls with the note implicitly set all return value registers.
* builtins.c (expand_builtin_apply): Add a REG_UNTYPED_CALL
to untyped_calls.
|
|
This patch is a GCC 11 regression caused by the rtl-ssa code.
Normally we treat calls as containing a potential set of a global
register, but DF makes a sensible exception for the stack pointer:
if (i == STACK_POINTER_REGNUM)
/* The stack ptr is used (honorarily) by a CALL insn. */
df_ref_record (DF_REF_BASE, collection_rec, regno_reg_rtx[i],
NULL, bb, insn_info, DF_REF_REG_USE,
DF_REF_CALL_STACK_USAGE | flags);
else if (global_regs[i])
{
/* Calls to const functions cannot access any global registers and
calls to pure functions cannot set them. All other calls may
reference any of the global registers, so they are recorded as
used. */
The only DF definition of SP was therefore the one in the entry block.
However, the rtlanal.c rtx_properties code (wrongly) assumed that calls
also clobbered the global SP. This led to multiple definitions of SP
when we only expected one.
This patch tightens the rtlanal.c handling of global registers
to match the DF approach.
gcc/
PR rtl-optimization/99596
* rtlanal.c (rtx_properties::try_to_add_insn): Don't add global
register accesses for const calls. Assume that pure functions
can only read from global registers. Ignore cases in which
the stack pointer has been marked global.
gcc/testsuite/
PR rtl-optimization/99596
* gcc.target/arm/pr99596.c: New test.
|
|
This is a sequel to the PR85022 changes, inline-asm can (unfortunately)
introduce VOIDmode MEMs and in PR85022 they have been changed so that
we don't pretend we know their size (as opposed to assuming they have
zero size).
This time we ICE in rtx_addr_can_trap_p_1 because it assumes that
all memory but BLKmode has known size. The patch just treats VOIDmode
MEMs like BLKmode in that regard. And, the STRICT_ALIGNMENT change
is needed because VOIDmode has GET_MODE_SIZE of 0 and we don't want to
check if something is a multiple of 0.
2021-04-10 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/98601
* rtlanal.c (rtx_addr_can_trap_p_1): Allow in assert unknown size
not just for BLKmode, but also for VOIDmode. For STRICT_ALIGNMENT
unaligned_mems handle VOIDmode like BLKmode.
* gcc.dg/torture/pr98601.c: New test.
|
|
gcc/
PR rtl-optimization/99376
* rtlanal.c (nonzero_bits1) <arithmetic operators>: If the number
of low-order zero bits is too large, set the result to 0 directly.
|
|
Tweak a couple of comments added in the RTL-SSA series in response
to reviewer feedback.
gcc/
* mux-utils.h (pointer_mux::m_ptr): Tweak description of contents.
* rtlanal.c (simple_regno_set): Tweak description to clarify the
RMW condition.
|
|
|
|
This patch adds a routine for finding a “simple” SET for a register
definition. See the comment in the patch for details.
gcc/
* rtl.h (simple_regno_set): Declare.
* rtlanal.c (simple_regno_set): New function.
|
|
This patch adds some classes for gathering the list of registers
and memory that are read and written by an instruction, along
with various properties about the accesses. In some ways it's
similar to the information that DF collects for registers,
but extended to memory. The main reason for using it instead
of DF is that it can analyse tentative changes to instructions
before they've been committed.
The classes also collect general information about the instruction,
since it's cheap to do and helps to avoid multiple walks of the same
RTL pattern.
I've tried to optimise the code quite a bit, since with later patches
it becomes relatively performance-sensitive. See the discussion in
the comments for the trade-offs involved.
I put the declarations in a new rtlanal.h header file since it
seemed a bit excessive to put so much new inline stuff in rtl.h.
gcc/
* rtlanal.h: New file.
(MEM_REGNO): New constant.
(rtx_obj_flags): New namespace.
(rtx_obj_reference, rtx_properties): New classes.
(growing_rtx_properties, vec_rtx_properties_base): Likewise.
(vec_rtx_properties): New alias.
* rtlanal.c: Include it.
(rtx_properties::try_to_add_reg): New function.
(rtx_properties::try_to_add_dest): Likewise.
(rtx_properties::try_to_add_src): Likewise.
(rtx_properties::try_to_add_pattern): Likewise.
(rtx_properties::try_to_add_insn): Likewise.
(vec_rtx_properties_base::grow): Likewise.
|
|
verify_changes has a test for whether a particular hard register
is a user-defined register asm. A later patch needs to test the
same thing, so this patch splits it out into a helper.
gcc/
* rtl.h (register_asm_p): Declare.
* recog.c (verify_changes): Split out the test for whether
a hard register is a register asm to...
* rtlanal.c (register_asm_p): ...this new function.
|
|
noop_move_p currently keeps any instruction that has a REG_EQUAL
note, on the basis that the equality might be useful in future.
But this creates a perverse incentive not to add potentially-useful
REG_EQUAL notes, in case they prevent an instruction from later being
removed as dead.
The condition originates from flow.c:life_analysis_1 and predates
the changes tracked by the current repository (1992). It probably
made sense when most optimisations were done on RTL rather than FE
trees, but it seems counterproductive now.
gcc/
* rtlanal.c (noop_move_p): Don't check for REG_EQUAL notes.
|
|
The following s390 rtx is errneously considered a no-op:
(set (subreg:DF (reg:TF %f0) 8) (subreg:DF (reg:V1TF %f0) 8))
Here, SET_DEST is a second register in a floating-point register pair,
and SET_SRC is the second half of a vector register, so they refer to
different bits.
Fix by treating subregs of registers in different modes conservatively.
gcc/ChangeLog:
2020-09-11 Ilya Leoshkevich <iii@linux.ibm.com>
* rtlanal.c (set_noop_p): Treat subregs of registers in
different modes conservatively.
|
|
gcc/ada/ChangeLog:
* gcc-interface/trans.c (gigi): Set exact argument of a vector
growth function to true.
(Attribute_to_gnu): Likewise.
gcc/ChangeLog:
* alias.c (init_alias_analysis): Set exact argument of a vector
growth function to true.
* calls.c (internal_arg_pointer_based_exp_scan): Likewise.
* cfgbuild.c (find_many_sub_basic_blocks): Likewise.
* cfgexpand.c (expand_asm_stmt): Likewise.
* cfgrtl.c (rtl_create_basic_block): Likewise.
* combine.c (combine_split_insns): Likewise.
(combine_instructions): Likewise.
* config/aarch64/aarch64-sve-builtins.cc (function_expander::add_output_operand): Likewise.
(function_expander::add_input_operand): Likewise.
(function_expander::add_integer_operand): Likewise.
(function_expander::add_address_operand): Likewise.
(function_expander::add_fixed_operand): Likewise.
* df-core.c (df_worklist_dataflow_doublequeue): Likewise.
* dwarf2cfi.c (update_row_reg_save): Likewise.
* early-remat.c (early_remat::init_block_info): Likewise.
(early_remat::finalize_candidate_indices): Likewise.
* except.c (sjlj_build_landing_pads): Likewise.
* final.c (compute_alignments): Likewise.
(grow_label_align): Likewise.
* function.c (temp_slots_at_level): Likewise.
* fwprop.c (build_single_def_use_links): Likewise.
(update_uses): Likewise.
* gcc.c (insert_wrapper): Likewise.
* genautomata.c (create_state_ainsn_table): Likewise.
(add_vect): Likewise.
(output_dead_lock_vect): Likewise.
* genmatch.c (capture_info::capture_info): Likewise.
(parser::finish_match_operand): Likewise.
* genrecog.c (optimize_subroutine_group): Likewise.
(merge_pattern_info::merge_pattern_info): Likewise.
(merge_into_decision): Likewise.
(print_subroutine_start): Likewise.
(main): Likewise.
* gimple-loop-versioning.cc (loop_versioning::loop_versioning): Likewise.
* gimple.c (gimple_set_bb): Likewise.
* graphite-isl-ast-to-gimple.c (translate_isl_ast_node_user): Likewise.
* haifa-sched.c (sched_extend_luids): Likewise.
(extend_h_i_d): Likewise.
* insn-addr.h (insn_addresses_new): Likewise.
* ipa-cp.c (gather_context_independent_values): Likewise.
(find_more_contexts_for_caller_subset): Likewise.
* ipa-devirt.c (final_warning_record::grow_type_warnings): Likewise.
(ipa_odr_read_section): Likewise.
* ipa-fnsummary.c (evaluate_properties_for_edge): Likewise.
(ipa_fn_summary_t::duplicate): Likewise.
(analyze_function_body): Likewise.
(ipa_merge_fn_summary_after_inlining): Likewise.
(read_ipa_call_summary): Likewise.
* ipa-icf.c (sem_function::bb_dict_test): Likewise.
* ipa-prop.c (ipa_alloc_node_params): Likewise.
(parm_bb_aa_status_for_bb): Likewise.
(ipa_compute_jump_functions_for_edge): Likewise.
(ipa_analyze_node): Likewise.
(update_jump_functions_after_inlining): Likewise.
(ipa_read_edge_info): Likewise.
(read_ipcp_transformation_info): Likewise.
(ipcp_transform_function): Likewise.
* ipa-reference.c (ipa_reference_write_optimization_summary): Likewise.
* ipa-split.c (execute_split_functions): Likewise.
* ira.c (find_moveable_pseudos): Likewise.
* lower-subreg.c (decompose_multiword_subregs): Likewise.
* lto-streamer-in.c (input_eh_regions): Likewise.
(input_cfg): Likewise.
(input_struct_function_base): Likewise.
(input_function): Likewise.
* modulo-sched.c (set_node_sched_params): Likewise.
(extend_node_sched_params): Likewise.
(schedule_reg_moves): Likewise.
* omp-general.c (omp_construct_simd_compare): Likewise.
* passes.c (pass_manager::create_pass_tab): Likewise.
(enable_disable_pass): Likewise.
* predict.c (determine_unlikely_bbs): Likewise.
* profile.c (compute_branch_probabilities): Likewise.
* read-rtl-function.c (function_reader::parse_block): Likewise.
* read-rtl.c (rtx_reader::read_rtx_code): Likewise.
* reg-stack.c (stack_regs_mentioned): Likewise.
* regrename.c (regrename_init): Likewise.
* rtlanal.c (T>::add_single_to_queue): Likewise.
* sched-deps.c (init_deps_data_vector): Likewise.
* sel-sched-ir.c (sel_extend_global_bb_info): Likewise.
(extend_region_bb_info): Likewise.
(extend_insn_data): Likewise.
* symtab.c (symtab_node::create_reference): Likewise.
* tracer.c (tail_duplicate): Likewise.
* trans-mem.c (tm_region_init): Likewise.
(get_bb_regions_instrumented): Likewise.
* tree-cfg.c (init_empty_tree_cfg_for_function): Likewise.
(build_gimple_cfg): Likewise.
(create_bb): Likewise.
(move_block_to_fn): Likewise.
* tree-complex.c (tree_lower_complex): Likewise.
* tree-if-conv.c (predicate_rhs_code): Likewise.
* tree-inline.c (copy_bb): Likewise.
* tree-into-ssa.c (get_ssa_name_ann): Likewise.
(mark_phi_for_rewrite): Likewise.
* tree-object-size.c (compute_builtin_object_size): Likewise.
(init_object_sizes): Likewise.
* tree-predcom.c (initialize_root_vars_store_elim_1): Likewise.
(initialize_root_vars_store_elim_2): Likewise.
(prepare_initializers_chain_store_elim): Likewise.
* tree-ssa-address.c (addr_for_mem_ref): Likewise.
(multiplier_allowed_in_address_p): Likewise.
* tree-ssa-coalesce.c (ssa_conflicts_new): Likewise.
* tree-ssa-forwprop.c (simplify_vector_constructor): Likewise.
* tree-ssa-loop-ivopts.c (addr_offset_valid_p): Likewise.
(get_address_cost_ainc): Likewise.
* tree-ssa-loop-niter.c (discover_iteration_bound_by_body_walk): Likewise.
* tree-ssa-pre.c (add_to_value): Likewise.
(phi_translate_1): Likewise.
(do_pre_regular_insertion): Likewise.
(do_pre_partial_partial_insertion): Likewise.
(init_pre): Likewise.
* tree-ssa-propagate.c (ssa_prop_init): Likewise.
(update_call_from_tree): Likewise.
* tree-ssa-reassoc.c (optimize_range_tests_cmp_bitwise): Likewise.
* tree-ssa-sccvn.c (vn_reference_lookup_3): Likewise.
(vn_reference_lookup_pieces): Likewise.
(eliminate_dom_walker::eliminate_push_avail): Likewise.
* tree-ssa-strlen.c (set_strinfo): Likewise.
(get_stridx_plus_constant): Likewise.
(zero_length_string): Likewise.
(find_equal_ptrs): Likewise.
(printf_strlen_execute): Likewise.
* tree-ssa-threadedge.c (set_ssa_name_value): Likewise.
* tree-ssanames.c (make_ssa_name_fn): Likewise.
* tree-streamer-in.c (streamer_read_tree_bitfields): Likewise.
* tree-vect-loop.c (vect_record_loop_mask): Likewise.
(vect_get_loop_mask): Likewise.
(vect_record_loop_len): Likewise.
(vect_get_loop_len): Likewise.
* tree-vect-patterns.c (vect_recog_mask_conversion_pattern): Likewise.
* tree-vect-slp.c (vect_slp_convert_to_external): Likewise.
(vect_bb_slp_scalar_cost): Likewise.
(vect_bb_vectorization_profitable_p): Likewise.
(vectorizable_slp_permutation): Likewise.
* tree-vect-stmts.c (vectorizable_call): Likewise.
(vectorizable_simd_clone_call): Likewise.
(scan_store_can_perm_p): Likewise.
(vectorizable_store): Likewise.
* expr.c: Likewise.
* vec.c (test_safe_grow_cleared): Likewise.
* vec.h (vec_safe_grow): Likewise.
(vec_safe_grow_cleared): Likewise.
(vl_ptr>::safe_grow): Likewise.
(vl_ptr>::safe_grow_cleared): Likewise.
* config/c6x/c6x.c (insn_set_clock): Likewise.
gcc/c/ChangeLog:
* gimple-parser.c (c_parser_gimple_compound_statement): Set exact argument of a vector
growth function to true.
gcc/cp/ChangeLog:
* class.c (build_vtbl_initializer): Set exact argument of a vector
growth function to true.
* constraint.cc (get_mapped_args): Likewise.
* decl.c (cp_maybe_mangle_decomp): Likewise.
(cp_finish_decomp): Likewise.
* parser.c (cp_parser_omp_for_loop): Likewise.
* pt.c (canonical_type_parameter): Likewise.
* rtti.c (get_pseudo_ti_init): Likewise.
gcc/fortran/ChangeLog:
* trans-openmp.c (gfc_trans_omp_do): Set exact argument of a vector
growth function to true.
gcc/lto/ChangeLog:
* lto-common.c (lto_file_finalize): Set exact argument of a vector
growth function to true.
|
|
peephole2.
gcc/
* lra.c (add_auto_inc_notes): Remove function.
* reload1.c (add_auto_inc_notes): Similarly. Move into...
* rtlanal.c (add_auto_inc_notes): New function.
* rtl.h (add_auto_inc_notes): Add prototype.
* recog.c (peep2_attempt): Scan and add REG_INC notes to new insns
as needed.
|
|
My recent combine-stack-adj.c change broke df checking bootstrap,
while most of the changes are done through validate_change/confirm_changes
which update df info, the removal of REG_EQUAL notes didn't update df info.
2020-05-08 Jakub Jelinek <jakub@redhat.com>
PR bootstrap/94961
PR rtl-optimization/94516
* rtl.h (remove_reg_equal_equiv_notes): Add a bool argument defaulted
to false.
* rtlanal.c (remove_reg_equal_equiv_notes): Add no_rescan argument.
Call df_notes_rescan if that argument is not true and returning true.
* combine.c (adjust_for_new_dest): Pass true as second argument to
remove_reg_equal_equiv_notes.
* postreload.c (reload_combine_recognize_pattern): Don't call
df_notes_rescan.
|
|
... given that the GCN target did away with the constant 'vec_select'
restriction.
gcc/
PR target/94279
* rtlanal.c (set_noop_p): Handle non-constant selectors.
|
|
There's a costly signed 64-bit division in rtx_cost on x86 as well as
any other target where UNITS_PER_WORD expands to TARGET_64BIT ? 8 : 4.
It's also evident that rtx_cost does redundant work for a SET.
Obviously the variable named 'factor' rarely exceeds 1, so in the
majority of cases it can be computed with a well-predictable branch
rather than a division.
This patch makes rtx_cost do the division only in case mode is wider
than UNITS_PER_WORD, and also moves a test for a SET up front to avoid
redundancy.
No functional change.
* rtlanal.c (rtx_cost): Handle a SET up front. Avoid division if
the mode is not wider than UNITS_PER_WORD.
|
|
From-SVN: r279813
|
|
2019-12-06 Andreas Krebbel <krebbel@linux.ibm.com>
Vladimir Makarov <vmakarov@redhat.com>
PR rtl-optimization/92176
* lra.c (simplify_subreg_regno): Don't permit unconditional
changing mode for LRA too.
2019-12-06 Andreas Krebbel <krebbel@linux.ibm.com>
Vladimir Makarov <vmakarov@redhat.com>
PR rtl-optimization/92176
* gcc.target/s390/pr92176.c: New test.
Co-Authored-By: Vladimir Makarov <vmakarov@redhat.com>
From-SVN: r279061
|
|
* config/m68k/m68k.c (output_move_himode, output_move_qimode):
Replace code for non-CONST_INT constants with gcc_unreachable.
* config/m68k/m68k.md (cbranchdi): Don't generate individual
compare and test.
(CMPMODE): New mode_iterator.
(cbranchsi4, cbranchqi4, cbranchhi4): Replace expanders with
cbranch<mode>4.
(cstoresi4, cstoreqi4, cstorehi4): Replace expanders with
cstore<mode>4.
(cmp<mode>_68881): Remove 'F' constraint from first comparison
operand.
(bit test insns patterns): Use nonimmediate_operand, not
register_operand, for source operands that allow memory in
their constraints.
(divmodsi4, udivmodsi4, divmodhi4 and related unnamed patterns):
Use register_operand, not nonimmediate_operand, for the
destinations.
(DBCC): New mode_iterator.
(dbcc peepholes): Use it to reduce duplication.
(trap): Use const_true_rtx, not const1_rtx.
* config/m68k/predicates.md (m68k_comparison_operand): Renamed
from m68k_subword_comparison_operand and changed to handle
SImode.
PR target/91851
* config/m68k/m68k-protos.h (output-dbcc_and_branch): Adjust
declaration.
(m68k_init_cc): New declaration.
(m68k_output_compare_di, m68k_output_compare_si)
(m68k_output_compare_hi, m68k_output_compare_qi)
(m68k_output_compare_fp, m68k_output_btst, m68k_output_bftst)
(m68k_find_flags_value, m68k_output_scc, m68k_output_scc_float)
(m68k_output_branch_integer, m68k_output_branch_integer_rev.
m68k_output_branch_float, m68k_output_branch_float_rev):
Likewise.
(valid_dbcc_comparison_p_2, flags_in_68881)
(output_btst): Remove declaration.
* config/m68k/m68k.c (INCLDUE_STRING): Define.
(TARGET_ASM_FINAL_POSTSCAN_INSN): Define.
(valid_dbcc_comparison_p_2, flags_in_68881): Delete functions.
(flags_compare_op0, flags_compare_op1, flags_operand1,
flags_operand2, flags_valid): New static variables.
(m68k_find_flags_value, m68k_init_cc): New functions.
(handle_flags_for_move, m68k_asm_final_postscan_insn,
remember_compare_flags): New static functions.
(output_dbcc_and_branch): New argument CODE. Use it, and add
PLUS and MINUS to the possible codes. All callers changed.
(m68k_output_btst): Renamed from output_btst. Remove OPERANDS
and INSN arguments, add CODE arg. Return the comparison code
to use. All callers changed. Use CODE instead of
next_insn_tests_no_inequality, and replace cc_status management
with changing the return code.
(m68k_rtx_costs): Instead of testing for COMPARE, test for
RTX_COMPARE or RTX_COMM_COMPARE.
(output_move_simode, output_move_qimode): Call
handle_flags_for_move.
(notice_update_cc): Delete function.
(m68k_output_bftst, m68k_output_compare_di, m68k_output_compare_si,
m68k_output_compare_hi, m68k_output_compare_qi,
m68k_output_compare_fp, m68k_output_branch_integer,
m68k_output_branch_integer_rev, m68k_output_scc,
m68k_output_branch_float, m68k_output_branch_float_rev,
m68k_output_scc_float): New functions.
(output_andsi3, output_iorsi3, output_xorsi3): Call CC_STATUS_INIT
once at the start, and set flags_valid and flags_operand1 if the
flags are usable.
* config/m68k/m68k.h (CC_IN_68881, NOTICE_UPDATE_CC,
CC_OVERFLOW_UNUSABLE, CC_NO_CARRY, OUTPUT_JUMP): Remove
definitions.
(CC_STATUS_INIT): Define.
* config/m68k/m68k.md (flags_valid): New define_attr.
(tstdi, tstsi_internal_68020_cf, tstsi_internal, tsthi_internal,
tstqi_internal, tst<mode>_68881, tst<mode>_cf, cmpdi_internal,
cmpdi, unnamed cmpsi/cmphi/cmpqi patterns, cmpsi_cf,
cmp<mode>_68881, cmp<mode>_cf, unnamed btst patterns,
tst_bftst_reg, tst_bftst_reg, unnamed scc patterns, scc,
sls, sordered_1, sunordered_1, suneq_1, sunge_1, sungt_1,
sunle_1, sunlt_1, sltgt_1, fsogt_1, fsoge_1, fsolt_1, fsole_1,
bge0_di, blt0_di, beq, bne, bgt, bgtu, blt, bltu, bge, bgeu,
ble, bleu, bordered, bunordered, buneq, bunge, bungt, bunle,
bunlt, bltgt, beq_rev, bne_rev, bgt_rev, bgtu_rev,
blt_rev, bltu_rev, bge_rev, bgeu_rev, ble_rev, bleu_rev,
bordered_rev, bunordered_rev, buneq_rev, bunge_rv, bungt_rev,
bunle_rev, bunlt_rev, bltgt_rev, ctrapdi4, ctrapsi4, ctraphi4,
ctrapqi4, conditional_trap): Delete patterns.
(cbranchdi4_insn): New pattern.
(cbranchdi4): Don't generate cc0 patterns. When testing LT or GE,
test high part only. When testing EQ or NE, generate beq0_di
and bne0_di patterns directly.
(cstoredi4): When testing LT or GE, test high part only.
(both sets of cbranch<mode>4, cstore<mode>4): Don't generate cc0
patterns.
(scc0_constraints, cmp1_constraints, cmp2_constraints,
scc0_cf_constraints, cmp1_cf_constraints, cmp2_cf_constraints,
cmp2_cf_predicate): New define_mode_attrs.
(cbranch<mode>4_insn, cbranch<mode>4_insn_rev,
cbranch<mode>4_insn_cf, cbranch<mode>4_insn_cf_rev,
cstore<mode>4_insn, cstore<mode>4_insn_cf for integer modes)
New patterns.
(cbranch<mode>4_insn_68881, cbranch<mode>4_insn_rev_68881):
(cbranch<mode>4_insn_cf, cbranch<mode>4_insn_rev_cf,
cstore<mode>4_insn_68881, cstore<mode>4_insn_cf for FP):
New patterns.
(cbranchsi4_btst_mem_insn, cbranchsi4_btst_reg_insn,
cbranchsi4_btst_mem_insn_1, cbranchsi4_btst_reg_insn_1):
Likewise.
(BTST): New define_mode_iterator.
(btst_predicate, btst_constraint, btst_range): New
define_mode_attrs.
(cbranch_bftst<mode>_insn, cstore_bftst<mode>_insn): New
patterns.
(movsi_m68k_movsi_m68k2, movsi_cf, unnamed movstrict patterns,
unnamed movhi and movqi patterns, unnamed movsf, movdf and movxf
patterns): Set attr "flags_valid".
(truncsiqi2, trunchiqi2, truncsihi2): Remove manual CC_STATUS
management. Set attr "flags_valid".
(extendsidi2, extendplussidi, unnamed float_extendsfdf pattern,
extendsfdf2_cf, fix_truncdfsi2, fix_truncdfhi2, fix_truncdfqi2,
addi_sexthishl32, adddi_dilshr32, adddi_dilshr32_cf,
addi_dishl32, subdi_sexthishl32, subdi_dishl32, subdi3): Remove
manual CC_STATUS management.
(addsi3_internal, addhi3, addqi3, subsi3, subhi3, subqi3,
unnamed strict_lowpart subhi and subqi patterns): Set attr
"flags_valid".
(unnamed strict_lowpart addhi3 and addqi3 patterns): Likewise.
Remove code to operate on address regs and assert the case
does not occur.
(unnamed mulsidi patterns, divmodhi4, udivmodhi4): Remove
manual CC_STATUS_INIT.
(andsi3_internal, andhi3, andqi3, iorsi3_internal, iorhi3, iorqi3,
xorsi3_internal, xorhi3, xorqi3, negsi2_internal,
negsi2_5200, neghi2, negqi2, one_cmplsi2_internal, one_cmplhi2,
one_cmplqi2, unnamed strict_lowpart patterns
for andhi, andqi, iorhi, iorqi, xorhi, xorqi, neghi, negqi,
one_cmplhi and one_cmplqi): Set attr "flags_valid".
(iorsi_zext_ashl16, iorsi_zext): Remove manual CC_STATUS_INIT.
(ashldi_sexthi, ashlsi_16, ashlsi_17_24): Remove manual
CC_STATUS_INIT.
(ashlsi3, ashlhi3, ashlqi3, ashrsi3, ashrhi3, ashrqi3, lshrsi3,
lshrhi3, shrqi3, rotlsi3, rotlhi3, rotlhi3_lowpart, rotlqi3,
rotlqi3_lowpart, rotrsi3, rotrhi3, rotrhi_lowpart, rotrqi3,
unnamed strict_low_part patterns for HI and
QI versions): Set attr "flags_valid".
(bsetmemqi, bsetmemqi_ext, bsetdreg, bchgdreg, bclrdreg,
bclrmemqi, extzv_8_16_reg, extzv_bfextu_mem, insv_bfchg_mem,
insv_bfclr_mem, insv_bfset_mem, extv_bfextu_reg,
insv_bfclr_reg, insv_bfset_reg, dbne_hi, dbne_si, dbge_hi,
dbge_si, extendsfxf2, extenddfxf2, ): Remove manual cc_status management.
(various unnamed peepholes): Adjust compare/branch sequences
for new cbranch patterns.
(dbcc peepholes): Likewise, and output the comparison here
as well.
* config/m68k/predicates.md (valid_dbcc_comparison_p): Delete.
(fp_src_operand): Allow constant zero.
(address_reg_operand): New predicate.
* rtl.h (inequality_comparisons_p): Remove declaration.
* recog.h (next_insn_tests_no_inequality): Likewise.
* rtlanal.c (inequality_comparisons_p): Delete function.
* recog.c (next_insn_tests_no_inequality): Likewise.
From-SVN: r278681
|
|
The AArch64 SVE tlsdesc patterns were the main motivating reason
for clobber_high. It's no longer needed now that the patterns use
calls instead.
At the time, one of the possible future uses for clobber_high was for
asm statements. However, the current code wouldn't handle that case
without modification, so I think we might as well remove it for now.
We can always reapply it in future if it turns out to be useful again.
2019-10-01 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* rtl.def (CLOBBER_HIGH): Delete.
* doc/rtl.texi (clobber_high): Remove documentation.
* rtl.h (SET_DEST): Remove CLOBBER_HIGH from the list of codes.
(reg_is_clobbered_by_clobber_high): Delete.
(gen_hard_reg_clobber_high): Likewise.
* alias.c (record_set): Remove CLOBBER_HIGH handling.
* cfgexpand.c (expand_gimple_stmt): Likewise.
* combine-stack-adj.c (single_set_for_csa): Likewise.
* combine.c (find_single_use_1, set_nonzero_bits_and_sign_copies)
(can_combine_p, is_parallel_of_n_reg_sets, try_combine)
(record_dead_and_set_regs_1, reg_dead_at_p_1): Likewise.
* cse.c (invalidate_reg): Remove clobber_high parameter.
(invalidate): Update call accordingly.
(canonicalize_insn): Remove CLOBBER_HIGH handling.
(invalidate_from_clobbers, invalidate_from_sets_and_clobbers)
(count_reg_usage, insn_live_p): Likewise.
* cselib.h (cselib_invalidate_rtx): Remove sett argument.
* cselib.c (cselib_invalidate_regno, cselib_invalidate_rtx): Likewise.
(cselib_invalidate_rtx_note_stores): Update call accordingly.
(cselib_expand_value_rtx_1): Remove CLOBBER_HIGH handling.
(cselib_invalidate_regno, cselib_process_insn): Likewise.
* dce.c (deletable_insn_p, mark_nonreg_stores_1): Likewise.
(mark_nonreg_stores_2): Likewise.
* df-scan.c (df_find_hard_reg_defs, df_uses_record): Likewise.
(df_get_call_refs): Likewise.
* dwarf2out.c (mem_loc_descriptor): Likewise.
* emit-rtl.c (verify_rtx_sharing): Likewise.
(copy_insn_1, copy_rtx_if_shared_1): Likewise.
(hard_reg_clobbers_high, gen_hard_reg_clobber_high): Delete.
* genconfig.c (walk_insn_part): Remove CLOBBER_HIGH handling.
* genemit.c (gen_exp, gen_insn): Likewise.
* genrecog.c (validate_pattern, remove_clobbers): Likewise.
* haifa-sched.c (haifa_classify_rtx): Likewise.
* ira-build.c (create_insn_allocnos): Likewise.
* ira-costs.c (scan_one_insn): Likewise.
* ira.c (equiv_init_movable_p, memref_referenced_p): Likewise.
(rtx_moveable_p, interesting_dest_for_shprep): Likewise.
* jump.c (mark_jump_label_1): Likewise.
* lra-int.h (lra_insn_reg::clobber_high): Delete.
* lra-eliminations.c (lra_eliminate_regs_1): Remove CLOBBER_HIGH
handling.
(mark_not_eliminable): Likewise.
* lra-lives.c (process_bb_lives): Likewise.
* lra.c (new_insn_reg): Remove clobber_high parameter.
(collect_non_operand_hard_regs): Likewise. Update call to new
insn_reg. Remove CLOBBER_HIGH handling.
(lra_set_insn_recog_data): Remove CLOBBER_HIGH handling. Update call
to collect_non_operand_hard_regs.
(add_regs_to_insn_regno_info): Remove CLOBBER_HIGH handling.
Update call to new_insn_reg.
(lra_update_insn_regno_info): Remove CLOBBER_HIGH handling.
* postreload.c (reload_cse_simplify, reload_combine_note_use)
(move2add_note_store): Likewise.
* print-rtl.c (print_pattern): Likewise.
* recog.c (store_data_bypass_p_1, store_data_bypass_p): Likewise.
(if_test_bypass_p): Likewise.
* regcprop.c (kill_clobbered_value, kill_set_value): Likewise.
* reginfo.c (reg_scan_mark_refs): Likewise.
* reload1.c (maybe_fix_stack_asms, eliminate_regs_1): Likewise.
(elimination_effects, mark_not_eliminable, scan_paradoxical_subregs)
(forget_old_reloads_1): Likewise.
* reorg.c (find_end_label, try_merge_delay_insns, redundant_insn)
(own_thread_p, fill_simple_delay_slots, fill_slots_from_thread)
(dbr_schedule): Likewise.
* resource.c (update_live_status, mark_referenced_resources)
(mark_set_resources): Likewise.
* rtl.c (copy_rtx): Likewise.
* rtlanal.c (reg_referenced_p, set_of_1, single_set_2, noop_move_p)
(note_pattern_stores): Likewise.
(reg_is_clobbered_by_clobber_high): Delete.
* sched-deps.c (sched_analyze_reg, sched_analyze_insn): Remove
CLOBBER_HIGH handling.
From-SVN: r276393
|
|
The reg_set_p part is simple, since the caller is asking about
a specific REG rtx, with a known register number and mode.
The find_all_hard_reg_sets part emphasises that the "implicit"
behaviour was always a bit suspect, since it includes fully-clobbered
registers but not partially-clobbered registers. The only current
user of this path is the c6x-specific scheduler predication code,
and c6x doesn't have partly call-clobbered registers, so in practice
it's fine. I've added a comment to try to disuade future users.
(The !implicit path is OK and useful though.)
2019-09-30 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* rtlanal.c: Include function-abi.h.
(reg_set_p): Use insn_callee_abi to get the ABI of the called
function and clobbers_reg_p to test whether the register
is call-clobbered.
(find_all_hard_reg_sets): When implicit is true, use insn_callee_abi
to get the ABI of the called function and full_reg_clobbers to
get the set of fully call-clobbered registers. Warn about the
pitfalls of using this mode.
From-SVN: r276334
|
|
This patch replaces get_call_reg_set_usage with insn_callee_abi,
which returns the ABI of the target of a call insn. The ABI's
full_reg_clobbers corresponds to regs_invalidated_by_call,
whereas many callers instead passed call_used_or_fixed_regs, i.e.:
(regs_invalidated_by_call | fixed_reg_set)
The patch slavishly preserves the "| fixed_reg_set" for these callers;
later patches will clean this up.
2019-09-30 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* target.def (insn_callee_abi): New hook.
(remove_extra_call_preserved_regs): Delete.
* doc/tm.texi.in (TARGET_INSN_CALLEE_ABI): New macro.
(TARGET_REMOVE_EXTRA_CALL_PRESERVED_REGS): Delete.
* doc/tm.texi: Regenerate.
* targhooks.h (default_remove_extra_call_preserved_regs): Delete.
* targhooks.c (default_remove_extra_call_preserved_regs): Delete.
* config/aarch64/aarch64.c (aarch64_simd_call_p): Constify the
insn argument.
(aarch64_remove_extra_call_preserved_regs): Delete.
(aarch64_insn_callee_abi): New function.
(TARGET_REMOVE_EXTRA_CALL_PRESERVED_REGS): Delete.
(TARGET_INSN_CALLEE_ABI): New macro.
* rtl.h (get_call_fndecl): Declare.
(cgraph_rtl_info): Fix formatting. Tweak comment for
function_used_regs. Remove function_used_regs_valid.
* rtlanal.c (get_call_fndecl): Moved from final.c
* function-abi.h (insn_callee_abi): Declare.
(target_function_abi_info): Mention insn_callee_abi.
* function-abi.cc (fndecl_abi): Handle flag_ipa_ra in a similar
way to get_call_reg_set_usage did.
(insn_callee_abi): New function.
* regs.h (get_call_reg_set_usage): Delete.
* final.c: Include function-abi.h.
(collect_fn_hard_reg_usage): Add fixed and stack registers to
function_used_regs before the main loop rather than afterwards.
Use insn_callee_abi instead of get_call_reg_set_usage. Exit early
if function_used_regs ends up not being useful.
(get_call_fndecl): Move to rtlanal.c
(get_call_cgraph_rtl_info, get_call_reg_set_usage): Delete.
* caller-save.c: Include function-abi.h.
(setup_save_areas, save_call_clobbered_regs): Use insn_callee_abi
instead of get_call_reg_set_usage.
* cfgcleanup.c: Include function-abi.h.
(old_insns_match_p): Use insn_callee_abi instead of
get_call_reg_set_usage.
* cgraph.h (cgraph_node::rtl_info): Take a const_tree instead of
a tree.
* cgraph.c (cgraph_node::rtl_info): Likewise. Initialize
function_used_regs.
* df-scan.c: Include function-abi.h.
(df_get_call_refs): Use insn_callee_abi instead of
get_call_reg_set_usage.
* ira-lives.c: Include function-abi.h.
(process_bb_node_lives): Use insn_callee_abi instead of
get_call_reg_set_usage.
* lra-lives.c: Include function-abi.h.
(process_bb_lives): Use insn_callee_abi instead of
get_call_reg_set_usage.
* postreload.c: Include function-abi.h.
(reload_combine): Use insn_callee_abi instead of
get_call_reg_set_usage.
* regcprop.c: Include function-abi.h.
(copyprop_hardreg_forward_1): Use insn_callee_abi instead of
get_call_reg_set_usage.
* resource.c: Include function-abi.h.
(mark_set_resources, mark_target_live_regs): Use insn_callee_abi
instead of get_call_reg_set_usage.
* var-tracking.c: Include function-abi.h.
(dataflow_set_clear_at_call): Use insn_callee_abi instead of
get_call_reg_set_usage.
From-SVN: r276309
|
|
This patch rewrites the way simplify_subreg handles constants.
It uses similar native_encode/native_decode routines to the
tree-level handling of VIEW_CONVERT_EXPR, meaning that we can
move between rtx constants and the target memory image of them.
The main point of this patch is to support subregs of constant-length
vectors for VLA vectors, beyond the very simple cases that were already
handled. Many of the new tests failed before the patch for variable-
length vectors.
The boolean side is tested more by the upcoming SVE ACLE work.
2019-09-19 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* defaults.h (TARGET_UNIT): New macro.
(target_unit): New type.
* rtl.h (native_encode_rtx, native_decode_rtx)
(native_decode_vector_rtx, subreg_size_lsb): Declare.
(subreg_lsb_1): Turn into an inline wrapper around subreg_size_lsb.
* rtlanal.c (subreg_lsb_1): Delete.
(subreg_size_lsb): New function.
* simplify-rtx.c: Include rtx-vector-builder.h
(simplify_immed_subreg): Delete.
(native_encode_rtx, native_decode_vector_rtx, native_decode_rtx)
(simplify_const_vector_byte_offset, simplify_const_vector_subreg): New
functions.
(simplify_subreg): Use them.
(test_vector_subregs_modes, test_vector_subregs_repeating)
(test_vector_subregs_fore_back, test_vector_subregs_stepped)
(test_vector_subregs): New functions.
(test_vector_ops): Call test_vector_subregs for integer vector
modes with at least 2 elements.
From-SVN: r275959
|
|
-fno-forward-propagate -fno-sched-pressure)
PR rtl-optimization/89795
* rtlanal.c (nonzero_bits1) <SUBREG>: Do not propagate results from
inner REGs to paradoxical SUBREGs if WORD_REGISTER_OPERATIONS is set.
From-SVN: r275635
|
|
CALL_USED_REGISTERS and call_used_regs infamously contain all fixed
registers (hence the need for CALL_REALLY_USED_REGISTERS etc.).
We try to recover from this to some extent with:
/* Contains 1 for registers that are set or clobbered by calls. */
/* ??? Ideally, this would be just call_used_regs plus global_regs, but
for someone's bright idea to have call_used_regs strictly include
fixed_regs. Which leaves us guessing as to the set of fixed_regs
that are actually preserved. We know for sure that those associated
with the local stack frame are safe, but scant others. */
HARD_REG_SET x_regs_invalidated_by_call;
Since global registers are added to fixed_reg_set and call_used_reg_set
too, it's always the case that:
call_used_reg_set == regs_invalidated_by_call | fixed_reg_set
This patch replaces all uses of call_used_reg_set with a new macro
call_used_or_fixed_regs to make this clearer.
This is part of a series that allows call_used_regs to be what is
now call_really_used_regs. It's a purely mechanical replacement;
later patches clean up obvious oddities like
"call_used_or_fixed_regs & ~fixed_regs".
2019-09-10 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* hard-reg-set.h (target_hard_regs::x_call_used_reg_set): Delete.
(call_used_reg_set): Delete.
(call_used_or_fixed_regs): New macro.
* reginfo.c (init_reg_sets_1, globalize_reg): Remove initialization
of call_used_reg_set.
* caller-save.c (setup_save_areas): Use call_used_or_fixed_regs
instead of call_used_regs.
(save_call_clobbered_regs): Likewise.
* cfgcleanup.c (old_insns_match_p): Likewise.
* config/c6x/c6x.c (c6x_call_saved_register_used): Likewise.
* config/epiphany/epiphany.c (epiphany_conditional_register_usage):
Likewise.
* config/frv/frv.c (frv_ifcvt_modify_tests): Likewise.
* config/sh/sh.c (output_stack_adjust): Likewise.
* final.c (collect_fn_hard_reg_usage): Likewise.
* ira-build.c (ira_build): Likewise.
* ira-color.c (calculate_saved_nregs): Likewise.
(allocno_reload_assign, calculate_spill_cost): Likewise.
* ira-conflicts.c (ira_build_conflicts): Likewise.
* ira-costs.c (ira_tune_allocno_costs): Likewise.
* ira-lives.c (process_bb_node_lives): Likewise.
* ira.c (setup_reg_renumber): Likewise.
* lra-assigns.c (find_hard_regno_for_1, lra_assign): Likewise.
* lra-constraints.c (need_for_call_save_p): Likewise.
(need_for_split_p, inherit_in_ebb): Likewise.
* lra-lives.c (process_bb_lives): Likewise.
* lra-remat.c (call_used_input_regno_present_p): Likewise.
* postreload.c (reload_combine): Likewise.
* regrename.c (find_rename_reg): Likewise.
* reload1.c (reload_as_needed): Likewise.
* rtlanal.c (find_all_hard_reg_sets): Likewise.
* sel-sched.c (mark_unavailable_hard_regs): Likewise.
* shrink-wrap.c (requires_stack_frame_p): Likewise.
From-SVN: r275600
|
|
Only one caller (in dwarf2out.c) was preventing get_call_rtx_from
from taking an rtx_insn *. Since that caller just passes a PATTERN,
it's a trivial change to make.
2019-09-10 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* rtl.h (get_call_rtx_from): Take a const rtx_insn * instead of an rtx.
* rtlanal.c (get_call_rtx_from): Likewise.
* dwarf2out.c (dwarf2out_var_location): Pass the insn rather
than the pattern to get_call_rtx_from.
* config/i386/i386-expand.h (ix86_notrack_prefixed_insn_p): Take
an rtx_insn * instead of an rtx.
* config/i386/i386-expand.c (ix86_notrack_prefixed_insn_p): Likewise.
From-SVN: r275593
|
|
Use "x |= y" instead of "IOR_HARD_REG_SET (x, y)" (or just "x | y"
if the result is a temporary).
2019-09-09 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* hard-reg-set.h (HARD_REG_SET::operator|): New function.
(HARD_REG_SET::operator|=): Likewise.
(IOR_HARD_REG_SET): Delete.
* config/gcn/gcn.c (gcn_md_reorg): Use "|" instead of
IOR_HARD_REG_SET.
* config/m32c/m32c.c (m32c_register_move_cost): Likewise.
* config/s390/s390.c (s390_adjust_loop_scan_osc): Likewise.
* final.c (collect_fn_hard_reg_usage): Likewise.
* hw-doloop.c (scan_loop, optimize_loop): Likewise.
* ira-build.c (merge_hard_reg_conflicts): Likewise.
(ior_hard_reg_conflicts, create_cap_allocno, propagate_allocno_info)
(propagate_some_info_from_allocno): Likewise.
(copy_info_to_removed_store_destinations): Likewise.
* ira-color.c (add_allocno_hard_regs_to_forest, assign_hard_reg)
(allocno_reload_assign, ira_reassign_pseudos): Likewise.
(fast_allocation): Likewise.
* ira-conflicts.c (ira_build_conflicts): Likewise.
* ira-lives.c (make_object_dead, process_single_reg_class_operands)
(process_bb_node_lives): Likewise.
* ira.c (setup_pressure_classes, setup_reg_class_relations): Likewise.
* lra-assigns.c (find_hard_regno_for_1): Likewise.
(setup_live_pseudos_and_spill_after_risky_transforms): Likewise.
* lra-constraints.c (process_alt_operands, inherit_in_ebb): Likewise.
* lra-eliminations.c (spill_pseudos, update_reg_eliminate): Likewise.
* lra-lives.c (mark_pseudo_dead, check_pseudos_live_through_calls)
(process_bb_lives): Likewise.
* lra-spills.c (assign_spill_hard_regs): Likewise.
* postreload.c (reload_combine): Likewise.
* reginfo.c (init_reg_sets_1): Likewise.
* regrename.c (merge_overlapping_regs, find_rename_reg)
(merge_chains): Likewise.
* reload1.c (maybe_fix_stack_asms, order_regs_for_reload, find_reg)
(find_reload_regs, finish_spills, choose_reload_regs_init)
(emit_reload_insns): Likewise.
* reorg.c (redundant_insn): Likewise.
* resource.c (find_dead_or_set_registers, mark_set_resources)
(mark_target_live_regs): Likewise.
* rtlanal.c (find_all_hard_reg_sets): Likewise.
* sched-deps.c (sched_analyze_insn): Likewise.
* sel-sched.c (mark_unavailable_hard_regs): Likewise.
(find_best_reg_for_expr): Likewise.
* shrink-wrap.c (try_shrink_wrapping): Likewise.
From-SVN: r275531
|
|
I have a series of patches that (as a side effect) makes all rtl
passes use the information collected by -fipa-ra. This showed up a
latent bug in the liveness tracking in regrename.c, which doesn't take
CALL_INSN_FUNCTION_USAGE into account when processing clobbers.
This actually seems to be quite a common problem with passes that use
note_stores; only a handful remember to walk CALL_INSN_FUNCTION_USAGE
too. I think it was just luck that I saw it with regrename first.
This patch tries to make things more robust by passing an insn rather
than a pattern to note_stores. The old function is still available
as note_pattern_stores for the few places that need it.
When updating callers, I've erred on the side of using note_stores
rather than note_pattern_stores, because IMO note_stores should be
the default choice and we should only use note_pattern_stores if
there's a specific reason. Specifically:
* For cselib.c, "body" may be a COND_EXEC_CODE instead of the main
insn pattern.
* For ira.c, I wasn't sure whether extending no_equiv to
CALL_INSN_FUNCTION_USAGE really made sense, since we don't do that
for normal call-clobbered registers. Same for mark_not_eliminable
in reload1.c
* Some other places only have a pattern available, and since those
places wouldn't benefit from walking CALL_INSN_FUNCTION_USAGE,
it seemed better to alter the code as little as possible.
* In the config/ changes, quite a few callers have already weeded
out CALL insns. It still seemed better to use note_stores rather
than prematurely optimise. (note_stores should tail call to
note_pattern_stores once it sees that the insn isn't a call.)
The patch also documents what SETs mean in CALL_INSN_FUNCTION_USAGE.
2019-09-09 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* rtl.h (CALL_INSN_FUNCTION_USAGE): Document what SETs mean.
(note_pattern_stores): Declare.
(note_stores): Take an rtx_insn *.
* rtlanal.c (set_of): Use note_pattern_stores instead of note_stores.
(find_all_hard_reg_sets): Pass the insn rather than its pattern to
note_stores. Remove explicit handling of CALL_INSN_FUNCTION_USAGE.
(note_stores): Take an rtx_insn * as argument and process
CALL_INSN_FUNCTION_USAGE. Rename old function to...
(note_pattern_stores): ...this.
(find_first_parameter_load): Pass the insn rather than
its pattern to note_stores.
* alias.c (memory_modified_in_insn_p, init_alias_analysis): Likewise.
* caller-save.c (setup_save_areas, save_call_clobbered_regs)
(insert_one_insn): Likewise.
* combine.c (combine_instructions): Likewise.
(likely_spilled_retval_p): Likewise.
(try_combine): Use note_pattern_stores instead of note_stores.
(record_dead_and_set_regs): Pass the insn rather than its pattern
to note_stores.
(reg_dead_at_p): Likewise.
* config/bfin/bfin.c (workaround_speculation): Likewise.
* config/c6x/c6x.c (maybe_clobber_cond): Likewise. Take an rtx_insn *
rather than an rtx.
* config/frv/frv.c (frv_registers_update): Use note_pattern_stores
instead of note_stores.
(frv_optimize_membar_local): Pass the insn rather than its pattern
to note_stores.
* config/gcn/gcn.c (gcn_md_reorg): Likewise.
* config/i386/i386.c (ix86_avx_u128_mode_after): Likewise.
* config/mips/mips.c (vr4130_true_reg_dependence_p): Likewise.
(r10k_needs_protection_p, mips_sim_issue_insn): Likewise.
(mips_reorg_process_insns): Likewise.
* config/s390/s390.c (s390_regs_ever_clobbered): Likewise.
* config/sh/sh.c (flow_dependent_p): Likewise. Take rtx_insn *s
rather than rtxes.
* cse.c (delete_trivially_dead_insns): Pass the insn rather than
its pattern to note_stores.
* cselib.c (cselib_record_sets): Use note_pattern_stores instead
of note_stores.
* dce.c (mark_nonreg_stores): Remove the "body" parameter and pass
the insn to note_stores.
(prescan_insns_for_dce): Update call accordingly.
* ddg.c (mem_write_insn_p): Pass the insn rather than its pattern
to note_stores.
* df-problems.c (can_move_insns_across): Likewise.
* dse.c (emit_inc_dec_insn_before, replace_read): Likewise.
* function.c (assign_parm_setup_reg): Likewise.
* gcse-common.c (record_last_mem_set_info_common): Likewise.
* gcse.c (load_killed_in_block_p, compute_hash_table_work): Likewise.
(single_set_gcse): Likewise.
* ira.c (validate_equiv_mem): Likewise.
(update_equiv_regs): Use note_pattern_stores rather than note_stores
for no_equiv.
* loop-doloop.c (doloop_optimize): Pass the insn rather than its
pattern to note_stores.
* loop-invariant.c (calculate_loop_reg_pressure): Likewise.
* loop-iv.c (simplify_using_initial_values): Likewise.
* mode-switching.c (optimize_mode_switching): Likewise.
* optabs.c (emit_libcall_block_1): Likewise.
(expand_atomic_compare_and_swap): Likewise.
* postreload-gcse.c (load_killed_in_block_p): Likewise.
(record_opr_changes): Likewise. Remove explicit handling of
CALL_INSN_FUNCTION_USAGE.
* postreload.c (reload_combine, reload_cse_move2add): Likewise.
* regcprop.c (kill_clobbered_values): Likewise.
(copyprop_hardreg_forward_1): Pass the insn rather than its pattern
to note_stores.
* regrename.c (build_def_use): Likewise.
* reload1.c (reload): Use note_pattern_stores instead of note_stores
for mark_not_eliminable.
(reload_as_needed): Pass the insn rather than its pattern
to note_stores.
(emit_output_reload_insns): Likewise.
* resource.c (mark_target_live_regs): Likewise.
* sched-deps.c (init_insn_reg_pressure_info): Likewise.
* sched-rgn.c (sets_likely_spilled): Use note_pattern_stores
instead of note_stores.
* shrink-wrap.c (try_shrink_wrapping): Pass the insn rather than
its pattern to note_stores.
* stack-ptr-mod.c (pass_stack_ptr_mod::execute): Likewise.
* var-tracking.c (adjust_insn, add_with_sets): Likewise.
From-SVN: r275527
|
|
* rtlanal.c (tablejump_casesi_pattern): New function, to
determine if a tablejump insn is a casesi dispatcher. Extracted
from patch_jump_insn.
* rtl.h (tablejump_casesi_pattern): Declare.
* cfgrtl.c (patch_jump_insn): Use it.
* dwarf2cfi.c (create_trace_edges): Use it.
testsuite/
* gnat.dg/casesi.ad[bs], test_casesi.adb: New test.
From-SVN: r274377
|
|
Two fixes for UB when handling very large offsets. The calculation in
force_int_to_mode would have been correct if signed integers used modulo
arithmetic, so just switch to unsigned types. The calculation in
rtx_addr_can_trap_p_1 didn't handle overflow properly, so switch to
known_subrange_p instead (which is supposed to handle all cases).
2019-04-18 Richard Sandiford <richard.sandiford@arm.com>
gcc/
PR middle-end/85164
* combine.c (force_int_to_mode): Cast the argument rather than
the result of known_alignment.
* rtlanal.c (rtx_addr_can_trap_p_1): Use known_subrange_p.
gcc/testsuite/
PR middle-end/85164
* gcc.dg/pr85164-1.c, gcc.dg/pr85164-2.c: New tests.
From-SVN: r270442
|
|
PR rtl-optimization/89445
* simplify-rtx.c (simplify_ternary_operation): Don't use
simplify_merge_mask on operands that may trap.
* rtlanal.c (may_trap_p_1): Use FLOAT_MODE_P instead of
SCALAR_FLOAT_MODE_P checks. For integral division by zero, if
second operand is CONST_VECTOR, check if any element could be zero.
Don't expect traps for VEC_{MERGE,SELECT,CONCAT,DUPLICATE} unless
their operands can trap.
* gcc.target/i386/avx512f-pr89445.c: New test.
From-SVN: r269176
|
|
as the epilogue isn't completed.
* rtlanal.c (get_initial_register_offset): Fall back to the estimate
as long as the epilogue isn't completed.
From-SVN: r269013
|
|
2019-01-09 Sandra Loosemore <sandra@codesourcery.com>
PR other/16615 [1/5]
contrib/
* mklog: Mechanically replace "can not" with "cannot".
gcc/
* Makefile.in: Mechanically replace "can not" with "cannot".
* alias.c: Likewise.
* builtins.c: Likewise.
* calls.c: Likewise.
* cgraph.c: Likewise.
* cgraph.h: Likewise.
* cgraphclones.c: Likewise.
* cgraphunit.c: Likewise.
* combine-stack-adj.c: Likewise.
* combine.c: Likewise.
* common/config/i386/i386-common.c: Likewise.
* config/aarch64/aarch64.c: Likewise.
* config/alpha/sync.md: Likewise.
* config/arc/arc.c: Likewise.
* config/arc/predicates.md: Likewise.
* config/arm/arm-c.c: Likewise.
* config/arm/arm.c: Likewise.
* config/arm/arm.h: Likewise.
* config/arm/arm.md: Likewise.
* config/arm/cortex-r4f.md: Likewise.
* config/csky/csky.c: Likewise.
* config/csky/csky.h: Likewise.
* config/darwin-f.c: Likewise.
* config/epiphany/epiphany.md: Likewise.
* config/i386/i386.c: Likewise.
* config/i386/sol2.h: Likewise.
* config/m68k/m68k.c: Likewise.
* config/mcore/mcore.h: Likewise.
* config/microblaze/microblaze.md: Likewise.
* config/mips/20kc.md: Likewise.
* config/mips/sb1.md: Likewise.
* config/nds32/nds32.c: Likewise.
* config/nds32/predicates.md: Likewise.
* config/pa/pa.c: Likewise.
* config/rs6000/e300c2c3.md: Likewise.
* config/rs6000/rs6000.c: Likewise.
* config/s390/s390.h: Likewise.
* config/sh/sh.c: Likewise.
* config/sh/sh.md: Likewise.
* config/spu/vmx2spu.h: Likewise.
* cprop.c: Likewise.
* dbxout.c: Likewise.
* df-scan.c: Likewise.
* doc/cfg.texi: Likewise.
* doc/extend.texi: Likewise.
* doc/fragments.texi: Likewise.
* doc/gty.texi: Likewise.
* doc/invoke.texi: Likewise.
* doc/lto.texi: Likewise.
* doc/md.texi: Likewise.
* doc/objc.texi: Likewise.
* doc/rtl.texi: Likewise.
* doc/tm.texi: Likewise.
* dse.c: Likewise.
* emit-rtl.c: Likewise.
* emit-rtl.h: Likewise.
* except.c: Likewise.
* expmed.c: Likewise.
* expr.c: Likewise.
* fold-const.c: Likewise.
* genautomata.c: Likewise.
* gimple-fold.c: Likewise.
* hard-reg-set.h: Likewise.
* ifcvt.c: Likewise.
* ipa-comdats.c: Likewise.
* ipa-cp.c: Likewise.
* ipa-devirt.c: Likewise.
* ipa-fnsummary.c: Likewise.
* ipa-icf.c: Likewise.
* ipa-inline-transform.c: Likewise.
* ipa-inline.c: Likewise.
* ipa-polymorphic-call.c: Likewise.
* ipa-profile.c: Likewise.
* ipa-prop.c: Likewise.
* ipa-pure-const.c: Likewise.
* ipa-reference.c: Likewise.
* ipa-split.c: Likewise.
* ipa-visibility.c: Likewise.
* ipa.c: Likewise.
* ira-build.c: Likewise.
* ira-color.c: Likewise.
* ira-conflicts.c: Likewise.
* ira-costs.c: Likewise.
* ira-int.h: Likewise.
* ira-lives.c: Likewise.
* ira.c: Likewise.
* ira.h: Likewise.
* loop-invariant.c: Likewise.
* loop-unroll.c: Likewise.
* lower-subreg.c: Likewise.
* lra-assigns.c: Likewise.
* lra-constraints.c: Likewise.
* lra-eliminations.c: Likewise.
* lra-lives.c: Likewise.
* lra-remat.c: Likewise.
* lra-spills.c: Likewise.
* lra.c: Likewise.
* lto-cgraph.c: Likewise.
* lto-streamer-out.c: Likewise.
* postreload-gcse.c: Likewise.
* predict.c: Likewise.
* profile-count.h: Likewise.
* profile.c: Likewise.
* recog.c: Likewise.
* ree.c: Likewise.
* reload.c: Likewise.
* reload1.c: Likewise.
* reorg.c: Likewise.
* resource.c: Likewise.
* rtl.def: Likewise.
* rtl.h: Likewise.
* rtlanal.c: Likewise.
* sched-deps.c: Likewise.
* sched-ebb.c: Likewise.
* sched-rgn.c: Likewise.
* sel-sched-ir.c: Likewise.
* sel-sched.c: Likewise.
* shrink-wrap.c: Likewise.
* simplify-rtx.c: Likewise.
* symtab.c: Likewise.
* target.def: Likewise.
* toplev.c: Likewise.
* tree-call-cdce.c: Likewise.
* tree-cfg.c: Likewise.
* tree-complex.c: Likewise.
* tree-core.h: Likewise.
* tree-eh.c: Likewise.
* tree-inline.c: Likewise.
* tree-loop-distribution.c: Likewise.
* tree-nrv.c: Likewise.
* tree-profile.c: Likewise.
* tree-sra.c: Likewise.
* tree-ssa-alias.c: Likewise.
* tree-ssa-dce.c: Likewise.
* tree-ssa-dom.c: Likewise.
* tree-ssa-forwprop.c: Likewise.
* tree-ssa-loop-im.c: Likewise.
* tree-ssa-loop-ivcanon.c: Likewise.
* tree-ssa-loop-ivopts.c: Likewise.
* tree-ssa-loop-niter.c: Likewise.
* tree-ssa-phionlycprop.c: Likewise.
* tree-ssa-phiopt.c: Likewise.
* tree-ssa-propagate.c: Likewise.
* tree-ssa-threadedge.c: Likewise.
* tree-ssa-threadupdate.c: Likewise.
* tree-ssa-uninit.c: Likewise.
* tree-ssanames.c: Likewise.
* tree-streamer-out.c: Likewise.
* tree.c: Likewise.
* tree.h: Likewise.
* vr-values.c: Likewise.
gcc/ada/
* exp_ch9.adb: Mechanically replace "can not" with "cannot".
* libgnat/s-regpat.ads: Likewise.
* par-ch4.adb: Likewise.
* set_targ.adb: Likewise.
* types.ads: Likewise.
gcc/cp/
* cp-tree.h: Mechanically replace "can not" with "cannot".
* parser.c: Likewise.
* pt.c: Likewise.
gcc/fortran/
* class.c: Mechanically replace "can not" with "cannot".
* decl.c: Likewise.
* expr.c: Likewise.
* gfc-internals.texi: Likewise.
* intrinsic.texi: Likewise.
* invoke.texi: Likewise.
* io.c: Likewise.
* match.c: Likewise.
* parse.c: Likewise.
* primary.c: Likewise.
* resolve.c: Likewise.
* symbol.c: Likewise.
* trans-array.c: Likewise.
* trans-decl.c: Likewise.
* trans-intrinsic.c: Likewise.
* trans-stmt.c: Likewise.
gcc/go/
* go-backend.c: Mechanically replace "can not" with "cannot".
* go-gcc.cc: Likewise.
gcc/lto/
* lto-partition.c: Mechanically replace "can not" with "cannot".
* lto-symtab.c: Likewise.
* lto.c: Likewise.
gcc/objc/
* objc-act.c: Mechanically replace "can not" with "cannot".
libbacktrace/
* backtrace.h: Mechanically replace "can not" with "cannot".
libgcc/
* config/c6x/libunwind.S: Mechanically replace "can not" with
"cannot".
* config/tilepro/atomic.h: Likewise.
* config/vxlib-tls.c: Likewise.
* generic-morestack-thread.c: Likewise.
* generic-morestack.c: Likewise.
* mkmap-symver.awk: Likewise.
libgfortran/
* caf/single.c: Mechanically replace "can not" with "cannot".
* io/unit.c: Likewise.
libobjc/
* class.c: Mechanically replace "can not" with "cannot".
* objc/runtime.h: Likewise.
* sendmsg.c: Likewise.
liboffloadmic/
* include/coi/common/COIResult_common.h: Mechanically replace
"can not" with "cannot".
* include/coi/source/COIBuffer_source.h: Likewise.
libstdc++-v3/
* include/ext/bitmap_allocator.h: Mechanically replace "can not"
with "cannot".
From-SVN: r267783
|
|
From-SVN: r267494
|
|
By the time peephole optimizations run, we've already made up our mind
whether to use base-register or relative addressing for literal pool
entries. LT(G) supports only base-register addressing, and so it is
too late to convert L(G)RL + compare to LT(G). This change should not
make the code worse unless building with e.g. -fno-dce, since comparing
literal pool entries to zero should be optimized away during earlier
passes.
gcc/ChangeLog:
2018-11-20 Ilya Leoshkevich <iii@linux.ibm.com>
PR target/88083
* config/s390/s390.md: Skip LT(G) peephole when literal pool is
involved.
* rtl.h (contains_constant_pool_address_p): New function.
* rtlanal.c (contains_constant_pool_address_p): Likewise.
gcc/testsuite/ChangeLog:
2018-11-20 Ilya Leoshkevich <iii@linux.ibm.com>
PR target/88083
* gcc.target/s390/pr88083.c: New test.
From-SVN: r266306
|
|
combine at -02)
PR rtl-optimization/85925
* rtl.h (word_register_operation_p): New predicate.
* combine.c (record_dead_and_set_regs_1): Only apply specific handling
for WORD_REGISTER_OPERATIONS targets to word_register_operation_p RTX.
* rtlanal.c (nonzero_bits1): Likewise. Adjust couple of comments.
(num_sign_bit_copies1): Likewise.
From-SVN: r266302
|
|
PR rtl-optimization/87361
* rtlanal.c (nonzero_bits1): Revert accidental change.
From-SVN: r264420
|
|
Combine will put CLOBBER (with a non-void mode) anywhere in a pattern
to poison it. reg_overlap_mentioned_p did not handle this. This patch
fixes that.
PR rtl-optimization/86882
* rtlanal.c (reg_overlap_mentioned_p): Handle CLOBBER.
From-SVN: r264400
|
|
gcc/
* alias.c (record_set): Check for clobber high.
* cfgexpand.c (expand_gimple_stmt): Likewise.
* combine-stack-adj.c (single_set_for_csa): Likewise.
* combine.c (find_single_use_1): Likewise.
(set_nonzero_bits_and_sign_copies): Likewise.
(get_combine_src_dest): Likewise.
(is_parallel_of_n_reg_sets): Likewise.
(try_combine): Likewise.
(record_dead_and_set_regs_1): Likewise.
(reg_dead_at_p_1): Likewise.
(reg_dead_at_p): Likewise.
* dce.c (deletable_insn_p): Likewise.
(mark_nonreg_stores_1): Likewise.
(mark_nonreg_stores_2): Likewise.
* df-scan.c (df_find_hard_reg_defs): Likewise.
(df_uses_record): Likewise.
(df_get_call_refs): Likewise.
* dwarf2out.c (mem_loc_descriptor): Likewise.
* haifa-sched.c (haifa_classify_rtx): Likewise.
* ira-build.c (create_insn_allocnos): Likewise.
* ira-costs.c (scan_one_insn): Likewise.
* ira.c (equiv_init_movable_p): Likewise.
(rtx_moveable_p): Likewise.
(interesting_dest_for_shprep): Likewise.
* jump.c (mark_jump_label_1): Likewise.
* postreload-gcse.c (record_opr_changes): Likewise.
* postreload.c (reload_cse_simplify): Likewise.
(struct reg_use): Add source expr.
(reload_combine): Check for clobber high.
(reload_combine_note_use): Likewise.
(reload_cse_move2add): Likewise.
(move2add_note_store): Likewise.
* print-rtl.c (print_pattern): Likewise.
* recog.c (decode_asm_operands): Likewise.
(store_data_bypass_p): Likewise.
(if_test_bypass_p): Likewise.
* regcprop.c (kill_clobbered_value): Likewise.
(kill_set_value): Likewise.
* reginfo.c (reg_scan_mark_refs): Likewise.
* reload1.c (maybe_fix_stack_asms): Likewise.
(eliminate_regs_1): Likewise.
(elimination_effects): Likewise.
(mark_not_eliminable): Likewise.
(scan_paradoxical_subregs): Likewise.
(forget_old_reloads_1): Likewise.
* reorg.c (find_end_label): Likewise.
(try_merge_delay_insns): Likewise.
(redundant_insn): Likewise.
(own_thread_p): Likewise.
(fill_simple_delay_slots): Likewise.
(fill_slots_from_thread): Likewise.
(dbr_schedule): Likewise.
* resource.c (update_live_status): Likewise.
(mark_referenced_resources): Likewise.
(mark_set_resources): Likewise.
* rtl.c (copy_rtx): Likewise.
* rtlanal.c (reg_referenced_p): Likewise.
(single_set_2): Likewise.
(noop_move_p): Likewise.
(note_stores): Likewise.
* sched-deps.c (sched_analyze_reg): Likewise.
(sched_analyze_insn): Likewise.
From-SVN: r263331
|
|
gcc/
* rtl.h (reg_is_clobbered_by_clobber_high): Add declarations.
* rtlanal.c (reg_is_clobbered_by_clobber_high): Add function.
From-SVN: r263328
|
|
gcc/
PR target/84711
* rtlanal.c (set_noop_p): Constrain on mode change,
include hard-reg-set.h
gcc/testuite/
PR target/84711
* gcc.dg/vect/pr84711.c: New.
From-SVN: r262435
|
|
This patch generalises various places that used hwi rtx accessors so
that they can handle poly_ints instead. In many cases these changes
are by inspection rather than because something had shown them to be
necessary.
2018-06-12 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* poly-int.h (can_div_trunc_p): Add new overload in which all values
are poly_ints.
* alias.c (get_addr): Extend CONST_INT handling to poly_int_rtx_p.
(memrefs_conflict_p): Likewise.
(init_alias_analysis): Likewise.
* cfgexpand.c (expand_debug_expr): Likewise.
* combine.c (combine_simplify_rtx, force_int_to_mode): Likewise.
* cse.c (fold_rtx): Likewise.
* explow.c (adjust_stack, anti_adjust_stack): Likewise.
* expr.c (emit_block_move_hints): Likewise.
(clear_storage_hints, push_block, emit_push_insn): Likewise.
(store_expr_with_bounds, reduce_to_bit_field_precision): Likewise.
(emit_group_load_1): Use rtx_to_poly_int64 for group offsets.
(emit_group_store): Likewise.
(find_args_size_adjust): Use strip_offset. Use rtx_to_poly_int64
to read the PRE/POST_MODIFY increment.
* calls.c (store_one_arg): Use strip_offset.
* rtlanal.c (rtx_addr_can_trap_p_1): Extend CONST_INT handling to
poly_int_rtx_p.
(set_noop_p): Use rtx_to_poly_int64 for the elements selected
by a VEC_SELECT.
* simplify-rtx.c (avoid_constant_pool_reference): Use strip_offset.
(simplify_binary_operation_1): Extend CONST_INT handling to
poly_int_rtx_p.
* var-tracking.c (compute_cfa_pointer): Take a poly_int64 rather
than a HOST_WIDE_INT.
(hard_frame_pointer_adjustment): Change from HOST_WIDE_INT to
poly_int64.
(adjust_mems, add_stores): Update accodingly.
(vt_canonicalize_addr): Track polynomial offsets.
(emit_note_insn_var_location): Likewise.
(vt_add_function_parameter): Likewise.
(vt_initialize): Likewise.
From-SVN: r261530
|
|
Cannot allocate memory)
PR rtl-optimization/81443
* rtlanal.c (num_sign_bit_copies1) <SUBREG>: Do not propagate results
from inner REGs to paradoxical SUBREGs.
From-SVN: r257724
|
|
PR rtl-optimization/83565
* rtlanal.c (nonzero_bits1): On WORD_REGISTER_OPERATIONS machines, do
not extend the result to a larger mode for rotate operations.
(num_sign_bit_copies1): Likewise.
From-SVN: r256572
|