Age | Commit message (Collapse) | Author | Files | Lines |
|
g:72fbd3b2b2a497dbbe6599239bd61c5624203ed0 added a use of std::array
without explicitly forcing <array> to be included. That didn't cause
problems in my local builds but understandably did for some people.
gcc/
* doc/rtl.texi: Document the need to define INCLUDE_ARRAY before
including rtl-ssa.h.
* rtl-ssa.h: Likewise (in comment).
* config/aarch64/aarch64-cc-fusion.cc: Add INCLUDE_ARRAY.
* config/aarch64/aarch64-early-ra.cc: Likewise.
* config/riscv/riscv-avlprop.cc: Likewise.
* config/riscv/riscv-vsetvl.cc: Likewise.
* fwprop.cc: Likewise.
* late-combine.cc: Likewise.
* pair-fusion.cc: Likewise.
* rtl-ssa/accesses.cc: Likewise.
* rtl-ssa/blocks.cc: Likewise.
* rtl-ssa/changes.cc: Likewise.
* rtl-ssa/functions.cc: Likewise.
* rtl-ssa/insns.cc: Likewise.
* rtl-ssa/movement.cc: Likewise.
|
|
PR116044 is a regression in the testsuite on AMD GCN caused (again)
by the split_clobber_group code. The first patch in this area
(g:71b31690a7c52413496e91bcc5ee4c68af2f366f) fixed a bug caused
by carrying the old group over as one of the split ones. That
patch instead:
- created two new groups
- inserted them in the splay tree as neighbours of the old group
- removed the old group, and
- invalidated the old group (to force lazy recomputation when
a clobber's parent group is queried)
However, this left add_def trying to insert the new definition
relative to a stale splay tree root. The second patch
(g:34f33ea801563e2eabb348e8d3e9344a91abfd48) attempted to fix
that by inserting it relative to the new root. But that's not
always correct either. We specifically want to insert it after
the first of the two new groups, whether that group is the root
or not.
This patch does that, and tries to refactor the code to make
it a bit less brittle.
gcc/
PR rtl-optimization/116044
* rtl-ssa/functions.h (function_info::split_clobber_group): Return
an array of two clobber_groups.
* rtl-ssa/accesses.cc (function_info::split_clobber_group): Return
the new clobber groups. Don't modify the splay tree here.
(function_info::add_def): Update call accordingly. Generalize
the splay tree insertion code so that the new definition can be
inserted as a child of any existing node, not just the root.
Fix the insertion used after calling split_clobber_group.
|
|
In the fix for PR115928, I'd failed to notice that "root" was used
later in the function, so needed to be updated.
gcc/
PR rtl-optimization/116009
* rtl-ssa/accesses.cc (function_info::add_def): Set the root
local variable after removing the old clobber group.
gcc/testsuite/
PR rtl-optimization/116009
* gcc.c-torture/compile/pr116009.c: New test.
|
|
This patch adds debug routines for def_splay_tree, which I found
useful while debugging PR116009.
gcc/
* rtl-ssa/accesses.h (rtl_ssa::pp_def_splay_tree): Declare.
(dump, debug): Add overloads for def_splay_tree.
* rtl-ssa/accesses.cc (rtl_ssa::pp_def_splay_tree): New function.
(dump, debug): Add overloads for def_splay_tree.
|
|
In this PR, canonicalize_move_range walked off the end of a list
and triggered a null dereference. There are multiple ways of fixing
that, but I think the approach taken in the patch should be
relatively efficient.
gcc/
PR rtl-optimization/115929
* rtl-ssa/movement.h (canonicalize_move_range): Check for null prev
and next insns and create an invalid move range for them.
gcc/testsuite/
PR rtl-optimization/115929
* gcc.dg/torture/pr115929-2.c: New test.
|
|
One of the goals of the rtl-ssa representation was to allow a
group of consecutive clobbers to be skipped in constant time,
with amortised sublinear insertion and deletion. This involves
putting consecutive clobbers in groups. Splitting or joining
groups would be linear if we had to update every clobber on
each update, so the operation to query a clobber's group is
lazy and (again) amortised sublinear.
This means that, when splitting a group into two, we cannot
reuse the old group for one side. We have to invalidate it,
so that the lazy clobber_info::group query can tell that something
has changed. The ICE in the PR came from failing to do that.
gcc/
PR rtl-optimization/115928
* rtl-ssa/accesses.h (clobber_group): Add a new constructor that
takes the first, last and root clobbers.
* rtl-ssa/internals.inl (clobber_group::clobber_group): Define it.
* rtl-ssa/accesses.cc (function_info::split_clobber_group): Use it.
Allocate a new group for both sides and invalidate the previous group.
(function_info::add_def): After calling split_clobber_group,
remove the old group from the splay tree.
gcc/testsuite/
PR rtl-optimization/115928
* gcc.dg/torture/pr115928.c: New test.
|
|
order_nodes are used to implement ordered comparisons between
two insns with the same program point number. remove_insn would
remove an order_node from its splay tree, but didn't remove it
from the insn. This caused confusion if the insn was later
reinserted somewhere else that also needed an order_node.
gcc/
PR rtl-optimization/115929
* rtl-ssa/insns.cc (function_info::remove_insn): Remove an
order_node from the instruction as well as from the splay tree.
gcc/testsuite/
PR rtl-optimization/115929
* gcc.dg/torture/pr115929-1.c: New test.
|
|
The asm in the testcase has a memory operand and also clobbers ax.
The clobber means that ax cannot be used to hold inputs, which
extends to the address of the memory.
I think I had an implicit assumption that constrain_operands
would enforce this, but in hindsight, that clearly wasn't going
to be true. constrain_operands only looks at constraints, and
these clobbers are by definition outside the constraint system.
(And that's why they have to be handled conservatively, since there's
no way to distinguish the earlyclobber and non-earlyclobber cases.)
The semantics of hard-coded clobbers are generic enough that I think
they should be handled directly by rtl-ssa, rather than by consumers.
And in the context of rtl-ssa, the easiest way to check for a clash is
to walk the list of input registers, which we already have to hand.
It therefore seemed better not to push this down to a more generic
rtl helper.
The patch detects hard-coded clobbers in the same way as regrename:
by temporarily stubbing out the operands with pc_rtx.
gcc/
PR rtl-optimization/115891
* rtl-ssa/changes.cc (find_clobbered_access): New function.
(recog_level2): Use it to check for overlap between input
registers and hard-coded clobbers. Conditionally reset
recog_data.insn after changing the insn code.
gcc/testsuite/
PR rtl-optimization/115891
* gcc.target/i386/pr115891.c: New test.
|
|
Bit of a brown paper bag issue, but: due to the representation
of the insn chain, insn_info::prev_any_insn would sometimes skip
over instructions. This led to an invalid update in the PR when
adding and removing instructions.
I think one of the reasons I failed to spot this when checking
the code is that m_prev_insn_or_last_debug_insn is misnamed:
it's the previous instruction *of the same type* or the last
debug instruction in a group. The patch therefore renames it to
m_prev_sametype_or_last_debug_insn (with the term prev_sametype
already being used in some accessors).
The reason this didn't show up earlier is that (a) prev_any_insn
is rarely used directly, (b) no instructions were lost from the
def-use chains, and (c) only consecutive debug instructions were
skipped when walking the insn chain.
The chaining scheme makes prev_any_insn more complicated than
next_any_insn, prev_nondebug_insn and next_nondebug_insn, but the
object code produced is still relatively simple.
gcc/
PR rtl-optimization/115785
* rtl-ssa/insns.h (insn_info::prev_insn_or_last_debug_insn)
(insn_info::next_nondebug_or_debug_insn): Remove typedefs.
(insn_info::m_prev_insn_or_last_debug_insn): Rename to...
(insn_info::m_prev_sametype_or_last_debug_insn): ...this.
* rtl-ssa/internals.inl (insn_info::insn_info): Update after
above renaming.
(insn_info::copy_prev_from): Likewise.
(insn_info::set_prev_sametype_insn): Likewise.
(insn_info::set_last_debug_insn): Likewise.
(insn_info::clear_insn_links): Likewise.
(insn_info::has_insn_links): Likewise.
* rtl-ssa/member-fns.inl (insn_info::prev_nondebug_insn): Likewise.
(insn_info::prev_any_insn): Fix moves from non-debug to debug insns.
gcc/testsuite/
PR rtl-optimization/115785
* g++.dg/torture/pr115785.C: New test.
|
|
change_insns is used to change multiple instructions at once, so that
the IR on return is valid & self-consistent. These changes can involve
moving instructions, and the new position for one instruction might
be expressed in terms of the old position of another instruction
that is changing at the same time.
change_insns therefore adds placeholder instructions to mark each
new instruction position, then replaces each placeholder with the
corresponding real instruction. This replacement was done in two
steps: removing the old placeholder instruction and inserting the new
real instruction. But it's more convenient for the upcoming fix for
PR115785 if we do the operation as a single step. That should also
be slightly more efficient, since e.g. no splay tree operations are
needed.
This operation happens purely on the rtl-ssa instruction chain.
The placeholders are never represented in rtl.
gcc/
PR rtl-optimization/115785
* rtl-ssa/functions.h (function_info::replace_nondebug_insn): Declare.
* rtl-ssa/insns.h (insn_info::order_node::set_uid): New function.
(insn_info::remove_note): Declare.
* rtl-ssa/insns.cc (insn_info::remove_note): New function.
(function_info::replace_nondebug_insn): Likewise.
* rtl-ssa/changes.cc (function_info::change_insns): Use
replace_nondebug_insn instead of remove_insn + add_insn.
|
|
rtl-ssa has routines for scanning forwards or backwards for something
under the control of an exclusion set. These searches are currently
used for two main things:
- to work out where an instruction can be moved within its EBB
- to work out whether recog can add a new hard register clobber
The exclusion set was originally a callback function that returned
true for insns that should be ignored. However, for the late-combine
work, I'd also like to be able to skip an entire definition, along
with all its uses.
This patch prepares for that by turning the exclusion set into an
object that provides predicate member functions. Currently the
only two member functions are:
- should_ignore_insn: what the old callback did
- should_ignore_def: the new functionality
but more could be added later.
Doing this also makes it easy to remove some asymmetry that I think
in hindsight was a mistake: in forward scans, ignoring an insn meant
ignoring all definitions in that insn (ok) and all uses of those
definitions (non-obvious). The new interface makes it possible
to select the required behaviour, with that behaviour being applied
consistently in both directions.
Now that the exclusion set is a dedicated object, rather than
just a "random" function, I think it makes sense to remove the
_ignoring suffix from the function names. The suffix was originally
there to describe the callback, and in particular to emphasise that
a true return meant "ignore" rather than "heed".
gcc/
* rtl-ssa.h: Include predicates.h.
* rtl-ssa/predicates.h: New file.
* rtl-ssa/access-utils.h (prev_call_clobbers_ignoring): Rename to...
(prev_call_clobbers): ...this and treat the ignore parameter as an
object with the same interface as ignore_nothing.
(next_call_clobbers_ignoring): Rename to...
(next_call_clobbers): ...this and treat the ignore parameter as an
object with the same interface as ignore_nothing.
(first_nondebug_insn_use_ignoring): Rename to...
(first_nondebug_insn_use): ...this and treat the ignore parameter as
an object with the same interface as ignore_nothing.
(last_nondebug_insn_use_ignoring): Rename to...
(last_nondebug_insn_use): ...this and treat the ignore parameter as
an object with the same interface as ignore_nothing.
(last_access_ignoring): Rename to...
(last_access): ...this and treat the ignore parameter as an object
with the same interface as ignore_nothing. Conditionally skip
definitions.
(prev_access_ignoring): Rename to...
(prev_access): ...this and treat the ignore parameter as an object
with the same interface as ignore_nothing.
(first_def_ignoring): Replace with...
(first_access): ...this new function.
(next_access_ignoring): Rename to...
(next_access): ...this and treat the ignore parameter as an object
with the same interface as ignore_nothing. Conditionally skip
definitions.
* rtl-ssa/change-utils.h (insn_is_changing): Delete.
(restrict_movement_ignoring): Rename to...
(restrict_movement): ...this and treat the ignore parameter as an
object with the same interface as ignore_nothing.
(recog_ignoring): Rename to...
(recog): ...this and treat the ignore parameter as an object with
the same interface as ignore_nothing.
* rtl-ssa/changes.h (insn_is_changing_closure): Delete.
* rtl-ssa/functions.h (function_info::add_regno_clobber): Treat
the ignore parameter as an object with the same interface as
ignore_nothing.
* rtl-ssa/insn-utils.h (insn_is): Delete.
* rtl-ssa/insns.h (insn_is_closure): Delete.
* rtl-ssa/member-fns.inl
(insn_is_changing_closure::insn_is_changing_closure): Delete.
(insn_is_changing_closure::operator()): Likewise.
(function_info::add_regno_clobber): Treat the ignore parameter
as an object with the same interface as ignore_nothing.
(ignore_changing_insns::ignore_changing_insns): New function.
(ignore_changing_insns::should_ignore_insn): Likewise.
* rtl-ssa/movement.h (restrict_movement_for_dead_range): Treat
the ignore parameter as an object with the same interface as
ignore_nothing.
(restrict_movement_for_defs_ignoring): Rename to...
(restrict_movement_for_defs): ...this and treat the ignore parameter
as an object with the same interface as ignore_nothing.
(restrict_movement_for_uses_ignoring): Rename to...
(restrict_movement_for_uses): ...this and treat the ignore parameter
as an object with the same interface as ignore_nothing. Conditionally
skip definitions.
* doc/rtl.texi: Update for above name changes. Use
ignore_changing_insns instead of insn_is_changing.
* config/aarch64/aarch64-cc-fusion.cc (cc_fusion::parallelize_insns):
Likewise.
* pair-fusion.cc (no_ignore): Delete.
(latest_hazard_before, first_hazard_after): Update for above name
changes. Use ignore_nothing instead of no_ignore.
(pair_fusion_bb_info::fuse_pair): Update for above name changes.
Use ignore_changing_insns instead of insn_is_changing.
(pair_fusion::try_promote_writeback): Likewise.
|
|
gcc/
* fwprop.cc (try_fwprop_subst_pattern): Invoke change_is_worthwhile
to judge if a replacement is worthwhile. Remove single_set check
and add is_debug_insn check.
* recog.cc (swap_change): Invalidate recog_data when the cached INSN
is swapped out.
* rtl-ssa/changes.cc (rtl_ssa::changes_are_worthwhile): Check if the
insn cost of new rtl is unknown and fail the replacement.
|
|
No-op moves are given the code NOOP_MOVE_INSN_CODE if we plan
to delete them later. Such insns shouldn't be costed, partly
because they're going to disappear, and partly because targets
won't recognise the insn code.
gcc/
* rtl-ssa/changes.cc (rtl_ssa::changes_are_worthwhile): Don't
cost no-op moves.
* rtl-ssa/insns.cc (insn_info::calculate_cost): Likewise.
|
|
Another patch from eyeballing
git grep -v 'long long\|optab optab\|template template\|double double' | grep ' \([a-zA-Z]\+\) \1 '
output, this time in gcc/ subdirectory.
2024-04-09 Jakub Jelinek <jakub@redhat.com>
gcc/
* expr.cc (convert_mode_scalar): Fix duplicated words in comment;
into into -> it into.
* function.h (function::cond_uids): Fix duplicated words in comment;
same same -> same.
* config/riscv/riscv-vector-costs.cc
(costs::adjust_vect_cost_per_loop): Fix duplicated words in comment;
model model -> model.
* config/riscv/riscv-vector-builtins-shapes.cc (build_base): Fix
duplicated words in comment; for for -> for.
* config/riscv/riscv-avlprop.cc (pass_avlprop::execute): Fix
duplicated words in comment; more more -> more.
* config/aarch64/driver-aarch64.cc (host_detect_local_cpu): Fix
duplicated words in comment; be be -> be.
* tree-profile.cc (masking_vectors): Fix duplicated words in comment;
has has -> has, the the -> the.
* value-range.cc (irange::set_range_from_bitmask): Fix duplicated
words in comment; the the -> the.
* gcov.cc (add_condition_counts): Fix duplicated words in comment;
to to -> to.
* vr-values.cc (get_scev_info): Fix duplicated words in comment;
the the -> to the.
* tree-vrp.cc (fully_replaceable): Fix duplicated words in comment;
by by -> by.
* mode-switching.cc (single_succ_confluence_n): Fix duplicated words
in comment; the the -> the.
* tree-ssa-phiopt.cc (value_replacement): Fix duplicated words in
comment; can can -> we can.
* gimple-range-phi.cc (phi_analyzer::process_phi): Fix duplicated words
in comment; it it -> it is.
* tree-ssa-sccvn.cc (visit_phi): Fix duplicated words in comment;
to to -> to.
* rtl-ssa/accesses.h (use_info::next_debug_insn_use): Fix duplicated
words in comment; if if -> if.
* doc/options.texi (InverseMask): Fix duplicated words; and and -> and.
Change take to takes.
* doc/invoke.texi (fanalyzer-undo-inlining): Fix duplicated words;
be be -> be.
(-minline-memops-threshold): Likewise.
gcc/analyzer/
* analyzer.opt (Wanalyzer-undefined-behavior-strtok): Fix duplicated
words; in in -> in.
* program-state.cc (sm_state_map::replay_call_summary): Fix duplicated
words in comment; to to -> to.
(program_state::replay_call_summary): Likewise.
* region-model.cc (region_model::replay_call_summary): Likewise.
gcc/c/
* c-decl.cc (previous_tag): Fix duplicated words in comment; the the
-> the.
(diagnose_mismatched_decls): Fix duplicated words in comment;
about about -> about.
gcc/cp/
* constexpr.cc (build_new_constexpr_heap_type): Fix duplicated words
in comment; is is -> is.
* cp-tree.def (CO_RETURN_EXPR): Fix duplicated words in comment;
for for -> for.
* parser.cc (fixup_blocks_walker): Fix duplicated words in comment;
is is -> is.
* semantics.cc (fixup_template_type): Fix duplicated words in comment;
for for -> for.
(finish_omp_for): Fix duplicated words in comment; the the -> the.
* pt.cc (more_specialized_fn): Fix duplicated words in comment;
think think -> think.
(type_targs_deducible_from): Fix duplicated words in comment; the the
-> the.
gcc/jit/
* docs/topics/expressions.rst (Constructor expressions): Fix
duplicated words; have have -> have.
|
|
The following tries to address the PHI insertion compile-time hog in
RTL fwprop observed with the PR54052 testcase where the loop computing
the "unfiltered" set of variables possibly needing PHI nodes for each
block exhibits quadratic compile-time and memory-use.
It does so by pruning the local DEFs with LR_OUT of the block, removing
regs that can never be LR_IN (defined by this block) in the dominance
frontier.
PR rtl-optimization/54052
* rtl-ssa/blocks.cc (function_info::place_phis): Filter
local defs by LR_OUT.
|
|
This patch adds some accessors to set_info and use_info to make it
easier to get at and iterate through uses in debug insns.
It is used by the aarch64 load/store pair fusion pass in a subsequent
patch to fix PR113089, i.e. to update debug uses in the pass.
gcc/ChangeLog:
PR target/113089
* rtl-ssa/accesses.h (use_info::next_debug_insn_use): New.
(debug_insn_use_iterator): New.
(set_info::first_debug_insn_use): New.
(set_info::debug_insn_uses): New.
* rtl-ssa/member-fns.inl (use_info::next_debug_insn_use): New.
(set_info::first_debug_insn_use): New.
(set_info::debug_insn_uses): New.
|
|
In r14-5820-ga49befbd2c783e751dc2110b544fe540eb7e33eb I added support to
RTL-SSA for inserting new insns, which included support for users
creating new defs.
However, I missed that apply_changes_to_insn needed updating to ensure
that the new defs actually got inserted into the main def chain. This
meant that when the aarch64 ldp/stp pass inserted a new stp insn, the
stp would just get skipped over during subsequent alias analysis, as its
def never got inserted into the memory def chain. This (unsurprisingly)
led to wrong code.
This patch fixes the issue by ensuring new user-created defs get
inserted. I would have preferred to have used a flag internal to the
defs instead of a separate data structure to keep track of them, but since
machine_mode increased to 16 bits we're already at 64 bits in access_info,
and we can't really reuse m_is_temp as the logic in finalize_new_accesses
requires it to get cleared.
gcc/ChangeLog:
PR target/113070
* rtl-ssa.h: Include hash-set.h.
* rtl-ssa/changes.cc (function_info::finalize_new_accesses): Add
new_sets parameter and use it to keep track of new user-created sets.
(function_info::apply_changes_to_insn): Also call add_def on new sets.
(function_info::change_insns): Add hash_set to keep track of new
user-created defs. Plumb it through.
* rtl-ssa/functions.h: Add hash_set parameter to finalize_new_accesses and
apply_changes_to_insn.
|
|
This exposes an interface for users to create new uses in RTL-SSA.
This is needed for updating uses after inserting a new store pair insn
in the aarch64 load/store pair fusion pass.
gcc/ChangeLog:
PR target/113070
* rtl-ssa/accesses.cc (function_info::create_use): New.
* rtl-ssa/changes.cc (function_info::finalize_new_accesses):
Ensure new uses end up referring to permanent defs.
* rtl-ssa/functions.h (function_info::create_use): Declare.
|
|
The next patch in this series exposes an interface for creating new uses
in RTL-SSA. The intent is that new user-created uses can consume new
user-created defs in the same change group. This is so that we can
correctly update uses of memory when inserting a new store pair insn in
the aarch64 load/store pair fusion pass (the affected uses need to
consume the new store pair insn).
As it stands, finalize_new_accesses is called as part of the backwards
insn placement loop within change_insns, but if we want new uses to be
able to depend on new defs in the same change group, we need
finalize_new_accesses to be called on earlier insns first. This is so
that when we process temporary uses and turn them into permanent uses,
we can follow the last_def link on the temporary def to ensure we end up
with a permanent use consuming a permanent def.
gcc/ChangeLog:
PR target/113070
* rtl-ssa/changes.cc (function_info::change_insns): Split out the call
to finalize_new_accesses from the backwards placement loop, run it
forwards in a separate loop.
|
|
|
|
check_asm_operands was inconsistent about how it handled "p"
after RA compared to before RA. Before RA it tested the address
with a void (unknown) memory mode:
case CT_ADDRESS:
/* Every address operand can be reloaded to fit. */
result = result || address_operand (op, VOIDmode);
break;
After RA it deferred to constrain_operands, which used the mode
of the operand:
if ((GET_MODE (op) == VOIDmode
|| SCALAR_INT_MODE_P (GET_MODE (op)))
&& (strict <= 0
|| (strict_memory_address_p
(recog_data.operand_mode[opno], op))))
win = true;
Using the mode of the operand is necessary for special predicates,
where it is used to give the memory mode. But for asms, the operand
mode is simply the mode of the address itself (so DImode on 64-bit
targets), which doesn't say anything about the addressed memory.
This patch uses VOIDmode for asms but continues to use the operand
mode for .md insns. It's needed to avoid a regression in the
testcase with the late-combine pass.
Fixing this made me realise that recog_level2 was doing duplicate
work for asms after RA.
gcc/
* recog.cc (constrain_operands): Pass VOIDmode to
strict_memory_address_p for 'p' constraints in asms.
* rtl-ssa/changes.cc (recog_level2): Skip redundant constrain_operands
for asms.
gcc/testsuite/
* gcc.target/aarch64/prfm_imm_offset_2.c: New test.
|
|
This patch fixes an ICE on record_use during RTL_SSA initialization RISC-V backend VSETVL PASS.
This is the ICE:
0x11a8603 partial_subreg_p(machine_mode, machine_mode)
../../../../gcc/gcc/rtl.h:3187
0x3b695eb rtl_ssa::function_info::record_use(rtl_ssa::function_info::build_info&, rtl_ssa::insn_info*, rtx_obj_reference)
../../../../gcc/gcc/rtl-ssa/insns.cc:524
In record_use:
if (HARD_REGISTER_NUM_P (regno)
&& partial_subreg_p (use->mode (), mode))
Assertion failed on partial_subreg_p which is:
inline bool
partial_subreg_p (machine_mode outermode, machine_mode innermode)
{
/* Modes involved in a subreg must be ordered. In particular, we must
always know at compile time whether the subreg is paradoxical. */
poly_int64 outer_prec = GET_MODE_PRECISION (outermode);
poly_int64 inner_prec = GET_MODE_PRECISION (innermode);
gcc_checking_assert (ordered_p (outer_prec, inner_prec)); -----> cause ICE.
return maybe_lt (outer_prec, inner_prec);
}
RISC-V VSETVL PASS is an advanced lazy vsetvl insertion PASS after RA (register allocation).
The rootcause is that we have a pattern (reduction instruction) that includes both VLA (length-agnostic) and VLS (fixed-length) modes.
(insn 168 173 170 31 (set (reg:RVVM1SI 101 v5 [311])
(unspec:RVVM1SI [
(unspec:V32BI [
(const_vector:V32BI [
(const_int 1 [0x1]) repeated x32
])
(reg:DI 30 t5 [312])
(const_int 2 [0x2]) repeated x2
(reg:SI 66 vl)
(reg:SI 67 vtype)
] UNSPEC_VPREDICATE)
(unspec:RVVM1SI [
(reg:V32SI 96 v0 [orig:185 vect__96.40 ] [185]) -----> VLS mode NUNITS = 32 elements.
(reg:RVVM1SI 113 v17 [439]) -----> VLA mode NUNITS = [8, 8] elements.
] UNSPEC_REDUC_XOR)
(unspec:RVVM1SI [
(reg:SI 0 zero)
] UNSPEC_VUNDEF)
] UNSPEC_REDUC)) 15948 {pred_redxorv32si}
In this case, record_use is trying to check partial_subreg_p (use->mode (), mode) for RTX = (reg:V32SI 96 v0 [orig:185 vect__96.40 ] [185]).
use->mode () == V32SImode, wheras mode = RVVM1SImode. Then it ICE since they are !ordered_p.
Set the use mode as the biggest mode which is natural fall back mode.
gcc/ChangeLog:
* rtl-ssa/insns.cc (function_info::record_use): Add !ordered_p case.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/vsetvl/vsetvl_bug-2.c: New test.
|
|
This adds some helpers to access-utils.h for removing accesses from an
access_array. This is needed by the upcoming aarch64 load/store pair
fusion pass.
gcc/ChangeLog:
* rtl-ssa/access-utils.h (filter_accesses): New.
(remove_regno_access): New.
(check_remove_regno_access): New.
* rtl-ssa/accesses.cc (rtl_ssa::remove_note_accesses_base): Use
new filter_accesses helper.
|
|
The upcoming aarch64 load pair pass needs to form store pairs, and can
re-order stores over loads when alias analysis determines this is safe.
In the case that both mem defs have uses in the RTL-SSA IR, and both
stores require re-ordering over their uses, we represent that as
(tentative) deletion of the original store insns and creation of a new
insn, to prevent requiring repeated re-parenting of uses during the
pass. We then update all mem uses that require re-parenting in one go
at the end of the pass.
To support this, RTL-SSA needs to handle inserting new insns (rather
than just changing existing ones), so this patch adds support for that.
New insns (and new accesses) are temporaries, allocated above a temporary
obstack_watermark, such that the user can easily back out of a change without
awkward bookkeeping.
gcc/ChangeLog:
* rtl-ssa/accesses.cc (function_info::create_set): New.
* rtl-ssa/accesses.h (access_info::is_temporary): New.
* rtl-ssa/changes.cc (move_insn): Handle new (temporary) insns.
(function_info::finalize_new_accesses): Handle new/temporary
user-created accesses.
(function_info::apply_changes_to_insn): Ensure m_is_temp flag
on new insns gets cleared.
(function_info::change_insns): Handle new/temporary insns.
(function_info::create_insn): New.
* rtl-ssa/changes.h (class insn_change): Make function_info a
friend class.
* rtl-ssa/functions.h (function_info): Declare new entry points:
create_set, create_insn. Declare new change_alloc helper.
* rtl-ssa/insns.cc (insn_info::print_full): Identify temporary insns in
dump.
* rtl-ssa/insns.h (insn_info): Add new m_is_temp flag and accompanying
is_temporary accessor.
* rtl-ssa/internals.inl (insn_info::insn_info): Initialize m_is_temp to
false.
* rtl-ssa/member-fns.inl (function_info::change_alloc): New.
* rtl-ssa/movement.h (restrict_movement_for_defs_ignoring): Add
handling for temporary defs.
|
|
This patch adds some RTL-SSA helper functions. They will be
used by the upcoming late-combine pass.
The patch contains the first non-template out-of-line function declared
in movement.h, so it adds a movement.cc. I realise it seems a bit
over-the-top to have a file with just one function, but it might grow
in future. :)
gcc/
* Makefile.in (OBJS): Add rtl-ssa/movement.o.
* rtl-ssa/access-utils.h (accesses_include_nonfixed_hard_registers)
(single_set_info): New functions.
(remove_uses_of_def, accesses_reference_same_resource): Declare.
(insn_clobbers_resources): Likewise.
* rtl-ssa/accesses.cc (rtl_ssa::remove_uses_of_def): New function.
(rtl_ssa::accesses_reference_same_resource): Likewise.
(rtl_ssa::insn_clobbers_resources): Likewise.
* rtl-ssa/movement.h (can_move_insn_p): Declare.
* rtl-ssa/movement.cc: New file.
|
|
The first in-tree use of RTL-SSA was fwprop, and one of the goals
was to make the fwprop rewrite preserve the old behaviour as far
as possible. The switch to RTL-SSA was supposed to be a pure
infrastructure change. So RTL-SSA has various FIXMEs for things
that were artifically limited to faciliate the old-fwprop vs.
new-fwprop comparison.
One of the things that fwprop wants to do is extend live ranges, and
function_info::make_use_available tried to keep within the cases that
old fwprop could handle.
Since the information is built in extended basic blocks, it's easy
to handle intra-EBB queries directly. This patch does that, and
removes the associated FIXME.
To get a flavour for how much difference this makes, I tried compiling
the testsuite at -Os for at least one target per supported CPU and OS.
For most targets, only a handful of tests changed, but the vast majority
of changes were positive. The only target that seemed to benefit
significantly was i686-apple-darwin.
The main point of the patch is to remove the FIXME and to enable
the upcoming post-RA late-combine pass to handle more cases.
gcc/
* rtl-ssa/functions.h (function_info::remains_available_at_insn):
New member function.
* rtl-ssa/accesses.cc (function_info::remains_available_at_insn):
Likewise.
(function_info::make_use_available): Avoid false negatives for
queries within an EBB.
|
|
rtl_ssa::changes_are_worthwhile used the standard approach
of summing up the individual costs of the old and new sequences
to see which one is better overall. But when optimising for
speed and changing instructions in multiple blocks, it seems
better to weight the cost of each instruction by its execution
frequency. (We already do something similar for SLP layouts.)
gcc/
* rtl-ssa/changes.cc: Include sreal.h.
(rtl_ssa::changes_are_worthwhile): When optimizing for speed,
scale the cost of each instruction by its execution frequency.
|
|
In order to save (a lot of) memory, RTL-SSA avoids creating
individual clobber records for every call-clobbered register.
It instead maintains a list & splay tree of calls in an EBB,
grouped by ABI.
This patch takes these call clobbers into account in a couple
more routines. I don't think this will have any effect on
existing users, since it's only necessary for hard registers.
gcc/
* rtl-ssa/access-utils.h (next_call_clobbers): New function.
(is_single_dominating_def, remains_available_on_exit): Replace with...
* rtl-ssa/functions.h (function_info::is_single_dominating_def)
(function_info::remains_available_on_exit): ...these new member
functions.
(function_info::m_clobbered_by_calls): New member variable.
* rtl-ssa/functions.cc (function_info::function_info): Explicitly
initialize m_clobbered_by_calls.
* rtl-ssa/insns.cc (function_info::record_call_clobbers): Update
m_clobbered_by_calls for each call-clobber note.
* rtl-ssa/member-fns.inl (function_info::is_single_dominating_def):
New function. Check for call clobbers.
* rtl-ssa/accesses.cc (function_info::remains_available_on_exit):
Likewise.
|
|
The exit block can have multiple predecessors, for example if the
function calls __builtin_eh_return. We might then need PHI nodes
for values that are live on exit.
RTL-SSA uses the normal dominance frontiers approach for calculating
where PHI nodes are needed. However, dominannce.cc only calculates
dominators for normal blocks, not the exit block.
calculate_dominance_frontiers likewise only calculates dominance
frontiers for normal blocks.
This patch fills in the “missing” frontiers manually.
gcc/
* rtl-ssa/internals.h (build_info::exit_block_dominator): New
member variable.
* rtl-ssa/blocks.cc (build_info::build_info): Initialize it.
(bb_walker::bb_walker): Use it, moving the computation of the
dominator to...
(function_info::process_all_blocks): ...here.
(function_info::place_phis): Add dominance frontiers for the
exit block.
|
|
If an optimisation removes the last real use of a definition,
there can still be artificial uses left. This patch removes
those uses too.
These artificial uses exist because RTL-SSA is only an SSA-like
view of the existing RTL IL, rather than a native SSA representation.
It effectively treats RTL registers like gimple vops, but with the
addition of an RPO view of the register's lifetime(s). Things are
structured to allow most operations to update this RPO view in
amortised sublinear time.
gcc/
* rtl-ssa/functions.h (function_info::process_uses_of_deleted_def):
New member function.
* rtl-ssa/changes.cc (function_info::process_uses_of_deleted_def):
Likewise.
(function_info::change_insns): Use it.
|
|
Sometimes an optimisation can remove a clobber of scratch registers
or scratch memory. We then need to update the DU chains to reflect
the removed clobber.
For registers this isn't a problem. Clobbers of registers are just
momentary blips in the register's lifetime. They act as a barrier for
moving uses later or defs earlier, but otherwise they have no effect on
the semantics of other instructions. Removing a clobber is therefore a
cheap, local operation.
In contrast, clobbers of memory are modelled as full sets.
This is because (a) a clobber of memory does not invalidate
*all* memory and (b) it's a common idiom to use (clobber (mem ...))
in stack barriers. But removing a set and redirecting all uses
to a different set is a linear operation. Doing it for potentially
every optimisation could lead to quadratic behaviour.
This patch therefore refrains from removing sets of memory that appear
to be redundant. There's an opportunity to clean this up in linear time
at the end of the pass, but as things stand, nothing would benefit from
that.
This is also a very rare event. Usually we should try to optimise the
insn before the scratch memory has been allocated.
gcc/
* rtl-ssa/changes.cc (function_info::finalize_new_accesses):
If a change describes a set of memory, ensure that that set
is kept, regardless of the insn pattern.
|
|
Unlike REG_DEAD notes, REG_UNUSED notes need to be kept free of
false positives by all passes. function_info::change_insns
does this by removing all REG_UNUSED notes, and then using
add_reg_unused_notes to add notes back (or create new ones)
where appropriate.
The problem was that it called add_reg_unused_notes on the fly
while updating each instruction, which meant that the information
for later instructions in the change set wasn't up to date.
This patch does it in a separate loop instead.
gcc/
* rtl-ssa/changes.cc (function_info::apply_changes_to_insn): Remove
call to add_reg_unused_notes and instead...
(function_info::change_insns): ...use a separate loop here.
|
|
RTL-SSA mostly relies on DF for block-level register liveness
information, including artificial uses and defs at the beginning
and end of blocks. But one case was missing. DF does not add
artificial uses of global registers to the beginning or end
of a block. Instead it marks them as used within every block
when computing LR and LIVE problems.
For RTL-SSA, global registers behave like memory, which in
turn behaves like gimple vops. We need to ensure that they
are live on exit so that final definitions do not appear
to be unused.
Also, the previous live-on-exit handling only considered the exit
block itself. It needs to consider non-local gotos as well, since
they jump directly to some code in a parent function and so do
not have a path to the exit block.
gcc/
* rtl-ssa/blocks.cc (function_info::add_artificial_accesses): Force
global registers to be live on exit. Handle any block with zero
successors like an exit block.
|
|
If make_uses_available was called twice for the same use,
we could end up trying to create duplicate definitions for
the same extended live range.
gcc/
* rtl-ssa/blocks.cc (function_info::create_degenerate_phi): Check
whether the requested phi already exists.
|
|
rtl_ssa::can_insert_after didn't handle insns that can throw.
Fixing that avoids a regression with a later patch.
gcc/
* rtl-ssa.h: Include cfgbuild.h.
* rtl-ssa/movement.h (can_insert_after): Replace is_jump with the
more comprehensive control_flow_insn_p.
|
|
RTL-SSA queues up some invasive changes for later. But sometimes
the insns involved in those changes can be deleted by later
optimisations, making the queued change unnecessary. This patch
checks for that case.
gcc/
* rtl-ssa/changes.cc (function_info::perform_pending_updates): Check
whether an insn has been replaced by a note.
|
|
first_any_insn_use implicitly (but contrary to its documentation)
assumed that there was at least one use.
gcc/
* rtl-ssa/member-fns.inl (first_any_insn_use): Handle null
m_first_use.
|
|
This patch tweaks change_insns to also call ::remove_insn to ensure the
underlying RTL insn gets removed from the insn chain in the case of a
deletion.
This avoids leaving NOTE_INSN_DELETED around after deleting insns.
For movement, the RTL insn chain is updated earlier in change_insns with
the call to move_insn. For deletion, it seems reasonable to do it here.
gcc/ChangeLog:
* rtl-ssa/changes.cc (function_info::change_insns): Ensure we call
::remove_insn on deleted insns.
|
|
Currently, rtl_ssa::change_insns requires all new uses and defs to be
specified explicitly. This turns out to be rather inconvenient for
forming load pairs in the new aarch64 load pair pass, as the pass has to
determine which mem def the final load pair consumes, and then obtain or
create a suitable use (i.e. significant bookkeeping, just to keep the
RTL-SSA IR consistent). It turns out to be much more convenient to
allow change_insns to infer which def is consumed and create a suitable
use of mem itself. This patch does that.
gcc/ChangeLog:
* rtl-ssa/changes.cc (function_info::finalize_new_accesses): Add new
parameter to give final insn position, infer use of mem if it isn't
specified explicitly.
(function_info::change_insns): Pass down final insn position to
finalize_new_accesses.
* rtl-ssa/functions.h: Add parameter to finalize_new_accesses.
|
|
This is needed by the upcoming aarch64 load pair pass, as it can
re-order stores (when alias analysis determines this is safe) and thus
change which mem def a given use consumes (in the RTL-SSA view, there is
no alias disambiguation of memory).
gcc/ChangeLog:
* rtl-ssa/accesses.cc (function_info::reparent_use): New.
* rtl-ssa/functions.h (function_info): Declare new member
function reparent_use.
|
|
Add a helper routine to access-utils.h which removes the memory access
from an access_array, if it has one.
gcc/ChangeLog:
* rtl-ssa/access-utils.h (drop_memory_access): New.
|
|
In the case that !insn->is_debug_insn () && next->is_debug_insn (), this
function was missing an update of the prev pointer on the first nondebug
insn following the sequence of debug insns starting at next.
This can lead to corruption of the insn chain, in that we end up with:
insn->next_any_insn ()->prev_any_insn () != insn
in this case. This patch fixes that.
gcc/ChangeLog:
* rtl-ssa/insns.cc (function_info::add_insn_after): Ensure we
update the prev pointer on the following nondebug insn in the
case that !insn->is_debug_insn () && next->is_debug_insn ().
|
|
The assert checking which is commented out in vec.h grow method requires
trivially default constructible types to be used with this method, but
bitmap_head has since the PR88317 r9-4642 workaround non-trivial default
constructor to catch bugs and we pay the minimum price of initializing
everything in bitmap_head twice on the common
bitmap_head var;
bitmap_initilize (&var, obstack);
sequence. This patch makes us pay the same price times number of elements
on
vec<bitmap_head> v;
v.create (n);
v.safe_grow_cleared (n); // previous v.safe_grow (n);
for (int i = 0; i < n; ++i)
bitmap_initialize (&v[i], obstack);
2023-09-29 Jakub Jelinek <jakub@redhat.com>
* tree-ssa-loop-im.cc (tree_ssa_lim_initialize): Use quick_grow_cleared
instead of quick_grow on vec<bitmap_head> members.
* cfganal.cc (control_dependences::control_dependences): Likewise.
* rtl-ssa/blocks.cc (function_info::build_info::build_info): Likewise.
(function_info::place_phis): Use safe_grow_cleared instead of safe_grow
on auto_vec<bitmap_head> vars.
* tree-ssa-live.cc (compute_live_vars): Use quick_grow_cleared instead
of quick_grow on vec<bitmap_head> var.
|
|
Hi, Richard.
RISC-V port needs to add a bunch VLS modes (V16QI,V32QI,V64QI,...etc)
There are sharing same REG_CLASS with VLA modes (VNx16QI,VNx32QI,...etc)
When I am adding those VLS modes, the RTL_SSA initialization in VSETVL PASS (inserted after RA) ICE:
rvv.c:13:1: internal compiler error: in partial_subreg_p, at rtl.h:3186
13 | }
| ^
0xf7a5b1 partial_subreg_p(machine_mode, machine_mode)
../../../riscv-gcc/gcc/rtl.h:3186
0x1407616 wider_subreg_mode(machine_mode, machine_mode)
../../../riscv-gcc/gcc/rtl.h:3252
0x2a2c6ff rtl_ssa::combine_modes(machine_mode, machine_mode)
../../../riscv-gcc/gcc/rtl-ssa/internals.inl:677
0x2a2b9a4 rtl_ssa::function_info::simplify_phi_setup(rtl_ssa::phi_info*, rtl_ssa::set_info**, bitmap_head*)
../../../riscv-gcc/gcc/rtl-ssa/functions.cc:146
0x2a2c142 rtl_ssa::function_info::simplify_phis()
../../../riscv-gcc/gcc/rtl-ssa/functions.cc:258
0x2a2b3f0 rtl_ssa::function_info::function_info(function*)
../../../riscv-gcc/gcc/rtl-ssa/functions.cc:51
0x1cebab9 pass_vsetvl::init()
../../../riscv-gcc/gcc/config/riscv/riscv-vsetvl.cc:4578
0x1cec150 pass_vsetvl::execute(function*)
../../../riscv-gcc/gcc/config/riscv/riscv-vsetvl.cc:4716
The reason is that we have V32QImode (size = [32,0]) which is the mode set as regno_reg_rtx[97]
When the PHI input def comes from ENTRY BLOCK (index =0), the def->mode () = V32QImode.
But the phi_mode = VNx2QI for example (I use VLA modes intrinsic write the codes).
Then combine_modes report ICE.
gcc/ChangeLog:
* rtl-ssa/internals.inl: Fix when mode1 and mode2 are not ordred.
|
|
We are running out of the machine_mode(8 bits) in RISC-V backend. Thus
we would like to extend the machine_mode bit size from 8 to 16 bits.
However, it is sensitive to extend the memory size in common structure
like tree or rtx. This patch would like to extend the machine_mode bits
to 16 bits by shrinking, like:
* Swap the bit size of code and machine code in rtx_def.
* Adjust the machine_mode location and spare in tree.
The memory impact of this patch for correlated structure looks like below:
+-------------------+----------+---------+------+
| struct/bytes | upstream | patched | diff |
+-------------------+----------+---------+------+
| rtx_obj_reference | 8 | 12 | +4 |
| ext_modified | 2 | 4 | +2 |
| ira_allocno | 192 | 184 | -8 |
| qty_table_elem | 40 | 40 | 0 |
| reg_stat_type | 64 | 64 | 0 |
| rtx_def | 40 | 40 | 0 |
| table_elt | 80 | 80 | 0 |
| tree_decl_common | 112 | 112 | 0 |
| tree_type_common | 128 | 128 | 0 |
| access_info | 8 | 8 | 0 |
+-------------------+----------+---------+------+
The tree and rtx related struct has no memory changes after this patch,
and the machine_mode changes to 16 bits already.
Signed-off-by: Pan Li <pan2.li@intel.com>
Co-authored-by: Ju-Zhe Zhong <juzhe.zhong@rivai.ai>
Co-authored-by: Kito Cheng <kito.cheng@sifive.com>
Co-Authored-By: Richard Biener <rguenther@suse.de>
Co-Authored-By: Richard Sandiford <richard.sandiford@arm.com>
gcc/ChangeLog:
* combine.cc (struct reg_stat_type): Extend machine_mode to 16 bits.
* cse.cc (struct qty_table_elem): Extend machine_mode to 16 bits
(struct table_elt): Extend machine_mode to 16 bits.
(struct set): Ditto.
* genmodes.cc (emit_mode_wider): Extend type from char to short.
(emit_mode_complex): Ditto.
(emit_mode_inner): Ditto.
(emit_class_narrowest_mode): Ditto.
* genopinit.cc (main): Extend the machine_mode limit.
* ira-int.h (struct ira_allocno): Extend machine_mode to 16 bits and
re-ordered the struct fields for padding.
* machmode.h (MACHINE_MODE_BITSIZE): New macro.
(GET_MODE_2XWIDER_MODE): Extend type from char to short.
(get_mode_alignment): Extend type from char to short.
* ree.cc (struct ext_modified): Extend machine_mode to 16 bits and
removed the ATTRIBUTE_PACKED.
* rtl-ssa/accesses.h: Extend machine_mode to 16 bits, narrow
* rtl-ssa/internals.inl (rtl_ssa::access_info): Adjust the assignment.
m_kind to 2 bits and remove m_spare.
* rtl.h (RTX_CODE_BITSIZE): New macro.
(struct rtx_def): Swap both the bit size and location between the
rtx_code and the machine_mode.
(subreg_shape::unique_id): Extend the machine_mode limit.
* rtlanal.h: Extend machine_mode to 16 bits.
* tree-core.h (struct tree_type_common): Extend machine_mode to 16
bits and re-ordered the struct fields for padding.
(struct tree_decl_common): Extend machine_mode to 16 bits.
|
|
insn_info tried to save space by storing the number of
definitions in a 16-bit bitfield. The justification was:
// ... FIRST_PSEUDO_REGISTER + 1
// is the maximum number of accesses to hard registers and memory, and
// MAX_RECOG_OPERANDS is the maximum number of pseudos that can be
// defined by an instruction, so the number of definitions should fit
// easily in 16 bits.
But while that reasoning holds (I think) for real instructions,
it doesn't hold for artificial instructions. I don't think there's
any sensible higher limit we can use, so this patch goes for a full
unsigned int.
gcc/
PR rtl-optimization/108086
* rtl-ssa/insns.h (insn_info): Make m_num_defs a full unsigned int.
Adjust size-related commentary accordingly.
|
|
Since rtl-ssa isn't a real/native SSA representation, it has
to honour the constraints of the underlying rtl representation.
Part of this involves maintaining an rpo list of definitions
for each rtl register, backed by a splay tree where necessary
for quick lookup/insertion.
However, clobbers of a register don't act as barriers to
other clobbers of a register. E.g. it's possible to move one
flag-clobbering instruction across an arbitrary number of other
flag-clobbering instructions. In order to allow passes to do
that without quadratic complexity, the splay tree groups all
consecutive clobbers into groups, with only the group being
entered into the splay tree. These groups in turn have an
internal splay tree of clobbers where necessary.
This means that, if we insert a new definition and use into
the middle of a sea of clobbers, we need to split the clobber
group into two groups. This was quite a difficult condition
to trigger during development, and the PR shows that the code
to handle it had (at least) two bugs.
First, the process involves searching the clobber tree for
the split point. This search can give either the previous
clobber (which will belong to the first of the split groups)
or the next clobber (which will belong to the second of the
split groups). The code for the former case handled the
split correctly but the code for the latter case didn't.
Second, I'd forgotten to add the second clobber group to the
main splay tree. :-(
gcc/
PR rtl-optimization/108508
* rtl-ssa/accesses.cc (function_info::split_clobber_group): When
the splay tree search gives the first clobber in the second group,
make sure that the root of the first clobber group is updated
correctly. Enter the new clobber group into the definition splay
tree.
gcc/testsuite/
PR rtl-optimization/108508
* gcc.target/aarch64/pr108508.c: New test.
|
|
|
|
gcc/ChangeLog:
* compare-elim.cc: Add "final" and "override" to dom_walker vfunc
implementations, removing redundant "virtual" as appropriate.
* gimple-ssa-strength-reduction.cc: Likewise.
* ipa-prop.cc: Likewise.
* rtl-ssa/blocks.cc: Likewise.
* tree-into-ssa.cc: Likewise.
* tree-ssa-dom.cc: Likewise.
* tree-ssa-math-opts.cc: Likewise.
* tree-ssa-phiopt.cc: Likewise.
* tree-ssa-propagate.cc: Likewise.
* tree-ssa-sccvn.cc: Likewise.
* tree-ssa-strlen.cc: Likewise.
* tree-ssa-uncprop.cc: Likewise.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
gcc/c/ChangeLog:
* c-parser.cc (c_parser_conditional_expression): Use {,UN}LIKELY
macros.
(c_parser_binary_expression): Likewise.
gcc/cp/ChangeLog:
* cp-gimplify.cc (cp_genericize_r): Use {,UN}LIKELY
macros.
* parser.cc (cp_finalize_omp_declare_simd): Likewise.
(cp_finalize_oacc_routine): Likewise.
gcc/ChangeLog:
* system.h (LIKELY): Define.
(UNLIKELY): Likewise.
* domwalk.cc (sort_bbs_postorder): Use {,UN}LIKELY
macros.
* dse.cc (set_position_unneeded): Likewise.
(set_all_positions_unneeded): Likewise.
(any_positions_needed_p): Likewise.
(all_positions_needed_p): Likewise.
* expmed.cc (flip_storage_order): Likewise.
* genmatch.cc (dt_simplify::gen_1): Likewise.
* ggc-common.cc (gt_pch_save): Likewise.
* print-rtl.cc: Likewise.
* rtl-iter.h (T>::array_type::~array_type): Likewise.
(T>::next): Likewise.
* rtl-ssa/internals.inl: Likewise.
* rtl-ssa/member-fns.inl: Likewise.
* rtlanal.cc (T>::add_subrtxes_to_queue): Likewise.
(rtx_properties::try_to_add_dest): Likewise.
* rtlanal.h (growing_rtx_properties::repeat): Likewise.
(vec_rtx_properties_base::~vec_rtx_properties_base): Likewise.
* simplify-rtx.cc (simplify_replace_fn_rtx): Likewise.
* sort.cc (likely): Likewise.
(mergesort): Likewise.
* wide-int.h (wi::eq_p): Likewise.
(wi::ltu_p): Likewise.
(wi::cmpu): Likewise.
(wi::bit_and): Likewise.
(wi::bit_and_not): Likewise.
(wi::bit_or): Likewise.
(wi::bit_or_not): Likewise.
(wi::bit_xor): Likewise.
(wi::add): Likewise.
(wi::sub): Likewise.
|