Age | Commit message (Collapse) | Author | Files | Lines |
|
the dom_ranger used for fast vrp no longer needs a local
gimple_outgoing_range object as it is now always available from the
range_query parent class.
The builtin_unreachable code for adjusting globals and removing the
builtin calls during the final VRP pass can now function with just
a range_query object rather than a specific ranger. This adjusts it to
use the extra methods in the range_query API.
This will now allow removal of builtin_unreachable calls even if there is no
active ranger with dependency info available.
* gimple-range.cc (dom_ranger::dom_ranger): Do not initialize m_out.
(dom_ranger::maybe_push_edge): Use gori () rather than m_out.
* gimple-range.h (dom_ranger::m_out): Remove.
* tree-vrp.cc (remove_unreachable::remove_unreachable): Use a
range-query ranther than a gimple_ranger.
(remove_unreachable::remove): New.
(remove_unreachable::m_ranger): Change to a range_query.
(remove_unreachable::handle_early): If there is no dependency
information, do nothing.
(remove_unreachable::remove_and_update_globals): Do not update
globals if there is no dependecy info to use.
|
|
No functional change.
- We always have a target_hash_table and bb_ticks because
init_resource_info is always called. These conditionals are
an ancient artifact: it's been quite a while since
resource.cc was used elsewhere than exclusively from reorg.cc
- In mark_target_live_regs, get rid of a now-redundant "if
(tinfo != NULL)" conditional and replace an "if (bb)" with a
gcc_assert.
A "git diff -wb" (ignore whitespace diff) is better at
showing the actual changes.
* resource.cc (free_resource_info, clear_hashed_info_for_insn): Don't
check for non-null target_hash_table and bb_ticks.
(mark_target_live_regs): Ditto. Replace check for non-NULL result from
BLOCK_FOR_INSN with a call to gcc_assert. Fold code conditioned on
tinfo != NULL.
|
|
No functional change.
A "git diff -wb" (ignore whitespace diff) shows that this
commit just removes a "if (b != -1)" after a "gcc_assert (b
!= -1)" and also removes the subsequent "else" clause.
* resource.cc (mark_target_live_regs): Remove redundant check for b
being -1, after gcc_assert.
|
|
...and call compute_bb_for_insn in init_resource_info and
free_bb_for_insn in free_resource_info.
I put a gcc_unreachable in that else-clause for a failing
find_basic_block in mark_target_live_regs after the comment that says:
/* We didn't find the start of a basic block. Assume everything
in use. This should happen only extremely rarely. */
SET_HARD_REG_SET (res->regs);
and found that it fails not extremely rarely but extremely early in
the build (compiling libgcc).
That kind of pessimization leads to suboptimal delay-slot-filling.
Instead, do like many machine_dependent_reorg passes and call
compute_bb_for_insn as part of resource.cc initialization.
After this patch, there's a whole "if (b != -1)" conditional that's
dominated by a gcc_assert (b != -1). I separated that, as it's a NFC
whitespace patch that hampers patch readability.
Altogether this improved coremark performance for CRIS at -O2
-march=v10 by 0.36%.
* resource.cc: Include cfgrtl.h. Use BLOCK_FOR_INSN (insn)->index
instead of calling find_basic_block (insn). Assert for not -1.
(find_basic_block): Remove function.
(init_resource_info): Call compute_bb_for_insn.
(free_resource_info): Call free_bb_for_insn.
|
|
The PR115182 regression is that a delay-slot for a conditional branch,
is no longer filled with an insn that has been "sunk" because of
r15-518-g99b1daae18c095, for cris-elf w. -O2 -march=v10.
There are still sufficient "nearby" dependency-less insns that the
delay-slot shouldn't be empty. In particular there's one candidate in
the loop, right after an off-ramp branch, off the loop: a move from
$r9 to $r3.
beq .L2
nop
move.d $r9,$r3
But, the resource.cc data-flow-analysis incorrectly says it collides
with registers "live" at that .L2 off-ramp. The off-ramp insns
(inlined from simple_rand) look like this (left-to-right direction):
.L2:
move.d $r12,[_seed.0]
move.d $r13,[_seed.0+4]
ret
movem [$sp+],$r8
So, a store of a long long to _seed, a return instruction and a
restoring multi-register-load of r0..r8 (all callee-saved registers)
in the delay-slot of the return insn. The return-value is kept in
$r10,$r11 so in total $r10..$r13 live plus the stack-pointer and
return-address registers. But, mark_target_live_regs says that
$r0..$r8 are also live because it *includes the registers live for the
return instruction*! While they "come alive" after the movem, they
certainly aren't live at the "off-ramp" .L2 label.
The problem is in mark_target_live_regs: it consults a hash-table
indexed by insn uid, where it tracks the currently live registers with
a "generation" count to handle when it moves around insn, filling
delay-slots. As a fall-back, it starts with registers live at the
start of each basic block, calculated by the comparatively modern df
machinery (except that it can fail finding out which basic block an
insn belongs to, at which times it includes all registers film at 11),
and tracks the semantics of insns up to each insn.
You'd think that's all that should be done, but then for some reason
it *also* looks at insns *after the target insn* up to a few branches,
and includes that in the set of live registers! This is the code in
mark_target_live_regs that starts with the call to
find_dead_or_set_registers. I couldn't make sense of it, so I looked
at its history, and I think I found the cause; it's a thinko or
possibly two thinkos. The original implementation, gcc-git-described
as r0-97-g9c7e297806a27f, later moved from reorg.c to resource.c in
r0-20470-gca545bb569b756.
I believe the "extra" lookup was intended to counter flaws in the
reorg.c/resource.c register liveness analysis; to inspect insns along
the execution paths to exclude registers that, when looking at
subsequent insns, weren't live. That guess is backed by a sentence in
the updated (i.e. deleted) part of the function head comment for
mark_target_live_regs: "Next, scan forward from TARGET looking for
things set or clobbered before they are used. These are not live."
To me that sounds like flawed register-liveness data.
An epilogue expanded as RTX (i.e. not just assembly code emitted as
text) is introduced in basepoints/gcc-0-1334-gbdac5f5848fb, so before
that time, nobody would notice that saved registers were included as
live registers in delay-slots in "next-to-last" basic blocks.
Then in r0-24783-g96e9c98d59cc40, the intersection ("and") was changed
to a union ("or"), i.e. it added to the set of live registers instead
of thinning it out. In the gcc-patches archives, I see the patch
submission doesn't offer a C test-case and only has RTX snippets
(apparently for SPARC). The message does admit that the change goes
"against what the comments in the code say":
https://gcc.gnu.org/pipermail/gcc-patches/1999-November/021836.html
It looks like this was related to a bug with register liveness info
messed up when moving a "delay-slotted" insn from one slot to another.
But, I can't help but thinking it's just papering over a register
liveness bug elsewhere.
I think, with a reliable "DF_LR_IN", the whole thing *after* tracking
from start-of-bb up to the target insn should be removed; thus.
This patch also removes the now-unused find_dead_or_set_registers
function.
At r15-518, it fixes the issue for CRIS and improves coremark scores
at -O2 -march=v10 a tiny bit (about 0.05%).
PR rtl-optimization/115182
* resource.cc (mark_target_live_regs): Don't look for
unconditional branches after the target to improve on the
register liveness.
(find_dead_or_set_registers): Remove unused function.
|
|
x86_32 targets
Use MOVD/PEXTRD and MOVD/PINSRD insn sequences to move DImode value
between XMM and GPR register sets for SSE4.1 x86_32 targets in order
to avoid spilling the value to stack.
The load from _Atomic location a improves from:
movq a, %xmm0
movq %xmm0, (%esp)
movl (%esp), %eax
movl 4(%esp), %edx
to:
movq a, %xmm0
movd %xmm0, %eax
pextrd $1, %xmm0, %edx
The store to _Atomic location b improves from:
movl %eax, (%esp)
movl %edx, 4(%esp)
movq (%esp), %xmm0
movq %xmm0, b
to:
movd %eax, %xmm0
pinsrd $1, %edx, %xmm0
movq %xmm0, b
gcc/ChangeLog:
* config/i386/sync.md (atomic_loaddi_fpu): Use movd/pextrd
to move DImode value from XMM to GPR for TARGET_SSE4_1.
(atomic_storedi_fpu): Use movd/pinsrd to move DImode value
from GPR to XMM for TARGET_SSE4_1.
|
|
Simplify the table of default colors, avoiding the need to manually
add the strlen of each entry.
Consolidate the global state in diagnostic-color.cc into a
g_color_dict, adding selftests for the new class diagnostic_color_dict.
No functional change intended.
gcc/ChangeLog:
* diagnostic-color.cc: Define INCLUDE_VECTOR.
Include "label-text.h" and "selftest.h".
(struct color_cap): Replace with...
(struct color_default): ...this, adding "m_" prefixes to fields
and dropping "name_len" and "free_val" field.
(color_dict): Convert to...
(gcc_color_defaults): ...this, making const, dropping the trailing
strlen and "false" from each entry.
(class diagnostic_color_dict): New.
(g_color_dict): New.
(colorize_start): Reimplement in terms of g_color_dict.
(diagnostic_color_dict::get_entry_by_name): New, based on
colorize_start.
(diagnostic_color_dict::get_start_by_name): Likewise.
(diagnostic_color_dict::diagnostic_color_dict): New.
(parse_gcc_colors): Reimplement, moving body...
(diagnostic_color_dict::parse_envvar_value): ...here.
(colorize_init): Lazily create g_color_dict.
(selftest::test_empty_color_dict): New.
(selftest::test_default_color_dict): New.
(selftest::test_color_dict_envvar_parsing): New.
(selftest::diagnostic_color_cc_tests): New.
* selftest-run-tests.cc (selftest::run_tests): Call
selftest::diagnostic_color_cc_tests.
* selftest.h (selftest::diagnostic_color_cc_tests): New decl.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
No functional change intended.
libcpp/ChangeLog:
* Makefile.in (TAGS_SOURCES): Add include/label-text.h.
* include/label-text.h: New file.
* include/rich-location.h: Include "label-text.h".
(class label_text): Move to label-text.h.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
Avoid selftest.h requiring the "tree" type.
No functional change intended.
gcc/analyzer/ChangeLog:
* region-model.cc: Include "selftest-tree.h".
gcc/ChangeLog:
* function-tests.cc: Include "selftest-tree.h".
* selftest-tree.h: New file.
* selftest.h (make_fndecl): Move to selftest-tree.h.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
gcc/ChangeLog:
* config/v850/v850.opt.urls: Regenerate, with fix.
* config/vax/vax.opt.urls: Likewise.
* regenerate-opt-urls.py (TARGET_SPECIFIC_PAGES): Fix transposed
values for "vax" and "v850".
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
We already warn for:
x = std::move (x);
which triggers:
warning: moving 'x' of type 'int' to itself [-Wself-move]
but bug 109396 reports that this doesn't work for a member-initializer-list:
X() : x(std::move (x))
so this patch amends that.
PR c++/109396
gcc/cp/ChangeLog:
* cp-tree.h (maybe_warn_self_move): Declare.
* init.cc (perform_member_init): Call maybe_warn_self_move.
* typeck.cc (maybe_warn_self_move): No longer static. Change the
return type to bool. Also warn when called from
a member-initializer-list. Drop the inform call.
gcc/testsuite/ChangeLog:
* g++.dg/warn/Wself-move2.C: New test.
|
|
SCEV always uses the current range_query object.
Ranger's cache uses a global value_query when propagating cache values to
avoid re-invoking ranger during simple vavhe propagations.
when folding a PHI value, SCEV can be invoked, and since it alwys uses
the current range_query object, when ranger is active this causes the
undesired re-invoking of ranger during cache propagation.
This patch checks to see if the fold_using_range specified range_query
object is the same as the one SCEV uses, and does not invoke SCEV if
they do not match.
PR tree-optimization/115221
gcc/
* gimple-range-fold.cc (range_of_ssa_name_with_loop_info): Do
not invoke SCEV is range_query's do not match.
gcc/testsuite/
* gcc.dg/pr115221.c: New.
|
|
The strlen pass currently has a local ranger instance, but when it
invokes SCEV, scev will not be able to access to this ranger.
Enable/disable ranger shoud be used, allowing other components to use
the current range_query.
gcc/
* tree-ssa-strlen.cc (strlen_pass::strlen_pass): Add function
pointer and initialize ptr_qry with current range_query.
(strlen_pass::m_ranger): Remove.
(printf_strlen_execute): Enable and disable ranger.
gcc/testsuite/
* gcc.dg/Wstringop-overflow-10.c: Add truncating warning.
|
|
Coming back to our discussion in
<https://gcc.gnu.org/pipermail/gcc-patches/2024-April/649426.html>:
TARGET_EXPRs that initialize a function argument are not marked
TARGET_EXPR_ELIDING_P even though gimplify_arg drops such TARGET_EXPRs
on the floor. To work around it, I added a pset to
replace_placeholders_for_class_temp_r, but it would be best to just rely
on TARGET_EXPR_ELIDING_P.
PR c++/114707
gcc/cp/ChangeLog:
* call.cc (convert_for_arg_passing): Call set_target_expr_eliding.
* typeck2.cc (replace_placeholders_for_class_temp_r): Don't use pset.
(digest_nsdmi_init): Call cp_walk_tree_without_duplicates instead of
cp_walk_tree.
Reviewed-by: Jason Merrill <jason@redhat.com>
|
|
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/lastprivate-conditional-1.c: Remove
'{ dg-prune-output "not supported yet" }'.
* c-c++-common/gomp/requires-1.c: Likewise.
* c-c++-common/gomp/requires-2.c: Likewise.
* c-c++-common/gomp/reverse-offload-1.c: Likewise.
* g++.dg/gomp/requires-1.C: Likewise.
* gfortran.dg/gomp/requires-1.f90: Likewise.
* gfortran.dg/gomp/requires-2.f90: Likewise.
* gfortran.dg/gomp/requires-4.f90: Likewise.
* gfortran.dg/gomp/requires-5.f90: Likewise.
* gfortran.dg/gomp/requires-6.f90: Likewise.
* gfortran.dg/gomp/requires-7.f90: Likewise.
|
|
gcc/ChangeLog:
PR analyzer/115203
* diagnostic-path.h
(simple_diagnostic_path::disable_event_localization): New.
(simple_diagnostic_path::m_localize_events): New field.
* diagnostic.cc
(simple_diagnostic_path::simple_diagnostic_path): Initialize
m_localize_events.
(simple_diagnostic_path::add_event): Only localize fmt if
m_localize_events is true.
* tree-diagnostic-path.cc
(test_diagnostic_path::test_diagnostic_path): Call
disable_event_localization.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
PR bootstrap/115167 reports a bootstrap failure on AIX triggered by
r15-636-g770657d02c986c whilst building f951 in stage 2, due to
the linker not being able to find symbols for:
vtable for range_label_for_type_mismatch
range_label_for_type_mismatch::get_text(unsigned int) const
The only users of the class range_label_for_type_mismatch are in the
C/C++ frontends, each of which supply their own implementation of:
range_label_for_type_mismatch::get_text(unsigned int) const
i.e. we had a cluster of symbols that was disconnnected from any
users on f951.
The above patch added a new range_label::get_effects vfunc to the
base class. My hunch is that we were getting away with not defining
the symbol for Fortran with AIX's linker before (since none of the
users are used), but adding the get_effects vfunc has somehow broken
things (possibly because there's an empty implementation in the base
class in the *header*).
The following patch moves all of the code in
gcc/gcc-rich-location.[cc,h,o} defining and using
range_label_for_type_mismatch to a new
gcc/c-family/c-type-mismatch.{cc,h,o}, to help the linker ignore this
cluster of symbols when it's disconnected from users.
I was able to reproduce the failure without the patch, and then
successfully bootstrap with this patch on powerpc-ibm-aix7.3.1.0
(cfarm119).
gcc/ChangeLog:
PR bootstrap/115167
* Makefile.in (C_COMMON_OBJS): Add c-family/c-type-mismatch.o.
* gcc-rich-location.cc
(maybe_range_label_for_tree_type_mismatch::get_text): Move to
c-family/c-type-mismatch.cc.
(binary_op_rich_location::binary_op_rich_location): Likewise.
(binary_op_rich_location::use_operator_loc_p): Likewise.
* gcc-rich-location.h (class range_label_for_type_mismatch):
Likewise.
(class maybe_range_label_for_tree_type_mismatch): Likewise.
(class op_location_t): Likewise for forward decl.
(class binary_op_rich_location): Likewise.
gcc/c-family/ChangeLog:
PR bootstrap/115167
* c-format.cc: Replace include of "gcc-rich-location.h" with
"c-family/c-type-mismatch.h".
* c-type-mismatch.cc: New file, taking material from
gcc-rich-location.cc.
* c-type-mismatch.h: New file, taking material from
gcc-rich-location.h.
* c-warn.cc: Replace include of "gcc-rich-location.h" with
"c-family/c-type-mismatch.h".
gcc/c/ChangeLog:
PR bootstrap/115167
* c-objc-common.cc: Replace include of "gcc-rich-location.h" with
"c-family/c-type-mismatch.h".
* c-typeck.cc: Likewise.
gcc/cp/ChangeLog:
PR bootstrap/115167
PR bootstrap/115167
* call.cc: Replace include of "gcc-rich-location.h" with
"c-family/c-type-mismatch.h".
* error.cc: Likewise.
* typeck.cc: Likewise.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
And here's Lyut's basic Zbkb support. Essentially it's four new patterns for
packh, packw, pack plus a bridge pattern needed for packh.
packw is a bit ugly as we need to match a sign extension in an inconvenient
location. We pull it out so that the extension is exposed in a convenient
place for subsequent sign extension elimination.
We need a bridge pattern to get packh. Thankfully the bridge pattern is a
degenerate packh where one operand is x0, so it works as-is without splitting
and provides the bridge to the more general form of packh.
This patch also refines the condition for the constant reassociation patch to
avoid a few more cases than can be handled efficiently with other preexisting
patterns and one bugfix to avoid losing bits, particularly in the xor/ior case.
Lyut did the core work here. I think I did some minor cleanups and the bridge
pattern to work with gcc-15 and beyond.
This is a prerequisite for using zbkb in constant synthesis. It also stands on
its own. I know we've seen it trigger in spec without the constant synthesis
bits.
It's been through our internal CI and my tester. I'll obviously wait for the
upstream CI to finish before taking further action.
gcc/
* config/riscv/crypto.md: Add new combiner patterns to generate
pack, packh, packw instrutions.
* config/riscv/iterators.md (HX): New iterator for half X mode.
* config/riscv/riscv.md (<optab>_shift_reverse<X:mode>): Tighten
cases to avoid. Do not lose bits for XOR/IOR.
gcc/testsuite
* gcc.target/riscv/pack32.c: New test.
* gcc.target/riscv/pack64.c: New test.
* gcc.target/riscv/packh32.c: New test.
* gcc.target/riscv/packh64.c: New test.
* gcc.target/riscv/packw.c: New test.
Co-authored-by: Jeffrey A Law <jlaw@ventanamicro.com>
|
|
[PR115060]
Some utility functions (such as vect_look_through_possible_promotion) that are
to find out certain kind of direct or indirect definition SSA for a value, may
return the original one of the SSA, not its pattern representative SSA, even
pattern is involved. For example,
a = (T1) patt_b;
patt_b = (T2) c; // b = ...
patt_c = not-a-cast; // c = ...
Given 'a', the mentioned function will return 'c', instead of 'patt_c'. This
subtlety would make some pattern recog code that is unaware of it mis-use the
original instead of the new pattern statement, which is inconsistent wth
processing logic of the pattern formation pass. This patch corrects the issue
by forcing another utility function (vect_get_internal_def) return the pattern
statement information to caller by default.
2024-05-23 Feng Xue <fxue@os.amperecomputing.com>
gcc/
PR tree-optimization/115060
* tree-vect-patterns.cc (vect_get_internal_def): Return statement for
vectorization.
(vect_widened_op_tree): Call vect_get_internal_def instead of look_def
to get statement information.
(vect_recog_widen_abd_pattern): No need to call vect_stmt_to_vectorize.
|
|
The dump scanning is supposed to check that we do not merge two
sligtly different gathers into one SLP node but since we now
SLP the store scanning for "ectorizing stmts using SLP" is no
longer good. Instead the following makes us look for
"stmt 1 .* = .MASK" which would be how the second lane of an SLP
node looks like. We have to handle both .MASK_GATHER_LOAD (for
targets with ifun mask gathers) and .MASK_LOAD (for ones without).
Tested on x86_64-linux with and without native gather and on GCN
where this now avoids a FAIL.
PR target/115254
* gcc.dg/vect/vect-gather-4.c: Adjust dump scan.
|
|
The stored-to ANYTHING handling has more holes, uncovered by treating
volatile accesses as ANYTHING. We fail to properly build the
pred and succ graphs, in particular we may not elide direct nodes
from receiving from STOREDANYTHING.
PR tree-optimization/115236
* tree-ssa-structalias.cc (build_pred_graph): Properly
handle *ANYTHING = X.
(build_succ_graph): Likewise. Do not elide direct nodes
from receiving from STOREDANYTHING.
* gcc.dg/pr115236.c: New testcase.
|
|
We process asm memory input/outputs with constraints to ESCAPED
but for this temporarily build an ADDR_EXPR. The issue is that
the used build_fold_addr_expr ends up wrapping the ADDR_EXPR in
a conversion which ends up producing &ANYTHING constraints which
is quite bad. The following uses get_constraint_for_address_of
instead, avoiding the temporary tree and the unhandled conversion.
This avoids a gcc.dg/tree-ssa/restrict-9.c FAIL with the fix
for PR115236.
* tree-ssa-structalias.cc (find_func_aliases): Use
get_constraint_for_address_of to build escape constraints
for asm inputs and outputs.
|
|
The following avoids accounting single-lane SLP to the discovery
limit. As the two testcases show this makes discovery fail,
unfortunately even not the same across targets. The following
should fix two FAILs for GCN as a side-effect.
PR tree-optimization/115254
* tree-vect-slp.cc (vect_build_slp_tree): Only account
multi-lane SLP to limit.
* gcc.dg/vect/slp-cond-2-big-array.c: Expect 4 times SLP.
* gcc.dg/vect/slp-cond-2.c: Likewise.
|
|
When the neutral op is the initial value we might need to convert
it from pointer to integer.
* tree-vect-loop.cc (get_initial_defs_for_reduction): Convert
neutral op to the vector component type.
|
|
When I applied Roger's patch [1], there's ICE due to it.
The patch fix the latent bug.
[1] https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651365.html
gcc/ChangeLog:
* config/i386/sse.md
(<avx512>_<complexopname>_<mode>_mask<round_name>): Align
operands' predicate with corresponding expander.
(<avx512>_<complexopname>_<mode><maskc_name><round_name>):
Ditto.
|
|
[PR115169]
gcc/ChangeLog:
PR target/115169
* config/loongarch/loongarch.cc
(loongarch_expand_conditional_move): Guard REGNO with REG_P.
|
|
This just moves the tree scan earlier so we can detect the optimization and not
need to detect the vector splitting too.
Committed as obvious after a quick test.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/bitops-9.c: Look at cdcde1 rather than optmization.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
|
|
|
|
I noticed while working on the `a ^ CST` patch, that bitwise_inverted_equal_p
would check INTEGER_CST directly and not handle vector csts that are uniform.
This moves over to using uniform_integer_cst_p instead of checking INTEGER_CST
directly.
Bootstrapped and tested on x86_64-linux-gnu with no regressions.
PR tree-optimization/115238
gcc/ChangeLog:
* generic-match-head.cc (bitwise_inverted_equal_p): Use
uniform_integer_cst_p instead of checking INTEGER_CST.
* gimple-match-head.cc (gimple_bitwise_inverted_equal_p): Likewise.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/bitops-9.c: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
|
|
This patch simplifies all the xref usage for gm2 nodes in the
modula-2 documentation.
gcc/ChangeLog:
* doc/gm2.texi: Replace all occurrences of xref
{foo, , , gm2} with xref {foo}.
Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
|
|
When points-to analysis finds SCCs it marks the wrong node as being
part of a found cycle. It only wants to mark the node it collapses
to but marked the entry node found rather than the one it collapses
to. This causes fallout in the patch for PR115236 but generally
weakens the points-to solution by collapsing too many nodes. Note
that this fix might slow down points-to solving.
* tree-ssa-structalias.cc (scc_visit): Mark the node we
collapse to as being in a component.
|
|
The following makes sure the virtual operand updating when sinking
stores works for the case we ignore paths to kills. The final
sink location might not post-dominate the original stmt location
which would require inserting of a virtual PHI which we do not support.
PR tree-optimization/115220
PR tree-optimization/115226
* tree-ssa-sink.cc (statement_sink_location): When ignoring
paths to kills when sinking stores make sure the final
sink location is still post-dominated by the original one.
Otherwise we'd need to insert a PHI node to merge virtual operands.
* gcc.dg/torture/pr115220.c: New testcase.
* gcc.dg/torture/pr115226.c: New testcase.
|
|
gcc:
* config/mingw/mingw32.h: Add new define for POSIX
threads.
Signed-off-by: TheShermanTanker <tanksherman27@gmail.com>
|
|
For the following testcase we fail to demangle
_ZZN5OuterIvE6methodIvEEvvQ3cstITL0__EEN5InnernwEm and
_ZZN5OuterIvE6methodIvEEvvQ3cstITL0__EEN5InnerdlEPv and in turn end
up building NULL references. The following puts in a safeguard for
faile demangling into -Waccess.
PR tree-optimization/115232
* gimple-ssa-warn-access.cc (new_delete_mismatch_p): Handle
failure to demangle gracefully.
* g++.dg/pr115232.C: New testcase.
|
|
The test case in PR c++/105229 has been fixed since 11.4 (via
PR c++/106024) - the attached patch simply adds the case to
the test suite.
Successfully tested on x86_64-pc-linux-gnu.
PR c++/105229
gcc/testsuite/ChangeLog:
* g++.dg/parse/crash72.C: New test.
|
|
gcc:
* doc/gm2.texi (What is GNU Modula-2): Move gcc.gnu.org links to
https.
(Other languages): Ditto. And fix casing of GCC.
|
|
Update v1->v2
Add testcase for this patch.
Missing boolean_expression TARGET_ZMMUL in riscv_rtx_costs() cause different instructions when
multiplying an integer with a constant. ( https://github.com/riscv-collab/riscv-gnu-toolchain/issues/1482 )
int foo(int *ib) {
*ib = *ib * 33938;
return 0;
}
rv64im:
lw a4,0(a1)
li a5,32768
addiw a5,a5,1170
mulw a5,a5,a4
sw a5,0(a1)
ret
rv64i_zmmul:
lw a4,0(a1)
slliw a5,a4,5
addw a5,a5,a4
slliw a5,a5,3
addw a5,a5,a4
slliw a5,a5,3
addw a5,a5,a4
slliw a5,a5,3
addw a5,a5,a4
slliw a5,a5,1
sw a5,0(a1)
ret
Fixed.
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_rtx_costs): Add TARGET_ZMMUL.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/zmmul-3.c: New test.
|
|
Use the correct names of the D_floating and G_floating data formats as
per the VAX ISA nomenclature[1]. Document the `-md', `-md-float', and
`-mg-float' options.
References:
[1] DEC STD 032-0 "VAX Architecture Standard", Digital Equipment
Corporation, A-DS-EL-00032-00-0 Rev J, December 15, 1989, Section
1.2 "Data Types", pp. 1-7, 1-9
gcc/
* doc/invoke.texi (Option Summary): Add `-md', `-md-float', and
`-mg-float' options. Reorder, matching VAX Options.
(VAX Options): Reword the description of `-mg' option. Add
`-md', `-md-float', and `-mg-float' options.
|
|
Replace "Target" with "Generate" consistently and place a hyphen in
"double-precision" as this is used as an adjective here.
gcc/ChangeLog:
PR target/79646
* config/vax/vax.opt (md, md-float, mg, mg-float): Correct
descriptions.
|
|
This patch from Lyut will reassociate operands when we have shifted logical
operations. This can simplify a constant that may not be fit in a simm12 into
a form that does fit into a simm12.
The basic work was done by Lyut. I generalized it to handle XOR/OR.
It stands on its own, but also helps the upcoming Zbkb work from Lyut.
This has survived Ventana's CI system as well as my tester. Obviously I'll
wait for a verdict from the Rivos CI system before moving forward.
gcc/
* config/riscv/riscv.md (<optab>_shift_reverse<X:mode>): New pattern.
gcc/testsuite
* gcc.target/riscv/and-shift32.c: New test.
* gcc.target/riscv/and-shift64.c: New test.
Co-authored-by: Jeffrey A Law <jlaw@ventanamicro.com>
|
|
Replaced arithmetic shifts with logical shifts in expand_vec_perm_psrlw_psllw_por to avoid sign bit extension issues. Also corrected gen_vlshrv8hi3 to gen_lshrv8hi3 and gen_vashlv8hi3 to gen_ashlv8hi3.
Co-authored-by: H.J. Lu <hjl.tools@gmail.com>
gcc/ChangeLog:
PR target/115146
* config/i386/i386-expand.cc (expand_vec_perm_psrlw_psllw_por): Replace arithmatic shift
gen_ashrv4hi3 with logic shift gen_lshrv4hi3.
Replace gen_vlshrv8hi3 with gen_lshrv8hi3 and gen_vashlv8hi3 with gen_ashlv8hi3.
gcc/testsuite/ChangeLog:
PR target/115146
* g++.target/i386/pr107563-a.C: Append '-mno-sse3' to compile option
to avoid test failure on hosts with SSE3 support.
* g++.target/i386/pr107563-b.C: Append '-mno-sse3' to compile option
to avoid test failure on hosts with SSE3 support.
* gcc.target/i386/pr115146.c: New test.
|
|
Notice some mis-alignment for gen_kids_1 right hand braces as below:
if ((_q50 == _q20 && ! TREE_SIDE_EFFECTS (...
{
if ((_q51 == _q21 && ! TREE_SIDE_EFFECTS (...
{
{
tree captures[2] ATTRIBUTE_UNUSED = {...
{
res_ops[0] = captures[0];
res_ops[1] = captures[1];
if (UNLIKELY (debug_dump)) ...
return true;
}
}
}
}
} // mis-aligned here.
}
The below test are passed for this patch:
* The x86 bootstrap test.
* The x86 fully regression test.
gcc/ChangeLog:
* genmatch.cc (dt_node::gen_kids_1): Fix indenet mis-aligned.
Signed-off-by: Pan Li <pan2.li@intel.com>
|
|
|
|
So there's another class of constants we're failing to synthesize well.
Specifically those where we can invert our original constant C into C' and C'
takes at least 2 fewer instructions to synthesize than C. In that case we can
initially generate C', then use xori with the constant -1 to flip all the bits
resulting in our target constant.
I've only seen this trigger when the final synthesis is li+srli+xori. The
original synthesis took on various 4 or 5 instruction forms.
Most of the methods we use to improve constant synthesis are in
riscv_build_integer_1. I originally tried to put this code in there. But
that'll end up with infinite recursion due to some other ADDI related code
which wants to flip bits and try synthesis.
So this was put into riscv_build_integer and recurses into riscv_build_integer.
This isn't unprecedented, just a bit different than most of the other synthesis
implementation bits.
This doesn't depend on any extensions. So it should help any rv64 system.
gcc/
* config/riscv/riscv.cc (riscv_build_integer_one): Verify there
are no bits left to set in the constant when generating bseti.
(riscv_built_integer): Synthesize ~value and if it's cheap use it
with a trailing xori with -1.
gcc/testsuite
* gcc.target/riscv/synthesis-8.c: New test.
|
|
gcc/go:
* gccgo.texi (Top): Move a web reference from golang.org to go.dev.
(C Interoperability): Move a web reference from golang.org to
pkg.go.dev.
|
|
gcc:
* doc/extend.texi (Attribute Syntax): Use @samp{=} instead of @code{=}.
(Extended Asm): Ditto.
|
|
desired constant
Next step in constant synthesis work.
For some cases it can be advantageous to generate a constant near our target,
then do a final addi to fully synthesize C.
The idea is that while our target C may require N instructions to synthesize,
C' may only require N-2 (or fewer) instructions. Thus there's budget to adjust
C' into C ending up with a better sequence than if we tried to generate C
directly.
So as an example:
> unsigned long foo_0xfffff7fe7ffff7ff(void) { return 0xfffff7fe7ffff7ffUL; }
This is currently 5 instructions on the trunk:
> li a0,-4096
> addi a0,a0,2047
> bclri a0,a0,31
> bclri a0,a0,32
> bclri a0,a0,43
But we can do better by first synthesizing 0xfffff7fe7ffff800 which is just 3
instructions. Then we can subtract 1 from the result. That gives us this
sequence:
> li a0,-16789504
> slli a0,a0,19
> addi a0,a0,-2048
> addi a0,a0,-1
These cases are relatively easy to find once you know what you're looking for.
I kept the full set found by the testing code yesterday, mostly because some of
them show different patterns for generating C', thus showing generality in the
overall synthesis implementation. While all these tests have 0x7ff in their
low bits. That's just an artifact to the test script. The methodology will
work for a variety of other cases.
gcc/
* config/riscv/riscv.cc (riscv_build_integer_1): Try generating
a nearby simpler constant, then using a final addi to set low
bits properly.
gcc/testsuite
* gcc.target/riscv/synthesis-7.c: New test.
|
|
libcpp/ChangeLog:
* lex.cc (do_peek_prev): Correct typo in argument to __builtin_expect()
Signed-off-by: Peter Damianov <peter0x44@disroot.org>
|
|
Forgot to free the gori_mpa object when a gori object is freed.
PR tree-optimization/115208
* value-query.cc (range_query::create_gori): Confirm gori_map is NULL.
(range_query::destroy_gori): Free gori_map if one was allocated.
|
|
|