Age | Commit message (Collapse) | Author | Files | Lines |
|
When ifcombine_mark_ssa_name is called directly, rather than by
ifcombine_mark_ssa_name_walk, we need to check that name is an
SSA_NAME at the caller or in the function itself. For convenience and
safety, I'm moving the checks from _walk to the implementation proper.
for gcc/ChangeLog
PR tree-optimization/117915
* tree-ssa-ifcombine.cc (ifcombine_mark_ssa_name): Move
preconditions from...
(ifcombine_mark_ssa_name_walk): ... here.
for gcc/testsuite/ChangeLog
PR tree-optimization/117915
* gcc.dg/pr117915.c: New.
|
|
There was a thinko in the testcase field-merge-9.c: I overcorrected it
for big-endian.
As a bonus, I'm including stdbool.h in field-merge-12.c, because I
used bool without the header there.
for gcc/testsuite/ChangeLog
PR testsuite/118025
* gcc.dg/field-merge-9.c (q): Drop overcorrection for
big-endian.
* gcc.dg/field-merge-12.c: Include stdbool.h.
|
|
The testcase shows that conversions that would impact negatively the
ifcombine field merging implementation won't always have been
optimized out by the time we reach ifcombine.
There's probably room to support multiple conversions with extra
logic, but this workaround should avoid codegen errors until that
logic is figured out.
for gcc/ChangeLog
PR tree-optimization/118046
* gimple-fold.cc (decode_field_reference): Don't follow more
than one conversion.
for gcc/testsuite/ChangeLog
PR tree-optimization/118046
* gcc.dg/field-merge-14.c: New.
|
|
ACATS-4 ca11d02 exposed an error in the logic for recognizing and
identifying the inner object in decode_field_ref: a view-converting
load, inserted in a previous successful field merging operation, was
recognized by gimple_convert_def_p within decode_field_reference, and
as a result we took its operand as the expression, and failed to take
note of the load location.
Without that load, we couldn't compare vuses, and then we ended up
inserting a wider load before relevant parts of the object were
initialized.
This patch makes gimple_convert_def_p recognize loads only when
requested, and requires that either both or neither parts of a
potentially merged operand have associated loads.
As a bonus, it enables additional optimizations by swapping the
operands of the second compare when that makes left-hand operands
of both compares match.
for gcc/ChangeLog
* gimple-fold.cc (gimple_convert_def_p): Reject load stmts
unless requested.
(decode_field_reference): Accept a converting load at the last
conversion matcher, subsuming the load identification.
(fold_truth_andor_for_ifcombine): Refuse to merge operands
when only one of them has an associated load stmt. Swap
operands of one of the compares if that helps them match.
for gcc/testsuite/ChangeLog
* gcc.dg/field-merge-13.c: New.
|
|
The function comment says:
*XOR_P is to be FALSE if EXP might be a XOR used in a compare, in which
case, if XOR_CMP_OP is a zero constant, it will be overridden with *PEXP,
*XOR_P will be set to TRUE, and the left-hand operand of the XOR will be
decoded. If *XOR_P is TRUE, XOR_CMP_OP is supposed to be NULL, and then the
right-hand operand of the XOR will be decoded.
and the comment right above the xor_p handling says
/* Turn (a ^ b) [!]= 0 into a [!]= b. */
but I don't see anything that would actually check that the other operand is
0, in the testcase below it happily optimizes (a ^ 1) == 8 into a == 1.
The following patch adds that check.
Note, there are various other parts of the function I'm worried about, but
haven't had time to construct counterexamples yet.
One worrying thing is the
/* Drop casts, only save the outermost type. We need not worry about
narrowing then widening casts, or vice-versa, for those that are not
essential for the compare have already been optimized out at this
point. */
comment, while obviously there are various optimizations which do optimize
nested casts and the like, I'm not really sure it is safe to rely on them
happening always before this optimization, there are various options to
disable certain optimizations and some IL could appear right before
ifcombine without being optimized yet the way this routine expects.
Plus, the 3 casts are looked through in between various optimizations which
might make those narrowing/widening or vice versa cases necessary.
Also, e.g. for the xor optimization, I think there is a difference between
int a and
(a ^ 0x23) == 0
and
((int) (((unsigned char) a) ^ (unsigned char) 0x23)) == 0
etc.
Another thing I'm worrying about are mixing up the different patterns
together, there is the BIT_AND_EXPR handling, BIT_XOR_EXPR handling,
RSHIFT_EXPR handling and then load handling.
What if all 4 appear together, or 3 of them, 2 of them?
Is the xor optimization still valid if there is BIT_AND_EXPR in between?
I.e. instead of
(a ^ 123) == 0
there is
((a ^ 123) & 234) == 0
?
2024-12-18 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/118081
* gimple-fold.cc (decode_field_reference): Only set *xor_p to true
if *xor_cmp_op is integer_zerop.
* gcc.dg/pr118081.c: New test.
|
|
Originally, we did not stream any formal parameter types into WPA and
were generally very conservative when it came to type mismatches in
IPA-CP. Over the time, mismatches that happen in code and blew up in
WPA made us to be much more resilient and also to stream the types of
the parameters which we now use commonly.
With that information, we can safely skip conversions when looking at
the IL from which we build jump functions and then simply fold convert
the constants and ranges to the resulting type, as long as we are
careful that performing the corresponding folding of constants gives
the corresponding results. In order to do that, we must ensure that
the old value can be represented in the new one without any loss.
With this change, we can nicely propagate non-NULLness in IPA-VR as
demonstrated with the new test case.
I have gone through all other uses of (all components of) jump
functions which could be affected by this and verified they do indeed
check types and can handle mismatches.
gcc/ChangeLog:
2024-12-11 Martin Jambor <mjambor@suse.cz>
* ipa-prop.cc: Include vr-values.h.
(skip_a_safe_conversion_op): New function.
(ipa_compute_jump_functions_for_edge): Use it.
gcc/testsuite/ChangeLog:
2024-11-01 Martin Jambor <mjambor@suse.cz>
* gcc.dg/ipa/vrp9.c: New test.
|
|
This handles fallout from r15-6097-gee2f19b0937b5e. A brief
analysis shows that the metric used in that code is computed
by estimate_move_cost, differentiating on the target macro
MOVE_MAX_PIECES (which defaults to MOVE_MAX) which for most
"32-bit targets" is 4 and for "64-bit targets" is 8. There
are some outliers, like pru, with MOVE_MAX set to 8 but
counting as a 32-bit target.
So, the main difference for this test-case, which is heavy
on 64-bit moves (most targets have "double" mapped to IEEE
64-bit), is between "32-bit" and "64-bit", with the cost up
to twice for the former compared to the latter. I see no
effective_target_move_max_is_4 or equivalent, and this
instance falls below the threshold of adding one, so I'm
sticking to a list of targets. For CRIS, it would suffice
with 210, but there's no need to be this specific, and it
would make the test even more brittle.
PR tree-optimization/118055
* gcc.dg/tree-ssa/pr83403-1.c, gcc.dg/tree-ssa/pr83403-2.c: Add
cris-*-* to targets passing --param=max-completely-peeled-insns=300.
|
|
libgdiagnostics was written before the fixes for PR other/116613 allowed
a diagnostic_context to have multiple output sinks.
Hence each libgdiagnostics sink had its own diagnostic_context with just
one diagnostic_output_format.
This wart is no longer necessary and makes it harder to move state
into the manager/context; in particular for quoting source code
from the .sarif file (PR sarif-replay/117943).
Simplify, by making libgdiagnostics' implementation more similar to
GCC's implementation, by moving the diagnostic_context from sink into
diagnostic_manager.
Doing so requires generalizing where the
diagnostic_source_printing_options comes from in class
diagnostic_text_output_format: for GCC we use
the instance within the diagnostic_context, whereas for
libgdiagnostics each diagnostic_text_sink has its own instance.
No functional change intended.
gcc/c-family/ChangeLog:
PR sarif-replay/117943
* c-format.cc (selftest::test_type_mismatch_range_labels): Use
dc.m_source_printing.
* c-opts.cc (c_diagnostic_text_finalizer): Use source-printing
options from text_output.
gcc/cp/ChangeLog:
PR sarif-replay/117943
* error.cc (auto_context_line::~auto_context_line): Use
source-printing options from text_output.
gcc/ChangeLog:
PR sarif-replay/117943
* diagnostic-format-text.cc
(diagnostic_text_output_format::append_note): Use source-printing
options from text_output.
(diagnostic_text_output_format::update_printer): Copy
source-printing options from dc.
(default_diagnostic_text_finalizer): Use source-printing
options from text_output.
* diagnostic-format-text.h
(diagnostic_text_output_format::diagnostic_text_output_format):
Add optional diagnostic_source_printing_options param, using
the context's if null.
(diagnostic_text_output_format::get_source_printing_options): New
accessor.
(diagnostic_text_output_format::m_source_printing): New field.
* diagnostic-path.cc (event_range::print): Use source-printing
options from text_output.
(selftest::test_interprocedural_path_1): Use source-printing
options from dc.
* diagnostic-show-locus.cc
(gcc_rich_location::add_location_if_nearby): Likewise.
(diagnostic_context::maybe_show_locus): Add "opts" param
and use in place of m_source_printing. Pass it to source_policy
ctor.
(diagnostic_source_print_policy::diagnostic_source_print_policy):
Add overload taking a const diagnostic_source_printing_options &.
* diagnostic.cc (diagnostic_context::initialize): Pass nullptr
for source options when creating text sink, so that it uses
the dc's options.
(diagnostic_context::dump): Add an "output sinks:" heading and
print "(none)" if there aren't any.
(diagnostic_context::set_output_format): Split out code into...
(diagnostic_context::remove_all_output_sinks): ...this new
function.
* diagnostic.h
(diagnostic_source_print_policy::diagnostic_source_print_policy):
Add overload taking a const diagnostic_source_printing_options &.
(diagnostic_context::maybe_show_locus): Add "opts" param.
(diagnostic_context::remove_all_output_sinks): New decl.
(diagnostic_context::m_source_printing): New field.
(diagnostic_show_locus): Add "opts" param and pass to
maybe_show_locus.
* libgdiagnostics.cc (sink::~sink): Delete.
(sink::begin_group): Delete.
(sink::end_group): Delete.
(sink::emit): Delete.
(sink::m_dc): Drop field.
(diagnostic_text_sink::on_begin_text_diagnostic): Delete.
(diagnostic_text_sink::get_source_printing_options): Use
m_souece_printing.
(diagnostic_text_sink::m_current_logical_loc): Drop field.
(diagnostic_text_sink::m_inner_sink): New field.
(diagnostic_text_sink::m_source_printing): New field.
(diagnostic_manager::diagnostic_manager): Update for changes
to fields. Initialize m_dc.
(diagnostic_manager::~diagnostic_manager): Call diagnostic_finish.
(diagnostic_manager::get_file_cache): Drop.
(diagnostic_manager::get_dc): New accessor.
(diagnostic_manager::begin_group): Reimplement.
(diagnostic_manager::end_group): Reimplement.
(diagnostic_manager::get_prev_diag_logical_loc): New accessor.
(diagnostic_manager::m_dc): New field.
(diagnostic_manager::m_file_cache): Drop field.
(diagnostic_manager::m_edit_context): Convert to a std::unique_ptr
so that object can be constructed after m_dc is initialized.
(diagnostic_manager::m_prev_diag_logical_loc): New field.
(diagnostic_text_sink::diagnostic_text_sink): Reimplement.
(get_color_rule): Delete.
(diagnostic_text_sink::set_colorize): Reimplement.
(diagnostic_text_sink::text_starter): New.
(sarif_sink::sarif_sink): Reimplement.
(diagnostic_manager::write_patch): Update for change to
m_edit_context.
(diagnostic_manager::emit): Update now that each sink has a
corresponding diagnostic_output_format object within m_dc.
gcc/fortran/ChangeLog:
PR sarif-replay/117943
* error.cc (gfc_diagnostic_text_starter): Use source-printing
options from text_output.
gcc/testsuite/ChangeLog:
PR sarif-replay/117943
* gcc.dg/plugin/diagnostic_plugin_test_show_locus.cc
(custom_diagnostic_text_finalizer): Use source-printing options
from text_output.
* gcc.dg/plugin/diagnostic_plugin_xhtml_format.cc
(xhtml_builder::make_element_for_diagnostic): Use source-printing
options from diagnostic_context.
* gcc.dg/plugin/expensive_selftests_plugin.cc (test_richloc):
Likewise.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
Memmove destination overflows if size of int is less than 3, resulting in
spurious test failures. Fix by adding a requirement for effective
target int32plus.
gcc/testsuite/ChangeLog:
* gcc.dg/pr117816.c: Require effective target int32plus.
Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
|
|
2024-12-15 John David Anglin <danglin@gcc.gnu.org>
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/ivopts-1.c: Enable TImode tests on hppa64.
|
|
Thank you for the feedback. I have made the minor changes that were requested.
Additionally, I extracted the repetitive code into a reusable helper function,
match_plus_neg_pattern, making the code much more readable. Furthermore, the
logic, code, and tests remain the same as in version 2 of the patch.
gcc/ChangeLog:
* match.pd: New pattern.
* simplify-rtx.cc (match_plus_neg_pattern): New helper function.
(simplify_context::simplify_binary_operation_1): New
code to handle (a - 1) & -a, (a - 1) | -a and (a - 1) ^ -a.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/bitops-11.c: New test.
|
|
The BIT_FIELD_REF verifier has:
if (INTEGRAL_TYPE_P (TREE_TYPE (op))
&& !type_has_mode_precision_p (TREE_TYPE (op)))
{
error ("%qs of non-mode-precision operand", code_name);
return true;
}
check among other things, so one can't extract something out of say
_BitInt(63) or _BitInt(4096).
The new ifcombine optimization happily creates such BIT_FIELD_REFs
and ICEs during their verification.
The following patch fixes that by rejecting those in decode_field_reference.
2024-12-14 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/118023
* gimple-fold.cc (decode_field_reference): Return NULL_TREE if
inner has non-type_has_mode_precision_p integral type.
* gcc.dg/bitint-119.c: New test.
|
|
The following testcase ICEs because of a bug in matching_alloc_calls_p.
The loop was apparently meant to be walking the two attribute chains
in lock-step, but doesn't really do that. If the first lookup_attribute
returns non-NULL, the second one is not done, so rmats in that case can
be some random unrelated attribute rather than "malloc" attribute; the
body assumes even rmats if non-NULL is "malloc" attribute and relies
on its argument to be a "malloc" argument and if it is some other
attribute with incompatible attribute, it just crashes.
Now, fixing that in the obvious way, instead of doing
(amats = lookup_attribute ("malloc", amats))
|| (rmats = lookup_attribute ("malloc", rmats))
in the condition do
((amats = lookup_attribute ("malloc", amats)),
(rmats = lookup_attribute ("malloc", rmats)),
(amats || rmats))
fixes the testcase but regresses Wmismatched-dealloc-{2,3}.c tests.
The problem is that walking the attribute lists in a lock-step is obviously
a very bad idea, there is no requirement that the same deallocators are
present in the same order on both decls, e.g. there could be an extra malloc
attribute without argument in just one of the lists, or the order of say
free/realloc could be swapped, etc. We don't generally document nor enforce
any particular ordering of attributes (even when for some attributes we just
handle the first one rather than all).
So, this patch instead simply splits it into two loops, the first one walks
alloc_decl attributes, the second one walks dealloc_decl attributes.
If the malloc attribute argument is a built-in, that doesn't change
anything, and otherwise we have the chance to populate the whole
common_deallocs hash_set in the first loop and then can check it in the
second one (and don't need to use more expensive add method on it, can just
check contains there). Not to mention that it also fixes the case when
the function would incorrectly return true if there wasn't a common
deallocator between the two, but dealloc_decl had 2 malloc attributes with
the same deallocator.
2024-12-14 Jakub Jelinek <jakub@redhat.com>
PR middle-end/118024
* gimple-ssa-warn-access.cc (matching_alloc_calls_p): Walk malloc
attributes of alloc_decl and dealloc_decl in separate loops rather
than in lock-step. Use common_deallocs.contains rather than
common_deallocs.add in the second loop.
* gcc.dg/pr118024.c: New test.
|
|
The 'has_device_addr' of 'dispatch' has to be seen in conjunction with the
'need_device_addr' modifier to the 'adjust_args' clause of 'declare variant'.
As the latter has not yet been implemented, 'has_device_addr' has no real
effect. However, to prepare for 'need_device_addr' and as service to the user:
For C, where 'need_device_addr' is not permitted (contrary to C++ and Fortran),
a note is output when then the user tries to use it (alongside the existing
error that either 'nothing' or 'need_device_ptr' was expected).
And, on the ME side, is is lightly handled by diagnosing when - for the
same argument - there is a mismatch between the variant's adjust_args
'need_device_ptr' modifier and dispatch having an 'has_device_addr' clause
(or likewise for need_device_addr with is_device_ptr) as, according to the
spec, those are completely separate.
Thus, 'dispatch' will still do the host to device pointer conversion for
a 'need_device_ptr' argument, even if it appeared in a 'has_device_addr'
clause.
gcc/c/ChangeLog:
* c-parser.cc (OMP_DISPATCH_CLAUSE_MASK): Add has_device_addr clause.
(c_finish_omp_declare_variant): Add an 'inform' telling the user that
'need_device_addr' is invalid for C.
gcc/cp/ChangeLog:
* parser.cc (OMP_DISPATCH_CLAUSE_MASK): Add has_device_addr clause.
gcc/ChangeLog:
* gimplify.cc (gimplify_call_expr): When handling OpenMP's dispatch,
add diagnostic when there is a ptr vs. addr mismatch between
need_device_{addr,ptr} and {is,has}_device_{ptr,addr}, respectively.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/adjust-args-3.c: New test.
* gcc.dg/gomp/adjust-args-2.c: New test.
|
|
This patch introduces various improvements to the logic that merges
field compares, while moving it into ifcombine.
Before the patch, we could merge:
(a.x1 EQNE b.x1) ANDOR (a.y1 EQNE b.y1)
into something like:
(((type *)&a)[Na] & MASK) EQNE (((type *)&b)[Nb] & MASK)
if both of A's fields live within the same alignment boundaries, and
so do B's, at the same relative positions. Constants may be used
instead of the object B.
The initial goal of this patch was to enable such combinations when a
field crossed alignment boundaries, e.g. for packed types. We can't
generally access such fields with a single memory access, so when we
come across such a compare, we will attempt to combine each access
separately.
Some merging opportunities were missed because of right-shifts,
compares expressed as e.g. ((a.x1 ^ b.x1) & MASK) EQNE 0, and
narrowing conversions, especially after earlier merges. This patch
introduces handlers for several cases involving these.
The merging of multiple field accesses into wider bitfield-like
accesses is undesirable to do too early in compilation, so we move it
from folding to ifcombine, and guard its warnings with
-Wtautological-compare, turned into a common flag.
When the second of a noncontiguous pair of compares is the first that
accesses a word, we may merge the first compare with part of the
second compare that refers to the same word, keeping the compare of
the remaining bits at the spot where the second compare used to be.
Handling compares with non-constant fields was somewhat generalized
from what fold used to do, now handling non-adjacent fields, even if a
field of one object crosses an alignment boundary but the other
doesn't.
for gcc/ChangeLog
* fold-const.cc (make_bit_field): Export.
(unextend, all_ones_mask_p): Drop.
(decode_field_reference, fold_truth_andor_1): Move
field compare merging logic...
* gimple-fold.cc: (fold_truth_andor_for_ifcombine) ... here,
with -Wtautological-compare warning guards, and...
(decode_field_reference): ... here. Rework for gimple.
(gimple_convert_def_p, gimple_binop_def_p): New.
(compute_split_boundary_from_align): New.
(make_bit_field_load, build_split_load): New.
(reuse_split_load): New.
* fold-const.h: (make_bit_field_ref): Declare
(fold_truth_andor_for_ifcombine): Declare.
* tree-ssa-ifcombine.cc (ifcombine_ifandif): Try
fold_truth_andor_for_ifcombine.
* common.opt (Wtautological-compare): Move here.
for gcc/c-family/ChangeLog
* c.opt (Wtautological-compare): Move to ../common.opt.
for gcc/testsuite/ChangeLog
* gcc.dg/field-merge-1.c: New.
* gcc.dg/field-merge-2.c: New.
* gcc.dg/field-merge-3.c: New.
* gcc.dg/field-merge-4.c: New.
* gcc.dg/field-merge-5.c: New.
* gcc.dg/field-merge-6.c: New.
* gcc.dg/field-merge-7.c: New.
* gcc.dg/field-merge-8.c: New.
* gcc.dg/field-merge-9.c: New.
* gcc.dg/field-merge-10.c: New.
* gcc.dg/field-merge-11.c: New.
* gcc.dg/field-merge-12.c: New.
* gcc.target/aarch64/long_branch_1.c: Disable ifcombine.
|
|
[PR113688,PR114713,PR117724]
For checking or computing TYPE_CANONICAL, ignore the array size when it is
the last element of a structure or union. To not get errors because of
an inconsistent number of members, zero-sized arrays which are the last
element are not ignored anymore when checking the fields of a struct.
PR c/113688
PR c/114014
PR c/114713
PR c/117724
gcc/ChangeLog:
* tree.cc (gimple_canonical_types_compatible_p): Add exception.
gcc/lto/ChangeLog:
* lto-common.cc (hash_canonical_type): Add exception.
gcc/testsuite/ChangeLog:
* gcc.dg/pr113688.c: New test.
* gcc.dg/pr114014.c: New test.
* gcc.dg/pr114713.c: New test.
* gcc.dg/pr117724.c: New test.
|
|
Update test cases to use -mcpu=unset/-march=unset feature introduced in
r15-3606-g7d6c6a0d15c.
gcc/testsuite/ChangeLog:
* gcc.dg/pr41574.c: Added option "-mcpu=unset".
* gcc.dg/pr59418.c: Likewise.
* lib/target-supports.exp (add_options_for_vect_early_break):
Likewise.
(add_options_for_arm_v8_neon): Likewise.
(check_effective_target_arm_neon_ok_nocache): Likewise.
(check_effective_target_arm_simd32_ok_nocache): Likewise.
(check_effective_target_arm_sat_ok_nocache): Likewise.
(check_effective_target_arm_dsp_ok_nocache): Likewise.
(check_effective_target_arm_crc_ok_nocache): Likewise.
(check_effective_target_arm_v8_neon_ok_nocache): Likewise.
(check_effective_target_arm_v8_1m_mve_fp_ok_nocache): Likewise.
(check_effective_target_arm_v8_1a_neon_ok_nocache): Likewise.
(check_effective_target_arm_v8_2a_fp16_scalar_ok_nocache):
Likewise.
(check_effective_target_arm_v8_2a_fp16_neon_ok_nocache):
Likewise.
(check_effective_target_arm_v8_2a_dotprod_neon_ok_nocache):
Likewise.
(check_effective_target_arm_v8_1m_mve_ok_nocache): Likewise.
(check_effective_target_arm_v8_2a_i8mm_ok_nocache): Likewise.
(check_effective_target_arm_fp16fml_neon_ok_nocache): Likewise.
(check_effective_target_arm_v8_2a_bf16_neon_ok_nocache):
Likewise.
(check_effective_target_arm_v8m_main_cde_ok_nocache): Likewise.
(check_effective_target_arm_v8m_main_cde_fp_ok_nocache):
Likewise.
(check_effective_target_arm_v8_1m_main_cde_mve_ok_nocache):
Likewise.
(check_effective_target_arm_v8_1m_main_cde_mve_fp_ok_nocache):
Likewise.
(check_effective_target_arm_v8_3a_complex_neon_ok_nocache):
Likewise.
(check_effective_target_arm_v8_3a_fp16_complex_neon_ok_nocache):
Likewise.
(check_effective_target_arm_v8_1_lob_ok): Likewise.
Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
|
|
This patch is a followup to:
"c++: use diagnostic nesting [PR116253]"
This patch tweaks how text output with experimental-nesting=yes
prints nested diagnostics, by omitting the leading "note: " from
nested notes.
This reduces the amount of visual cruft the user has to ignore when
reading C++ template errors; see the examples in the testsuite.
This doesn't affect the output for users who have not opted-in
to nested diagnostic-printing.
gcc/ChangeLog:
PR other/116253
* diagnostic-format-text.cc (build_prefix): Don't add the
"note: " prefix when showing nested diagnostics.
gcc/testsuite/ChangeLog:
PR other/116253
* g++.dg/concepts/nested-diagnostics-1-truncated.C: Update
expected output.
* g++.dg/concepts/nested-diagnostics-1.C: Likewise.
* g++.dg/concepts/nested-diagnostics-2.C: Likewise.
* gcc.dg/plugin/diagnostic-test-nesting-text-indented-show-levels.c:
Likewise.
* gcc.dg/plugin/diagnostic-test-nesting-text-indented-unicode.c:
Likewise.
* gcc.dg/plugin/diagnostic-test-nesting-text-indented.c: Likewise.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
r15-919-gef27b91b62c3aa removed 1 / 3 size reduction for innermost
loop, but it doesn't accurately remember what's "innermost" for 2
testcases in PR117888.
1) For pass_cunroll, the "innermost" loop could be an originally outer
loop with inner loop completely unrolled by cunrolli. The patch moves
local variable cunrolli to parameter of tree_unroll_loops_completely
and passes it directly from execute of the pass.
2) For pass_cunrolli, cunrolli is set to false when the sibling loop
of a innermost loop is completely unrolled, and it inaccurately
takes the innermost loop as an "outer" loop. The patch add another
paramter innermost to helps recognizing the "original" innermost loop.
gcc/ChangeLog:
PR tree-optimization/117888
* tree-ssa-loop-ivcanon.cc (try_unroll_loop_completely): Use
cunrolli instead of cunrolli && !loop->inner to check if it's
innermost loop.
(canonicalize_loop_induction_variables): Add new parameter
const_sbitmap innermost, and pass
cunrolli
&& (unsigned) loop->num < SBITMAP_SIZE (innermost)
&& bitmap_bit_p (innermost, loop->num) as "cunrolli" to
try_unroll_loop_completely
(canonicalize_induction_variables): Pass innermost to
canonicalize_loop_induction_variables.
(tree_unroll_loops_completely_1): Add new parameter
const_sbitmap innermost.
(tree_unroll_loops_completely): Move local variable cunrolli
to parameter to indicate it's from pass cunrolli, also track
all "original" innermost loop at the beginning.
gcc/testsuite/ChangeLog:
* gcc.dg/pr117888-2.c: New test.
* gcc.dg/vect/pr117888-1.c: Ditto.
* gcc.dg/tree-ssa/pr83403-1.c: Add
--param max-completely-peeled-insns=300 for arm*-*-*.
* gcc.dg/tree-ssa/pr83403-2.c: Ditto.
|
|
PR117973 covers the aspect of
non-LOGICAL_OP_NON_SHORT_CIRCUIT targets for PR111456, for
which the test-case gcc.dg/tree-ssa/pr111456-1.c started
failing as described in PR117954.
* gcc.dg/tree-ssa/pr117973-1.c: New test.
|
|
This is expected fallout from r15-5646-gd1cf0d7a0f27fd as
described by that commit. The =0 case is covered by
PR117973.
PR tree-optimization/117954
* gcc.dg/tree-ssa/pr111456-1.c: Pass
--param=logical-op-non-short-circuit=1.
|
|
Change location_t to be a 64-bit integer instead of a 32-bit integer in
libcpp.
Also included in this change are the two other patches in the original
series which depended on this one; I am committing them all at once in case
it needs to be reverted later:
-Support for 64-bit location_t: gimple parts
The size of struct gimple increased by 8 bytes with the change in size of
location_t from 32- to 64-bit; adjust the WORD markings in the comments
accordingly. It seems that most of the WORD markings were off by one already,
probably not having been updated after a previous reduction in the size of a
gimple, so they have become retroactively correct again, and only a couple
needed adjustment actually.
Also add a comment that there is now 32 bits of unused padding available in
struct gimple for 64-bit hosts.
-Support for 64-bit location_t: Remove -flarge-source-files
The option -flarge-source-files became unnecessary with 64-bit location_t
and harms performance compared to the new default setting, so silently
ignore it.
libcpp/ChangeLog:
* include/cpplib.h (struct cpp_token): Adjust comment about the
struct size.
* include/line-map.h (location_t): Change typedef from 32-bit to 64-bit
integer.
(LINE_MAP_MAX_COLUMN_NUMBER): Increase size to be appropriate for
64-bit location_t.
(LINE_MAP_MAX_LOCATION_WITH_PACKED_RANGES): Likewise.
(LINE_MAP_MAX_LOCATION_WITH_COLS): Likewise.
(LINE_MAP_MAX_LOCATION): Likewise.
(MAX_LOCATION_T): Likewise.
(line_map_suggested_range_bits): Likewise.
(struct line_map): Adjust comment about the struct size.
(struct line_map_macro): Likewise.
(struct line_map_ordinary): Likewise. Rearrange fields to optimize
padding.
gcc/testsuite/ChangeLog:
* g++.dg/diagnostic/pr77949.C: Adapt the test for 64-bit location_t,
when the previously expected failure doesn't actually happen.
* g++.dg/modules/loc-prune-4.C: Adjust the expected output for the
64-bit location_t case.
* gcc.dg/plugin/expensive_selftests_plugin.cc: Don't try to test
the maximum supported column number in 64-bit location_t mode.
* gcc.dg/plugin/location_overflow_plugin.cc: Adjust the base_location
so it can effectively test 64-bit location_t.
gcc/ChangeLog:
* gimple.h (struct gphi): Update word marking comments to reflect
the new size of location_t.
(struct gimple): Likewise. Add a comment about padding.
* common.opt: Mark -flarge-source-files as Ignored.
* common.opt.urls: Regenerate.
* doc/invoke.texi: Remove -flarge-source-files.
* toplev.cc (process_options): Remove support for
-flarge-source-files.
|
|
Avoid-store-forwarding doesn't handle the case where an instruction in
the store-load sequence contains a REG_EH_REGION note, leading to the
insertion of instructions after it, while it should be the last
instruction in the basic block. This causes an ICE when compiling
using `-O -fnon-call-exceptions -favoid-store-forwarding
-fno-forward-propagate -finstrument-functions`.
This patch rejects the transformation when there are instructions in
the sequence that may throw an exeption.
PR rtl-optimization/117816
gcc/ChangeLog:
* avoid-store-forwarding.cc (store_forwarding_analyzer::avoid_store_forwarding):
Reject the transformation when having instructions that may
throw exceptions in the sequence.
gcc/testsuite/ChangeLog:
* gcc.dg/pr117816.c: New test.
|
|
The testcase tries to ensure we can elide all permutations when
vectorizing a MAX reduction. For SPARC the issue is that the
MAX reduction isn't supported and since we're trying to fall back
to single-lane SLP the dumps contain VEC_PERM_EXPR for the
interleaving permute lowering. Before all-SLP that wouldn't
be in the dumps when doing non-SLP, but eventually we'd fail to
vectorize so no VEC_PERM_EXPRs would be in the dumps either.
The following adds vect_no_int_min_max to the set of xfails for
this particular scan as well, like the existing check for vectorizing.
PR testsuite/117714
* gcc.dg/vect/slp-reduc-4.c: Add vect_no_int_min_max to the
XFAIL for the VEC_PERM_EXPR scan.
|
|
va_start macro was changed in C23 from the C17 va_start (va_list ap, parmN)
where parmN is the identifier of the last parameter into
va_start (va_list ap, ...) where arguments after ap aren't evaluated.
Late in the C23 development
"If any additional arguments expand to include unbalanced parentheses, or
a preprocessing token that does not convert to a token, the behavior is
undefined."
has been added, plus there is
"NOTE The macro allows additional arguments to be passed for va_start for
compatibility with older versions of the library only."
and
"Additional arguments beyond the first given to the va_start macro may be
expanded and used in unspecified contexts where they are unevaluated. For
example, an implementation diagnoses potentially erroneous input for an
invocation of va_start such as:"
...
va_start(vl, 1, 3.0, "12", xd); // diagnostic encouraged
...
"Simultaneously, va_start usage consistent with older revisions of this
document should not produce a diagnostic:"
...
void neigh (int last_arg, ...) {
va_list vl;
va_start(vl, last_arg); // no diagnostic
The following patch implements the recommended diagnostics.
Until now in C23 mode va_start(v, ...) was defined to
__builtin_va_start(v, 0)
and the extra arguments were silently ignored.
The following patch adds a new builtin in a form of a keyword which
parses the first argument, is silent about the __builtin_c23_va_start (ap)
form, for __builtin_c23_va_start (ap, identifier) looks the identifier up
and is silent if it is the last named parameter (except that it diagnoses
if it has register keyword), otherwise diagnoses it isn't the last one
but something else, and if there is just __builtin_c23_va_start (ap, )
or if __builtin_c23_va_start (ap, is followed by tokens other than
identifier followed by ), it skips over the tokens (with handling of
balanced ()s) until ) and diagnoses the extra tokens.
In all cases in a form of warnings.
2024-12-05 Jakub Jelinek <jakub@redhat.com>
PR c/107980
gcc/
* ginclude/stdarg.h (va_start): For C23+ change parameters from
v, ... to just ... and define to __builtin_c23_va_start(__VA_ARGS__)
rather than __builtin_va_start(v, 0).
gcc/c-family/
* c-common.h (enum rid): Add RID_C23_VA_START.
* c-common.cc (c_common_reswords): Add __builtin_c23_va_start.
gcc/c/
* c-parser.cc (c_parser_postfix_expression): Handle RID_C23_VA_START.
gcc/testsuite/
* gcc.dg/c23-stdarg-4.c: Expect extra warning.
* gcc.dg/c23-stdarg-6.c: Likewise.
* gcc.dg/c23-stdarg-7.c: Likewise.
* gcc.dg/c23-stdarg-8.c: Likewise.
* gcc.dg/c23-stdarg-10.c: New test.
* gcc.dg/c23-stdarg-11.c: New test.
* gcc.dg/torture/c23-stdarg-split-1a.c: Expect extra warning.
* gcc.dg/torture/c23-stdarg-split-1b.c: Likewise.
|
|
an exit from the loop [PR117243]
After r12-5300-gf98f373dd822b3, phiopt could get the following bb structure:
|
middle-bb -----|
| |
| |----| |
phi<1, 2> | |
cond | |
| | |
|--------+---|
Which was considered 2 loops. The inner loop had esimtate of upper_bound to be 8,
due to the original `for (b = 0; b <= 7; b++)`. The outer loop was already an
infinite one.
So phiopt would come along and change the condition to be unconditionally true,
we change the inner loop to being an infinite one but don't reset the estimate
on the loop and cleanup cfg comes along and changes it into one loop but also
does not reset the estimate of the loop. Then the loop unrolling uses the old estimate
and decides to add an unreachable there.o
So the fix is when phiopt changes an exit to a loop, reset the estimates, similar to
how cleanupcfg does it when merging some basic blocks.
Bootstrapped and tested on x86_64-linux-gnu.
PR tree-optimization/117243
PR tree-optimization/116749
gcc/ChangeLog:
* tree-ssa-phiopt.cc (replace_phi_edge_with_variable): Reset loop
estimates if the cond_block was an exit to a loop.
gcc/testsuite/ChangeLog:
* gcc.dg/torture/pr117243-1.c: New test.
* gcc.dg/torture/pr117243-2.c: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
|
|
PR testsuite/52641
PR testsuite/109123
PR testsuite/114661
PR testsuite/117828
PR testsuite/116481
PR testsuite/91069
gcc/testsuite/
* gcc.dg/Wuse-after-free-pr109123.c: Use size_t
instead of long unsigned int.
* gcc.dg/c23-tag-bitfields-1.c: Requires int32plus.
* gcc.dg/pr114661.c: Same.
* gcc.dg/pr117828.c: Same.
* gcc.dg/flex-array-counted-by-2.c: Use uintptr_t
instead of unsigned long.
* gcc.dg/pr116481.c: Same.
* gcc.dg/lto/tag-1_0.c: Use int32_t instead of int.
* gcc.dg/lto/tag-1_1.c: Use int16_t instead of short.
* gcc.dg/pr91069.c: Require double64.
* gcc.dg/type-convert-var.c: Require double64plus.
|
|
gcc/testsuite/
* gcc.c-torture/execute/ieee/cdivchkd.x: New file.
* gcc.c-torture/execute/ieee/cdivchkf.x: New file.
* gcc.dg/flex-array-counted-by.c: Require wchar.
* gcc.dg/fold-copysign-1.c [avr]: Add -mdouble=64.
|
|
Some diagnostics are issues late, e.g. in avr_print_operand().
This patch uses the insn's location as a proxy for the operand
location. Without the patch, the location is usually input_location,
which points to the closing } of the function body.
gcc/
* config/avr/avr.cc (avr_insn_location): New variable.
(avr_final_prescan_insn): Set avr_insn_location.
(avr_asm_final_postscan_insn): Unset avr_insn_location after last insn.
(avr_print_operand): Pass avr_insn_location to warning_at.
gcc/testsuite/
* gcc.dg/Warray-bounds-33.c: Adjust for avr diagnostics.
* gcc.dg/pr56228.c: Same.
* gcc.dg/pr86124.c: Same.
* gcc.dg/pr94291.c: Same.
* gcc.dg/tree-ssa/pr82059.c: Same.
|
|
Jakub noted that these tests were using dg-skip-if directives that implied the
tests were expected to run under multiple optimization options, which means
they probably should be in gcc.dg/torture rather than in the gcc.dg directory.
This moves the relevant tests from gcc.dg to gcc.dg/torture.
gcc/testsuite
* gcc.dg/crc-linux-1.c: Moved to from gcc.dg/torture.
* gcc.dg/crc-linux-2.c: Likewise.
* gcc.dg/crc-linux-4.c: Likewise.
* gcc.dg/crc-linux-5.c: Likewise.
* gcc.dg/crc-not-crc-15.c: Likewise.
* gcc.dg/crc-side-instr-1.c: Likewise.
* gcc.dg/crc-side-instr-2.c: Likewise.
* gcc.dg/crc-side-instr-3.c: Likewise.
* gcc.dg/crc-side-instr-4.c: Likewise.
* gcc.dg/crc-side-instr-5.c: Likewise.
* gcc.dg/crc-side-instr-6.c: Likewise.
* gcc.dg/crc-side-instr-7.c: Likewise.
* gcc.dg/crc-side-instr-8.c: Likewise.
* gcc.dg/crc-side-instr-9.c: Likewise.
* gcc.dg/crc-side-instr-10.c: Likewise.
* gcc.dg/crc-side-instr-11.c: Likewise.
* gcc.dg/crc-side-instr-12.c: Likewise.
* gcc.dg/crc-side-instr-13.c: Likewise.
* gcc.dg/crc-side-instr-14.c: Likewise.
* gcc.dg/crc-side-instr-15.c: Likewise.
* gcc.dg/crc-side-instr-16.c: Likewise.
* gcc.dg/crc-side-instr-17.c: Likewise.
|
|
As noted in bug 117162, C23 changed some rules on UCNs to match C++
(this was a late change agreed in the resolution to CD2 comment
US-032, implementing changes from N3124), which we need to implement.
Allow UCNs below 0xa0 outside identifiers for C, with a
pedwarn-if-pedantic before C23 (and a warning with -Wc11-c23-compat)
except for the always-allowed cases of UCNs for $ @ `. Also as part
of that change, do not allow \u0024 in identifiers as equivalent to $
for C23.
Bootstrapped with no regressions for x86_64-pc-linux-gnu.
PR c/117162
libcpp/
* include/cpplib.h (struct cpp_options): Add low_ucns.
* init.cc (struct lang_flags, lang_defaults): Add low_ucns.
(cpp_set_lang): Set low_ucns
* charset.cc (_cpp_valid_ucn): For C, allow UCNs below 0xa0
outside identifiers, with a pedwarn if pedantic before C23 or a
warning with -Wc11-c23-compat. Do not allow \u0024 in identifiers
for C23.
gcc/testsuite/
* gcc.dg/cpp/c17-ucn-1.c, gcc.dg/cpp/c17-ucn-2.c,
gcc.dg/cpp/c17-ucn-3.c, gcc.dg/cpp/c17-ucn-4.c,
gcc.dg/cpp/c23-ucn-2.c, gcc.dg/cpp/c23-ucnid-2.c: New tests.
* c-c++-common/cpp/delimited-escape-seq-3.c,
c-c++-common/cpp/named-universal-char-escape-3.c,
gcc.dg/cpp/c23-ucn-1.c, gcc.dg/cpp/c2y-delimited-escape-seq-3.c:
Update expected messages
* gcc.dg/cpp/ucs.c: Use -pedantic-errors. Update expected
messages.
|
|
improvements [PR117420]
The following patch implements the with_*_nonzero_bits* cleanups and
improvements I was talking about.
get_nonzero_bits is extended to also handle BIT_AND_EXPR (as a tree or
as SSA_NAME with BIT_AND_EXPR def_stmt), new function is added for the
bits known to be set (get_known_nonzero_bits) and the match.pd predicates
are renamed and adjusted, so that there is no confusion on which one to
use (one is named and documented to be internal), changed so that it can be
used only as a simple predicate, not match some operands, and that it doesn't
try to match twice for the GIMPLE case (where SSA_NAME with integral or pointer
type matches, but SSA_NAME with BIT_AND_EXPR def_stmt matched differently).
Furthermore, get_nonzero_bits just returns the all bits set (or
get_known_nonzero_bits no bits set) fallback if the argument isn't a
SSA_NAME (nor INTEGER_CST or whatever the functions handle explicitly).
2024-12-03 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/117420
* tree-ssanames.h (get_known_nonzero_bits): Declare.
* tree-ssanames.cc (get_nonzero_bits): New wrapper function. Move old
definition to ...
(get_nonzero_bits_1): ... here, add static. Change widest_int in
function comment to wide_int.
(get_known_nonzero_bits_1, get_known_nonzero_bits): New functions.
* match.pd (with_possible_nonzero_bits2): Rename to ...
(with_possible_nonzero_bits): ... this. Guard the bit_and case with
#if GENERIC. Change to a normal match predicate without parameters.
Rename the old with_possible_nonzero_bits match to ...
(with_possible_nonzero_bits_1): ... this.
(with_certain_nonzero_bits2): Remove.
(with_known_nonzero_bits_1, with_known_nonzero_bits): New match
predicates.
(X == C (or X & Z == Y | C) is impossible if ~nonzero(X) & C != 0):
Use with_known_nonzero_bits@0 instead of
(with_certain_nonzero_bits2 @1), use with_possible_nonzero_bits@0
instead of (with_possible_nonzero_bits2 @0) and
get_known_nonzero_bits (@1) instead of wi::to_wide (@1).
* gcc.dg/tree-ssa/pr117420.c: New test.
|
|
In the ?ROTATE_EXPR lowering I forgot to handle rotation by 0 correctly.
INTEGER_CST 0 is very unlikely, it would be probably folded away, but
a non-constant count can't use just p - n because then the shift count
is out of bounds for zero.
In the FE I use n == 0 ? x : (x << n) | (x >> (p - n)) but bitintlower
here isn't prepared at this point to have bb split and am not sure if
using COND_EXPR is a good idea either, so the patch uses (p - n) % p.
Perhaps I should just disable lowering the rotate in the FE for the
non-mode precision BITINT_TYPEs too.
2024-12-03 Jakub Jelinek <jakub@redhat.com>
PR middle-end/117847
* gimple-lower-bitint.cc (gimple_lower_bitint) <case LROTATE_EXPR>:
Use m = (p - n) % p instead of m = p - n for the other shift count.
* gcc.dg/torture/bitint-75.c: New test.
|
|
The function uses atoi, which can silently return valid numbers even for
some too large numbers in the string.
Furthermore, the verification that all the characters in asmspec are
decimal digits can be simplified when using strotoul, we can check just
the first digit and whether the end pointer points to '\0'.
2024-12-03 Heiko Eißfeldt <heiko@hexco.de>
PR middle-end/114540
* varasm.cc (decode_reg_name_and_count): Use strtoul instead of atoi
and simplify verification that the whole asmspec contains just decimal
digits.
* gcc.dg/pr114540.c: New test.
Signed-off-by: Heiko Eißfeldt <heiko@hexco.de>
Co-authored-by: Jakub Jelinek <jakub@redhat.com>
|
|
With SLP forced we fail to consider using single-lane SLP for a case
that we still end up discovering as hybrid (in the PR in question
this is because we run into the SLP discovery limit due to excessive
association).
PR tree-optimization/117874
* tree-vect-loop.cc (vect_analyze_loop_2): When non-SLP
analysis fails, try single-lane SLP.
* gcc.dg/vect/pr117874.c: New testcase.
|
|
Especially in the recent CRC commits, I see
\ No newline at end of file
in almost every second file. So, I went through
the diff between r15-1 and current trunk in gcc/, looking for
additions of such problems which don't intentional (e.g.
Wtrailing-whitespace* tests had it there intentionally) and
just added the missing newline elsewhere.
2024-12-02 Jakub Jelinek <jakub@redhat.com>
gcc/
* config/mingw/mingw-stdint.h: Add newline at the end of the file.
* config/mingw/winnt-dll.cc: Likewise.
* sym-exec/sym-exec-expression.h: Likewise.
* sym-exec/sym-exec-expression.cc: Likewise.
* sym-exec/sym-exec-condition.cc: Likewise.
* sym-exec/sym-exec-expr-is-a-helper.h: Likewise.
* sym-exec/sym-exec-condition.h: Likewise.
* hwint.cc: Likewise.
* crc-verification.cc: Likewise.
* sarif-spec-urls.def: Likewise.
gcc/testsuite/
* g++.target/aarch64/pr94515-2.C: Add newline at the end of the file.
* g++.target/aarch64/return_address_sign_ab_exception.C: Likewise.
* gcc.target/arm/thumb2-switchstatement.c: Likewise.
* gcc.target/riscv/rvv/base/vssubu-2.c: Likewise.
* gcc.target/riscv/rvv/base/vssubu-1.c: Likewise.
* gcc.target/riscv/and-shift32.c: Likewise.
* gcc.target/riscv/crc-builtin-zbc32.c: Likewise.
* gcc.target/riscv/and-shift64.c: Likewise.
* gcc.target/riscv/xtheadbb-extu-4.c: Likewise.
* gcc.target/i386/avx2-bf16-vec-absneg.c: Likewise.
* gcc.target/i386/avx512f-bf16-vec-absneg.c: Likewise.
* gcc.target/aarch64/cpunative/native_cpu_26.c: Likewise.
* gcc.target/aarch64/cpunative/info_26: Likewise.
* gcc.target/aarch64/cpunative/info_25: Likewise.
* g++.dg/contracts/pr116607.C: Likewise.
* gfortran.dg/pr108889.f90: Likewise.
* gcc.dg/crc-not-crc-14.c: Likewise.
* gcc.dg/crc-from-fedora-packages-13.c: Likewise.
* gcc.dg/crc-not-crc-25.c: Likewise.
* gcc.dg/crc-from-fedora-packages-29.c: Likewise.
* gcc.dg/crc-from-fedora-packages-10.c: Likewise.
* gcc.dg/crc-side-instr-10.c: Likewise.
* gcc.dg/crc-side-instr-1.c: Likewise.
* gcc.dg/crc-side-instr-3.c: Likewise.
* gcc.dg/crc-side-instr-2.c: Likewise.
* gcc.dg/crc-not-crc-17.c: Likewise.
* gcc.dg/crc-from-fedora-packages-7.c: Likewise.
* gcc.dg/crc-side-instr-12.c: Likewise.
* gcc.dg/crc-side-instr-16.c: Likewise.
* gcc.dg/crc-not-crc-16.c: Likewise.
* gcc.dg/crc-from-fedora-packages-4.c: Likewise.
* gcc.dg/crc-not-crc-20.c: Likewise.
* gcc.dg/crc-linux-3.c: Likewise.
* gcc.dg/crc-from-fedora-packages-27.c: Likewise.
* gcc.dg/pr109393.c: Likewise.
* gcc.dg/crc-side-instr-7.c: Likewise.
* gcc.dg/crc-side-instr-4.c: Likewise.
* gcc.dg/tree-ssa/ldexp.c: Likewise.
* gcc.dg/tree-ssa/pr114760-2.c: Likewise.
* gcc.dg/tree-ssa/pr114760-1.c: Likewise.
* gcc.dg/crc-side-instr-15.c: Likewise.
* gcc.dg/crc-side-instr-9.c: Likewise.
* gcc.dg/crc-not-crc-26.c: Likewise.
* gcc.dg/crc-side-instr-8.c: Likewise.
* gcc.dg/crc-not-crc-23.c: Likewise.
* gcc.dg/crc-not-crc-19.c: Likewise.
* gcc.dg/crc-from-fedora-packages-22.c: Likewise.
* gcc.dg/crc-from-fedora-packages-16.c: Likewise.
* gcc.dg/crc-side-instr-11.c: Likewise.
* gcc.dg/crc-from-fedora-packages-5.c: Likewise.
* gcc.dg/crc-not-crc-22.c: Likewise.
* gcc.dg/crc-side-instr-17.c: Likewise.
* gcc.dg/crc-linux-4.c: Likewise.
* gcc.dg/crc-side-instr-14.c: Likewise.
* gcc.dg/crc-not-crc-18.c: Likewise.
* gcc.dg/crc-from-fedora-packages-23.c: Likewise.
* gcc.dg/crc-not-crc-21.c: Likewise.
* gcc.dg/crc-linux-2.c: Likewise.
* gcc.dg/crc-from-fedora-packages-1.c: Likewise.
* gcc.dg/crc-from-fedora-packages-30.c: Likewise.
* gcc.dg/torture/crc-11.c: Likewise.
* gcc.dg/torture/crc-27.c: Likewise.
* gcc.dg/torture/crc-2.c: Likewise.
* gcc.dg/torture/crc-24.c: Likewise.
* gcc.dg/torture/crc-crc8.c: Likewise.
* gcc.dg/torture/crc-crc8-data8-xorOustideFor.c: Likewise.
* gcc.dg/torture/crc-16.c: Likewise.
* gcc.dg/torture/crc-crc64-data64.c: Likewise.
* gcc.dg/crc-from-fedora-packages-32.c: Likewise.
* gcc.dg/crc-side-instr-6.c: Likewise.
* gcc.dg/crc-side-instr-5.c: Likewise.
* gcc.dg/crc-side-instr-13.c: Likewise.
* gcc.dg/crc-not-crc-15.c: Likewise.
* gcc.dg/crc-not-crc-13.c: Likewise.
* gcc.dg/crc-from-fedora-packages-6.c: Likewise.
* gcc.dg/crc-not-crc-24.c: Likewise.
|
|
The PR uncovers unchecked constraints on the ability to code-generate
with SLP but also latent issues with regard to stmt order checking
since loop (early-break) and BB (for quite some time) vectorization
are no longer constraint to single-BBs. In particular get_later_stmt
simply compares UIDs of stmts, but that's only reliable when they
are in the same BB.
For the PR in question the problematical case is demoting a SLP node
to external which fails to check we can actually code generate this
in the way we do (using get_later_stmt). The following thus adds
checking that we demote to external only when all defs are from
the same BB.
We no longer vectorize gcc.dg/vect/bb-slp-49.c but the testcase was
for a wrong-code issue and the vectorization done is a no-op.
PR tree-optimization/116352
PR tree-optimization/117876
* tree-vect-slp.cc (vect_slp_can_convert_to_external): New.
(vect_slp_convert_to_external): Call it.
(vect_build_slp_tree_2): Likewise.
* gcc.dg/vect/pr116352.c: New testcase.
* gcc.dg/vect/bb-slp-49.c: Remove vectorization check.
|
|
I have corrected the code formatting as requested. I added new tests to
the existing file phi-opt-11.c, instead of creating a new one.
I performed testing before and after applying the patch on the x86
architecture, and I confirm that there are no new regressions.
The logic and general code of the patch itself have not been changed.
> So the A EQ/NE B expression, we can reverse A and B in the expression
> and still get the same result. But don't we have to be more careful for
> the TRUE/FALSE arms of the ternary? For BIT_AND we need ? a : b for
> BIT_IOR we need ? b : a.
>
> I don't see that gets verified in the existing code or after your
> change. I suspect I'm just missing something here. Can you clarify how
> we verify that BIT_AND gets ? a : b for the true/false arms and that
> BIT_IOR gets ? b : a for the true/false arms?
I did not communicate this clearly last time, but the existing optimization
simplifies the expression "(cond & (a == b)) ? a : b" to the simpler "b".
Similarly, the expression "(cond & (a == b)) ? b : a" simplifies to "a".
Thus, the existing and my optimization perform the following
simplifications:
(cond & (a == b)) ? a : b -> b
(cond & (a == b)) ? b : a -> a
(cond | (a != b)) ? a : b -> a
(cond | (a != b)) ? b : a -> b
For this reason, for BIT_AND_EXPR when we have A EQ B, it is sufficient to
confirm that one operand matches the true/false arm and the other matches
the false/true arm. In both cases, we simplify the expression to the third
operand of the ternary operation (i.e., OP0 ? OP1 : OP2 simplifies to OP2).
This is achieved in the value_replacement function after successfully
setting the value of *code within the rhs_is_fed_for_value_replacement
function to EQ_EXPR.
For BIT_IOR_EXPR, the same check is performed for A NE B, except now
*code remains NE_EXPR, and then value_replacement returns the second
operand (i.e., OP0 ? OP1 : OP2 simplifies to OP1).
2024-10-30 Jovan Vukic <Jovan.Vukic@rt-rk.com>
gcc/ChangeLog:
* tree-ssa-phiopt.cc (rhs_is_fed_for_value_replacement): Add a new
optimization opportunity for BIT_IOR_EXPR and a != b.
(operand_equal_for_value_replacement): Ditto.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/phi-opt-11.c: Add more tests.
|
|
gcc/testsuite
* gcc.dg/crc-from-fedora-packages-1.c: New test.
* gcc.dg/crc-from-fedora-packages-2.c: Likewise.
* gcc.dg/crc-from-fedora-packages-3.c: Likewise.
* gcc.dg/crc-from-fedora-packages-4.c: Likewise.
* gcc.dg/crc-from-fedora-packages-5.c: Likewise.
* gcc.dg/crc-from-fedora-packages-6.c: Likewise.
* gcc.dg/crc-from-fedora-packages-7.c: Likewise.
* gcc.dg/crc-from-fedora-packages-8.c: Likewise.
* gcc.dg/crc-from-fedora-packages-9.c: Likewise.
* gcc.dg/crc-from-fedora-packages-10.c: Likewise.
* gcc.dg/crc-from-fedora-packages-11.c: Likewise.
* gcc.dg/crc-from-fedora-packages-12.c: Likewise.
* gcc.dg/crc-from-fedora-packages-13.c: Likewise.
* gcc.dg/crc-from-fedora-packages-14.c: Likewise.
* gcc.dg/crc-from-fedora-packages-15.c: Likewise.
* gcc.dg/crc-from-fedora-packages-16.c: Likewise.
* gcc.dg/crc-from-fedora-packages-17.c: Likewise.
* gcc.dg/crc-from-fedora-packages-18.c: Likewise.
* gcc.dg/crc-from-fedora-packages-19.c: Likewise.
* gcc.dg/crc-from-fedora-packages-20.c: Likewise.
* gcc.dg/crc-from-fedora-packages-21.c: Likewise.
* gcc.dg/crc-from-fedora-packages-22.c: Likewise.
* gcc.dg/crc-from-fedora-packages-23.c: Likewise.
* gcc.dg/crc-from-fedora-packages-24.c: Likewise.
* gcc.dg/crc-from-fedora-packages-25.c: Likewise.
* gcc.dg/crc-from-fedora-packages-26.c: Likewise.
* gcc.dg/crc-from-fedora-packages-27.c: Likewise.
* gcc.dg/crc-from-fedora-packages-28.c: Likewise.
* gcc.dg/crc-from-fedora-packages-29.c: Likewise.
* gcc.dg/crc-from-fedora-packages-30.c: Likewise.
* gcc.dg/crc-from-fedora-packages-31.c: Likewise.
* gcc.dg/crc-from-fedora-packages-32.c: Likewise.
* gcc.dg/crc-linux-1.c: Likewise.
* gcc.dg/crc-linux-2.c: Likewise.
* gcc.dg/crc-linux-3.c: Likewise.
* gcc.dg/crc-linux-4.c: Likewise.
* gcc.dg/crc-linux-5.c: Likewise.
* gcc.dg/crc-not-crc-1.c: Likewise.
* gcc.dg/crc-not-crc-2.c: Likewise.
* gcc.dg/crc-not-crc-3.c: Likewise.
* gcc.dg/crc-not-crc-4.c: Likewise.
* gcc.dg/crc-not-crc-5.c: Likewise.
* gcc.dg/crc-not-crc-6.c: Likewise.
* gcc.dg/crc-not-crc-7.c: Likewise.
* gcc.dg/crc-not-crc-8.c: Likewise.
* gcc.dg/crc-not-crc-9.c: Likewise.
* gcc.dg/crc-not-crc-10.c: Likewise.
* gcc.dg/crc-not-crc-11.c: Likewise.
* gcc.dg/crc-not-crc-12.c: Likewise.
* gcc.dg/crc-not-crc-13.c: Likewise.
* gcc.dg/crc-not-crc-14.c: Likewise.
* gcc.dg/crc-not-crc-15.c: Likewise.
* gcc.dg/crc-not-crc-16.c: Likewise.
* gcc.dg/crc-not-crc-17.c: Likewise.
* gcc.dg/crc-not-crc-18.c: Likewise.
* gcc.dg/crc-not-crc-19.c: Likewise.
* gcc.dg/crc-not-crc-20.c: Likewise.
* gcc.dg/crc-not-crc-21.c: Likewise.
* gcc.dg/crc-not-crc-22.c: Likewise.
* gcc.dg/crc-not-crc-23.c: Likewise.
* gcc.dg/crc-not-crc-24.c: Likewise.
* gcc.dg/crc-not-crc-25.c: Likewise.
* gcc.dg/crc-not-crc-26.c: Likewise.
* gcc.dg/crc-side-instr-1.c: Likewise.
* gcc.dg/crc-side-instr-2.c: Likewise.
* gcc.dg/crc-side-instr-3.c: Likewise.
* gcc.dg/crc-side-instr-4.c: Likewise.
* gcc.dg/crc-side-instr-5.c: Likewise.
* gcc.dg/crc-side-instr-6.c: Likewise.
* gcc.dg/crc-side-instr-7.c: Likewise.
* gcc.dg/crc-side-instr-8.c: Likewise.
* gcc.dg/crc-side-instr-9.c: Likewise.
* gcc.dg/crc-side-instr-10.c: Likewise.
* gcc.dg/crc-side-instr-11.c: Likewise.
* gcc.dg/crc-side-instr-12.c: Likewise.
* gcc.dg/crc-side-instr-13.c: Likewise.
* gcc.dg/crc-side-instr-14.c: Likewise.
* gcc.dg/crc-side-instr-15.c: Likewise.
* gcc.dg/crc-side-instr-16.c: Likewise.
* gcc.dg/crc-side-instr-17.c: Likewise.
* gcc.dg/torture/crc-1.c: Likewise.
* gcc.dg/torture/crc-2.c: Likewise.
* gcc.dg/torture/crc-3.c: Likewise.
* gcc.dg/torture/crc-4.c: Likewise.
* gcc.dg/torture/crc-5.c: Likewise.
* gcc.dg/torture/crc-6.c: Likewise.
* gcc.dg/torture/crc-7.c: Likewise.
* gcc.dg/torture/crc-8.c: Likewise.
* gcc.dg/torture/crc-9.c: Likewise.
* gcc.dg/torture/crc-10.c: Likewise.
* gcc.dg/torture/crc-11.c: Likewise.
* gcc.dg/torture/crc-12.c: Likewise.
* gcc.dg/torture/crc-13.c: Likewise.
* gcc.dg/torture/crc-14.c: Likewise.
* gcc.dg/torture/crc-15.c: Likewise.
* gcc.dg/torture/crc-16.c: Likewise.
* gcc.dg/torture/crc-17.c: Likewise.
* gcc.dg/torture/crc-18.c: Likewise.
* gcc.dg/torture/crc-19.c: Likewise.
* gcc.dg/torture/crc-20.c: Likewise.
* gcc.dg/torture/crc-21.c: Likewise.
* gcc.dg/torture/crc-22.c: Likewise.
* gcc.dg/torture/crc-23.c: Likewise.
* gcc.dg/torture/crc-24.c: Likewise.
* gcc.dg/torture/crc-25.c: Likewise.
* gcc.dg/torture/crc-26.c: Likewise.
* gcc.dg/torture/crc-27.c: Likewise.
* gcc.dg/torture/crc-28.c: Likewise.
* gcc.dg/torture/crc-29.c: Likewise.
* gcc.dg/torture/crc-CCIT-data16-xorOutside_InsideFor.c: Likewise.
* gcc.dg/torture/crc-coremark16-data16.c: Likewise.
* gcc.dg/torture/crc-coremark32-data16.c: Likewise.
* gcc.dg/torture/crc-coremark32-data32.c: Likewise.
* gcc.dg/torture/crc-coremark32-data8.c: Likewise.
* gcc.dg/torture/crc-coremark64-data64.c: Likewise.
* gcc.dg/torture/crc-coremark8-data8.c: Likewise.
* gcc.dg/torture/crc-CCIT-data16.c: Likewise.
* gcc.dg/torture/crc-CCIT-data8.c: Likewise.
* gcc.dg/torture/crc-crc32-data16.c: Likewise.
* gcc.dg/torture/crc-crc32-data24.c: Likewise.
* gcc.dg/torture/crc-crc32-data8.c: Likewise.
* gcc.dg/torture/crc-crc32.c: Likewise.
* gcc.dg/torture/crc-crc64-data32.c: Likewise.
* gcc.dg/torture/crc-crc64-data64.c: Likewise.
* gcc.dg/torture/crc-crc8-data8-loop-xorInFor.c: Likewise.
* gcc.dg/torture/crc-crc8-data8-xorOustideFor.c: Likewise.
* gcc.dg/torture/crc-crc8.c: Likewise.
Co-Authored: Jeff Law <jlaw@ventanamicro.com>
|
|
On default_packed targets like PRU, spurious warnings are emitted:
...workspace/gcc/gcc/testsuite/gcc.dg/pr117806.c:5:3: warning: 'packed' attribute ignored for field of type 'double' [-Wattributes]
Fix by annotating the excess warnings for default_packed targets.
gcc/testsuite/ChangeLog:
* gcc.dg/pr117806.c: Test can spill excess
errors for default_packed targets.
Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
|
|
Like r15-5063-g6e84a41622f56c, but this is for the `a != 0` case.
After adding vn_valueize to the handle the `a ==/!= 0` case
of insert_predicates_for_cond, it would go into an infinite loop
as the Value number for a could be the same as what it
is for the whole expression. This avoids that recursion so there is
no infinite loop here.
Note lim was introducing `bool_var2 = bool_var1 != 0` originally but
with the gimple testcase in -2, there is no dependency on what passes
before hand will do.
Bootstrapped and tested on x86_64-linux-gnu.
PR tree-optimization/117859
gcc/ChangeLog:
* tree-ssa-sccvn.cc (insert_predicates_for_cond): If the
valueization for the new lhs for `lhs != 0`
is the same as the old ones, don't recurse.
gcc/testsuite/ChangeLog:
* gcc.dg/torture/pr117859-1.c: New test.
* gcc.dg/torture/pr117859-2.c: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
|
|
This patch adds optimization of the following patterns:
(zero_extend:M (subreg:N (not:O==M (X:Q==M)))) ->
(xor:M (zero_extend:M (subreg:N (X:M)), mask))
... where the mask is GET_MODE_MASK (N).
For the cases when X:M doesn't have any non-zero bits outside of mode N,
(zero_extend:M (subreg:N (X:M)) could be simplified to just (X:M)
and whole optimization will be:
(zero_extend:M (subreg:N (not:M (X:M)))) ->
(xor:M (X:M, mask))
Patch targets to handle code patterns like:
not a0,a0
andi a0,a0,0xff
to be optimized to:
xori a0,a0,255
PR rtl-optimization/112398
PR rtl-optimization/117476
gcc/ChangeLog:
* simplify-rtx.cc (simplify_context::simplify_unary_operation_1):
Simplify ZERO_EXTEND (SUBREG (NOT X)) to XOR (X, GET_MODE_MASK(SUBREG))
when X doesn't have any non-zero bits outside of SUBREG mode.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/pr112398.c: New test.
* gcc.dg/torture/pr117476-1.c: New test. From Zhendong Su.
* gcc.dg/torture/pr117476-2.c: New test. From Zdenek Sojka.
|
|
As reported in bug 100501 (plus duplicates), the gimplifier ICEs for C
tests involving a statement expression not returning a value as an asm
input; this includes the variant bug 100792 where the statement
expression ends with another asm statement.
The expected diagnostic for this case (as seen for C++ input) is one
coming from the gimplifier and so it seems reasonable to fix the
gimplifier to handle the GENERIC generated for this case by the C
front end, rather than trying to make the C front end detect it
earlier. Thus the gimplifier to handle a void
expression like other non-lvalues for such a memory input.
Bootstrapped with no regressions for x86_64-pc-linux-gnu. OK to commit?
PR c/100501
PR c/100792
gcc/
* gimplify.cc (gimplify_asm_expr): Handle void expressions for
memory inputs like other non-lvalues.
gcc/testsuite/
* gcc.dg/pr100501-1.c, gcc.dg/pr100792-1.c: New tests.
* gcc.dg/pr48552-1.c, gcc.dg/pr48552-2.c,
gcc.dg/torture/pr98601.c: Update expected errors.
Co-authored-by: Richard Biener <rguenther@suse.de>
|
|
The following patch handles VECTOR_TYPE_P CONSTRUCTORs in
count_nonzero_bytes, including handling them if they have some elements
non-constant.
If there are still some constant elements before it (in the range queried),
we derive info at least from those bytes and consider the rest as unknown.
The first 3 hunks just punt in IMHO problematic cases, the spaghetti code
considers byte_size 0 as unknown size, determine yourself, so if offset
is equal to exp size, there are 0 bytes to consider (so nothing useful
to determine), but using byte_size 0 would mean use any size.
Similarly, native_encode_expr uses int type for offset (and size), so
padding it offset larger than INT_MAX could be silent miscompilation.
I've guarded the test to just a couple of targets known to handle it,
because e.g. on ia32 without -msse forwprop1 seems to lower the CONSTRUCTOR
into 4 BIT_FIELD_REF stores and I haven't figured out on what exactly
that depends on (e.g. powerpc* is fine on any CPUs, even with -mno-altivec
-mno-vsx, even -m32).
2024-11-30 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/117057
* tree-ssa-strlen.cc (strlen_pass::count_nonzero_bytes): Punt also
when byte_size is equal to offset or nchars. Punt if offset is bigger
than INT_MAX. Handle vector CONSTRUCTOR with some elements constant,
possibly followed by non-constant.
* gcc.dg/strlenopt-32.c: Remove xfail and vect_slp_v2qi_store_unalign
specific scan-tree-dump-times directive.
* gcc.dg/strlenopt-96.c: New test.
|
|
We need to call decl_attributes when creating the fields for a composite
type.
PR c/117806
gcc/c/ChangeLog:
* c-typeck.cc (composite_type_internal): Call decl_attributes.
gcc/testsuite/ChangeLog:
* gcc.dg/pr117806.c: New test.
|
|
c_parser_declarator can return null if there was an error,
but c_parser_gimple_declaration was not ready for that.
This fixes that oversight so we don't get an ICE after the error.
Bootstrapped and tested on x86_64-linux-gnu.
PR c/117749
gcc/c/ChangeLog:
* gimple-parser.cc (c_parser_gimple_declaration): Check
declarator to be non-null.
gcc/testsuite/ChangeLog:
* gcc.dg/gimplefe-55.c: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
|
|
Add missing test for consistency of bit-fields when comparing tagged
types for compatibility.
PR c/117828
gcc/c/ChangeLog:
* c-typeck.cc (tagged_types_tu_compatible_p): Add check.
gcc/testsuite/ChangeLog:
* gcc.dg/c23-tag-bitfields-1.c: New test.
* gcc.dg/pr117828.c: New test.
|
|
The following testcase used to ICE on the trunk since the clear small
object if it has padding optimization before my r15-5746 change,
now it doesn't just because type_has_padding_at_level_p isn't called
on the testcase.
Though, as the testcase shows, structures/unions which contain erroneous
types of one or more of its members can have TREE_TYPE of the FIELD_DECL
error_mark_node, on which we can crash.
E.g. the __builtin_clear_padding lowering just ignores those:
if (TREE_TYPE (field) == error_mark_node)
continue;
and
if (ftype == error_mark_node)
continue;
It doesn't matter much what exactly we do for those cases, as we are going
to fail the compilation anyway, but we shouldn't crash.
So, the following patch ignores those in type_has_padding_at_level_p.
For RECORD_TYPE, we already return if !DECL_SIZE (f) which I think should
cover already the erroneous fields (and we don't use TYPE_SIZE on those).
2024-11-29 Jakub Jelinek <jakub@redhat.com>
PR middle-end/117065
* gimple-fold.cc (type_has_padding_at_level_p) <case UNION_TYPE>:
Also continue if f has error_mark_node type.
* gcc.dg/pr117065.c: New test.
|
|
The r15-4833-ge9ab41b79933 patch had among tons of config/i386
specific changes also important change to the generic code, allowing
also 2 as valid value of the second argument of __builtin_prefetch:
- /* Argument 1 must be either zero or one. */
- if (INTVAL (op1) != 0 && INTVAL (op1) != 1)
+ /* Argument 1 must be 0, 1 or 2. */
+ if (INTVAL (op1) < 0 || INTVAL (op1) > 2)
But the patch failed to document that change in __builtin_prefetch
documentation, and more importantly didn't adjust any of the other
backends to deal with it (my understanding is the expected behavior
is that 2 will be silently handled as 0 unless backends have some
more specific way). Some of the backends would ICE on it, in some
cases gcc_assert failures/gcc_unreachable, in other cases crash later
(e.g. accessing arrays with that value as index and due to accessing
garbage after the array crashing at final.cc time), others treated 2
silently as 0, others treated 2 silently as 1.
And even in the i386 backend there were bugs which caused ICEs.
The patch added some if (write == 0) and write 2 handling into
a (badly indented, maybe that is the reason, if (write == 1) body),
rather than into the else side, so it would be always false.
The new *prefetch_rst2 define_insn only accepts parameters 2 1
(i.e. read-shared with moderate degree of locality), so in order
not to ICE the patch uses it only for __builtin_prefetch (ptr, 2, 1);
or __builtin_ia32_prefetch (ptr, 2, 1, 0); and not for other values
of the parameter. If that isn't what we want and we want it to be used
also for all or some of __builtin_prefetch (ptr, 2, {0,2,3}); and
corresponding __builtin_ia32_prefetch, maybe the define_insn could match
other values.
And there was another problem that -mno-mmx -mno-sse -mmovrs compilation
would ICE on most of the prefetches, so I had to add the FAIL; cases.
2024-11-29 Jakub Jelinek <jakub@redhat.com>
PR target/117608
* doc/extend.texi (__builtin_prefetch): Document that second
argument may be also 2 and its meaning.
* config/i386/i386.md (prefetch): Remove unreachable code.
Clear write set operands[1] to const0_rtx if !TARGET_MOVRS or
of locality is not 1. Formatting fixes.
* config/i386/i386-expand.cc (ix86_expand_builtin): Use IN_RANGE.
Call gen_prefetch even for TARGET_MOVRS.
* config/alpha/alpha.md (prefetch): Treat read_or_write 2 like 0.
* config/mips/mips.md (prefetch): Likewise.
* config/arc/arc.md (prefetch_1, prefetch_2, prefetch_3): Likewise.
* config/riscv/riscv.md (prefetch): Likewise.
* config/loongarch/loongarch.md (prefetch): Likewise.
* config/sparc/sparc.md (prefetch): Likewise. Use IN_RANGE.
* config/ia64/ia64.md (prefetch): Likewise.
* config/pa/pa.md (prefetch): Likewise.
* config/aarch64/aarch64.md (prefetch): Likewise.
* config/rs6000/rs6000.md (prefetch): Likewise.
* gcc.dg/builtin-prefetch-1.c (good): Add tests with second argument
2.
* gcc.target/i386/pr117608-1.c: New test.
* gcc.target/i386/pr117608-2.c: New test.
|
|
When ifcombining contiguous blocks, we can follow forwarder blocks and
reverse conditions to enable combinations, but when there are
intervening blocks, we have to constrain ourselves to paths to the
exit that share the PHI args with all intervening blocks.
Avoiding considering forwarders when intervening blocks were present
would match the preexisting test, but we can do better, recording in
case a forwarded path corresponds to the outer block's exit path, and
insisting on not combining through any other path but the one that was
verified as corresponding. The latter is what this patch implements.
While at that, I've fixed some typos, introduced early testing before
computing the exit path to avoid it when computing it would be
wasteful, or when avoiding it can enable other sound combinations.
for gcc/ChangeLog
PR tree-optimization/117723
* tree-ssa-ifcombine.cc (tree_ssa_ifcombine_bb): Record
forwarder blocks in path to exit, and stick to them. Avoid
computing the exit if obviously not needed, and if that
enables additional optimizations.
(tree_ssa_ifcombine_bb_1): Fix typos.
for gcc/testsuite/ChangeLog
PR tree-optimization/117723
* gcc.dg/torture/ifcmb-1.c: New.
|