Age | Commit message (Collapse) | Author | Files | Lines |
|
Consider this case of a bad call to a callback function (perhaps
due to C23 changing the meaning of () in function decls):
struct p {
int (*bar)();
};
void baz() {
struct p q;
q.bar(1);
}
Before this patch the C frontend emits:
t.c: In function 'baz':
t.c:7:5: error: too many arguments to function 'q.bar'
7 | q.bar(1);
| ^
which doesn't give the user much help in terms of knowing what
was expected, and where the relevant declaration is.
With this patch the C frontend emits:
t.c: In function 'baz':
t.c:7:5: error: too many arguments to function 'q.bar'; expected 0, have 1
7 | q.bar(1);
| ^ ~
t.c:2:15: note: declared here
2 | int (*bar)();
| ^~~
(showing the expected vs actual counts, the pertinent field decl, and
underlining the first extraneous argument at the callsite)
Similarly, the patch also updates the "too few arguments" case to also
show expected vs actual counts. Doing so requires a tweak to the
wording to say "at least" for the case of variadic fns where
previously the C FE emitted e.g.:
s.c: In function 'test':
s.c:5:3: error: too few arguments to function 'callee'
5 | callee ();
| ^~~~~~
s.c:1:6: note: declared here
1 | void callee (const char *, ...);
| ^~~~~~
with this patch it emits:
s.c: In function 'test':
s.c:5:3: error: too few arguments to function 'callee'; expected at least 1, have 0
5 | callee ();
| ^~~~~~
s.c:1:6: note: declared here
1 | void callee (const char *, ...);
| ^~~~~~
gcc/c/ChangeLog:
PR c/118112
* c-typeck.cc (inform_declaration): Add "function_expr" param and
use it for cases where we couldn't show the function decl to show
field decls for callbacks.
(build_function_call_vec): Add missing auto_diagnostic_group.
Update for new param of inform_declaration.
(convert_arguments): Likewise. For the "too many arguments" case
add the expected vs actual counts to the message, and if we have
it, add the location_t of the first surplus param as a secondary
location within the diagnostic. For the "too few arguments" case,
determine the minimum number of arguments required and add the
expected vs actual counts to the message, tweaking it to "at least"
for variadic functions.
gcc/testsuite/ChangeLog:
PR c/118112
* gcc.dg/too-few-arguments.c: New test.
* gcc.dg/too-many-arguments.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
Query for known relations between the operands, and pass that to
fold_range to help simplify MIN and MAX relations.
Make it type agnostic as well.
Adapt testcases from DOM to EVRP (e suffix) and test floats (f suffix).
PR tree-optimization/88575
gcc/
* vr-values.cc (simplify_using_ranges::fold_cond_with_ops): Query
relation between op0 and op1 and utilize it.
(simplify_using_ranges::simplify): Do not eliminate float checks.
gcc/testsuite/
* gcc.dg/tree-ssa/minmax-27.c: Disable VRP.
* gcc.dg/tree-ssa/minmax-27e.c: New.
* gcc.dg/tree-ssa/minmax-27f.c: New.
* gcc.dg/tree-ssa/minmax-28.c: Disable VRP.
* gcc.dg/tree-ssa/minmax-28e.c: New.
* gcc.dg/tree-ssa/minmax-28f.c: New.
|
|
[PR118211]
This fixes a latent wrong code issue whereby vect_do_peeling determined
the wrong condition for inserting the vector skip guard. Specifically
in the case where the loop niters are unknown at compile time we used to
check:
!LOOP_REQUIRES_VERSIONING (loop_vinfo)
but LOOP_REQUIRES_VERSIONING is true for loops which we have versioned
for aliasing, and that has nothing to do with prolog peeling. I think
this condition should instead be checking specifically if we aren't
versioning for alignment.
As it stands, when we version for alignment, we don't peel, so the
vector skip guard is indeed redundant in that case.
With the testcase added (reduced from the Fortran frontend) we would
version for aliasing, omit the vector skip guard, and then at runtime we
would peel sufficient iterations for alignment that there wasn't a full
vector iteration left when we entered the vector body, thus overflowing
the output buffer.
gcc/ChangeLog:
PR tree-optimization/118211
PR tree-optimization/116126
* tree-vect-loop-manip.cc (vect_do_peeling): Adjust skip_vector
condition to only omit the edge if we're versioning for
alignment.
gcc/testsuite/ChangeLog:
PR tree-optimization/118211
PR tree-optimization/116126
* gcc.dg/vect/vect-early-break_130.c: New test.
|
|
This allows us to vectorize more loops with early exits by forcing
peeling for alignment to make sure that we're guaranteed to be able to
safely read an entire vector iteration without crossing a page boundary.
To make this work for VLA architectures we have to allow compile-time
non-constant target alignments. We also have to override the result of
the target's preferred_vector_alignment hook if it isn't a power-of-two
multiple of the TYPE_SIZE of the chosen vector type.
gcc/ChangeLog:
PR tree-optimization/118211
PR tree-optimization/116126
* tree-vect-data-refs.cc (vect_analyze_early_break_dependences):
Set need_peeling_for_alignment flag on read DRs instead of
failing vectorization. Punt on gathers.
(dr_misalignment): Handle non-constant target alignments.
(vect_compute_data_ref_alignment): If need_peeling_for_alignment
flag is set on the DR, then override the target alignment chosen
by the preferred_vector_alignment hook to choose a safe
alignment.
(vect_supportable_dr_alignment): Override
support_vector_misalignment hook if need_peeling_for_alignment
is set on the DR: in this case we must return
dr_unaligned_unsupported in order to force peeling.
* tree-vect-loop-manip.cc (vect_do_peeling): Allow prolog
peeling by a compile-time non-constant amount.
* tree-vectorizer.h (dr_vec_info): Add new flag
need_peeling_for_alignment.
gcc/testsuite/ChangeLog:
PR tree-optimization/118211
PR tree-optimization/116126
* gcc.dg/tree-ssa/cunroll-13.c: Don't vectorize.
* gcc.dg/tree-ssa/cunroll-14.c: Likewise.
* gcc.dg/unroll-6.c: Likewise.
* gcc.dg/tree-ssa/gen-vect-28.c: Likewise.
* gcc.dg/vect/vect-104.c: Expect to vectorize.
* gcc.dg/vect/vect-early-break_108-pr113588.c: Likewise.
* gcc.dg/vect/vect-early-break_109-pr113588.c: Likewise.
* gcc.dg/vect/vect-early-break_110-pr113467.c: Likewise.
* gcc.dg/vect/vect-early-break_3.c: Likewise.
* gcc.dg/vect/vect-early-break_65.c: Likewise.
* gcc.dg/vect/vect-early-break_8.c: Likewise.
* gfortran.dg/vect/vect-5.f90: Likewise.
* gfortran.dg/vect/vect-8.f90: Likewise.
* gcc.dg/vect/vect-switch-search-line-fast.c:
Co-Authored-By: Tamar Christina <tamar.christina@arm.com>
|
|
Seems I forgot to set_c_expr_source_range for the __builtin_stdc_rotate_*
case (the other __builtin_stdc_* cases already have it), which means
the locations in expr are uninitialized, sometimes causing ICEs in linemap
code, at other times just valgrind errors about uninitialized var uses.
2025-01-10 Jakub Jelinek <jakub@redhat.com>
PR c/118376
* c-parser.cc (c_parser_postfix_expression): Call
set_c_expr_source_range before break in the __builtin_stdc_rotate_*
case.
* gcc.dg/pr118376.c: New test.
|
|
g:d882fe5150fbbeb4e44d007bb4964e5b22373021, posted at
https://gcc.gnu.org/pipermail/gcc-patches/2000-July/033786.html ,
added code to treat:
(set (reg:CC cc) (compare:CC (gt:M (reg:CC cc) 0) (lt:M (reg:CC cc) 0)))
as a nop. This PR shows that that isn't always correct.
The compare in the set above is between two 0/1 booleans (at least
on STORE_FLAG_VALUE==1 targets), whereas the unknown comparison that
produced the incoming (reg:CC cc) is unconstrained; it could be between
arbitrary integers, or even floats. The fold is therefore replacing a
cc that is valid for both signed and unsigned comparisons with one that
is only known to be valid for signed comparisons.
(gt (compare (gt cc 0) (lt cc 0) 0)
does simplify to:
(gt cc 0)
but:
(gtu (compare (gt cc 0) (lt cc 0) 0)
does not simplify to:
(gtu cc 0)
The optimisation didn't come with a testcase, but it was added for
i386's cmpstrsi, now cmpstrnsi. That probably doesn't matter as much
as it once did, since it's now conditional on -minline-all-stringops.
But the patch is almost 25 years old, so whatever the original
motivation was, it seems likely that other things now rely on it.
It therefore seems better to try to preserve the optimisation on rtl
rather than get rid of it. To do that, we need to look at how the
result of the outer compare is used. We'd therefore be looking at four
instructions (the gt, the lt, the compare, and the use of the compare),
but combine already allows that for 3-instruction combinations thanks
to:
/* If the source is a COMPARE, look for the use of the comparison result
and try to simplify it unless we already have used undobuf.other_insn. */
When applied to boolean inputs, a comparison operator is
effectively a boolean logical operator (AND, ANDNOT, XOR, etc.).
simplify_logical_relational_operation already had code to simplify
logical operators between two comparison results, but:
* It only handled IOR, which doesn't cover all the cases needed here.
The others are easily added.
* It treated comparisons of integers as having an ORDERED/UNORDERED result.
Therefore:
* it would not treat "true for LT + EQ + GT" as "always true" for
comparisons between integers, because the mask excluded the UNORDERED
condition.
* it would try to convert "true for LT + GT" into LTGT even for comparisons
between integers. To prevent an ICE later, the code used:
/* Many comparison codes are only valid for certain mode classes. */
if (!comparison_code_valid_for_mode (code, mode))
return 0;
However, this used the wrong mode, since "mode" is here the integer
result of the comparisons (and the mode of the IOR), not the mode of
the things being compared. Thus the effect was to reject all
floating-point-only codes, even when comparing floats.
I think instead the code should detect whether the comparison is between
integer values and remove UNORDERED from consideration if so. It then
always produces a valid comparison (or an always true/false result),
and so comparison_code_valid_for_mode is not needed. In particular,
"true for LT + GT" becomes NE for comparisons between integers but
remains LTGT for comparisons between floats.
* There was a missing check for whether the comparison inputs had
side effects.
While there, it also seemed worth extending
simplify_logical_relational_operation to unsigned comparisons, since
that makes the testing easier.
As far as that testing goes: the patch exhaustively tests all
combinations of integer comparisons in:
(cmp1 (cmp2 X Y) (cmp3 X Y))
for the 10 integer comparisons, giving 1000 fold attempts in total.
It then tries all combinations of (X in {-1,0,1} x Y in {-1,0,1})
on the result of the fold, giving 9 checks per fold, or 9000 in total.
That's probably more than is typical for self-tests, but it seems to
complete in neglible time, even for -O0 builds.
gcc/
PR rtl-optimization/117186
* rtl.h (simplify_context::simplify_logical_relational_operation): Add
an invert0_p parameter.
* simplify-rtx.cc (unsigned_comparison_to_mask): New function.
(mask_to_unsigned_comparison): Likewise.
(comparison_code_valid_for_mode): Delete.
(simplify_context::simplify_logical_relational_operation): Add
an invert0_p parameter. Handle AND and XOR. Handle unsigned
comparisons. Handle always-false results. Ignore the low bit
of the mask if the operands are always ordered and remove the
then-redundant check of comparison_code_valid_for_mode. Check
for side-effects in the operands before simplifying them away.
(simplify_context::simplify_binary_operation_1): Remove
simplification of (compare (gt ...) (lt ...)) and instead...
(simplify_context::simplify_relational_operation_1): ...handle
comparisons of comparisons here.
(test_comparisons): New function.
(test_scalar_ops): Call it.
gcc/testsuite/
PR rtl-optimization/117186
* gcc.dg/torture/pr117186.c: New test.
* gcc.target/aarch64/pr117186.c: Likewise.
|
|
There was a cut&pasto in the rr_and_mask's adjustment to match the
combined type: the test on whether there was a mask already was
testing the wrong variable, and then it might crash or otherwise fail
accessing an undefined mask. This only hit with checking enabled,
and rarely at that.
for gcc/ChangeLog
PR tree-optimization/118344
* gimple-fold.cc (fold_truth_andor_for_ifcombine): Fix typo in
rr_and_mask's type adjustment test.
for gcc/testsuite/ChangeLog
PR tree-optimization/118344
* gcc.dg/field-merge-19.c: New.
|
|
A narrowing conversion and a shift both drop bits from the loaded
value, but we need to take into account which one comes first to get
the right number of bits and mask.
Fold when applying masks to parts, comparing the parts, and combining
the results, in the odd chance either mask happens to be zero.
for gcc/ChangeLog
PR tree-optimization/118206
* gimple-fold.cc (decode_field_reference): Account for upper
bits dropped by narrowing conversions whether before or after
a right shift.
(fold_truth_andor_for_ifcombine): Fold masks, compares, and
combined results.
for gcc/testsuite/ChangeLog
PR tree-optimization/118206
* gcc.dg/field-merge-18.c: New.
|
|
Explicitly convert constants to the desired types, so as to not elicit
warnings about implicit truncations, nor execution errors, on targets
whose ints are narrower than 32 bits.
for gcc/testsuite/ChangeLog
PR testsuite/118025
* gcc.dg/field-merge-1.c: Convert constants to desired types.
* gcc.dg/field-merge-3.c: Likewise.
* gcc.dg/field-merge-4.c: Likewise.
* gcc.dg/field-merge-5.c: Likewise.
* gcc.dg/field-merge-11.c: Likewise.
* gcc.dg/field-merge-17.c: Don't mess with padding bits.
|
|
A number of tests that check for specific ifcombine transformations
fail on AVR and PRU targets, whose type sizes and alignments aren't
conducive of the expected transformations. Adjust the expectations.
Most execution tests should run successfully regardless of the
transformations, but a few that could conceivably fail if short and
char have the same bit width now check for that and bypass the tests
that would fail.
Conversely, one test that had such a runtime test, but that would work
regardless, no longer has that runtime test, and its types are
narrowed so that the transformations on 32-bit targets are more likely
to be the same as those that used to take place on 64-bit targets.
This latter change is somewhat obviated by a separate patch, but I've
left it in place anyway.
for gcc/testsuite/ChangeLog
PR testsuite/118025
* gcc.dg/field-merge-1.c: Skip BIT_FIELD_REF counting on AVR and PRU.
* gcc.dg/field-merge-3.c: Bypass the test if short doesn't have the
expected size.
* gcc.dg/field-merge-8.c: Likewise.
* gcc.dg/field-merge-9.c: Likewise. Skip optimization counting on
AVR and PRU.
* gcc.dg/field-merge-13.c: Skip optimization counting on AVR and PRU.
* gcc.dg/field-merge-15.c: Likewise.
* gcc.dg/field-merge-17.c: Likewise.
* gcc.dg/field-merge-16.c: Likewise. Drop runtime bypass. Use
smaller types.
* gcc.dg/field-merge-14.c: Add comments.
|
|
On 32-bit hosts, data types with 64-bit alignment aren't getting
treated as desired by ifcombine field-merging: we limit the choice of
modes at BITS_PER_WORD sizes, but when deciding the boundary for a
split, we'd limit the choice only by the alignment, so we wouldn't
even consider a split at an odd 32-bit boundary. Fix that by limiting
the boundary choice by word choice as well.
Now, this would still leave misaligned 64-bit fields in 64-bit-aligned
data structures unhandled by ifcombine on 32-bit hosts. We already
need to loading them as double words, and if they're not byte-aligned,
the code gets really ugly, but ifcombine could improve it if it allows
double-word loads as a last resort. I've added that.
for gcc/ChangeLog
* gimple-fold.cc (fold_truth_andor_for_ifcombine): Limit
boundary choice by word size as well. Try aligned double-word
loads as a last resort.
for gcc/testsuite/ChangeLog
* gcc.dg/field-merge-17.c: New.
|
|
PR 118138 and quite a few duplicates that it has acquired in a short
time show that even though we are careful to make sure we do not loose
any bits when newly allowing type conversions in jump-functions, we
still need to perform the fold conversions during IPA constant
propagation and not just at the end in order to properly perform
sign-extensions or zero-extensions as appropriate.
This patch does just that, changing a safety predicate we already use
at the appropriate places to return the necessary type.
gcc/ChangeLog:
2025-01-03 Martin Jambor <mjambor@suse.cz>
PR ipa/118138
* ipa-cp.cc (ipacp_value_safe_for_type): Return the appropriate
type instead of a bool, accept NULL_TREE VALUEs.
(propagate_vals_across_arith_jfunc): Use the new returned value of
ipacp_value_safe_for_type.
(propagate_vals_across_ancestor): Likewise.
(propagate_scalar_across_jump_function): Likewise.
gcc/testsuite/ChangeLog:
2025-01-03 Martin Jambor <mjambor@suse.cz>
PR ipa/118138
* gcc.dg/ipa/pr118138.c: New test.
|
|
[PR117866]
In C23 mode the warning about declaring structures and union in
parameter lists was removed, because it is possible to redeclare
a compatible type elsewhere. This is not the case for incomplete types,
so restore the warning for those types.
PR c/117866
gcc/c/ChangeLog:
* c-decl.cc (get_parm_info): Change condition for warning.
gcc/testsuite/ChangeLog:
* gcc.dg/pr117866.c: New test.
* gcc.dg/strub-pr118007.c: Adapt.
|
|
When the program requests a conversion to a typedef, let's try harder to
remember the new name.
Torbjörn's original patch changed the type of the original expression, but
that seems not generally desirable; we might want either or both of the
original type and the converted-to type to be represented. So this
expresses the name change as a NOP_EXPR.
Compiling stdc++.h, this adds 519 allocations out of 1870k, or 0.28%.
The -Wsuggest-attribute=format change was necessary to do the check before
converting to the target type, which seems like an improvement.
PR c/116060
gcc/c/ChangeLog:
* c-typeck.cc (convert_for_assignment): Make sure left hand side and
right hand side has identical named types to aid diagnostic output.
gcc/cp/ChangeLog:
* call.cc (standard_conversion): Preserve type name in ck_identity.
(maybe_adjust_type_name): New.
(convert_like_internal): Use it.
Handle -Wsuggest-attribute=format here.
(convert_for_arg_passing): Not here.
gcc/testsuite/ChangeLog:
* c-c++-common/analyzer/out-of-bounds-diagram-8.c: Update to
correct type.
* c-c++-common/analyzer/out-of-bounds-diagram-11.c: Likewise.
* gcc.dg/analyzer/out-of-bounds-diagram-10.c: Likewise.
Co-authored-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
|
|
The test case uses a nested function, which is not supported by some
targets.
gcc/testsuite/ChangeLog:
* gcc.dg/pr118325.c: Require effective target trampolines.
Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
|
|
Like recent commit 96f5fd3089075b56ea9ea85060213cc4edd7251a
"Move some CRC tests into the gcc.dg/torture directory" moved a few files, this
one also needs to go into torture testing: otherwise, it's compiled just at
'-O0', where the CRC optimization pass isn't active.
gcc/testsuite/
* gcc.dg/crc-linux-3.c: Move...
* gcc.dg/torture/crc-linux-3.c: ... here.
|
|
[PR117927]
As mentioned in the PR, the a r<< (bitsize-b) to a r>> b and similar
match.pd optimization which has been introduced in GCC 15 can introduce
UB which wasn't there before, in particular if b is equal at runtime
to bitsize, then a r<< 0 is turned into a r>> bitsize.
The following patch fixes it by optimizing it early only if VRP
tells us the count isn't equal to the bitsize, and late into
a r>> (b & (bitsize - 1)) if bitsize is power of two and the subtraction
has single use, on various targets the masking then goes away because
its rotate instructions do masking already. The latter can't be
done too early though, because then the expr_not_equal_to case is
basically useless and we introduce the masking always and can't find out
anymore that there was originally no masking. Even cfun->after_inlining
check would be too early, there is forwprop before vrp, so the patch
introduces a new PROP for the start of the last forwprop pass.
2025-01-09 Jakub Jelinek <jakub@redhat.com>
Andrew Pinski <quic_apinski@quicinc.com>
PR tree-optimization/117927
* tree-pass.h (PROP_last_full_fold): Define.
* passes.def: Add last= parameters to pass_forwprop.
* tree-ssa-forwprop.cc (pass_forwprop): Add last_p non-static
data member and initialize it in the ctor.
(pass_forwprop::set_pass_param): New method.
(pass_forwprop::execute): Set PROP_last_full_fold in curr_properties
at the start if last_p.
* match.pd (a rrotate (32-b) -> a lrotate b): Only optimize either
if @2 is known not to be equal to prec or if during/after last
forwprop the subtraction has single use and prec is power of two; in
that case transform it into orotate by masked count.
* gcc.dg/tree-ssa/pr117927.c: New test.
|
|
These generally PASS nowadays, without requiring 'alloca'.
There were two exceptions: 'gcc.dg/torture/stackalign/pr16660-2.c',
'gcc.dg/torture/stackalign/pr16660-3.c', where variants specifying
'-O0' or '-fpic' FAILed with 'ptxas' of, for example, CUDA 10.0 due to:
nvptx-as: ptxas terminated with signal 11 [Segmentation fault], core dumped
That however is gone with 'ptxas' of, for example, CUDA 11.5 and later.
gcc/testsuite/
* gcc.dg/torture/stackalign/global-1.c: Re-enable for nvptx.
* gcc.dg/torture/stackalign/inline-1.c: Likewise.
* gcc.dg/torture/stackalign/nested-1.c: Likewise.
* gcc.dg/torture/stackalign/nested-2.c: Likewise.
* gcc.dg/torture/stackalign/nested-4.c: Likewise.
* gcc.dg/torture/stackalign/pr16660-1.c: Likewise.
* gcc.dg/torture/stackalign/pr16660-2.c: Likewise.
* gcc.dg/torture/stackalign/pr16660-3.c: Likewise.
* gcc.dg/torture/stackalign/ret-struct-1.c: Likewise.
* gcc.dg/torture/stackalign/struct-1.c: Likewise.
|
|
When CD-DCE creates forwarders to reduce false control dependences
it fails to update the irreducible state of edge and the forwarder
block in case the fowarder groups both normal (entry) and edges
from an irreducible region (necessarily backedges). This is because
when we split the first edge, if that's a normal edge, the forwarder
and its edge to the original block will not be marked as part
of the irreducible region but when we then redirect an edge from
within the region it becomes so.
The following fixes this up.
Note I think creating a forwarder that includes backedges is
likely not going to help, but at this stage I don't want to change
the CFG going into DCE. For regular loops we'll have a single
entry and a single backedge by means of loop init and will never
create a forwarder - so this is solely happening for irreducible
regions where it's harder to prove that such forwarder doesn't help.
PR tree-optimization/117979
* tree-ssa-dce.cc (make_forwarders_with_degenerate_phis):
Properly update the irreducible region state.
* gcc.dg/torture/pr117979.c: New testcase.
|
|
When nonlocal goto lowering creates an artificial label it fails
to adjust its context.
PR middle-end/118325
* tree-nested.cc (convert_nl_goto_reference): Assign proper
context to generated artificial label.
* gcc.dg/pr118325.c: New testcase.
|
|
When we create the SLP reduction chain epilogue for the PHIs for
the early exit we fail to properly classify the reduction as SLP
reduction chain. The following fixes the corresponding checks.
PR tree-optimization/118269
* tree-vect-loop.cc (vect_create_epilog_for_reduction):
Use the correct stmt for the REDUC_GROUP_FIRST_ELEMENT lookup.
* gcc.dg/vect/vect-early-break_131-pr118269.c: New testcase.
|
|
gcc:
PR target/118137
* config/riscv/sync.md ("lrsc_atomic_exchange<mode>"): Apply mask
to shifted value.
gcc/testsuite:
PR target/118137
* gcc.dg/atomic/pr118137.c: New.
|
|
Many test cases explicitly set -march with extensions which are not
compatible with the E ABI variants. This leads to spurious errors
when toolchain has been configured for RV32E base ISA and ILP32E ABI:
spawn ... -march=rv32gc_zbb ...
cc1: error: ILP32E ABI does not support the 'D' extension
Fix by skipping those tests if toolchain's default ABI is E.
gcc/testsuite/ChangeLog:
* gcc.dg/pr90838-2.c: Skip if default ABI is E.
* gcc.dg/pr90838.c: Ditto.
* gcc.target/riscv/adddibeq.c: Ditto.
* gcc.target/riscv/adddibfeq.c: Ditto.
* gcc.target/riscv/adddibfge.c: Ditto.
* gcc.target/riscv/adddibfgt.c: Ditto.
* gcc.target/riscv/adddibfle.c: Ditto.
* gcc.target/riscv/adddibflt.c: Ditto.
* gcc.target/riscv/adddibfne.c: Ditto.
* gcc.target/riscv/adddibge.c: Ditto.
* gcc.target/riscv/adddibgeu.c: Ditto.
* gcc.target/riscv/adddibgt.c: Ditto.
* gcc.target/riscv/adddibgtu.c: Ditto.
* gcc.target/riscv/adddible.c: Ditto.
* gcc.target/riscv/adddibleu.c: Ditto.
* gcc.target/riscv/adddiblt.c: Ditto.
* gcc.target/riscv/adddibltu.c: Ditto.
* gcc.target/riscv/adddibne.c: Ditto.
* gcc.target/riscv/adddieq.c: Ditto.
* gcc.target/riscv/adddifeq.c: Ditto.
* gcc.target/riscv/adddifge.c: Ditto.
* gcc.target/riscv/adddifgt.c: Ditto.
* gcc.target/riscv/adddifle.c: Ditto.
* gcc.target/riscv/adddiflt.c: Ditto.
* gcc.target/riscv/adddifne.c: Ditto.
* gcc.target/riscv/adddige.c: Ditto.
* gcc.target/riscv/adddigeu.c: Ditto.
* gcc.target/riscv/adddigt.c: Ditto.
* gcc.target/riscv/adddigtu.c: Ditto.
* gcc.target/riscv/adddile.c: Ditto.
* gcc.target/riscv/adddileu.c: Ditto.
* gcc.target/riscv/adddilt.c: Ditto.
* gcc.target/riscv/adddiltu.c: Ditto.
* gcc.target/riscv/adddine.c: Ditto.
* gcc.target/riscv/addsibeq.c: Ditto.
* gcc.target/riscv/addsibfeq.c: Ditto.
* gcc.target/riscv/addsibfge.c: Ditto.
* gcc.target/riscv/addsibfgt.c: Ditto.
* gcc.target/riscv/addsibfle.c: Ditto.
* gcc.target/riscv/addsibflt.c: Ditto.
* gcc.target/riscv/addsibfne.c: Ditto.
* gcc.target/riscv/addsibge.c: Ditto.
* gcc.target/riscv/addsibgeu.c: Ditto.
* gcc.target/riscv/addsibgt.c: Ditto.
* gcc.target/riscv/addsibgtu.c: Ditto.
* gcc.target/riscv/addsible.c: Ditto.
* gcc.target/riscv/addsibleu.c: Ditto.
* gcc.target/riscv/addsiblt.c: Ditto.
* gcc.target/riscv/addsibltu.c: Ditto.
* gcc.target/riscv/addsibne.c: Ditto.
* gcc.target/riscv/addsieq.c: Ditto.
* gcc.target/riscv/addsifeq.c: Ditto.
* gcc.target/riscv/addsifge.c: Ditto.
* gcc.target/riscv/addsifgt.c: Ditto.
* gcc.target/riscv/addsifle.c: Ditto.
* gcc.target/riscv/addsiflt.c: Ditto.
* gcc.target/riscv/addsifne.c: Ditto.
* gcc.target/riscv/addsige.c: Ditto.
* gcc.target/riscv/addsigeu.c: Ditto.
* gcc.target/riscv/addsigt.c: Ditto.
* gcc.target/riscv/addsigtu.c: Ditto.
* gcc.target/riscv/addsile.c: Ditto.
* gcc.target/riscv/addsileu.c: Ditto.
* gcc.target/riscv/addsilt.c: Ditto.
* gcc.target/riscv/addsiltu.c: Ditto.
* gcc.target/riscv/addsine.c: Ditto.
* gcc.target/riscv/cmo-zicboz-zic64-1.c: Ditto.
* gcc.target/riscv/cmpmemsi-2.c: Ditto.
* gcc.target/riscv/cmpmemsi-3.c: Ditto.
* gcc.target/riscv/cmpmemsi.c: Ditto.
* gcc.target/riscv/cpymemsi-2.c: Ditto.
* gcc.target/riscv/cpymemsi-3.c: Ditto.
* gcc.target/riscv/cpymemsi.c: Ditto.
* gcc.target/riscv/crc-builtin-zbc32.c: Ditto.
* gcc.target/riscv/crc-builtin-zbc64.c: Ditto.
* gcc.target/riscv/cset-sext-rtl.c: Ditto.
* gcc.target/riscv/cset-sext-rtl32.c: Ditto.
* gcc.target/riscv/cset-sext-sfb-rtl.c: Ditto.
* gcc.target/riscv/cset-sext-sfb-rtl32.c: Ditto.
* gcc.target/riscv/cset-sext-sfb.c: Ditto.
* gcc.target/riscv/cset-sext-thead-rtl.c: Ditto.
* gcc.target/riscv/cset-sext-thead.c: Ditto.
* gcc.target/riscv/cset-sext-ventana-rtl.c: Ditto.
* gcc.target/riscv/cset-sext-ventana.c: Ditto.
* gcc.target/riscv/cset-sext-zicond-rtl.c: Ditto.
* gcc.target/riscv/cset-sext-zicond-rtl32.c: Ditto.
* gcc.target/riscv/cset-sext-zicond.c: Ditto.
* gcc.target/riscv/cset-sext.c: Ditto.
* gcc.target/riscv/matrix_add_const.c: Ditto.
* gcc.target/riscv/movdibeq-thead.c: Ditto.
* gcc.target/riscv/movdibeq-ventana.c: Ditto.
* gcc.target/riscv/movdibeq-zicond.c: Ditto.
* gcc.target/riscv/movdibeq.c: Ditto.
* gcc.target/riscv/movdibfeq-ventana.c: Ditto.
* gcc.target/riscv/movdibfeq-zicond.c: Ditto.
* gcc.target/riscv/movdibfeq.c: Ditto.
* gcc.target/riscv/movdibfge-ventana.c: Ditto.
* gcc.target/riscv/movdibfge-zicond.c: Ditto.
* gcc.target/riscv/movdibfge.c: Ditto.
* gcc.target/riscv/movdibfgt-ventana.c: Ditto.
* gcc.target/riscv/movdibfgt-zicond.c: Ditto.
* gcc.target/riscv/movdibfgt.c: Ditto.
* gcc.target/riscv/movdibfle-ventana.c: Ditto.
* gcc.target/riscv/movdibfle-zicond.c: Ditto.
* gcc.target/riscv/movdibfle.c: Ditto.
* gcc.target/riscv/movdibflt-ventana.c: Ditto.
* gcc.target/riscv/movdibflt-zicond.c: Ditto.
* gcc.target/riscv/movdibflt.c: Ditto.
* gcc.target/riscv/movdibfne-ventana.c: Ditto.
* gcc.target/riscv/movdibfne-zicond.c: Ditto.
* gcc.target/riscv/movdibfne.c: Ditto.
* gcc.target/riscv/movdibge-thead.c: Ditto.
* gcc.target/riscv/movdibge-ventana.c: Ditto.
* gcc.target/riscv/movdibge-zicond.c: Ditto.
* gcc.target/riscv/movdibge.c: Ditto.
* gcc.target/riscv/movdibgeu-thead.c: Ditto.
* gcc.target/riscv/movdibgeu-ventana.c: Ditto.
* gcc.target/riscv/movdibgeu-zicond.c: Ditto.
* gcc.target/riscv/movdibgeu.c: Ditto.
* gcc.target/riscv/movdibgt-thead.c: Ditto.
* gcc.target/riscv/movdibgt-ventana.c: Ditto.
* gcc.target/riscv/movdibgt-zicond.c: Ditto.
* gcc.target/riscv/movdibgt.c: Ditto.
* gcc.target/riscv/movdibgtu-thead.c: Ditto.
* gcc.target/riscv/movdibgtu-ventana.c: Ditto.
* gcc.target/riscv/movdibgtu-zicond.c: Ditto.
* gcc.target/riscv/movdibgtu.c: Ditto.
* gcc.target/riscv/movdible-thead.c: Ditto.
* gcc.target/riscv/movdible-ventana.c: Ditto.
* gcc.target/riscv/movdible-zicond.c: Ditto.
* gcc.target/riscv/movdible.c: Ditto.
* gcc.target/riscv/movdibleu-thead.c: Ditto.
* gcc.target/riscv/movdibleu-ventana.c: Ditto.
* gcc.target/riscv/movdibleu-zicond.c: Ditto.
* gcc.target/riscv/movdibleu.c: Ditto.
* gcc.target/riscv/movdiblt-thead.c: Ditto.
* gcc.target/riscv/movdiblt-ventana.c: Ditto.
* gcc.target/riscv/movdiblt-zicond.c: Ditto.
* gcc.target/riscv/movdiblt.c: Ditto.
* gcc.target/riscv/movdibltu-thead.c: Ditto.
* gcc.target/riscv/movdibltu-ventana.c: Ditto.
* gcc.target/riscv/movdibltu-zicond.c: Ditto.
* gcc.target/riscv/movdibltu.c: Ditto.
* gcc.target/riscv/movdibne-thead.c: Ditto.
* gcc.target/riscv/movdibne-ventana.c: Ditto.
* gcc.target/riscv/movdibne-zicond.c: Ditto.
* gcc.target/riscv/movdibne.c: Ditto.
* gcc.target/riscv/movdieq-sfb.c: Ditto.
* gcc.target/riscv/movdieq-thead.c: Ditto.
* gcc.target/riscv/movdieq-ventana.c: Ditto.
* gcc.target/riscv/movdieq-zicond.c: Ditto.
* gcc.target/riscv/movdieq.c: Ditto.
* gcc.target/riscv/movdifeq-sfb.c: Ditto.
* gcc.target/riscv/movdifeq-thead.c: Ditto.
* gcc.target/riscv/movdifeq-ventana.c: Ditto.
* gcc.target/riscv/movdifeq-zicond.c: Ditto.
* gcc.target/riscv/movdifeq.c: Ditto.
* gcc.target/riscv/movdifge-sfb.c: Ditto.
* gcc.target/riscv/movdifge-thead.c: Ditto.
* gcc.target/riscv/movdifge-ventana.c: Ditto.
* gcc.target/riscv/movdifge-zicond.c: Ditto.
* gcc.target/riscv/movdifge.c: Ditto.
* gcc.target/riscv/movdifgt-sfb.c: Ditto.
* gcc.target/riscv/movdifgt-thead.c: Ditto.
* gcc.target/riscv/movdifgt-ventana.c: Ditto.
* gcc.target/riscv/movdifgt-zicond.c: Ditto.
* gcc.target/riscv/movdifgt.c: Ditto.
* gcc.target/riscv/movdifle-sfb.c: Ditto.
* gcc.target/riscv/movdifle-thead.c: Ditto.
* gcc.target/riscv/movdifle-ventana.c: Ditto.
* gcc.target/riscv/movdifle-zicond.c: Ditto.
* gcc.target/riscv/movdifle.c: Ditto.
* gcc.target/riscv/movdiflt-sfb.c: Ditto.
* gcc.target/riscv/movdiflt-thead.c: Ditto.
* gcc.target/riscv/movdiflt-ventana.c: Ditto.
* gcc.target/riscv/movdiflt-zicond.c: Ditto.
* gcc.target/riscv/movdiflt.c: Ditto.
* gcc.target/riscv/movdifne-sfb.c: Ditto.
* gcc.target/riscv/movdifne-thead.c: Ditto.
* gcc.target/riscv/movdifne-ventana.c: Ditto.
* gcc.target/riscv/movdifne-zicond.c: Ditto.
* gcc.target/riscv/movdifne.c: Ditto.
* gcc.target/riscv/movdige-sfb.c: Ditto.
* gcc.target/riscv/movdige-thead.c: Ditto.
* gcc.target/riscv/movdige-ventana.c: Ditto.
* gcc.target/riscv/movdige-zicond.c: Ditto.
* gcc.target/riscv/movdige.c: Ditto.
* gcc.target/riscv/movdigeu-sfb.c: Ditto.
* gcc.target/riscv/movdigeu-thead.c: Ditto.
* gcc.target/riscv/movdigeu-ventana.c: Ditto.
* gcc.target/riscv/movdigeu-zicond.c: Ditto.
* gcc.target/riscv/movdigeu.c: Ditto.
* gcc.target/riscv/movdigt-sfb.c: Ditto.
* gcc.target/riscv/movdigt-thead.c: Ditto.
* gcc.target/riscv/movdigt-ventana.c: Ditto.
* gcc.target/riscv/movdigt-zicond.c: Ditto.
* gcc.target/riscv/movdigt.c: Ditto.
* gcc.target/riscv/movdigtu-sfb.c: Ditto.
* gcc.target/riscv/movdigtu-thead.c: Ditto.
* gcc.target/riscv/movdigtu-ventana.c: Ditto.
* gcc.target/riscv/movdigtu-zicond.c: Ditto.
* gcc.target/riscv/movdigtu.c: Ditto.
* gcc.target/riscv/movdile-sfb.c: Ditto.
* gcc.target/riscv/movdile-thead.c: Ditto.
* gcc.target/riscv/movdile-ventana.c: Ditto.
* gcc.target/riscv/movdile-zicond.c: Ditto.
* gcc.target/riscv/movdile.c: Ditto.
* gcc.target/riscv/movdileu-sfb.c: Ditto.
* gcc.target/riscv/movdileu-thead.c: Ditto.
* gcc.target/riscv/movdileu-ventana.c: Ditto.
* gcc.target/riscv/movdileu-zicond.c: Ditto.
* gcc.target/riscv/movdileu.c: Ditto.
* gcc.target/riscv/movdilt-sfb.c: Ditto.
* gcc.target/riscv/movdilt-thead.c: Ditto.
* gcc.target/riscv/movdilt-ventana.c: Ditto.
* gcc.target/riscv/movdilt-zicond.c: Ditto.
* gcc.target/riscv/movdilt.c: Ditto.
* gcc.target/riscv/movdiltu-sfb.c: Ditto.
* gcc.target/riscv/movdiltu-thead.c: Ditto.
* gcc.target/riscv/movdiltu-ventana.c: Ditto.
* gcc.target/riscv/movdiltu-zicond.c: Ditto.
* gcc.target/riscv/movdiltu.c: Ditto.
* gcc.target/riscv/movdine-sfb.c: Ditto.
* gcc.target/riscv/movdine-thead.c: Ditto.
* gcc.target/riscv/movdine-ventana.c: Ditto.
* gcc.target/riscv/movdine-zicond.c: Ditto.
* gcc.target/riscv/movdine.c: Ditto.
* gcc.target/riscv/movsibeq-thead.c: Ditto.
* gcc.target/riscv/movsibeq-ventana.c: Ditto.
* gcc.target/riscv/movsibeq-zicond.c: Ditto.
* gcc.target/riscv/movsibeq.c: Ditto.
* gcc.target/riscv/movsibfeq-ventana.c: Ditto.
* gcc.target/riscv/movsibfeq-zicond.c: Ditto.
* gcc.target/riscv/movsibfeq.c: Ditto.
* gcc.target/riscv/movsibfge-ventana.c: Ditto.
* gcc.target/riscv/movsibfge-zicond.c: Ditto.
* gcc.target/riscv/movsibfge.c: Ditto.
* gcc.target/riscv/movsibfgt-ventana.c: Ditto.
* gcc.target/riscv/movsibfgt-zicond.c: Ditto.
* gcc.target/riscv/movsibfgt.c: Ditto.
* gcc.target/riscv/movsibfle-ventana.c: Ditto.
* gcc.target/riscv/movsibfle-zicond.c: Ditto.
* gcc.target/riscv/movsibfle.c: Ditto.
* gcc.target/riscv/movsibflt-ventana.c: Ditto.
* gcc.target/riscv/movsibflt-zicond.c: Ditto.
* gcc.target/riscv/movsibflt.c: Ditto.
* gcc.target/riscv/movsibfne-ventana.c: Ditto.
* gcc.target/riscv/movsibfne-zicond.c: Ditto.
* gcc.target/riscv/movsibfne.c: Ditto.
* gcc.target/riscv/movsibge-thead.c: Ditto.
* gcc.target/riscv/movsibge-ventana.c: Ditto.
* gcc.target/riscv/movsibge-zicond.c: Ditto.
* gcc.target/riscv/movsibge.c: Ditto.
* gcc.target/riscv/movsibgeu-thead.c: Ditto.
* gcc.target/riscv/movsibgeu-ventana.c: Ditto.
* gcc.target/riscv/movsibgeu-zicond.c: Ditto.
* gcc.target/riscv/movsibgeu.c: Ditto.
* gcc.target/riscv/movsibgt-thead.c: Ditto.
* gcc.target/riscv/movsibgt-ventana.c: Ditto.
* gcc.target/riscv/movsibgt-zicond.c: Ditto.
* gcc.target/riscv/movsibgt.c: Ditto.
* gcc.target/riscv/movsibgtu-thead.c: Ditto.
* gcc.target/riscv/movsibgtu-ventana.c: Ditto.
* gcc.target/riscv/movsibgtu-zicond.c: Ditto.
* gcc.target/riscv/movsibgtu.c: Ditto.
* gcc.target/riscv/movsible-thead.c: Ditto.
* gcc.target/riscv/movsible-ventana.c: Ditto.
* gcc.target/riscv/movsible-zicond.c: Ditto.
* gcc.target/riscv/movsible.c: Ditto.
* gcc.target/riscv/movsibleu-thead.c: Ditto.
* gcc.target/riscv/movsibleu-ventana.c: Ditto.
* gcc.target/riscv/movsibleu-zicond.c: Ditto.
* gcc.target/riscv/movsibleu.c: Ditto.
* gcc.target/riscv/movsiblt-thead.c: Ditto.
* gcc.target/riscv/movsiblt-ventana.c: Ditto.
* gcc.target/riscv/movsiblt-zicond.c: Ditto.
* gcc.target/riscv/movsiblt.c: Ditto.
* gcc.target/riscv/movsibltu-thead.c: Ditto.
* gcc.target/riscv/movsibltu-ventana.c: Ditto.
* gcc.target/riscv/movsibltu-zicond.c: Ditto.
* gcc.target/riscv/movsibltu.c: Ditto.
* gcc.target/riscv/movsibne-thead.c: Ditto.
* gcc.target/riscv/movsibne-ventana.c: Ditto.
* gcc.target/riscv/movsibne-zicond.c: Ditto.
* gcc.target/riscv/movsibne.c: Ditto.
* gcc.target/riscv/movsieq-sfb.c: Ditto.
* gcc.target/riscv/movsieq-thead.c: Ditto.
* gcc.target/riscv/movsieq-ventana.c: Ditto.
* gcc.target/riscv/movsieq-zicond.c: Ditto.
* gcc.target/riscv/movsieq.c: Ditto.
* gcc.target/riscv/movsifeq-sfb.c: Ditto.
* gcc.target/riscv/movsifeq-thead.c: Ditto.
* gcc.target/riscv/movsifeq-ventana.c: Ditto.
* gcc.target/riscv/movsifeq-zicond.c: Ditto.
* gcc.target/riscv/movsifeq.c: Ditto.
* gcc.target/riscv/movsifge-sfb.c: Ditto.
* gcc.target/riscv/movsifge-thead.c: Ditto.
* gcc.target/riscv/movsifge-ventana.c: Ditto.
* gcc.target/riscv/movsifge-zicond.c: Ditto.
* gcc.target/riscv/movsifge.c: Ditto.
* gcc.target/riscv/movsifgt-sfb.c: Ditto.
* gcc.target/riscv/movsifgt-thead.c: Ditto.
* gcc.target/riscv/movsifgt-ventana.c: Ditto.
* gcc.target/riscv/movsifgt-zicond.c: Ditto.
* gcc.target/riscv/movsifgt.c: Ditto.
* gcc.target/riscv/movsifle-sfb.c: Ditto.
* gcc.target/riscv/movsifle-thead.c: Ditto.
* gcc.target/riscv/movsifle-ventana.c: Ditto.
* gcc.target/riscv/movsifle-zicond.c: Ditto.
* gcc.target/riscv/movsifle.c: Ditto.
* gcc.target/riscv/movsiflt-sfb.c: Ditto.
* gcc.target/riscv/movsiflt-thead.c: Ditto.
* gcc.target/riscv/movsiflt-ventana.c: Ditto.
* gcc.target/riscv/movsiflt-zicond.c: Ditto.
* gcc.target/riscv/movsiflt.c: Ditto.
* gcc.target/riscv/movsifne-sfb.c: Ditto.
* gcc.target/riscv/movsifne-thead.c: Ditto.
* gcc.target/riscv/movsifne-ventana.c: Ditto.
* gcc.target/riscv/movsifne-zicond.c: Ditto.
* gcc.target/riscv/movsifne.c: Ditto.
* gcc.target/riscv/movsige-sfb.c: Ditto.
* gcc.target/riscv/movsige-thead.c: Ditto.
* gcc.target/riscv/movsige-ventana.c: Ditto.
* gcc.target/riscv/movsige-zicond.c: Ditto.
* gcc.target/riscv/movsige.c: Ditto.
* gcc.target/riscv/movsigeu-sfb.c: Ditto.
* gcc.target/riscv/movsigeu-thead.c: Ditto.
* gcc.target/riscv/movsigeu-ventana.c: Ditto.
* gcc.target/riscv/movsigeu-zicond.c: Ditto.
* gcc.target/riscv/movsigeu.c: Ditto.
* gcc.target/riscv/movsigt-sfb.c: Ditto.
* gcc.target/riscv/movsigt-thead.c: Ditto.
* gcc.target/riscv/movsigt-ventana.c: Ditto.
* gcc.target/riscv/movsigt-zicond.c: Ditto.
* gcc.target/riscv/movsigt.c: Ditto.
* gcc.target/riscv/movsigtu-sfb.c: Ditto.
* gcc.target/riscv/movsigtu-thead.c: Ditto.
* gcc.target/riscv/movsigtu-ventana.c: Ditto.
* gcc.target/riscv/movsigtu-zicond.c: Ditto.
* gcc.target/riscv/movsigtu.c: Ditto.
* gcc.target/riscv/movsile-sfb.c: Ditto.
* gcc.target/riscv/movsile-thead.c: Ditto.
* gcc.target/riscv/movsile-ventana.c: Ditto.
* gcc.target/riscv/movsile-zicond.c: Ditto.
* gcc.target/riscv/movsile.c: Ditto.
* gcc.target/riscv/movsileu-sfb.c: Ditto.
* gcc.target/riscv/movsileu-thead.c: Ditto.
* gcc.target/riscv/movsileu-ventana.c: Ditto.
* gcc.target/riscv/movsileu-zicond.c: Ditto.
* gcc.target/riscv/movsileu.c: Ditto.
* gcc.target/riscv/movsilt-sfb.c: Ditto.
* gcc.target/riscv/movsilt-thead.c: Ditto.
* gcc.target/riscv/movsilt-ventana.c: Ditto.
* gcc.target/riscv/movsilt-zicond.c: Ditto.
* gcc.target/riscv/movsilt.c: Ditto.
* gcc.target/riscv/movsiltu-sfb.c: Ditto.
* gcc.target/riscv/movsiltu-thead.c: Ditto.
* gcc.target/riscv/movsiltu-ventana.c: Ditto.
* gcc.target/riscv/movsiltu-zicond.c: Ditto.
* gcc.target/riscv/movsiltu.c: Ditto.
* gcc.target/riscv/movsine-sfb.c: Ditto.
* gcc.target/riscv/movsine-thead.c: Ditto.
* gcc.target/riscv/movsine-ventana.c: Ditto.
* gcc.target/riscv/movsine-zicond.c: Ditto.
* gcc.target/riscv/movsine.c: Ditto.
* gcc.target/riscv/pr111501.c: Ditto.
* gcc.target/riscv/pr115921.c: Ditto.
* gcc.target/riscv/pr116033.c: Ditto.
* gcc.target/riscv/pr116035-1.c: Ditto.
* gcc.target/riscv/pr116035-2.c: Ditto.
* gcc.target/riscv/pr116131.c: Ditto.
* gcc.target/riscv/reg_subreg_costs.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/shuffle-slide.c: Ditto.
* gcc.target/riscv/rvv/xtheadvector.c: Ditto.
* gcc.target/riscv/rvv/xtheadvector/pr114194.c: Ditto.
* gcc.target/riscv/sign-extend-rshift-32.c: Ditto.
* gcc.target/riscv/sign-extend-rshift-64.c: Ditto.
* gcc.target/riscv/sign-extend-rshift.c: Ditto.
* gcc.target/riscv/synthesis-1.c: Ditto.
* gcc.target/riscv/synthesis-10.c: Ditto.
* gcc.target/riscv/synthesis-11.c: Ditto.
* gcc.target/riscv/synthesis-12.c: Ditto.
* gcc.target/riscv/synthesis-13.c: Ditto.
* gcc.target/riscv/synthesis-14.c: Ditto.
* gcc.target/riscv/synthesis-15.c: Ditto.
* gcc.target/riscv/synthesis-16.c: Ditto.
* gcc.target/riscv/synthesis-2.c: Ditto.
* gcc.target/riscv/synthesis-3.c: Ditto.
* gcc.target/riscv/synthesis-4.c: Ditto.
* gcc.target/riscv/synthesis-5.c: Ditto.
* gcc.target/riscv/synthesis-6.c: Ditto.
* gcc.target/riscv/synthesis-7.c: Ditto.
* gcc.target/riscv/synthesis-8.c: Ditto.
* gcc.target/riscv/synthesis-9.c: Ditto.
* gcc.target/riscv/target-attr-16.c: Ditto.
* gcc.target/riscv/target-attr-norelax.c: Ditto.
* gcc.target/riscv/xtheadba-addsl.c: Ditto.
* gcc.target/riscv/xtheadba.c: Ditto.
* gcc.target/riscv/xtheadbb-ext-1.c: Ditto.
* gcc.target/riscv/xtheadbb-ext-2.c: Ditto.
* gcc.target/riscv/xtheadbb-ext-3.c: Ditto.
* gcc.target/riscv/xtheadbb-ext.c: Ditto.
* gcc.target/riscv/xtheadbb-extu-1.c: Ditto.
* gcc.target/riscv/xtheadbb-extu-2.c: Ditto.
* gcc.target/riscv/xtheadbb-extu-4.c: Ditto.
* gcc.target/riscv/xtheadbb-extu.c: Ditto.
* gcc.target/riscv/xtheadbb-ff1.c: Ditto.
* gcc.target/riscv/xtheadbb-rev.c: Ditto.
* gcc.target/riscv/xtheadbb-srri.c: Ditto.
* gcc.target/riscv/xtheadbb-strcmp.c: Ditto.
* gcc.target/riscv/xtheadbb-strlen-unaligned.c: Ditto.
* gcc.target/riscv/xtheadbb-strlen.c: Ditto.
* gcc.target/riscv/xtheadbb.c: Ditto.
* gcc.target/riscv/xtheadbs-tst.c: Ditto.
* gcc.target/riscv/xtheadbs.c: Ditto.
* gcc.target/riscv/xtheadcmo.c: Ditto.
* gcc.target/riscv/xtheadcondmov-indirect.c: Ditto.
* gcc.target/riscv/xtheadcondmov-mveqz-imm-eqz.c: Ditto.
* gcc.target/riscv/xtheadcondmov-mveqz-imm-not.c: Ditto.
* gcc.target/riscv/xtheadcondmov-mveqz-reg-eqz.c: Ditto.
* gcc.target/riscv/xtheadcondmov-mveqz-reg-not.c: Ditto.
* gcc.target/riscv/xtheadcondmov-mvnez-imm-cond.c: Ditto.
* gcc.target/riscv/xtheadcondmov-mvnez-imm-nez.c: Ditto.
* gcc.target/riscv/xtheadcondmov-mvnez-reg-cond.c: Ditto.
* gcc.target/riscv/xtheadcondmov-mvnez-reg-nez.c: Ditto.
* gcc.target/riscv/xtheadcondmov.c: Ditto.
* gcc.target/riscv/xtheadfmemidx-without-xtheadmemidx.c: Ditto.
* gcc.target/riscv/xtheadfmemidx.c: Ditto.
* gcc.target/riscv/xtheadfmv.c: Ditto.
* gcc.target/riscv/xtheadint.c: Ditto.
* gcc.target/riscv/xtheadmac-mula-muls.c: Ditto.
* gcc.target/riscv/xtheadmac.c: Ditto.
* gcc.target/riscv/xtheadmemidx-index-update.c: Ditto.
* gcc.target/riscv/xtheadmemidx-index-xtheadbb-update.c: Ditto.
* gcc.target/riscv/xtheadmemidx-index-xtheadbb.c: Ditto.
* gcc.target/riscv/xtheadmemidx-index.c: Ditto.
* gcc.target/riscv/xtheadmemidx-modify-xtheadbb.c: Ditto.
* gcc.target/riscv/xtheadmemidx-modify.c: Ditto.
* gcc.target/riscv/xtheadmemidx-uindex-update.c: Ditto.
* gcc.target/riscv/xtheadmemidx-uindex-xtheadbb-update.c: Ditto.
* gcc.target/riscv/xtheadmemidx-uindex-xtheadbb.c: Ditto.
* gcc.target/riscv/xtheadmemidx-uindex.c: Ditto.
* gcc.target/riscv/xtheadmemidx.c: Ditto.
* gcc.target/riscv/xtheadmempair-1.c: Ditto.
* gcc.target/riscv/xtheadmempair-2.c: Ditto.
* gcc.target/riscv/xtheadmempair-3.c: Ditto.
* gcc.target/riscv/xtheadmempair-4.c: Ditto.
* gcc.target/riscv/xtheadmempair-interrupt-fcsr.c: Ditto.
* gcc.target/riscv/xtheadmempair.c: Ditto.
* gcc.target/riscv/xtheadsync.c: Ditto.
* gcc.target/riscv/za-ext.c: Ditto.
* gcc.target/riscv/zawrs.c: Ditto.
* gcc.target/riscv/zbb-strcmp-disabled-2.c: Ditto.
* gcc.target/riscv/zbb-strcmp-disabled.c: Ditto.
* gcc.target/riscv/zbb-strcmp-limit.c: Ditto.
* gcc.target/riscv/zbb-strcmp-unaligned.c: Ditto.
* gcc.target/riscv/zbb-strcmp.c: Ditto.
* gcc.target/riscv/zbb-strlen-disabled-2.c: Ditto.
* gcc.target/riscv/zbb-strlen-disabled.c: Ditto.
* gcc.target/riscv/zbb-strlen-unaligned.c: Ditto.
* gcc.target/riscv/zbb-strlen.c: Ditto.
* gcc.target/riscv/zero-extend-rshift-32.c: Ditto.
* gcc.target/riscv/zero-extend-rshift-64.c: Ditto.
* gcc.target/riscv/zero-extend-rshift.c: Ditto.
* gcc.target/riscv/zi-ext.c: Ditto.
* gcc.target/riscv/zvbb.c: Ditto.
* gcc.target/riscv/zvbc.c: Ditto.
* gcc.target/riscv/zvkb.c: Ditto.
* gcc.target/riscv/zvkg.c: Ditto.
* gcc.target/riscv/zvkn-1.c: Ditto.
* gcc.target/riscv/zvkn.c: Ditto.
* gcc.target/riscv/zvknc-1.c: Ditto.
* gcc.target/riscv/zvknc-2.c: Ditto.
* gcc.target/riscv/zvknc.c: Ditto.
* gcc.target/riscv/zvkned.c: Ditto.
* gcc.target/riscv/zvkng-1.c: Ditto.
* gcc.target/riscv/zvkng-2.c: Ditto.
* gcc.target/riscv/zvkng.c: Ditto.
* gcc.target/riscv/zvknha.c: Ditto.
* gcc.target/riscv/zvknhb.c: Ditto.
* gcc.target/riscv/zvks-1.c: Ditto.
* gcc.target/riscv/zvks.c: Ditto.
* gcc.target/riscv/zvksc-1.c: Ditto.
* gcc.target/riscv/zvksc-2.c: Ditto.
* gcc.target/riscv/zvksc.c: Ditto.
* gcc.target/riscv/zvksed.c: Ditto.
* gcc.target/riscv/zvksg-1.c: Ditto.
* gcc.target/riscv/zvksg-2.c: Ditto.
* gcc.target/riscv/zvksg.c: Ditto.
* gcc.target/riscv/zvksh.c: Ditto.
* gcc.target/riscv/zvkt.c: Ditto.
Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
|
|
The early scheduler takes up ~33% of the total build time, however it doesn't
provide a meaningful performance gain. This is partly because modern OoO cores
need far less scheduling, partly because the scheduler tends to create many
unnecessary spills by increasing register pressure. Building applications
56% faster is far more useful than ~0.1% improvement on SPEC, so switch off
early scheduling on AArch64. Codesize reduces by ~0.2%.
Fix various tests that depend on scheduling by explicitly adding -fschedule-insns.
gcc:
* common/config/aarch64/aarch64-common.cc: Switch off fschedule_insns.
gcc/testsuite:
* gcc.dg/guality/pr36728-3.c: Remove XFAIL.
* gcc.dg/guality/pr68860-1.c: Likewise.
* gcc.dg/guality/pr68860-2.c: Likewise.
* gcc.target/aarch64/ldp_aligned.c: Fix test.
* gcc.target/aarch64/ldp_always.c: Likewise.
* gcc.target/aarch64/ldp_stp_10.c: Add -fschedule-insns.
* gcc.target/aarch64/ldp_stp_12.c: Likewise.
* gcc.target/aarch64/ldp_stp_13.c: Remove test.
* gcc.target/aarch64/ldp_stp_21.c: Add -fschedule-insns.
* gcc.target/aarch64/ldp_stp_8.c: Likewise.
* gcc.target/aarch64/ldp_vec_v2sf.c: Likewise.
* gcc.target/aarch64/ldp_vec_v2si.c: Likewise.
* gcc.target/aarch64/test_frame_16.c: Fix test.
* gcc.target/aarch64/sve/vcond_12.c: Add -fschedule-insns.
* gcc.target/aarch64/sve/acle/general/ldff1_3.c: Likewise.
|
|
When the patch for PR114074 was applied we saw a good boost in exchange2.
This boost was partially caused by a simplification of the addressing modes.
With the patch applied IV opts saw the following form for the base addressing;
Base: (integer(kind=4) *) &block + ((sizetype) ((unsigned long) l0_19(D) *
324) + 36)
vs what we normally get:
Base: (integer(kind=4) *) &block + ((sizetype) ((integer(kind=8)) l0_19(D)
* 81) + 9) * 4
This is because the patch promoted multiplies where one operand is a constant
from a signed multiply to an unsigned one, to attempt to fold away the constant.
This patch attempts the same but due to the various problems with SCEV and
niters not being able to analyze the resulting forms (i.e. PR114322) we can't
do it during SCEV or in the general form like in fold-const like extract_muldiv
attempts.
Instead this applies the simplification during IVopts initialization when we
create the IV. This allows IV opts to see the simplified form without
influencing the rest of the compiler.
as mentioned in PR114074 it would be good to fix the missed optimization in the
other passes so we can perform this in general.
The reason this has a big impact on Fortran code is that Fortran doesn't seem to
have unsigned integer types. As such all it's addressing are created with
signed types and folding does not happen on them due to the possible overflow.
concretely on AArch64 this changes the results from generation:
mov x27, -108
mov x24, -72
mov x23, -36
add x21, x1, x0, lsl 2
add x19, x20, x22
.L5:
add x0, x22, x19
add x19, x19, 324
ldr d1, [x0, x27]
add v1.2s, v1.2s, v15.2s
str d1, [x20, 216]
ldr d0, [x0, x24]
add v0.2s, v0.2s, v15.2s
str d0, [x20, 252]
ldr d31, [x0, x23]
add v31.2s, v31.2s, v15.2s
str d31, [x20, 288]
bl digits_20_
cmp x21, x19
bne .L5
into:
.L5:
ldr d1, [x19, -108]
add v1.2s, v1.2s, v15.2s
str d1, [x20, 216]
ldr d0, [x19, -72]
add v0.2s, v0.2s, v15.2s
str d0, [x20, 252]
ldr d31, [x19, -36]
add x19, x19, 324
add v31.2s, v31.2s, v15.2s
str d31, [x20, 288]
bl digits_20_
cmp x21, x19
bne .L5
The two patches together results in a 10% performance increase in exchange2 in
SPECCPU 2017 and a 4% reduction in binary size and a 5% improvement in compile
time. There's also a 5% performance improvement in fotonik3d and similar
reduction in binary size.
The patch folds every IV to unsigned to canonicalize them. At the end of the
pass we match.pd will then remove unneeded conversions.
Note that we cannot force everything to unsigned, IVops requires that array
address expressions remain as such. Folding them results in them becoming
pointer expressions for which some optimizations in IVopts do not run.
gcc/ChangeLog:
PR tree-optimization/114932
* tree-ssa-loop-ivopts.cc (alloc_iv): Perform affine unsigned fold.
gcc/testsuite/ChangeLog:
PR tree-optimization/114932
* gcc.dg/tree-ssa/pr64705.c: Update dump file scan.
* gcc.target/i386/pr115462.c: The testcase shares 3 IVs which calculates
the same thing but with a slightly different increment offset. The test
checks for 3 complex addressing loads, one for each IV. But with this
change they now all share one IV. That is the loop now only has one
complex addressing. This is ultimately driven by the backend costing
and the current costing says this is preferred so updating the testcase.
* gfortran.dg/addressing-modes_1.f90: New test.
|
|
[PR111422]
After fixing loop-im to do the correct overflow rewriting
for pointer types too. We end up with code like:
```
_9 = (unsigned long) &g;
_84 = _9 + 18446744073709551615;
_11 = _42 + _84;
_44 = (signed char *) _11;
...
*_44 = 10;
g ={v} {CLOBBER(eos)};
...
n[0] = &f;
*_44 = 8;
g ={v} {CLOBBER(eos)};
```
Which was not being recongized by the scope conflicts code.
This was because it only handled one level walk backs rather than multiple ones.
This fixes the issue by having a cache which records all references to addresses
of stack variables.
Unlike the previous patch, this only records and looks at addresses of stack variables.
The cache uses a bitmap and uses the index as the bit to look at.
PR middle-end/117426
PR middle-end/111422
gcc/ChangeLog:
* cfgexpand.cc (struct vars_ssa_cache): New class.
(vars_ssa_cache::vars_ssa_cache): New constructor.
(vars_ssa_cache::~vars_ssa_cache): New deconstructor.
(vars_ssa_cache::create): New method.
(vars_ssa_cache::exists): New method.
(vars_ssa_cache::add_one): New method.
(vars_ssa_cache::update): New method.
(vars_ssa_cache::dump): New method.
(add_scope_conflicts_2): Factor mostly out to
vars_ssa_cache::operator(). New cache argument.
Walk the bitmap cache for the stack variables addresses.
(vars_ssa_cache::operator()): New method factored out from
add_scope_conflicts_2. Rewrite to be a full walk of all operands
and use a worklist.
(add_scope_conflicts_1): Add cache new argument for the addr cache.
Just call add_scope_conflicts_2 for the phi result instead of calling
for the uses and don't call walk_stmt_load_store_addr_ops for phis.
Update call to add_scope_conflicts_2 to add cache argument.
(add_scope_conflicts): Add cache argument and update calls to
add_scope_conflicts_1.
gcc/testsuite/ChangeLog:
* gcc.dg/torture/pr117426-1.c: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
|
|
After a bit of a prod from Hans...
Make the obvious change to these tests to get them passing again on m68k.
PR testsuite/118055
gcc/testsuite
* gcc.dg/tree-ssa/pr83403-1.c: Add m68k*-*-* to targets needing
additional arguments for peeling.
* gcc.dg/tree-ssa/pr83403-2.c: Similarly.
|
|
The testcases use -save-temps which doesn't play nice with -flto
and multilib testing resulting in spurious UNRESOLVED like
/usr/lib64/gcc/x86_64-suse-linux/14/../../../../x86_64-suse-linux/bin/ld: i386:x86-64 architecture of input file `./convert-dfp-2.ltrans0.ltrans.o' is incompatible with i386 output
The following skips the testcases when using -flto.
* gcc.dg/torture/convert-dfp-2.c: Skip with -flto.
* gcc.dg/torture/convert-dfp.c: Likewise.
|
|
PR117546 was fixed by Eric's r14-10693-gadab597af288d6 change, but
the testcase here is sufficiently different to be worth including
in torture/.
gcc/testsuite/ChangeLog:
PR ipa/117546
* gcc.dg/torture/pr117546.c: New test.
|
|
As suggested by Richi in the PR, the following patch will fail to DCE
allocation calls if they have constant size which is too large (over
PTRDIFF_MAX), or for the case of calloc, if either of the arguments
is too large (in that case in theory the call could succeed if the other
argument is variable zero but who cares) or if both are constant and
their product overflows or is above PTRDIFF_MAX.
This will make some pedantic conformance tests happy, though if one
hides the size one will still need to use -fno-malloc-dce or obfuscate even
the malloc etc. uses. If the size is constant and too large, it isn't worth
trying to optimize it.
2025-01-06 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/118224
* tree-ssa-dce.cc (is_removable_allocation_p): Don't return true
for allocations with constant size argument larger than PTRDIFF_MAX
or for calloc with one of the arguments constant larger than
PTRDIFF_MAX or their product known constant above PTRDIFF_MAX.
Fix comment typos, furhter -> further and then -> than.
* lto-section-in.cc (lto_free_function_in_decl_state_for_node):
Fix comment typo, furhter -> further.
* gcc.dg/pr118224.c: New test.
* c-c++-common/ubsan/vla-1.c (bar): Use noipa attribute instead
of noinline, noclone.
|
|
TARGET_CALLEE_COPIES-adjustments
With the dump now emitting "privatized symbols" in the default
"%s.%lu" format also for MMIX, there's still a difference for MMIX.
This time it's because numbers have changed (copies introduced before
this point) because it has TARGET_CALLEE_COPIES yielding true.
Redundant copies may have been elided at this point, but the change
in name remains.
Since that's true for other targets too, an obvious change is to
generalize the tested patterns to include TARGET_CALLEE_COPIES-true
targets, as a brief inspection of the history of these tests shows
that the point of these tests lie not in whether copies have been done
but in the part of the pattern that match a constant.
Also fixed up a "." where there should have been a "\\.".
* gcc.dg/tree-ssa/vector-4.c: Replace MMIX adjustments with
TARGET_CALLEE_COPIES-agnostic adjustments.
* gcc.dg/tree-ssa/forwprop-36.c: Ditto. Correct pattern to match a
literal ".".
|
|
This PR was about a case in which late-combine moved a stack
deallocation across an earlier stack access. This was possible
because the deallocation was missing the RTL-SSA equivalent of
a vop, which in turn was because rtl_properties didn't treat
the deallocation as writing to memory. I think the bug was
ultimately there.
gcc/
PR rtl-optimization/117938
* rtlanal.cc (rtx_properties::try_to_add_dest): Treat writes
to the stack pointer as also writing to memory.
gcc/testsuite/
PR rtl-optimization/117938
* gcc.dg/torture/pr117938.c: New test.
|
|
This testcase came up in a recent LLVM bug report [0] for DSE vs
-ftrivial-auto-var-init=. Add it to our testsuite given that area
could do with better coverage.
[0] https://github.com/llvm/llvm-project/issues/119646
gcc/testsuite/ChangeLog:
* gcc.dg/torture/dse-trivial-auto-var-init.c: New test.
Co-authored-by: Andrew Pinski <pinskia@gmail.com>
|
|
Changed in v2:
- distinguish between "bool" and "_Bool" when determining
standard version
This patch attempts to provide better error messages for
code compiled with C23 that hasn't been updated for
"bool", "true", and "false" becoming keywords.
Specifically:
(1) with "typedef int bool;" previously we emitted:
t1.c:7:13: error: two or more data types in declaration specifiers
7 | typedef int bool;
| ^~~~
t1.c:7:1: warning: useless type name in empty declaration
7 | typedef int bool;
| ^~~~~~~
whereas with this patch we emit:
t1.c:7:13: error: 'bool' cannot be defined via 'typedef'
7 | typedef int bool;
| ^~~~
t1.c:7:13: note: 'bool' is a keyword with '-std=c23' onwards
t1.c:7:1: warning: useless type name in empty declaration
7 | typedef int bool;
| ^~~~~~~
(2) with "int bool;" previously we emitted:
t2.c:7:5: error: two or more data types in declaration specifiers
7 | int bool;
| ^~~~
t2.c:7:1: warning: useless type name in empty declaration
7 | int bool;
| ^~~
whereas with this patch we emit:
t2.c:7:5: error: 'bool' cannot be used here
7 | int bool;
| ^~~~
t2.c:7:5: note: 'bool' is a keyword with '-std=c23' onwards
t2.c:7:1: warning: useless type name in empty declaration
7 | int bool;
| ^~~
(3) with "typedef enum { false = 0, true = 1 } _Bool;" previously we
emitted:
t3.c:7:16: error: expected identifier before 'false'
7 | typedef enum { false = 0, true = 1 } _Bool;
| ^~~~~
t3.c:7:38: error: expected ';', identifier or '(' before '_Bool'
7 | typedef enum { false = 0, true = 1 } _Bool;
| ^~~~~
t3.c:7:38: warning: useless type name in empty declaration
whereas with this patch we emit:
t3.c:7:16: error: cannot use keyword 'false' as enumeration constant
7 | typedef enum { false = 0, true = 1 } _Bool;
| ^~~~~
t3.c:7:16: note: 'false' is a keyword with '-std=c23' onwards
t3.c:7:38: error: expected ';', identifier or '(' before '_Bool'
7 | typedef enum { false = 0, true = 1 } _Bool;
| ^~~~~
t3.c:7:38: warning: useless type name in empty declaration
gcc/c/ChangeLog:
PR c/117629
* c-decl.cc (declspecs_add_type): Special-case attempts to use
bool as a typedef name or declaration name.
* c-errors.cc (get_std_for_keyword): New.
(add_note_about_new_keyword): New.
* c-parser.cc (report_bad_enum_name): New, split out from...
(c_parser_enum_specifier): ...here, adding handling for RID_FALSE
and RID_TRUE.
* c-tree.h (add_note_about_new_keyword): New decl.
gcc/testsuite/ChangeLog:
PR c/117629
* gcc.dg/auto-type-2.c: Update expected output with _Bool.
* gcc.dg/c23-bool-errors-1.c: New test.
* gcc.dg/c23-bool-errors-2.c: New test.
* gcc.dg/c23-bool-errors-3.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
The test was failing on x86 because longdouble128 only checks sizeof,
rather than a full 128-bit payload. Using _Float128 is more portable
and still exposes the original bug.
gcc/testsuite/
PR target/118184
* gcc.dg/torture/pr118184.c: Use _Float128 instead of long double.
|
|
PRE applies GENERIC folding to some component ref components which
might result in invalid GIMPLE, like a VIEW_CONVERT_EXPR wrapping
a REALPART_EXPR as in the PR. The following removes all GENERIC
folding in the code re-constructing a GENERIC component-ref from
the PRE VN IL.
PR tree-optimization/118171
* tree-ssa-pre.cc (create_component_ref_by_pieces_1): Do not
fold any component ref parts.
* gcc.dg/torture/pr118171.c: New testcase.
|
|
REGMODE_NATURAL_SIZE is set to 64 bits for everything except
VLA SVE modes. This means that it's possible to modify (say)
the highpart of a TI pseudo or a V2DI pseudo independently
of the lowpart. Modifying such highparts requires a reload
if the highpart ends up in the upper 64 bits of an FPR,
since RTL semantics do not allow the highpart of a single
hard register to be modified independently of the lowpart.
early-ra missed a check for this case, which meant that it
effectively treated an assignment to (subreg:DI (reg:TI R) 0)
as an assignment to the whole of R.
gcc/
PR target/118184
* config/aarch64/aarch64-early-ra.cc (allocno_assignment_is_rmw):
New function.
(early_ra::record_insn_defs): Mark the live range information as
untrustworthy if an assignment would change part of an allocno
but preserve the rest.
gcc/testsuite/
* gcc.dg/torture/pr118184.c: New test.
|
|
In order to stress test RAW_DATA_CST handling, I've tested trunk gcc with
r15-6339 reapplied and a hack where I've changed
const unsigned int raw_data_min_len = 128;
to
const unsigned int raw_data_min_len = 2;
in cp_lexer_new_main and 64 to 4 several times in c_parser_initval
and c_maybe_optimize_large_byte_initializer, so that RAW_DATA_CST doesn't
trigger just on very large initializers, but even quite small ones.
One of the regressions (will work on the others next) was that pr90838.c
testcase regressed, check_ctz_array needs to handle RAW_DATA_CST, otherwise
on larger initializers or if those come from #embed just won't trigger.
The new testcase shows when it doesn't trigger anymore (regression from 14).
The patch just handles RAW_DATA_CST in the CONSTRUCTOR_ELTS the same as is
it was a series of INTEGER_CSTs.
2025-01-02 Jakub Jelinek <jakub@redhat.com>
* tree-ssa-forwprop.cc (check_ctz_array): Handle also RAW_DATA_CST
in the CONSTRUCTOR_ELTS.
* gcc.dg/pr90838-2.c: New test.
|
|
|
|
The following avoids applying TER to direct internal functions that
are tailcall since the involved expansion code path doesn't honor
TER constraints.
PR middle-end/118174
* tree-outof-ssa.cc (ssa_is_replaceable_p): Exclude tailcalls.
* gcc.dg/torture/pr118174.c: New testcase.
|
|
Several months ago changes were made to the vectorizer which mucked up several
of the scan tests. All but one of the cases in pr115375 have since been fixed.
The remaining failure seems to be primarily a debugging dump issue -- we're
still selecting the same lmul values. This patch adjusts the dump scan
appropriately.
PR target/115375
gcc/testsuite
* gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-9.c: Adjust expected
output.
|
|
The following testcases ICE because fold_array_ctor_reference in the
RAW_DATA_CST handling just return build_int_cst without actually checking
that if type is non-NULL, TREE_TYPE (val) is uselessly convertible to it.
By falling through the code after it without *suboff += we get everything
we need, the two if conditionals will never be true (we've already
checked that size == BITS_PER_UNIT and so can't be 0, and val will be
INTEGER_CST), but it will do the important fold_ctor_reference call
which will deal with type incompatibilities.
2024-12-28 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/118207
* gimple-fold.cc (fold_array_ctor_reference): For RAW_DATA_CST,
just set val to build_int_cst and fall through to the normal
element handling code instead of returning build_int_cst right away.
* gcc.dg/pr118207.c: New test.
|
|
Running tests in parallel on my 4.5y+ old laptop made this
test time out: the test itself runs in 9m20s, the timeout
being 10 minutes with the 2x factor. That's a bit too close.
This commit does to the base test a similar change as was
done for gcc.dg/torture/inline-mem-cpy-1.c in commit
r14-8188-g6eca0d23b7ea84; or IOW cut it down a factor of 7
(r14-8188 was by a factor of 11).
* gcc.dg/memcmp-1.c: Pass -DRUN_FRACTION=7 when testing in a simulator.
|
|
Recently two test cases for PR118149 have been added.
While pr118149-2.c works well for AArch64, pr118149.c fails
because the expected optimization in forwprop4 cannot be applied
as SLP vectorization does not happen.
This patch fixes this issue by disabling the check on AArch64.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/pr118149.c: Disable for AArch64.
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
|
|
We don't want to indirect pointers in strub wrappers, because it
generally isn't profitable, but if the argument is volatile, then we
must use indirection to preserve access patterns, so amend the
assertion check.
for gcc/ChangeLog
PR middle-end/118007
* ipa-strub.cc (pass_ipa_strub::execute): Accept indirecting
volatile args of pointer types.
for gcc/testsuite/ChangeLog
PR middle-end/118007
* gcc.dg/strub-pr118007.c: New.
|
|
FAILs have been reported for several tree-ssa vector-*.c tests
on i686-linux or on x86_64-linux with -m32.
This patch addresses these fails by setting the necessary -msse2 flags.
This patch also streamlines all tests to use dg-options instead
of dg-additional-options. This is in line with most other tests
in gcc.dg/tree-ssa.
Tested with the following board config in RUNTESTFLAGS:
--target_board=unix\{-m64,-m32,-m32/-mno-mmx/-mno-sse}
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/satd-hadamard.c: Rename dg-additional-options
to dg-options.
* gcc.dg/tree-ssa/vector-10.c: Rename dg-additional-options
to dg-options and add -msse2 to it.
* gcc.dg/tree-ssa/vector-11.c: Likewise.
* gcc.dg/tree-ssa/vector-8.c: Rename dg-additional-options
to dg-options.
* gcc.dg/tree-ssa/vector-9.c: Likewise.
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
|
|
A recent bugfix (eee2891312) for PR117830 also addressed PR118149.
This patch adds two test cases for PR118149.
These tests are different than other tests in that one of the
vec-perm selectors contains indices in descending order (1, 1, 0, 0),
which is the root cause for the ICE observed in PR118149.
PR118149
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/pr118149-2.c: New test.
* gcc.dg/tree-ssa/pr118149.c: New test.
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
|
|
In PR117830 a miscompilation of 464.h264ref was reported.
An analysis showed that wrong code was generated because of
unsatisfied assumptions. This patch addresses these issues.
The first assumption was that we could independently analyze the two
vec-perms at the start of a vec-perm-simplify sequence and use the
information later for calculating a final vec-perm selector that
utilizes fewer lanes. However, this information does not help much,
because for changing the selector entry, we need to ensure that both
elements of the operand vectors v_1 and v_2 remain equal.
This is addressed by removing the function get_vect_selector_index_map
and checking for this equality in the loop where we create the new
selector.
The calculation of the selector vector for the blended sequence
assumed that the indices of the selector vector of the narrowed
sequences are increasing. This assumption does not hold in general.
This was fixed by allowing a wrap-around when searching for an empty
lane.
Further, there was an issue in the calculation of the selector vector
entries for the second sequence. The code did not consider that the
lanes of the second sequence could have been moved.
A relevant property of this patch is that it introduces a
couple of nested loops, where the out loop iterates from
i=0..nelts and the inner loop iterates from j=0..i.
To avoid performance concerns, a check is introduced that
ensures nelts won't exceed 4 lanes.
The added test case is derived from h264ref (the other cases from the
benchmark have the same structure and don't provide additional coverage).
Bootstrapped and regression-tested on x86-64 and aarch64.
Further, tested on CPU 2006 h264ref and CPU 2017 x264.
PR117830
gcc/ChangeLog:
* tree-ssa-forwprop.cc (get_vect_selector_index_map): Removed.
(recognise_vec_perm_simplify_seq): Fix calculation of vec-perm
selectors of narrowed sequence.
(calc_perm_vec_perm_simplify_seqs): Fixing calculation of
vec-perm selectors of the blended sequence.
(process_vec_perm_simplify_seq_list): Add whitespace to dump
string to avoid bad formatted dump output.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/vector-11.c: New test.
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
|
|
When a loaded field is sign extended, masked and compared, we used to
drop from the mask the bits past the original field width, which is
not correct.
Take note of the fact that the mask covered copies of the sign bit,
before clipping it, and arrange to test the sign bit if we're
comparing with zero. Punt in other cases.
If bits_test fail recoverably, try other ifcombine strategies.
for gcc/ChangeLog
* gimple-fold.cc (decode_field_reference): Add psignbit
parameter. Set it if the mask references sign-extending
bits.
(fold_truth_andor_for_ifcombine): Adjust calls with new
variables. Swap them along with other r?_* variables. Handle
extended sign bit compares with zero.
* tree-ssa-ifcombine.cc (ifcombine_ifandif): If bits_test
fails in a way that doesn't prevent other ifcombine strategies
from passing, give them a try.
for gcc/testsuite/ChangeLog
* gcc.dg/field-merge-16.c: New.
|
|
Some bitfield compares with zero are optimized to range tests, so
instead of X & ~(Bit - 1) != 0 what reaches ifcombine is X > (Bit - 1),
where Bit is a power of two and X is unsigned.
This patch recognizes this optimized form of masked compares, and
attempts to merge them like masked compares, which enables some more
field merging that a folder version of fold_truth_andor used to handle
without additional effort.
I haven't seen X & ~(Bit - 1) == 0 become X <= (Bit - 1), or X < Bit
for that matter, but it was easy enough to handle the former
symmetrically to the above.
The latter was also easy enough, and so was its symmetric, X >= Bit,
that is handled like X & ~(Bit - 1) != 0.
for gcc/ChangeLog
* gimple-fold.cc (decode_field_reference): Accept incoming
mask.
(fold_truth_andor_for_ifcombine): Handle some compares with
powers of two, minus 1 or 0, like masked compares with zero.
for gcc/testsuite/ChangeLog
* gcc.dg/field-merge-15.c: New.
|