Age | Commit message (Collapse) | Author | Files | Lines |
|
From: Andrew Bennett <andrew.bennett@imgtec.com>
Firstly, remove the MIPS specific bit of the test.
Secondly, create a MIPS specific version in the gcc.target/mips.
This will only execute for a MIPS ISA less than R6.
Cherry-picked c8b051cdbb1d5b166293513b0360d3d67cf31eb9
from https://github.com/MIPS/gcc
gcc/testsuite
* gcc.dg/memcpy-4.c: Remove mips specific code.
* gcc.target/mips/memcpy-2.c: New test.
|
|
The optimisation to reduce the result to constant 28 still happens
but only much later in combine.
gcc/testsuite/
* gcc.dg/tree-ssa/ssa-dom-cse-2.c: Do not check output for
MIPS lp64 abi.
|
|
commit 546f28f83ceba74dc8bf84b0435c0159ffca971a
Author: Richard Sandiford <richard.sandiford@arm.com>
Date: Mon Apr 7 08:03:46 2025 +0100
simplify-rtx: Fix shortcut for vector eq/ne
fixed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117863
PR target/117863
* gcc.dg/rtl/i386/vector_eq-2.c: New test.
* gcc.dg/rtl/i386/vector_eq-3.c: Likewise.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
|
|
The problem here is on targets where a 32byte memcpy will use an integral (vector) type
to do the copy and the code will be optimized a different way than expected. This changes
the testcase instead to use a size of 1025 to make sure there is no target that will use an
integral (vector) type for the memcpy and be optimized via the method that was just added.
Pushed as obvious after a test run.
gcc/testsuite/ChangeLog:
* gcc.dg/pr118947-1.c: Use 1025 as the size of the buf.
* gcc.dg/pr78408-3.c: Likewise.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
|
|
This is the second part of the PR which comes from transformation
of memset into either stores of 0 (via an integral type) or stores
of {}. We already handle stores of `{}`, this just extends that to
handle of the constant 0 and treat it similarly.
PR tree-optimization/87901
gcc/ChangeLog:
* tree-ssa-dse.cc (maybe_trim_constructor_store): Add was_integer_cst argument.
Check for was_integer_cst instead of `{}` when was_integer_cst is true.
(maybe_trim_partially_dead_store): Handle INTEGER_CST stores of 0 as stores of `{}`.
Udpate call to maybe_trim_constructor_store for CONSTRUCTOR.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/ssa-dse-53.c: New test.
* gcc.dg/tree-ssa/ssa-dse-54.c: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
|
|
DSE has support for trimming memset (and memset like) statements.
In this case we have `MEM <unsigned char[17]> [(char * {ref-all})&z] = {};` in
the IR and when we go to trim it, we call build_fold_addr_expr which leaves around
a cast from one pointer type to another. This is due to build_fold_addr_expr
being generic but in gimple you don't need these casts.
PR tree-optimization/87901
gcc/ChangeLog:
* tree-ssa-dse.cc (maybe_trim_constructor_store): Strip over useless type
conversions after taking the address of the MEM_REF.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/ssa-dse-52.c: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
|
|
So unlike constants, address invariants are currently put first if
used with a SSA NAME.
It would be better if address invariants are consistent with constants
and this patch changes that.
gcc.dg/tree-ssa/pr118902-1.c is an example where this canonicalization
can help. In it if `p` variable was a global variable, FRE (VN) would have figured
it out that `a` could never be equal to `&p` inside the loop. But without the
canonicalization we end up with `&p == a.0_1` which VN does try to handle for conditional
VN.
Bootstrapped and tested on x86_64.
PR tree-optimization/118902
gcc/ChangeLog:
* fold-const.cc (tree_swap_operands_p): Place invariants in the first operand
if not used with constants.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/pr118902-1.c: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
|
|
aliasing says the ref is a may clobber. [PR118947]
The case here is we have:
```
char buf[32] = {};
void* ret = aaa();
__builtin_memcpy(ret, buf, 32);
```
And buf does not escape. But we don't prop the zeroing from buf to the memcpy statement
because optimize_memcpy_to_memset only looks back one statement. This can be fixed to look back
until we get an statement that may clobber the reference. If we get a phi node, then we don't do
anything.
Bootstrapped and tested on x86_64-linux-gnu.
PR tree-optimization/118947
gcc/ChangeLog:
* gimple-fold.cc (optimize_memcpy_to_memset): Walk back until we get a
statement that may clobber the read.
gcc/testsuite/ChangeLog:
* gcc.dg/pr118947-1.c: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
|
|
While looking into PR 118947, I noticed that optimize_memcpy_to_memset didn't
handle STRING_CST which are also used for a memset of 0 but for char arrays.
This fixes that and improves optimize_memcpy_to_memset to handle that case.
This fixes part of PR 118947 but not the whole thing; we still need to skip over
vdefs in some cases.
Boostrapped and tested on x86_64-linux-gnu.
PR tree-optimization/78408
PR tree-optimization/118947
gcc/ChangeLog:
* gimple-fold.cc (optimize_memcpy_to_memset): Handle STRING_CST case too.
gcc/testsuite/ChangeLog:
* gcc.dg/pr78408-3.c: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
|
|
The test case assumes that alignof(int)=sizeof(int). But for some
targets this is not valid. For example, for PRU target,
alignof(int)=1 but sizeof(int)=4.
Fix the test case to align to twice the size of int, as the expected
dg-error messages suggest.
This patch fixes the test failures for PRU target.
gcc/testsuite/ChangeLog:
* gcc.dg/pr116357.c: Use sizeof(int) instead of alignof(int).
Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
|
|
In the case that we are eliminating the load instruction, we use zero_extend
for the initialization of the base register for the zero-offset store.
This causes issues when the store and the load use the same mode,
as we are trying to generate a zero_extend with the same inner and
outer modes.
This patch fixes the issue by zero-extending the value stored in the
base register only when the load's mode is wider than the store's mode.
PR rtl-optimization/119160
gcc/ChangeLog:
* avoid-store-forwarding.cc (process_store_forwarding):
Zero-extend the value stored in the base register, in case
of load-elimination, only when the mode of the destination
is wider.
gcc/testsuite/ChangeLog:
* gcc.dg/pr119160.c: New test.
|
|
Like other ppc targets, powerpc-*-elf needs -Wno-psabi to compile
gcc.dg/ipa/ipa-sra-19.c without an undesired warning about vector
argument passing.
for gcc/testsuite/ChangeLog
* gcc.dg/ipa/ipa-sra-19.c: Add -Wno-psabi on ppc-elf too.
|
|
architecture [PR119286]
The given test is intended to test vectorization of a strided access done by
having a step of > 1.
GCN target doesn't support load lanes, so the testcase is expected to fail,
other targets create a permuted load here which we then then reject.
However some GCN arch don't seem to support the permuted loads either, so the
vectorizer tries a gather/scatter. But the indices aren't supported by some
target, so instead the vectorizer scalarizes the loads.
I can't really test for which architecture is being used by the compiler, so
instead this updates the testcase to use one single architecture so we get a
consistent result.
gcc/testsuite/ChangeLog:
PR target/119286
* gcc.dg/vect/vect-early-break_18.c: Force -march=gfx908 for amdgcn.
|
|
has_single_use [PR119808]
The following testcase is miscompiled, because we emit a CLOBBER in a place
where it shouldn't be emitted.
Before lowering we have:
b_5 = 0;
b.0_6 = b_5;
b.1_1 = (unsigned _BitInt(129)) b.0_6;
...
<retval> = b_5;
The bitint coalescing assigns the same partition/underlying variable
for both b_5 and b.0_6 (possible because there is a copy assignment)
and of course a different one for b.1_1 (and other SSA_NAMEs in between).
This is -O0 so stmts aren't DCEd and aren't propagated that much etc.
It is -O0 so we also don't try to optimize and omit some names from m_names
and handle multiple stmts at once, so the expansion emits essentially
bitint.4 = {};
bitint.4 = bitint.4;
bitint.2 = cast of bitint.4;
bitint.4 = CLOBBER;
...
<retval> = bitint.4;
and the CLOBBER is the problem because bitint.4 is still live afterwards.
We emit the clobbers to improve code generation, but do it only for
(initially) has_single_use SSA_NAMEs (remembered in m_single_use_names)
being used, if they don't have the same partition on the lhs and a few
other conditions.
The problem above is that b.0_6 which is used in the cast has_single_use
and so was in m_single_use_names bitmask and the lhs in that case is
bitint.2, so a different partition. But there is gimple_assign_copy_p
with SSA_NAME rhs1 and the partitioning special cases those and while
b.0_6 is single use, b_5 has multiple uses. I believe this ought to be
a problem solely in the case of such copy stmts and its special case
by the partitioning, if instead of b.0_6 = b_5; there would be
b.0_6 = b_5 + 1; or whatever other stmts that performs or may perform
changes on the value, partitioning couldn't assign the same partition
to b.0_6 and b_5 if b_5 is used later, it couldn't have two different
(or potentially different) values in the same bitint.N var. With
copy that is possible though.
So the following patch fixes it by being more careful when we set
m_single_use_names, don't set it if it is a has_single_use SSA_NAME
but SSA_NAME_DEF_STMT of it is a copy stmt with SSA_NAME rhs1 and that
rhs1 doesn't have single use, or has_single_use but SSA_NAME_DEF_STMT of it
is a copy stmt etc.
Just to make sure it doesn't change code generation too much, I've gathered
statistics how many times
if (m_first
&& m_single_use_names
&& m_vars[p] != m_lhs
&& m_after_stmt
&& bitmap_bit_p (m_single_use_names, SSA_NAME_VERSION (op)))
{
tree clobber = build_clobber (TREE_TYPE (m_vars[p]),
CLOBBER_STORAGE_END);
g = gimple_build_assign (m_vars[p], clobber);
gimple_stmt_iterator gsi = gsi_for_stmt (m_after_stmt);
gsi_insert_after (&gsi, g, GSI_SAME_STMT);
}
emits a clobber on
make check-gcc GCC_TEST_RUN_EXPENSIVE=1 RUNTESTFLAGS="--target_board=unix\{-m64,-m32\} GCC_TEST_RUN_EXPENSIVE=1 dg.exp='*bitint* pr112673.c builtin-stdc-bit-*.c pr112566-2.c pr112511.c pr116588.c pr116003.c pr113693.c pr113602.c flex-array-counted-by-7.c' dg-torture.exp='*bitint* pr116480-2.c pr114312.c pr114121.c' dfp.exp=*bitint* i386.exp='pr118017.c pr117946.c apx-ndd-x32-2a.c' vect.exp='vect-early-break_99-pr113287.c' tree-ssa.exp=pr113735.c"
and before this patch it was 41010 clobbers and after it is 40968,
so difference is 42 clobbers, 0.1% fewer.
2025-04-16 Jakub Jelinek <jakub@redhat.com>
PR middle-end/119808
* gimple-lower-bitint.cc (gimple_lower_bitint): Don't set
m_single_use_names bits for SSA_NAMEs which have single use but
their SSA_NAME_DEF_STMT is a copy from another SSA_NAME which doesn't
have a single use, or single use which is such a copy etc.
* gcc.dg/bitint-121.c: New test.
|
|
This testcase got fixed with r15-9397 PR119722 fix.
2025-04-16 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/116093
* gcc.dg/bitint-122.c: New test.
|
|
The r15-9487 change has added -flto-partition=default, which broke
the completion-2.c testcase because that case is now also printed
during completion.
2025-04-16 Jakub Jelinek <jakub@redhat.com>
* gcc.dg/completion-2.c: Expect also -flto-partition=default line.
|
|
C_MAYBE_CONST_EXPR is a C FE operator that will be removed by c_fully_fold.
In c_fully_fold, it assumes that operands of function calls have already
been folded. However, when we build call to .ACCESS_WITH_SIZE, all its
operands are not fully folded. therefore the C FE specific operator is
passed to middle-end.
In order to fix this issue, fully fold the parameters before building the
call to .ACCESS_WITH_SIZE.
PR c/119717
gcc/c/ChangeLog:
* c-typeck.cc (build_access_with_size_for_counted_by): Fully fold the
parameters for call to .ACCESS_WITH_SIZE.
gcc/testsuite/ChangeLog:
* gcc.dg/pr119717.c: New test.
|
|
In my fix for PR 119318 I put mask calculation in
ipcp_bits_lattice::meet_with_1 above a final fix to value so that all
the bits in the value which are meaningless according to mask have
value zero, which has tripped a validator in PR 119803. This patch
fixes that by moving the adjustment down.
Even thought the fix for PR 119318 did a similar thing in
ipcp_bits_lattice::meet_with, the same is not necessary because that
code path then feeds the new value and mask to
ipcp_bits_lattice::set_to_constant which does the final adjustment
correctly.
In both places, however, Jakup proposed a better way of calculating
cap_mask and so I have changed it accordingly.
gcc/ChangeLog:
2025-04-15 Martin Jambor <mjambor@suse.cz>
PR ipa/119803
* ipa-cp.cc (ipcp_bits_lattice::meet_with_1): Move m_value adjustmed
according to m_mask below the adjustment of the latter according to
cap_mask. Optimize the calculation of cap_mask a bit.
(ipcp_bits_lattice::meet_with): Optimize the calculation of cap_mask a
bit.
gcc/testsuite/ChangeLog:
2025-04-15 Martin Jambor <mjambor@suse.cz>
PR ipa/119803
* gcc.dg/ipa/pr119803.c: New test.
Co-authored-by: Jakub Jelinek <jakub@redhat.com>
|
|
llp64 targets like mingw-w64 will print:
gcc/testsuite/gcc.dg/Wbuiltin-declaration-mismatch-4.c:80:17: warning: ‘memset’ argument 3 promotes to ‘ptrdiff_t’ {aka ‘long long int’} where ‘long long unsigned int’ is expected in a call to built-in function declared without prototype [-
Wbuiltin-declaration-mismatch]
Change the regex pattern to accept it.
Signed-off-by: Jonathan Yong <10walls@gmail.com>
gcc/testsuite/ChangeLog:
* gcc.dg/Wbuiltin-declaration-mismatch-4.c: Make diagnostic
accept long long.
|
|
dg-additional-options followed by dg-options is ignored. I've added the
-w from there to dg-options and removed dg-additional-options.
2025-04-15 Jakub Jelinek <jakub@redhat.com>
PR ipa/119318
* gcc.dg/ipa/pr119318.c: Remove dg-additional-options, add -w to
dg-options.
|
|
I'm seeing
+FAIL: gcc.dg/ipa/pr119530.c execution test
on i686-linux. The problem is that when long is just 32-bit and
so is unsigned, the testcase then behaves differently and should abort.
Fixed by making the argument long long instead.
While at it, just in case I've changed type of d variable to signed char
as well just in case there is -funsigned-char 8-bit int target or something
similar.
2025-04-14 Jakub Jelinek <jakub@redhat.com>
PR ipa/119318
* gcc.dg/ipa/pr119530.c (d): Change type from char to signed char.
(e): Change argument type from long to long long.
|
|
This testcase was fixed by r15-3052-gc7b76a076cb2c6ded but is
a testcase that failed in a different fashion and a much older
failure than the one added with r15-3052.
Pushed as obvious after a quick test.
PR tree-optimization/118476
gcc/testsuite/ChangeLog:
* gcc.dg/torture/pr118476-1.c: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
|
|
After the propagation of constants and value ranges, it turns out
that the propagation of known bits also needs to be made aware of any
intermediate types in which any arithmetic operations are made and
must limit its precision there. This implements just that, using the
newly collected and streamed types of the operations involved.
This version removed the extra check that the type of a formal
parameter is known pointed out in Honza in his review because I agree
it is currently always known. I have also added the testcase of PR
119530 which is a duplicate of this bug.
gcc/ChangeLog:
2025-04-11 Martin Jambor <mjambor@suse.cz>
PR ipa/119318
* ipa-cp.cc (ipcp_bits_lattice::meet_with_1): Set all mask bits
not covered by precision to one.
(ipcp_bits_lattice::meet_with): Likewise.
(propagate_bits_across_jump_function): Use the stored operation
type to perform meet with other lattices.
gcc/testsuite/ChangeLog:
2025-04-11 Martin Jambor <mjambor@suse.cz>
PR ipa/119318
* gcc.dg/ipa/pr119318.c: New test.
* gcc.dg/ipa/pr119530.c: Likwise.
|
|
The following makes sure to not mix masked/non-masked stmts when
forming a SLP node.
PR tree-optimization/119757
* tree-vect-slp.cc (vect_build_slp_tree_1): Record and compare
whether a stmt uses a maks.
* gcc.dg/vect/pr119757.c: New testcase.
|
|
When late combine was enabled for x86_64 (r15-1735-ge62ea4fb8ffcab),
these 2 testcases start to xpass in a similar fashion as when late
combine was added and the testcase was updated for aarch64 not to
xfail them there.
Pushed as obvious after a test to make sure the testcase no longer xpass.
PR testsuite/117706
gcc/testsuite/ChangeLog:
* gcc.dg/ira-shrinkwrap-prep-1.c: Unxfail for i?68-*-* and x86_64-*-*.
* gcc.dg/ira-shrinkwrap-prep-2.c: Likewise.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
|
|
The following patch is miscompiled, because during the limited
SSA name coalescing the bitintlower pass does we incorrectly don't
register a conflict.
This is on
<bb 4> [local count: 1073741824]:
# b_17 = PHI <b_19(3), 8(2)>
g.4_13 = g;
_14 = g.4_13 >> 50;
_15 = (unsigned int) _14;
_21 = b_17;
_16 = (unsigned int) _21;
s_22 = _15 + _16;
return s_22;
basic block where in the map->bitint bitmap we track 14, 17 and 19.
The build_bitint_stmt_ssa_conflicts "hook" has special code where
it tracks uses at the final statements of mergeable operations, so
e.g. the
_16 = (unsigned int) _21;
statement is considered to be use of b_17 because _21 is not in
map->bitmap (or large_huge.m_names), i.e. is mergeable.
The problem is that build_ssa_conflict_graph has special code to handle
SSA_NAME copies and _21 = b_17; is gimple_assign_copy_p. In such cases
it calls live_track_clear_var on the rhs1. The problem is that
on the above bb, after we note in the _16 = (unsigned int) _21;
stmt we need b_17 the generic code makes us forget that because
of the copy statement, and then build_bitint_stmt_ssa_conflicts
ignores it completely (because _21 is large/huge bitint and is
not in map->bitint, so assumed to be handled by a later stmt in the
bb, for backwards walk like this before this one).
As the b_17 use is ignored, the coalescing thinks it can put
all of b_17, b_19 and _14 into the same partition, which is wrong,
while we can and should coalesce b_17 and b_19, _14 needs to be a different
temporary because b_17 is set before and used after _14 has been written.
The following patch fixes it by handling gimple_assign_copy_p in two
separate spots, move the generic coalesce handling of it after
build_ssa_conflict_graph (where build_ssa_conflict_graph handling
doesn't fall through to that, it does continue after the call) and
inside of build_ssa_conflict_graph it performs it too, but only if
the lhs is not mergeable large/huge bitint.
2025-04-12 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/119722
* gimple-lower-bitint.h (build_bitint_stmt_ssa_conflicts): Add
CLEAR argument.
* gimple-lower-bitint.cc (build_bitint_stmt_ssa_conflicts): Add
CLEAR argument. Call clear on gimple_assign_copy_p rhs1 if lhs
is large/huge bitint unless lhs is not in names.
* tree-ssa-coalesce.cc (build_ssa_conflict_graph): Adjust
build_bitint_stmt_ssa_conflicts caller. Move gimple_assign_copy_p
handling to after the build_bitint_stmt_ssa_conflicts call.
* gcc.dg/torture/bitint-77.c: New test.
|
|
The following testcase is miscompiled I believe starting with
PR112941 r14-6742. That commit fixed the bitint-55.c testcase.
The m_first initialization for such conversion initializes 2 SSA_NAMEs,
one is PHI result on the loop (m_data[save_data_cnt]) and the other
(m_data[save_data_cnt+1]) is the argument of that PHI from the latch
edge initialized somewhere in the loop. Both of these are used to
propagate sign extension (i.e. either 0 or all ones limb) from the
iteration with the sign bit of a narrower type to following iterations.
The bitint-55.c testcase was ICEing with invalid SSA forms as it was
using unconditionally the PHI argument SSA_NAME even in places which
weren't dominated by that. And the code which was touched is about
handling constant idx, so if e.g. there are nested casts and the
outer one does conditional code based on index comparison with
a particular constant index.
In the following testcase there are 2 nested casts, one from signed
_BitInt(129) to unsigned _BitInt(255) and the outer from unsigned
_BitInt(255) to unsigned _BitInt(256). The m_upward_2limbs case which
is used for handling mergeable arithmetics (like +-|&^ and casts etc.)
one loop iteration handles 2 limbs, the first half the even ones, the
second half the odd ones.
And for these 2 conversions, the special one for the inner conversion
on x86_64 is with index 2 where the sign bit of _BitInt(129) is present,
while for the outer one index 3 where we need to mask off the most
significant bit.
The r15-6742 change started using m_data[save_data_cnt] for all constant
indexes if it is still inside of the loop (and it is sign extension).
But that doesn't work correctly for the case where the inner conversion
produces the sign extension limb in the loop for an even index and
the outer conversion needs to special case the immediately next conversion,
because in that case using the PHI result will see still 0 there rather
than the updated value from the handling of previous limb.
So the following patch special cases this and uses the other SSA_NAME.
Commented IL, trying to lower
_1 = (unsigned _BitInt(255)) y_4(D);
_2 = (unsigned _BitInt(256)) _1;
_3 = _2 + x_5(D);
<retval> = _3;
we were emitting
<bb 3> [local count: 1073741824]:
# _8 = PHI <0(2), _9(12)> // This is the limb index
# _10 = PHI <0(2), _11(12)> // Sign extension limb from inner cast (0 or ~0UL)
# _22 = PHI <0(2), _23(12)> // Overflow bit from addition of previous limb
if (_8 <= 2)
goto <bb 4>; [80.00%]
else
goto <bb 7>; [20.00%]
<bb 4> [local count: 1073741824]:
if (_8 == 2)
goto <bb 6>; [20.00%]
else
goto <bb 5>; [80.00%]
<bb 5> [local count: 1073741824]:
_12 = VIEW_CONVERT_EXPR<unsigned long[3]>(y)[_8]; // Full limbs in y
goto <bb 7>; [100.00%]
<bb 6> [local count: 214748360]:
_13 = MEM <unsigned long> [(_BitInt(129) *)&y + 16B]; // y[2] which
_14 = (<unnamed-signed:1>) _13; // needs to be
_15 = (unsigned long) _14; // sign extended
_16 = (signed long) _15; // to full
_17 = _16 >> 63; // limb
_18 = (unsigned long) _17;
<bb 7> [local count: 1073741824]:
# _19 = PHI <_12(5), _10(3), _15(6)> // Limb to add for result of casts
# _20 = PHI <0(5), _10(3), _18(6)> // Sign extension limb from previous limb
_11 = _20; // PHI _10 argument above
_21 = VIEW_CONVERT_EXPR<unsigned long[4]>(x)[_8];
_24 = .UADDC (_19, _21, _22);
_25 = IMAGPART_EXPR <_24>;
_26 = REALPART_EXPR <_24>;
VIEW_CONVERT_EXPR<unsigned long[4]>(<retval>)[_8] = _26;
_27 = _8 + 1;
if (_27 == 3) // For the outer cast limb 3 is special
goto <bb 11>; [20.00%]
else
goto <bb 8>; [80.00%]
<bb 8> [local count: 1073741824]:
if (_27 < 2)
goto <bb 9>; [80.00%]
else
goto <bb 10>; [20.00%]
<bb 9> [local count: 1073741824]:
_28 = VIEW_CONVERT_EXPR<unsigned long[3]>(y)[_27]; // These are used in full
<bb 10> [local count: 1073741824]:
# _29 = PHI <_28(9), _11(8)>
goto <bb 12>; [100.00%]
<bb 11> [local count: 214748360]:
// And HERE is the actual bug. Using _10 for idx 3 will mean it is always
// zero there and doesn't contain the _18 value propagated to it.
// It should be
// _30 = (<unnamed-unsigned:63>) _11;
// Now if the outer conversion had special iteration say 5, we could
// have used _10 fine here, by that time it already propagates through
// the PHI.
_30 = (<unnamed-unsigned:63>) _10;
_31 = (unsigned long) _30;
<bb 12> [local count: 1073741824]:
# _32 = PHI <_29(10), _31(11)>
_33 = VIEW_CONVERT_EXPR<unsigned long[4]>(x)[_27];
_34 = .UADDC (_32, _33, _25);
_23 = IMAGPART_EXPR <_34>;
_35 = REALPART_EXPR <_34>;
VIEW_CONVERT_EXPR<unsigned long[4]>(<retval>)[_27] = _35;
_9 = _8 + 2;
if (_9 != 4)
goto <bb 3>; [0.05%]
else
goto <bb 13>; [99.95%]
2025-04-11 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/119707
* gimple-lower-bitint.cc (bitint_large_huge::handle_cast): Only use
m_data[save_data_cnt] instead of m_data[save_data_cnt + 1] if
idx is odd and equal to low + 1. Remember tree_to_uhwi (idx) in
a temporary instead of calling the function multiple times.
* gcc.dg/torture/bitint-76.c: New test.
|
|
Both gcc and msvc agree that the struct size should
be 12, gcc is already correct.
Signed-off-by: Jonathan Yong <10walls@gmail.com>
gcc/testsuite/ChangeLog:
PR target/113633
* gcc.dg/bf-ms-attrib.c: Fix expected __ms_struct__ layout
size.
|
|
In r10-4803-g8489e1f45b50600c I'd used POINTER_DIFF_EXPR to subtract
the two pointers involved in an overlap test. I'm not sure whether
I'd specifically chosen that over MINUS_EXPR or not; if so, the only
reason I can think of is that it is probably faster on targets with
PSImode pointers. Regardless, as the PR points out, subtracting
unrelated pointers using POINTER_DIFF_EXPR is undefined behaviour.
gcc/
PR tree-optimization/119399
* tree-data-ref.cc (create_waw_or_war_checks): Use a MINUS_EXPR
on two converted pointers, rather than converting a POINTER_DIFF_EXPR
on the pointers.
gcc/testsuite/
PR tree-optimization/119399
* gcc.dg/vect/pr119399.c: New test.
|
|
r12-2601 has added this define_insn_and_split and corresponding
(define_insn ""
[(set (reg:CCZ CC_REG)
(eq (zero_extract:HSI (match_operand:HSI 0 "register_operand" "r")
(const_int 1)
(match_operand 1 "const_int_operand" "n"))
(const_int 0)))]
"INTVAL (operands[1]) < 16"
"btst %Z1,%Y0"
[(set_attr "length" "2")])
pattern into which the define_insn_and_split wants to splut in addition
to a conditional jump.
But as can be seen, the btst define_insn uses HSI mode iterator while
define_insn_and_split QHSI, so for QImode it splits into something that
can't be recognized.
This was probably latent since r12-2601 and on the attached testcase
is reproduceable starting with r15-1945 - a late combiner change.
2025-04-09 Jakub Jelinek <jakub@redhat.com>
PR target/119664
* config/h8300/jumpcall.md (bit test and jump define_insn_and_split):
Use HSI iterator rather than QHSI.
* gcc.dg/pr119664.c: New test.
|
|
Warnings about pointer sizes cause the test to fail
incorrectly. A dummy return value is also added to
set_marker_internal for completeness to suppress a
-Wreturn-type warning even though gcc does not issue
it by default.
Signed-off-by: Jonathan Yong <10walls@gmail.com>
gcc/testsuite/ChangeLog:
PR analyzer/113253
* gcc.dg/analyzer/deref-before-check-pr113253.c:
(ptrdiff_t): use stddef.h type.
(uintptr_t): ditto.
(EMACS_INT): ditto.
(set_marker_internal): Add dummy 0 to suppress -Wreturn-type.
|
|
The following testcase ICEs after emitting one pedwarn (about using
__VA_ARGS__ in a place where it shouldn't be used) and one error.
The error is emitted by _cpp_save_parameter where it sees the node
has been used already earlier. But unlike the other _cpp_save_parameter
caller which does goto out; if it returns false, this call with explicit
__VA_ARGS__ doesn't and if it increments number of parameters etc. after
the error, we then try to unsave it twice.
The following patch fixes it by doing the goto out in that case too,
the macro will then not be considered as variable arguments macro,
but for error recovery I think that is fine.
The other option would be before the other _cpp_save_parameter caller
check if the node is pfile->spec_nodes.n__VA_ARGS__ and in that case
also error and goto out, but that seems more expensive than this for
the common case that the macro definition is correct.
2025-04-09 Jakub Jelinek <jakub@redhat.com>
PR preprocessor/118674
* macro.cc (parse_params) <case CPP_ELLIPSIS>: If _cpp_save_parameter
failed for __VA_ARGS__, goto out.
* gcc.dg/cpp/pr118674.c: New test.
|
|
In previous years, I've tried to update the guality tests
so that they give clean results on aarch64-linux-gnu with
a recent version of GDB. This patch does the same thing for
GCC 15. The version of GDB I used was 16.2.
As before, there are no PRs for the XFAILs. The idea is that
anyone who is interested in working in this area can see the
current XFAILs by grepping the tests.
gcc/testsuite/
* gcc.dg/guality/pr36728-3.c: Update XFAILs for aarch64.
* gcc.dg/guality/pr41353-1.c: Likewise.
* gcc.dg/guality/pr54693-2.c: Likewise.
* gcc.dg/guality/pr68860-1.c: Likewise.
* gcc.dg/guality/pr68860-2.c: Likewise.
* gcc.dg/guality/sra-1.c: Likewise.
* gcc.dg/guality/vla-1.c: Likewise.
|
|
The aarch64_sve256_hw line forced the vector length, but didn't force
SVE itself. This meant that the associated:
/* { dg-final { scan-tree-dump "MASK_SCATTER_STORE" "vect" { target aarch64_sve256_hw } } } */
wouldn't always fire. I imagine this was tested with SVE enabled by
default, which would have masked the problem.
gcc/testsuite/
* gcc.dg/vect/pr99102.c: Force SVE when forcing the vector length.
|
|
The checking assertion added for PR118765 did not take into account
that add_decl_expr can change TYPE_NAME to a TYPE_DECL with no name
for certain cases of variably modified types. This also implies that we
might sometimes not reliably detect the absence of a tag when only
considering TYPE_NAME. This patch introduces a new helper function
c_type_tag to reliable compute the tag for a tagged types and uses it
for code where the switch to C23 may cause regressions.
PR c/119612
gcc/c/ChangeLog:
* c-tree.h (c_type_tag): Add prototype.
* c-typeck.cc (c_type_tag): New function.
(tagged_types_tu_compatible_p, composite_type_internal): Use
c_type_tag.
* c-decl.cc (c_struct_hasher::hash, previous_tag): Use c_type_tag.
gcc/testsuite/ChangeLog:
* gcc.dg/gnu23-tag-6.c: New test.
* gcc.dg/pr119612.c: New test.
|
|
The following testcase is miscompiled by delete_trivially_dead_insns,
latently since r0-6313, actually since r15-1575.
The problem is in that r0-6313 change, which made count_reg_usage not
count uses of the pseudo which the containing SET sets. That is needed
so we can delete those instructions as trivially dead if they are really
dead, but has the following problem. After fwprop proper we have:
(insn 7 2 8 2 (set (reg/v:DI 101 [ g ])
(const_int -1 [0xffffffffffffffff])) "pr119594.c":8:10 95 {*movdi_internal}
(nil))
...
(insn 26 24 27 7 (set (reg:DI 104 [ g ])
(zero_extend:DI (subreg:SI (reg/v:DI 101 [ g ]) 0))) "pr119594.c":11:8 175 {*zero_extendsidi2}
(expr_list:REG_EQUAL (const_int 4294967295 [0xffffffff])
(expr_list:REG_DEAD (reg/v:DI 101 [ g ])
(nil))))
(insn 27 26 28 7 (set (reg/v:DI 101 [ g ])
(zero_extend:DI (subreg:SI (reg/v:DI 101 [ g ]) 0))) "pr119594.c":11:8 175 {*zero_extendsidi2}
(expr_list:REG_EQUAL (const_int 4294967295 [0xffffffff])
(expr_list:REG_UNUSED (reg/v:DI 101 [ g ])
(nil))))
and nothing else uses or sets the 101 and 104 pseudos. The subpass doesn't
look at REG_UNUSED or REG_DEAD notes (correctly, as they aren't guaranteed
to be accurate). The last change in the IL was forward propagation of
(reg:DI 104 [ g ]) value into the following insn.
Now, count_reg_usage doesn't count anything on insn 7, the SET_DEST is a
reg, so we don't count that and SET_SRC doesn't contain any regs.
On insn 26 it counts one usage of pseudo 101 (so counts[101] = 1) and
on insn 27 since r0-6313 doesn't count anything as that insn sets
pseudo 101 to something that uses it, it isn't a side-effect instruction
and can't throw.
Now, after counting reg usages the subpass walks the IL from end to start,
sees insn 27, counts[101] is non-zero, so insn_live_p is true, nothing is
deleted. Then sees insn 26, counts[104] is zero, insn_live_p is false,
we delete the insn and decrease associated counts, in this case counts[101]
becomes zero. And finally later we process insn 7, counts[101] is now zero,
insn_live_p is false, we delete the insn (and decrease associated counts,
which aren't any).
Except that this resulted in insn 27 staying in the IL but using a REG
which is no longer set (and worse, having a REG_EQUAL note of something we
need later in the same bb, so we then assume pseudo 101 contains 0xffffffff,
which it no longer does.
Now, if insn 26 was after insn 27, this would work just fine, we'd first
delete that and then insn 27 and then insn 7, which is why most of the time
it happens to work fine.
The following patch fixes it by detecting the cases where there are
self-references after a pseudo has been used at least once outside of the
self-references or just as REG_P SET_DEST and in that case only increases
the count for the pseudo, making it not trivially deletable.
2025-04-08 Eric Botcazou <botcazou@adacore.com>
Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/119594
* cse.cc (count_reg_usage): Count even x == dest regs if they have
non-zero counts already and incr is positive.
* gcc.dg/pr119594.c: New test.
|
|
When the whole shift is invariant but the shift amount needs
to be converted and a vector shift used we can mess up placement
of vector stmts because we do not make SLP scheduling aware of
the need to insert code for it. The following mitigates this
by more conservative placement of such code in vectorizable_shift.
PR tree-optimization/119640
* tree-vect-stmts.cc (vectorizable_shift): Always insert code
for one of our SLP operands before the code for the vector
shift itself.
* gcc.dg/vect/pr119640.c: New testcase.
|
|
r8-3988-g356fcc67fba52b added code to turn return statements into __builtin_unreachable
calls inside noreturn functions but only while optimizing. Since -funreachable-traps
was added (r13-1204-gd68d3664253696), it is a good idea to move over to using
__builtin_unreachable (and the trap version with this option which defaults at -O0 and -0g)
instead of just a follow through even at -O0.
This also fixes a regression when inlining a noreturn function that returns at -O0 (due to always_inline)
as we would get an empty bb which has no successor edge instead of one with a call to __builtin_unreachable.
I also noticed there was no testcase testing the warning about __builtin_return inside a noreturn function
so I added a testcase there.
Bootstrapped and tested on x86_64-linux-gnu.
PR ipa/119599
gcc/ChangeLog:
* tree-cfg.cc (pass_warn_function_return::execute): Turn return statements always
into __builtin_unreachable calls.
gcc/testsuite/ChangeLog:
* gcc.dg/torture/pr119599-1.c: New test.
* gcc.dg/builtin-apply5.c: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
|
|
Using specific SSA names in pattern matching in `dg-final' makes tests
"unstable", in that changes in passes prior to the pass whose dump is
analyzed in the particular test may change the numbering of the SSA
variables, causing the test to start failing spuriously.
We thus switch from specific SSA names to the use of a multi-line
regular expression making use of capture groups for matching particular
variables across different statements, ensuring the test will pass
more consistently across different versions of GCC.
PR testsuite/118597
gcc/testsuite/ChangeLog:
* gcc.dg/vect/vect-fncall-mask.c: Update test directives.
|
|
Since r15-7878-ge1c49f413c8, these tests appear as XPASS on aarch64,
so we can remove the xfails introduced by r12-102-gf31ddad8ac8f11.
gcc/testsuite/ChangeLog:
* gcc.dg/guality/pr90074.c: Remove xfail for aarch64.
* gcc.dg/guality/pr90716.c: Likewise.
|
|
Some of the tests regressed with a fix for the vectorization of
shifts. The riscv cost models need to be adjusted to avoid the
unprofitable optimization. The failure of these tests has been known
since 2024-03-13, without a forthcoming fix, so I suggest we consider
it expected by now. Adjust the tests to reflect that expectation.
for gcc/testsuite/ChangeLog
PR tree-optimization/113281
* gcc.dg/vect/costmodel/riscv/rvv/pr113281-1.c: XFAIL.
* gcc.dg/vect/costmodel/riscv/rvv/pr113281-2.c: Likewise.
* gcc.dg/vect/costmodel/riscv/rvv/pr113281-5.c: Likewise.
|
|
For the same reasons that affect alpha and other targets,
gcc.dg/tree-ssa/ssa-dom-cse-2.c fails to be optimized to the expected
return statement: the array initializer is vectorized into pairs, and
DOM cannot see through that.
Add riscv*-*-* to the list of affected lp64 platforms. riscv32 is
not affected.
for gcc/testsuite/ChangeLog
* gcc.dg/tree-ssa/ssa-dom-cse-2.c: XFAIL on riscv lp64.
|
|
The following testcase ICEs because c_fully_fold isn't performed on the
arguments of __sanitizer_ptr_{sub,cmp} builtins and so e.g.
C_MAYBE_CONST_EXPR can leak into the gimplifier where it ICEs.
2025-04-02 Jakub Jelinek <jakub@redhat.com>
PR c/119582
* c-typeck.cc (pointer_diff, build_binary_op): Call c_fully_fold on
__sanitizer_ptr_sub or __sanitizer_ptr_cmp arguments.
* gcc.dg/asan/pr119582.c: New test.
|
|
The following reverts parts of r15-8047 which assesses alignment
analysis for VMAT_STRIDED_SLP is correct by using aligned accesses
where allowed by it. As the PR shows this analysis is still incorrect,
so revert back to assuming we got it wrong.
PR tree-optimization/119586
* tree-vect-stmts.cc (vectorizable_load): Assume we got
alignment analysis for VMAT_STRIDED_SLP wrong.
(vectorizable_store): Likewise.
* gcc.dg/vect/pr119586.c: New testcase.
|
|
mtrr_ioctl() uses long and casts it to a pointer. Fix warnings
for llp64 platforms.
Signed-off-by: Jonathan Yong <10walls@gmail.com>
gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/torture/switch-3.c: Fix llp64 warnings.
|
|
This is a partial step towards fixing that PR.
For musttail recursive calls which have non-is_gimple_reg_type typed
parameters, the only case we've handled was if the exact parameter
was passed through (perhaps modified, but still the same PARM_DECL).
That isn't necessary, we can copy the argument to the parameter as well
(just need to watch for the use of the parameter in later arguments,
say musttail recursive call which swaps 2 structure arguments).
The patch attempts to play safe and punts if any of the parameters are
addressable (like we do for all normal tail calls and tail recursions,
except for musttail in the posted unreviewed patch).
With this patch (at least when early inlining isn't done on not yet
optimized body) inlining should see already tail recursion optimized
body and will not have problems with SRA breaking musttail.
This version of the patch limits this for musttail tail recursions,
with intent to enable for all tail recursions in GCC 16.
2025-04-01 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/119493
* tree-tailcall.cc (find_tail_calls): Don't punt on tail recusion
if some arguments don't have is_gimple_reg_type, only punt if they
have non-POD types, or volatile, or addressable or (for now) it is
not a musttail call. Set tailr_arg_needs_copy in those cases too.
(eliminate_tail_call): Copy call arguments to params if they don't
have is_gimple_reg_type, use temporaries if the argument is used
later.
(tree_optimize_tail_calls_1): Skip !is_gimple_reg_type
tailr_arg_needs_copy parameters. Formatting fix.
* gcc.dg/pr119493-1.c: New test.
|
|
The following makes sure to reject the attempts to emulate a vector
gather when the discovered index vector type is a vector mask.
PR tree-optimization/119534
* tree-vect-stmts.cc (get_load_store_type): Reject
VECTOR_BOOLEAN_TYPE_P offset vector type for emulated gathers.
* gcc.dg/vect/pr119534.c: New testcase.
|
|
While working on the previous tailc patch, I've noticed the following
problem.
The testcase below fails, because we decide to tail recursion optimize
the call, but tail recursion (as documented in tree-tailcall.cc) needs to
add some result multiplication and/or addition if any tail recursion uses
accumulator, which is added right before the return.
So, if there are musttail non-recurive calls in the function, successful
tail recursion optimization will mean we'll later error on the musttail
calls. musttail recursive calls are ok, those would be tail recursion
optimized.
So, the following patch punts on all tail recursion optimizations if it
needs accumulators (add and/or mult) if there is at least one non-recursive
musttail call.
2025-04-01 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/119493
* tree-tailcall.cc (tree_optimize_tail_calls_1): Ignore tail recursion
candidates which need accumulators if there is at least one musttail
non-recursive call.
* gcc.dg/pr119493-2.c: New test.
|
|
This resolves all instances of PR119369
"GCN: weak undefined symbols -> execution test FAIL, 'HSA_STATUS_ERROR_VARIABLE_UNDEFINED'";
for all affected test cases, the execution test status progresses FAIL -> PASS.
This however also causes a small number of (expected) regressions, very similar
to GCC/nvptx:
[-PASS:-]{+FAIL:+} g++.dg/abi/pure-virtual1.C -std=c++17 (test for excess errors)
[-PASS:-]{+FAIL:+} g++.dg/abi/pure-virtual1.C -std=c++26 (test for excess errors)
[-PASS:-]{+FAIL:+} g++.dg/abi/pure-virtual1.C -std=c++98 (test for excess errors)
[-PASS:-]{+FAIL:+} g++.dg/cpp0x/pr84497.C -std=c++11 scan-assembler .weak[ \t]*_?_ZTH11derived_obj
[-PASS:-]{+FAIL:+} g++.dg/cpp0x/pr84497.C -std=c++11 scan-assembler .weak[ \t]*_?_ZTH13container_obj
[-PASS:-]{+FAIL:+} g++.dg/cpp0x/pr84497.C -std=c++11 scan-assembler .weak[ \t]*_?_ZTH8base_obj
PASS: g++.dg/cpp0x/pr84497.C -std=c++11 (test for excess errors)
[-PASS:-]{+FAIL:+} g++.dg/cpp0x/pr84497.C -std=c++17 scan-assembler .weak[ \t]*_?_ZTH11derived_obj
[-PASS:-]{+FAIL:+} g++.dg/cpp0x/pr84497.C -std=c++17 scan-assembler .weak[ \t]*_?_ZTH13container_obj
[-PASS:-]{+FAIL:+} g++.dg/cpp0x/pr84497.C -std=c++17 scan-assembler .weak[ \t]*_?_ZTH8base_obj
PASS: g++.dg/cpp0x/pr84497.C -std=c++17 (test for excess errors)
[-PASS:-]{+FAIL:+} g++.dg/cpp0x/pr84497.C -std=c++26 scan-assembler .weak[ \t]*_?_ZTH11derived_obj
[-PASS:-]{+FAIL:+} g++.dg/cpp0x/pr84497.C -std=c++26 scan-assembler .weak[ \t]*_?_ZTH13container_obj
[-PASS:-]{+FAIL:+} g++.dg/cpp0x/pr84497.C -std=c++26 scan-assembler .weak[ \t]*_?_ZTH8base_obj
PASS: g++.dg/cpp0x/pr84497.C -std=c++26 (test for excess errors)
[-PASS:-]{+FAIL:+} g++.dg/ext/weak2.C -std=gnu++17 scan-assembler weak[^ \t]*[ \t]_?_Z3foov
PASS: g++.dg/ext/weak2.C -std=gnu++17 (test for excess errors)
[-PASS:-]{+FAIL:+} g++.dg/ext/weak2.C -std=gnu++26 scan-assembler weak[^ \t]*[ \t]_?_Z3foov
PASS: g++.dg/ext/weak2.C -std=gnu++26 (test for excess errors)
[-PASS:-]{+FAIL:+} g++.dg/ext/weak2.C -std=gnu++98 scan-assembler weak[^ \t]*[ \t]_?_Z3foov
PASS: g++.dg/ext/weak2.C -std=gnu++98 (test for excess errors)
[-PASS:-]{+FAIL:+} gcc.dg/attr-weakref-1.c (test for excess errors)
[-FAIL:-]{+UNRESOLVED:+} gcc.dg/attr-weakref-1.c [-execution test-]{+compilation failed to produce executable+}
@@ -131211,25 +131211,25 @@ PASS: gcc.dg/weak/weak-1.c scan-assembler weak[^ \t]*[ \t]_?c
PASS: gcc.dg/weak/weak-1.c scan-assembler weak[^ \t]*[ \t]_?d
PASS: gcc.dg/weak/weak-1.c scan-assembler weak[^ \t]*[ \t]_?e
PASS: gcc.dg/weak/weak-1.c scan-assembler weak[^ \t]*[ \t]_?g
[-PASS:-]{+FAIL:+} gcc.dg/weak/weak-1.c scan-assembler weak[^ \t]*[ \t]_?j
PASS: gcc.dg/weak/weak-1.c scan-assembler-not weak[^ \t]*[ \t]_?i
PASS: gcc.dg/weak/weak-12.c (test for excess errors)
[-PASS:-]{+FAIL:+} gcc.dg/weak/weak-12.c scan-assembler weak[^ \t]*[ \t]_?foo
PASS: gcc.dg/weak/weak-15.c (test for excess errors)
[-PASS:-]{+FAIL:+} gcc.dg/weak/weak-15.c scan-assembler weak[^ \t]*[ \t]_?a
[-PASS:-]{+FAIL:+} gcc.dg/weak/weak-15.c scan-assembler weak[^ \t]*[ \t]_?c
[-PASS:-]{+FAIL:+} gcc.dg/weak/weak-15.c scan-assembler weak[^ \t]*[ \t]_?d
PASS: gcc.dg/weak/weak-15.c scan-assembler-not weak[^ \t]*[ \t]_?b
PASS: gcc.dg/weak/weak-16.c (test for excess errors)
[-PASS:-]{+FAIL:+} gcc.dg/weak/weak-16.c scan-assembler weak[^ \t]*[ \t]_?kallsyms_token_index
[-PASS:-]{+FAIL:+} gcc.dg/weak/weak-16.c scan-assembler weak[^ \t]*[ \t]_?kallsyms_token_table
PASS: gcc.dg/weak/weak-2.c (test for excess errors)
[-PASS:-]{+FAIL:+} gcc.dg/weak/weak-2.c scan-assembler weak[^ \t]*[ \t]_?ffoo1a
[-PASS:-]{+FAIL:+} gcc.dg/weak/weak-2.c scan-assembler weak[^ \t]*[ \t]_?ffoo1b
[-PASS:-]{+FAIL:+} gcc.dg/weak/weak-2.c scan-assembler weak[^ \t]*[ \t]_?ffoo1c
[-PASS:-]{+FAIL:+} gcc.dg/weak/weak-2.c scan-assembler weak[^ \t]*[ \t]_?ffoo1e
PASS: gcc.dg/weak/weak-2.c scan-assembler-not weak[^ \t]*[ \t]_?ffoo1d
PASS: gcc.dg/weak/weak-3.c (test for warnings, line 58)
PASS: gcc.dg/weak/weak-3.c (test for warnings, line 73)
PASS: gcc.dg/weak/weak-3.c (test for excess errors)
[-PASS:-]{+FAIL:+} gcc.dg/weak/weak-3.c scan-assembler weak[^ \t]*[ \t]_?ffoo1a
[-PASS:-]{+FAIL:+} gcc.dg/weak/weak-3.c scan-assembler weak[^ \t]*[ \t]_?ffoo1b
[-PASS:-]{+FAIL:+} gcc.dg/weak/weak-3.c scan-assembler weak[^ \t]*[ \t]_?ffoo1c
[-PASS:-]{+FAIL:+} gcc.dg/weak/weak-3.c scan-assembler weak[^ \t]*[ \t]_?ffoo1e
PASS: gcc.dg/weak/weak-3.c scan-assembler weak[^ \t]*[ \t]_?ffoo1f
PASS: gcc.dg/weak/weak-3.c scan-assembler weak[^ \t]*[ \t]_?ffoo1g
PASS: gcc.dg/weak/weak-3.c scan-assembler-not weak[^ \t]*[ \t]_?ffoo1d
PASS: gcc.dg/weak/weak-4.c (test for excess errors)
[-PASS:-]{+FAIL:+} gcc.dg/weak/weak-4.c scan-assembler weak[^ \t]*[ \t]_?vfoo1a
[-PASS:-]{+FAIL:+} gcc.dg/weak/weak-4.c scan-assembler weak[^ \t]*[ \t]_?vfoo1b
[-PASS:-]{+FAIL:+} gcc.dg/weak/weak-4.c scan-assembler weak[^ \t]*[ \t]_?vfoo1c
PASS: gcc.dg/weak/weak-4.c scan-assembler weak[^ \t]*[ \t]_?vfoo1d
PASS: gcc.dg/weak/weak-4.c scan-assembler weak[^ \t]*[ \t]_?vfoo1e
PASS: gcc.dg/weak/weak-4.c scan-assembler weak[^ \t]*[ \t]_?vfoo1f
@@ -131267,16 +131267,16 @@ PASS: gcc.dg/weak/weak-4.c scan-assembler weak[^ \t]*[ \t]_?vfoo1i
PASS: gcc.dg/weak/weak-4.c scan-assembler weak[^ \t]*[ \t]_?vfoo1j
PASS: gcc.dg/weak/weak-4.c scan-assembler weak[^ \t]*[ \t]_?vfoo1k
PASS: gcc.dg/weak/weak-5.c (test for excess errors)
[-PASS:-]{+FAIL:+} gcc.dg/weak/weak-5.c scan-assembler weak[^ \t]*[ \t]_?vfoo1a
[-PASS:-]{+FAIL:+} gcc.dg/weak/weak-5.c scan-assembler weak[^ \t]*[ \t]_?vfoo1b
[-PASS:-]{+FAIL:+} gcc.dg/weak/weak-5.c scan-assembler weak[^ \t]*[ \t]_?vfoo1c
PASS: gcc.dg/weak/weak-5.c scan-assembler weak[^ \t]*[ \t]_?vfoo1d
PASS: gcc.dg/weak/weak-5.c scan-assembler weak[^ \t]*[ \t]_?vfoo1e
PASS: gcc.dg/weak/weak-5.c scan-assembler weak[^ \t]*[ \t]_?vfoo1f
PASS: gcc.dg/weak/weak-5.c scan-assembler weak[^ \t]*[ \t]_?vfoo1g
PASS: gcc.dg/weak/weak-5.c scan-assembler weak[^ \t]*[ \t]_?vfoo1h
[-PASS:-]{+FAIL:+} gcc.dg/weak/weak-5.c scan-assembler weak[^ \t]*[ \t]_?vfoo1i
[-PASS:-]{+FAIL:+} gcc.dg/weak/weak-5.c scan-assembler weak[^ \t]*[ \t]_?vfoo1j
PASS: gcc.dg/weak/weak-5.c scan-assembler weak[^ \t]*[ \t]_?vfoo1k
PASS: gcc.dg/weak/weak-5.c scan-assembler weak[^ \t]*[ \t]_?vfoo1l
These get 'dg-xfail-if'ed or 'dg-skip-if'ed, (mostly) similar to GCC/nvptx.
PR target/119369
gcc/
* config/gcn/gcn-protos.h (gcn_asm_weaken_decl): Declare.
* config/gcn/gcn.cc (gcn_asm_weaken_decl): New.
* config/gcn/gcn-hsa.h (ASM_WEAKEN_DECL): '#define' to this.
gcc/testsuite/
* g++.dg/abi/pure-virtual1.C: 'dg-xfail-if' GCN.
* g++.dg/cpp0x/pr84497.C: 'dg-skip-if' GCN.
* g++.dg/ext/weak2.C: Likewise.
* gcc.dg/attr-weakref-1.c: Likewise.
* gcc.dg/weak/weak-1.c: Likewise.
* gcc.dg/weak/weak-12.c: Likewise.
* gcc.dg/weak/weak-15.c: Likewise.
* gcc.dg/weak/weak-16.c: Likewise.
* gcc.dg/weak/weak-2.c: Likewise.
* gcc.dg/weak/weak-3.c: Likewise.
* gcc.dg/weak/weak-4.c: Likewise.
* gcc.dg/weak/weak-5.c: Likewise.
|
|
The following disables tail recursion optimization when fixed-point
types are involved as we cannot generate -1 for all fixed-point
types.
PR tree-optimization/119532
* tree-tailcall.cc (process_assignment): FAIL for fixed-point
typed functions.
* gcc.dg/torture/pr119532.c: New testcase.
|