Age | Commit message (Collapse) | Author | Files | Lines |
|
[PR119291]
The following testcase is miscompiled on x86_64-linux at -O2 by the combiner.
We have from earlier combinations
(insn 22 21 23 4 (set (reg:SI 104 [ _7 ])
(const_int 0 [0])) "pr119291.c":25:15 96 {*movsi_internal}
(nil))
(insn 23 22 24 4 (set (reg/v:SI 117 [ e ])
(reg/v:SI 116 [ e ])) 96 {*movsi_internal}
(expr_list:REG_DEAD (reg/v:SI 116 [ e ])
(nil)))
(note 24 23 25 4 NOTE_INSN_DELETED)
(insn 25 24 26 4 (parallel [
(set (reg:CCZ 17 flags)
(compare:CCZ (neg:SI (reg:SI 104 [ _7 ]))
(const_int 0 [0])))
(set (reg/v:SI 116 [ e ])
(neg:SI (reg:SI 104 [ _7 ])))
]) "pr119291.c":26:13 977 {*negsi_2}
(expr_list:REG_DEAD (reg:SI 104 [ _7 ])
(nil)))
(note 26 25 27 4 NOTE_INSN_DELETED)
(insn 27 26 28 4 (set (reg:DI 128 [ _9 ])
(ne:DI (reg:CCZ 17 flags)
(const_int 0 [0]))) "pr119291.c":26:13 1447 {*setcc_di_1}
(expr_list:REG_DEAD (reg:CCZ 17 flags)
(nil)))
and try_combine is called on i3 25 and i2 22 (second time)
and reach the hunk being patched with simplified i3
(insn 25 24 26 4 (parallel [
(set (pc)
(pc))
(set (reg/v:SI 116 [ e ])
(const_int 0 [0]))
]) "pr119291.c":28:13 977 {*negsi_2}
(expr_list:REG_DEAD (reg:SI 104 [ _7 ])
(nil)))
and
(insn 22 21 23 4 (set (reg:SI 104 [ _7 ])
(const_int 0 [0])) "pr119291.c":27:15 96 {*movsi_internal}
(nil))
Now, the try_combine code there attempts to split two independent
sets in newpat by moving one of them to i2.
And among other tests it checks
!modified_between_p (SET_DEST (set1), i2, i3)
which is certainly needed, if there would be say
(set (reg/v:SI 116 [ e ]) (const_int 42 [0x2a]))
in between i2 and i3, we couldn't do that, as that set would overwrite
the value set by set1 we want to move to the i2 position.
But in this case pseudo 116 isn't set in between i2 and i3, but used
(and additionally there is a REG_DEAD note for it).
This is equally bad for the move, because while the i3 insn
and later will see the pseudo value that we set, the insn in between
which uses the value will see a different value from the one that
it should see.
As we don't check for that, in the end try_combine succeeds and
changes the IL to:
(insn 22 21 23 4 (set (reg/v:SI 116 [ e ])
(const_int 0 [0])) "pr119291.c":27:15 96 {*movsi_internal}
(nil))
(insn 23 22 24 4 (set (reg/v:SI 117 [ e ])
(reg/v:SI 116 [ e ])) 96 {*movsi_internal}
(expr_list:REG_DEAD (reg/v:SI 116 [ e ])
(nil)))
(note 24 23 25 4 NOTE_INSN_DELETED)
(insn 25 24 26 4 (set (pc)
(pc)) "pr119291.c":28:13 2147483647 {NOOP_MOVE}
(nil))
(note 26 25 27 4 NOTE_INSN_DELETED)
(insn 27 26 28 4 (set (reg:DI 128 [ _9 ])
(const_int 0 [0])) "pr119291.c":28:13 95 {*movdi_internal}
(nil))
(note, the i3 got turned into a nop and try_combine also modified insn 27).
The following patch replaces the modified_between_p
tests with reg_used_between_p, my understanding is that
modified_between_p is a subset of reg_used_between_p, so one
doesn't need both.
Looking at this some more today, I think we should special case
set_noop_p because that can be put into i2 (except for the JUMP_P
violations), currently both modified_between_p (pc_rtx, i2, i3)
and reg_used_between_p (pc_rtx, i2, i3) returns false.
I'll post a patch incrementally for that (but that feels like
new optimization, so probably not something that should be backported).
On Tue, Apr 01, 2025 at 11:27:25AM +0200, Richard Biener wrote:
> Can we constrain SET_DEST (set1/set0) to a REG_P in combine? Why
> does the comment talk about memory?
I was worried about making too risky changes this late in stage4
(and especially also for backports). Most of this code is 1992-ish.
I think many of the functions are just misnamed, the reg_ in there doesn't
match what those functions do (bet they initially supported just REGs
and later on support for other kinds of expressions was added, but haven't
done git archeology to prove that).
What we know for sure is:
&& GET_CODE (SET_DEST (XVECEXP (newpat, 0, 0))) != ZERO_EXTRACT
&& GET_CODE (SET_DEST (XVECEXP (newpat, 0, 0))) != STRICT_LOW_PART
&& GET_CODE (SET_DEST (XVECEXP (newpat, 0, 1))) != ZERO_EXTRACT
&& GET_CODE (SET_DEST (XVECEXP (newpat, 0, 1))) != STRICT_LOW_PART
that is checked earlier in the condition.
Then it calls
&& ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 1)),
XVECEXP (newpat, 0, 0))
&& ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 0)),
XVECEXP (newpat, 0, 1))
While it has reg_* in it, that function mostly calls reg_overlap_mentioned_p
which is also misnamed, that function handles just fine all of
REG, MEM, SUBREG of REG, (SUBREG of MEM not, see below), ZERO_EXTRACT,
STRICT_LOW_PART, PC and even some further cases.
So, IMHO SET_DEST (set0) or SET_DEST (set0) can be certainly a REG, SUBREG
of REG, PC (at least the REG and PC cases are triggered on the testcase)
and quite possibly also MEM (SUBREG of MEM not, see below).
Now, the code uses !modified_between_p (SET_SRC (set{1,0}), i2, i3) where that
function for constants just returns false, for PC returns true, for REG
returns reg_set_between_p, for MEM recurses on the address, for
MEM_READONLY_P otherwise returns false, otherwise checks using alias.cc code
whether the memory could have been modified in between, for all other
rtxes recurses on the subrtxes. This part didn't change in my patch.
I've only changed those
- && !modified_between_p (SET_DEST (set{1,0}), i2, i3)
+ && !reg_used_between_p (SET_DEST (set{1,0}), i2, i3)
where the former has been described above and clearly handles all of
REG, SUBREG of REG, PC, MEM and SUBREG of MEM among other things.
The replacement reg_used_between_p calls reg_overlap_mentioned_p on each
instruction in between i2 and i3. So, there is clearly a difference
in behavior if SET_DEST (set{1,0}) is pc_rtx, in that case modified_between_p
returns unconditionally true even if there are no instructions in between,
but reg_used_between_p if there are no non-debug insns in between returns
false. Sorry for missing that, guess I should check for that (with the
exception of the noop moves which are often (set (pc) (pc)) and handled
by the incremental patch). In fact not just that, reg_used_between_p
will only return true for PC if it is mentioned anywhere in the insns
in between.
Anyway, except for that, for REG it calls refers_to_regno_p
and so should find any occurrences of any of the REG or parts of it for hard
registers, for MEM returns true if it sees any MEMs in insns in between
(conservatively), for SUBREGs apparently it relies on it being SUBREG of REG
(so doesn't handle SUBREG of MEM) and handles SUBREG of REG like the
SUBREG_REG, PC I've already described.
Now, because reg_overlap_mentioned_p doesn't handle SUBREG of MEM, I think
already the initial
&& ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 1)),
XVECEXP (newpat, 0, 0))
&& ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 0)),
XVECEXP (newpat, 0, 1))
calls would have failed --enable-checking=rtl or would have misbehaved, so
I think there is no need to check for it further.
To your question why I don't use reg_referenced_p, that is because
reg_referenced_p is something to call on one insn pattern, while
reg_used_between_p is pretty much that on all insns in between two
instructions (excluding the boundaries).
So, I think it would be safer to add && SET_DEST (set{1,0} != pc_rtx
checks to preserve former behavior, like in the following version.
2025-04-01 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/119291
* combine.cc (try_combine): For splitting of PARALLEL with
2 independent SETs into i2 and i3 sets check reg_used_between_p
of the SET_DESTs rather than just modified_between_p.
* gcc.c-torture/execute/pr119291.c: New test.
|
|
The following patch is miscompiled from r15-8478 but latently already
since my r11-5756 and r11-6631 changes.
The r11-5756 change was
https://gcc.gnu.org/pipermail/gcc-patches/2020-December/561164.html
which changed the splitters to immediately throw away the masking.
And the r11-6631 change was an optimization to recognize
(set (zero_extract:HI (...) (const_int 1) (...)) (const_int 1)
as btr.
The problem is their interaction. x86 is not a SHIFT_COUNT_TRUNCATED
target, so the masking needs to be explicit in the IL.
And combine.cc (make_field_assignment) has since 1992 optimizations
which try to optimize x &= (-2 r<< y) into zero_extract (x) = 0.
Now, such an optimization is fine if y has not been masked or if the
chosen zero_extract has the same mode as the rotate (or it recognizes
something with a left shift too). IMHO such optimization is invalid
for SHIFT_COUNT_TRUNCATED targets because we explicitly say that
the masking of the shift/rotate counts are redundant there and don't
need to be part of the IL (I have a patch for that, but because it
is just latent, I'm not sure it needs to be posted for gcc 15 (and
also am not sure if it should punt or add operand masking just in case)).
x86 is not SHIFT_COUNT_TRUNCATED though and so even fixing combine
not to do that for SHIFT_COUNT_TRUNCATED targets doesn't help, and we don't
have QImode insv, so it is optimized into HImode insertions. Now,
if the y in x &= (-2 r<< y) wasn't masked in any way, turning it into
HImode btr is just fine, but if it was x &= (-2 r<< (y & 7)) and we just
decided to throw away the masking, using btr changes the behavior on it
and causes e2fsprogs and sqlite miscompilations.
So IMHO on !SHIFT_COUNT_TRUNCATED targets, we need to keep the maskings
explicit in the IL, either at least for the duration of the combine pass
as does the following patch (where combine is the only known pass to have
such transformation), or even keep it until final pass in case there are
some later optimizations that would also need to know whether there was
explicit masking or not and with what mask. The latter change would be
much larger.
The following patch just reverts the r11-5756 change and adds a testcase.
2025-03-25 Jakub Jelinek <jakub@redhat.com>
PR target/96226
PR target/119428
* config/i386/i386.md (splitter after *<rotate_insn><mode>3_mask,
splitter after *<rotate_insn><mode>3_mask_1): Revert 2020-12-05
changes.
* gcc.c-torture/execute/pr119428.c: New test.
|
|
The following testcase is miscompiled on powerpc64le-linux starting with
r15-6777. During combine we see:
(set (reg:SI 134)
(ior:SI (ge:SI (reg:CCFP 128)
(const_int 0 [0]))
(lt:SI (reg:CCFP 128)
(const_int 0 [0]))))
The simplify_logical_relational_operation code (in its current form)
was written with arithmetic rather than CC modes in mind. Since CCFP
is a CC mode, it fails the HONOR_NANS check, and so the function assumes
that ge | lt => true.
If one comparison is unsigned then it should be safe to assume that
the other comparison is also unsigned, even for CC modes, since the
optimisation checks that the comparisons are between the same operands.
For the other cases, we can only safely fold comparisons of CC mode
values if the result is always-true (15) or always-false (0).
It turns out that the original testcase for PR117186, which ran at -O,
was relying on the old behaviour for some of the functions. It needs
4-instruction combinations, and so -fexpensive-optimizations, to pass
in its intended form.
gcc/
PR rtl-optimization/119002
* simplify-rtx.cc
(simplify_context::simplify_logical_relational_operation): Handle
comparisons between CC values. If there is no evidence that the
CC values are unsigned, restrict the fold to always-true or
always-false results.
gcc/testsuite/
* gcc.c-torture/execute/ieee/pr119002.c: New test.
* gcc.target/aarch64/pr117186.c: Run at -O2 rather than -O.
Co-authored-by: Jakub Jelinek <jakub@redhat.com>
|
|
Uros' r15-7793 fixed this PR as well, I'm just committing tests
from the PR so that it can be closed.
2025-03-04 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/119071
* gcc.dg/pr119071.c: New test.
* gcc.c-torture/execute/pr119071.c: New test.
|
|
The following testcase is miscompiled since r15-7597.
The left comparison is unsigned (x & 0x8000U) != 0) while the
right one is signed (x >> 16) >= 0 and is actually a signbit test,
so rsignbit is 64.
After debugging this and reading the r15-7597 change, I believe there
is just a pasto, the if (lsignbit) and if (rsignbit) blocks are pretty
much identical with just the first l on all variables starting with l
replaced with r (the only difference is that if (lsignbit) has a comment
explaining the sign <<= 1; stuff, while it isn't repeated in the second one.
Except the second one was using ll_unsignedp instead of rl_unsignedp
in one spot. I think it should use the latter, the signedness of the left
comparison doesn't affect the other one, they are basically independent
with the exception that we check that after transformations they are both
EQ or both NE and later on we try to merge them together.
2025-02-27 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/119030
* gimple-fold.cc (fold_truth_andor_for_ifcombine): Fix a pasto,
ll_unsignedp -> rl_unsignedp.
* gcc.c-torture/execute/pr119030.c: New test.
|
|
The following testcase is miscompiled due to a bug in
optimize_range_tests_to_bit_test. It is trying to optimize
check for a in [-34,-34] or [-26,-26] or [-6,-6] or [-4,inf] ranges.
Another reassoc optimization folds the the test for the first
two ranges into (a + 34U) & ~8U in [0U,0U] range, and extract_bit_test_mask
actually has code to virtually undo it and treat that again as test
for a being -34 or -26. The problem is that optimize_range_tests_to_bit_test
remembers in the type variable TREE_TYPE (ranges[i].exp); from the first
range. If extract_bit_test_mask doesn't do that virtual undoing of the
BIT_AND_EXPR handling, that is just fine, the returned exp is ranges[i].exp.
But if the first range is BIT_AND_EXPR, the type could be different, the
BIT_AND_EXPR form has the optional cast to corresponding unsigned type
in order to avoid introducing UB. Now, type was used to fill in the
max value if ranges[j].high was missing in subsequently tested range,
and so in this particular testcase the [-4,inf] range which was
signed int and so [-4,INT_MAX] was treated as [-4,UINT_MAX] instead.
And we were subtracting values of 2 different types and trying to make
sense out of that.
The following patch fixes this by using the type of the low bound
(which is always non-NULL) for the max value of the high bound instead.
2025-02-24 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/118915
* tree-ssa-reassoc.cc (optimize_range_tests_to_bit_test): For
highj == NULL_TREE use TYPE_MAX_VALUE (TREE_TYPE (lowj)) rather
than TYPE_MAX_VALUE (type).
* gcc.c-torture/execute/pr118915.c: New test.
|
|
dynamic stack allocation not supported'
In Subversion r217296 (Git commit e2acc079ff125a869159be45371dc0a29b230e92)
"Testsuite alloca fixes for ptx", effective-target 'alloca' was added to mark
up test cases that run into the nvptx back end's non-support of dynamic stack
allocation. (Later, nvptx gained conditional support for that in
commit 3861d362ec7e3c50742fc43833fe9d8674f4070e
"nvptx: PTX 'alloca' for '-mptx=7.3'+, '-march=sm_52'+ [PR65181]", but on the
other hand, in commit f93a612fc4567652b75ffc916d31a446378e6613
"bpf: liberate R9 for general register allocation", the BPF back end joined
"the list of targets that do not support alloca in target-support.exp".
Manually maintaining the list of test cases requiring effective-target 'alloca'
is notoriously hard, gets out of date quickly: new test cases added to the test
suite may need to be analyzed and annotated, and over time annotations also may
need to be removed, in cases where the compiler learns to optimize out
'alloca'/VLA usage, for example. This commit replaces (99 % of) the manual
annotations with an automatic scheme: turn test cases into UNSUPPORTED if
running into 'sorry, unimplemented: dynamic stack allocation not supported'.
gcc/testsuite/
* lib/target-supports.exp (check_effective_target_alloca):
Gracefully handle the case that we've not be called (indirectly)
from 'dg-test'.
* lib/gcc-dg.exp (proc gcc-dg-prune): Turn
'sorry, unimplemented: dynamic stack allocation not supported' into
UNSUPPORTED.
* c-c++-common/Walloca-larger-than.c: Don't
'dg-require-effective-target alloca'.
* c-c++-common/Warray-bounds-9.c: Likewise.
* c-c++-common/Warray-bounds.c: Likewise.
* c-c++-common/Wdangling-pointer-2.c: Likewise.
* c-c++-common/Wdangling-pointer-4.c: Likewise.
* c-c++-common/Wdangling-pointer-5.c: Likewise.
* c-c++-common/Wdangling-pointer.c: Likewise.
* c-c++-common/Wimplicit-fallthrough-7.c: Likewise.
* c-c++-common/Wsizeof-pointer-memaccess1.c: Likewise.
* c-c++-common/Wsizeof-pointer-memaccess2.c: Likewise.
* c-c++-common/Wstringop-truncation.c: Likewise.
* c-c++-common/Wunused-var-6.c: Likewise.
* c-c++-common/Wunused-var-8.c: Likewise.
* c-c++-common/analyzer/alloca-leak.c: Likewise.
* c-c++-common/analyzer/allocation-size-multiline-2.c: Likewise.
* c-c++-common/analyzer/allocation-size-multiline-3.c: Likewise.
* c-c++-common/analyzer/capacity-1.c: Likewise.
* c-c++-common/analyzer/capacity-3.c: Likewise.
* c-c++-common/analyzer/imprecise-floating-point-1.c: Likewise.
* c-c++-common/analyzer/infinite-recursion-alloca.c: Likewise.
* c-c++-common/analyzer/malloc-callbacks.c: Likewise.
* c-c++-common/analyzer/malloc-paths-8.c: Likewise.
* c-c++-common/analyzer/out-of-bounds-5.c: Likewise.
* c-c++-common/analyzer/out-of-bounds-diagram-11.c: Likewise.
* c-c++-common/analyzer/uninit-alloca.c: Likewise.
* c-c++-common/analyzer/write-to-string-literal-5.c: Likewise.
* c-c++-common/asan/alloca_loop_unpoisoning.c: Likewise.
* c-c++-common/auto-init-11.c: Likewise.
* c-c++-common/auto-init-12.c: Likewise.
* c-c++-common/auto-init-15.c: Likewise.
* c-c++-common/auto-init-16.c: Likewise.
* c-c++-common/builtins.c: Likewise.
* c-c++-common/dwarf2/vla1.c: Likewise.
* c-c++-common/gomp/pr61486-2.c: Likewise.
* c-c++-common/torture/builtin-clear-padding-4.c: Likewise.
* c-c++-common/torture/strub-run3.c: Likewise.
* c-c++-common/torture/strub-run4.c: Likewise.
* c-c++-common/torture/strub-run4c.c: Likewise.
* c-c++-common/torture/strub-run4d.c: Likewise.
* c-c++-common/torture/strub-run4i.c: Likewise.
* g++.dg/Walloca1.C: Likewise.
* g++.dg/Walloca2.C: Likewise.
* g++.dg/cpp0x/pr70338.C: Likewise.
* g++.dg/cpp1y/lambda-generic-vla1.C: Likewise.
* g++.dg/cpp1y/vla10.C: Likewise.
* g++.dg/cpp1y/vla2.C: Likewise.
* g++.dg/cpp1y/vla6.C: Likewise.
* g++.dg/cpp1y/vla8.C: Likewise.
* g++.dg/debug/debug5.C: Likewise.
* g++.dg/debug/debug6.C: Likewise.
* g++.dg/debug/pr54828.C: Likewise.
* g++.dg/diagnostic/pr70105.C: Likewise.
* g++.dg/eh/cleanup5.C: Likewise.
* g++.dg/eh/spbp.C: Likewise.
* g++.dg/ext/builtin_alloca.C: Likewise.
* g++.dg/ext/tmplattr9.C: Likewise.
* g++.dg/ext/vla10.C: Likewise.
* g++.dg/ext/vla11.C: Likewise.
* g++.dg/ext/vla12.C: Likewise.
* g++.dg/ext/vla15.C: Likewise.
* g++.dg/ext/vla16.C: Likewise.
* g++.dg/ext/vla17.C: Likewise.
* g++.dg/ext/vla23.C: Likewise.
* g++.dg/ext/vla3.C: Likewise.
* g++.dg/ext/vla6.C: Likewise.
* g++.dg/ext/vla7.C: Likewise.
* g++.dg/init/array24.C: Likewise.
* g++.dg/init/new47.C: Likewise.
* g++.dg/init/pr55497.C: Likewise.
* g++.dg/opt/pr78201.C: Likewise.
* g++.dg/template/vla2.C: Likewise.
* g++.dg/torture/Wsizeof-pointer-memaccess1.C: Likewise.
* g++.dg/torture/Wsizeof-pointer-memaccess2.C: Likewise.
* g++.dg/torture/pr62127.C: Likewise.
* g++.dg/torture/pr67055.C: Likewise.
* g++.dg/torture/stackalign/eh-alloca-1.C: Likewise.
* g++.dg/torture/stackalign/eh-inline-2.C: Likewise.
* g++.dg/torture/stackalign/eh-vararg-1.C: Likewise.
* g++.dg/torture/stackalign/eh-vararg-2.C: Likewise.
* g++.dg/warn/Wplacement-new-size-5.C: Likewise.
* g++.dg/warn/Wsizeof-pointer-memaccess-1.C: Likewise.
* g++.dg/warn/Wvla-1.C: Likewise.
* g++.dg/warn/Wvla-3.C: Likewise.
* g++.old-deja/g++.ext/array2.C: Likewise.
* g++.old-deja/g++.ext/constructor.C: Likewise.
* g++.old-deja/g++.law/builtin1.C: Likewise.
* g++.old-deja/g++.other/crash12.C: Likewise.
* g++.old-deja/g++.other/eh3.C: Likewise.
* g++.old-deja/g++.pt/array6.C: Likewise.
* g++.old-deja/g++.pt/dynarray.C: Likewise.
* gcc.c-torture/compile/20000923-1.c: Likewise.
* gcc.c-torture/compile/20030224-1.c: Likewise.
* gcc.c-torture/compile/20071108-1.c: Likewise.
* gcc.c-torture/compile/20071117-1.c: Likewise.
* gcc.c-torture/compile/900313-1.c: Likewise.
* gcc.c-torture/compile/parms.c: Likewise.
* gcc.c-torture/compile/pr17397.c: Likewise.
* gcc.c-torture/compile/pr35006.c: Likewise.
* gcc.c-torture/compile/pr42956.c: Likewise.
* gcc.c-torture/compile/pr51354.c: Likewise.
* gcc.c-torture/compile/pr52714.c: Likewise.
* gcc.c-torture/compile/pr55851.c: Likewise.
* gcc.c-torture/compile/pr77754-1.c: Likewise.
* gcc.c-torture/compile/pr77754-2.c: Likewise.
* gcc.c-torture/compile/pr77754-3.c: Likewise.
* gcc.c-torture/compile/pr77754-4.c: Likewise.
* gcc.c-torture/compile/pr77754-5.c: Likewise.
* gcc.c-torture/compile/pr77754-6.c: Likewise.
* gcc.c-torture/compile/pr78439.c: Likewise.
* gcc.c-torture/compile/pr79413.c: Likewise.
* gcc.c-torture/compile/pr82564.c: Likewise.
* gcc.c-torture/compile/pr87110.c: Likewise.
* gcc.c-torture/compile/pr99787-1.c: Likewise.
* gcc.c-torture/compile/vla-const-1.c: Likewise.
* gcc.c-torture/compile/vla-const-2.c: Likewise.
* gcc.c-torture/execute/20010209-1.c: Likewise.
* gcc.c-torture/execute/20020314-1.c: Likewise.
* gcc.c-torture/execute/20020412-1.c: Likewise.
* gcc.c-torture/execute/20021113-1.c: Likewise.
* gcc.c-torture/execute/20040223-1.c: Likewise.
* gcc.c-torture/execute/20040308-1.c: Likewise.
* gcc.c-torture/execute/20040811-1.c: Likewise.
* gcc.c-torture/execute/20070824-1.c: Likewise.
* gcc.c-torture/execute/20070919-1.c: Likewise.
* gcc.c-torture/execute/built-in-setjmp.c: Likewise.
* gcc.c-torture/execute/pr22061-1.c: Likewise.
* gcc.c-torture/execute/pr43220.c: Likewise.
* gcc.c-torture/execute/pr82210.c: Likewise.
* gcc.c-torture/execute/pr86528.c: Likewise.
* gcc.c-torture/execute/vla-dealloc-1.c: Likewise.
* gcc.dg/20001012-2.c: Likewise.
* gcc.dg/20020415-1.c: Likewise.
* gcc.dg/20030331-2.c: Likewise.
* gcc.dg/20101010-1.c: Likewise.
* gcc.dg/Walloca-1.c: Likewise.
* gcc.dg/Walloca-10.c: Likewise.
* gcc.dg/Walloca-11.c: Likewise.
* gcc.dg/Walloca-12.c: Likewise.
* gcc.dg/Walloca-13.c: Likewise.
* gcc.dg/Walloca-14.c: Likewise.
* gcc.dg/Walloca-15.c: Likewise.
* gcc.dg/Walloca-2.c: Likewise.
* gcc.dg/Walloca-3.c: Likewise.
* gcc.dg/Walloca-4.c: Likewise.
* gcc.dg/Walloca-5.c: Likewise.
* gcc.dg/Walloca-6.c: Likewise.
* gcc.dg/Walloca-7.c: Likewise.
* gcc.dg/Walloca-8.c: Likewise.
* gcc.dg/Walloca-9.c: Likewise.
* gcc.dg/Walloca-larger-than-2.c: Likewise.
* gcc.dg/Walloca-larger-than-3.c: Likewise.
* gcc.dg/Walloca-larger-than-4.c: Likewise.
* gcc.dg/Walloca-larger-than.c: Likewise.
* gcc.dg/Warray-bounds-22.c: Likewise.
* gcc.dg/Warray-bounds-41.c: Likewise.
* gcc.dg/Warray-bounds-46.c: Likewise.
* gcc.dg/Warray-bounds-48-novec.c: Likewise.
* gcc.dg/Warray-bounds-48.c: Likewise.
* gcc.dg/Warray-bounds-50.c: Likewise.
* gcc.dg/Warray-bounds-63.c: Likewise.
* gcc.dg/Warray-bounds-66.c: Likewise.
* gcc.dg/Wdangling-pointer.c: Likewise.
* gcc.dg/Wfree-nonheap-object-2.c: Likewise.
* gcc.dg/Wfree-nonheap-object.c: Likewise.
* gcc.dg/Wrestrict-17.c: Likewise.
* gcc.dg/Wrestrict.c: Likewise.
* gcc.dg/Wreturn-local-addr-2.c: Likewise.
* gcc.dg/Wreturn-local-addr-3.c: Likewise.
* gcc.dg/Wreturn-local-addr-4.c: Likewise.
* gcc.dg/Wreturn-local-addr-6.c: Likewise.
* gcc.dg/Wsizeof-pointer-memaccess1.c: Likewise.
* gcc.dg/Wstack-usage.c: Likewise.
* gcc.dg/Wstrict-aliasing-bogus-vla-1.c: Likewise.
* gcc.dg/Wstrict-overflow-27.c: Likewise.
* gcc.dg/Wstringop-overflow-15.c: Likewise.
* gcc.dg/Wstringop-overflow-23.c: Likewise.
* gcc.dg/Wstringop-overflow-25.c: Likewise.
* gcc.dg/Wstringop-overflow-27.c: Likewise.
* gcc.dg/Wstringop-overflow-3.c: Likewise.
* gcc.dg/Wstringop-overflow-39.c: Likewise.
* gcc.dg/Wstringop-overflow-56.c: Likewise.
* gcc.dg/Wstringop-overflow-57.c: Likewise.
* gcc.dg/Wstringop-overflow-67.c: Likewise.
* gcc.dg/Wstringop-overflow-71.c: Likewise.
* gcc.dg/Wstringop-truncation-3.c: Likewise.
* gcc.dg/Wvla-larger-than-1.c: Likewise.
* gcc.dg/Wvla-larger-than-2.c: Likewise.
* gcc.dg/Wvla-larger-than-3.c: Likewise.
* gcc.dg/Wvla-larger-than-4.c: Likewise.
* gcc.dg/Wvla-larger-than-5.c: Likewise.
* gcc.dg/analyzer/boxed-malloc-1.c: Likewise.
* gcc.dg/analyzer/call-summaries-2.c: Likewise.
* gcc.dg/analyzer/malloc-1.c: Likewise.
* gcc.dg/analyzer/malloc-reuse.c: Likewise.
* gcc.dg/analyzer/out-of-bounds-diagram-12.c: Likewise.
* gcc.dg/analyzer/pr93355-localealias.c: Likewise.
* gcc.dg/analyzer/putenv-1.c: Likewise.
* gcc.dg/analyzer/taint-alloc-1.c: Likewise.
* gcc.dg/analyzer/torture/pr93373.c: Likewise.
* gcc.dg/analyzer/torture/ubsan-1.c: Likewise.
* gcc.dg/analyzer/vla-1.c: Likewise.
* gcc.dg/atomic/stdatomic-vm.c: Likewise.
* gcc.dg/attr-alloc_size-6.c: Likewise.
* gcc.dg/attr-alloc_size-7.c: Likewise.
* gcc.dg/attr-alloc_size-8.c: Likewise.
* gcc.dg/attr-alloc_size-9.c: Likewise.
* gcc.dg/attr-noipa.c: Likewise.
* gcc.dg/auto-init-uninit-36.c: Likewise.
* gcc.dg/auto-init-uninit-9.c: Likewise.
* gcc.dg/auto-type-1.c: Likewise.
* gcc.dg/builtin-alloc-size.c: Likewise.
* gcc.dg/builtin-dynamic-alloc-size.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-1.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-2.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-3.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-4.c: Likewise.
* gcc.dg/builtin-object-size-1.c: Likewise.
* gcc.dg/builtin-object-size-2.c: Likewise.
* gcc.dg/builtin-object-size-3.c: Likewise.
* gcc.dg/builtin-object-size-4.c: Likewise.
* gcc.dg/builtins-64.c: Likewise.
* gcc.dg/builtins-68.c: Likewise.
* gcc.dg/c23-auto-2.c: Likewise.
* gcc.dg/c99-const-expr-13.c: Likewise.
* gcc.dg/c99-vla-1.c: Likewise.
* gcc.dg/fold-alloca-1.c: Likewise.
* gcc.dg/gomp/pr30494.c: Likewise.
* gcc.dg/gomp/vla-2.c: Likewise.
* gcc.dg/gomp/vla-3.c: Likewise.
* gcc.dg/gomp/vla-4.c: Likewise.
* gcc.dg/gomp/vla-5.c: Likewise.
* gcc.dg/graphite/pr99085.c: Likewise.
* gcc.dg/guality/guality.c: Likewise.
* gcc.dg/lto/pr80778_0.c: Likewise.
* gcc.dg/nested-func-10.c: Likewise.
* gcc.dg/nested-func-12.c: Likewise.
* gcc.dg/nested-func-13.c: Likewise.
* gcc.dg/nested-func-14.c: Likewise.
* gcc.dg/nested-func-15.c: Likewise.
* gcc.dg/nested-func-16.c: Likewise.
* gcc.dg/nested-func-17.c: Likewise.
* gcc.dg/nested-func-9.c: Likewise.
* gcc.dg/packed-vla.c: Likewise.
* gcc.dg/pr100225.c: Likewise.
* gcc.dg/pr25682.c: Likewise.
* gcc.dg/pr27301.c: Likewise.
* gcc.dg/pr31507-1.c: Likewise.
* gcc.dg/pr33238.c: Likewise.
* gcc.dg/pr41470.c: Likewise.
* gcc.dg/pr49120.c: Likewise.
* gcc.dg/pr50764.c: Likewise.
* gcc.dg/pr51491-2.c: Likewise.
* gcc.dg/pr51990-2.c: Likewise.
* gcc.dg/pr51990.c: Likewise.
* gcc.dg/pr59011.c: Likewise.
* gcc.dg/pr59523.c: Likewise.
* gcc.dg/pr61561.c: Likewise.
* gcc.dg/pr78468.c: Likewise.
* gcc.dg/pr78902.c: Likewise.
* gcc.dg/pr79972.c: Likewise.
* gcc.dg/pr82875.c: Likewise.
* gcc.dg/pr83844.c: Likewise.
* gcc.dg/pr84131.c: Likewise.
* gcc.dg/pr87099.c: Likewise.
* gcc.dg/pr87320.c: Likewise.
* gcc.dg/pr89045.c: Likewise.
* gcc.dg/pr91014.c: Likewise.
* gcc.dg/pr93986.c: Likewise.
* gcc.dg/pr98721-1.c: Likewise.
* gcc.dg/pr99122-2.c: Likewise.
* gcc.dg/shrink-wrap-alloca.c: Likewise.
* gcc.dg/sso-14.c: Likewise.
* gcc.dg/strlenopt-62.c: Likewise.
* gcc.dg/strlenopt-83.c: Likewise.
* gcc.dg/strlenopt-84.c: Likewise.
* gcc.dg/strlenopt-91.c: Likewise.
* gcc.dg/torture/Wsizeof-pointer-memaccess1.c: Likewise.
* gcc.dg/torture/calleesave-sse.c: Likewise.
* gcc.dg/torture/pr48953.c: Likewise.
* gcc.dg/torture/pr71881.c: Likewise.
* gcc.dg/torture/pr71901.c: Likewise.
* gcc.dg/torture/pr78742.c: Likewise.
* gcc.dg/torture/pr92088-1.c: Likewise.
* gcc.dg/torture/pr92088-2.c: Likewise.
* gcc.dg/torture/pr93124.c: Likewise.
* gcc.dg/torture/pr94479.c: Likewise.
* gcc.dg/torture/stackalign/alloca-1.c: Likewise.
* gcc.dg/torture/stackalign/inline-2.c: Likewise.
* gcc.dg/torture/stackalign/nested-3.c: Likewise.
* gcc.dg/torture/stackalign/vararg-1.c: Likewise.
* gcc.dg/torture/stackalign/vararg-2.c: Likewise.
* gcc.dg/tree-ssa/20030807-2.c: Likewise.
* gcc.dg/tree-ssa/20080530.c: Likewise.
* gcc.dg/tree-ssa/alias-37.c: Likewise.
* gcc.dg/tree-ssa/builtin-sprintf-warn-22.c: Likewise.
* gcc.dg/tree-ssa/builtin-sprintf-warn-25.c: Likewise.
* gcc.dg/tree-ssa/builtin-sprintf-warn-3.c: Likewise.
* gcc.dg/tree-ssa/loop-interchange-15.c: Likewise.
* gcc.dg/tree-ssa/pr23848-1.c: Likewise.
* gcc.dg/tree-ssa/pr23848-2.c: Likewise.
* gcc.dg/tree-ssa/pr23848-3.c: Likewise.
* gcc.dg/tree-ssa/pr23848-4.c: Likewise.
* gcc.dg/uninit-32.c: Likewise.
* gcc.dg/uninit-36.c: Likewise.
* gcc.dg/uninit-39.c: Likewise.
* gcc.dg/uninit-41.c: Likewise.
* gcc.dg/uninit-9-O0.c: Likewise.
* gcc.dg/uninit-9.c: Likewise.
* gcc.dg/uninit-pr100250.c: Likewise.
* gcc.dg/uninit-pr101300.c: Likewise.
* gcc.dg/uninit-pr101494.c: Likewise.
* gcc.dg/uninit-pr98583.c: Likewise.
* gcc.dg/vla-2.c: Likewise.
* gcc.dg/vla-22.c: Likewise.
* gcc.dg/vla-24.c: Likewise.
* gcc.dg/vla-3.c: Likewise.
* gcc.dg/vla-4.c: Likewise.
* gcc.dg/vla-stexp-1.c: Likewise.
* gcc.dg/vla-stexp-2.c: Likewise.
* gcc.dg/vla-stexp-4.c: Likewise.
* gcc.dg/vla-stexp-5.c: Likewise.
* gcc.dg/winline-7.c: Likewise.
* gcc.target/aarch64/stack-check-alloca-1.c: Likewise.
* gcc.target/aarch64/stack-check-alloca-10.c: Likewise.
* gcc.target/aarch64/stack-check-alloca-2.c: Likewise.
* gcc.target/aarch64/stack-check-alloca-3.c: Likewise.
* gcc.target/aarch64/stack-check-alloca-4.c: Likewise.
* gcc.target/aarch64/stack-check-alloca-5.c: Likewise.
* gcc.target/aarch64/stack-check-alloca-6.c: Likewise.
* gcc.target/aarch64/stack-check-alloca-7.c: Likewise.
* gcc.target/aarch64/stack-check-alloca-8.c: Likewise.
* gcc.target/aarch64/stack-check-alloca-9.c: Likewise.
* gcc.target/arc/interrupt-6.c: Likewise.
* gcc.target/i386/pr80969-3.c: Likewise.
* gcc.target/loongarch/stack-check-alloca-1.c: Likewise.
* gcc.target/loongarch/stack-check-alloca-2.c: Likewise.
* gcc.target/loongarch/stack-check-alloca-3.c: Likewise.
* gcc.target/loongarch/stack-check-alloca-4.c: Likewise.
* gcc.target/loongarch/stack-check-alloca-5.c: Likewise.
* gcc.target/loongarch/stack-check-alloca-6.c: Likewise.
* gcc.target/riscv/stack-check-alloca-1.c: Likewise.
* gcc.target/riscv/stack-check-alloca-10.c: Likewise.
* gcc.target/riscv/stack-check-alloca-2.c: Likewise.
* gcc.target/riscv/stack-check-alloca-3.c: Likewise.
* gcc.target/riscv/stack-check-alloca-4.c: Likewise.
* gcc.target/riscv/stack-check-alloca-5.c: Likewise.
* gcc.target/riscv/stack-check-alloca-6.c: Likewise.
* gcc.target/riscv/stack-check-alloca-7.c: Likewise.
* gcc.target/riscv/stack-check-alloca-8.c: Likewise.
* gcc.target/riscv/stack-check-alloca-9.c: Likewise.
* gcc.target/sparc/setjmp-1.c: Likewise.
* gcc.target/x86_64/abi/ms-sysv/ms-sysv.c: Likewise.
* gcc.c-torture/compile/20001221-1.c: Don't 'dg-skip-if'
for '! alloca'.
* gcc.c-torture/compile/20020807-1.c: Likewise.
* gcc.c-torture/compile/20050801-2.c: Likewise.
* gcc.c-torture/compile/920428-4.c: Likewise.
* gcc.c-torture/compile/debugvlafunction-1.c: Likewise.
* gcc.c-torture/compile/pr41469.c: Likewise.
* gcc.c-torture/execute/920721-2.c: Likewise.
* gcc.c-torture/execute/920929-1.c: Likewise.
* gcc.c-torture/execute/921017-1.c: Likewise.
* gcc.c-torture/execute/941202-1.c: Likewise.
* gcc.c-torture/execute/align-nest.c: Likewise.
* gcc.c-torture/execute/alloca-1.c: Likewise.
* gcc.c-torture/execute/pr22061-4.c: Likewise.
* gcc.c-torture/execute/pr36321.c: Likewise.
* gcc.dg/torture/pr8081.c: Likewise.
* gcc.dg/analyzer/data-model-1.c: Don't
'dg-require-effective-target alloca'. XFAIL relevant
'dg-warning's for '! alloca'.
* gcc.dg/uninit-38.c: Likewise.
* gcc.dg/uninit-pr98578.c: Likewise.
* gcc.dg/compat/struct-by-value-22_main.c: Comment on
'dg-require-effective-target alloca'.
libstdc++-v3/
* testsuite/lib/prune.exp (proc libstdc++-dg-prune): Turn
'sorry, unimplemented: dynamic stack allocation not supported' into
UNSUPPORTED.
|
|
The following testcase is miscompiled because of RTL represententation
of bt{l,q} insn followed by e.g. j{c,nc} being misleading to what it
actually does.
Let's look e.g. at
(define_insn_and_split "*jcc_bt<mode>"
[(set (pc)
(if_then_else (match_operator 0 "bt_comparison_operator"
[(zero_extract:SWI48
(match_operand:SWI48 1 "nonimmediate_operand")
(const_int 1)
(match_operand:QI 2 "nonmemory_operand"))
(const_int 0)])
(label_ref (match_operand 3))
(pc)))
(clobber (reg:CC FLAGS_REG))]
"(TARGET_USE_BT || optimize_function_for_size_p (cfun))
&& (CONST_INT_P (operands[2])
? (INTVAL (operands[2]) < GET_MODE_BITSIZE (<MODE>mode)
&& INTVAL (operands[2])
>= (optimize_function_for_size_p (cfun) ? 8 : 32))
: !memory_operand (operands[1], <MODE>mode))
&& ix86_pre_reload_split ()"
"#"
"&& 1"
[(set (reg:CCC FLAGS_REG)
(compare:CCC
(zero_extract:SWI48
(match_dup 1)
(const_int 1)
(match_dup 2))
(const_int 0)))
(set (pc)
(if_then_else (match_op_dup 0 [(reg:CCC FLAGS_REG) (const_int 0)])
(label_ref (match_dup 3))
(pc)))]
{
operands[0] = shallow_copy_rtx (operands[0]);
PUT_CODE (operands[0], reverse_condition (GET_CODE (operands[0])));
})
The define_insn part in RTL describes exactly what it does,
jumps to op3 if bit op2 in op1 is set (for op0 NE) or not set (for op0 EQ).
The problem is with what it splits into.
put_condition_code %C1 for CCCmode comparisons emits c for EQ and LTU,
nc for NE and GEU and ICEs otherwise.
CCCmode is used mainly for carry out of add/adc, borrow out of sub/sbb,
in those cases e.g. for add we have
(set (reg:CCC flags) (compare:CCC (plus:M x y) x))
and use (ltu (reg:CCC flags) (const_int 0)) for carry set and
(geu (reg:CCC flags) (const_int 0)) for carry not set. These cases
model in RTL what is actually happening, compare in infinite precision
x from the result of finite precision addition in M mode and if it is
less than unsigned (i.e. overflow happened), carry is set.
Another use of CCCmode is in UNSPEC_* patterns, those are used with
(eq (reg:CCC flags) (const_int 0)) for carry set and ne for unset,
given the UNSPEC no big deal, the middle-end doesn't know what means
set or unset.
But for the bt{l,q}; j{c,nc} case the above splits it into
(set (reg:CCC flags) (compare:CCC (zero_extract) (const_int 0)))
for bt and
(set (pc) (if_then_else (eq (reg:CCC flags) (const_int 0)) (label_ref) (pc)))
for the bit set case (so that the jump expands to jc) and ne for
the bit not set case (so that the jump expands to jnc).
Similarly for the different splitters for cmov and set{c,nc} etc.
The problem is that when the middle-end reads this RTL, it feels
the exact opposite to it. If zero_extract is 1, flags is set
to comparison of 1 and 0 and that would mean using ne ne in the
if_then_else, and vice versa.
So, in order to better describe in RTL what is actually happening,
one possibility would be to swap the behavior of put_condition_code
and use NE + LTU -> c and EQ + GEU -> nc rather than the current
EQ + LTU -> c and NE + GEU -> nc; and adjust everything. The
following patch uses a more limited approach, instead of representing
bt{l,q}; j{c,nc} case as written above it uses
(set (reg:CCC flags) (compare:CCC (const_int 0) (zero_extract)))
and
(set (pc) (if_then_else (ltu (reg:CCC flags) (const_int 0)) (label_ref) (pc)))
which uses the existing put_condition_code but describes what the
insns actually do in RTL clearly. If zero_extract is 1,
then flags are LTU, 0U < 1U, if zero_extract is 0, then flags are GEU,
0U >= 0U. The patch adjusts the *bt<mode> define_insn and all the
splitters to it and its comparisons/conditional moves/setXX.
2025-02-10 Jakub Jelinek <jakub@redhat.com>
PR target/118623
* config/i386/i386.md (*bt<mode>): Represent bt as
compare:CCC of const0_rtx and zero_extract rather than
zero_extract and const0_rtx.
(*bt<SWI48:mode>_mask): Likewise.
(*jcc_bt<mode>): Likewise. Use LTU and GEU as flags test
instead of EQ and NE.
(*jcc_bt<mode>_mask): Likewise.
(*jcc_bt<SWI48:mode>_mask_1): Likewise.
(Help combine recognize bt followed by cmov splitter): Likewise.
(*bt<mode>_setcqi): Likewise.
(*bt<mode>_setncqi): Likewise.
(*bt<mode>_setnc<mode>): Likewise.
(*bt<mode>_setncqi_2): Likewise.
(*bt<mode>_setc<mode>_mask): Likewise.
* gcc.c-torture/execute/pr118623.c: New test.
|
|
compare_operand uses operand_equal_p under the hood, which e.g. for
INTEGER_CSTs will just match the values rather regardless of their types.
Now, in many comparing the type is redundant, if we have
x_2 = y_3 + 1;
we've already compared the type for the lhs and also for rhs1, there won't
be any surprises on rhs2.
As noted in the PR, there are cases where the type of the operand is the
sole place of information and we don't want to ICF merge functions if the
types differ.
One case is stdarg functions, arguments passed to ..., it is different
if we pass 1, 1L, 1LL.
Another case are the K&R unprototyped functions (sure, gone in C23).
And yet another case are inline asm operands, "r" (1) is different from "r"
(1L) from "r" (1LL).
So, the following patch determines based on lack of fntype (e.g. for
internal functions), or on !prototype_p, or on stdarg_p (in that case
using number of named arguments) which arguments need to have type checked
and does that, plus compares types on inline asm operands (maybe it would be
enough to do that just for input operands but we have just a routine to
handle both and I didn't feel we need to differentiate).
Furthermore, I've noticed fntype{1,2} isn't actually compared if it is a
direct call (gimple_call_fndecl is non-NULL). That is wrong too, we could
have
void (*fn) (int, long long) = (void (*) (int, long long)) foo;
fn (1, 1LL);
in one case and
void (*fn) (long long, int) = (void (*) (long long, int)) foo;
fn (1LL, 1);
in another, both folded into a direct call of foo with different
gimple_call_fntype. Sure, one of them would be UB at runtime (or both), but
what if we ICF merge it into something that into the one UB at runtime
and the program actually calls the correct one only?
2025-02-01 Jakub Jelinek <jakub@redhat.com>
PR ipa/117432
* ipa-icf-gimple.cc (func_checker::compare_asm_inputs_outputs):
Also return_false if operands have incompatible types.
(func_checker::compare_gimple_call): Check fntype1 vs. fntype2
compatibility for all non-internal calls and assume fntype1 and
fntype2 are non-NULL for those. For calls to non-prototyped
calls or for stdarg_p functions after the last named argument (if any)
check type compatibility of call arguments.
* gcc.c-torture/execute/pr117432.c: New test.
* gcc.target/i386/pr117432.c: New test.
|
|
This wrong-code issue has been fixed with r15-7249.
We still emit warnings which are questionable and perhaps we'd
get better generated code if niters determined the loop has only a single
iteration without UB and we'd punt on vectorizing it (or unrolling).
2025-01-31 Jakub Jelinek <jakub@redhat.com>
PR middle-end/117498
* gcc.c-torture/execute/pr117498.c: New test.
|
|
The following testcase is miscompiled at -Os on x86_64-linux.
The problem is during make_compound_operation of
(ashiftrt:SI (ashift:SI (mult:SI (reg:SI 107 [ a_5 ])
(const_int 3 [0x3]))
(const_int 31 [0x1f]))
(const_int 31 [0x1f]))
where it incorrectly returns
(mult:SI (sign_extract:SI (reg:SI 107 [ a_5 ])
(const_int 2 [0x2])
(const_int 0 [0]))
(const_int 3 [0x3]))
which isn't obviously true, the former returns either 0 or -1 depending
on the least significant bit of the multiplication,
the latter returns either 0 or -3 depending on the second least significant
bit of the multiplication argument.
The bug has been introduced in PR96998 r11-4563, which added handling of x
* (2^N) similar to x << N. In the above case, pos is 0 and len is 1,
sign extracting a single least significant bit of the multiplication.
As 3 is not a power of 2, shift_amt is -1.
But IN_RANGE (-1, 1, 1 - 1) is still true, because the basic requirement of
IN_RANGE that LOWER is not greater than UPPER is violated.
The intention of using 1 as LOWER is to avoid matching multiplication by 1,
that really shouldn't appear in the IL. But to avoid violating IN_RANGE
requirement, we need to verify that len is at least 2.
I've added this len > 1 check to the inner if rather than outer because I
think for GCC 16 we should add a further optimization.
In the particular case of 1 least significant bit sign extraction from
multiplication by 3, we could actually say it is equivalent to
(sign_extract:SI (reg:SI 107 [ a_5 ])
(const_int 1 [0x2])
(const_int 0 [0]))
That is because 3 is an odd number and multiplication by 2 will yield the
least significant bit 0 (we are sign extracting just one) and so the
multiplication doesn't change anything on the outcome.
More generally, even for larger len, multiplication by C which is
(1 << X) + 1 where X is >= len should be optimizable just to extraction
of the multiplicand's least significant len bits.
2025-01-28 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/118638
* combine.cc (make_extraction): Only optimize (mult x 2^n) if len is
larger than 1.
* gcc.c-torture/execute/pr118638.c: New test.
|
|
..., and use it for '-mno-soft-stack': PTX "native" stacks.
PR target/65181
gcc/
* config/nvptx/nvptx.cc (nvptx_get_drap_rtx): Handle
'!TARGET_SOFT_STACK'.
* config/nvptx/nvptx.md (define_c_enum "unspec"): Add
'UNSPEC_STACKSAVE', 'UNSPEC_STACKRESTORE'.
(define_expand "allocate_stack", define_expand "save_stack_block")
(define_expand "save_stack_block"): Handle '!TARGET_SOFT_STACK',
PTX 'alloca'.
(define_insn "@nvptx_alloca_<mode>")
(define_insn "@nvptx_stacksave_<mode>")
(define_insn "@nvptx_stackrestore_<mode>"): New.
* doc/invoke.texi (Nvidia PTX Options): Update '-msoft-stack',
'-mno-soft-stack'.
* doc/sourcebuild.texi (nvptx-specific attributes): Document
'nvptx_runtime_alloca_ptx'.
(Add Options): Document 'nvptx_alloca_ptx'.
gcc/testsuite/
* gcc.target/nvptx/alloca-1.c: Evolve into...
* gcc.target/nvptx/alloca-1-O0.c: ... this, ...
* gcc.target/nvptx/alloca-1-O1.c: ... this, and...
* gcc.target/nvptx/alloca-1-sm_30.c: ... this.
* gcc.target/nvptx/vla-1.c: Evolve into...
* gcc.target/nvptx/vla-1-O0.c: ... this, ...
* gcc.target/nvptx/vla-1-O1.c: ... this, and...
* gcc.target/nvptx/vla-1-sm_30.c: ... this.
* gcc.c-torture/execute/pr36321.c: Adjust.
* gcc.target/nvptx/__builtin_alloca_0-1-O0.c: Likewise.
* gcc.target/nvptx/__builtin_alloca_0-1-O1.c: Likewise.
* gcc.target/nvptx/__builtin_stack_save___builtin_stack_restore-1.c:
Likewise.
* gcc.target/nvptx/softstack.c: Likewise.
* gcc.target/nvptx/__builtin_stack_save___builtin_stack_restore-1-sm_30.c:
New.
* gcc.target/nvptx/alloca-2-O0.c: Likewise.
* gcc.target/nvptx/alloca-3-O1.c: Likewise.
* gcc.target/nvptx/alloca-4-O3.c: Likewise.
* gcc.target/nvptx/alloca-5.c: Likewise.
* lib/target-supports.exp (check_effective_target_alloca): Adjust.
(check_nvptx_default_ptx_isa_target_architecture_at_least)
(check_nvptx_runtime_ptx_isa_target_architecture_at_least)
(check_effective_target_nvptx_runtime_alloca_ptx)
(add_options_for_nvptx_alloca_ptx): New.
libgomp/
* fortran.c (omp_get_device_from_uid_): Adjust.
* testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: Likewise.
|
|
|
|
Expand coverage for unaligned memory stores, for the "insvmisalignM"
patterns, for 2-byte, 4-byte, and 8-byte scalars, across byte alignments
of 1, 2, 4 and byte misalignments within from 0 up to 7 (there's some
redundancy there for the sake of simplicity of the test case), making
sure all data is written and no data is changed outside the area meant
to be written.
The test case has turned invaluable in verifying changes to the Alpha
backend, but functionality covered is generic, so I have concluded this
test qualifies for generic verification and does not have to be limited
to the Alpha-specific subset of the testsuite.
gcc/testsuite/
* gcc.c-torture/execute/misalign.c: New file.
|
|
Expand coverage for `__builtin_memset' for the special case of clearing
a block, primarily for "setmemM" block set pattern, though with smaller
sizes open-coded sequences may be produced instead.
This verifies block sizes in bytes from 1 to 64 across byte alignments
of 1, 2, 4, 8 and byte misalignments within from 0 up to 7 (there's some
redundancy there for the sake of simplicity of the test case), making
sure all the intended area is cleared and no data is changed outside it.
These choice of the ranges for the parameters has come from the Alpha
backend, whose "setmemM" pattern has various corner cases related to
base alignment and the misalignment within.
The test case has turned invaluable in verifying changes to the Alpha
backend, but functionality covered is generic, so I have concluded this
test qualifies for generic verification and does not have to be limited
to the Alpha-specific subset of the testsuite.
Just as with `__builtin_memcpy' tests this code turned out to require
quite a lot of time to compile, although a bit less than the former.
Example compilation times with reasonably fast POWER9@2.166GHz at `-O2'
optimization and GCC built at `-O2' for various targets:
mips-linux-gnu: 19s
vax-netbsdelf: 27s
alphaev56-linux-gnu: 30s
alpha-linux-gnu: 31s
powerpc64le-linux-gnu: 47s
With GCC built at `-O0':
alphaev56-linux-gnu: 2m59s
alpha-linux-gnu: 3m06s
I have therefore set the timeout factor accordingly so as to take slower
test hosts into account.
gcc/testsuite/
* gcc.c-torture/execute/memclr.c: New file.
|
|
The following testcase is miscompiled on s390x-linux with -O2 -march=z15.
The problem happens during cse2, which sees in an extended basic block
(jump_insn 217 78 216 10 (parallel [
(set (pc)
(if_then_else (ne (reg:SI 165)
(const_int 1 [0x1]))
(label_ref 216)
(pc)))
(set (reg:SI 165)
(plus:SI (reg:SI 165)
(const_int -1 [0xffffffffffffffff])))
(clobber (scratch:SI))
(clobber (reg:CC 33 %cc))
]) "t.c":14:17 discrim 1 2192 {doloop_si64}
(int_list:REG_BR_PROB 955630228 (nil))
-> 216)
...
(insn 99 98 100 12 (set (reg:SI 138)
(const_int 1 [0x1])) "t.c":9:31 1507 {*movsi_zarch}
(nil))
(insn 100 99 103 12 (parallel [
(set (reg:SI 137)
(minus:SI (reg:SI 138)
(subreg:SI (reg:HI 135 [ a ]) 0)))
(clobber (reg:CC 33 %cc))
]) "t.c":9:31 1904 {*subsi3}
(expr_list:REG_DEAD (reg:SI 138)
(expr_list:REG_DEAD (reg:HI 135 [ a ])
(expr_list:REG_UNUSED (reg:CC 33 %cc)
(nil)))))
Note, cse2 has df_note_add_problem () before df_analyze, which add
(expr_list:REG_UNUSED (reg:SI 165)
(expr_list:REG_UNUSED (reg:CC 33 %cc)
notes to the first insn (correctly so, %cc is clobbered there and pseudo
165 isn't used after the insn).
Now, cse_extended_basic_block has an extra optimization on conditional
jumps, where it records equivalence on the edge which continues in the ebb.
Here it sees (ne reg:SI 165) (const_int 1) is false on the edge and
remembers that pseudo 165 is comparison equivalent to (const_int 1),
so on insn 100 it decides to replace (reg:SI 138) with (reg:SI 165).
This optimization isn't correct here though, because the JUMP_INSN has
multiple sets. Before r0-77890 record_jump_equiv has been called from
cse_insn guarded on n_sets == 1 && any_condjump_p (insn), so it wouldn't
be done on the above JUMP_INSN where n_sets == 2. But since that change
it is guarded with single_set (insn) && any_condjump_p (insn) and that
is true because of the REG_UNUSED note. Looking at that note is
inappropriate in CSE though, because the whole intent of the pass is to
extend the lifetimes of the pseudos if equivalence is found, so the fact
that there is REG_UNUSED note for (reg:SI 165) and that the reg isn't used
later doesn't imply that it won't be used after the optimization.
So, unless we manage to process the other sets on the JUMP_INSN (it wouldn't
be terribly hard in this exact case, the doloop insn decreases the register
by 1 and so we could just record equivalence to (const_int 0) instead, but
generally it might be hard), we should IMHO just punt if there are multiple
sets.
The patch below adds !multiple_sets (insn) check instead of replacing with
it the single_set (insn) check, because apparently any_condjump_p uses
pc_set which supports the case where PATTERN is a SET to PC (that is a
single_set (insn) && !multiple_sets (insn), PATTERN is a PARALLEL with a
single SET to PC (likewise) and some CLOBBERs, PARALLEL with two or more
SETs where the first one is SET to PC (that could be single_set (insn)
with REG_UNUSED notes but is not !multiple_sets (insn)) or PATTERN
is UNSPEC/UNSPEC_VOLATILE with SET inside of it. For the last case
!multiple_sets (insn) will be true, but IMHO we shouldn't try to derive
anything from those because we haven't checked the rest of the UNSPEC*
and we don't really know what it does.
2024-12-13 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/117095
* cse.cc (cse_extended_basic_block): Don't call record_jump_equiv
if multiple_sets (insn).
* gcc.c-torture/execute/pr117095.c: New test.
|
|
These tests can take several seconds per compilation to complete, taking
total elapsed time measured in minutes. Mark them as expensive so as to
let people skip them where they want to save on testing time.
gcc/testsuite/
* gcc.c-torture/execute/memcpy-a1.c: Mark as expensive.
* gcc.c-torture/execute/memcpy-a2.c: Likewise.
* gcc.c-torture/execute/memcpy-a4.c: Likewise.
* gcc.c-torture/execute/memcpy-a8.c: Likewise.
|
|
The following testcases are miscompiled on s390x-linux, because the
doloop_optimize
/* Ensure that the new sequence doesn't clobber a register that
is live at the end of the block. */
{
bitmap modified = BITMAP_ALLOC (NULL);
for (rtx_insn *i = doloop_seq; i != NULL; i = NEXT_INSN (i))
note_stores (i, record_reg_sets, modified);
basic_block loop_end = desc->out_edge->src;
bool fail = bitmap_intersect_p (df_get_live_out (loop_end), modified);
check doesn't work as intended.
The problem is that it uses df, but the df analysis was only done using
iv_analysis_loop_init (loop);
->
df_analyze_loop (loop);
which computes df inside on the bbs of the loop.
While loop_end bb is inside of the loop, df_get_live_out computed that
way includes registers set in the loop and used at the start of the next
iteration, but doesn't include registers set in the loop (or before the
loop) and used after the loop.
The following patch fixes that by doing whole function df_analyze first,
changes the loop iteration mode from 0 to LI_ONLY_INNERMOST (on many
targets which use can_use_doloop_if_innermost target hook a so are known
to only handle innermost loops) or LI_FROM_INNERMOST (I think only bfin
actually allows non-innermost loops) and checking not just
df_get_live_out (loop_end) (that is needed for something used by the
next iteration), but also df_get_live_in (desc->out_edge->dest),
i.e. what will be used after the loop. df of such a bb shouldn't
be affected by the df_analyze_loop and so should be from df_analyze
of the whole function.
2024-12-05 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/113994
PR rtl-optimization/116799
* loop-doloop.cc: Include targhooks.h.
(doloop_optimize): Also punt on intersection of modified
with df_get_live_in (desc->out_edge->dest).
(doloop_optimize_loops): Call df_analyze. Use
LI_ONLY_INNERMOST or LI_FROM_INNERMOST instead of 0 as
second loops_list argument.
* gcc.c-torture/execute/pr116799.c: New test.
* g++.dg/torture/pr113994.C: New test.
|
|
gcc/testsuite/
* gcc.c-torture/execute/ieee/cdivchkd.x: New file.
* gcc.c-torture/execute/ieee/cdivchkf.x: New file.
* gcc.dg/flex-array-counted-by.c: Require wchar.
* gcc.dg/fold-copysign-1.c [avr]: Add -mdouble=64.
|
|
Skipping these tests on avr since they come up with "memory full",
plus they consume a multiple of the time the rest of the testsuite
takes.
gcc/testsuite/
* gcc.c-torture/execute/memcpy-a1.c
* gcc.c-torture/execute/memcpy-a2.c
* gcc.c-torture/execute/memcpy-a4.c
* gcc.c-torture/execute/memcpy-a8.c
|
|
nios2 target support in GCC was deprecated in GCC 14 as the
architecture has been EOL'ed by the vendor. This patch removes the
entire port for GCC 15
There are still references to "nios2" in libffi and libgo. Since those
libraries are imported into the gcc sources from master copies maintained
by other projects, those will need to be addressed elsewhere.
ChangeLog:
* MAINTAINERS: Remove references to nios2.
* configure.ac: Likewise.
* configure: Regenerated.
config/ChangeLog:
* mt-nios2-elf: Deleted.
contrib/ChangeLog:
* config-list.mk: Remove references to Nios II.
gcc/ChangeLog:
* common/config/nios2/*: Delete entire directory.
* config/nios2/*: Delete entire directory.
* config.gcc: Remove references to nios2.
* configure.ac: Likewise.
* doc/extend.texi: Likewise.
* doc/install.texi: Likewise.
* doc/invoke.texi: Likewise.
* doc/md.texi: Likewise.
* regenerate-opt-urls.py: Likewise.
* config.in: Regenerated.
* configure: Regenerated.
gcc/testsuite/ChangeLog:
* g++.target/nios2/*: Delete entire directory.
* gcc.target/nios2/*: Delete entire directory.
* g++.dg/cpp0x/constexpr-rom.C: Remove refences to nios2.
* g++.old-deja/g++.jason/thunk3.C: Likewise.
* gcc.c-torture/execute/20101011-1.c: Likewise.
* gcc.c-torture/execute/pr47237.c: Likewise.
* gcc.dg/20020312-2.c: Likewise.
* gcc.dg/20021029-1.c: Likewise.
* gcc.dg/debug/btf/btf-datasec-1.c: Likewise.
* gcc.dg/ifcvt-4.c: Likewise.
* gcc.dg/stack-usage-1.c: Likewise.
* gcc.dg/struct-by-value-1.c: Likewise.
* gcc.dg/tree-ssa/reassoc-33.c: Likewise.
* gcc.dg/tree-ssa/reassoc-34.c: Likewise.
* gcc.dg/tree-ssa/reassoc-35.c: Likewise.
* gcc.dg/tree-ssa/reassoc-36.c: Likewise.
* lib/target-supports.exp: Likewise.
libgcc/ChangeLog:
* config/nios2/*: Delete entire directory.
* config.host: Remove refences to nios2.
* unwind-dw2-fde-dip.c: Likewise.
|
|
Expand coverage for `__builtin_memcpy', primarily for "cpymemM" block
copy pattern, although with smaller sizes open-coded sequences may be
produced instead.
This verifies block sizes in bytes from 1 to 64, across byte alignments
of 1, 2, 4, 8 and byte misalignments within from 0 up to 7 (there's some
redundancy there for the sake of simplicity of the test cases) both for
the source and the destination, making sure all data is copied and no
data is changed outside the area meant to be written.
These choice of the ranges for the parameters has come from the Alpha
backend, whose "cpymemM" pattern covers copies being made of up to 64
bytes and has various corner cases related to base alignment and the
misalignment within.
The test cases have turned invaluable in verifying changes to the Alpha
backend, but functionality covered is generic, so I have concluded these
tests qualify for generic verification and do not have to be limited to
the Alpha-specific subset of the testsuite.
On the implementation side the tests turned out being quite stressful to
GCC and the original simpler version that just expanded all code inline
took a lot of time to complete compilation. Depending on the target and
compilation options elapsed times up to 40 minutes (!) have been seen,
especially with GCC built at `-O0' for debugging purposes.
At the cost of increased complexity where a pair of macros is required
per variant rather than just one I have split the code into individual
functions forced not to be inlined and it improved compilation times
considerably without losing coverage.
Example compilation times with reasonably fast POWER9@2.166GHz at `-O2'
optimization and GCC built at `-O2' for various targets:
mips-linux-gnu: 23s
vax-netbsdelf: 29s
alphaev56-linux-gnu: 39s
alpha-linux-gnu: 43s
powerpc64le-linux-gnu: 48s
With GCC built at `-O0':
alphaev56-linux-gnu: 3m37s
alpha-linux-gnu: 3m54s
I have therefore set the timeout factor accordingly so as to take slower
test hosts into account.
gcc/testsuite/
* gcc.c-torture/execute/memcpy-a1.c: New file.
* gcc.c-torture/execute/memcpy-a2.c: New file.
* gcc.c-torture/execute/memcpy-a4.c: New file.
* gcc.c-torture/execute/memcpy-a8.c: New file.
* gcc.c-torture/execute/memcpy-ax.h: New file.
|
|
The following patch adds u{,l,ll,imax}abs builtins, which just fold
to ABSU_EXPR, similarly to how {,l,ll,imax}abs builtins fold to
ABS_EXPR.
2024-11-21 Jakub Jelinek <jakub@redhat.com>
PR c/117024
gcc/
* coretypes.h (enum function_class): Add function_c2y_misc
enumerator.
* builtin-types.def (BT_FN_UINTMAX_INTMAX, BT_FN_ULONG_LONG,
BT_FN_ULONGLONG_LONGLONG): New DEF_FUNCTION_TYPE_1s.
* builtins.def (DEF_C2Y_BUILTIN): Define.
(BUILT_IN_UABS, BUILT_IN_UIMAXABS, BUILT_IN_ULABS,
BUILT_IN_ULLABS): New builtins.
* builtins.cc (fold_builtin_abs): Handle also folding of u*abs
to ABSU_EXPR.
(fold_builtin_1): Handle BUILT_IN_U{,L,LL,IMAX}ABS.
gcc/lto/ChangeLog:
* lto-lang.cc (flag_isoc2y): New variable.
gcc/ada/ChangeLog:
* gcc-interface/utils.cc (flag_isoc2y): New variable.
gcc/testsuite/
* gcc.c-torture/execute/builtins/lib/abs.c (uintmax_t): New typedef.
(uabs, ulabs, ullabs, uimaxabs): New functions.
* gcc.c-torture/execute/builtins/uabs-1.c: New test.
* gcc.c-torture/execute/builtins/uabs-1.x: New file.
* gcc.c-torture/execute/builtins/uabs-1-lib.c: New file.
* gcc.c-torture/execute/builtins/uabs-2.c: New test.
* gcc.c-torture/execute/builtins/uabs-2.x: New file.
* gcc.c-torture/execute/builtins/uabs-2-lib.c: New file.
* gcc.c-torture/execute/builtins/uabs-3.c: New test.
* gcc.c-torture/execute/builtins/uabs-3.x: New test.
* gcc.c-torture/execute/builtins/uabs-3-lib.c: New test.
|
|
How can you use "read-shared" as an identifier? It's not allowed by all
C standard versions.
gcc/testsuite/ChangeLog:
* gcc.c-torture/execute/builtin-prefetch-1.c (rws): Use
"read_shared" instead of "read-shared" as the identifier for
enum value.
* gcc.dg/builtin-prefetch-1.c (rws): Likewise.
|
|
gcc/ChangeLog:
* builtins.cc (expand_builtin_prefetch): Expand for
prefetchrst2.
* common/config/i386/cpuinfo.h (get_available_features): Detect movrs.
* common/config/i386/i386-common.cc
(OPTION_MASK_ISA2_MOVRS_SET): New.
(OPTION_MASK_ISA2_MOVRS_UNSET): Ditto.
(ix86_handle_option): Handle -mmovrs.
* common/config/i386/i386-cpuinfo.h
(enum processor_features): Add FEATURE_MOVRS.
* common/config/i386/i386-isas.h: Add ISA_NAME_TABLE_ENTRY for movrs.
* config.gcc: Add movrsintrin.h
* config/i386/cpuid.h (bit_MOVRS): New.
* config/i386/i386-builtin-types.def:
Add DEF_FUNCTION_TYPE (CHAR, PCCHAR), (SHORT, PCSHORT), (INT, PCINT),
(INT64, PCINT64).
* config/i386/i386-builtin.def (BDESC): Add new builtins.
* config/i386/i386-c.cc (ix86_target_macros_internal): Add
__MOVRS__.
* config/i386/i386-expand.cc (ix86_expand_special_args_builtin): Define
__MOVRS__.
* config/i386/i386-isa.def (MOVRS): Add DEF_PTA(MOVRS)
* config/i386/i386-options.cc (ix86_valid_target_attribute_inner_p):
Handle movrs.
* config/i386/i386.md (movrs<mode>): New.
* config/i386/i386.opt: Add option -mmovrs.
* config/i386/i386.opt.urls: Regenerated.
* config/i386/immintrin.h: Include movrsintrin.h
* config/i386/sse.md (unspecv): Add UNSPEC_VMOVRS.
(VI1248_AVX10_2): New.
(avx10_2_movrs_vmovrs<ssemodesuffix><mode><mask_name>): New define_insn.
* config/i386/xmmintrin.h: Add prefetchrst2.
* doc/extend.texi: Document movrs.
* doc/invoke.texi: Document -mmovrs.
* doc/rtl.texi: Document extension of prefetchrst2.
* doc/sourcebuild.texi: Document target movrs.
* config/i386/movrsintrin.h: New.
gcc/testsuite/ChangeLog:
* g++.dg/other/i386-2.C: Add -mmovrs.
* g++.dg/other/i386-3.C: Ditto.
* gcc.c-torture/execute/builtin-prefetch-1.c: Expand rws.
* gcc.dg/builtin-prefetch-1.c: Ditto.
* gcc.target/i386/avx-1.c: Ditto.
* gcc.target/i386/avx-2.c: Ditto.
* gcc.target/i386/funcspec-56.inc: Add new target attribute.
* gcc.target/i386/sse-12.c: Add -mmovrs.
* gcc.target/i386/sse-13.c: Ditto.
* gcc.target/i386/sse-14.c: Ditto.
* gcc.target/i386/sse-22.c: Add movrs.
* gcc.target/i386/sse-23.c: Ditto
* gcc.target/i386/avx10_2-512-movrs-1.c: New test.
* gcc.target/i386/avx10_2-movrs-1.c: Ditto.
* gcc.target/i386/movrs-1.c: Ditto.
Co-authored-by: Haochen Jiang <haochen.jiang@intel.com>
|
|
This is a wrong-code generation on the SPARC for a function containing
a call to __builtin_unreachable caused by the delay slot scheduling pass,
and more specifically the find_end_label function which has these lines:
/* Otherwise, see if there is a label at the end of the function. If there
is, it must be that RETURN insns aren't needed, so that is our return
label and we don't have to do anything else. */
The comment was correct 20 years ago but no longer is nowadays in the
presence of RTL epilogues and calls to __builtin_unreachable, so the
patch just removes the associated two lines of code:
else if (LABEL_P (insn))
*plabel = as_a <rtx_code_label *> (insn);
and otherwise contains just adjustments to the commentary.
gcc/
PR rtl-optimization/117327
* reorg.cc (find_end_label): Do not return a dangling label at the
end of the function and adjust commentary.
gcc/testsuite/
* gcc.c-torture/execute/20241029-1.c: New test.
|
|
Now that C23 support is essentially feature-complete, I'd like to
switch the default language version for C compilation to -std=gnu23.
This requires updating a large number of testcases that fail with the
new language version if left unchanged. In this patch, update most of
the tests for which there is a safe change that works both before and
after the update to default language version - typically adding the
option -std=gnu17 or -Wno-old-style-definition to the tests. (There
are also a few tests where I'd like to investigate further why they
fail with -std=gnu23, or where I think such failures show an actual
bug to fix before changing the default language version, or where it
seems more appropriate to make a testcase change that would result in
failures in the absence of the language version change rather than
just adding an option that does nothing with the gnu17 default.)
The libffi test fixes have also been submitted upstream:
<https://github.com/libffi/libffi/pull/861>.
Most of the failures requiring such changes are for one of two
reasons:
* Unprototyped function declarations with () (meaning the same as
(void) in C23 mode) for a function then called with arguments.
* Old-style function definitions, which warn by default in C23 mode,
so resulting in test failures for the unexpected warnings.
Other reasons for failures include:
* Tests with their own definitions of bool, true and false.
* Tests of diagnostics (often with -pedantic) in cases where C23 has
changed semantics, such as:
- tag compatibility for structs;
- enum values out of range of int;
- handing of qualified array types;
- decimal floating types formerly needing -pedantic diagnostics, but
being standard in C23.
Bootstrapped with no regressions for x86_64-pc-linux-gnu.
gcc/testsuite/
* c-c++-common/Wcast-function-type.c: Add -std=gnu17 for C.
* c-c++-common/Wformat-pr84258.c: Add -std=gnu17 for C.
* c-c++-common/Wvarargs.c: Add -std=gnu17 for C.
* c-c++-common/analyzer/data-model-12.c: Add -std=gnu17 for C.
* c-c++-common/builtins.c: Add -std=gnu17 for C.
* c-c++-common/pointer-to-fn1.c: Add -std=gnu17 for C.
* c-c++-common/pragma-diag-17.c: Add -std=gnu17 for C.
* c-c++-common/sizeof-array-argument.c: Add
-Wno-old-style-definition for C.
* g++.dg/lto/pr54625-1_0.c: Add -std=gnu17.
* g++.dg/lto/pr54625-2_0.c: Add -std=gnu17.
* gcc.c-torture/compile/20040214-2.c: Add -std=gnu17.
* gcc.c-torture/compile/921011-2.c: Add -std=gnu17.
* gcc.c-torture/compile/931102-1.c: Add -std=gnu17.
* gcc.c-torture/compile/990801-1.c: Add -std=gnu17.
* gcc.c-torture/compile/nested-1.c: Add -std=gnu17.
* gcc.c-torture/compile/pr100241-1.c: Add -std=gnu17.
* gcc.c-torture/compile/pr106101.c: Add -std=gnu17.
* gcc.c-torture/compile/pr113616.c: Add -std=gnu17.
* gcc.c-torture/compile/pr47967.c: Add -std=gnu17.
* gcc.c-torture/compile/pr51694.c: Add -std=gnu17.
* gcc.c-torture/compile/pr71109.c: Add -std=gnu17.
* gcc.c-torture/compile/pr83051-2.c: Add -std=gnu17.
* gcc.c-torture/compile/pr89663-1.c: Add -std=gnu17.
* gcc.c-torture/compile/pr94238.c: Add -std=gnu17.
* gcc.c-torture/compile/pr96796.c: Add -std=gnu17.
* gcc.c-torture/compile/pr97576.c: Add -std=gnu17.
* gcc.c-torture/compile/udivmod4.c: Add -std=gnu17.
* gcc.c-torture/execute/20010605-2.c: Add -std=gnu17.
* gcc.c-torture/execute/20020404-1.c: Add -std=gnu17.
* gcc.c-torture/execute/20030714-1.c: Add -std=gnu17.
* gcc.c-torture/execute/20051012-1.c: Add -std=gnu17.
* gcc.c-torture/execute/20190820-1.c: Add -std=gnu17.
* gcc.c-torture/execute/920612-1.c: Add -Wno-old-style-definition.
* gcc.c-torture/execute/930608-1.c: Add -std=gnu17.
* gcc.c-torture/execute/comp-goto-1.c: Add -std=gnu17.
* gcc.c-torture/execute/ieee/fp-cmp-1.x: Add -std=gnu17.
* gcc.c-torture/execute/ieee/fp-cmp-2.x: Add -std=gnu17.
* gcc.c-torture/execute/ieee/fp-cmp-3.x: Add -std=gnu17.
* gcc.c-torture/execute/ieee/fp-cmp-4.x: New file.
* gcc.c-torture/execute/ieee/fp-cmp-4f.x: New file.
* gcc.c-torture/execute/ieee/fp-cmp-4l.x: New file.
* gcc.c-torture/execute/loop-9.c: Add -std=gnu17.
* gcc.c-torture/execute/pr103209.c: Add -std=gnu17.
* gcc.c-torture/execute/pr28289.c: Add -std=gnu17.
* gcc.c-torture/execute/pr34982.c: Add -std=gnu17.
* gcc.c-torture/execute/pr67037.c: Add -std=gnu17.
* gcc.c-torture/execute/va-arg-2.c: Add -std=gnu17.
* gcc.dg/20010202-1.c: Add -std=gnu17.
* gcc.dg/20020430-1.c: Add -std=gnu17.
* gcc.dg/20031218-3.c: Add -std=gnu17.
* gcc.dg/20040127-1.c: Add -std=gnu17.
* gcc.dg/20041014-1.c: Add -Wno-old-style-definition.
* gcc.dg/20041122-1.c: Add -std=gnu17.
* gcc.dg/20050309-1.c: Add -std=gnu17.
* gcc.dg/20061026.c: Add -std=gnu17.
* gcc.dg/20101010-1.c: Add -std=gnu17.
* gcc.dg/Warray-parameter-10.c: Add -std=gnu17.
* gcc.dg/Wbuiltin-declaration-mismatch-2.c: Add -std=gnu17.
* gcc.dg/Wbuiltin-declaration-mismatch-3.c: Add -std=gnu17.
* gcc.dg/Wbuiltin-declaration-mismatch-4.c: Add -std=gnu17.
* gcc.dg/Wbuiltin-declaration-mismatch-5.c: Add -std=gnu17.
* gcc.dg/Wbuiltin-declaration-mismatch.c: Add -std=gnu17.
* gcc.dg/Wcxx-compat-2.c: Add -std=gnu17.
* gcc.dg/Wdouble-promotion.c: Add -std=gnu17.
* gcc.dg/Wfree-nonheap-object-7.c: Add -std=gnu17.
* gcc.dg/Wimplicit-int-1.c: Add -std=gnu17.
* gcc.dg/Wimplicit-int-1a.c: Add -std=gnu17.
* gcc.dg/Wimplicit-int-2.c: Add -std=gnu17.
* gcc.dg/Wimplicit-int-3.c: Add -std=gnu17.
* gcc.dg/Wimplicit-int-4.c: Add -std=gnu17.
* gcc.dg/Wimplicit-int-4a.c: Add -std=gnu17.
* gcc.dg/Wincompatible-pointer-types-1.c: Add -std=gnu17.
* gcc.dg/Wrestrict-19.c: Add -std=gnu17.
* gcc.dg/Wrestrict-4.c: Add -std=gnu17.
* gcc.dg/Wrestrict-5.c: Add -std=gnu17.
* gcc.dg/Wstrict-overflow-20.c: Add -std=gnu17.
* gcc.dg/Wstringop-overflow-13.c: Add -std=gnu17.
* gcc.dg/analyzer/doom-d_main-IdentifyVersion.c: Add -std=gnu17.
* gcc.dg/analyzer/doom-s_sound-pr108867.c: Add -std=gnu17.
* gcc.dg/analyzer/pr93032-mztools-signed-char.c: Add
-Wno-old-style-definition.
* gcc.dg/analyzer/pr93032-mztools-unsigned-char.c: Add
-Wno-old-style-definition.
* gcc.dg/analyzer/pr93355-localealias.c: Add
-Wno-old-style-definition.
* gcc.dg/analyzer/pr93375.c: Add -std=gnu17.
* gcc.dg/analyzer/pr94688.c: Add -std=gnu17.
* gcc.dg/analyzer/sensitive-1.c: Add -std=gnu17.
* gcc.dg/analyzer/torture/asm-x86-linux-wfx_get_ps_timeout-full.c:
Add -std=gnu17.
* gcc.dg/analyzer/torture/pr104863.c: Add -std=gnu17.
* gcc.dg/analyzer/torture/pr93379.c: Add -std=gnu17.
* gcc.dg/array-quals-2.c: Add -std=gnu17.
* gcc.dg/attr-invalid.c: Add -Wno-old-style-definition.
* gcc.dg/auto-init-uninit-A.c: Add -Wno-old-style-definition.
* gcc.dg/builtin-choose-expr.c: Declare exit with (int) prototype.
* gcc.dg/builtin-tgmath-err-1.c: Add -std=gnu17.
* gcc.dg/builtins-30.c: Add -std=gnu17.
* gcc.dg/cast-function-1.c: Add -std=gnu17.
* gcc.dg/cleanup-1.c: Add -std=gnu17.
* gcc.dg/compat/struct-complex-1_x.c: Add -std=gnu17.
* gcc.dg/compat/struct-complex-2_x.c: Add -std=gnu17.
* gcc.dg/compat/union-m128-1_x.c: Add -std=gnu17.
* gcc.dg/debug/dwarf2/pr66482.c: Add -std=gnu17.
* gcc.dg/dfp/composite-type-2.c: Add -std=gnu17.
* gcc.dg/dfp/composite-type.c: Add -std=gnu17.
* gcc.dg/dfp/keywords-pedantic.c: Add -std=gnu17.
* gcc.dg/dremf-type-compat-1.c: Add -std=gnu17.
* gcc.dg/dremf-type-compat-2.c: Add -std=gnu17.
* gcc.dg/dremf-type-compat-3.c: Add -std=gnu17.
* gcc.dg/dremf-type-compat-4.c: Add -std=gnu17.
* gcc.dg/enum-compat-1.c: Add -std=gnu17.
* gcc.dg/enum-compat-2.c: Add -std=gnu17.
* gcc.dg/floatn-errs.c: Add -std=gnu17.
* gcc.dg/fltconst-pedantic-dfp.c: Add -std=gnu17.
* gcc.dg/format/proto.c: Add -std=gnu17.
* gcc.dg/format/sentinel-1.c: Add -std=gnu17.
* gcc.dg/gomp/declare-simd-1.c: Add -Wno-old-style-definition.
* gcc.dg/ifelse-1.c: Add -Wno-old-style-definition.
* gcc.dg/inline-33.c: Add -std=gnu17.
* gcc.dg/ipa/inline-5.c: Add -std=gnu17.
* gcc.dg/ipa/ipa-sra-21.c: Add -std=gnu17.
* gcc.dg/ipa/pr102714.c: Add -std=gnu17.
* gcc.dg/ipa/pr104813.c: Add -std=gnu17.
* gcc.dg/ipa/pr108679.c: Add -std=gnu17.
* gcc.dg/ipa/pr42706.c: Add -std=gnu17.
* gcc.dg/ipa/pr88214.c: Add -Wno-old-style-definition.
* gcc.dg/ipa/pr91853.c: Add -Wno-old-style-definition.
* gcc.dg/ipa/pr93763.c: Add -std=gnu17.
* gcc.dg/ipa/pr96482-2.c: Add -std=gnu17.
* gcc.dg/lto/20091013-1_2.c: Add -std=gnu17.
* gcc.dg/lto/20091015-1_2.c: Add -std=gnu17.
* gcc.dg/lto/pr113197_1.c: Add -std=gnu17.
* gcc.dg/lto/pr54702_1.c: Add -std=gnu17.
* gcc.dg/lto/pr99849_0.c: Add -std=gnu17.
* gcc.dg/noncompile/920923-1.c: Add -std=gnu17.
* gcc.dg/noncompile/old-style-parm-1.c: Add
-Wno-old-style-definition.
* gcc.dg/noncompile/old-style-parm-3.c: Add
-Wno-old-style-definition.
* gcc.dg/noncompile/pr30552-2.c: Add -Wno-old-style-definition.
* gcc.dg/noncompile/pr30552-3.c: Add -std=gnu17.
* gcc.dg/noncompile/pr71265.c: Add -Wno-old-style-definition.
* gcc.dg/noncompile/pr79758-2.c: Add -Wno-old-style-definition.
* gcc.dg/noncompile/pr79758.c: Add -Wno-old-style-definition.
* gcc.dg/noncompile/va-arg-1.c: Add -std=gnu17.
* gcc.dg/old-style-prom-1.c: Add -std=gnu17.
* gcc.dg/old-style-prom-2.c: Add -std=gnu17.
* gcc.dg/old-style-prom-3.c: Add -std=gnu17.
* gcc.dg/old-style-then-proto-1.c: Add -std=gnu17.
* gcc.dg/parm-incomplete-1.c: Add -std=gnu17.
* gcc.dg/parm-mismatch-1.c: Add -std=gnu17.
* gcc.dg/permerror-default.c: Add -std=gnu17.
* gcc.dg/permerror-fpermissive-nowarning.c: Add -std=gnu17.
* gcc.dg/permerror-fpermissive.c: Add -std=gnu17.
* gcc.dg/permerror-noerror.c: Add -std=gnu17.
* gcc.dg/permerror-nowarning.c: Add -std=gnu17.
* gcc.dg/permerror-pedantic.c: Add -std=gnu17.
* gcc.dg/plugin/infoleak-net-ethtool-ioctl.c: Add -std=gnu17.
* gcc.dg/pointer-array-quals-1.c: Add -std=gnu17.
* gcc.dg/pointer-array-quals-2.c: Add -std=gnu17.
* gcc.dg/pr100791.c: Add -std=gnu17.
* gcc.dg/pr100843.c: Add -std=gnu17.
* gcc.dg/pr102273.c: Add -std=gnu17.
* gcc.dg/pr102385.c: Add -std=gnu17.
* gcc.dg/pr103222.c: Add -std=gnu17.
* gcc.dg/pr105140.c: Add -std=gnu17.
* gcc.dg/pr105150.c: Add -std=gnu17.
* gcc.dg/pr105250.c: Add -std=gnu17.
* gcc.dg/pr105972.c: Add -Wno-old-style-definition.
* gcc.dg/pr111039.c: Add -std=gnu17.
* gcc.dg/pr111407.c: Add -std=gnu17.
* gcc.dg/pr111922.c: Add -Wno-old-style-definition.
* gcc.dg/pr15236.c: Add -std=gnu17.
* gcc.dg/pr17188-1.c: Add -std=gnu17.
* gcc.dg/pr20368-1.c: Add -std=gnu17.
* gcc.dg/pr20368-2.c: Add -std=gnu17.
* gcc.dg/pr20368-3.c: Add -std=gnu17.
* gcc.dg/pr27331.c: Add -Wno-old-style-definition.
* gcc.dg/pr27861-1.c: Add -std=gnu17.
* gcc.dg/pr28121.c: Add -std=gnu17.
* gcc.dg/pr28243.c: Add -std=gnu17.
* gcc.dg/pr28888.c: Add -std=gnu17.
* gcc.dg/pr29254.c: Add -std=gnu17.
* gcc.dg/pr34457-1.c: Add -std=gnu17.
* gcc.dg/pr36015.c: Add -std=gnu17.
* gcc.dg/pr38245-3.c: Add -std=gnu17.
* gcc.dg/pr38245-4.c: Add -std=gnu17.
* gcc.dg/pr41241.c: Add -std=gnu17.
* gcc.dg/pr43058.c: Add -std=gnu17.
* gcc.dg/pr44539.c: Add -std=gnu17.
* gcc.dg/pr45055.c: Add -std=gnu17.
* gcc.dg/pr50908.c: Add -Wno-old-style-definition.
* gcc.dg/pr60647-1.c: Add -Wno-old-style-definition.
* gcc.dg/pr63762.c: Add -std=gnu17.
* gcc.dg/pr63804.c: Add -std=gnu17.
* gcc.dg/pr68306-3.c: Add -std=gnu17.
* gcc.dg/pr68533.c: Add -std=gnu17.
* gcc.dg/pr69156.c: Add -std=gnu17.
* gcc.dg/pr7356-2.c: Add -Wno-old-style-definition.
* gcc.dg/pr79983.c: Add -std=gnu17.
* gcc.dg/pr83463.c: Add -std=gnu17.
* gcc.dg/pr87347.c: Add -std=gnu17.
* gcc.dg/pr89521-1.c: Add -std=gnu17.
* gcc.dg/pr89521-2.c: Add -std=gnu17.
* gcc.dg/pr90648.c: Add -std=gnu17.
* gcc.dg/pr93573-1.c: Add -std=gnu17.
* gcc.dg/pr94167.c: Add -std=gnu17.
* gcc.dg/pr94705.c: Add -std=gnu17.
* gcc.dg/pr95118.c: Add -std=gnu17.
* gcc.dg/pr96335.c: Add -std=gnu17.
* gcc.dg/pr97830.c: Add -std=gnu17.
* gcc.dg/pr97882.c: Add -std=gnu17.
* gcc.dg/pr99122-2.c: Add -std=gnu17.
* gcc.dg/pr99122-3.c: Add -std=gnu17.
* gcc.dg/qual-component-1.c: Add -std=gnu17.
* gcc.dg/sibcall-6.c: Add -Wno-old-style-definition.
* gcc.dg/sms-2.c: Add -Wno-old-style-definition.
* gcc.dg/tm/20091221.c: Add -std=gnu17.
* gcc.dg/torture/bfloat16-basic.c: Add -Wno-old-style-definition.
* gcc.dg/torture/float128-basic.c: Add -Wno-old-style-definition.
* gcc.dg/torture/float128x-basic.c: Add -Wno-old-style-definition.
* gcc.dg/torture/float16-basic.c: Add -Wno-old-style-definition.
* gcc.dg/torture/float32-basic.c: Add -Wno-old-style-definition.
* gcc.dg/torture/float32x-basic.c: Add -Wno-old-style-definition.
* gcc.dg/torture/float64-basic.c: Add -Wno-old-style-definition.
* gcc.dg/torture/float64x-basic.c: Add -Wno-old-style-definition.
* gcc.dg/torture/pr102762.c: Add -std=gnu17.
* gcc.dg/torture/pr103987.c: Add -std=gnu17.
* gcc.dg/torture/pr104825.c: Add -Wno-old-style-definition.
* gcc.dg/torture/pr105166.c: Add -std=gnu17.
* gcc.dg/torture/pr105185.c: Add -Wno-old-style-definition.
* gcc.dg/torture/pr109652.c: Add -std=gnu17.
* gcc.dg/torture/pr112444.c: Add -std=gnu17.
* gcc.dg/torture/pr113895-3.c: Add -std=gnu17.
* gcc.dg/torture/pr24626-2.c: Add -std=gnu17.
* gcc.dg/torture/pr25183.c: Add -std=gnu17.
* gcc.dg/torture/pr38948.c: Add -std=gnu17.
* gcc.dg/torture/pr44807.c: Add -std=gnu17.
* gcc.dg/torture/pr47281.c: Add -std=gnu17.
* gcc.dg/torture/pr47958-1.c: Add -Wno-old-style-definition.
* gcc.dg/torture/pr48063.c: Add -std=gnu17.
* gcc.dg/torture/pr57036-1.c: Add -std=gnu17.
* gcc.dg/torture/pr57330.c: Add -std=gnu17.
* gcc.dg/torture/pr57584.c: Add -std=gnu17.
* gcc.dg/torture/pr67741.c: Add -std=gnu17.
* gcc.dg/torture/pr68104.c: Add -std=gnu17.
* gcc.dg/torture/pr69242.c: Add -std=gnu17.
* gcc.dg/torture/pr70457.c: Add -std=gnu17.
* gcc.dg/torture/pr70985.c: Add -std=gnu17.
* gcc.dg/torture/pr71606.c: Add -std=gnu17.
* gcc.dg/torture/pr71816.c: Add -std=gnu17.
* gcc.dg/torture/pr77286.c: Add -std=gnu17.
* gcc.dg/torture/pr77646.c: Add -std=gnu17.
* gcc.dg/torture/pr77677-2.c: Add -std=gnu17.
* gcc.dg/torture/pr78365.c: Add -Wno-old-style-definition.
* gcc.dg/torture/pr79732.c: Add -std=gnu17.
* gcc.dg/torture/pr80612.c: Add -std=gnu17.
* gcc.dg/torture/pr80764.c: Add -std=gnu17.
* gcc.dg/torture/pr80842.c: Add -std=gnu17.
* gcc.dg/torture/pr81900.c: Add -std=gnu17.
* gcc.dg/torture/pr82276.c: Add -std=gnu17.
* gcc.dg/torture/pr84803.c: Add -std=gnu17.
* gcc.dg/torture/pr93124.c: Add -std=gnu17.
* gcc.dg/torture/pr97330-1.c: Add -Wno-old-style-definition.
* gcc.dg/tree-prof/comp-goto-1.c: Add -std=gnu17.
* gcc.dg/tree-ssa/20030703-2.c: Add -Wno-old-style-definition.
* gcc.dg/tree-ssa/20030708-1.c: Add -Wno-old-style-definition.
* gcc.dg/tree-ssa/20030709-2.c: Add -Wno-old-style-definition.
* gcc.dg/tree-ssa/20030709-3.c: Add -Wno-old-style-definition.
* gcc.dg/tree-ssa/20030710-1.c: Add -Wno-old-style-definition.
* gcc.dg/tree-ssa/20030711-1.c: Add -Wno-old-style-definition.
* gcc.dg/tree-ssa/20030711-2.c: Add -Wno-old-style-definition.
* gcc.dg/tree-ssa/20030711-3.c: Add -Wno-old-style-definition.
* gcc.dg/tree-ssa/20030714-1.c: Add -Wno-old-style-definition.
* gcc.dg/tree-ssa/20030714-2.c: Add -Wno-old-style-definition.
* gcc.dg/tree-ssa/20030728-1.c: Add -Wno-old-style-definition.
* gcc.dg/tree-ssa/20030807-10.c: Add -Wno-old-style-definition.
* gcc.dg/tree-ssa/20030807-11.c: Add -Wno-old-style-definition.
* gcc.dg/tree-ssa/20030807-3.c: Add -Wno-old-style-definition.
* gcc.dg/tree-ssa/20030807-6.c: Add -Wno-old-style-definition.
* gcc.dg/tree-ssa/20030807-7.c: Add -Wno-old-style-definition.
* gcc.dg/tree-ssa/20030814-4.c: Add -Wno-old-style-definition.
* gcc.dg/tree-ssa/20030814-5.c: Add -Wno-old-style-definition.
* gcc.dg/tree-ssa/20030814-6.c: Add -Wno-old-style-definition.
* gcc.dg/tree-ssa/20030918-1.c: Add -Wno-old-style-definition.
* gcc.dg/tree-ssa/20040514-2.c: Add -Wno-old-style-definition.
* gcc.dg/tree-ssa/loadpre7.c: Add -Wno-old-style-definition.
* gcc.dg/tree-ssa/pr111003.c: Add -std=gnu17.
* gcc.dg/tree-ssa/pr115128.c: Add -std=gnu17.
* gcc.dg/tree-ssa/pr115191.c: Add -std=gnu17.
* gcc.dg/tree-ssa/pr24840.c: Add -std=gnu17.
* gcc.dg/tree-ssa/pr69666.c: Add -std=gnu17.
* gcc.dg/tree-ssa/pr70232.c: Add -std=gnu17.
* gcc.dg/ubsan/pr79757-1.c: Add -Wno-old-style-definition.
* gcc.dg/ubsan/pr79757-2.c: Add -Wno-old-style-definition.
* gcc.dg/ubsan/pr79757-3.c: Add -Wno-old-style-definition.
* gcc.dg/ubsan/pr81223.c: Add -std=gnu17.
* gcc.dg/uninit-10-O0.c: Add -Wno-old-style-definition.
* gcc.dg/uninit-10.c: Add -Wno-old-style-definition.
* gcc.dg/uninit-32.c: Add -std=gnu17.
* gcc.dg/uninit-41.c: Add -std=gnu17.
* gcc.dg/uninit-A-O0.c: Add -Wno-old-style-definition.
* gcc.dg/uninit-A.c: Add -Wno-old-style-definition.
* gcc.dg/unused-1.c: Add -Wno-old-style-definition.
* gcc.dg/vect/bb-slp-pr114249.c: Add -std=gnu17.
* gcc.dg/vect/bb-slp-pr97486.c: Add -std=gnu17.
* gcc.dg/vect/bb-slp-subgroups-1.c: Add -std=gnu17.
* gcc.dg/vect/bb-slp-subgroups-2.c: Add -std=gnu17.
* gcc.dg/vect/bb-slp-subgroups-3.c: Add -std=gnu17.
* gcc.dg/vect/vect-early-break_111-pr113731.c: Add -std=gnu17.
* gcc.dg/vect/vect-early-break_122-pr114239.c: Add -std=gnu17.
* gcc.dg/vect/vect-multi-peel-gaps.c: Add -std=gnu17.
* gcc.dg/vla-stexp-2.c: Add -std=gnu17.
* gcc.dg/warn-1.c: Add -Wno-old-style-definition.
* gcc.dg/winline-10.c: Add -Wno-old-style-definition.
* gcc.dg/wtr-label-1.c: Add -Wno-old-style-definition.
* gcc.dg/wtr-switch-1.c: Add -Wno-old-style-definition.
* gcc.target/i386/excess-precision-3.c: Add
-Wno-old-style-definition.
* gcc.target/i386/fma4-256-nmsubXX.c: Add -std=gnu17.
* gcc.target/i386/fma4-nmsubXX.c: Add -std=gnu17.
* gcc.target/i386/nop-mcount.c: Add -Wno-old-style-definition.
* gcc.target/i386/pr102627.c: Add -std=gnu17.
* gcc.target/i386/pr106994.c: Add -std=gnu17.
* gcc.target/i386/pr68349.c: Add -std=gnu17.
* gcc.target/i386/pr97313.c: Add -std=gnu17.
* gcc.target/i386/pr99454.c: Add -std=gnu17.
* gcc.target/i386/record-mcount.c: Add -Wno-old-style-definition.
libffi/
* testsuite/libffi.call/va_struct2.c (test_fn): Cast n to void.
* testsuite/libffi.call/va_struct3.c (test_fn): Likewise.
Backported from <https://github.com/libffi/libffi/pull/861>.
|
|
Generally PASSes with:
$ ptxas --version
ptxas: NVIDIA (R) Ptx optimizing assembler
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sun_Sep__9_21:06:46_CDT_2018
Cuda compilation tools, release 10.0, V10.0.145
..., and execution with 'Driver Version: 361.93.02'.
Only the '-O1' execution test FAILs (pre-existing; to be analyzed later):
nvptx-run: error getting kernel result: an illegal memory access was encountered (CUDA_ERROR_ILLEGAL_ADDRESS, 700)
gcc/testsuite/
* gcc.c-torture/execute/20020529-1.c: Re-enable all variants for
nvptx.
|
|
After 2014's commit 157e859ffe3b5d43db1e19475711c1a3d21ab57a "remove picochip",
the effective-target 'freestanding' (later) was only ever used for nvptx.
However, the relevant I/O library functions have long been implemented in nvptx
newlib.
These test cases generally PASS, just a few need to get XFAILed; see
<https://docs.nvidia.com/cuda/ptx-writers-guide-to-interoperability/#system-calls>,
and then supposedly
<https://docs.nvidia.com/cuda/cuda-c-programming-guide/#formatted-output> for
description of the non-standard PTX 'vprintf' return value:
> Unlike the C-standard 'printf()', which returns the number of characters
> printed, CUDA's 'printf()' returns the number of arguments parsed. If no
> arguments follow the format string, 0 is returned. If the format string is
> NULL, -1 is returned. If an internal error occurs, -2 is returned.
(I've tried a few variants to confirm that PTX 'vprintf' -- which supposedly is
underlying the CUDA 'printf' -- is what's implementing this behavior.)
Probably, we ought to fix that up in nvptx newlib.
gcc/testsuite/
* gcc.c-torture/execute/printf-1.c: XFAIL for nvptx.
* gcc.c-torture/execute/printf-chk-1.c: Likewise.
* gcc.c-torture/execute/vprintf-1.c: Likewise.
* gcc.c-torture/execute/vprintf-chk-1.c: Likewise.
* lib/target-supports.exp (check_effective_target_freestanding):
Disable for nvptx.
|
|
PR testsuite/108540
gcc/testsuite/
* gcc.c-torture/execute/ieee/pr108540-1.c: Un-preprocess
__SIZE_TYPE__ and __INT64_TYPE__.
* gcc.c-torture/execute/ieee/pr108540-1.x: New file, requires double64.
|
|
The code path which was added for 16bit had a broken inline-asm which would
only assign maybe half of the registers for the `long` type to 0.
Adding L to the input operand of the inline-asm fixes the issue by now assigning
the full 32bit value of the input register that would match up with the output register.
Fixes r0-115223-gb0408f13d4b317 which added the 16bit code path to fix the testcase for 16bit.
Pushed as obvious.
PR testsuite/116716
gcc/testsuite/ChangeLog:
* gcc.c-torture/execute/pr52286.c: Fix inline-asm for 16bit case.
|
|
gcc/testsuite/
* gcc.c-torture/execute/20021120-1.c: Skip if not size20plus or -Os.
* gcc.dg/fixed-point/convert-float-4.c: Require size20plus.
* gcc.dg/torture/pr112282.c: Skip if -O0 unless size20plus.
* g++.dg/lookup/pr21802.C: Require size20plus.
|
|
PR ipa/111613
* gcc.c-torture/pr111613.c: Rename to..
* gcc.c-torture/execute/pr111613.c: ...this.
|
|
PR middle-end/115277
* gcc.c-torture/compile/pr115277.c: Rename to...
* gcc.c-torture/execute/pr115277.c: ...this.
|
|
function call parameters
modref_eaf_analysis::analyze_ssa_name misinterprets EAF flags. If dereferenced
parameter is passed (to map_iterator in the testcase) it can be returned
indirectly which in turn makes it to escape into the next function call.
PR ipa/115033
gcc/ChangeLog:
* ipa-modref.cc (modref_eaf_analysis::analyze_ssa_name): Fix checking of
EAF flags when analysing values dereferenced as function parameters.
gcc/testsuite/ChangeLog:
* gcc.c-torture/execute/pr115033.c: New test.
|
|
unadjusted_ptr_and_unit_offset accidentally throws away the offset computed by
get_addr_base_and_unit_offset. Instead of passing extra_offset it passes offset.
PR ipa/114207
gcc/ChangeLog:
* ipa-prop.cc (unadjusted_ptr_and_unit_offset): Fix accounting of offsets in ADDR_EXPR.
gcc/testsuite/ChangeLog:
* gcc.c-torture/execute/pr114207.c: New test.
|
|
According to IEEE standard, for conversions from floating point to
integer. When a NaN or infinite operand cannot be represented in the
destination format and this cannot otherwise be indicated, the invalid
operation exception shall be signaled. When a numeric operand would
convert to an integer outside the range of the destination format, the
invalid operation exception shall be signaled if this situation cannot
otherwise be indicated.
The patch prevent simplication of the conversion from floating point
to integer for NAN/INF/out-of-range constant when flag_trapping_math.
gcc/ChangeLog:
PR rtl-optimization/100927
PR rtl-optimization/115161
PR rtl-optimization/115115
* simplify-rtx.cc (simplify_const_unary_operation): Prevent
simplication of FIX/UNSIGNED_FIX for NAN/INF/out-of-range
constant when flag_trapping_math.
* fold-const.cc (fold_convert_const_int_from_real): Don't fold
for overflow value when_trapping_math.
gcc/testsuite/ChangeLog:
* gcc.dg/pr100927.c: New test.
* c-c++-common/Wconversion-1.c: Add -fno-trapping-math.
* c-c++-common/dfp/convert-int-saturate.c: Ditto.
* g++.dg/ubsan/pr63956.C: Ditto.
* g++.dg/warn/Wconversion-real-integer.C: Ditto.
* gcc.c-torture/execute/20031003-1.c: Ditto.
* gcc.dg/Wconversion-complex-c99.c: Ditto.
* gcc.dg/Wconversion-real-integer.c: Ditto.
* gcc.dg/c90-const-expr-11.c: Ditto.
* gcc.dg/overflow-warn-8.c: Ditto.
|
|
__builtin{add,sub}c [PR108789]
The following testcase is miscompiled, because we use save_expr
on the .{ADD,SUB,MUL}_OVERFLOW call we are creating, but if the first
two operands are not INTEGER_CSTs (in that case we just fold it right away)
but are TREE_READONLY/!TREE_SIDE_EFFECTS, save_expr doesn't actually
create a SAVE_EXPR at all and so we lower it to
*arg2 = REALPART_EXPR (.ADD_OVERFLOW (arg0, arg1)), \
IMAGPART_EXPR (.ADD_OVERFLOW (arg0, arg1))
which evaluates the ifn twice and just hope it will be CSEd back.
As *arg2 aliases *arg0, that is not the case.
The builtins are really never const/pure as they store into what
the third arguments points to, so after handling the INTEGER_CST+INTEGER_CST
case, I think we should just always use SAVE_EXPR. Just building SAVE_EXPR
by hand and setting TREE_SIDE_EFFECTS on it doesn't work, because
c_fully_fold optimizes it away again, so the following patch marks the
ifn calls as TREE_SIDE_EFFECTS (but doesn't do it for the
__builtin_{add,sub,mul}_overflow_p case which were designed for use
especially in constant expressions and don't really evaluate the
realpart side, so we don't really need a SAVE_EXPR in that case).
2024-06-04 Jakub Jelinek <jakub@redhat.com>
PR middle-end/108789
* builtins.cc (fold_builtin_arith_overflow): For ovf_only,
don't call save_expr and don't build REALPART_EXPR, otherwise
set TREE_SIDE_EFFECTS on call before calling save_expr.
(fold_builtin_addc_subc): Set TREE_SIDE_EFFECTS on call before
calling save_expr.
* gcc.c-torture/execute/pr108789.c: New test.
|
|
The problem here is the pattern added in r13-1162-g9991d84d2a8435
assumes that it is well defined to multiply zero_one_valuep by the truncated
converted integer constant. It is well defined for all types except for signed 1bit types.
Where `a * -1` is produced which is undefined/
So disable this pattern for 1bit signed types.
Note the pattern added in r14-3432-gddd64a6ec3b38e is able to workaround the undefinedness except when
`-fsanitize=undefined` is turned on, this is why I added a testcase for that.
Bootstrapped and tested on x86_64-linux-gnu with no regressions.
PR tree-optimization/115154
gcc/ChangeLog:
* match.pd (convert (mult zero_one_valued_p@1 INTEGER_CST@2)): Disable
for 1bit signed types.
gcc/testsuite/ChangeLog:
* c-c++-common/ubsan/signed1bitfield-1.c: New test.
* gcc.c-torture/execute/signed1bitfield-1.c: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
|
|
TARGET_MEM_REF can be used to offset constant base into a memory object (to
produce lea instruction). This confuses points_to_local_or_readonly_memory_p
which treats the constant address as a base of the access.
Bootstrapped/regtsted x86_64-linux, comitted.
Honza
gcc/ChangeLog:
PR ipa/113787
* ipa-fnsummary.cc (points_to_local_or_readonly_memory_p): Do not
look into TARGET_MEM_REFS with constant opreand 0.
gcc/testsuite/ChangeLog:
* gcc.c-torture/execute/pr113787.c: New test.
|
|
The optimize_range_tests_to_bit_test optimization normally emits a range
test first:
if (entry_test_needed)
{
tem = build_range_check (loc, optype, unshare_expr (exp),
false, lowi, high);
if (tem == NULL_TREE || is_gimple_val (tem))
continue;
}
so during the bit test we already know that exp is in the [lowi, high]
range, but skips it if we have range info which tells us this isn't
necessary.
Also, normally it emits shifts by exp - lowi counter, but has an
optimization to use just exp counter if the mask isn't a more expensive
constant in that case and lowi is > 0 and high is smaller than prec.
The following testcase is miscompiled because the two abnormal cases
are triggered. The range of exp is [43, 43][48, 48][95, 95], so we on
64-bit arch decide we don't need the entry test, because 95 - 43 < 64.
And we also decide to use just exp as counter, because the range test
tests just for exp == 43 || exp == 48, so high is smaller than 64 too.
Because 95 is in the exp range, we can't do that, we'd either need to
do a range test first, i.e.
if (exp - 43U <= 48U - 43U) if ((1UL << exp) & mask1))
or need to subtract lowi from the shift counter, i.e.
if ((1UL << (exp - 43)) & mask2)
but can't do both unless r.upper_bound () is < prec.
The following patch ensures that.
2024-05-08 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/114965
* tree-ssa-reassoc.cc (optimize_range_tests_to_bit_test): Don't try to
optimize away exp - lowi subtraction from shift count unless entry
test is emitted or unless r.upper_bound () is smaller than prec.
* gcc.c-torture/execute/pr114965.c: New test.
|
|
The problem is `!a?b:c` pattern will create a COND_EXPR with an 1bit signed integer
which breaks patterns like `a?~t:t`. This rejects when we have a signed operand for
both patterns.
Note for GCC 15, I am going to look at the canonicalization of `a?~t:t` where t
was a constant since I think keeping it a COND_EXPR might be more canonical and
is what VPR produces from the same IR; if anything expand should handle which one
is better.
Bootstrapped and tested on x86_64-linux-gnu with no regressions.
PR tree-optimization/114666
gcc/ChangeLog:
* match.pd (`!a?b:c`): Reject signed types for the condition.
(`a?~t:t`): Likewise.
gcc/testsuite/ChangeLog:
* gcc.c-torture/execute/bitfld-signed1-1.c: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
|
|
r13-990 added optimizations in multiple spots to optimize during
expansion storing of constant initializers into targets.
In the load_register_parameters and expand_expr_real_1 cases,
it checks it has a tree as the source and so knows we are reading
that whole decl's value, so the code is fine as is, but in the
emit_push_insn case it checks for a MEM from which something
is pushed and checks for SYMBOL_REF as the MEM's address, but
still assumes the whole object is copied, which as the following
testcase shows might not always be the case. In the testcase,
k is 6 bytes, then 2 bytes of padding, then another 4 bytes,
while the emit_push_insn wants to store just the 6 bytes.
The following patch simply verifies it is the whole initializer
that is being stored, I think that is best thing to do so late
in GCC 14 cycle as well for backporting.
For GCC 15, perhaps the code could stop requiring it must be at offset zero,
nor that the size is equal, but could use
get_symbol_constant_value/fold_ctor_reference gimple-fold APIs to actually
extract just part of the initializer if we e.g. push just some subset
(of course, still verify that it is a subset). For sizes which are power
of two bytes and we have some integer modes, we could use as type for
fold_ctor_reference corresponding integral types, otherwise dunno, punt
or use some structure (e.g. try to find one in the initializer?), whatever.
But even in the other spots it could perhaps handle loading of
COMPONENT_REFs or MEM_REFs from the .rodata vars.
2024-04-03 Jakub Jelinek <jakub@redhat.com>
PR middle-end/114552
* expr.cc (emit_push_insn): Only use store_constructor for
immediate_const_ctor_p if int_expr_size matches size.
* gcc.c-torture/execute/pr114552.c: New test.
|
|
This testcase was made latent by r14-4089 and got fixed both
on the trunk and 13 branch with PR113372 fix.
Adding testcase to the testsuite and will close the PR as a dup.
2024-03-28 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/109925
* gcc.c-torture/execute/pr109925.c: New test.
|
|
Apparently I've somehow screwed up the adjustments of the originally tested
testcase, tweaked it so that in the second/third cases it actually see
a MAX_EXPR rather than COND_EXPR the MAX_EXPR has been optimized into,
and didn't update the expected value.
2024-03-26 Jakub Jelinek <jakub@redhat.com>
PR middle-end/111151
PR testsuite/114486
* gcc.c-torture/execute/pr111151.c (main): Fix up expected value for
f.
|
|
As I've tried to explain in the comments, the extract_muldiv_1
MIN/MAX_EXPR optimization is wrong for code == MULT_EXPR.
If the multiplication is done in unsigned type or in signed
type with -fwrapv, it is fairly obvious that max (a, b) * c
in many cases isn't equivalent to max (a * c, b * c) (or min if c is
negative) due to overflows, but even for signed with undefined overflow,
the optimization could turn something without UB in it (where
say a * c invokes UB, but max (or min) picks the other operand where
b * c doesn't).
As for division/modulo, I think it is in most cases safe, except if
the problematic INT_MIN / -1 case could be triggered, but we can
just punt for MAX_EXPR because for MIN_EXPR if one operand is INT_MIN,
we'd pick that operand already. It is just for completeness, match.pd
already has an optimization which turns x / -1 into -x, so the division
by zero is mostly theoretical. That is also why in the testcase the
i case isn't actually miscompiled without the patch, while the c and f
cases are.
2024-03-26 Jakub Jelinek <jakub@redhat.com>
PR middle-end/111151
* fold-const.cc (extract_muldiv_1) <case MAX_EXPR>: Punt for
MULT_EXPR altogether, or for MAX_EXPR if c is -1.
* gcc.c-torture/execute/pr111151.c: New test.
|
|
Also fixed a typo in the testcase.
gcc/testsuite/ChangeLog:
PR tree-optimization/114396
* gcc.target/i386/pr114396.c: Move to...
* gcc.c-torture/execute/pr114396.c: ...here.
|
|
Excerpt from gcc.sum:
[...]
PASS: gcc.c-torture/execute/20101011-1.c -O0 (test for excess errors)
FAIL: gcc.c-torture/execute/20101011-1.c -O0 execution test
PASS: gcc.c-torture/execute/20101011-1.c -O1 (test for excess errors)
FAIL: gcc.c-torture/execute/20101011-1.c -O1 execution test
[ ... ]
This is because H8 MCUs do not throw a "divide by zero" exception.
gcc/testsuite
* gcc.c-torture/execute/20101011-1.c: Do not test on H8 series.
|
|
WORD_REGISTER_OPERATIONS [PR113010]
The sign-bit-copies of a sign-extending load cannot be known until runtime on
WORD_REGISTER_OPERATIONS targets, except in the case of a zero-extending MEM
load. See the fix for PR112758.
gcc/
PR rtl-optimization/113010
* combine.cc (simplify_comparison): Simplify a SUBREG on
WORD_REGISTER_OPERATIONS targets only if it is a zero-extending
MEM load.
gcc/testsuite
* gcc.c-torture/execute/pr113010.c: New test.
|
|
2024-02-11 John David Anglin <danglin@gcc.gnu.org>
gcc/testsuite/ChangeLog:
* gcc.c-torture/execute/ieee/cdivchkf.c: Use ilogb and
__builtin_fmax instead of ilogbf and __builtin_fmaxf.
|