aboutsummaryrefslogtreecommitdiff
path: root/gcc/testsuite/gcc.c-torture/execute
AgeCommit message (Collapse)AuthorFilesLines
3 dayscombine: Use reg_used_between_p rather than modified_between_p in two spots ↵Jakub Jelinek1-0/+33
[PR119291] The following testcase is miscompiled on x86_64-linux at -O2 by the combiner. We have from earlier combinations (insn 22 21 23 4 (set (reg:SI 104 [ _7 ]) (const_int 0 [0])) "pr119291.c":25:15 96 {*movsi_internal} (nil)) (insn 23 22 24 4 (set (reg/v:SI 117 [ e ]) (reg/v:SI 116 [ e ])) 96 {*movsi_internal} (expr_list:REG_DEAD (reg/v:SI 116 [ e ]) (nil))) (note 24 23 25 4 NOTE_INSN_DELETED) (insn 25 24 26 4 (parallel [ (set (reg:CCZ 17 flags) (compare:CCZ (neg:SI (reg:SI 104 [ _7 ])) (const_int 0 [0]))) (set (reg/v:SI 116 [ e ]) (neg:SI (reg:SI 104 [ _7 ]))) ]) "pr119291.c":26:13 977 {*negsi_2} (expr_list:REG_DEAD (reg:SI 104 [ _7 ]) (nil))) (note 26 25 27 4 NOTE_INSN_DELETED) (insn 27 26 28 4 (set (reg:DI 128 [ _9 ]) (ne:DI (reg:CCZ 17 flags) (const_int 0 [0]))) "pr119291.c":26:13 1447 {*setcc_di_1} (expr_list:REG_DEAD (reg:CCZ 17 flags) (nil))) and try_combine is called on i3 25 and i2 22 (second time) and reach the hunk being patched with simplified i3 (insn 25 24 26 4 (parallel [ (set (pc) (pc)) (set (reg/v:SI 116 [ e ]) (const_int 0 [0])) ]) "pr119291.c":28:13 977 {*negsi_2} (expr_list:REG_DEAD (reg:SI 104 [ _7 ]) (nil))) and (insn 22 21 23 4 (set (reg:SI 104 [ _7 ]) (const_int 0 [0])) "pr119291.c":27:15 96 {*movsi_internal} (nil)) Now, the try_combine code there attempts to split two independent sets in newpat by moving one of them to i2. And among other tests it checks !modified_between_p (SET_DEST (set1), i2, i3) which is certainly needed, if there would be say (set (reg/v:SI 116 [ e ]) (const_int 42 [0x2a])) in between i2 and i3, we couldn't do that, as that set would overwrite the value set by set1 we want to move to the i2 position. But in this case pseudo 116 isn't set in between i2 and i3, but used (and additionally there is a REG_DEAD note for it). This is equally bad for the move, because while the i3 insn and later will see the pseudo value that we set, the insn in between which uses the value will see a different value from the one that it should see. As we don't check for that, in the end try_combine succeeds and changes the IL to: (insn 22 21 23 4 (set (reg/v:SI 116 [ e ]) (const_int 0 [0])) "pr119291.c":27:15 96 {*movsi_internal} (nil)) (insn 23 22 24 4 (set (reg/v:SI 117 [ e ]) (reg/v:SI 116 [ e ])) 96 {*movsi_internal} (expr_list:REG_DEAD (reg/v:SI 116 [ e ]) (nil))) (note 24 23 25 4 NOTE_INSN_DELETED) (insn 25 24 26 4 (set (pc) (pc)) "pr119291.c":28:13 2147483647 {NOOP_MOVE} (nil)) (note 26 25 27 4 NOTE_INSN_DELETED) (insn 27 26 28 4 (set (reg:DI 128 [ _9 ]) (const_int 0 [0])) "pr119291.c":28:13 95 {*movdi_internal} (nil)) (note, the i3 got turned into a nop and try_combine also modified insn 27). The following patch replaces the modified_between_p tests with reg_used_between_p, my understanding is that modified_between_p is a subset of reg_used_between_p, so one doesn't need both. Looking at this some more today, I think we should special case set_noop_p because that can be put into i2 (except for the JUMP_P violations), currently both modified_between_p (pc_rtx, i2, i3) and reg_used_between_p (pc_rtx, i2, i3) returns false. I'll post a patch incrementally for that (but that feels like new optimization, so probably not something that should be backported). On Tue, Apr 01, 2025 at 11:27:25AM +0200, Richard Biener wrote: > Can we constrain SET_DEST (set1/set0) to a REG_P in combine? Why > does the comment talk about memory? I was worried about making too risky changes this late in stage4 (and especially also for backports). Most of this code is 1992-ish. I think many of the functions are just misnamed, the reg_ in there doesn't match what those functions do (bet they initially supported just REGs and later on support for other kinds of expressions was added, but haven't done git archeology to prove that). What we know for sure is: && GET_CODE (SET_DEST (XVECEXP (newpat, 0, 0))) != ZERO_EXTRACT && GET_CODE (SET_DEST (XVECEXP (newpat, 0, 0))) != STRICT_LOW_PART && GET_CODE (SET_DEST (XVECEXP (newpat, 0, 1))) != ZERO_EXTRACT && GET_CODE (SET_DEST (XVECEXP (newpat, 0, 1))) != STRICT_LOW_PART that is checked earlier in the condition. Then it calls && ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 1)), XVECEXP (newpat, 0, 0)) && ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 0)), XVECEXP (newpat, 0, 1)) While it has reg_* in it, that function mostly calls reg_overlap_mentioned_p which is also misnamed, that function handles just fine all of REG, MEM, SUBREG of REG, (SUBREG of MEM not, see below), ZERO_EXTRACT, STRICT_LOW_PART, PC and even some further cases. So, IMHO SET_DEST (set0) or SET_DEST (set0) can be certainly a REG, SUBREG of REG, PC (at least the REG and PC cases are triggered on the testcase) and quite possibly also MEM (SUBREG of MEM not, see below). Now, the code uses !modified_between_p (SET_SRC (set{1,0}), i2, i3) where that function for constants just returns false, for PC returns true, for REG returns reg_set_between_p, for MEM recurses on the address, for MEM_READONLY_P otherwise returns false, otherwise checks using alias.cc code whether the memory could have been modified in between, for all other rtxes recurses on the subrtxes. This part didn't change in my patch. I've only changed those - && !modified_between_p (SET_DEST (set{1,0}), i2, i3) + && !reg_used_between_p (SET_DEST (set{1,0}), i2, i3) where the former has been described above and clearly handles all of REG, SUBREG of REG, PC, MEM and SUBREG of MEM among other things. The replacement reg_used_between_p calls reg_overlap_mentioned_p on each instruction in between i2 and i3. So, there is clearly a difference in behavior if SET_DEST (set{1,0}) is pc_rtx, in that case modified_between_p returns unconditionally true even if there are no instructions in between, but reg_used_between_p if there are no non-debug insns in between returns false. Sorry for missing that, guess I should check for that (with the exception of the noop moves which are often (set (pc) (pc)) and handled by the incremental patch). In fact not just that, reg_used_between_p will only return true for PC if it is mentioned anywhere in the insns in between. Anyway, except for that, for REG it calls refers_to_regno_p and so should find any occurrences of any of the REG or parts of it for hard registers, for MEM returns true if it sees any MEMs in insns in between (conservatively), for SUBREGs apparently it relies on it being SUBREG of REG (so doesn't handle SUBREG of MEM) and handles SUBREG of REG like the SUBREG_REG, PC I've already described. Now, because reg_overlap_mentioned_p doesn't handle SUBREG of MEM, I think already the initial && ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 1)), XVECEXP (newpat, 0, 0)) && ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 0)), XVECEXP (newpat, 0, 1)) calls would have failed --enable-checking=rtl or would have misbehaved, so I think there is no need to check for it further. To your question why I don't use reg_referenced_p, that is because reg_referenced_p is something to call on one insn pattern, while reg_used_between_p is pretty much that on all insns in between two instructions (excluding the boundaries). So, I think it would be safer to add && SET_DEST (set{1,0} != pc_rtx checks to preserve former behavior, like in the following version. 2025-04-01 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/119291 * combine.cc (try_combine): For splitting of PARALLEL with 2 independent SETs into i2 and i3 sets check reg_used_between_p of the SET_DESTs rather than just modified_between_p. * gcc.c-torture/execute/pr119291.c: New test.
10 daysi386: Fix up combination of -2 r<<= (x & 7) into btr [PR119428]Jakub Jelinek1-0/+18
The following patch is miscompiled from r15-8478 but latently already since my r11-5756 and r11-6631 changes. The r11-5756 change was https://gcc.gnu.org/pipermail/gcc-patches/2020-December/561164.html which changed the splitters to immediately throw away the masking. And the r11-6631 change was an optimization to recognize (set (zero_extract:HI (...) (const_int 1) (...)) (const_int 1) as btr. The problem is their interaction. x86 is not a SHIFT_COUNT_TRUNCATED target, so the masking needs to be explicit in the IL. And combine.cc (make_field_assignment) has since 1992 optimizations which try to optimize x &= (-2 r<< y) into zero_extract (x) = 0. Now, such an optimization is fine if y has not been masked or if the chosen zero_extract has the same mode as the rotate (or it recognizes something with a left shift too). IMHO such optimization is invalid for SHIFT_COUNT_TRUNCATED targets because we explicitly say that the masking of the shift/rotate counts are redundant there and don't need to be part of the IL (I have a patch for that, but because it is just latent, I'm not sure it needs to be posted for gcc 15 (and also am not sure if it should punt or add operand masking just in case)). x86 is not SHIFT_COUNT_TRUNCATED though and so even fixing combine not to do that for SHIFT_COUNT_TRUNCATED targets doesn't help, and we don't have QImode insv, so it is optimized into HImode insertions. Now, if the y in x &= (-2 r<< y) wasn't masked in any way, turning it into HImode btr is just fine, but if it was x &= (-2 r<< (y & 7)) and we just decided to throw away the masking, using btr changes the behavior on it and causes e2fsprogs and sqlite miscompilations. So IMHO on !SHIFT_COUNT_TRUNCATED targets, we need to keep the maskings explicit in the IL, either at least for the duration of the combine pass as does the following patch (where combine is the only known pass to have such transformation), or even keep it until final pass in case there are some later optimizations that would also need to know whether there was explicit masking or not and with what mask. The latter change would be much larger. The following patch just reverts the r11-5756 change and adds a testcase. 2025-03-25 Jakub Jelinek <jakub@redhat.com> PR target/96226 PR target/119428 * config/i386/i386.md (splitter after *<rotate_insn><mode>3_mask, splitter after *<rotate_insn><mode>3_mask_1): Revert 2020-12-05 changes. * gcc.c-torture/execute/pr119428.c: New test.
2025-03-04simplify-rtx: Fix up simplify_logical_relational_operation [PR119002]Richard Sandiford1-0/+23
The following testcase is miscompiled on powerpc64le-linux starting with r15-6777. During combine we see: (set (reg:SI 134) (ior:SI (ge:SI (reg:CCFP 128) (const_int 0 [0])) (lt:SI (reg:CCFP 128) (const_int 0 [0])))) The simplify_logical_relational_operation code (in its current form) was written with arithmetic rather than CC modes in mind. Since CCFP is a CC mode, it fails the HONOR_NANS check, and so the function assumes that ge | lt => true. If one comparison is unsigned then it should be safe to assume that the other comparison is also unsigned, even for CC modes, since the optimisation checks that the comparisons are between the same operands. For the other cases, we can only safely fold comparisons of CC mode values if the result is always-true (15) or always-false (0). It turns out that the original testcase for PR117186, which ran at -O, was relying on the old behaviour for some of the functions. It needs 4-instruction combinations, and so -fexpensive-optimizations, to pass in its intended form. gcc/ PR rtl-optimization/119002 * simplify-rtx.cc (simplify_context::simplify_logical_relational_operation): Handle comparisons between CC values. If there is no evidence that the CC values are unsigned, restrict the fold to always-true or always-false results. gcc/testsuite/ * gcc.c-torture/execute/ieee/pr119002.c: New test. * gcc.target/aarch64/pr117186.c: Run at -O2 rather than -O. Co-authored-by: Jakub Jelinek <jakub@redhat.com>
2025-03-04testsuite: Add tests for already fixed PR [PR119071]Jakub Jelinek1-0/+15
Uros' r15-7793 fixed this PR as well, I'm just committing tests from the PR so that it can be closed. 2025-03-04 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/119071 * gcc.dg/pr119071.c: New test. * gcc.c-torture/execute/pr119071.c: New test.
2025-02-27gimple-fold: Fix a pasto in fold_truth_andor_for_ifcombine [PR119030]Jakub Jelinek1-0/+26
The following testcase is miscompiled since r15-7597. The left comparison is unsigned (x & 0x8000U) != 0) while the right one is signed (x >> 16) >= 0 and is actually a signbit test, so rsignbit is 64. After debugging this and reading the r15-7597 change, I believe there is just a pasto, the if (lsignbit) and if (rsignbit) blocks are pretty much identical with just the first l on all variables starting with l replaced with r (the only difference is that if (lsignbit) has a comment explaining the sign <<= 1; stuff, while it isn't repeated in the second one. Except the second one was using ll_unsignedp instead of rl_unsignedp in one spot. I think it should use the latter, the signedness of the left comparison doesn't affect the other one, they are basically independent with the exception that we check that after transformations they are both EQ or both NE and later on we try to merge them together. 2025-02-27 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/119030 * gimple-fold.cc (fold_truth_andor_for_ifcombine): Fix a pasto, ll_unsignedp -> rl_unsignedp. * gcc.c-torture/execute/pr119030.c: New test.
2025-02-24reassoc: Fix up optimize_range_tests_to_bit_test [PR118915]Jakub Jelinek1-0/+22
The following testcase is miscompiled due to a bug in optimize_range_tests_to_bit_test. It is trying to optimize check for a in [-34,-34] or [-26,-26] or [-6,-6] or [-4,inf] ranges. Another reassoc optimization folds the the test for the first two ranges into (a + 34U) & ~8U in [0U,0U] range, and extract_bit_test_mask actually has code to virtually undo it and treat that again as test for a being -34 or -26. The problem is that optimize_range_tests_to_bit_test remembers in the type variable TREE_TYPE (ranges[i].exp); from the first range. If extract_bit_test_mask doesn't do that virtual undoing of the BIT_AND_EXPR handling, that is just fine, the returned exp is ranges[i].exp. But if the first range is BIT_AND_EXPR, the type could be different, the BIT_AND_EXPR form has the optional cast to corresponding unsigned type in order to avoid introducing UB. Now, type was used to fill in the max value if ranges[j].high was missing in subsequently tested range, and so in this particular testcase the [-4,inf] range which was signed int and so [-4,INT_MAX] was treated as [-4,UINT_MAX] instead. And we were subtracting values of 2 different types and trying to make sense out of that. The following patch fixes this by using the type of the low bound (which is always non-NULL) for the max value of the high bound instead. 2025-02-24 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/118915 * tree-ssa-reassoc.cc (optimize_range_tests_to_bit_test): For highj == NULL_TREE use TYPE_MAX_VALUE (TREE_TYPE (lowj)) rather than TYPE_MAX_VALUE (type). * gcc.c-torture/execute/pr118915.c: New test.
2025-02-22Turn test cases into UNSUPPORTED if running into 'sorry, unimplemented: ↵Thomas Schwinge23-25/+0
dynamic stack allocation not supported' In Subversion r217296 (Git commit e2acc079ff125a869159be45371dc0a29b230e92) "Testsuite alloca fixes for ptx", effective-target 'alloca' was added to mark up test cases that run into the nvptx back end's non-support of dynamic stack allocation. (Later, nvptx gained conditional support for that in commit 3861d362ec7e3c50742fc43833fe9d8674f4070e "nvptx: PTX 'alloca' for '-mptx=7.3'+, '-march=sm_52'+ [PR65181]", but on the other hand, in commit f93a612fc4567652b75ffc916d31a446378e6613 "bpf: liberate R9 for general register allocation", the BPF back end joined "the list of targets that do not support alloca in target-support.exp". Manually maintaining the list of test cases requiring effective-target 'alloca' is notoriously hard, gets out of date quickly: new test cases added to the test suite may need to be analyzed and annotated, and over time annotations also may need to be removed, in cases where the compiler learns to optimize out 'alloca'/VLA usage, for example. This commit replaces (99 % of) the manual annotations with an automatic scheme: turn test cases into UNSUPPORTED if running into 'sorry, unimplemented: dynamic stack allocation not supported'. gcc/testsuite/ * lib/target-supports.exp (check_effective_target_alloca): Gracefully handle the case that we've not be called (indirectly) from 'dg-test'. * lib/gcc-dg.exp (proc gcc-dg-prune): Turn 'sorry, unimplemented: dynamic stack allocation not supported' into UNSUPPORTED. * c-c++-common/Walloca-larger-than.c: Don't 'dg-require-effective-target alloca'. * c-c++-common/Warray-bounds-9.c: Likewise. * c-c++-common/Warray-bounds.c: Likewise. * c-c++-common/Wdangling-pointer-2.c: Likewise. * c-c++-common/Wdangling-pointer-4.c: Likewise. * c-c++-common/Wdangling-pointer-5.c: Likewise. * c-c++-common/Wdangling-pointer.c: Likewise. * c-c++-common/Wimplicit-fallthrough-7.c: Likewise. * c-c++-common/Wsizeof-pointer-memaccess1.c: Likewise. * c-c++-common/Wsizeof-pointer-memaccess2.c: Likewise. * c-c++-common/Wstringop-truncation.c: Likewise. * c-c++-common/Wunused-var-6.c: Likewise. * c-c++-common/Wunused-var-8.c: Likewise. * c-c++-common/analyzer/alloca-leak.c: Likewise. * c-c++-common/analyzer/allocation-size-multiline-2.c: Likewise. * c-c++-common/analyzer/allocation-size-multiline-3.c: Likewise. * c-c++-common/analyzer/capacity-1.c: Likewise. * c-c++-common/analyzer/capacity-3.c: Likewise. * c-c++-common/analyzer/imprecise-floating-point-1.c: Likewise. * c-c++-common/analyzer/infinite-recursion-alloca.c: Likewise. * c-c++-common/analyzer/malloc-callbacks.c: Likewise. * c-c++-common/analyzer/malloc-paths-8.c: Likewise. * c-c++-common/analyzer/out-of-bounds-5.c: Likewise. * c-c++-common/analyzer/out-of-bounds-diagram-11.c: Likewise. * c-c++-common/analyzer/uninit-alloca.c: Likewise. * c-c++-common/analyzer/write-to-string-literal-5.c: Likewise. * c-c++-common/asan/alloca_loop_unpoisoning.c: Likewise. * c-c++-common/auto-init-11.c: Likewise. * c-c++-common/auto-init-12.c: Likewise. * c-c++-common/auto-init-15.c: Likewise. * c-c++-common/auto-init-16.c: Likewise. * c-c++-common/builtins.c: Likewise. * c-c++-common/dwarf2/vla1.c: Likewise. * c-c++-common/gomp/pr61486-2.c: Likewise. * c-c++-common/torture/builtin-clear-padding-4.c: Likewise. * c-c++-common/torture/strub-run3.c: Likewise. * c-c++-common/torture/strub-run4.c: Likewise. * c-c++-common/torture/strub-run4c.c: Likewise. * c-c++-common/torture/strub-run4d.c: Likewise. * c-c++-common/torture/strub-run4i.c: Likewise. * g++.dg/Walloca1.C: Likewise. * g++.dg/Walloca2.C: Likewise. * g++.dg/cpp0x/pr70338.C: Likewise. * g++.dg/cpp1y/lambda-generic-vla1.C: Likewise. * g++.dg/cpp1y/vla10.C: Likewise. * g++.dg/cpp1y/vla2.C: Likewise. * g++.dg/cpp1y/vla6.C: Likewise. * g++.dg/cpp1y/vla8.C: Likewise. * g++.dg/debug/debug5.C: Likewise. * g++.dg/debug/debug6.C: Likewise. * g++.dg/debug/pr54828.C: Likewise. * g++.dg/diagnostic/pr70105.C: Likewise. * g++.dg/eh/cleanup5.C: Likewise. * g++.dg/eh/spbp.C: Likewise. * g++.dg/ext/builtin_alloca.C: Likewise. * g++.dg/ext/tmplattr9.C: Likewise. * g++.dg/ext/vla10.C: Likewise. * g++.dg/ext/vla11.C: Likewise. * g++.dg/ext/vla12.C: Likewise. * g++.dg/ext/vla15.C: Likewise. * g++.dg/ext/vla16.C: Likewise. * g++.dg/ext/vla17.C: Likewise. * g++.dg/ext/vla23.C: Likewise. * g++.dg/ext/vla3.C: Likewise. * g++.dg/ext/vla6.C: Likewise. * g++.dg/ext/vla7.C: Likewise. * g++.dg/init/array24.C: Likewise. * g++.dg/init/new47.C: Likewise. * g++.dg/init/pr55497.C: Likewise. * g++.dg/opt/pr78201.C: Likewise. * g++.dg/template/vla2.C: Likewise. * g++.dg/torture/Wsizeof-pointer-memaccess1.C: Likewise. * g++.dg/torture/Wsizeof-pointer-memaccess2.C: Likewise. * g++.dg/torture/pr62127.C: Likewise. * g++.dg/torture/pr67055.C: Likewise. * g++.dg/torture/stackalign/eh-alloca-1.C: Likewise. * g++.dg/torture/stackalign/eh-inline-2.C: Likewise. * g++.dg/torture/stackalign/eh-vararg-1.C: Likewise. * g++.dg/torture/stackalign/eh-vararg-2.C: Likewise. * g++.dg/warn/Wplacement-new-size-5.C: Likewise. * g++.dg/warn/Wsizeof-pointer-memaccess-1.C: Likewise. * g++.dg/warn/Wvla-1.C: Likewise. * g++.dg/warn/Wvla-3.C: Likewise. * g++.old-deja/g++.ext/array2.C: Likewise. * g++.old-deja/g++.ext/constructor.C: Likewise. * g++.old-deja/g++.law/builtin1.C: Likewise. * g++.old-deja/g++.other/crash12.C: Likewise. * g++.old-deja/g++.other/eh3.C: Likewise. * g++.old-deja/g++.pt/array6.C: Likewise. * g++.old-deja/g++.pt/dynarray.C: Likewise. * gcc.c-torture/compile/20000923-1.c: Likewise. * gcc.c-torture/compile/20030224-1.c: Likewise. * gcc.c-torture/compile/20071108-1.c: Likewise. * gcc.c-torture/compile/20071117-1.c: Likewise. * gcc.c-torture/compile/900313-1.c: Likewise. * gcc.c-torture/compile/parms.c: Likewise. * gcc.c-torture/compile/pr17397.c: Likewise. * gcc.c-torture/compile/pr35006.c: Likewise. * gcc.c-torture/compile/pr42956.c: Likewise. * gcc.c-torture/compile/pr51354.c: Likewise. * gcc.c-torture/compile/pr52714.c: Likewise. * gcc.c-torture/compile/pr55851.c: Likewise. * gcc.c-torture/compile/pr77754-1.c: Likewise. * gcc.c-torture/compile/pr77754-2.c: Likewise. * gcc.c-torture/compile/pr77754-3.c: Likewise. * gcc.c-torture/compile/pr77754-4.c: Likewise. * gcc.c-torture/compile/pr77754-5.c: Likewise. * gcc.c-torture/compile/pr77754-6.c: Likewise. * gcc.c-torture/compile/pr78439.c: Likewise. * gcc.c-torture/compile/pr79413.c: Likewise. * gcc.c-torture/compile/pr82564.c: Likewise. * gcc.c-torture/compile/pr87110.c: Likewise. * gcc.c-torture/compile/pr99787-1.c: Likewise. * gcc.c-torture/compile/vla-const-1.c: Likewise. * gcc.c-torture/compile/vla-const-2.c: Likewise. * gcc.c-torture/execute/20010209-1.c: Likewise. * gcc.c-torture/execute/20020314-1.c: Likewise. * gcc.c-torture/execute/20020412-1.c: Likewise. * gcc.c-torture/execute/20021113-1.c: Likewise. * gcc.c-torture/execute/20040223-1.c: Likewise. * gcc.c-torture/execute/20040308-1.c: Likewise. * gcc.c-torture/execute/20040811-1.c: Likewise. * gcc.c-torture/execute/20070824-1.c: Likewise. * gcc.c-torture/execute/20070919-1.c: Likewise. * gcc.c-torture/execute/built-in-setjmp.c: Likewise. * gcc.c-torture/execute/pr22061-1.c: Likewise. * gcc.c-torture/execute/pr43220.c: Likewise. * gcc.c-torture/execute/pr82210.c: Likewise. * gcc.c-torture/execute/pr86528.c: Likewise. * gcc.c-torture/execute/vla-dealloc-1.c: Likewise. * gcc.dg/20001012-2.c: Likewise. * gcc.dg/20020415-1.c: Likewise. * gcc.dg/20030331-2.c: Likewise. * gcc.dg/20101010-1.c: Likewise. * gcc.dg/Walloca-1.c: Likewise. * gcc.dg/Walloca-10.c: Likewise. * gcc.dg/Walloca-11.c: Likewise. * gcc.dg/Walloca-12.c: Likewise. * gcc.dg/Walloca-13.c: Likewise. * gcc.dg/Walloca-14.c: Likewise. * gcc.dg/Walloca-15.c: Likewise. * gcc.dg/Walloca-2.c: Likewise. * gcc.dg/Walloca-3.c: Likewise. * gcc.dg/Walloca-4.c: Likewise. * gcc.dg/Walloca-5.c: Likewise. * gcc.dg/Walloca-6.c: Likewise. * gcc.dg/Walloca-7.c: Likewise. * gcc.dg/Walloca-8.c: Likewise. * gcc.dg/Walloca-9.c: Likewise. * gcc.dg/Walloca-larger-than-2.c: Likewise. * gcc.dg/Walloca-larger-than-3.c: Likewise. * gcc.dg/Walloca-larger-than-4.c: Likewise. * gcc.dg/Walloca-larger-than.c: Likewise. * gcc.dg/Warray-bounds-22.c: Likewise. * gcc.dg/Warray-bounds-41.c: Likewise. * gcc.dg/Warray-bounds-46.c: Likewise. * gcc.dg/Warray-bounds-48-novec.c: Likewise. * gcc.dg/Warray-bounds-48.c: Likewise. * gcc.dg/Warray-bounds-50.c: Likewise. * gcc.dg/Warray-bounds-63.c: Likewise. * gcc.dg/Warray-bounds-66.c: Likewise. * gcc.dg/Wdangling-pointer.c: Likewise. * gcc.dg/Wfree-nonheap-object-2.c: Likewise. * gcc.dg/Wfree-nonheap-object.c: Likewise. * gcc.dg/Wrestrict-17.c: Likewise. * gcc.dg/Wrestrict.c: Likewise. * gcc.dg/Wreturn-local-addr-2.c: Likewise. * gcc.dg/Wreturn-local-addr-3.c: Likewise. * gcc.dg/Wreturn-local-addr-4.c: Likewise. * gcc.dg/Wreturn-local-addr-6.c: Likewise. * gcc.dg/Wsizeof-pointer-memaccess1.c: Likewise. * gcc.dg/Wstack-usage.c: Likewise. * gcc.dg/Wstrict-aliasing-bogus-vla-1.c: Likewise. * gcc.dg/Wstrict-overflow-27.c: Likewise. * gcc.dg/Wstringop-overflow-15.c: Likewise. * gcc.dg/Wstringop-overflow-23.c: Likewise. * gcc.dg/Wstringop-overflow-25.c: Likewise. * gcc.dg/Wstringop-overflow-27.c: Likewise. * gcc.dg/Wstringop-overflow-3.c: Likewise. * gcc.dg/Wstringop-overflow-39.c: Likewise. * gcc.dg/Wstringop-overflow-56.c: Likewise. * gcc.dg/Wstringop-overflow-57.c: Likewise. * gcc.dg/Wstringop-overflow-67.c: Likewise. * gcc.dg/Wstringop-overflow-71.c: Likewise. * gcc.dg/Wstringop-truncation-3.c: Likewise. * gcc.dg/Wvla-larger-than-1.c: Likewise. * gcc.dg/Wvla-larger-than-2.c: Likewise. * gcc.dg/Wvla-larger-than-3.c: Likewise. * gcc.dg/Wvla-larger-than-4.c: Likewise. * gcc.dg/Wvla-larger-than-5.c: Likewise. * gcc.dg/analyzer/boxed-malloc-1.c: Likewise. * gcc.dg/analyzer/call-summaries-2.c: Likewise. * gcc.dg/analyzer/malloc-1.c: Likewise. * gcc.dg/analyzer/malloc-reuse.c: Likewise. * gcc.dg/analyzer/out-of-bounds-diagram-12.c: Likewise. * gcc.dg/analyzer/pr93355-localealias.c: Likewise. * gcc.dg/analyzer/putenv-1.c: Likewise. * gcc.dg/analyzer/taint-alloc-1.c: Likewise. * gcc.dg/analyzer/torture/pr93373.c: Likewise. * gcc.dg/analyzer/torture/ubsan-1.c: Likewise. * gcc.dg/analyzer/vla-1.c: Likewise. * gcc.dg/atomic/stdatomic-vm.c: Likewise. * gcc.dg/attr-alloc_size-6.c: Likewise. * gcc.dg/attr-alloc_size-7.c: Likewise. * gcc.dg/attr-alloc_size-8.c: Likewise. * gcc.dg/attr-alloc_size-9.c: Likewise. * gcc.dg/attr-noipa.c: Likewise. * gcc.dg/auto-init-uninit-36.c: Likewise. * gcc.dg/auto-init-uninit-9.c: Likewise. * gcc.dg/auto-type-1.c: Likewise. * gcc.dg/builtin-alloc-size.c: Likewise. * gcc.dg/builtin-dynamic-alloc-size.c: Likewise. * gcc.dg/builtin-dynamic-object-size-1.c: Likewise. * gcc.dg/builtin-dynamic-object-size-2.c: Likewise. * gcc.dg/builtin-dynamic-object-size-3.c: Likewise. * gcc.dg/builtin-dynamic-object-size-4.c: Likewise. * gcc.dg/builtin-object-size-1.c: Likewise. * gcc.dg/builtin-object-size-2.c: Likewise. * gcc.dg/builtin-object-size-3.c: Likewise. * gcc.dg/builtin-object-size-4.c: Likewise. * gcc.dg/builtins-64.c: Likewise. * gcc.dg/builtins-68.c: Likewise. * gcc.dg/c23-auto-2.c: Likewise. * gcc.dg/c99-const-expr-13.c: Likewise. * gcc.dg/c99-vla-1.c: Likewise. * gcc.dg/fold-alloca-1.c: Likewise. * gcc.dg/gomp/pr30494.c: Likewise. * gcc.dg/gomp/vla-2.c: Likewise. * gcc.dg/gomp/vla-3.c: Likewise. * gcc.dg/gomp/vla-4.c: Likewise. * gcc.dg/gomp/vla-5.c: Likewise. * gcc.dg/graphite/pr99085.c: Likewise. * gcc.dg/guality/guality.c: Likewise. * gcc.dg/lto/pr80778_0.c: Likewise. * gcc.dg/nested-func-10.c: Likewise. * gcc.dg/nested-func-12.c: Likewise. * gcc.dg/nested-func-13.c: Likewise. * gcc.dg/nested-func-14.c: Likewise. * gcc.dg/nested-func-15.c: Likewise. * gcc.dg/nested-func-16.c: Likewise. * gcc.dg/nested-func-17.c: Likewise. * gcc.dg/nested-func-9.c: Likewise. * gcc.dg/packed-vla.c: Likewise. * gcc.dg/pr100225.c: Likewise. * gcc.dg/pr25682.c: Likewise. * gcc.dg/pr27301.c: Likewise. * gcc.dg/pr31507-1.c: Likewise. * gcc.dg/pr33238.c: Likewise. * gcc.dg/pr41470.c: Likewise. * gcc.dg/pr49120.c: Likewise. * gcc.dg/pr50764.c: Likewise. * gcc.dg/pr51491-2.c: Likewise. * gcc.dg/pr51990-2.c: Likewise. * gcc.dg/pr51990.c: Likewise. * gcc.dg/pr59011.c: Likewise. * gcc.dg/pr59523.c: Likewise. * gcc.dg/pr61561.c: Likewise. * gcc.dg/pr78468.c: Likewise. * gcc.dg/pr78902.c: Likewise. * gcc.dg/pr79972.c: Likewise. * gcc.dg/pr82875.c: Likewise. * gcc.dg/pr83844.c: Likewise. * gcc.dg/pr84131.c: Likewise. * gcc.dg/pr87099.c: Likewise. * gcc.dg/pr87320.c: Likewise. * gcc.dg/pr89045.c: Likewise. * gcc.dg/pr91014.c: Likewise. * gcc.dg/pr93986.c: Likewise. * gcc.dg/pr98721-1.c: Likewise. * gcc.dg/pr99122-2.c: Likewise. * gcc.dg/shrink-wrap-alloca.c: Likewise. * gcc.dg/sso-14.c: Likewise. * gcc.dg/strlenopt-62.c: Likewise. * gcc.dg/strlenopt-83.c: Likewise. * gcc.dg/strlenopt-84.c: Likewise. * gcc.dg/strlenopt-91.c: Likewise. * gcc.dg/torture/Wsizeof-pointer-memaccess1.c: Likewise. * gcc.dg/torture/calleesave-sse.c: Likewise. * gcc.dg/torture/pr48953.c: Likewise. * gcc.dg/torture/pr71881.c: Likewise. * gcc.dg/torture/pr71901.c: Likewise. * gcc.dg/torture/pr78742.c: Likewise. * gcc.dg/torture/pr92088-1.c: Likewise. * gcc.dg/torture/pr92088-2.c: Likewise. * gcc.dg/torture/pr93124.c: Likewise. * gcc.dg/torture/pr94479.c: Likewise. * gcc.dg/torture/stackalign/alloca-1.c: Likewise. * gcc.dg/torture/stackalign/inline-2.c: Likewise. * gcc.dg/torture/stackalign/nested-3.c: Likewise. * gcc.dg/torture/stackalign/vararg-1.c: Likewise. * gcc.dg/torture/stackalign/vararg-2.c: Likewise. * gcc.dg/tree-ssa/20030807-2.c: Likewise. * gcc.dg/tree-ssa/20080530.c: Likewise. * gcc.dg/tree-ssa/alias-37.c: Likewise. * gcc.dg/tree-ssa/builtin-sprintf-warn-22.c: Likewise. * gcc.dg/tree-ssa/builtin-sprintf-warn-25.c: Likewise. * gcc.dg/tree-ssa/builtin-sprintf-warn-3.c: Likewise. * gcc.dg/tree-ssa/loop-interchange-15.c: Likewise. * gcc.dg/tree-ssa/pr23848-1.c: Likewise. * gcc.dg/tree-ssa/pr23848-2.c: Likewise. * gcc.dg/tree-ssa/pr23848-3.c: Likewise. * gcc.dg/tree-ssa/pr23848-4.c: Likewise. * gcc.dg/uninit-32.c: Likewise. * gcc.dg/uninit-36.c: Likewise. * gcc.dg/uninit-39.c: Likewise. * gcc.dg/uninit-41.c: Likewise. * gcc.dg/uninit-9-O0.c: Likewise. * gcc.dg/uninit-9.c: Likewise. * gcc.dg/uninit-pr100250.c: Likewise. * gcc.dg/uninit-pr101300.c: Likewise. * gcc.dg/uninit-pr101494.c: Likewise. * gcc.dg/uninit-pr98583.c: Likewise. * gcc.dg/vla-2.c: Likewise. * gcc.dg/vla-22.c: Likewise. * gcc.dg/vla-24.c: Likewise. * gcc.dg/vla-3.c: Likewise. * gcc.dg/vla-4.c: Likewise. * gcc.dg/vla-stexp-1.c: Likewise. * gcc.dg/vla-stexp-2.c: Likewise. * gcc.dg/vla-stexp-4.c: Likewise. * gcc.dg/vla-stexp-5.c: Likewise. * gcc.dg/winline-7.c: Likewise. * gcc.target/aarch64/stack-check-alloca-1.c: Likewise. * gcc.target/aarch64/stack-check-alloca-10.c: Likewise. * gcc.target/aarch64/stack-check-alloca-2.c: Likewise. * gcc.target/aarch64/stack-check-alloca-3.c: Likewise. * gcc.target/aarch64/stack-check-alloca-4.c: Likewise. * gcc.target/aarch64/stack-check-alloca-5.c: Likewise. * gcc.target/aarch64/stack-check-alloca-6.c: Likewise. * gcc.target/aarch64/stack-check-alloca-7.c: Likewise. * gcc.target/aarch64/stack-check-alloca-8.c: Likewise. * gcc.target/aarch64/stack-check-alloca-9.c: Likewise. * gcc.target/arc/interrupt-6.c: Likewise. * gcc.target/i386/pr80969-3.c: Likewise. * gcc.target/loongarch/stack-check-alloca-1.c: Likewise. * gcc.target/loongarch/stack-check-alloca-2.c: Likewise. * gcc.target/loongarch/stack-check-alloca-3.c: Likewise. * gcc.target/loongarch/stack-check-alloca-4.c: Likewise. * gcc.target/loongarch/stack-check-alloca-5.c: Likewise. * gcc.target/loongarch/stack-check-alloca-6.c: Likewise. * gcc.target/riscv/stack-check-alloca-1.c: Likewise. * gcc.target/riscv/stack-check-alloca-10.c: Likewise. * gcc.target/riscv/stack-check-alloca-2.c: Likewise. * gcc.target/riscv/stack-check-alloca-3.c: Likewise. * gcc.target/riscv/stack-check-alloca-4.c: Likewise. * gcc.target/riscv/stack-check-alloca-5.c: Likewise. * gcc.target/riscv/stack-check-alloca-6.c: Likewise. * gcc.target/riscv/stack-check-alloca-7.c: Likewise. * gcc.target/riscv/stack-check-alloca-8.c: Likewise. * gcc.target/riscv/stack-check-alloca-9.c: Likewise. * gcc.target/sparc/setjmp-1.c: Likewise. * gcc.target/x86_64/abi/ms-sysv/ms-sysv.c: Likewise. * gcc.c-torture/compile/20001221-1.c: Don't 'dg-skip-if' for '! alloca'. * gcc.c-torture/compile/20020807-1.c: Likewise. * gcc.c-torture/compile/20050801-2.c: Likewise. * gcc.c-torture/compile/920428-4.c: Likewise. * gcc.c-torture/compile/debugvlafunction-1.c: Likewise. * gcc.c-torture/compile/pr41469.c: Likewise. * gcc.c-torture/execute/920721-2.c: Likewise. * gcc.c-torture/execute/920929-1.c: Likewise. * gcc.c-torture/execute/921017-1.c: Likewise. * gcc.c-torture/execute/941202-1.c: Likewise. * gcc.c-torture/execute/align-nest.c: Likewise. * gcc.c-torture/execute/alloca-1.c: Likewise. * gcc.c-torture/execute/pr22061-4.c: Likewise. * gcc.c-torture/execute/pr36321.c: Likewise. * gcc.dg/torture/pr8081.c: Likewise. * gcc.dg/analyzer/data-model-1.c: Don't 'dg-require-effective-target alloca'. XFAIL relevant 'dg-warning's for '! alloca'. * gcc.dg/uninit-38.c: Likewise. * gcc.dg/uninit-pr98578.c: Likewise. * gcc.dg/compat/struct-by-value-22_main.c: Comment on 'dg-require-effective-target alloca'. libstdc++-v3/ * testsuite/lib/prune.exp (proc libstdc++-dg-prune): Turn 'sorry, unimplemented: dynamic stack allocation not supported' into UNSUPPORTED.
2025-02-10i386: Change RTL representation of bt[lq] [PR118623]Jakub Jelinek1-0/+23
The following testcase is miscompiled because of RTL represententation of bt{l,q} insn followed by e.g. j{c,nc} being misleading to what it actually does. Let's look e.g. at (define_insn_and_split "*jcc_bt<mode>" [(set (pc) (if_then_else (match_operator 0 "bt_comparison_operator" [(zero_extract:SWI48 (match_operand:SWI48 1 "nonimmediate_operand") (const_int 1) (match_operand:QI 2 "nonmemory_operand")) (const_int 0)]) (label_ref (match_operand 3)) (pc))) (clobber (reg:CC FLAGS_REG))] "(TARGET_USE_BT || optimize_function_for_size_p (cfun)) && (CONST_INT_P (operands[2]) ? (INTVAL (operands[2]) < GET_MODE_BITSIZE (<MODE>mode) && INTVAL (operands[2]) >= (optimize_function_for_size_p (cfun) ? 8 : 32)) : !memory_operand (operands[1], <MODE>mode)) && ix86_pre_reload_split ()" "#" "&& 1" [(set (reg:CCC FLAGS_REG) (compare:CCC (zero_extract:SWI48 (match_dup 1) (const_int 1) (match_dup 2)) (const_int 0))) (set (pc) (if_then_else (match_op_dup 0 [(reg:CCC FLAGS_REG) (const_int 0)]) (label_ref (match_dup 3)) (pc)))] { operands[0] = shallow_copy_rtx (operands[0]); PUT_CODE (operands[0], reverse_condition (GET_CODE (operands[0]))); }) The define_insn part in RTL describes exactly what it does, jumps to op3 if bit op2 in op1 is set (for op0 NE) or not set (for op0 EQ). The problem is with what it splits into. put_condition_code %C1 for CCCmode comparisons emits c for EQ and LTU, nc for NE and GEU and ICEs otherwise. CCCmode is used mainly for carry out of add/adc, borrow out of sub/sbb, in those cases e.g. for add we have (set (reg:CCC flags) (compare:CCC (plus:M x y) x)) and use (ltu (reg:CCC flags) (const_int 0)) for carry set and (geu (reg:CCC flags) (const_int 0)) for carry not set. These cases model in RTL what is actually happening, compare in infinite precision x from the result of finite precision addition in M mode and if it is less than unsigned (i.e. overflow happened), carry is set. Another use of CCCmode is in UNSPEC_* patterns, those are used with (eq (reg:CCC flags) (const_int 0)) for carry set and ne for unset, given the UNSPEC no big deal, the middle-end doesn't know what means set or unset. But for the bt{l,q}; j{c,nc} case the above splits it into (set (reg:CCC flags) (compare:CCC (zero_extract) (const_int 0))) for bt and (set (pc) (if_then_else (eq (reg:CCC flags) (const_int 0)) (label_ref) (pc))) for the bit set case (so that the jump expands to jc) and ne for the bit not set case (so that the jump expands to jnc). Similarly for the different splitters for cmov and set{c,nc} etc. The problem is that when the middle-end reads this RTL, it feels the exact opposite to it. If zero_extract is 1, flags is set to comparison of 1 and 0 and that would mean using ne ne in the if_then_else, and vice versa. So, in order to better describe in RTL what is actually happening, one possibility would be to swap the behavior of put_condition_code and use NE + LTU -> c and EQ + GEU -> nc rather than the current EQ + LTU -> c and NE + GEU -> nc; and adjust everything. The following patch uses a more limited approach, instead of representing bt{l,q}; j{c,nc} case as written above it uses (set (reg:CCC flags) (compare:CCC (const_int 0) (zero_extract))) and (set (pc) (if_then_else (ltu (reg:CCC flags) (const_int 0)) (label_ref) (pc))) which uses the existing put_condition_code but describes what the insns actually do in RTL clearly. If zero_extract is 1, then flags are LTU, 0U < 1U, if zero_extract is 0, then flags are GEU, 0U >= 0U. The patch adjusts the *bt<mode> define_insn and all the splitters to it and its comparisons/conditional moves/setXX. 2025-02-10 Jakub Jelinek <jakub@redhat.com> PR target/118623 * config/i386/i386.md (*bt<mode>): Represent bt as compare:CCC of const0_rtx and zero_extract rather than zero_extract and const0_rtx. (*bt<SWI48:mode>_mask): Likewise. (*jcc_bt<mode>): Likewise. Use LTU and GEU as flags test instead of EQ and NE. (*jcc_bt<mode>_mask): Likewise. (*jcc_bt<SWI48:mode>_mask_1): Likewise. (Help combine recognize bt followed by cmov splitter): Likewise. (*bt<mode>_setcqi): Likewise. (*bt<mode>_setncqi): Likewise. (*bt<mode>_setnc<mode>): Likewise. (*bt<mode>_setncqi_2): Likewise. (*bt<mode>_setc<mode>_mask): Likewise. * gcc.c-torture/execute/pr118623.c: New test.
2025-02-01icf: Compare call argument types in certain cases and asm operands [PR117432]Jakub Jelinek1-0/+71
compare_operand uses operand_equal_p under the hood, which e.g. for INTEGER_CSTs will just match the values rather regardless of their types. Now, in many comparing the type is redundant, if we have x_2 = y_3 + 1; we've already compared the type for the lhs and also for rhs1, there won't be any surprises on rhs2. As noted in the PR, there are cases where the type of the operand is the sole place of information and we don't want to ICF merge functions if the types differ. One case is stdarg functions, arguments passed to ..., it is different if we pass 1, 1L, 1LL. Another case are the K&R unprototyped functions (sure, gone in C23). And yet another case are inline asm operands, "r" (1) is different from "r" (1L) from "r" (1LL). So, the following patch determines based on lack of fntype (e.g. for internal functions), or on !prototype_p, or on stdarg_p (in that case using number of named arguments) which arguments need to have type checked and does that, plus compares types on inline asm operands (maybe it would be enough to do that just for input operands but we have just a routine to handle both and I didn't feel we need to differentiate). Furthermore, I've noticed fntype{1,2} isn't actually compared if it is a direct call (gimple_call_fndecl is non-NULL). That is wrong too, we could have void (*fn) (int, long long) = (void (*) (int, long long)) foo; fn (1, 1LL); in one case and void (*fn) (long long, int) = (void (*) (long long, int)) foo; fn (1LL, 1); in another, both folded into a direct call of foo with different gimple_call_fntype. Sure, one of them would be UB at runtime (or both), but what if we ICF merge it into something that into the one UB at runtime and the program actually calls the correct one only? 2025-02-01 Jakub Jelinek <jakub@redhat.com> PR ipa/117432 * ipa-icf-gimple.cc (func_checker::compare_asm_inputs_outputs): Also return_false if operands have incompatible types. (func_checker::compare_gimple_call): Check fntype1 vs. fntype2 compatibility for all non-internal calls and assume fntype1 and fntype2 are non-NULL for those. For calls to non-prototyped calls or for stdarg_p functions after the last named argument (if any) check type compatibility of call arguments. * gcc.c-torture/execute/pr117432.c: New test. * gcc.target/i386/pr117432.c: New test.
2025-01-31testsuite: Add testcase for already fixed PR [PR117498]Jakub Jelinek1-0/+35
This wrong-code issue has been fixed with r15-7249. We still emit warnings which are questionable and perhaps we'd get better generated code if niters determined the loop has only a single iteration without UB and we'd punt on vectorizing it (or unrolling). 2025-01-31 Jakub Jelinek <jakub@redhat.com> PR middle-end/117498 * gcc.c-torture/execute/pr117498.c: New test.
2025-01-28combine: Fix up make_extraction [PR118638]Jakub Jelinek1-0/+20
The following testcase is miscompiled at -Os on x86_64-linux. The problem is during make_compound_operation of (ashiftrt:SI (ashift:SI (mult:SI (reg:SI 107 [ a_5 ]) (const_int 3 [0x3])) (const_int 31 [0x1f])) (const_int 31 [0x1f])) where it incorrectly returns (mult:SI (sign_extract:SI (reg:SI 107 [ a_5 ]) (const_int 2 [0x2]) (const_int 0 [0])) (const_int 3 [0x3])) which isn't obviously true, the former returns either 0 or -1 depending on the least significant bit of the multiplication, the latter returns either 0 or -3 depending on the second least significant bit of the multiplication argument. The bug has been introduced in PR96998 r11-4563, which added handling of x * (2^N) similar to x << N. In the above case, pos is 0 and len is 1, sign extracting a single least significant bit of the multiplication. As 3 is not a power of 2, shift_amt is -1. But IN_RANGE (-1, 1, 1 - 1) is still true, because the basic requirement of IN_RANGE that LOWER is not greater than UPPER is violated. The intention of using 1 as LOWER is to avoid matching multiplication by 1, that really shouldn't appear in the IL. But to avoid violating IN_RANGE requirement, we need to verify that len is at least 2. I've added this len > 1 check to the inner if rather than outer because I think for GCC 16 we should add a further optimization. In the particular case of 1 least significant bit sign extraction from multiplication by 3, we could actually say it is equivalent to (sign_extract:SI (reg:SI 107 [ a_5 ]) (const_int 1 [0x2]) (const_int 0 [0])) That is because 3 is an odd number and multiplication by 2 will yield the least significant bit 0 (we are sign extracting just one) and so the multiplication doesn't change anything on the outcome. More generally, even for larger len, multiplication by C which is (1 << X) + 1 where X is >= len should be optimizable just to extraction of the multiplicand's least significant len bits. 2025-01-28 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/118638 * combine.cc (make_extraction): Only optimize (mult x 2^n) if len is larger than 1. * gcc.c-torture/execute/pr118638.c: New test.
2025-01-09nvptx: PTX 'alloca' for '-mptx=7.3'+, '-march=sm_52'+ [PR65181]Thomas Schwinge1-0/+3
..., and use it for '-mno-soft-stack': PTX "native" stacks. PR target/65181 gcc/ * config/nvptx/nvptx.cc (nvptx_get_drap_rtx): Handle '!TARGET_SOFT_STACK'. * config/nvptx/nvptx.md (define_c_enum "unspec"): Add 'UNSPEC_STACKSAVE', 'UNSPEC_STACKRESTORE'. (define_expand "allocate_stack", define_expand "save_stack_block") (define_expand "save_stack_block"): Handle '!TARGET_SOFT_STACK', PTX 'alloca'. (define_insn "@nvptx_alloca_<mode>") (define_insn "@nvptx_stacksave_<mode>") (define_insn "@nvptx_stackrestore_<mode>"): New. * doc/invoke.texi (Nvidia PTX Options): Update '-msoft-stack', '-mno-soft-stack'. * doc/sourcebuild.texi (nvptx-specific attributes): Document 'nvptx_runtime_alloca_ptx'. (Add Options): Document 'nvptx_alloca_ptx'. gcc/testsuite/ * gcc.target/nvptx/alloca-1.c: Evolve into... * gcc.target/nvptx/alloca-1-O0.c: ... this, ... * gcc.target/nvptx/alloca-1-O1.c: ... this, and... * gcc.target/nvptx/alloca-1-sm_30.c: ... this. * gcc.target/nvptx/vla-1.c: Evolve into... * gcc.target/nvptx/vla-1-O0.c: ... this, ... * gcc.target/nvptx/vla-1-O1.c: ... this, and... * gcc.target/nvptx/vla-1-sm_30.c: ... this. * gcc.c-torture/execute/pr36321.c: Adjust. * gcc.target/nvptx/__builtin_alloca_0-1-O0.c: Likewise. * gcc.target/nvptx/__builtin_alloca_0-1-O1.c: Likewise. * gcc.target/nvptx/__builtin_stack_save___builtin_stack_restore-1.c: Likewise. * gcc.target/nvptx/softstack.c: Likewise. * gcc.target/nvptx/__builtin_stack_save___builtin_stack_restore-1-sm_30.c: New. * gcc.target/nvptx/alloca-2-O0.c: Likewise. * gcc.target/nvptx/alloca-3-O1.c: Likewise. * gcc.target/nvptx/alloca-4-O3.c: Likewise. * gcc.target/nvptx/alloca-5.c: Likewise. * lib/target-supports.exp (check_effective_target_alloca): Adjust. (check_nvptx_default_ptx_isa_target_architecture_at_least) (check_nvptx_runtime_ptx_isa_target_architecture_at_least) (check_effective_target_nvptx_runtime_alloca_ptx) (add_options_for_nvptx_alloca_ptx): New. libgomp/ * fortran.c (omp_get_device_from_uid_): Adjust. * testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: Likewise.
2025-01-02Update copyright years.Jakub Jelinek3-3/+3
2024-12-25testsuite: Expand coverage for unaligned memory storesMaciej W. Rozycki1-0/+84
Expand coverage for unaligned memory stores, for the "insvmisalignM" patterns, for 2-byte, 4-byte, and 8-byte scalars, across byte alignments of 1, 2, 4 and byte misalignments within from 0 up to 7 (there's some redundancy there for the sake of simplicity of the test case), making sure all data is written and no data is changed outside the area meant to be written. The test case has turned invaluable in verifying changes to the Alpha backend, but functionality covered is generic, so I have concluded this test qualifies for generic verification and does not have to be limited to the Alpha-specific subset of the testsuite. gcc/testsuite/ * gcc.c-torture/execute/misalign.c: New file.
2024-12-25testsuite: Expand coverage for `__builtin_memset' with 0Maciej W. Rozycki1-0/+233
Expand coverage for `__builtin_memset' for the special case of clearing a block, primarily for "setmemM" block set pattern, though with smaller sizes open-coded sequences may be produced instead. This verifies block sizes in bytes from 1 to 64 across byte alignments of 1, 2, 4, 8 and byte misalignments within from 0 up to 7 (there's some redundancy there for the sake of simplicity of the test case), making sure all the intended area is cleared and no data is changed outside it. These choice of the ranges for the parameters has come from the Alpha backend, whose "setmemM" pattern has various corner cases related to base alignment and the misalignment within. The test case has turned invaluable in verifying changes to the Alpha backend, but functionality covered is generic, so I have concluded this test qualifies for generic verification and does not have to be limited to the Alpha-specific subset of the testsuite. Just as with `__builtin_memcpy' tests this code turned out to require quite a lot of time to compile, although a bit less than the former. Example compilation times with reasonably fast POWER9@2.166GHz at `-O2' optimization and GCC built at `-O2' for various targets: mips-linux-gnu: 19s vax-netbsdelf: 27s alphaev56-linux-gnu: 30s alpha-linux-gnu: 31s powerpc64le-linux-gnu: 47s With GCC built at `-O0': alphaev56-linux-gnu: 2m59s alpha-linux-gnu: 3m06s I have therefore set the timeout factor accordingly so as to take slower test hosts into account. gcc/testsuite/ * gcc.c-torture/execute/memclr.c: New file.
2024-12-14cse: Fix up record_jump_equiv checks [PR117095]Jakub Jelinek1-0/+47
The following testcase is miscompiled on s390x-linux with -O2 -march=z15. The problem happens during cse2, which sees in an extended basic block (jump_insn 217 78 216 10 (parallel [ (set (pc) (if_then_else (ne (reg:SI 165) (const_int 1 [0x1])) (label_ref 216) (pc))) (set (reg:SI 165) (plus:SI (reg:SI 165) (const_int -1 [0xffffffffffffffff]))) (clobber (scratch:SI)) (clobber (reg:CC 33 %cc)) ]) "t.c":14:17 discrim 1 2192 {doloop_si64} (int_list:REG_BR_PROB 955630228 (nil)) -> 216) ... (insn 99 98 100 12 (set (reg:SI 138) (const_int 1 [0x1])) "t.c":9:31 1507 {*movsi_zarch} (nil)) (insn 100 99 103 12 (parallel [ (set (reg:SI 137) (minus:SI (reg:SI 138) (subreg:SI (reg:HI 135 [ a ]) 0))) (clobber (reg:CC 33 %cc)) ]) "t.c":9:31 1904 {*subsi3} (expr_list:REG_DEAD (reg:SI 138) (expr_list:REG_DEAD (reg:HI 135 [ a ]) (expr_list:REG_UNUSED (reg:CC 33 %cc) (nil))))) Note, cse2 has df_note_add_problem () before df_analyze, which add (expr_list:REG_UNUSED (reg:SI 165) (expr_list:REG_UNUSED (reg:CC 33 %cc) notes to the first insn (correctly so, %cc is clobbered there and pseudo 165 isn't used after the insn). Now, cse_extended_basic_block has an extra optimization on conditional jumps, where it records equivalence on the edge which continues in the ebb. Here it sees (ne reg:SI 165) (const_int 1) is false on the edge and remembers that pseudo 165 is comparison equivalent to (const_int 1), so on insn 100 it decides to replace (reg:SI 138) with (reg:SI 165). This optimization isn't correct here though, because the JUMP_INSN has multiple sets. Before r0-77890 record_jump_equiv has been called from cse_insn guarded on n_sets == 1 && any_condjump_p (insn), so it wouldn't be done on the above JUMP_INSN where n_sets == 2. But since that change it is guarded with single_set (insn) && any_condjump_p (insn) and that is true because of the REG_UNUSED note. Looking at that note is inappropriate in CSE though, because the whole intent of the pass is to extend the lifetimes of the pseudos if equivalence is found, so the fact that there is REG_UNUSED note for (reg:SI 165) and that the reg isn't used later doesn't imply that it won't be used after the optimization. So, unless we manage to process the other sets on the JUMP_INSN (it wouldn't be terribly hard in this exact case, the doloop insn decreases the register by 1 and so we could just record equivalence to (const_int 0) instead, but generally it might be hard), we should IMHO just punt if there are multiple sets. The patch below adds !multiple_sets (insn) check instead of replacing with it the single_set (insn) check, because apparently any_condjump_p uses pc_set which supports the case where PATTERN is a SET to PC (that is a single_set (insn) && !multiple_sets (insn), PATTERN is a PARALLEL with a single SET to PC (likewise) and some CLOBBERs, PARALLEL with two or more SETs where the first one is SET to PC (that could be single_set (insn) with REG_UNUSED notes but is not !multiple_sets (insn)) or PATTERN is UNSPEC/UNSPEC_VOLATILE with SET inside of it. For the last case !multiple_sets (insn) will be true, but IMHO we shouldn't try to derive anything from those because we haven't checked the rest of the UNSPEC* and we don't really know what it does. 2024-12-13 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/117095 * cse.cc (cse_extended_basic_block): Don't call record_jump_equiv if multiple_sets (insn). * gcc.c-torture/execute/pr117095.c: New test.
2024-12-10testsuite: Mark gcc.c-torture/execute/memcpy-a?.c tests expensiveMaciej W. Rozycki4-0/+4
These tests can take several seconds per compilation to complete, taking total elapsed time measured in minutes. Mark them as expensive so as to let people skip them where they want to save on testing time. gcc/testsuite/ * gcc.c-torture/execute/memcpy-a1.c: Mark as expensive. * gcc.c-torture/execute/memcpy-a2.c: Likewise. * gcc.c-torture/execute/memcpy-a4.c: Likewise. * gcc.c-torture/execute/memcpy-a8.c: Likewise.
2024-12-05doloop: Fix up doloop df use [PR116799]Jakub Jelinek1-0/+41
The following testcases are miscompiled on s390x-linux, because the doloop_optimize /* Ensure that the new sequence doesn't clobber a register that is live at the end of the block. */ { bitmap modified = BITMAP_ALLOC (NULL); for (rtx_insn *i = doloop_seq; i != NULL; i = NEXT_INSN (i)) note_stores (i, record_reg_sets, modified); basic_block loop_end = desc->out_edge->src; bool fail = bitmap_intersect_p (df_get_live_out (loop_end), modified); check doesn't work as intended. The problem is that it uses df, but the df analysis was only done using iv_analysis_loop_init (loop); -> df_analyze_loop (loop); which computes df inside on the bbs of the loop. While loop_end bb is inside of the loop, df_get_live_out computed that way includes registers set in the loop and used at the start of the next iteration, but doesn't include registers set in the loop (or before the loop) and used after the loop. The following patch fixes that by doing whole function df_analyze first, changes the loop iteration mode from 0 to LI_ONLY_INNERMOST (on many targets which use can_use_doloop_if_innermost target hook a so are known to only handle innermost loops) or LI_FROM_INNERMOST (I think only bfin actually allows non-innermost loops) and checking not just df_get_live_out (loop_end) (that is needed for something used by the next iteration), but also df_get_live_in (desc->out_edge->dest), i.e. what will be used after the loop. df of such a bb shouldn't be affected by the df_analyze_loop and so should be from df_analyze of the whole function. 2024-12-05 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/113994 PR rtl-optimization/116799 * loop-doloop.cc: Include targhooks.h. (doloop_optimize): Also punt on intersection of modified with df_get_live_in (desc->out_edge->dest). (doloop_optimize_loops): Call df_analyze. Use LI_ONLY_INNERMOST or LI_FROM_INNERMOST instead of 0 as second loops_list argument. * gcc.c-torture/execute/pr116799.c: New test. * g++.dg/torture/pr113994.C: New test.
2024-12-03AVR: Skip some test cases that don't work for it.Georg-Johann Lay2-0/+10
gcc/testsuite/ * gcc.c-torture/execute/ieee/cdivchkd.x: New file. * gcc.c-torture/execute/ieee/cdivchkf.x: New file. * gcc.dg/flex-array-counted-by.c: Require wchar. * gcc.dg/fold-copysign-1.c [avr]: Add -mdouble=64.
2024-11-29AVR: Skip the gcc.c-torture/execute/memcpy-a*.c tests.Georg-Johann Lay4-0/+4
Skipping these tests on avr since they come up with "memory full", plus they consume a multiple of the time the rest of the testsuite takes. gcc/testsuite/ * gcc.c-torture/execute/memcpy-a1.c * gcc.c-torture/execute/memcpy-a2.c * gcc.c-torture/execute/memcpy-a4.c * gcc.c-torture/execute/memcpy-a8.c
2024-11-25nios2: Remove all support for Nios II target.Sandra Loosemore2-5/+1
nios2 target support in GCC was deprecated in GCC 14 as the architecture has been EOL'ed by the vendor. This patch removes the entire port for GCC 15 There are still references to "nios2" in libffi and libgo. Since those libraries are imported into the gcc sources from master copies maintained by other projects, those will need to be addressed elsewhere. ChangeLog: * MAINTAINERS: Remove references to nios2. * configure.ac: Likewise. * configure: Regenerated. config/ChangeLog: * mt-nios2-elf: Deleted. contrib/ChangeLog: * config-list.mk: Remove references to Nios II. gcc/ChangeLog: * common/config/nios2/*: Delete entire directory. * config/nios2/*: Delete entire directory. * config.gcc: Remove references to nios2. * configure.ac: Likewise. * doc/extend.texi: Likewise. * doc/install.texi: Likewise. * doc/invoke.texi: Likewise. * doc/md.texi: Likewise. * regenerate-opt-urls.py: Likewise. * config.in: Regenerated. * configure: Regenerated. gcc/testsuite/ChangeLog: * g++.target/nios2/*: Delete entire directory. * gcc.target/nios2/*: Delete entire directory. * g++.dg/cpp0x/constexpr-rom.C: Remove refences to nios2. * g++.old-deja/g++.jason/thunk3.C: Likewise. * gcc.c-torture/execute/20101011-1.c: Likewise. * gcc.c-torture/execute/pr47237.c: Likewise. * gcc.dg/20020312-2.c: Likewise. * gcc.dg/20021029-1.c: Likewise. * gcc.dg/debug/btf/btf-datasec-1.c: Likewise. * gcc.dg/ifcvt-4.c: Likewise. * gcc.dg/stack-usage-1.c: Likewise. * gcc.dg/struct-by-value-1.c: Likewise. * gcc.dg/tree-ssa/reassoc-33.c: Likewise. * gcc.dg/tree-ssa/reassoc-34.c: Likewise. * gcc.dg/tree-ssa/reassoc-35.c: Likewise. * gcc.dg/tree-ssa/reassoc-36.c: Likewise. * lib/target-supports.exp: Likewise. libgcc/ChangeLog: * config/nios2/*: Delete entire directory. * config.host: Remove refences to nios2. * unwind-dw2-fde-dip.c: Likewise.
2024-11-23testsuite: Expand coverage for `__builtin_memcpy'Maciej W. Rozycki5-0/+259
Expand coverage for `__builtin_memcpy', primarily for "cpymemM" block copy pattern, although with smaller sizes open-coded sequences may be produced instead. This verifies block sizes in bytes from 1 to 64, across byte alignments of 1, 2, 4, 8 and byte misalignments within from 0 up to 7 (there's some redundancy there for the sake of simplicity of the test cases) both for the source and the destination, making sure all data is copied and no data is changed outside the area meant to be written. These choice of the ranges for the parameters has come from the Alpha backend, whose "cpymemM" pattern covers copies being made of up to 64 bytes and has various corner cases related to base alignment and the misalignment within. The test cases have turned invaluable in verifying changes to the Alpha backend, but functionality covered is generic, so I have concluded these tests qualify for generic verification and do not have to be limited to the Alpha-specific subset of the testsuite. On the implementation side the tests turned out being quite stressful to GCC and the original simpler version that just expanded all code inline took a lot of time to complete compilation. Depending on the target and compilation options elapsed times up to 40 minutes (!) have been seen, especially with GCC built at `-O0' for debugging purposes. At the cost of increased complexity where a pair of macros is required per variant rather than just one I have split the code into individual functions forced not to be inlined and it improved compilation times considerably without losing coverage. Example compilation times with reasonably fast POWER9@2.166GHz at `-O2' optimization and GCC built at `-O2' for various targets: mips-linux-gnu: 23s vax-netbsdelf: 29s alphaev56-linux-gnu: 39s alpha-linux-gnu: 43s powerpc64le-linux-gnu: 48s With GCC built at `-O0': alphaev56-linux-gnu: 3m37s alpha-linux-gnu: 3m54s I have therefore set the timeout factor accordingly so as to take slower test hosts into account. gcc/testsuite/ * gcc.c-torture/execute/memcpy-a1.c: New file. * gcc.c-torture/execute/memcpy-a2.c: New file. * gcc.c-torture/execute/memcpy-a4.c: New file. * gcc.c-torture/execute/memcpy-a8.c: New file. * gcc.c-torture/execute/memcpy-ax.h: New file.
2024-11-21c: Add u{,l,ll,imax}abs builtins [PR117024]Jakub Jelinek10-0/+327
The following patch adds u{,l,ll,imax}abs builtins, which just fold to ABSU_EXPR, similarly to how {,l,ll,imax}abs builtins fold to ABS_EXPR. 2024-11-21 Jakub Jelinek <jakub@redhat.com> PR c/117024 gcc/ * coretypes.h (enum function_class): Add function_c2y_misc enumerator. * builtin-types.def (BT_FN_UINTMAX_INTMAX, BT_FN_ULONG_LONG, BT_FN_ULONGLONG_LONGLONG): New DEF_FUNCTION_TYPE_1s. * builtins.def (DEF_C2Y_BUILTIN): Define. (BUILT_IN_UABS, BUILT_IN_UIMAXABS, BUILT_IN_ULABS, BUILT_IN_ULLABS): New builtins. * builtins.cc (fold_builtin_abs): Handle also folding of u*abs to ABSU_EXPR. (fold_builtin_1): Handle BUILT_IN_U{,L,LL,IMAX}ABS. gcc/lto/ChangeLog: * lto-lang.cc (flag_isoc2y): New variable. gcc/ada/ChangeLog: * gcc-interface/utils.cc (flag_isoc2y): New variable. gcc/testsuite/ * gcc.c-torture/execute/builtins/lib/abs.c (uintmax_t): New typedef. (uabs, ulabs, ullabs, uimaxabs): New functions. * gcc.c-torture/execute/builtins/uabs-1.c: New test. * gcc.c-torture/execute/builtins/uabs-1.x: New file. * gcc.c-torture/execute/builtins/uabs-1-lib.c: New file. * gcc.c-torture/execute/builtins/uabs-2.c: New test. * gcc.c-torture/execute/builtins/uabs-2.x: New file. * gcc.c-torture/execute/builtins/uabs-2-lib.c: New file. * gcc.c-torture/execute/builtins/uabs-3.c: New test. * gcc.c-torture/execute/builtins/uabs-3.x: New test. * gcc.c-torture/execute/builtins/uabs-3-lib.c: New test.
2024-11-02testsuite: Fix up builtin-prefetch-1.c testsXi Ruoyao1-1/+1
How can you use "read-shared" as an identifier? It's not allowed by all C standard versions. gcc/testsuite/ChangeLog: * gcc.c-torture/execute/builtin-prefetch-1.c (rws): Use "read_shared" instead of "read-shared" as the identifier for enum value. * gcc.dg/builtin-prefetch-1.c (rws): Likewise.
2024-11-01Support Intel MOVRSHu, Lin11-1/+2
gcc/ChangeLog: * builtins.cc (expand_builtin_prefetch): Expand for prefetchrst2. * common/config/i386/cpuinfo.h (get_available_features): Detect movrs. * common/config/i386/i386-common.cc (OPTION_MASK_ISA2_MOVRS_SET): New. (OPTION_MASK_ISA2_MOVRS_UNSET): Ditto. (ix86_handle_option): Handle -mmovrs. * common/config/i386/i386-cpuinfo.h (enum processor_features): Add FEATURE_MOVRS. * common/config/i386/i386-isas.h: Add ISA_NAME_TABLE_ENTRY for movrs. * config.gcc: Add movrsintrin.h * config/i386/cpuid.h (bit_MOVRS): New. * config/i386/i386-builtin-types.def: Add DEF_FUNCTION_TYPE (CHAR, PCCHAR), (SHORT, PCSHORT), (INT, PCINT), (INT64, PCINT64). * config/i386/i386-builtin.def (BDESC): Add new builtins. * config/i386/i386-c.cc (ix86_target_macros_internal): Add __MOVRS__. * config/i386/i386-expand.cc (ix86_expand_special_args_builtin): Define __MOVRS__. * config/i386/i386-isa.def (MOVRS): Add DEF_PTA(MOVRS) * config/i386/i386-options.cc (ix86_valid_target_attribute_inner_p): Handle movrs. * config/i386/i386.md (movrs<mode>): New. * config/i386/i386.opt: Add option -mmovrs. * config/i386/i386.opt.urls: Regenerated. * config/i386/immintrin.h: Include movrsintrin.h * config/i386/sse.md (unspecv): Add UNSPEC_VMOVRS. (VI1248_AVX10_2): New. (avx10_2_movrs_vmovrs<ssemodesuffix><mode><mask_name>): New define_insn. * config/i386/xmmintrin.h: Add prefetchrst2. * doc/extend.texi: Document movrs. * doc/invoke.texi: Document -mmovrs. * doc/rtl.texi: Document extension of prefetchrst2. * doc/sourcebuild.texi: Document target movrs. * config/i386/movrsintrin.h: New. gcc/testsuite/ChangeLog: * g++.dg/other/i386-2.C: Add -mmovrs. * g++.dg/other/i386-3.C: Ditto. * gcc.c-torture/execute/builtin-prefetch-1.c: Expand rws. * gcc.dg/builtin-prefetch-1.c: Ditto. * gcc.target/i386/avx-1.c: Ditto. * gcc.target/i386/avx-2.c: Ditto. * gcc.target/i386/funcspec-56.inc: Add new target attribute. * gcc.target/i386/sse-12.c: Add -mmovrs. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-14.c: Ditto. * gcc.target/i386/sse-22.c: Add movrs. * gcc.target/i386/sse-23.c: Ditto * gcc.target/i386/avx10_2-512-movrs-1.c: New test. * gcc.target/i386/avx10_2-movrs-1.c: Ditto. * gcc.target/i386/movrs-1.c: Ditto. Co-authored-by: Haochen Jiang <haochen.jiang@intel.com>
2024-10-29Fix miscompilation of function containing __builtin_unreachableEric Botcazou1-0/+23
This is a wrong-code generation on the SPARC for a function containing a call to __builtin_unreachable caused by the delay slot scheduling pass, and more specifically the find_end_label function which has these lines: /* Otherwise, see if there is a label at the end of the function. If there is, it must be that RETURN insns aren't needed, so that is our return label and we don't have to do anything else. */ The comment was correct 20 years ago but no longer is nowadays in the presence of RTL epilogues and calls to __builtin_unreachable, so the patch just removes the associated two lines of code: else if (LABEL_P (insn)) *plabel = as_a <rtx_code_label *> (insn); and otherwise contains just adjustments to the commentary. gcc/ PR rtl-optimization/117327 * reorg.cc (find_end_label): Do not return a dangling label at the end of the function and adjust commentary. gcc/testsuite/ * gcc.c-torture/execute/20241029-1.c: New test.
2024-10-16testsuite: Prepare for -std=gnu23 defaultJoseph Myers20-2/+31
Now that C23 support is essentially feature-complete, I'd like to switch the default language version for C compilation to -std=gnu23. This requires updating a large number of testcases that fail with the new language version if left unchanged. In this patch, update most of the tests for which there is a safe change that works both before and after the update to default language version - typically adding the option -std=gnu17 or -Wno-old-style-definition to the tests. (There are also a few tests where I'd like to investigate further why they fail with -std=gnu23, or where I think such failures show an actual bug to fix before changing the default language version, or where it seems more appropriate to make a testcase change that would result in failures in the absence of the language version change rather than just adding an option that does nothing with the gnu17 default.) The libffi test fixes have also been submitted upstream: <https://github.com/libffi/libffi/pull/861>. Most of the failures requiring such changes are for one of two reasons: * Unprototyped function declarations with () (meaning the same as (void) in C23 mode) for a function then called with arguments. * Old-style function definitions, which warn by default in C23 mode, so resulting in test failures for the unexpected warnings. Other reasons for failures include: * Tests with their own definitions of bool, true and false. * Tests of diagnostics (often with -pedantic) in cases where C23 has changed semantics, such as: - tag compatibility for structs; - enum values out of range of int; - handing of qualified array types; - decimal floating types formerly needing -pedantic diagnostics, but being standard in C23. Bootstrapped with no regressions for x86_64-pc-linux-gnu. gcc/testsuite/ * c-c++-common/Wcast-function-type.c: Add -std=gnu17 for C. * c-c++-common/Wformat-pr84258.c: Add -std=gnu17 for C. * c-c++-common/Wvarargs.c: Add -std=gnu17 for C. * c-c++-common/analyzer/data-model-12.c: Add -std=gnu17 for C. * c-c++-common/builtins.c: Add -std=gnu17 for C. * c-c++-common/pointer-to-fn1.c: Add -std=gnu17 for C. * c-c++-common/pragma-diag-17.c: Add -std=gnu17 for C. * c-c++-common/sizeof-array-argument.c: Add -Wno-old-style-definition for C. * g++.dg/lto/pr54625-1_0.c: Add -std=gnu17. * g++.dg/lto/pr54625-2_0.c: Add -std=gnu17. * gcc.c-torture/compile/20040214-2.c: Add -std=gnu17. * gcc.c-torture/compile/921011-2.c: Add -std=gnu17. * gcc.c-torture/compile/931102-1.c: Add -std=gnu17. * gcc.c-torture/compile/990801-1.c: Add -std=gnu17. * gcc.c-torture/compile/nested-1.c: Add -std=gnu17. * gcc.c-torture/compile/pr100241-1.c: Add -std=gnu17. * gcc.c-torture/compile/pr106101.c: Add -std=gnu17. * gcc.c-torture/compile/pr113616.c: Add -std=gnu17. * gcc.c-torture/compile/pr47967.c: Add -std=gnu17. * gcc.c-torture/compile/pr51694.c: Add -std=gnu17. * gcc.c-torture/compile/pr71109.c: Add -std=gnu17. * gcc.c-torture/compile/pr83051-2.c: Add -std=gnu17. * gcc.c-torture/compile/pr89663-1.c: Add -std=gnu17. * gcc.c-torture/compile/pr94238.c: Add -std=gnu17. * gcc.c-torture/compile/pr96796.c: Add -std=gnu17. * gcc.c-torture/compile/pr97576.c: Add -std=gnu17. * gcc.c-torture/compile/udivmod4.c: Add -std=gnu17. * gcc.c-torture/execute/20010605-2.c: Add -std=gnu17. * gcc.c-torture/execute/20020404-1.c: Add -std=gnu17. * gcc.c-torture/execute/20030714-1.c: Add -std=gnu17. * gcc.c-torture/execute/20051012-1.c: Add -std=gnu17. * gcc.c-torture/execute/20190820-1.c: Add -std=gnu17. * gcc.c-torture/execute/920612-1.c: Add -Wno-old-style-definition. * gcc.c-torture/execute/930608-1.c: Add -std=gnu17. * gcc.c-torture/execute/comp-goto-1.c: Add -std=gnu17. * gcc.c-torture/execute/ieee/fp-cmp-1.x: Add -std=gnu17. * gcc.c-torture/execute/ieee/fp-cmp-2.x: Add -std=gnu17. * gcc.c-torture/execute/ieee/fp-cmp-3.x: Add -std=gnu17. * gcc.c-torture/execute/ieee/fp-cmp-4.x: New file. * gcc.c-torture/execute/ieee/fp-cmp-4f.x: New file. * gcc.c-torture/execute/ieee/fp-cmp-4l.x: New file. * gcc.c-torture/execute/loop-9.c: Add -std=gnu17. * gcc.c-torture/execute/pr103209.c: Add -std=gnu17. * gcc.c-torture/execute/pr28289.c: Add -std=gnu17. * gcc.c-torture/execute/pr34982.c: Add -std=gnu17. * gcc.c-torture/execute/pr67037.c: Add -std=gnu17. * gcc.c-torture/execute/va-arg-2.c: Add -std=gnu17. * gcc.dg/20010202-1.c: Add -std=gnu17. * gcc.dg/20020430-1.c: Add -std=gnu17. * gcc.dg/20031218-3.c: Add -std=gnu17. * gcc.dg/20040127-1.c: Add -std=gnu17. * gcc.dg/20041014-1.c: Add -Wno-old-style-definition. * gcc.dg/20041122-1.c: Add -std=gnu17. * gcc.dg/20050309-1.c: Add -std=gnu17. * gcc.dg/20061026.c: Add -std=gnu17. * gcc.dg/20101010-1.c: Add -std=gnu17. * gcc.dg/Warray-parameter-10.c: Add -std=gnu17. * gcc.dg/Wbuiltin-declaration-mismatch-2.c: Add -std=gnu17. * gcc.dg/Wbuiltin-declaration-mismatch-3.c: Add -std=gnu17. * gcc.dg/Wbuiltin-declaration-mismatch-4.c: Add -std=gnu17. * gcc.dg/Wbuiltin-declaration-mismatch-5.c: Add -std=gnu17. * gcc.dg/Wbuiltin-declaration-mismatch.c: Add -std=gnu17. * gcc.dg/Wcxx-compat-2.c: Add -std=gnu17. * gcc.dg/Wdouble-promotion.c: Add -std=gnu17. * gcc.dg/Wfree-nonheap-object-7.c: Add -std=gnu17. * gcc.dg/Wimplicit-int-1.c: Add -std=gnu17. * gcc.dg/Wimplicit-int-1a.c: Add -std=gnu17. * gcc.dg/Wimplicit-int-2.c: Add -std=gnu17. * gcc.dg/Wimplicit-int-3.c: Add -std=gnu17. * gcc.dg/Wimplicit-int-4.c: Add -std=gnu17. * gcc.dg/Wimplicit-int-4a.c: Add -std=gnu17. * gcc.dg/Wincompatible-pointer-types-1.c: Add -std=gnu17. * gcc.dg/Wrestrict-19.c: Add -std=gnu17. * gcc.dg/Wrestrict-4.c: Add -std=gnu17. * gcc.dg/Wrestrict-5.c: Add -std=gnu17. * gcc.dg/Wstrict-overflow-20.c: Add -std=gnu17. * gcc.dg/Wstringop-overflow-13.c: Add -std=gnu17. * gcc.dg/analyzer/doom-d_main-IdentifyVersion.c: Add -std=gnu17. * gcc.dg/analyzer/doom-s_sound-pr108867.c: Add -std=gnu17. * gcc.dg/analyzer/pr93032-mztools-signed-char.c: Add -Wno-old-style-definition. * gcc.dg/analyzer/pr93032-mztools-unsigned-char.c: Add -Wno-old-style-definition. * gcc.dg/analyzer/pr93355-localealias.c: Add -Wno-old-style-definition. * gcc.dg/analyzer/pr93375.c: Add -std=gnu17. * gcc.dg/analyzer/pr94688.c: Add -std=gnu17. * gcc.dg/analyzer/sensitive-1.c: Add -std=gnu17. * gcc.dg/analyzer/torture/asm-x86-linux-wfx_get_ps_timeout-full.c: Add -std=gnu17. * gcc.dg/analyzer/torture/pr104863.c: Add -std=gnu17. * gcc.dg/analyzer/torture/pr93379.c: Add -std=gnu17. * gcc.dg/array-quals-2.c: Add -std=gnu17. * gcc.dg/attr-invalid.c: Add -Wno-old-style-definition. * gcc.dg/auto-init-uninit-A.c: Add -Wno-old-style-definition. * gcc.dg/builtin-choose-expr.c: Declare exit with (int) prototype. * gcc.dg/builtin-tgmath-err-1.c: Add -std=gnu17. * gcc.dg/builtins-30.c: Add -std=gnu17. * gcc.dg/cast-function-1.c: Add -std=gnu17. * gcc.dg/cleanup-1.c: Add -std=gnu17. * gcc.dg/compat/struct-complex-1_x.c: Add -std=gnu17. * gcc.dg/compat/struct-complex-2_x.c: Add -std=gnu17. * gcc.dg/compat/union-m128-1_x.c: Add -std=gnu17. * gcc.dg/debug/dwarf2/pr66482.c: Add -std=gnu17. * gcc.dg/dfp/composite-type-2.c: Add -std=gnu17. * gcc.dg/dfp/composite-type.c: Add -std=gnu17. * gcc.dg/dfp/keywords-pedantic.c: Add -std=gnu17. * gcc.dg/dremf-type-compat-1.c: Add -std=gnu17. * gcc.dg/dremf-type-compat-2.c: Add -std=gnu17. * gcc.dg/dremf-type-compat-3.c: Add -std=gnu17. * gcc.dg/dremf-type-compat-4.c: Add -std=gnu17. * gcc.dg/enum-compat-1.c: Add -std=gnu17. * gcc.dg/enum-compat-2.c: Add -std=gnu17. * gcc.dg/floatn-errs.c: Add -std=gnu17. * gcc.dg/fltconst-pedantic-dfp.c: Add -std=gnu17. * gcc.dg/format/proto.c: Add -std=gnu17. * gcc.dg/format/sentinel-1.c: Add -std=gnu17. * gcc.dg/gomp/declare-simd-1.c: Add -Wno-old-style-definition. * gcc.dg/ifelse-1.c: Add -Wno-old-style-definition. * gcc.dg/inline-33.c: Add -std=gnu17. * gcc.dg/ipa/inline-5.c: Add -std=gnu17. * gcc.dg/ipa/ipa-sra-21.c: Add -std=gnu17. * gcc.dg/ipa/pr102714.c: Add -std=gnu17. * gcc.dg/ipa/pr104813.c: Add -std=gnu17. * gcc.dg/ipa/pr108679.c: Add -std=gnu17. * gcc.dg/ipa/pr42706.c: Add -std=gnu17. * gcc.dg/ipa/pr88214.c: Add -Wno-old-style-definition. * gcc.dg/ipa/pr91853.c: Add -Wno-old-style-definition. * gcc.dg/ipa/pr93763.c: Add -std=gnu17. * gcc.dg/ipa/pr96482-2.c: Add -std=gnu17. * gcc.dg/lto/20091013-1_2.c: Add -std=gnu17. * gcc.dg/lto/20091015-1_2.c: Add -std=gnu17. * gcc.dg/lto/pr113197_1.c: Add -std=gnu17. * gcc.dg/lto/pr54702_1.c: Add -std=gnu17. * gcc.dg/lto/pr99849_0.c: Add -std=gnu17. * gcc.dg/noncompile/920923-1.c: Add -std=gnu17. * gcc.dg/noncompile/old-style-parm-1.c: Add -Wno-old-style-definition. * gcc.dg/noncompile/old-style-parm-3.c: Add -Wno-old-style-definition. * gcc.dg/noncompile/pr30552-2.c: Add -Wno-old-style-definition. * gcc.dg/noncompile/pr30552-3.c: Add -std=gnu17. * gcc.dg/noncompile/pr71265.c: Add -Wno-old-style-definition. * gcc.dg/noncompile/pr79758-2.c: Add -Wno-old-style-definition. * gcc.dg/noncompile/pr79758.c: Add -Wno-old-style-definition. * gcc.dg/noncompile/va-arg-1.c: Add -std=gnu17. * gcc.dg/old-style-prom-1.c: Add -std=gnu17. * gcc.dg/old-style-prom-2.c: Add -std=gnu17. * gcc.dg/old-style-prom-3.c: Add -std=gnu17. * gcc.dg/old-style-then-proto-1.c: Add -std=gnu17. * gcc.dg/parm-incomplete-1.c: Add -std=gnu17. * gcc.dg/parm-mismatch-1.c: Add -std=gnu17. * gcc.dg/permerror-default.c: Add -std=gnu17. * gcc.dg/permerror-fpermissive-nowarning.c: Add -std=gnu17. * gcc.dg/permerror-fpermissive.c: Add -std=gnu17. * gcc.dg/permerror-noerror.c: Add -std=gnu17. * gcc.dg/permerror-nowarning.c: Add -std=gnu17. * gcc.dg/permerror-pedantic.c: Add -std=gnu17. * gcc.dg/plugin/infoleak-net-ethtool-ioctl.c: Add -std=gnu17. * gcc.dg/pointer-array-quals-1.c: Add -std=gnu17. * gcc.dg/pointer-array-quals-2.c: Add -std=gnu17. * gcc.dg/pr100791.c: Add -std=gnu17. * gcc.dg/pr100843.c: Add -std=gnu17. * gcc.dg/pr102273.c: Add -std=gnu17. * gcc.dg/pr102385.c: Add -std=gnu17. * gcc.dg/pr103222.c: Add -std=gnu17. * gcc.dg/pr105140.c: Add -std=gnu17. * gcc.dg/pr105150.c: Add -std=gnu17. * gcc.dg/pr105250.c: Add -std=gnu17. * gcc.dg/pr105972.c: Add -Wno-old-style-definition. * gcc.dg/pr111039.c: Add -std=gnu17. * gcc.dg/pr111407.c: Add -std=gnu17. * gcc.dg/pr111922.c: Add -Wno-old-style-definition. * gcc.dg/pr15236.c: Add -std=gnu17. * gcc.dg/pr17188-1.c: Add -std=gnu17. * gcc.dg/pr20368-1.c: Add -std=gnu17. * gcc.dg/pr20368-2.c: Add -std=gnu17. * gcc.dg/pr20368-3.c: Add -std=gnu17. * gcc.dg/pr27331.c: Add -Wno-old-style-definition. * gcc.dg/pr27861-1.c: Add -std=gnu17. * gcc.dg/pr28121.c: Add -std=gnu17. * gcc.dg/pr28243.c: Add -std=gnu17. * gcc.dg/pr28888.c: Add -std=gnu17. * gcc.dg/pr29254.c: Add -std=gnu17. * gcc.dg/pr34457-1.c: Add -std=gnu17. * gcc.dg/pr36015.c: Add -std=gnu17. * gcc.dg/pr38245-3.c: Add -std=gnu17. * gcc.dg/pr38245-4.c: Add -std=gnu17. * gcc.dg/pr41241.c: Add -std=gnu17. * gcc.dg/pr43058.c: Add -std=gnu17. * gcc.dg/pr44539.c: Add -std=gnu17. * gcc.dg/pr45055.c: Add -std=gnu17. * gcc.dg/pr50908.c: Add -Wno-old-style-definition. * gcc.dg/pr60647-1.c: Add -Wno-old-style-definition. * gcc.dg/pr63762.c: Add -std=gnu17. * gcc.dg/pr63804.c: Add -std=gnu17. * gcc.dg/pr68306-3.c: Add -std=gnu17. * gcc.dg/pr68533.c: Add -std=gnu17. * gcc.dg/pr69156.c: Add -std=gnu17. * gcc.dg/pr7356-2.c: Add -Wno-old-style-definition. * gcc.dg/pr79983.c: Add -std=gnu17. * gcc.dg/pr83463.c: Add -std=gnu17. * gcc.dg/pr87347.c: Add -std=gnu17. * gcc.dg/pr89521-1.c: Add -std=gnu17. * gcc.dg/pr89521-2.c: Add -std=gnu17. * gcc.dg/pr90648.c: Add -std=gnu17. * gcc.dg/pr93573-1.c: Add -std=gnu17. * gcc.dg/pr94167.c: Add -std=gnu17. * gcc.dg/pr94705.c: Add -std=gnu17. * gcc.dg/pr95118.c: Add -std=gnu17. * gcc.dg/pr96335.c: Add -std=gnu17. * gcc.dg/pr97830.c: Add -std=gnu17. * gcc.dg/pr97882.c: Add -std=gnu17. * gcc.dg/pr99122-2.c: Add -std=gnu17. * gcc.dg/pr99122-3.c: Add -std=gnu17. * gcc.dg/qual-component-1.c: Add -std=gnu17. * gcc.dg/sibcall-6.c: Add -Wno-old-style-definition. * gcc.dg/sms-2.c: Add -Wno-old-style-definition. * gcc.dg/tm/20091221.c: Add -std=gnu17. * gcc.dg/torture/bfloat16-basic.c: Add -Wno-old-style-definition. * gcc.dg/torture/float128-basic.c: Add -Wno-old-style-definition. * gcc.dg/torture/float128x-basic.c: Add -Wno-old-style-definition. * gcc.dg/torture/float16-basic.c: Add -Wno-old-style-definition. * gcc.dg/torture/float32-basic.c: Add -Wno-old-style-definition. * gcc.dg/torture/float32x-basic.c: Add -Wno-old-style-definition. * gcc.dg/torture/float64-basic.c: Add -Wno-old-style-definition. * gcc.dg/torture/float64x-basic.c: Add -Wno-old-style-definition. * gcc.dg/torture/pr102762.c: Add -std=gnu17. * gcc.dg/torture/pr103987.c: Add -std=gnu17. * gcc.dg/torture/pr104825.c: Add -Wno-old-style-definition. * gcc.dg/torture/pr105166.c: Add -std=gnu17. * gcc.dg/torture/pr105185.c: Add -Wno-old-style-definition. * gcc.dg/torture/pr109652.c: Add -std=gnu17. * gcc.dg/torture/pr112444.c: Add -std=gnu17. * gcc.dg/torture/pr113895-3.c: Add -std=gnu17. * gcc.dg/torture/pr24626-2.c: Add -std=gnu17. * gcc.dg/torture/pr25183.c: Add -std=gnu17. * gcc.dg/torture/pr38948.c: Add -std=gnu17. * gcc.dg/torture/pr44807.c: Add -std=gnu17. * gcc.dg/torture/pr47281.c: Add -std=gnu17. * gcc.dg/torture/pr47958-1.c: Add -Wno-old-style-definition. * gcc.dg/torture/pr48063.c: Add -std=gnu17. * gcc.dg/torture/pr57036-1.c: Add -std=gnu17. * gcc.dg/torture/pr57330.c: Add -std=gnu17. * gcc.dg/torture/pr57584.c: Add -std=gnu17. * gcc.dg/torture/pr67741.c: Add -std=gnu17. * gcc.dg/torture/pr68104.c: Add -std=gnu17. * gcc.dg/torture/pr69242.c: Add -std=gnu17. * gcc.dg/torture/pr70457.c: Add -std=gnu17. * gcc.dg/torture/pr70985.c: Add -std=gnu17. * gcc.dg/torture/pr71606.c: Add -std=gnu17. * gcc.dg/torture/pr71816.c: Add -std=gnu17. * gcc.dg/torture/pr77286.c: Add -std=gnu17. * gcc.dg/torture/pr77646.c: Add -std=gnu17. * gcc.dg/torture/pr77677-2.c: Add -std=gnu17. * gcc.dg/torture/pr78365.c: Add -Wno-old-style-definition. * gcc.dg/torture/pr79732.c: Add -std=gnu17. * gcc.dg/torture/pr80612.c: Add -std=gnu17. * gcc.dg/torture/pr80764.c: Add -std=gnu17. * gcc.dg/torture/pr80842.c: Add -std=gnu17. * gcc.dg/torture/pr81900.c: Add -std=gnu17. * gcc.dg/torture/pr82276.c: Add -std=gnu17. * gcc.dg/torture/pr84803.c: Add -std=gnu17. * gcc.dg/torture/pr93124.c: Add -std=gnu17. * gcc.dg/torture/pr97330-1.c: Add -Wno-old-style-definition. * gcc.dg/tree-prof/comp-goto-1.c: Add -std=gnu17. * gcc.dg/tree-ssa/20030703-2.c: Add -Wno-old-style-definition. * gcc.dg/tree-ssa/20030708-1.c: Add -Wno-old-style-definition. * gcc.dg/tree-ssa/20030709-2.c: Add -Wno-old-style-definition. * gcc.dg/tree-ssa/20030709-3.c: Add -Wno-old-style-definition. * gcc.dg/tree-ssa/20030710-1.c: Add -Wno-old-style-definition. * gcc.dg/tree-ssa/20030711-1.c: Add -Wno-old-style-definition. * gcc.dg/tree-ssa/20030711-2.c: Add -Wno-old-style-definition. * gcc.dg/tree-ssa/20030711-3.c: Add -Wno-old-style-definition. * gcc.dg/tree-ssa/20030714-1.c: Add -Wno-old-style-definition. * gcc.dg/tree-ssa/20030714-2.c: Add -Wno-old-style-definition. * gcc.dg/tree-ssa/20030728-1.c: Add -Wno-old-style-definition. * gcc.dg/tree-ssa/20030807-10.c: Add -Wno-old-style-definition. * gcc.dg/tree-ssa/20030807-11.c: Add -Wno-old-style-definition. * gcc.dg/tree-ssa/20030807-3.c: Add -Wno-old-style-definition. * gcc.dg/tree-ssa/20030807-6.c: Add -Wno-old-style-definition. * gcc.dg/tree-ssa/20030807-7.c: Add -Wno-old-style-definition. * gcc.dg/tree-ssa/20030814-4.c: Add -Wno-old-style-definition. * gcc.dg/tree-ssa/20030814-5.c: Add -Wno-old-style-definition. * gcc.dg/tree-ssa/20030814-6.c: Add -Wno-old-style-definition. * gcc.dg/tree-ssa/20030918-1.c: Add -Wno-old-style-definition. * gcc.dg/tree-ssa/20040514-2.c: Add -Wno-old-style-definition. * gcc.dg/tree-ssa/loadpre7.c: Add -Wno-old-style-definition. * gcc.dg/tree-ssa/pr111003.c: Add -std=gnu17. * gcc.dg/tree-ssa/pr115128.c: Add -std=gnu17. * gcc.dg/tree-ssa/pr115191.c: Add -std=gnu17. * gcc.dg/tree-ssa/pr24840.c: Add -std=gnu17. * gcc.dg/tree-ssa/pr69666.c: Add -std=gnu17. * gcc.dg/tree-ssa/pr70232.c: Add -std=gnu17. * gcc.dg/ubsan/pr79757-1.c: Add -Wno-old-style-definition. * gcc.dg/ubsan/pr79757-2.c: Add -Wno-old-style-definition. * gcc.dg/ubsan/pr79757-3.c: Add -Wno-old-style-definition. * gcc.dg/ubsan/pr81223.c: Add -std=gnu17. * gcc.dg/uninit-10-O0.c: Add -Wno-old-style-definition. * gcc.dg/uninit-10.c: Add -Wno-old-style-definition. * gcc.dg/uninit-32.c: Add -std=gnu17. * gcc.dg/uninit-41.c: Add -std=gnu17. * gcc.dg/uninit-A-O0.c: Add -Wno-old-style-definition. * gcc.dg/uninit-A.c: Add -Wno-old-style-definition. * gcc.dg/unused-1.c: Add -Wno-old-style-definition. * gcc.dg/vect/bb-slp-pr114249.c: Add -std=gnu17. * gcc.dg/vect/bb-slp-pr97486.c: Add -std=gnu17. * gcc.dg/vect/bb-slp-subgroups-1.c: Add -std=gnu17. * gcc.dg/vect/bb-slp-subgroups-2.c: Add -std=gnu17. * gcc.dg/vect/bb-slp-subgroups-3.c: Add -std=gnu17. * gcc.dg/vect/vect-early-break_111-pr113731.c: Add -std=gnu17. * gcc.dg/vect/vect-early-break_122-pr114239.c: Add -std=gnu17. * gcc.dg/vect/vect-multi-peel-gaps.c: Add -std=gnu17. * gcc.dg/vla-stexp-2.c: Add -std=gnu17. * gcc.dg/warn-1.c: Add -Wno-old-style-definition. * gcc.dg/winline-10.c: Add -Wno-old-style-definition. * gcc.dg/wtr-label-1.c: Add -Wno-old-style-definition. * gcc.dg/wtr-switch-1.c: Add -Wno-old-style-definition. * gcc.target/i386/excess-precision-3.c: Add -Wno-old-style-definition. * gcc.target/i386/fma4-256-nmsubXX.c: Add -std=gnu17. * gcc.target/i386/fma4-nmsubXX.c: Add -std=gnu17. * gcc.target/i386/nop-mcount.c: Add -Wno-old-style-definition. * gcc.target/i386/pr102627.c: Add -std=gnu17. * gcc.target/i386/pr106994.c: Add -std=gnu17. * gcc.target/i386/pr68349.c: Add -std=gnu17. * gcc.target/i386/pr97313.c: Add -std=gnu17. * gcc.target/i386/pr99454.c: Add -std=gnu17. * gcc.target/i386/record-mcount.c: Add -Wno-old-style-definition. libffi/ * testsuite/libffi.call/va_struct2.c (test_fn): Cast n to void. * testsuite/libffi.call/va_struct3.c (test_fn): Likewise. Backported from <https://github.com/libffi/libffi/pull/861>.
2024-10-07nvptx: Re-enable all variants of 'gcc.c-torture/execute/20020529-1.c'Thomas Schwinge1-4/+0
Generally PASSes with: $ ptxas --version ptxas: NVIDIA (R) Ptx optimizing assembler Copyright (c) 2005-2018 NVIDIA Corporation Built on Sun_Sep__9_21:06:46_CDT_2018 Cuda compilation tools, release 10.0, V10.0.145 ..., and execution with 'Driver Version: 361.93.02'. Only the '-O1' execution test FAILs (pre-existing; to be analyzed later): nvptx-run: error getting kernel result: an illegal memory access was encountered (CUDA_ERROR_ILLEGAL_ADDRESS, 700) gcc/testsuite/ * gcc.c-torture/execute/20020529-1.c: Re-enable all variants for nvptx.
2024-10-07nvptx: Disable effective-target 'freestanding'Thomas Schwinge4-0/+4
After 2014's commit 157e859ffe3b5d43db1e19475711c1a3d21ab57a "remove picochip", the effective-target 'freestanding' (later) was only ever used for nvptx. However, the relevant I/O library functions have long been implemented in nvptx newlib. These test cases generally PASS, just a few need to get XFAILed; see <https://docs.nvidia.com/cuda/ptx-writers-guide-to-interoperability/#system-calls>, and then supposedly <https://docs.nvidia.com/cuda/cuda-c-programming-guide/#formatted-output> for description of the non-standard PTX 'vprintf' return value: > Unlike the C-standard 'printf()', which returns the number of characters > printed, CUDA's 'printf()' returns the number of arguments parsed. If no > arguments follow the format string, 0 is returned. If the format string is > NULL, -1 is returned. If an internal error occurs, -2 is returned. (I've tried a few variants to confirm that PTX 'vprintf' -- which supposedly is underlying the CUDA 'printf' -- is what's implementing this behavior.) Probably, we ought to fix that up in nvptx newlib. gcc/testsuite/ * gcc.c-torture/execute/printf-1.c: XFAIL for nvptx. * gcc.c-torture/execute/printf-chk-1.c: Likewise. * gcc.c-torture/execute/vprintf-1.c: Likewise. * gcc.c-torture/execute/vprintf-chk-1.c: Likewise. * lib/target-supports.exp (check_effective_target_freestanding): Disable for nvptx.
2024-10-04testsuite - Fix gcc.c-torture/execute/ieee/pr108540-1.cGeorg-Johann Lay2-3/+10
PR testsuite/108540 gcc/testsuite/ * gcc.c-torture/execute/ieee/pr108540-1.c: Un-preprocess __SIZE_TYPE__ and __INT64_TYPE__. * gcc.c-torture/execute/ieee/pr108540-1.x: New file, requires double64.
2024-09-14testsuite; Fix execute/pr52286.c for 16bitAndrew Pinski1-1/+1
The code path which was added for 16bit had a broken inline-asm which would only assign maybe half of the registers for the `long` type to 0. Adding L to the input operand of the inline-asm fixes the issue by now assigning the full 32bit value of the input register that would match up with the output register. Fixes r0-115223-gb0408f13d4b317 which added the 16bit code path to fix the testcase for 16bit. Pushed as obvious. PR testsuite/116716 gcc/testsuite/ChangeLog: * gcc.c-torture/execute/pr52286.c: Fix inline-asm for 16bit case.
2024-08-1216-bit testsuite fixes - excessive code sizeJoern Rennecke1-0/+2
gcc/testsuite/ * gcc.c-torture/execute/20021120-1.c: Skip if not size20plus or -Os. * gcc.dg/fixed-point/convert-float-4.c: Require size20plus. * gcc.dg/torture/pr112282.c: Skip if -O0 unless size20plus. * g++.dg/lookup/pr21802.C: Require size20plus.
2024-07-29testsuite: fix PR111613 testSam James1-0/+29
PR ipa/111613 * gcc.c-torture/pr111613.c: Rename to.. * gcc.c-torture/execute/pr111613.c: ...this.
2024-07-29testsuite: make PR115277 test an execute oneSam James1-0/+28
PR middle-end/115277 * gcc.c-torture/compile/pr115277.c: Rename to... * gcc.c-torture/execute/pr115277.c: ...this.
2024-07-22Fix modref_eaf_analysis::analyze_ssa_name handling of values dereferenced to ↵Jan Hubicka1-0/+35
function call parameters modref_eaf_analysis::analyze_ssa_name misinterprets EAF flags. If dereferenced parameter is passed (to map_iterator in the testcase) it can be returned indirectly which in turn makes it to escape into the next function call. PR ipa/115033 gcc/ChangeLog: * ipa-modref.cc (modref_eaf_analysis::analyze_ssa_name): Fix checking of EAF flags when analysing values dereferenced as function parameters. gcc/testsuite/ChangeLog: * gcc.c-torture/execute/pr115033.c: New test.
2024-07-22Fix accounting of offsets in unadjusted_ptr_and_unit_offsetJan Hubicka1-0/+23
unadjusted_ptr_and_unit_offset accidentally throws away the offset computed by get_addr_base_and_unit_offset. Instead of passing extra_offset it passes offset. PR ipa/114207 gcc/ChangeLog: * ipa-prop.cc (unadjusted_ptr_and_unit_offset): Fix accounting of offsets in ADDR_EXPR. gcc/testsuite/ChangeLog: * gcc.c-torture/execute/pr114207.c: New test.
2024-06-05Don't simplify NAN/INF or out-of-range constant for FIX/UNSIGNED_FIX.liuhongt1-0/+2
According to IEEE standard, for conversions from floating point to integer. When a NaN or infinite operand cannot be represented in the destination format and this cannot otherwise be indicated, the invalid operation exception shall be signaled. When a numeric operand would convert to an integer outside the range of the destination format, the invalid operation exception shall be signaled if this situation cannot otherwise be indicated. The patch prevent simplication of the conversion from floating point to integer for NAN/INF/out-of-range constant when flag_trapping_math. gcc/ChangeLog: PR rtl-optimization/100927 PR rtl-optimization/115161 PR rtl-optimization/115115 * simplify-rtx.cc (simplify_const_unary_operation): Prevent simplication of FIX/UNSIGNED_FIX for NAN/INF/out-of-range constant when flag_trapping_math. * fold-const.cc (fold_convert_const_int_from_real): Don't fold for overflow value when_trapping_math. gcc/testsuite/ChangeLog: * gcc.dg/pr100927.c: New test. * c-c++-common/Wconversion-1.c: Add -fno-trapping-math. * c-c++-common/dfp/convert-int-saturate.c: Ditto. * g++.dg/ubsan/pr63956.C: Ditto. * g++.dg/warn/Wconversion-real-integer.C: Ditto. * gcc.c-torture/execute/20031003-1.c: Ditto. * gcc.dg/Wconversion-complex-c99.c: Ditto. * gcc.dg/Wconversion-real-integer.c: Ditto. * gcc.dg/c90-const-expr-11.c: Ditto. * gcc.dg/overflow-warn-8.c: Ditto.
2024-06-04builtins: Force SAVE_EXPR for __builtin_{add,sub,mul}_overflow and ↵Jakub Jelinek1-0/+39
__builtin{add,sub}c [PR108789] The following testcase is miscompiled, because we use save_expr on the .{ADD,SUB,MUL}_OVERFLOW call we are creating, but if the first two operands are not INTEGER_CSTs (in that case we just fold it right away) but are TREE_READONLY/!TREE_SIDE_EFFECTS, save_expr doesn't actually create a SAVE_EXPR at all and so we lower it to *arg2 = REALPART_EXPR (.ADD_OVERFLOW (arg0, arg1)), \ IMAGPART_EXPR (.ADD_OVERFLOW (arg0, arg1)) which evaluates the ifn twice and just hope it will be CSEd back. As *arg2 aliases *arg0, that is not the case. The builtins are really never const/pure as they store into what the third arguments points to, so after handling the INTEGER_CST+INTEGER_CST case, I think we should just always use SAVE_EXPR. Just building SAVE_EXPR by hand and setting TREE_SIDE_EFFECTS on it doesn't work, because c_fully_fold optimizes it away again, so the following patch marks the ifn calls as TREE_SIDE_EFFECTS (but doesn't do it for the __builtin_{add,sub,mul}_overflow_p case which were designed for use especially in constant expressions and don't really evaluate the realpart side, so we don't really need a SAVE_EXPR in that case). 2024-06-04 Jakub Jelinek <jakub@redhat.com> PR middle-end/108789 * builtins.cc (fold_builtin_arith_overflow): For ovf_only, don't call save_expr and don't build REALPART_EXPR, otherwise set TREE_SIDE_EFFECTS on call before calling save_expr. (fold_builtin_addc_subc): Set TREE_SIDE_EFFECTS on call before calling save_expr. * gcc.c-torture/execute/pr108789.c: New test.
2024-05-21match: Disable `(type)zero_one_valuep*CST` for 1bit signed types [PR115154]Andrew Pinski1-0/+23
The problem here is the pattern added in r13-1162-g9991d84d2a8435 assumes that it is well defined to multiply zero_one_valuep by the truncated converted integer constant. It is well defined for all types except for signed 1bit types. Where `a * -1` is produced which is undefined/ So disable this pattern for 1bit signed types. Note the pattern added in r14-3432-gddd64a6ec3b38e is able to workaround the undefinedness except when `-fsanitize=undefined` is turned on, this is why I added a testcase for that. Bootstrapped and tested on x86_64-linux-gnu with no regressions. PR tree-optimization/115154 gcc/ChangeLog: * match.pd (convert (mult zero_one_valued_p@1 INTEGER_CST@2)): Disable for 1bit signed types. gcc/testsuite/ChangeLog: * c-c++-common/ubsan/signed1bitfield-1.c: New test. * gcc.c-torture/execute/signed1bitfield-1.c: New test. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2024-05-16Fix points_to_local_or_readonly_memory_p wrt TARGET_MEM_REFJan Hubicka1-0/+38
TARGET_MEM_REF can be used to offset constant base into a memory object (to produce lea instruction). This confuses points_to_local_or_readonly_memory_p which treats the constant address as a base of the access. Bootstrapped/regtsted x86_64-linux, comitted. Honza gcc/ChangeLog: PR ipa/113787 * ipa-fnsummary.cc (points_to_local_or_readonly_memory_p): Do not look into TARGET_MEM_REFS with constant opreand 0. gcc/testsuite/ChangeLog: * gcc.c-torture/execute/pr113787.c: New test.
2024-05-08reassoc: Fix up optimize_range_tests_to_bit_test [PR114965]Jakub Jelinek1-0/+30
The optimize_range_tests_to_bit_test optimization normally emits a range test first: if (entry_test_needed) { tem = build_range_check (loc, optype, unshare_expr (exp), false, lowi, high); if (tem == NULL_TREE || is_gimple_val (tem)) continue; } so during the bit test we already know that exp is in the [lowi, high] range, but skips it if we have range info which tells us this isn't necessary. Also, normally it emits shifts by exp - lowi counter, but has an optimization to use just exp counter if the mask isn't a more expensive constant in that case and lowi is > 0 and high is smaller than prec. The following testcase is miscompiled because the two abnormal cases are triggered. The range of exp is [43, 43][48, 48][95, 95], so we on 64-bit arch decide we don't need the entry test, because 95 - 43 < 64. And we also decide to use just exp as counter, because the range test tests just for exp == 43 || exp == 48, so high is smaller than 64 too. Because 95 is in the exp range, we can't do that, we'd either need to do a range test first, i.e. if (exp - 43U <= 48U - 43U) if ((1UL << exp) & mask1)) or need to subtract lowi from the shift counter, i.e. if ((1UL << (exp - 43)) & mask2) but can't do both unless r.upper_bound () is < prec. The following patch ensures that. 2024-05-08 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/114965 * tree-ssa-reassoc.cc (optimize_range_tests_to_bit_test): Don't try to optimize away exp - lowi subtraction from shift count unless entry test is emitted or unless r.upper_bound () is smaller than prec. * gcc.c-torture/execute/pr114965.c: New test.
2024-04-12match: Fix `!a?b:c` and `a?~t:t` patterns for signed 1 bit types [PR114666]Andrew Pinski1-0/+13
The problem is `!a?b:c` pattern will create a COND_EXPR with an 1bit signed integer which breaks patterns like `a?~t:t`. This rejects when we have a signed operand for both patterns. Note for GCC 15, I am going to look at the canonicalization of `a?~t:t` where t was a constant since I think keeping it a COND_EXPR might be more canonical and is what VPR produces from the same IR; if anything expand should handle which one is better. Bootstrapped and tested on x86_64-linux-gnu with no regressions. PR tree-optimization/114666 gcc/ChangeLog: * match.pd (`!a?b:c`): Reject signed types for the condition. (`a?~t:t`): Likewise. gcc/testsuite/ChangeLog: * gcc.c-torture/execute/bitfld-signed1-1.c: New test. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2024-04-03expr: Fix up emit_push_insn [PR114552]Jakub Jelinek1-0/+24
r13-990 added optimizations in multiple spots to optimize during expansion storing of constant initializers into targets. In the load_register_parameters and expand_expr_real_1 cases, it checks it has a tree as the source and so knows we are reading that whole decl's value, so the code is fine as is, but in the emit_push_insn case it checks for a MEM from which something is pushed and checks for SYMBOL_REF as the MEM's address, but still assumes the whole object is copied, which as the following testcase shows might not always be the case. In the testcase, k is 6 bytes, then 2 bytes of padding, then another 4 bytes, while the emit_push_insn wants to store just the 6 bytes. The following patch simply verifies it is the whole initializer that is being stored, I think that is best thing to do so late in GCC 14 cycle as well for backporting. For GCC 15, perhaps the code could stop requiring it must be at offset zero, nor that the size is equal, but could use get_symbol_constant_value/fold_ctor_reference gimple-fold APIs to actually extract just part of the initializer if we e.g. push just some subset (of course, still verify that it is a subset). For sizes which are power of two bytes and we have some integer modes, we could use as type for fold_ctor_reference corresponding integral types, otherwise dunno, punt or use some structure (e.g. try to find one in the initializer?), whatever. But even in the other spots it could perhaps handle loading of COMPONENT_REFs or MEM_REFs from the .rodata vars. 2024-04-03 Jakub Jelinek <jakub@redhat.com> PR middle-end/114552 * expr.cc (emit_push_insn): Only use store_constructor for immediate_const_ctor_p if int_expr_size matches size. * gcc.c-torture/execute/pr114552.c: New test.
2024-03-28testsuite: Add testcase for already fixed PR [PR109925]Jakub Jelinek1-0/+30
This testcase was made latent by r14-4089 and got fixed both on the trunk and 13 branch with PR113372 fix. Adding testcase to the testsuite and will close the PR as a dup. 2024-03-28 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/109925 * gcc.c-torture/execute/pr109925.c: New test.
2024-03-26testsuite: Fix up pr111151.c testcase [PR114486]Jakub Jelinek1-1/+1
Apparently I've somehow screwed up the adjustments of the originally tested testcase, tweaked it so that in the second/third cases it actually see a MAX_EXPR rather than COND_EXPR the MAX_EXPR has been optimized into, and didn't update the expected value. 2024-03-26 Jakub Jelinek <jakub@redhat.com> PR middle-end/111151 PR testsuite/114486 * gcc.c-torture/execute/pr111151.c (main): Fix up expected value for f.
2024-03-26fold-const: Punt on MULT_EXPR in extract_muldiv MIN/MAX_EXPR case [PR111151]Jakub Jelinek1-0/+21
As I've tried to explain in the comments, the extract_muldiv_1 MIN/MAX_EXPR optimization is wrong for code == MULT_EXPR. If the multiplication is done in unsigned type or in signed type with -fwrapv, it is fairly obvious that max (a, b) * c in many cases isn't equivalent to max (a * c, b * c) (or min if c is negative) due to overflows, but even for signed with undefined overflow, the optimization could turn something without UB in it (where say a * c invokes UB, but max (or min) picks the other operand where b * c doesn't). As for division/modulo, I think it is in most cases safe, except if the problematic INT_MIN / -1 case could be triggered, but we can just punt for MAX_EXPR because for MIN_EXPR if one operand is INT_MIN, we'd pick that operand already. It is just for completeness, match.pd already has an optimization which turns x / -1 into -x, so the division by zero is mostly theoretical. That is also why in the testcase the i case isn't actually miscompiled without the patch, while the c and f cases are. 2024-03-26 Jakub Jelinek <jakub@redhat.com> PR middle-end/111151 * fold-const.cc (extract_muldiv_1) <case MAX_EXPR>: Punt for MULT_EXPR altogether, or for MAX_EXPR if c is -1. * gcc.c-torture/execute/pr111151.c: New test.
2024-03-22Move pr114396.c from gcc.target/i386 to gcc.c-torture/execute.liuhongt1-0/+105
Also fixed a typo in the testcase. gcc/testsuite/ChangeLog: PR tree-optimization/114396 * gcc.target/i386/pr114396.c: Move to... * gcc.c-torture/execute/pr114396.c: ...here.
2024-03-04Fix 201001011-1.c on H8Jan Dubiec1-0/+3
Excerpt from gcc.sum: [...] PASS: gcc.c-torture/execute/20101011-1.c -O0 (test for excess errors) FAIL: gcc.c-torture/execute/20101011-1.c -O0 execution test PASS: gcc.c-torture/execute/20101011-1.c -O1 (test for excess errors) FAIL: gcc.c-torture/execute/20101011-1.c -O1 execution test [ ... ] This is because H8 MCUs do not throw a "divide by zero" exception. gcc/testsuite * gcc.c-torture/execute/20101011-1.c: Do not test on H8 series.
2024-03-03[PATCH] combine: Don't simplify paradoxical SUBREG on ↵Greg McGary1-0/+9
WORD_REGISTER_OPERATIONS [PR113010] The sign-bit-copies of a sign-extending load cannot be known until runtime on WORD_REGISTER_OPERATIONS targets, except in the case of a zero-extending MEM load. See the fix for PR112758. gcc/ PR rtl-optimization/113010 * combine.cc (simplify_comparison): Simplify a SUBREG on WORD_REGISTER_OPERATIONS targets only if it is a zero-extending MEM load. gcc/testsuite * gcc.c-torture/execute/pr113010.c: New test.
2024-02-11Fix gcc.c-torture/execute/ieee/cdivchkf.c on hpuxJohn David Anglin1-4/+5
2024-02-11 John David Anglin <danglin@gcc.gnu.org> gcc/testsuite/ChangeLog: * gcc.c-torture/execute/ieee/cdivchkf.c: Use ilogb and __builtin_fmax instead of ilogbf and __builtin_fmaxf.