aboutsummaryrefslogtreecommitdiff
path: root/gcc/testsuite/gcc.c-torture
AgeCommit message (Collapse)AuthorFilesLines
41 hoursnvptx: Don't use PTX '.const', constant state space [PR119573]Thomas Schwinge1-1/+0
This avoids cases where a "File uses too much global constant data" (final executable, or single object file), and avoids cases of wrong code generation: "error : State space incorrect for instruction 'st'" ('st.const'), or another case where an "illegal instruction was encountered", or a lot of cases where for two compilation units (such as a library linked with user code) we ran into "error : Memory space doesn't match" due to differences in '.const' usage between definition and use of a variable. We progress: ptxas error : File uses too much global constant data (0x1f01a bytes, 0x10000 max) nvptx-run: cuLinkAddData failed: a PTX JIT compilation failed (CUDA_ERROR_INVALID_PTX, 218) ... into: PASS: 20_util/to_chars/103955.cc -std=gnu++17 (test for excess errors) [-FAIL:-]{+PASS:+} 20_util/to_chars/103955.cc -std=gnu++17 execution test We progress: ptxas error : File uses too much global constant data (0x36c65 bytes, 0x10000 max) nvptx-as: ptxas returned 255 exit status ... into: [-UNSUPPORTED:-]{+PASS:+} gcc.c-torture/compile/pr46534.c -O0 {+(test for excess errors)+} [-UNSUPPORTED:-]{+PASS:+} gcc.c-torture/compile/pr46534.c -O1 {+(test for excess errors)+} [-UNSUPPORTED:-]{+PASS:+} gcc.c-torture/compile/pr46534.c -O2 {+(test for excess errors)+} [-UNSUPPORTED:-]{+PASS:+} gcc.c-torture/compile/pr46534.c -O3 -g {+(test for excess errors)+} [-UNSUPPORTED:-]{+PASS:+} gcc.c-torture/compile/pr46534.c -Os {+(test for excess errors)+} [-FAIL:-]{+PASS:+} g++.dg/torture/pr31863.C -O0 (test for excess errors) [-FAIL:-]{+PASS:+} g++.dg/torture/pr31863.C -O1 (test for excess errors) [-FAIL:-]{+PASS:+} g++.dg/torture/pr31863.C -O2 (test for excess errors) [-FAIL:-]{+PASS:+} g++.dg/torture/pr31863.C -O3 -g (test for excess errors) [-FAIL:-]{+PASS:+} g++.dg/torture/pr31863.C -Os (test for excess errors) [-FAIL:-]{+PASS:+} gfortran.dg/bind-c-contiguous-1.f90 -O0 (test for excess errors) [-UNRESOLVED:-]{+PASS:+} gfortran.dg/bind-c-contiguous-1.f90 -O0 [-compilation failed to produce executable-]{+execution test+} [-FAIL:-]{+PASS:+} gfortran.dg/bind-c-contiguous-4.f90 -O0 (test for excess errors) [-UNRESOLVED:-]{+PASS:+} gfortran.dg/bind-c-contiguous-4.f90 -O0 [-compilation failed to produce executable-]{+execution test+} [-FAIL:-]{+PASS:+} gfortran.dg/bind-c-contiguous-5.f90 -O0 (test for excess errors) [-UNRESOLVED:-]{+PASS:+} gfortran.dg/bind-c-contiguous-5.f90 -O0 [-compilation failed to produce executable-]{+execution test+} [-FAIL:-]{+PASS:+} 20_util/to_chars/double.cc -std=gnu++17 (test for excess errors) [-UNRESOLVED:-]{+PASS:+} 20_util/to_chars/double.cc -std=gnu++17 [-compilation failed to produce executable-]{+execution test+} [-FAIL:-]{+PASS:+} 20_util/to_chars/float.cc -std=gnu++17 (test for excess errors) [-UNRESOLVED:-]{+PASS:+} 20_util/to_chars/float.cc -std=gnu++17 [-compilation failed to produce executable-]{+execution test+} [-FAIL:-]{+PASS:+} special_functions/13_ellint_3/check_value.cc -std=gnu++17 (test for excess errors) [-UNRESOLVED:-]{+PASS:+} special_functions/13_ellint_3/check_value.cc -std=gnu++17 [-compilation failed to produce executable-]{+execution test+} [-FAIL:-]{+PASS:+} tr1/5_numerical_facilities/special_functions/14_ellint_3/check_value.cc -std=gnu++17 (test for excess errors) [-UNRESOLVED:-]{+PASS:+} tr1/5_numerical_facilities/special_functions/14_ellint_3/check_value.cc -std=gnu++17 [-compilation failed to produce executable-]{+execution test+} ..., and progress likewise, but fail later with an unrelated error: [-FAIL:-]{+PASS:+} ext/special_functions/hyperg/check_value.cc -std=gnu++17 (test for excess errors) [-UNRESOLVED:-]{+FAIL:+} ext/special_functions/hyperg/check_value.cc -std=gnu++17 [-compilation failed to produce executable-]{+execution test+} [...]/libstdc++-v3/testsuite/ext/special_functions/hyperg/check_value.cc:12317: void test(const testcase_hyperg<Ret> (&)[Num], Ret) [with Ret = double; unsigned int Num = 19]: Assertion 'max_abs_frac < toler' failed. ..., and: [-FAIL:-]{+PASS:+} tr1/5_numerical_facilities/special_functions/17_hyperg/check_value.cc -std=gnu++17 (test for excess errors) [-UNRESOLVED:-]{+FAIL:+} tr1/5_numerical_facilities/special_functions/17_hyperg/check_value.cc -std=gnu++17 [-compilation failed to produce executable-]{+execution test+} [...]/libstdc++-v3/testsuite/tr1/5_numerical_facilities/special_functions/17_hyperg/check_value.cc:12316: void test(const testcase_hyperg<Ret> (&)[Num], Ret) [with Ret = double; unsigned int Num = 19]: Assertion 'max_abs_frac < toler' failed. We progress: nvptx-run: error getting kernel result: an illegal instruction was encountered (CUDA_ERROR_ILLEGAL_INSTRUCTION, 715) ... into: PASS: g++.dg/cpp1z/inline-var1.C -std=gnu++17 (test for excess errors) [-FAIL:-]{+PASS:+} g++.dg/cpp1z/inline-var1.C -std=gnu++17 execution test PASS: g++.dg/cpp1z/inline-var1.C -std=gnu++20 (test for excess errors) [-FAIL:-]{+PASS:+} g++.dg/cpp1z/inline-var1.C -std=gnu++20 execution test PASS: g++.dg/cpp1z/inline-var1.C -std=gnu++26 (test for excess errors) [-FAIL:-]{+PASS:+} g++.dg/cpp1z/inline-var1.C -std=gnu++26 execution test (A lot of '.const' -> '.global' etc. Haven't researched what the actual problem was.) We progress: ptxas /tmp/cc5TSZZp.o, line 142; error : State space incorrect for instruction 'st' ptxas /tmp/cc5TSZZp.o, line 174; error : State space incorrect for instruction 'st' ptxas fatal : Ptx assembly aborted due to errors nvptx-as: ptxas returned 255 exit status ... into: [-FAIL:-]{+PASS:+} g++.dg/torture/builtin-clear-padding-1.C -O0 (test for excess errors) [-UNRESOLVED:-]{+PASS:+} g++.dg/torture/builtin-clear-padding-1.C -O0 [-compilation failed to produce executable-]{+execution test+} PASS: g++.dg/torture/builtin-clear-padding-1.C -O1 (test for excess errors) PASS: g++.dg/torture/builtin-clear-padding-1.C -O1 execution test [-FAIL:-]{+PASS:+} g++.dg/torture/builtin-clear-padding-1.C -O2 (test for excess errors) [-UNRESOLVED:-]{+PASS:+} g++.dg/torture/builtin-clear-padding-1.C -O2 [-compilation failed to produce executable-]{+execution test+} [-FAIL:-]{+PASS:+} g++.dg/torture/builtin-clear-padding-1.C -O3 -g (test for excess errors) [-UNRESOLVED:-]{+PASS:+} g++.dg/torture/builtin-clear-padding-1.C -O3 -g [-compilation failed to produce executable-]{+execution test+} [-FAIL:-]{+PASS:+} g++.dg/torture/builtin-clear-padding-1.C -Os (test for excess errors) [-UNRESOLVED:-]{+PASS:+} g++.dg/torture/builtin-clear-padding-1.C -Os [-compilation failed to produce executable-]{+execution test+} This indeed tried to write ('st.const') into 's2', which was '.const' (also: 's1' was '.const') -- even though, no explicit 'const' in 'g++.dg/torture/builtin-clear-padding-1.C'; "interesting". We progress: error : Memory space doesn't match for '_ZNSt3tr18__detail12__prime_listE' in 'input file 3 at offset 53085', first specified in 'input file 1 at offset 1924' nvptx-run: cuLinkAddData failed: device kernel image is invalid (CUDA_ERROR_INVALID_SOURCE, 300) ... into execution test PASS for a few dozens of libstdc++ test cases. We progress: error : Memory space doesn't match for '_ZNSt6locale17_S_twinned_facetsE' in 'input file 11 at offset 479903', first specified in 'input file 9 at offset 59300' nvptx-run: cuLinkAddData failed: device kernel image is invalid (CUDA_ERROR_INVALID_SOURCE, 300) ... into: PASS: g++.dg/tree-ssa/pr20458.C -std=gnu++17 (test for excess errors) [-FAIL:-]{+PASS:+} g++.dg/tree-ssa/pr20458.C -std=gnu++17 execution test PASS: g++.dg/tree-ssa/pr20458.C -std=gnu++26 (test for excess errors) [-FAIL:-]{+PASS:+} g++.dg/tree-ssa/pr20458.C -std=gnu++26 execution test ..., and likewise for a few hundreds of libstdc++ test cases. We progress: error : Memory space doesn't match for '_ZNSt6locale5_Impl19_S_facet_categoriesE' in 'input file 11 at offset 821962', first specified in 'input file 10 at offset 676317' nvptx-run: cuLinkAddData failed: device kernel image is invalid (CUDA_ERROR_INVALID_SOURCE, 300) ... into execution test PASS for a hundred of libstdc++ test cases. We progress: error : Memory space doesn't match for '_ctype_' in 'input file 22 at offset 1698331', first specified in 'input file 9 at offset 57095' nvptx-run: cuLinkAddData failed: device kernel image is invalid (CUDA_ERROR_INVALID_SOURCE, 300) ... into execution test PASS for another few libstdc++ test cases. PR target/119573 gcc/ * config/nvptx/nvptx.cc (nvptx_encode_section_info): Don't set 'DATA_AREA_CONST' for 'TREE_CONSTANT', or 'TREE_READONLY'. (nvptx_asm_declare_constant_name): Use '.global' instead of '.const'. gcc/testsuite/ * gcc.c-torture/compile/pr46534.c: Don't 'dg-skip-if' nvptx. * gcc.target/nvptx/decl.c: Adjust. libstdc++-v3/ * config/cpu/nvptx/t-nvptx (AM_MAKEFLAGS): Don't amend.
4 dayscombine: Use reg_used_between_p rather than modified_between_p in two spots ↵Jakub Jelinek1-0/+33
[PR119291] The following testcase is miscompiled on x86_64-linux at -O2 by the combiner. We have from earlier combinations (insn 22 21 23 4 (set (reg:SI 104 [ _7 ]) (const_int 0 [0])) "pr119291.c":25:15 96 {*movsi_internal} (nil)) (insn 23 22 24 4 (set (reg/v:SI 117 [ e ]) (reg/v:SI 116 [ e ])) 96 {*movsi_internal} (expr_list:REG_DEAD (reg/v:SI 116 [ e ]) (nil))) (note 24 23 25 4 NOTE_INSN_DELETED) (insn 25 24 26 4 (parallel [ (set (reg:CCZ 17 flags) (compare:CCZ (neg:SI (reg:SI 104 [ _7 ])) (const_int 0 [0]))) (set (reg/v:SI 116 [ e ]) (neg:SI (reg:SI 104 [ _7 ]))) ]) "pr119291.c":26:13 977 {*negsi_2} (expr_list:REG_DEAD (reg:SI 104 [ _7 ]) (nil))) (note 26 25 27 4 NOTE_INSN_DELETED) (insn 27 26 28 4 (set (reg:DI 128 [ _9 ]) (ne:DI (reg:CCZ 17 flags) (const_int 0 [0]))) "pr119291.c":26:13 1447 {*setcc_di_1} (expr_list:REG_DEAD (reg:CCZ 17 flags) (nil))) and try_combine is called on i3 25 and i2 22 (second time) and reach the hunk being patched with simplified i3 (insn 25 24 26 4 (parallel [ (set (pc) (pc)) (set (reg/v:SI 116 [ e ]) (const_int 0 [0])) ]) "pr119291.c":28:13 977 {*negsi_2} (expr_list:REG_DEAD (reg:SI 104 [ _7 ]) (nil))) and (insn 22 21 23 4 (set (reg:SI 104 [ _7 ]) (const_int 0 [0])) "pr119291.c":27:15 96 {*movsi_internal} (nil)) Now, the try_combine code there attempts to split two independent sets in newpat by moving one of them to i2. And among other tests it checks !modified_between_p (SET_DEST (set1), i2, i3) which is certainly needed, if there would be say (set (reg/v:SI 116 [ e ]) (const_int 42 [0x2a])) in between i2 and i3, we couldn't do that, as that set would overwrite the value set by set1 we want to move to the i2 position. But in this case pseudo 116 isn't set in between i2 and i3, but used (and additionally there is a REG_DEAD note for it). This is equally bad for the move, because while the i3 insn and later will see the pseudo value that we set, the insn in between which uses the value will see a different value from the one that it should see. As we don't check for that, in the end try_combine succeeds and changes the IL to: (insn 22 21 23 4 (set (reg/v:SI 116 [ e ]) (const_int 0 [0])) "pr119291.c":27:15 96 {*movsi_internal} (nil)) (insn 23 22 24 4 (set (reg/v:SI 117 [ e ]) (reg/v:SI 116 [ e ])) 96 {*movsi_internal} (expr_list:REG_DEAD (reg/v:SI 116 [ e ]) (nil))) (note 24 23 25 4 NOTE_INSN_DELETED) (insn 25 24 26 4 (set (pc) (pc)) "pr119291.c":28:13 2147483647 {NOOP_MOVE} (nil)) (note 26 25 27 4 NOTE_INSN_DELETED) (insn 27 26 28 4 (set (reg:DI 128 [ _9 ]) (const_int 0 [0])) "pr119291.c":28:13 95 {*movdi_internal} (nil)) (note, the i3 got turned into a nop and try_combine also modified insn 27). The following patch replaces the modified_between_p tests with reg_used_between_p, my understanding is that modified_between_p is a subset of reg_used_between_p, so one doesn't need both. Looking at this some more today, I think we should special case set_noop_p because that can be put into i2 (except for the JUMP_P violations), currently both modified_between_p (pc_rtx, i2, i3) and reg_used_between_p (pc_rtx, i2, i3) returns false. I'll post a patch incrementally for that (but that feels like new optimization, so probably not something that should be backported). On Tue, Apr 01, 2025 at 11:27:25AM +0200, Richard Biener wrote: > Can we constrain SET_DEST (set1/set0) to a REG_P in combine? Why > does the comment talk about memory? I was worried about making too risky changes this late in stage4 (and especially also for backports). Most of this code is 1992-ish. I think many of the functions are just misnamed, the reg_ in there doesn't match what those functions do (bet they initially supported just REGs and later on support for other kinds of expressions was added, but haven't done git archeology to prove that). What we know for sure is: && GET_CODE (SET_DEST (XVECEXP (newpat, 0, 0))) != ZERO_EXTRACT && GET_CODE (SET_DEST (XVECEXP (newpat, 0, 0))) != STRICT_LOW_PART && GET_CODE (SET_DEST (XVECEXP (newpat, 0, 1))) != ZERO_EXTRACT && GET_CODE (SET_DEST (XVECEXP (newpat, 0, 1))) != STRICT_LOW_PART that is checked earlier in the condition. Then it calls && ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 1)), XVECEXP (newpat, 0, 0)) && ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 0)), XVECEXP (newpat, 0, 1)) While it has reg_* in it, that function mostly calls reg_overlap_mentioned_p which is also misnamed, that function handles just fine all of REG, MEM, SUBREG of REG, (SUBREG of MEM not, see below), ZERO_EXTRACT, STRICT_LOW_PART, PC and even some further cases. So, IMHO SET_DEST (set0) or SET_DEST (set0) can be certainly a REG, SUBREG of REG, PC (at least the REG and PC cases are triggered on the testcase) and quite possibly also MEM (SUBREG of MEM not, see below). Now, the code uses !modified_between_p (SET_SRC (set{1,0}), i2, i3) where that function for constants just returns false, for PC returns true, for REG returns reg_set_between_p, for MEM recurses on the address, for MEM_READONLY_P otherwise returns false, otherwise checks using alias.cc code whether the memory could have been modified in between, for all other rtxes recurses on the subrtxes. This part didn't change in my patch. I've only changed those - && !modified_between_p (SET_DEST (set{1,0}), i2, i3) + && !reg_used_between_p (SET_DEST (set{1,0}), i2, i3) where the former has been described above and clearly handles all of REG, SUBREG of REG, PC, MEM and SUBREG of MEM among other things. The replacement reg_used_between_p calls reg_overlap_mentioned_p on each instruction in between i2 and i3. So, there is clearly a difference in behavior if SET_DEST (set{1,0}) is pc_rtx, in that case modified_between_p returns unconditionally true even if there are no instructions in between, but reg_used_between_p if there are no non-debug insns in between returns false. Sorry for missing that, guess I should check for that (with the exception of the noop moves which are often (set (pc) (pc)) and handled by the incremental patch). In fact not just that, reg_used_between_p will only return true for PC if it is mentioned anywhere in the insns in between. Anyway, except for that, for REG it calls refers_to_regno_p and so should find any occurrences of any of the REG or parts of it for hard registers, for MEM returns true if it sees any MEMs in insns in between (conservatively), for SUBREGs apparently it relies on it being SUBREG of REG (so doesn't handle SUBREG of MEM) and handles SUBREG of REG like the SUBREG_REG, PC I've already described. Now, because reg_overlap_mentioned_p doesn't handle SUBREG of MEM, I think already the initial && ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 1)), XVECEXP (newpat, 0, 0)) && ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 0)), XVECEXP (newpat, 0, 1)) calls would have failed --enable-checking=rtl or would have misbehaved, so I think there is no need to check for it further. To your question why I don't use reg_referenced_p, that is because reg_referenced_p is something to call on one insn pattern, while reg_used_between_p is pretty much that on all insns in between two instructions (excluding the boundaries). So, I think it would be safer to add && SET_DEST (set{1,0} != pc_rtx checks to preserve former behavior, like in the following version. 2025-04-01 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/119291 * combine.cc (try_combine): For splitting of PARALLEL with 2 independent SETs into i2 and i3 sets check reg_used_between_p of the SET_DESTs rather than just modified_between_p. * gcc.c-torture/execute/pr119291.c: New test.
11 daysi386: Fix up combination of -2 r<<= (x & 7) into btr [PR119428]Jakub Jelinek1-0/+18
The following patch is miscompiled from r15-8478 but latently already since my r11-5756 and r11-6631 changes. The r11-5756 change was https://gcc.gnu.org/pipermail/gcc-patches/2020-December/561164.html which changed the splitters to immediately throw away the masking. And the r11-6631 change was an optimization to recognize (set (zero_extract:HI (...) (const_int 1) (...)) (const_int 1) as btr. The problem is their interaction. x86 is not a SHIFT_COUNT_TRUNCATED target, so the masking needs to be explicit in the IL. And combine.cc (make_field_assignment) has since 1992 optimizations which try to optimize x &= (-2 r<< y) into zero_extract (x) = 0. Now, such an optimization is fine if y has not been masked or if the chosen zero_extract has the same mode as the rotate (or it recognizes something with a left shift too). IMHO such optimization is invalid for SHIFT_COUNT_TRUNCATED targets because we explicitly say that the masking of the shift/rotate counts are redundant there and don't need to be part of the IL (I have a patch for that, but because it is just latent, I'm not sure it needs to be posted for gcc 15 (and also am not sure if it should punt or add operand masking just in case)). x86 is not SHIFT_COUNT_TRUNCATED though and so even fixing combine not to do that for SHIFT_COUNT_TRUNCATED targets doesn't help, and we don't have QImode insv, so it is optimized into HImode insertions. Now, if the y in x &= (-2 r<< y) wasn't masked in any way, turning it into HImode btr is just fine, but if it was x &= (-2 r<< (y & 7)) and we just decided to throw away the masking, using btr changes the behavior on it and causes e2fsprogs and sqlite miscompilations. So IMHO on !SHIFT_COUNT_TRUNCATED targets, we need to keep the maskings explicit in the IL, either at least for the duration of the combine pass as does the following patch (where combine is the only known pass to have such transformation), or even keep it until final pass in case there are some later optimizations that would also need to know whether there was explicit masking or not and with what mask. The latter change would be much larger. The following patch just reverts the r11-5756 change and adds a testcase. 2025-03-25 Jakub Jelinek <jakub@redhat.com> PR target/96226 PR target/119428 * config/i386/i386.md (splitter after *<rotate_insn><mode>3_mask, splitter after *<rotate_insn><mode>3_mask_1): Revert 2020-12-05 changes. * gcc.c-torture/execute/pr119428.c: New test.
2025-03-12testsuite: Add testcase for already fixed PR [PR119226]Jakub Jelinek1-0/+12
This was fixed in PR119219 r15-7981 commit. 2025-03-12 Jakub Jelinek <jakub@redhat.com> PR middle-end/119226 * gcc.c-torture/compile/pr119226.c: New test.
2025-03-04simplify-rtx: Fix up simplify_logical_relational_operation [PR119002]Richard Sandiford1-0/+23
The following testcase is miscompiled on powerpc64le-linux starting with r15-6777. During combine we see: (set (reg:SI 134) (ior:SI (ge:SI (reg:CCFP 128) (const_int 0 [0])) (lt:SI (reg:CCFP 128) (const_int 0 [0])))) The simplify_logical_relational_operation code (in its current form) was written with arithmetic rather than CC modes in mind. Since CCFP is a CC mode, it fails the HONOR_NANS check, and so the function assumes that ge | lt => true. If one comparison is unsigned then it should be safe to assume that the other comparison is also unsigned, even for CC modes, since the optimisation checks that the comparisons are between the same operands. For the other cases, we can only safely fold comparisons of CC mode values if the result is always-true (15) or always-false (0). It turns out that the original testcase for PR117186, which ran at -O, was relying on the old behaviour for some of the functions. It needs 4-instruction combinations, and so -fexpensive-optimizations, to pass in its intended form. gcc/ PR rtl-optimization/119002 * simplify-rtx.cc (simplify_context::simplify_logical_relational_operation): Handle comparisons between CC values. If there is no evidence that the CC values are unsigned, restrict the fold to always-true or always-false results. gcc/testsuite/ * gcc.c-torture/execute/ieee/pr119002.c: New test. * gcc.target/aarch64/pr117186.c: Run at -O2 rather than -O. Co-authored-by: Jakub Jelinek <jakub@redhat.com>
2025-03-04testsuite: Add tests for already fixed PR [PR119071]Jakub Jelinek1-0/+15
Uros' r15-7793 fixed this PR as well, I'm just committing tests from the PR so that it can be closed. 2025-03-04 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/119071 * gcc.dg/pr119071.c: New test. * gcc.c-torture/execute/pr119071.c: New test.
2025-02-27gimple-fold: Fix a pasto in fold_truth_andor_for_ifcombine [PR119030]Jakub Jelinek1-0/+26
The following testcase is miscompiled since r15-7597. The left comparison is unsigned (x & 0x8000U) != 0) while the right one is signed (x >> 16) >= 0 and is actually a signbit test, so rsignbit is 64. After debugging this and reading the r15-7597 change, I believe there is just a pasto, the if (lsignbit) and if (rsignbit) blocks are pretty much identical with just the first l on all variables starting with l replaced with r (the only difference is that if (lsignbit) has a comment explaining the sign <<= 1; stuff, while it isn't repeated in the second one. Except the second one was using ll_unsignedp instead of rl_unsignedp in one spot. I think it should use the latter, the signedness of the left comparison doesn't affect the other one, they are basically independent with the exception that we check that after transformations they are both EQ or both NE and later on we try to merge them together. 2025-02-27 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/119030 * gimple-fold.cc (fold_truth_andor_for_ifcombine): Fix a pasto, ll_unsignedp -> rl_unsignedp. * gcc.c-torture/execute/pr119030.c: New test.
2025-02-24reassoc: Fix up optimize_range_tests_to_bit_test [PR118915]Jakub Jelinek1-0/+22
The following testcase is miscompiled due to a bug in optimize_range_tests_to_bit_test. It is trying to optimize check for a in [-34,-34] or [-26,-26] or [-6,-6] or [-4,inf] ranges. Another reassoc optimization folds the the test for the first two ranges into (a + 34U) & ~8U in [0U,0U] range, and extract_bit_test_mask actually has code to virtually undo it and treat that again as test for a being -34 or -26. The problem is that optimize_range_tests_to_bit_test remembers in the type variable TREE_TYPE (ranges[i].exp); from the first range. If extract_bit_test_mask doesn't do that virtual undoing of the BIT_AND_EXPR handling, that is just fine, the returned exp is ranges[i].exp. But if the first range is BIT_AND_EXPR, the type could be different, the BIT_AND_EXPR form has the optional cast to corresponding unsigned type in order to avoid introducing UB. Now, type was used to fill in the max value if ranges[j].high was missing in subsequently tested range, and so in this particular testcase the [-4,inf] range which was signed int and so [-4,INT_MAX] was treated as [-4,UINT_MAX] instead. And we were subtracting values of 2 different types and trying to make sense out of that. The following patch fixes this by using the type of the low bound (which is always non-NULL) for the max value of the high bound instead. 2025-02-24 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/118915 * tree-ssa-reassoc.cc (optimize_range_tests_to_bit_test): For highj == NULL_TREE use TYPE_MAX_VALUE (TREE_TYPE (lowj)) rather than TYPE_MAX_VALUE (type). * gcc.c-torture/execute/pr118915.c: New test.
2025-02-22Turn test cases into UNSUPPORTED if running into 'sorry, unimplemented: ↵Thomas Schwinge54-57/+2
dynamic stack allocation not supported' In Subversion r217296 (Git commit e2acc079ff125a869159be45371dc0a29b230e92) "Testsuite alloca fixes for ptx", effective-target 'alloca' was added to mark up test cases that run into the nvptx back end's non-support of dynamic stack allocation. (Later, nvptx gained conditional support for that in commit 3861d362ec7e3c50742fc43833fe9d8674f4070e "nvptx: PTX 'alloca' for '-mptx=7.3'+, '-march=sm_52'+ [PR65181]", but on the other hand, in commit f93a612fc4567652b75ffc916d31a446378e6613 "bpf: liberate R9 for general register allocation", the BPF back end joined "the list of targets that do not support alloca in target-support.exp". Manually maintaining the list of test cases requiring effective-target 'alloca' is notoriously hard, gets out of date quickly: new test cases added to the test suite may need to be analyzed and annotated, and over time annotations also may need to be removed, in cases where the compiler learns to optimize out 'alloca'/VLA usage, for example. This commit replaces (99 % of) the manual annotations with an automatic scheme: turn test cases into UNSUPPORTED if running into 'sorry, unimplemented: dynamic stack allocation not supported'. gcc/testsuite/ * lib/target-supports.exp (check_effective_target_alloca): Gracefully handle the case that we've not be called (indirectly) from 'dg-test'. * lib/gcc-dg.exp (proc gcc-dg-prune): Turn 'sorry, unimplemented: dynamic stack allocation not supported' into UNSUPPORTED. * c-c++-common/Walloca-larger-than.c: Don't 'dg-require-effective-target alloca'. * c-c++-common/Warray-bounds-9.c: Likewise. * c-c++-common/Warray-bounds.c: Likewise. * c-c++-common/Wdangling-pointer-2.c: Likewise. * c-c++-common/Wdangling-pointer-4.c: Likewise. * c-c++-common/Wdangling-pointer-5.c: Likewise. * c-c++-common/Wdangling-pointer.c: Likewise. * c-c++-common/Wimplicit-fallthrough-7.c: Likewise. * c-c++-common/Wsizeof-pointer-memaccess1.c: Likewise. * c-c++-common/Wsizeof-pointer-memaccess2.c: Likewise. * c-c++-common/Wstringop-truncation.c: Likewise. * c-c++-common/Wunused-var-6.c: Likewise. * c-c++-common/Wunused-var-8.c: Likewise. * c-c++-common/analyzer/alloca-leak.c: Likewise. * c-c++-common/analyzer/allocation-size-multiline-2.c: Likewise. * c-c++-common/analyzer/allocation-size-multiline-3.c: Likewise. * c-c++-common/analyzer/capacity-1.c: Likewise. * c-c++-common/analyzer/capacity-3.c: Likewise. * c-c++-common/analyzer/imprecise-floating-point-1.c: Likewise. * c-c++-common/analyzer/infinite-recursion-alloca.c: Likewise. * c-c++-common/analyzer/malloc-callbacks.c: Likewise. * c-c++-common/analyzer/malloc-paths-8.c: Likewise. * c-c++-common/analyzer/out-of-bounds-5.c: Likewise. * c-c++-common/analyzer/out-of-bounds-diagram-11.c: Likewise. * c-c++-common/analyzer/uninit-alloca.c: Likewise. * c-c++-common/analyzer/write-to-string-literal-5.c: Likewise. * c-c++-common/asan/alloca_loop_unpoisoning.c: Likewise. * c-c++-common/auto-init-11.c: Likewise. * c-c++-common/auto-init-12.c: Likewise. * c-c++-common/auto-init-15.c: Likewise. * c-c++-common/auto-init-16.c: Likewise. * c-c++-common/builtins.c: Likewise. * c-c++-common/dwarf2/vla1.c: Likewise. * c-c++-common/gomp/pr61486-2.c: Likewise. * c-c++-common/torture/builtin-clear-padding-4.c: Likewise. * c-c++-common/torture/strub-run3.c: Likewise. * c-c++-common/torture/strub-run4.c: Likewise. * c-c++-common/torture/strub-run4c.c: Likewise. * c-c++-common/torture/strub-run4d.c: Likewise. * c-c++-common/torture/strub-run4i.c: Likewise. * g++.dg/Walloca1.C: Likewise. * g++.dg/Walloca2.C: Likewise. * g++.dg/cpp0x/pr70338.C: Likewise. * g++.dg/cpp1y/lambda-generic-vla1.C: Likewise. * g++.dg/cpp1y/vla10.C: Likewise. * g++.dg/cpp1y/vla2.C: Likewise. * g++.dg/cpp1y/vla6.C: Likewise. * g++.dg/cpp1y/vla8.C: Likewise. * g++.dg/debug/debug5.C: Likewise. * g++.dg/debug/debug6.C: Likewise. * g++.dg/debug/pr54828.C: Likewise. * g++.dg/diagnostic/pr70105.C: Likewise. * g++.dg/eh/cleanup5.C: Likewise. * g++.dg/eh/spbp.C: Likewise. * g++.dg/ext/builtin_alloca.C: Likewise. * g++.dg/ext/tmplattr9.C: Likewise. * g++.dg/ext/vla10.C: Likewise. * g++.dg/ext/vla11.C: Likewise. * g++.dg/ext/vla12.C: Likewise. * g++.dg/ext/vla15.C: Likewise. * g++.dg/ext/vla16.C: Likewise. * g++.dg/ext/vla17.C: Likewise. * g++.dg/ext/vla23.C: Likewise. * g++.dg/ext/vla3.C: Likewise. * g++.dg/ext/vla6.C: Likewise. * g++.dg/ext/vla7.C: Likewise. * g++.dg/init/array24.C: Likewise. * g++.dg/init/new47.C: Likewise. * g++.dg/init/pr55497.C: Likewise. * g++.dg/opt/pr78201.C: Likewise. * g++.dg/template/vla2.C: Likewise. * g++.dg/torture/Wsizeof-pointer-memaccess1.C: Likewise. * g++.dg/torture/Wsizeof-pointer-memaccess2.C: Likewise. * g++.dg/torture/pr62127.C: Likewise. * g++.dg/torture/pr67055.C: Likewise. * g++.dg/torture/stackalign/eh-alloca-1.C: Likewise. * g++.dg/torture/stackalign/eh-inline-2.C: Likewise. * g++.dg/torture/stackalign/eh-vararg-1.C: Likewise. * g++.dg/torture/stackalign/eh-vararg-2.C: Likewise. * g++.dg/warn/Wplacement-new-size-5.C: Likewise. * g++.dg/warn/Wsizeof-pointer-memaccess-1.C: Likewise. * g++.dg/warn/Wvla-1.C: Likewise. * g++.dg/warn/Wvla-3.C: Likewise. * g++.old-deja/g++.ext/array2.C: Likewise. * g++.old-deja/g++.ext/constructor.C: Likewise. * g++.old-deja/g++.law/builtin1.C: Likewise. * g++.old-deja/g++.other/crash12.C: Likewise. * g++.old-deja/g++.other/eh3.C: Likewise. * g++.old-deja/g++.pt/array6.C: Likewise. * g++.old-deja/g++.pt/dynarray.C: Likewise. * gcc.c-torture/compile/20000923-1.c: Likewise. * gcc.c-torture/compile/20030224-1.c: Likewise. * gcc.c-torture/compile/20071108-1.c: Likewise. * gcc.c-torture/compile/20071117-1.c: Likewise. * gcc.c-torture/compile/900313-1.c: Likewise. * gcc.c-torture/compile/parms.c: Likewise. * gcc.c-torture/compile/pr17397.c: Likewise. * gcc.c-torture/compile/pr35006.c: Likewise. * gcc.c-torture/compile/pr42956.c: Likewise. * gcc.c-torture/compile/pr51354.c: Likewise. * gcc.c-torture/compile/pr52714.c: Likewise. * gcc.c-torture/compile/pr55851.c: Likewise. * gcc.c-torture/compile/pr77754-1.c: Likewise. * gcc.c-torture/compile/pr77754-2.c: Likewise. * gcc.c-torture/compile/pr77754-3.c: Likewise. * gcc.c-torture/compile/pr77754-4.c: Likewise. * gcc.c-torture/compile/pr77754-5.c: Likewise. * gcc.c-torture/compile/pr77754-6.c: Likewise. * gcc.c-torture/compile/pr78439.c: Likewise. * gcc.c-torture/compile/pr79413.c: Likewise. * gcc.c-torture/compile/pr82564.c: Likewise. * gcc.c-torture/compile/pr87110.c: Likewise. * gcc.c-torture/compile/pr99787-1.c: Likewise. * gcc.c-torture/compile/vla-const-1.c: Likewise. * gcc.c-torture/compile/vla-const-2.c: Likewise. * gcc.c-torture/execute/20010209-1.c: Likewise. * gcc.c-torture/execute/20020314-1.c: Likewise. * gcc.c-torture/execute/20020412-1.c: Likewise. * gcc.c-torture/execute/20021113-1.c: Likewise. * gcc.c-torture/execute/20040223-1.c: Likewise. * gcc.c-torture/execute/20040308-1.c: Likewise. * gcc.c-torture/execute/20040811-1.c: Likewise. * gcc.c-torture/execute/20070824-1.c: Likewise. * gcc.c-torture/execute/20070919-1.c: Likewise. * gcc.c-torture/execute/built-in-setjmp.c: Likewise. * gcc.c-torture/execute/pr22061-1.c: Likewise. * gcc.c-torture/execute/pr43220.c: Likewise. * gcc.c-torture/execute/pr82210.c: Likewise. * gcc.c-torture/execute/pr86528.c: Likewise. * gcc.c-torture/execute/vla-dealloc-1.c: Likewise. * gcc.dg/20001012-2.c: Likewise. * gcc.dg/20020415-1.c: Likewise. * gcc.dg/20030331-2.c: Likewise. * gcc.dg/20101010-1.c: Likewise. * gcc.dg/Walloca-1.c: Likewise. * gcc.dg/Walloca-10.c: Likewise. * gcc.dg/Walloca-11.c: Likewise. * gcc.dg/Walloca-12.c: Likewise. * gcc.dg/Walloca-13.c: Likewise. * gcc.dg/Walloca-14.c: Likewise. * gcc.dg/Walloca-15.c: Likewise. * gcc.dg/Walloca-2.c: Likewise. * gcc.dg/Walloca-3.c: Likewise. * gcc.dg/Walloca-4.c: Likewise. * gcc.dg/Walloca-5.c: Likewise. * gcc.dg/Walloca-6.c: Likewise. * gcc.dg/Walloca-7.c: Likewise. * gcc.dg/Walloca-8.c: Likewise. * gcc.dg/Walloca-9.c: Likewise. * gcc.dg/Walloca-larger-than-2.c: Likewise. * gcc.dg/Walloca-larger-than-3.c: Likewise. * gcc.dg/Walloca-larger-than-4.c: Likewise. * gcc.dg/Walloca-larger-than.c: Likewise. * gcc.dg/Warray-bounds-22.c: Likewise. * gcc.dg/Warray-bounds-41.c: Likewise. * gcc.dg/Warray-bounds-46.c: Likewise. * gcc.dg/Warray-bounds-48-novec.c: Likewise. * gcc.dg/Warray-bounds-48.c: Likewise. * gcc.dg/Warray-bounds-50.c: Likewise. * gcc.dg/Warray-bounds-63.c: Likewise. * gcc.dg/Warray-bounds-66.c: Likewise. * gcc.dg/Wdangling-pointer.c: Likewise. * gcc.dg/Wfree-nonheap-object-2.c: Likewise. * gcc.dg/Wfree-nonheap-object.c: Likewise. * gcc.dg/Wrestrict-17.c: Likewise. * gcc.dg/Wrestrict.c: Likewise. * gcc.dg/Wreturn-local-addr-2.c: Likewise. * gcc.dg/Wreturn-local-addr-3.c: Likewise. * gcc.dg/Wreturn-local-addr-4.c: Likewise. * gcc.dg/Wreturn-local-addr-6.c: Likewise. * gcc.dg/Wsizeof-pointer-memaccess1.c: Likewise. * gcc.dg/Wstack-usage.c: Likewise. * gcc.dg/Wstrict-aliasing-bogus-vla-1.c: Likewise. * gcc.dg/Wstrict-overflow-27.c: Likewise. * gcc.dg/Wstringop-overflow-15.c: Likewise. * gcc.dg/Wstringop-overflow-23.c: Likewise. * gcc.dg/Wstringop-overflow-25.c: Likewise. * gcc.dg/Wstringop-overflow-27.c: Likewise. * gcc.dg/Wstringop-overflow-3.c: Likewise. * gcc.dg/Wstringop-overflow-39.c: Likewise. * gcc.dg/Wstringop-overflow-56.c: Likewise. * gcc.dg/Wstringop-overflow-57.c: Likewise. * gcc.dg/Wstringop-overflow-67.c: Likewise. * gcc.dg/Wstringop-overflow-71.c: Likewise. * gcc.dg/Wstringop-truncation-3.c: Likewise. * gcc.dg/Wvla-larger-than-1.c: Likewise. * gcc.dg/Wvla-larger-than-2.c: Likewise. * gcc.dg/Wvla-larger-than-3.c: Likewise. * gcc.dg/Wvla-larger-than-4.c: Likewise. * gcc.dg/Wvla-larger-than-5.c: Likewise. * gcc.dg/analyzer/boxed-malloc-1.c: Likewise. * gcc.dg/analyzer/call-summaries-2.c: Likewise. * gcc.dg/analyzer/malloc-1.c: Likewise. * gcc.dg/analyzer/malloc-reuse.c: Likewise. * gcc.dg/analyzer/out-of-bounds-diagram-12.c: Likewise. * gcc.dg/analyzer/pr93355-localealias.c: Likewise. * gcc.dg/analyzer/putenv-1.c: Likewise. * gcc.dg/analyzer/taint-alloc-1.c: Likewise. * gcc.dg/analyzer/torture/pr93373.c: Likewise. * gcc.dg/analyzer/torture/ubsan-1.c: Likewise. * gcc.dg/analyzer/vla-1.c: Likewise. * gcc.dg/atomic/stdatomic-vm.c: Likewise. * gcc.dg/attr-alloc_size-6.c: Likewise. * gcc.dg/attr-alloc_size-7.c: Likewise. * gcc.dg/attr-alloc_size-8.c: Likewise. * gcc.dg/attr-alloc_size-9.c: Likewise. * gcc.dg/attr-noipa.c: Likewise. * gcc.dg/auto-init-uninit-36.c: Likewise. * gcc.dg/auto-init-uninit-9.c: Likewise. * gcc.dg/auto-type-1.c: Likewise. * gcc.dg/builtin-alloc-size.c: Likewise. * gcc.dg/builtin-dynamic-alloc-size.c: Likewise. * gcc.dg/builtin-dynamic-object-size-1.c: Likewise. * gcc.dg/builtin-dynamic-object-size-2.c: Likewise. * gcc.dg/builtin-dynamic-object-size-3.c: Likewise. * gcc.dg/builtin-dynamic-object-size-4.c: Likewise. * gcc.dg/builtin-object-size-1.c: Likewise. * gcc.dg/builtin-object-size-2.c: Likewise. * gcc.dg/builtin-object-size-3.c: Likewise. * gcc.dg/builtin-object-size-4.c: Likewise. * gcc.dg/builtins-64.c: Likewise. * gcc.dg/builtins-68.c: Likewise. * gcc.dg/c23-auto-2.c: Likewise. * gcc.dg/c99-const-expr-13.c: Likewise. * gcc.dg/c99-vla-1.c: Likewise. * gcc.dg/fold-alloca-1.c: Likewise. * gcc.dg/gomp/pr30494.c: Likewise. * gcc.dg/gomp/vla-2.c: Likewise. * gcc.dg/gomp/vla-3.c: Likewise. * gcc.dg/gomp/vla-4.c: Likewise. * gcc.dg/gomp/vla-5.c: Likewise. * gcc.dg/graphite/pr99085.c: Likewise. * gcc.dg/guality/guality.c: Likewise. * gcc.dg/lto/pr80778_0.c: Likewise. * gcc.dg/nested-func-10.c: Likewise. * gcc.dg/nested-func-12.c: Likewise. * gcc.dg/nested-func-13.c: Likewise. * gcc.dg/nested-func-14.c: Likewise. * gcc.dg/nested-func-15.c: Likewise. * gcc.dg/nested-func-16.c: Likewise. * gcc.dg/nested-func-17.c: Likewise. * gcc.dg/nested-func-9.c: Likewise. * gcc.dg/packed-vla.c: Likewise. * gcc.dg/pr100225.c: Likewise. * gcc.dg/pr25682.c: Likewise. * gcc.dg/pr27301.c: Likewise. * gcc.dg/pr31507-1.c: Likewise. * gcc.dg/pr33238.c: Likewise. * gcc.dg/pr41470.c: Likewise. * gcc.dg/pr49120.c: Likewise. * gcc.dg/pr50764.c: Likewise. * gcc.dg/pr51491-2.c: Likewise. * gcc.dg/pr51990-2.c: Likewise. * gcc.dg/pr51990.c: Likewise. * gcc.dg/pr59011.c: Likewise. * gcc.dg/pr59523.c: Likewise. * gcc.dg/pr61561.c: Likewise. * gcc.dg/pr78468.c: Likewise. * gcc.dg/pr78902.c: Likewise. * gcc.dg/pr79972.c: Likewise. * gcc.dg/pr82875.c: Likewise. * gcc.dg/pr83844.c: Likewise. * gcc.dg/pr84131.c: Likewise. * gcc.dg/pr87099.c: Likewise. * gcc.dg/pr87320.c: Likewise. * gcc.dg/pr89045.c: Likewise. * gcc.dg/pr91014.c: Likewise. * gcc.dg/pr93986.c: Likewise. * gcc.dg/pr98721-1.c: Likewise. * gcc.dg/pr99122-2.c: Likewise. * gcc.dg/shrink-wrap-alloca.c: Likewise. * gcc.dg/sso-14.c: Likewise. * gcc.dg/strlenopt-62.c: Likewise. * gcc.dg/strlenopt-83.c: Likewise. * gcc.dg/strlenopt-84.c: Likewise. * gcc.dg/strlenopt-91.c: Likewise. * gcc.dg/torture/Wsizeof-pointer-memaccess1.c: Likewise. * gcc.dg/torture/calleesave-sse.c: Likewise. * gcc.dg/torture/pr48953.c: Likewise. * gcc.dg/torture/pr71881.c: Likewise. * gcc.dg/torture/pr71901.c: Likewise. * gcc.dg/torture/pr78742.c: Likewise. * gcc.dg/torture/pr92088-1.c: Likewise. * gcc.dg/torture/pr92088-2.c: Likewise. * gcc.dg/torture/pr93124.c: Likewise. * gcc.dg/torture/pr94479.c: Likewise. * gcc.dg/torture/stackalign/alloca-1.c: Likewise. * gcc.dg/torture/stackalign/inline-2.c: Likewise. * gcc.dg/torture/stackalign/nested-3.c: Likewise. * gcc.dg/torture/stackalign/vararg-1.c: Likewise. * gcc.dg/torture/stackalign/vararg-2.c: Likewise. * gcc.dg/tree-ssa/20030807-2.c: Likewise. * gcc.dg/tree-ssa/20080530.c: Likewise. * gcc.dg/tree-ssa/alias-37.c: Likewise. * gcc.dg/tree-ssa/builtin-sprintf-warn-22.c: Likewise. * gcc.dg/tree-ssa/builtin-sprintf-warn-25.c: Likewise. * gcc.dg/tree-ssa/builtin-sprintf-warn-3.c: Likewise. * gcc.dg/tree-ssa/loop-interchange-15.c: Likewise. * gcc.dg/tree-ssa/pr23848-1.c: Likewise. * gcc.dg/tree-ssa/pr23848-2.c: Likewise. * gcc.dg/tree-ssa/pr23848-3.c: Likewise. * gcc.dg/tree-ssa/pr23848-4.c: Likewise. * gcc.dg/uninit-32.c: Likewise. * gcc.dg/uninit-36.c: Likewise. * gcc.dg/uninit-39.c: Likewise. * gcc.dg/uninit-41.c: Likewise. * gcc.dg/uninit-9-O0.c: Likewise. * gcc.dg/uninit-9.c: Likewise. * gcc.dg/uninit-pr100250.c: Likewise. * gcc.dg/uninit-pr101300.c: Likewise. * gcc.dg/uninit-pr101494.c: Likewise. * gcc.dg/uninit-pr98583.c: Likewise. * gcc.dg/vla-2.c: Likewise. * gcc.dg/vla-22.c: Likewise. * gcc.dg/vla-24.c: Likewise. * gcc.dg/vla-3.c: Likewise. * gcc.dg/vla-4.c: Likewise. * gcc.dg/vla-stexp-1.c: Likewise. * gcc.dg/vla-stexp-2.c: Likewise. * gcc.dg/vla-stexp-4.c: Likewise. * gcc.dg/vla-stexp-5.c: Likewise. * gcc.dg/winline-7.c: Likewise. * gcc.target/aarch64/stack-check-alloca-1.c: Likewise. * gcc.target/aarch64/stack-check-alloca-10.c: Likewise. * gcc.target/aarch64/stack-check-alloca-2.c: Likewise. * gcc.target/aarch64/stack-check-alloca-3.c: Likewise. * gcc.target/aarch64/stack-check-alloca-4.c: Likewise. * gcc.target/aarch64/stack-check-alloca-5.c: Likewise. * gcc.target/aarch64/stack-check-alloca-6.c: Likewise. * gcc.target/aarch64/stack-check-alloca-7.c: Likewise. * gcc.target/aarch64/stack-check-alloca-8.c: Likewise. * gcc.target/aarch64/stack-check-alloca-9.c: Likewise. * gcc.target/arc/interrupt-6.c: Likewise. * gcc.target/i386/pr80969-3.c: Likewise. * gcc.target/loongarch/stack-check-alloca-1.c: Likewise. * gcc.target/loongarch/stack-check-alloca-2.c: Likewise. * gcc.target/loongarch/stack-check-alloca-3.c: Likewise. * gcc.target/loongarch/stack-check-alloca-4.c: Likewise. * gcc.target/loongarch/stack-check-alloca-5.c: Likewise. * gcc.target/loongarch/stack-check-alloca-6.c: Likewise. * gcc.target/riscv/stack-check-alloca-1.c: Likewise. * gcc.target/riscv/stack-check-alloca-10.c: Likewise. * gcc.target/riscv/stack-check-alloca-2.c: Likewise. * gcc.target/riscv/stack-check-alloca-3.c: Likewise. * gcc.target/riscv/stack-check-alloca-4.c: Likewise. * gcc.target/riscv/stack-check-alloca-5.c: Likewise. * gcc.target/riscv/stack-check-alloca-6.c: Likewise. * gcc.target/riscv/stack-check-alloca-7.c: Likewise. * gcc.target/riscv/stack-check-alloca-8.c: Likewise. * gcc.target/riscv/stack-check-alloca-9.c: Likewise. * gcc.target/sparc/setjmp-1.c: Likewise. * gcc.target/x86_64/abi/ms-sysv/ms-sysv.c: Likewise. * gcc.c-torture/compile/20001221-1.c: Don't 'dg-skip-if' for '! alloca'. * gcc.c-torture/compile/20020807-1.c: Likewise. * gcc.c-torture/compile/20050801-2.c: Likewise. * gcc.c-torture/compile/920428-4.c: Likewise. * gcc.c-torture/compile/debugvlafunction-1.c: Likewise. * gcc.c-torture/compile/pr41469.c: Likewise. * gcc.c-torture/execute/920721-2.c: Likewise. * gcc.c-torture/execute/920929-1.c: Likewise. * gcc.c-torture/execute/921017-1.c: Likewise. * gcc.c-torture/execute/941202-1.c: Likewise. * gcc.c-torture/execute/align-nest.c: Likewise. * gcc.c-torture/execute/alloca-1.c: Likewise. * gcc.c-torture/execute/pr22061-4.c: Likewise. * gcc.c-torture/execute/pr36321.c: Likewise. * gcc.dg/torture/pr8081.c: Likewise. * gcc.dg/analyzer/data-model-1.c: Don't 'dg-require-effective-target alloca'. XFAIL relevant 'dg-warning's for '! alloca'. * gcc.dg/uninit-38.c: Likewise. * gcc.dg/uninit-pr98578.c: Likewise. * gcc.dg/compat/struct-by-value-22_main.c: Comment on 'dg-require-effective-target alloca'. libstdc++-v3/ * testsuite/lib/prune.exp (proc libstdc++-dg-prune): Turn 'sorry, unimplemented: dynamic stack allocation not supported' into UNSUPPORTED.
2025-02-10i386: Change RTL representation of bt[lq] [PR118623]Jakub Jelinek1-0/+23
The following testcase is miscompiled because of RTL represententation of bt{l,q} insn followed by e.g. j{c,nc} being misleading to what it actually does. Let's look e.g. at (define_insn_and_split "*jcc_bt<mode>" [(set (pc) (if_then_else (match_operator 0 "bt_comparison_operator" [(zero_extract:SWI48 (match_operand:SWI48 1 "nonimmediate_operand") (const_int 1) (match_operand:QI 2 "nonmemory_operand")) (const_int 0)]) (label_ref (match_operand 3)) (pc))) (clobber (reg:CC FLAGS_REG))] "(TARGET_USE_BT || optimize_function_for_size_p (cfun)) && (CONST_INT_P (operands[2]) ? (INTVAL (operands[2]) < GET_MODE_BITSIZE (<MODE>mode) && INTVAL (operands[2]) >= (optimize_function_for_size_p (cfun) ? 8 : 32)) : !memory_operand (operands[1], <MODE>mode)) && ix86_pre_reload_split ()" "#" "&& 1" [(set (reg:CCC FLAGS_REG) (compare:CCC (zero_extract:SWI48 (match_dup 1) (const_int 1) (match_dup 2)) (const_int 0))) (set (pc) (if_then_else (match_op_dup 0 [(reg:CCC FLAGS_REG) (const_int 0)]) (label_ref (match_dup 3)) (pc)))] { operands[0] = shallow_copy_rtx (operands[0]); PUT_CODE (operands[0], reverse_condition (GET_CODE (operands[0]))); }) The define_insn part in RTL describes exactly what it does, jumps to op3 if bit op2 in op1 is set (for op0 NE) or not set (for op0 EQ). The problem is with what it splits into. put_condition_code %C1 for CCCmode comparisons emits c for EQ and LTU, nc for NE and GEU and ICEs otherwise. CCCmode is used mainly for carry out of add/adc, borrow out of sub/sbb, in those cases e.g. for add we have (set (reg:CCC flags) (compare:CCC (plus:M x y) x)) and use (ltu (reg:CCC flags) (const_int 0)) for carry set and (geu (reg:CCC flags) (const_int 0)) for carry not set. These cases model in RTL what is actually happening, compare in infinite precision x from the result of finite precision addition in M mode and if it is less than unsigned (i.e. overflow happened), carry is set. Another use of CCCmode is in UNSPEC_* patterns, those are used with (eq (reg:CCC flags) (const_int 0)) for carry set and ne for unset, given the UNSPEC no big deal, the middle-end doesn't know what means set or unset. But for the bt{l,q}; j{c,nc} case the above splits it into (set (reg:CCC flags) (compare:CCC (zero_extract) (const_int 0))) for bt and (set (pc) (if_then_else (eq (reg:CCC flags) (const_int 0)) (label_ref) (pc))) for the bit set case (so that the jump expands to jc) and ne for the bit not set case (so that the jump expands to jnc). Similarly for the different splitters for cmov and set{c,nc} etc. The problem is that when the middle-end reads this RTL, it feels the exact opposite to it. If zero_extract is 1, flags is set to comparison of 1 and 0 and that would mean using ne ne in the if_then_else, and vice versa. So, in order to better describe in RTL what is actually happening, one possibility would be to swap the behavior of put_condition_code and use NE + LTU -> c and EQ + GEU -> nc rather than the current EQ + LTU -> c and NE + GEU -> nc; and adjust everything. The following patch uses a more limited approach, instead of representing bt{l,q}; j{c,nc} case as written above it uses (set (reg:CCC flags) (compare:CCC (const_int 0) (zero_extract))) and (set (pc) (if_then_else (ltu (reg:CCC flags) (const_int 0)) (label_ref) (pc))) which uses the existing put_condition_code but describes what the insns actually do in RTL clearly. If zero_extract is 1, then flags are LTU, 0U < 1U, if zero_extract is 0, then flags are GEU, 0U >= 0U. The patch adjusts the *bt<mode> define_insn and all the splitters to it and its comparisons/conditional moves/setXX. 2025-02-10 Jakub Jelinek <jakub@redhat.com> PR target/118623 * config/i386/i386.md (*bt<mode>): Represent bt as compare:CCC of const0_rtx and zero_extract rather than zero_extract and const0_rtx. (*bt<SWI48:mode>_mask): Likewise. (*jcc_bt<mode>): Likewise. Use LTU and GEU as flags test instead of EQ and NE. (*jcc_bt<mode>_mask): Likewise. (*jcc_bt<SWI48:mode>_mask_1): Likewise. (Help combine recognize bt followed by cmov splitter): Likewise. (*bt<mode>_setcqi): Likewise. (*bt<mode>_setncqi): Likewise. (*bt<mode>_setnc<mode>): Likewise. (*bt<mode>_setncqi_2): Likewise. (*bt<mode>_setc<mode>_mask): Likewise. * gcc.c-torture/execute/pr118623.c: New test.
2025-02-01icf: Compare call argument types in certain cases and asm operands [PR117432]Jakub Jelinek1-0/+71
compare_operand uses operand_equal_p under the hood, which e.g. for INTEGER_CSTs will just match the values rather regardless of their types. Now, in many comparing the type is redundant, if we have x_2 = y_3 + 1; we've already compared the type for the lhs and also for rhs1, there won't be any surprises on rhs2. As noted in the PR, there are cases where the type of the operand is the sole place of information and we don't want to ICF merge functions if the types differ. One case is stdarg functions, arguments passed to ..., it is different if we pass 1, 1L, 1LL. Another case are the K&R unprototyped functions (sure, gone in C23). And yet another case are inline asm operands, "r" (1) is different from "r" (1L) from "r" (1LL). So, the following patch determines based on lack of fntype (e.g. for internal functions), or on !prototype_p, or on stdarg_p (in that case using number of named arguments) which arguments need to have type checked and does that, plus compares types on inline asm operands (maybe it would be enough to do that just for input operands but we have just a routine to handle both and I didn't feel we need to differentiate). Furthermore, I've noticed fntype{1,2} isn't actually compared if it is a direct call (gimple_call_fndecl is non-NULL). That is wrong too, we could have void (*fn) (int, long long) = (void (*) (int, long long)) foo; fn (1, 1LL); in one case and void (*fn) (long long, int) = (void (*) (long long, int)) foo; fn (1LL, 1); in another, both folded into a direct call of foo with different gimple_call_fntype. Sure, one of them would be UB at runtime (or both), but what if we ICF merge it into something that into the one UB at runtime and the program actually calls the correct one only? 2025-02-01 Jakub Jelinek <jakub@redhat.com> PR ipa/117432 * ipa-icf-gimple.cc (func_checker::compare_asm_inputs_outputs): Also return_false if operands have incompatible types. (func_checker::compare_gimple_call): Check fntype1 vs. fntype2 compatibility for all non-internal calls and assume fntype1 and fntype2 are non-NULL for those. For calls to non-prototyped calls or for stdarg_p functions after the last named argument (if any) check type compatibility of call arguments. * gcc.c-torture/execute/pr117432.c: New test. * gcc.target/i386/pr117432.c: New test.
2025-01-31testsuite: Add testcase for already fixed PR [PR117498]Jakub Jelinek1-0/+35
This wrong-code issue has been fixed with r15-7249. We still emit warnings which are questionable and perhaps we'd get better generated code if niters determined the loop has only a single iteration without UB and we'd punt on vectorizing it (or unrolling). 2025-01-31 Jakub Jelinek <jakub@redhat.com> PR middle-end/117498 * gcc.c-torture/execute/pr117498.c: New test.
2025-01-29pair-fusion: A couple of fixes for sp updates [PR118429]Richard Sandiford1-0/+7
The PR showed two issues with pair-fusion. The first is that the pass treated stack pointer deallocations as ordinary register updates, and so might move them earlier than another stack access (through a different base register) that doesn't alias the pair candidate. The simplest fix for that seems to be to prevent the stack deallocation from being moved. This part might (or might not) be a latent source of wrong code and so worth backporting in some form. (The patch as-is won't work for GCC 14.) The second issue only started with r15-6551, which added a memory write to stack allocations and deallocations. We should use the existing tombstone mechanism to preserve the associated memory definition. (Deleting definitions immediately would have quadratic complexity in the worst case.) gcc/ PR rtl-optimization/118429 * pair-fusion.cc (latest_hazard_before): Add an extra parameter to say whether the instruction is a load or a store. If the instruction is not a load or store and has memory side effects, prevent it from being moved earlier. (pair_fusion::find_trailing_add): Update call accordingly. (pair_fusion_bb_info::fuse_pair): If the trailng addition had a memory side-effect, use a tombstone to preserve it. gcc/testsuite/ PR rtl-optimization/118429 * gcc.c-torture/compile/pr118429.c: New test.
2025-01-28combine: Fix up make_extraction [PR118638]Jakub Jelinek1-0/+20
The following testcase is miscompiled at -Os on x86_64-linux. The problem is during make_compound_operation of (ashiftrt:SI (ashift:SI (mult:SI (reg:SI 107 [ a_5 ]) (const_int 3 [0x3])) (const_int 31 [0x1f])) (const_int 31 [0x1f])) where it incorrectly returns (mult:SI (sign_extract:SI (reg:SI 107 [ a_5 ]) (const_int 2 [0x2]) (const_int 0 [0])) (const_int 3 [0x3])) which isn't obviously true, the former returns either 0 or -1 depending on the least significant bit of the multiplication, the latter returns either 0 or -3 depending on the second least significant bit of the multiplication argument. The bug has been introduced in PR96998 r11-4563, which added handling of x * (2^N) similar to x << N. In the above case, pos is 0 and len is 1, sign extracting a single least significant bit of the multiplication. As 3 is not a power of 2, shift_amt is -1. But IN_RANGE (-1, 1, 1 - 1) is still true, because the basic requirement of IN_RANGE that LOWER is not greater than UPPER is violated. The intention of using 1 as LOWER is to avoid matching multiplication by 1, that really shouldn't appear in the IL. But to avoid violating IN_RANGE requirement, we need to verify that len is at least 2. I've added this len > 1 check to the inner if rather than outer because I think for GCC 16 we should add a further optimization. In the particular case of 1 least significant bit sign extraction from multiplication by 3, we could actually say it is equivalent to (sign_extract:SI (reg:SI 107 [ a_5 ]) (const_int 1 [0x2]) (const_int 0 [0])) That is because 3 is an odd number and multiplication by 2 will yield the least significant bit 0 (we are sign extracting just one) and so the multiplication doesn't change anything on the outcome. More generally, even for larger len, multiplication by C which is (1 << X) + 1 where X is >= len should be optimizable just to extraction of the multiplicand's least significant len bits. 2025-01-28 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/118638 * combine.cc (make_extraction): Only optimize (mult x 2^n) if len is larger than 1. * gcc.c-torture/execute/pr118638.c: New test.
2025-01-20aarch64: Fix invalid subregs in xorsign [PR118501]Richard Sandiford1-0/+6
In the testcase, we try to use xorsign on: (subreg:DF (reg:TI R) 8) i.e. the highpart of the TI. xorsign wants to take a V2DF paradoxical subreg of this, which is rightly rejected as a direct operation. In cases like this, we need to force the highpart into a fresh register first. gcc/ PR target/118501 * config/aarch64/aarch64.md (@xorsign<mode>3): Use force_lowpart_subreg. gcc/testsuite/ PR target/118501 * gcc.c-torture/compile/pr118501.c: New test.
2025-01-20testsuite: Fix name of PR116348 test caseXi Ruoyao1-0/+0
gcc/testsuite/ChangeLog: * gcc.c-torture/compile/pr116438.c: Rename to ... * gcc.c-torture/compile/pr116348.c: ... this.
2025-01-09s390: Add testcase for just fixed PR118362Jakub Jelinek1-0/+19
On Thu, Jan 09, 2025 at 01:29:27PM +0100, Stefan Schulze Frielinghaus wrote: > Optimization s390_constant_via_vgbm_p() should only apply to constant > vectors which can be expressed by the hardware, i.e., which have a size > of at most 16-bytes, similar as it is done for s390_constant_via_vgm_p() > and s390_constant_via_vrepi_p(). > > gcc/ChangeLog: > > PR target/118362 > * config/s390/s390.cc (s390_constant_via_vgbm_p): Allow at most > 16-byte vectors. > --- > Bootstrap and regtest are still running. If both are successful, I > will push this one promptly. This was committed without a testcase, which IMHO shouldn't hurt. 2025-01-09 Jakub Jelinek <jakub@redhat.com> PR target/118362 * gcc.c-torture/compile/pr118362.c: New test. * gcc.target/s390/pr118362.c: New test.
2025-01-09nvptx: PTX 'alloca' for '-mptx=7.3'+, '-march=sm_52'+ [PR65181]Thomas Schwinge1-0/+3
..., and use it for '-mno-soft-stack': PTX "native" stacks. PR target/65181 gcc/ * config/nvptx/nvptx.cc (nvptx_get_drap_rtx): Handle '!TARGET_SOFT_STACK'. * config/nvptx/nvptx.md (define_c_enum "unspec"): Add 'UNSPEC_STACKSAVE', 'UNSPEC_STACKRESTORE'. (define_expand "allocate_stack", define_expand "save_stack_block") (define_expand "save_stack_block"): Handle '!TARGET_SOFT_STACK', PTX 'alloca'. (define_insn "@nvptx_alloca_<mode>") (define_insn "@nvptx_stacksave_<mode>") (define_insn "@nvptx_stackrestore_<mode>"): New. * doc/invoke.texi (Nvidia PTX Options): Update '-msoft-stack', '-mno-soft-stack'. * doc/sourcebuild.texi (nvptx-specific attributes): Document 'nvptx_runtime_alloca_ptx'. (Add Options): Document 'nvptx_alloca_ptx'. gcc/testsuite/ * gcc.target/nvptx/alloca-1.c: Evolve into... * gcc.target/nvptx/alloca-1-O0.c: ... this, ... * gcc.target/nvptx/alloca-1-O1.c: ... this, and... * gcc.target/nvptx/alloca-1-sm_30.c: ... this. * gcc.target/nvptx/vla-1.c: Evolve into... * gcc.target/nvptx/vla-1-O0.c: ... this, ... * gcc.target/nvptx/vla-1-O1.c: ... this, and... * gcc.target/nvptx/vla-1-sm_30.c: ... this. * gcc.c-torture/execute/pr36321.c: Adjust. * gcc.target/nvptx/__builtin_alloca_0-1-O0.c: Likewise. * gcc.target/nvptx/__builtin_alloca_0-1-O1.c: Likewise. * gcc.target/nvptx/__builtin_stack_save___builtin_stack_restore-1.c: Likewise. * gcc.target/nvptx/softstack.c: Likewise. * gcc.target/nvptx/__builtin_stack_save___builtin_stack_restore-1-sm_30.c: New. * gcc.target/nvptx/alloca-2-O0.c: Likewise. * gcc.target/nvptx/alloca-3-O1.c: Likewise. * gcc.target/nvptx/alloca-4-O3.c: Likewise. * gcc.target/nvptx/alloca-5.c: Likewise. * lib/target-supports.exp (check_effective_target_alloca): Adjust. (check_nvptx_default_ptx_isa_target_architecture_at_least) (check_nvptx_runtime_ptx_isa_target_architecture_at_least) (check_effective_target_nvptx_runtime_alloca_ptx) (add_options_for_nvptx_alloca_ptx): New. libgomp/ * fortran.c (omp_get_device_from_uid_): Adjust. * testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: Likewise.
2025-01-02Update copyright years.Jakub Jelinek5-5/+5
2024-12-30[PR testsuite/114182] Fix minor testsuite issue when double == floatJeff Law2-3/+5
This is a minor testsuite adjustment attr-complex-method-2.c selects between two scan-tree-dump clauses based on avr, !avr. But what they really should be checking is "large_double" that way it works for avr, h8, rl78 and any other target which makes doubles the same size as floats. attr-complex-method.c should be doing the same thing. After this change avr passes attr-complex-method.c and the rl78 and h8 ports will pass both tests. Other targets in my tester are unaffected. PR testsuite/114182 gcc/testsuite/ * gcc.c-torture/compile/attr-complex-method.c: Use "large_double" to select between scan outputs. * gcc.c-torture/compile/attr-complex-method-2.c: Similarly.
2024-12-25testsuite: Expand coverage for unaligned memory storesMaciej W. Rozycki1-0/+84
Expand coverage for unaligned memory stores, for the "insvmisalignM" patterns, for 2-byte, 4-byte, and 8-byte scalars, across byte alignments of 1, 2, 4 and byte misalignments within from 0 up to 7 (there's some redundancy there for the sake of simplicity of the test case), making sure all data is written and no data is changed outside the area meant to be written. The test case has turned invaluable in verifying changes to the Alpha backend, but functionality covered is generic, so I have concluded this test qualifies for generic verification and does not have to be limited to the Alpha-specific subset of the testsuite. gcc/testsuite/ * gcc.c-torture/execute/misalign.c: New file.
2024-12-25testsuite: Expand coverage for `__builtin_memset' with 0Maciej W. Rozycki1-0/+233
Expand coverage for `__builtin_memset' for the special case of clearing a block, primarily for "setmemM" block set pattern, though with smaller sizes open-coded sequences may be produced instead. This verifies block sizes in bytes from 1 to 64 across byte alignments of 1, 2, 4, 8 and byte misalignments within from 0 up to 7 (there's some redundancy there for the sake of simplicity of the test case), making sure all the intended area is cleared and no data is changed outside it. These choice of the ranges for the parameters has come from the Alpha backend, whose "setmemM" pattern has various corner cases related to base alignment and the misalignment within. The test case has turned invaluable in verifying changes to the Alpha backend, but functionality covered is generic, so I have concluded this test qualifies for generic verification and does not have to be limited to the Alpha-specific subset of the testsuite. Just as with `__builtin_memcpy' tests this code turned out to require quite a lot of time to compile, although a bit less than the former. Example compilation times with reasonably fast POWER9@2.166GHz at `-O2' optimization and GCC built at `-O2' for various targets: mips-linux-gnu: 19s vax-netbsdelf: 27s alphaev56-linux-gnu: 30s alpha-linux-gnu: 31s powerpc64le-linux-gnu: 47s With GCC built at `-O0': alphaev56-linux-gnu: 2m59s alpha-linux-gnu: 3m06s I have therefore set the timeout factor accordingly so as to take slower test hosts into account. gcc/testsuite/ * gcc.c-torture/execute/memclr.c: New file.
2024-12-14cse: Fix up record_jump_equiv checks [PR117095]Jakub Jelinek1-0/+47
The following testcase is miscompiled on s390x-linux with -O2 -march=z15. The problem happens during cse2, which sees in an extended basic block (jump_insn 217 78 216 10 (parallel [ (set (pc) (if_then_else (ne (reg:SI 165) (const_int 1 [0x1])) (label_ref 216) (pc))) (set (reg:SI 165) (plus:SI (reg:SI 165) (const_int -1 [0xffffffffffffffff]))) (clobber (scratch:SI)) (clobber (reg:CC 33 %cc)) ]) "t.c":14:17 discrim 1 2192 {doloop_si64} (int_list:REG_BR_PROB 955630228 (nil)) -> 216) ... (insn 99 98 100 12 (set (reg:SI 138) (const_int 1 [0x1])) "t.c":9:31 1507 {*movsi_zarch} (nil)) (insn 100 99 103 12 (parallel [ (set (reg:SI 137) (minus:SI (reg:SI 138) (subreg:SI (reg:HI 135 [ a ]) 0))) (clobber (reg:CC 33 %cc)) ]) "t.c":9:31 1904 {*subsi3} (expr_list:REG_DEAD (reg:SI 138) (expr_list:REG_DEAD (reg:HI 135 [ a ]) (expr_list:REG_UNUSED (reg:CC 33 %cc) (nil))))) Note, cse2 has df_note_add_problem () before df_analyze, which add (expr_list:REG_UNUSED (reg:SI 165) (expr_list:REG_UNUSED (reg:CC 33 %cc) notes to the first insn (correctly so, %cc is clobbered there and pseudo 165 isn't used after the insn). Now, cse_extended_basic_block has an extra optimization on conditional jumps, where it records equivalence on the edge which continues in the ebb. Here it sees (ne reg:SI 165) (const_int 1) is false on the edge and remembers that pseudo 165 is comparison equivalent to (const_int 1), so on insn 100 it decides to replace (reg:SI 138) with (reg:SI 165). This optimization isn't correct here though, because the JUMP_INSN has multiple sets. Before r0-77890 record_jump_equiv has been called from cse_insn guarded on n_sets == 1 && any_condjump_p (insn), so it wouldn't be done on the above JUMP_INSN where n_sets == 2. But since that change it is guarded with single_set (insn) && any_condjump_p (insn) and that is true because of the REG_UNUSED note. Looking at that note is inappropriate in CSE though, because the whole intent of the pass is to extend the lifetimes of the pseudos if equivalence is found, so the fact that there is REG_UNUSED note for (reg:SI 165) and that the reg isn't used later doesn't imply that it won't be used after the optimization. So, unless we manage to process the other sets on the JUMP_INSN (it wouldn't be terribly hard in this exact case, the doloop insn decreases the register by 1 and so we could just record equivalence to (const_int 0) instead, but generally it might be hard), we should IMHO just punt if there are multiple sets. The patch below adds !multiple_sets (insn) check instead of replacing with it the single_set (insn) check, because apparently any_condjump_p uses pc_set which supports the case where PATTERN is a SET to PC (that is a single_set (insn) && !multiple_sets (insn), PATTERN is a PARALLEL with a single SET to PC (likewise) and some CLOBBERs, PARALLEL with two or more SETs where the first one is SET to PC (that could be single_set (insn) with REG_UNUSED notes but is not !multiple_sets (insn)) or PATTERN is UNSPEC/UNSPEC_VOLATILE with SET inside of it. For the last case !multiple_sets (insn) will be true, but IMHO we shouldn't try to derive anything from those because we haven't checked the rest of the UNSPEC* and we don't really know what it does. 2024-12-13 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/117095 * cse.cc (cse_extended_basic_block): Don't call record_jump_equiv if multiple_sets (insn). * gcc.c-torture/execute/pr117095.c: New test.
2024-12-10testsuite: Mark gcc.c-torture/execute/memcpy-a?.c tests expensiveMaciej W. Rozycki4-0/+4
These tests can take several seconds per compilation to complete, taking total elapsed time measured in minutes. Mark them as expensive so as to let people skip them where they want to save on testing time. gcc/testsuite/ * gcc.c-torture/execute/memcpy-a1.c: Mark as expensive. * gcc.c-torture/execute/memcpy-a2.c: Likewise. * gcc.c-torture/execute/memcpy-a4.c: Likewise. * gcc.c-torture/execute/memcpy-a8.c: Likewise.
2024-12-05doloop: Fix up doloop df use [PR116799]Jakub Jelinek1-0/+41
The following testcases are miscompiled on s390x-linux, because the doloop_optimize /* Ensure that the new sequence doesn't clobber a register that is live at the end of the block. */ { bitmap modified = BITMAP_ALLOC (NULL); for (rtx_insn *i = doloop_seq; i != NULL; i = NEXT_INSN (i)) note_stores (i, record_reg_sets, modified); basic_block loop_end = desc->out_edge->src; bool fail = bitmap_intersect_p (df_get_live_out (loop_end), modified); check doesn't work as intended. The problem is that it uses df, but the df analysis was only done using iv_analysis_loop_init (loop); -> df_analyze_loop (loop); which computes df inside on the bbs of the loop. While loop_end bb is inside of the loop, df_get_live_out computed that way includes registers set in the loop and used at the start of the next iteration, but doesn't include registers set in the loop (or before the loop) and used after the loop. The following patch fixes that by doing whole function df_analyze first, changes the loop iteration mode from 0 to LI_ONLY_INNERMOST (on many targets which use can_use_doloop_if_innermost target hook a so are known to only handle innermost loops) or LI_FROM_INNERMOST (I think only bfin actually allows non-innermost loops) and checking not just df_get_live_out (loop_end) (that is needed for something used by the next iteration), but also df_get_live_in (desc->out_edge->dest), i.e. what will be used after the loop. df of such a bb shouldn't be affected by the df_analyze_loop and so should be from df_analyze of the whole function. 2024-12-05 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/113994 PR rtl-optimization/116799 * loop-doloop.cc: Include targhooks.h. (doloop_optimize): Also punt on intersection of modified with df_get_live_in (desc->out_edge->dest). (doloop_optimize_loops): Call df_analyze. Use LI_ONLY_INNERMOST or LI_FROM_INNERMOST instead of 0 as second loops_list argument. * gcc.c-torture/execute/pr116799.c: New test. * g++.dg/torture/pr113994.C: New test.
2024-12-03AVR: Skip some test cases that don't work for it.Georg-Johann Lay2-0/+10
gcc/testsuite/ * gcc.c-torture/execute/ieee/cdivchkd.x: New file. * gcc.c-torture/execute/ieee/cdivchkf.x: New file. * gcc.dg/flex-array-counted-by.c: Require wchar. * gcc.dg/fold-copysign-1.c [avr]: Add -mdouble=64.
2024-11-29AVR: Skip the gcc.c-torture/execute/memcpy-a*.c tests.Georg-Johann Lay4-0/+4
Skipping these tests on avr since they come up with "memory full", plus they consume a multiple of the time the rest of the testsuite takes. gcc/testsuite/ * gcc.c-torture/execute/memcpy-a1.c * gcc.c-torture/execute/memcpy-a2.c * gcc.c-torture/execute/memcpy-a4.c * gcc.c-torture/execute/memcpy-a8.c
2024-11-28gimple-fold: Avoid ICEs with bogus declarations like const attribute no ↵Jakub Jelinek1-0/+17
snprintf [PR117358] When one puts incorrect const or pure attributes on declarations of various C APIs which have corresponding builtins (vs. what they actually do), we can get tons of ICEs in gimple-fold.cc. The following patch fixes it by giving up gimple_fold_builtin_* folding if the functions don't have gimple_vdef (or for pure functions like bcmp/strchr/strstr gimple_vuse) when in SSA form (during gimplification they will surely have both of those NULL even when declared correctly, yet it is highly desirable to fold them). Or shall I replace !gimple_vdef (stmt) && gimple_in_ssa_p (cfun) tests with (gimple_call_flags (stmt) & (ECF_CONST | ECF_PURE | ECF_NOVOPS)) != 0 and !gimple_vuse (stmt) && gimple_in_ssa_p (cfun) with (gimple_call_flags (stmt) & (ECF_CONST | ECF_NOVOPS)) != 0 ? 2024-11-28 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/117358 * gimple-fold.cc (gimple_fold_builtin_memory_op): Punt if stmt has no vdef in ssa form. (gimple_fold_builtin_bcmp): Punt if stmt has no vuse in ssa form. (gimple_fold_builtin_bcopy): Punt if stmt has no vdef in ssa form. (gimple_fold_builtin_bzero): Likewise. (gimple_fold_builtin_memset): Likewise. Use return false instead of return NULL_TREE. (gimple_fold_builtin_strcpy): Punt if stmt has no vdef in ssa form. (gimple_fold_builtin_strncpy): Likewise. (gimple_fold_builtin_strchr): Punt if stmt has no vuse in ssa form. (gimple_fold_builtin_strstr): Likewise. (gimple_fold_builtin_strcat): Punt if stmt has no vdef in ssa form. (gimple_fold_builtin_strcat_chk): Likewise. (gimple_fold_builtin_strncat): Likewise. (gimple_fold_builtin_strncat_chk): Likewise. (gimple_fold_builtin_string_compare): Likewise. (gimple_fold_builtin_fputs): Likewise. (gimple_fold_builtin_memory_chk): Likewise. (gimple_fold_builtin_stxcpy_chk): Likewise. (gimple_fold_builtin_stxncpy_chk): Likewise. (gimple_fold_builtin_stpcpy): Likewise. (gimple_fold_builtin_snprintf_chk): Likewise. (gimple_fold_builtin_sprintf_chk): Likewise. (gimple_fold_builtin_sprintf): Likewise. (gimple_fold_builtin_snprintf): Likewise. (gimple_fold_builtin_fprintf): Likewise. (gimple_fold_builtin_printf): Likewise. (gimple_fold_builtin_realloc): Likewise. * gcc.c-torture/compile/pr117358.c: New test.
2024-11-25nios2: Remove all support for Nios II target.Sandra Loosemore2-5/+1
nios2 target support in GCC was deprecated in GCC 14 as the architecture has been EOL'ed by the vendor. This patch removes the entire port for GCC 15 There are still references to "nios2" in libffi and libgo. Since those libraries are imported into the gcc sources from master copies maintained by other projects, those will need to be addressed elsewhere. ChangeLog: * MAINTAINERS: Remove references to nios2. * configure.ac: Likewise. * configure: Regenerated. config/ChangeLog: * mt-nios2-elf: Deleted. contrib/ChangeLog: * config-list.mk: Remove references to Nios II. gcc/ChangeLog: * common/config/nios2/*: Delete entire directory. * config/nios2/*: Delete entire directory. * config.gcc: Remove references to nios2. * configure.ac: Likewise. * doc/extend.texi: Likewise. * doc/install.texi: Likewise. * doc/invoke.texi: Likewise. * doc/md.texi: Likewise. * regenerate-opt-urls.py: Likewise. * config.in: Regenerated. * configure: Regenerated. gcc/testsuite/ChangeLog: * g++.target/nios2/*: Delete entire directory. * gcc.target/nios2/*: Delete entire directory. * g++.dg/cpp0x/constexpr-rom.C: Remove refences to nios2. * g++.old-deja/g++.jason/thunk3.C: Likewise. * gcc.c-torture/execute/20101011-1.c: Likewise. * gcc.c-torture/execute/pr47237.c: Likewise. * gcc.dg/20020312-2.c: Likewise. * gcc.dg/20021029-1.c: Likewise. * gcc.dg/debug/btf/btf-datasec-1.c: Likewise. * gcc.dg/ifcvt-4.c: Likewise. * gcc.dg/stack-usage-1.c: Likewise. * gcc.dg/struct-by-value-1.c: Likewise. * gcc.dg/tree-ssa/reassoc-33.c: Likewise. * gcc.dg/tree-ssa/reassoc-34.c: Likewise. * gcc.dg/tree-ssa/reassoc-35.c: Likewise. * gcc.dg/tree-ssa/reassoc-36.c: Likewise. * lib/target-supports.exp: Likewise. libgcc/ChangeLog: * config/nios2/*: Delete entire directory. * config.host: Remove refences to nios2. * unwind-dw2-fde-dip.c: Likewise.
2024-11-23testsuite: Expand coverage for `__builtin_memcpy'Maciej W. Rozycki5-0/+259
Expand coverage for `__builtin_memcpy', primarily for "cpymemM" block copy pattern, although with smaller sizes open-coded sequences may be produced instead. This verifies block sizes in bytes from 1 to 64, across byte alignments of 1, 2, 4, 8 and byte misalignments within from 0 up to 7 (there's some redundancy there for the sake of simplicity of the test cases) both for the source and the destination, making sure all data is copied and no data is changed outside the area meant to be written. These choice of the ranges for the parameters has come from the Alpha backend, whose "cpymemM" pattern covers copies being made of up to 64 bytes and has various corner cases related to base alignment and the misalignment within. The test cases have turned invaluable in verifying changes to the Alpha backend, but functionality covered is generic, so I have concluded these tests qualify for generic verification and do not have to be limited to the Alpha-specific subset of the testsuite. On the implementation side the tests turned out being quite stressful to GCC and the original simpler version that just expanded all code inline took a lot of time to complete compilation. Depending on the target and compilation options elapsed times up to 40 minutes (!) have been seen, especially with GCC built at `-O0' for debugging purposes. At the cost of increased complexity where a pair of macros is required per variant rather than just one I have split the code into individual functions forced not to be inlined and it improved compilation times considerably without losing coverage. Example compilation times with reasonably fast POWER9@2.166GHz at `-O2' optimization and GCC built at `-O2' for various targets: mips-linux-gnu: 23s vax-netbsdelf: 29s alphaev56-linux-gnu: 39s alpha-linux-gnu: 43s powerpc64le-linux-gnu: 48s With GCC built at `-O0': alphaev56-linux-gnu: 3m37s alpha-linux-gnu: 3m54s I have therefore set the timeout factor accordingly so as to take slower test hosts into account. gcc/testsuite/ * gcc.c-torture/execute/memcpy-a1.c: New file. * gcc.c-torture/execute/memcpy-a2.c: New file. * gcc.c-torture/execute/memcpy-a4.c: New file. * gcc.c-torture/execute/memcpy-a8.c: New file. * gcc.c-torture/execute/memcpy-ax.h: New file.
2024-11-21c: Add u{,l,ll,imax}abs builtins [PR117024]Jakub Jelinek10-0/+327
The following patch adds u{,l,ll,imax}abs builtins, which just fold to ABSU_EXPR, similarly to how {,l,ll,imax}abs builtins fold to ABS_EXPR. 2024-11-21 Jakub Jelinek <jakub@redhat.com> PR c/117024 gcc/ * coretypes.h (enum function_class): Add function_c2y_misc enumerator. * builtin-types.def (BT_FN_UINTMAX_INTMAX, BT_FN_ULONG_LONG, BT_FN_ULONGLONG_LONGLONG): New DEF_FUNCTION_TYPE_1s. * builtins.def (DEF_C2Y_BUILTIN): Define. (BUILT_IN_UABS, BUILT_IN_UIMAXABS, BUILT_IN_ULABS, BUILT_IN_ULLABS): New builtins. * builtins.cc (fold_builtin_abs): Handle also folding of u*abs to ABSU_EXPR. (fold_builtin_1): Handle BUILT_IN_U{,L,LL,IMAX}ABS. gcc/lto/ChangeLog: * lto-lang.cc (flag_isoc2y): New variable. gcc/ada/ChangeLog: * gcc-interface/utils.cc (flag_isoc2y): New variable. gcc/testsuite/ * gcc.c-torture/execute/builtins/lib/abs.c (uintmax_t): New typedef. (uabs, ulabs, ullabs, uimaxabs): New functions. * gcc.c-torture/execute/builtins/uabs-1.c: New test. * gcc.c-torture/execute/builtins/uabs-1.x: New file. * gcc.c-torture/execute/builtins/uabs-1-lib.c: New file. * gcc.c-torture/execute/builtins/uabs-2.c: New test. * gcc.c-torture/execute/builtins/uabs-2.x: New file. * gcc.c-torture/execute/builtins/uabs-2-lib.c: New file. * gcc.c-torture/execute/builtins/uabs-3.c: New test. * gcc.c-torture/execute/builtins/uabs-3.x: New test. * gcc.c-torture/execute/builtins/uabs-3-lib.c: New test.
2024-11-05PR target/117449: Restrict vector rotate match and split to pre-reloadKyrylo Tkachov1-0/+8
The vector rotate splitter has some logic to deal with post-reload splitting but not all cases in aarch64_emit_opt_vec_rotate are post-reload-safe. In particular the ROTATE+XOR expansion for TARGET_SHA3 can create RTL that can later be simplified to a simple ROTATE post-reload, which would then match the insn again and try to split it. So do a clean split pre-reload and avoid going down this path post-reload by restricting the insn_and_split to can_create_pseudo_p (). Bootstrapped and tested on aarch64-none-linux. Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com> gcc/ PR target/117449 * config/aarch64/aarch64-simd.md (*aarch64_simd_rotate_imm<mode>): Match only when can_create_pseudo_p (). * config/aarch64/aarch64.cc (aarch64_emit_opt_vec_rotate): Assume can_create_pseudo_p (). gcc/testsuite/ PR target/117449 * gcc.c-torture/compile/pr117449.c: New test.
2024-11-02testsuite: Fix up builtin-prefetch-1.c testsXi Ruoyao1-1/+1
How can you use "read-shared" as an identifier? It's not allowed by all C standard versions. gcc/testsuite/ChangeLog: * gcc.c-torture/execute/builtin-prefetch-1.c (rws): Use "read_shared" instead of "read-shared" as the identifier for enum value. * gcc.dg/builtin-prefetch-1.c (rws): Likewise.
2024-11-01Support Intel MOVRSHu, Lin11-1/+2
gcc/ChangeLog: * builtins.cc (expand_builtin_prefetch): Expand for prefetchrst2. * common/config/i386/cpuinfo.h (get_available_features): Detect movrs. * common/config/i386/i386-common.cc (OPTION_MASK_ISA2_MOVRS_SET): New. (OPTION_MASK_ISA2_MOVRS_UNSET): Ditto. (ix86_handle_option): Handle -mmovrs. * common/config/i386/i386-cpuinfo.h (enum processor_features): Add FEATURE_MOVRS. * common/config/i386/i386-isas.h: Add ISA_NAME_TABLE_ENTRY for movrs. * config.gcc: Add movrsintrin.h * config/i386/cpuid.h (bit_MOVRS): New. * config/i386/i386-builtin-types.def: Add DEF_FUNCTION_TYPE (CHAR, PCCHAR), (SHORT, PCSHORT), (INT, PCINT), (INT64, PCINT64). * config/i386/i386-builtin.def (BDESC): Add new builtins. * config/i386/i386-c.cc (ix86_target_macros_internal): Add __MOVRS__. * config/i386/i386-expand.cc (ix86_expand_special_args_builtin): Define __MOVRS__. * config/i386/i386-isa.def (MOVRS): Add DEF_PTA(MOVRS) * config/i386/i386-options.cc (ix86_valid_target_attribute_inner_p): Handle movrs. * config/i386/i386.md (movrs<mode>): New. * config/i386/i386.opt: Add option -mmovrs. * config/i386/i386.opt.urls: Regenerated. * config/i386/immintrin.h: Include movrsintrin.h * config/i386/sse.md (unspecv): Add UNSPEC_VMOVRS. (VI1248_AVX10_2): New. (avx10_2_movrs_vmovrs<ssemodesuffix><mode><mask_name>): New define_insn. * config/i386/xmmintrin.h: Add prefetchrst2. * doc/extend.texi: Document movrs. * doc/invoke.texi: Document -mmovrs. * doc/rtl.texi: Document extension of prefetchrst2. * doc/sourcebuild.texi: Document target movrs. * config/i386/movrsintrin.h: New. gcc/testsuite/ChangeLog: * g++.dg/other/i386-2.C: Add -mmovrs. * g++.dg/other/i386-3.C: Ditto. * gcc.c-torture/execute/builtin-prefetch-1.c: Expand rws. * gcc.dg/builtin-prefetch-1.c: Ditto. * gcc.target/i386/avx-1.c: Ditto. * gcc.target/i386/avx-2.c: Ditto. * gcc.target/i386/funcspec-56.inc: Add new target attribute. * gcc.target/i386/sse-12.c: Add -mmovrs. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-14.c: Ditto. * gcc.target/i386/sse-22.c: Add movrs. * gcc.target/i386/sse-23.c: Ditto * gcc.target/i386/avx10_2-512-movrs-1.c: New test. * gcc.target/i386/avx10_2-movrs-1.c: Ditto. * gcc.target/i386/movrs-1.c: Ditto. Co-authored-by: Haochen Jiang <haochen.jiang@intel.com>
2024-10-29Fix miscompilation of function containing __builtin_unreachableEric Botcazou1-0/+23
This is a wrong-code generation on the SPARC for a function containing a call to __builtin_unreachable caused by the delay slot scheduling pass, and more specifically the find_end_label function which has these lines: /* Otherwise, see if there is a label at the end of the function. If there is, it must be that RETURN insns aren't needed, so that is our return label and we don't have to do anything else. */ The comment was correct 20 years ago but no longer is nowadays in the presence of RTL epilogues and calls to __builtin_unreachable, so the patch just removes the associated two lines of code: else if (LABEL_P (insn)) *plabel = as_a <rtx_code_label *> (insn); and otherwise contains just adjustments to the commentary. gcc/ PR rtl-optimization/117327 * reorg.cc (find_end_label): Do not return a dangling label at the end of the function and adjust commentary. gcc/testsuite/ * gcc.c-torture/execute/20241029-1.c: New test.
2024-10-16testsuite: Prepare for -std=gnu23 defaultJoseph Myers37-7/+52
Now that C23 support is essentially feature-complete, I'd like to switch the default language version for C compilation to -std=gnu23. This requires updating a large number of testcases that fail with the new language version if left unchanged. In this patch, update most of the tests for which there is a safe change that works both before and after the update to default language version - typically adding the option -std=gnu17 or -Wno-old-style-definition to the tests. (There are also a few tests where I'd like to investigate further why they fail with -std=gnu23, or where I think such failures show an actual bug to fix before changing the default language version, or where it seems more appropriate to make a testcase change that would result in failures in the absence of the language version change rather than just adding an option that does nothing with the gnu17 default.) The libffi test fixes have also been submitted upstream: <https://github.com/libffi/libffi/pull/861>. Most of the failures requiring such changes are for one of two reasons: * Unprototyped function declarations with () (meaning the same as (void) in C23 mode) for a function then called with arguments. * Old-style function definitions, which warn by default in C23 mode, so resulting in test failures for the unexpected warnings. Other reasons for failures include: * Tests with their own definitions of bool, true and false. * Tests of diagnostics (often with -pedantic) in cases where C23 has changed semantics, such as: - tag compatibility for structs; - enum values out of range of int; - handing of qualified array types; - decimal floating types formerly needing -pedantic diagnostics, but being standard in C23. Bootstrapped with no regressions for x86_64-pc-linux-gnu. gcc/testsuite/ * c-c++-common/Wcast-function-type.c: Add -std=gnu17 for C. * c-c++-common/Wformat-pr84258.c: Add -std=gnu17 for C. * c-c++-common/Wvarargs.c: Add -std=gnu17 for C. * c-c++-common/analyzer/data-model-12.c: Add -std=gnu17 for C. * c-c++-common/builtins.c: Add -std=gnu17 for C. * c-c++-common/pointer-to-fn1.c: Add -std=gnu17 for C. * c-c++-common/pragma-diag-17.c: Add -std=gnu17 for C. * c-c++-common/sizeof-array-argument.c: Add -Wno-old-style-definition for C. * g++.dg/lto/pr54625-1_0.c: Add -std=gnu17. * g++.dg/lto/pr54625-2_0.c: Add -std=gnu17. * gcc.c-torture/compile/20040214-2.c: Add -std=gnu17. * gcc.c-torture/compile/921011-2.c: Add -std=gnu17. * gcc.c-torture/compile/931102-1.c: Add -std=gnu17. * gcc.c-torture/compile/990801-1.c: Add -std=gnu17. * gcc.c-torture/compile/nested-1.c: Add -std=gnu17. * gcc.c-torture/compile/pr100241-1.c: Add -std=gnu17. * gcc.c-torture/compile/pr106101.c: Add -std=gnu17. * gcc.c-torture/compile/pr113616.c: Add -std=gnu17. * gcc.c-torture/compile/pr47967.c: Add -std=gnu17. * gcc.c-torture/compile/pr51694.c: Add -std=gnu17. * gcc.c-torture/compile/pr71109.c: Add -std=gnu17. * gcc.c-torture/compile/pr83051-2.c: Add -std=gnu17. * gcc.c-torture/compile/pr89663-1.c: Add -std=gnu17. * gcc.c-torture/compile/pr94238.c: Add -std=gnu17. * gcc.c-torture/compile/pr96796.c: Add -std=gnu17. * gcc.c-torture/compile/pr97576.c: Add -std=gnu17. * gcc.c-torture/compile/udivmod4.c: Add -std=gnu17. * gcc.c-torture/execute/20010605-2.c: Add -std=gnu17. * gcc.c-torture/execute/20020404-1.c: Add -std=gnu17. * gcc.c-torture/execute/20030714-1.c: Add -std=gnu17. * gcc.c-torture/execute/20051012-1.c: Add -std=gnu17. * gcc.c-torture/execute/20190820-1.c: Add -std=gnu17. * gcc.c-torture/execute/920612-1.c: Add -Wno-old-style-definition. * gcc.c-torture/execute/930608-1.c: Add -std=gnu17. * gcc.c-torture/execute/comp-goto-1.c: Add -std=gnu17. * gcc.c-torture/execute/ieee/fp-cmp-1.x: Add -std=gnu17. * gcc.c-torture/execute/ieee/fp-cmp-2.x: Add -std=gnu17. * gcc.c-torture/execute/ieee/fp-cmp-3.x: Add -std=gnu17. * gcc.c-torture/execute/ieee/fp-cmp-4.x: New file. * gcc.c-torture/execute/ieee/fp-cmp-4f.x: New file. * gcc.c-torture/execute/ieee/fp-cmp-4l.x: New file. * gcc.c-torture/execute/loop-9.c: Add -std=gnu17. * gcc.c-torture/execute/pr103209.c: Add -std=gnu17. * gcc.c-torture/execute/pr28289.c: Add -std=gnu17. * gcc.c-torture/execute/pr34982.c: Add -std=gnu17. * gcc.c-torture/execute/pr67037.c: Add -std=gnu17. * gcc.c-torture/execute/va-arg-2.c: Add -std=gnu17. * gcc.dg/20010202-1.c: Add -std=gnu17. * gcc.dg/20020430-1.c: Add -std=gnu17. * gcc.dg/20031218-3.c: Add -std=gnu17. * gcc.dg/20040127-1.c: Add -std=gnu17. * gcc.dg/20041014-1.c: Add -Wno-old-style-definition. * gcc.dg/20041122-1.c: Add -std=gnu17. * gcc.dg/20050309-1.c: Add -std=gnu17. * gcc.dg/20061026.c: Add -std=gnu17. * gcc.dg/20101010-1.c: Add -std=gnu17. * gcc.dg/Warray-parameter-10.c: Add -std=gnu17. * gcc.dg/Wbuiltin-declaration-mismatch-2.c: Add -std=gnu17. * gcc.dg/Wbuiltin-declaration-mismatch-3.c: Add -std=gnu17. * gcc.dg/Wbuiltin-declaration-mismatch-4.c: Add -std=gnu17. * gcc.dg/Wbuiltin-declaration-mismatch-5.c: Add -std=gnu17. * gcc.dg/Wbuiltin-declaration-mismatch.c: Add -std=gnu17. * gcc.dg/Wcxx-compat-2.c: Add -std=gnu17. * gcc.dg/Wdouble-promotion.c: Add -std=gnu17. * gcc.dg/Wfree-nonheap-object-7.c: Add -std=gnu17. * gcc.dg/Wimplicit-int-1.c: Add -std=gnu17. * gcc.dg/Wimplicit-int-1a.c: Add -std=gnu17. * gcc.dg/Wimplicit-int-2.c: Add -std=gnu17. * gcc.dg/Wimplicit-int-3.c: Add -std=gnu17. * gcc.dg/Wimplicit-int-4.c: Add -std=gnu17. * gcc.dg/Wimplicit-int-4a.c: Add -std=gnu17. * gcc.dg/Wincompatible-pointer-types-1.c: Add -std=gnu17. * gcc.dg/Wrestrict-19.c: Add -std=gnu17. * gcc.dg/Wrestrict-4.c: Add -std=gnu17. * gcc.dg/Wrestrict-5.c: Add -std=gnu17. * gcc.dg/Wstrict-overflow-20.c: Add -std=gnu17. * gcc.dg/Wstringop-overflow-13.c: Add -std=gnu17. * gcc.dg/analyzer/doom-d_main-IdentifyVersion.c: Add -std=gnu17. * gcc.dg/analyzer/doom-s_sound-pr108867.c: Add -std=gnu17. * gcc.dg/analyzer/pr93032-mztools-signed-char.c: Add -Wno-old-style-definition. * gcc.dg/analyzer/pr93032-mztools-unsigned-char.c: Add -Wno-old-style-definition. * gcc.dg/analyzer/pr93355-localealias.c: Add -Wno-old-style-definition. * gcc.dg/analyzer/pr93375.c: Add -std=gnu17. * gcc.dg/analyzer/pr94688.c: Add -std=gnu17. * gcc.dg/analyzer/sensitive-1.c: Add -std=gnu17. * gcc.dg/analyzer/torture/asm-x86-linux-wfx_get_ps_timeout-full.c: Add -std=gnu17. * gcc.dg/analyzer/torture/pr104863.c: Add -std=gnu17. * gcc.dg/analyzer/torture/pr93379.c: Add -std=gnu17. * gcc.dg/array-quals-2.c: Add -std=gnu17. * gcc.dg/attr-invalid.c: Add -Wno-old-style-definition. * gcc.dg/auto-init-uninit-A.c: Add -Wno-old-style-definition. * gcc.dg/builtin-choose-expr.c: Declare exit with (int) prototype. * gcc.dg/builtin-tgmath-err-1.c: Add -std=gnu17. * gcc.dg/builtins-30.c: Add -std=gnu17. * gcc.dg/cast-function-1.c: Add -std=gnu17. * gcc.dg/cleanup-1.c: Add -std=gnu17. * gcc.dg/compat/struct-complex-1_x.c: Add -std=gnu17. * gcc.dg/compat/struct-complex-2_x.c: Add -std=gnu17. * gcc.dg/compat/union-m128-1_x.c: Add -std=gnu17. * gcc.dg/debug/dwarf2/pr66482.c: Add -std=gnu17. * gcc.dg/dfp/composite-type-2.c: Add -std=gnu17. * gcc.dg/dfp/composite-type.c: Add -std=gnu17. * gcc.dg/dfp/keywords-pedantic.c: Add -std=gnu17. * gcc.dg/dremf-type-compat-1.c: Add -std=gnu17. * gcc.dg/dremf-type-compat-2.c: Add -std=gnu17. * gcc.dg/dremf-type-compat-3.c: Add -std=gnu17. * gcc.dg/dremf-type-compat-4.c: Add -std=gnu17. * gcc.dg/enum-compat-1.c: Add -std=gnu17. * gcc.dg/enum-compat-2.c: Add -std=gnu17. * gcc.dg/floatn-errs.c: Add -std=gnu17. * gcc.dg/fltconst-pedantic-dfp.c: Add -std=gnu17. * gcc.dg/format/proto.c: Add -std=gnu17. * gcc.dg/format/sentinel-1.c: Add -std=gnu17. * gcc.dg/gomp/declare-simd-1.c: Add -Wno-old-style-definition. * gcc.dg/ifelse-1.c: Add -Wno-old-style-definition. * gcc.dg/inline-33.c: Add -std=gnu17. * gcc.dg/ipa/inline-5.c: Add -std=gnu17. * gcc.dg/ipa/ipa-sra-21.c: Add -std=gnu17. * gcc.dg/ipa/pr102714.c: Add -std=gnu17. * gcc.dg/ipa/pr104813.c: Add -std=gnu17. * gcc.dg/ipa/pr108679.c: Add -std=gnu17. * gcc.dg/ipa/pr42706.c: Add -std=gnu17. * gcc.dg/ipa/pr88214.c: Add -Wno-old-style-definition. * gcc.dg/ipa/pr91853.c: Add -Wno-old-style-definition. * gcc.dg/ipa/pr93763.c: Add -std=gnu17. * gcc.dg/ipa/pr96482-2.c: Add -std=gnu17. * gcc.dg/lto/20091013-1_2.c: Add -std=gnu17. * gcc.dg/lto/20091015-1_2.c: Add -std=gnu17. * gcc.dg/lto/pr113197_1.c: Add -std=gnu17. * gcc.dg/lto/pr54702_1.c: Add -std=gnu17. * gcc.dg/lto/pr99849_0.c: Add -std=gnu17. * gcc.dg/noncompile/920923-1.c: Add -std=gnu17. * gcc.dg/noncompile/old-style-parm-1.c: Add -Wno-old-style-definition. * gcc.dg/noncompile/old-style-parm-3.c: Add -Wno-old-style-definition. * gcc.dg/noncompile/pr30552-2.c: Add -Wno-old-style-definition. * gcc.dg/noncompile/pr30552-3.c: Add -std=gnu17. * gcc.dg/noncompile/pr71265.c: Add -Wno-old-style-definition. * gcc.dg/noncompile/pr79758-2.c: Add -Wno-old-style-definition. * gcc.dg/noncompile/pr79758.c: Add -Wno-old-style-definition. * gcc.dg/noncompile/va-arg-1.c: Add -std=gnu17. * gcc.dg/old-style-prom-1.c: Add -std=gnu17. * gcc.dg/old-style-prom-2.c: Add -std=gnu17. * gcc.dg/old-style-prom-3.c: Add -std=gnu17. * gcc.dg/old-style-then-proto-1.c: Add -std=gnu17. * gcc.dg/parm-incomplete-1.c: Add -std=gnu17. * gcc.dg/parm-mismatch-1.c: Add -std=gnu17. * gcc.dg/permerror-default.c: Add -std=gnu17. * gcc.dg/permerror-fpermissive-nowarning.c: Add -std=gnu17. * gcc.dg/permerror-fpermissive.c: Add -std=gnu17. * gcc.dg/permerror-noerror.c: Add -std=gnu17. * gcc.dg/permerror-nowarning.c: Add -std=gnu17. * gcc.dg/permerror-pedantic.c: Add -std=gnu17. * gcc.dg/plugin/infoleak-net-ethtool-ioctl.c: Add -std=gnu17. * gcc.dg/pointer-array-quals-1.c: Add -std=gnu17. * gcc.dg/pointer-array-quals-2.c: Add -std=gnu17. * gcc.dg/pr100791.c: Add -std=gnu17. * gcc.dg/pr100843.c: Add -std=gnu17. * gcc.dg/pr102273.c: Add -std=gnu17. * gcc.dg/pr102385.c: Add -std=gnu17. * gcc.dg/pr103222.c: Add -std=gnu17. * gcc.dg/pr105140.c: Add -std=gnu17. * gcc.dg/pr105150.c: Add -std=gnu17. * gcc.dg/pr105250.c: Add -std=gnu17. * gcc.dg/pr105972.c: Add -Wno-old-style-definition. * gcc.dg/pr111039.c: Add -std=gnu17. * gcc.dg/pr111407.c: Add -std=gnu17. * gcc.dg/pr111922.c: Add -Wno-old-style-definition. * gcc.dg/pr15236.c: Add -std=gnu17. * gcc.dg/pr17188-1.c: Add -std=gnu17. * gcc.dg/pr20368-1.c: Add -std=gnu17. * gcc.dg/pr20368-2.c: Add -std=gnu17. * gcc.dg/pr20368-3.c: Add -std=gnu17. * gcc.dg/pr27331.c: Add -Wno-old-style-definition. * gcc.dg/pr27861-1.c: Add -std=gnu17. * gcc.dg/pr28121.c: Add -std=gnu17. * gcc.dg/pr28243.c: Add -std=gnu17. * gcc.dg/pr28888.c: Add -std=gnu17. * gcc.dg/pr29254.c: Add -std=gnu17. * gcc.dg/pr34457-1.c: Add -std=gnu17. * gcc.dg/pr36015.c: Add -std=gnu17. * gcc.dg/pr38245-3.c: Add -std=gnu17. * gcc.dg/pr38245-4.c: Add -std=gnu17. * gcc.dg/pr41241.c: Add -std=gnu17. * gcc.dg/pr43058.c: Add -std=gnu17. * gcc.dg/pr44539.c: Add -std=gnu17. * gcc.dg/pr45055.c: Add -std=gnu17. * gcc.dg/pr50908.c: Add -Wno-old-style-definition. * gcc.dg/pr60647-1.c: Add -Wno-old-style-definition. * gcc.dg/pr63762.c: Add -std=gnu17. * gcc.dg/pr63804.c: Add -std=gnu17. * gcc.dg/pr68306-3.c: Add -std=gnu17. * gcc.dg/pr68533.c: Add -std=gnu17. * gcc.dg/pr69156.c: Add -std=gnu17. * gcc.dg/pr7356-2.c: Add -Wno-old-style-definition. * gcc.dg/pr79983.c: Add -std=gnu17. * gcc.dg/pr83463.c: Add -std=gnu17. * gcc.dg/pr87347.c: Add -std=gnu17. * gcc.dg/pr89521-1.c: Add -std=gnu17. * gcc.dg/pr89521-2.c: Add -std=gnu17. * gcc.dg/pr90648.c: Add -std=gnu17. * gcc.dg/pr93573-1.c: Add -std=gnu17. * gcc.dg/pr94167.c: Add -std=gnu17. * gcc.dg/pr94705.c: Add -std=gnu17. * gcc.dg/pr95118.c: Add -std=gnu17. * gcc.dg/pr96335.c: Add -std=gnu17. * gcc.dg/pr97830.c: Add -std=gnu17. * gcc.dg/pr97882.c: Add -std=gnu17. * gcc.dg/pr99122-2.c: Add -std=gnu17. * gcc.dg/pr99122-3.c: Add -std=gnu17. * gcc.dg/qual-component-1.c: Add -std=gnu17. * gcc.dg/sibcall-6.c: Add -Wno-old-style-definition. * gcc.dg/sms-2.c: Add -Wno-old-style-definition. * gcc.dg/tm/20091221.c: Add -std=gnu17. * gcc.dg/torture/bfloat16-basic.c: Add -Wno-old-style-definition. * gcc.dg/torture/float128-basic.c: Add -Wno-old-style-definition. * gcc.dg/torture/float128x-basic.c: Add -Wno-old-style-definition. * gcc.dg/torture/float16-basic.c: Add -Wno-old-style-definition. * gcc.dg/torture/float32-basic.c: Add -Wno-old-style-definition. * gcc.dg/torture/float32x-basic.c: Add -Wno-old-style-definition. * gcc.dg/torture/float64-basic.c: Add -Wno-old-style-definition. * gcc.dg/torture/float64x-basic.c: Add -Wno-old-style-definition. * gcc.dg/torture/pr102762.c: Add -std=gnu17. * gcc.dg/torture/pr103987.c: Add -std=gnu17. * gcc.dg/torture/pr104825.c: Add -Wno-old-style-definition. * gcc.dg/torture/pr105166.c: Add -std=gnu17. * gcc.dg/torture/pr105185.c: Add -Wno-old-style-definition. * gcc.dg/torture/pr109652.c: Add -std=gnu17. * gcc.dg/torture/pr112444.c: Add -std=gnu17. * gcc.dg/torture/pr113895-3.c: Add -std=gnu17. * gcc.dg/torture/pr24626-2.c: Add -std=gnu17. * gcc.dg/torture/pr25183.c: Add -std=gnu17. * gcc.dg/torture/pr38948.c: Add -std=gnu17. * gcc.dg/torture/pr44807.c: Add -std=gnu17. * gcc.dg/torture/pr47281.c: Add -std=gnu17. * gcc.dg/torture/pr47958-1.c: Add -Wno-old-style-definition. * gcc.dg/torture/pr48063.c: Add -std=gnu17. * gcc.dg/torture/pr57036-1.c: Add -std=gnu17. * gcc.dg/torture/pr57330.c: Add -std=gnu17. * gcc.dg/torture/pr57584.c: Add -std=gnu17. * gcc.dg/torture/pr67741.c: Add -std=gnu17. * gcc.dg/torture/pr68104.c: Add -std=gnu17. * gcc.dg/torture/pr69242.c: Add -std=gnu17. * gcc.dg/torture/pr70457.c: Add -std=gnu17. * gcc.dg/torture/pr70985.c: Add -std=gnu17. * gcc.dg/torture/pr71606.c: Add -std=gnu17. * gcc.dg/torture/pr71816.c: Add -std=gnu17. * gcc.dg/torture/pr77286.c: Add -std=gnu17. * gcc.dg/torture/pr77646.c: Add -std=gnu17. * gcc.dg/torture/pr77677-2.c: Add -std=gnu17. * gcc.dg/torture/pr78365.c: Add -Wno-old-style-definition. * gcc.dg/torture/pr79732.c: Add -std=gnu17. * gcc.dg/torture/pr80612.c: Add -std=gnu17. * gcc.dg/torture/pr80764.c: Add -std=gnu17. * gcc.dg/torture/pr80842.c: Add -std=gnu17. * gcc.dg/torture/pr81900.c: Add -std=gnu17. * gcc.dg/torture/pr82276.c: Add -std=gnu17. * gcc.dg/torture/pr84803.c: Add -std=gnu17. * gcc.dg/torture/pr93124.c: Add -std=gnu17. * gcc.dg/torture/pr97330-1.c: Add -Wno-old-style-definition. * gcc.dg/tree-prof/comp-goto-1.c: Add -std=gnu17. * gcc.dg/tree-ssa/20030703-2.c: Add -Wno-old-style-definition. * gcc.dg/tree-ssa/20030708-1.c: Add -Wno-old-style-definition. * gcc.dg/tree-ssa/20030709-2.c: Add -Wno-old-style-definition. * gcc.dg/tree-ssa/20030709-3.c: Add -Wno-old-style-definition. * gcc.dg/tree-ssa/20030710-1.c: Add -Wno-old-style-definition. * gcc.dg/tree-ssa/20030711-1.c: Add -Wno-old-style-definition. * gcc.dg/tree-ssa/20030711-2.c: Add -Wno-old-style-definition. * gcc.dg/tree-ssa/20030711-3.c: Add -Wno-old-style-definition. * gcc.dg/tree-ssa/20030714-1.c: Add -Wno-old-style-definition. * gcc.dg/tree-ssa/20030714-2.c: Add -Wno-old-style-definition. * gcc.dg/tree-ssa/20030728-1.c: Add -Wno-old-style-definition. * gcc.dg/tree-ssa/20030807-10.c: Add -Wno-old-style-definition. * gcc.dg/tree-ssa/20030807-11.c: Add -Wno-old-style-definition. * gcc.dg/tree-ssa/20030807-3.c: Add -Wno-old-style-definition. * gcc.dg/tree-ssa/20030807-6.c: Add -Wno-old-style-definition. * gcc.dg/tree-ssa/20030807-7.c: Add -Wno-old-style-definition. * gcc.dg/tree-ssa/20030814-4.c: Add -Wno-old-style-definition. * gcc.dg/tree-ssa/20030814-5.c: Add -Wno-old-style-definition. * gcc.dg/tree-ssa/20030814-6.c: Add -Wno-old-style-definition. * gcc.dg/tree-ssa/20030918-1.c: Add -Wno-old-style-definition. * gcc.dg/tree-ssa/20040514-2.c: Add -Wno-old-style-definition. * gcc.dg/tree-ssa/loadpre7.c: Add -Wno-old-style-definition. * gcc.dg/tree-ssa/pr111003.c: Add -std=gnu17. * gcc.dg/tree-ssa/pr115128.c: Add -std=gnu17. * gcc.dg/tree-ssa/pr115191.c: Add -std=gnu17. * gcc.dg/tree-ssa/pr24840.c: Add -std=gnu17. * gcc.dg/tree-ssa/pr69666.c: Add -std=gnu17. * gcc.dg/tree-ssa/pr70232.c: Add -std=gnu17. * gcc.dg/ubsan/pr79757-1.c: Add -Wno-old-style-definition. * gcc.dg/ubsan/pr79757-2.c: Add -Wno-old-style-definition. * gcc.dg/ubsan/pr79757-3.c: Add -Wno-old-style-definition. * gcc.dg/ubsan/pr81223.c: Add -std=gnu17. * gcc.dg/uninit-10-O0.c: Add -Wno-old-style-definition. * gcc.dg/uninit-10.c: Add -Wno-old-style-definition. * gcc.dg/uninit-32.c: Add -std=gnu17. * gcc.dg/uninit-41.c: Add -std=gnu17. * gcc.dg/uninit-A-O0.c: Add -Wno-old-style-definition. * gcc.dg/uninit-A.c: Add -Wno-old-style-definition. * gcc.dg/unused-1.c: Add -Wno-old-style-definition. * gcc.dg/vect/bb-slp-pr114249.c: Add -std=gnu17. * gcc.dg/vect/bb-slp-pr97486.c: Add -std=gnu17. * gcc.dg/vect/bb-slp-subgroups-1.c: Add -std=gnu17. * gcc.dg/vect/bb-slp-subgroups-2.c: Add -std=gnu17. * gcc.dg/vect/bb-slp-subgroups-3.c: Add -std=gnu17. * gcc.dg/vect/vect-early-break_111-pr113731.c: Add -std=gnu17. * gcc.dg/vect/vect-early-break_122-pr114239.c: Add -std=gnu17. * gcc.dg/vect/vect-multi-peel-gaps.c: Add -std=gnu17. * gcc.dg/vla-stexp-2.c: Add -std=gnu17. * gcc.dg/warn-1.c: Add -Wno-old-style-definition. * gcc.dg/winline-10.c: Add -Wno-old-style-definition. * gcc.dg/wtr-label-1.c: Add -Wno-old-style-definition. * gcc.dg/wtr-switch-1.c: Add -Wno-old-style-definition. * gcc.target/i386/excess-precision-3.c: Add -Wno-old-style-definition. * gcc.target/i386/fma4-256-nmsubXX.c: Add -std=gnu17. * gcc.target/i386/fma4-nmsubXX.c: Add -std=gnu17. * gcc.target/i386/nop-mcount.c: Add -Wno-old-style-definition. * gcc.target/i386/pr102627.c: Add -std=gnu17. * gcc.target/i386/pr106994.c: Add -std=gnu17. * gcc.target/i386/pr68349.c: Add -std=gnu17. * gcc.target/i386/pr97313.c: Add -std=gnu17. * gcc.target/i386/pr99454.c: Add -std=gnu17. * gcc.target/i386/record-mcount.c: Add -Wno-old-style-definition. libffi/ * testsuite/libffi.call/va_struct2.c (test_fn): Cast n to void. * testsuite/libffi.call/va_struct3.c (test_fn): Likewise. Backported from <https://github.com/libffi/libffi/pull/861>.
2024-10-07nvptx: Re-enable all variants of 'gcc.c-torture/execute/20020529-1.c'Thomas Schwinge1-4/+0
Generally PASSes with: $ ptxas --version ptxas: NVIDIA (R) Ptx optimizing assembler Copyright (c) 2005-2018 NVIDIA Corporation Built on Sun_Sep__9_21:06:46_CDT_2018 Cuda compilation tools, release 10.0, V10.0.145 ..., and execution with 'Driver Version: 361.93.02'. Only the '-O1' execution test FAILs (pre-existing; to be analyzed later): nvptx-run: error getting kernel result: an illegal memory access was encountered (CUDA_ERROR_ILLEGAL_ADDRESS, 700) gcc/testsuite/ * gcc.c-torture/execute/20020529-1.c: Re-enable all variants for nvptx.
2024-10-07nvptx: Disable effective-target 'freestanding'Thomas Schwinge4-0/+4
After 2014's commit 157e859ffe3b5d43db1e19475711c1a3d21ab57a "remove picochip", the effective-target 'freestanding' (later) was only ever used for nvptx. However, the relevant I/O library functions have long been implemented in nvptx newlib. These test cases generally PASS, just a few need to get XFAILed; see <https://docs.nvidia.com/cuda/ptx-writers-guide-to-interoperability/#system-calls>, and then supposedly <https://docs.nvidia.com/cuda/cuda-c-programming-guide/#formatted-output> for description of the non-standard PTX 'vprintf' return value: > Unlike the C-standard 'printf()', which returns the number of characters > printed, CUDA's 'printf()' returns the number of arguments parsed. If no > arguments follow the format string, 0 is returned. If the format string is > NULL, -1 is returned. If an internal error occurs, -2 is returned. (I've tried a few variants to confirm that PTX 'vprintf' -- which supposedly is underlying the CUDA 'printf' -- is what's implementing this behavior.) Probably, we ought to fix that up in nvptx newlib. gcc/testsuite/ * gcc.c-torture/execute/printf-1.c: XFAIL for nvptx. * gcc.c-torture/execute/printf-chk-1.c: Likewise. * gcc.c-torture/execute/vprintf-1.c: Likewise. * gcc.c-torture/execute/vprintf-chk-1.c: Likewise. * lib/target-supports.exp (check_effective_target_freestanding): Disable for nvptx.
2024-10-07nvptx: Re-enable "ptxas times out" test casesThomas Schwinge8-10/+5
These are all quick to compile and generally PASS with: $ ptxas --version ptxas: NVIDIA (R) Ptx optimizing assembler Copyright (c) 2005-2018 NVIDIA Corporation Built on Sun_Sep__9_21:06:46_CDT_2018 Cuda compilation tools, release 10.0, V10.0.145 Only 'gcc.c-torture/compile/limits-fndefn.c' at '-O0' still has an issue, as indicated. Working around that with '-Wa,--no-verify', for now. gcc/testsuite/ * gcc.c-torture/compile/920501-4.c: Re-enable nvptx "ptxas times out" variants. * gcc.c-torture/compile/921011-1.c: Likewise. * gcc.c-torture/compile/pr34334.c: Likewise. * gcc.c-torture/compile/pr37056.c: Likewise. * gcc.c-torture/compile/pr39423-1.c: Likewise. * gcc.c-torture/compile/pr49049.c: Likewise. * gcc.c-torture/compile/pr59417.c: Likewise. * gcc.c-torture/compile/limits-fndefn.c: Likewise. Specify '-Wa,--no-verify' for nvptx '-O0'.
2024-10-07nvptx: Re-enable 'gcc.c-torture/compile/20080721-1.c'Thomas Schwinge1-1/+0
PASSes with: $ ptxas --version ptxas: NVIDIA (R) Ptx optimizing assembler Copyright (c) 2005-2018 NVIDIA Corporation Built on Sun_Sep__9_21:06:46_CDT_2018 Cuda compilation tools, release 10.0, V10.0.145 gcc/testsuite/ * gcc.c-torture/compile/20080721-1.c: Re-enable for nvptx.
2024-10-04testsuite - Fix gcc.c-torture/execute/ieee/pr108540-1.cGeorg-Johann Lay2-3/+10
PR testsuite/108540 gcc/testsuite/ * gcc.c-torture/execute/ieee/pr108540-1.c: Un-preprocess __SIZE_TYPE__ and __INT64_TYPE__. * gcc.c-torture/execute/ieee/pr108540-1.x: New file, requires double64.
2024-09-14testsuite; Fix execute/pr52286.c for 16bitAndrew Pinski1-1/+1
The code path which was added for 16bit had a broken inline-asm which would only assign maybe half of the registers for the `long` type to 0. Adding L to the input operand of the inline-asm fixes the issue by now assigning the full 32bit value of the input register that would match up with the output register. Fixes r0-115223-gb0408f13d4b317 which added the 16bit code path to fix the testcase for 16bit. Pushed as obvious. PR testsuite/116716 gcc/testsuite/ChangeLog: * gcc.c-torture/execute/pr52286.c: Fix inline-asm for 16bit case.
2024-08-29Allow subregs around constant displacements [PR116516]Richard Sandiford1-0/+10
This patch fixes a regression introduced by g:708ee71808ea61758e73. x86_64 allows addresses of the form: (zero_extend:DI (subreg:SI (symbol_ref:DI "foo") 0)) Before the previous patch, a lax SUBREG check meant that we would treat the subreg as a base and reload it into a base register. But that wasn't what the target was expecting. Instead we should treat "foo" as a constant displacement, to match: leal foo, <dest> After the patch, we recognised that "foo" isn't a base register, but ICEd on it rather than handling it as a displacement. With or without the recent patches, if the address had instead been: (zero_extend:DI (subreg:SI (plus:DI (reg:DI R) (symbol_ref:DI "foo") 0))) then we would have treated "foo" as the displacement and R as the base or index, as expected. The problem was that the code that does this was rejecting all subregs of objects, rather than just subregs of variable objects. gcc/ PR middle-end/116516 * rtlanal.cc (strip_address_mutations): Allow subregs around constant displacements. gcc/testsuite/ PR middle-end/116516 * gcc.c-torture/compile/pr116516.c: New test.
2024-08-26vect: Fix STMT_VINFO_DEF_TYPE check for odd/even widen mult [PR116348]Xi Ruoyao1-0/+14
After fixing PR116142 some code started to trigger an ICE with -O3 -march=znver4. Per Richard Biener who actually made this fix: "supportable_widening_operation fails at transform time - that's likely because vectorizable_reduction "puns" defs to internal_def" so the check should use STMT_VINFO_REDUC_DEF instead of checking if STMT_VINFO_DEF_TYPE is vect_reduction_def. gcc/ChangeLog: PR tree-optimization/116348 * tree-vect-stmts.cc (supportable_widening_operation): Use STMT_VINFO_REDUC_DEF (x) instead of STMT_VINFO_DEF_TYPE (x) == vect_reduction_def. gcc/testsuite/ChangeLog: PR tree-optimization/116348 * gcc.c-torture/compile/pr116438.c: New test. Co-authored-by: Richard Biener <rguenther@suse.de>
2024-08-1216-bit testsuite fixes - excessive code sizeJoern Rennecke1-0/+2
gcc/testsuite/ * gcc.c-torture/execute/20021120-1.c: Skip if not size20plus or -Os. * gcc.dg/fixed-point/convert-float-4.c: Require size20plus. * gcc.dg/torture/pr112282.c: Skip if -O0 unless size20plus. * g++.dg/lookup/pr21802.C: Require size20plus.
2024-07-31recog: Disallow subregs in mode-punned value [PR115881]Richard Sandiford1-0/+16
In g:9d20529d94b23275885f380d155fe8671ab5353a, I'd extended insn_propagation to handle simple cases of hard-reg mode punning. The punned "to" value was created using simplify_subreg rather than simplify_gen_subreg, on the basis that hard-coded subregs aren't generally useful after RA (where hard-reg propagation is expected to happen). This PR is about a case where the subreg gets pushed into the operands of a plus, but the subreg on one of the operands cannot be simplified. Specifically, we have to generate (subreg:SI (reg:DI sp) 0) rather than (reg:SI sp), since all references to the stack pointer must be via stack_pointer_rtx. However, code in x86 (reasonably) expects no subregs of registers to appear after RA, except for special cases like strict_low_part. This leads to an awkward situation where we can't ban subregs of sp (because of the strict_low_part use), can't allow direct references to sp in other modes (because of the stack_pointer_rtx requirement), and can't allow rvalue uses of the subreg (because of the "no subregs after RA" assumption). It all seems a bit of a mess... I sat on this for a while in the hope that a clean solution might become apparent, but in the end, I think we'll just have to check manually for nested subregs and punt on them. gcc/ PR rtl-optimization/115881 * recog.cc: Include rtl-iter.h. (insn_propagation::apply_to_rvalue_1): Check that the result of simplify_subreg does not include nested subregs. gcc/testsuite/ PR rtl-optimization/115881 * gcc.c-torture/compile/pr115881.c: New test.
2024-07-29testsuite: fix PR111613 testSam James1-0/+0
PR ipa/111613 * gcc.c-torture/pr111613.c: Rename to.. * gcc.c-torture/execute/pr111613.c: ...this.
2024-07-29testsuite: make PR115277 test an execute oneSam James1-0/+0
PR middle-end/115277 * gcc.c-torture/compile/pr115277.c: Rename to... * gcc.c-torture/execute/pr115277.c: ...this.
2024-07-22Fix modref's iteraction with store mergingJan Hubicka1-0/+29
Hi, this patch fixes wrong code in case store-merging introduces load of function parameter that was previously write-only (which happens for bitfields). Without this, the whole store-merged area is consdered to be killed. PR ipa/111613 gcc/ChangeLog: * ipa-modref.cc (analyze_parms): Do not preserve EAF_NO_DIRECT_READ and EAF_NO_INDIRECT_READ from past flags. gcc/testsuite/ChangeLog: * gcc.c-torture/pr111613.c: New test.
2024-07-22Fix modref_eaf_analysis::analyze_ssa_name handling of values dereferenced to ↵Jan Hubicka1-0/+35
function call parameters modref_eaf_analysis::analyze_ssa_name misinterprets EAF flags. If dereferenced parameter is passed (to map_iterator in the testcase) it can be returned indirectly which in turn makes it to escape into the next function call. PR ipa/115033 gcc/ChangeLog: * ipa-modref.cc (modref_eaf_analysis::analyze_ssa_name): Fix checking of EAF flags when analysing values dereferenced as function parameters. gcc/testsuite/ChangeLog: * gcc.c-torture/execute/pr115033.c: New test.