aboutsummaryrefslogtreecommitdiff
path: root/gcc/testsuite/gcc.dg
AgeCommit message (Collapse)AuthorFilesLines
2025-01-12c: UX improvements to 'too {few,many} arguments' errors (v5) [PR118112]David Malcolm2-0/+111
Consider this case of a bad call to a callback function (perhaps due to C23 changing the meaning of () in function decls): struct p { int (*bar)(); }; void baz() { struct p q; q.bar(1); } Before this patch the C frontend emits: t.c: In function 'baz': t.c:7:5: error: too many arguments to function 'q.bar' 7 | q.bar(1); | ^ which doesn't give the user much help in terms of knowing what was expected, and where the relevant declaration is. With this patch the C frontend emits: t.c: In function 'baz': t.c:7:5: error: too many arguments to function 'q.bar'; expected 0, have 1 7 | q.bar(1); | ^ ~ t.c:2:15: note: declared here 2 | int (*bar)(); | ^~~ (showing the expected vs actual counts, the pertinent field decl, and underlining the first extraneous argument at the callsite) Similarly, the patch also updates the "too few arguments" case to also show expected vs actual counts. Doing so requires a tweak to the wording to say "at least" for the case of variadic fns where previously the C FE emitted e.g.: s.c: In function 'test': s.c:5:3: error: too few arguments to function 'callee' 5 | callee (); | ^~~~~~ s.c:1:6: note: declared here 1 | void callee (const char *, ...); | ^~~~~~ with this patch it emits: s.c: In function 'test': s.c:5:3: error: too few arguments to function 'callee'; expected at least 1, have 0 5 | callee (); | ^~~~~~ s.c:1:6: note: declared here 1 | void callee (const char *, ...); | ^~~~~~ gcc/c/ChangeLog: PR c/118112 * c-typeck.cc (inform_declaration): Add "function_expr" param and use it for cases where we couldn't show the function decl to show field decls for callbacks. (build_function_call_vec): Add missing auto_diagnostic_group. Update for new param of inform_declaration. (convert_arguments): Likewise. For the "too many arguments" case add the expected vs actual counts to the message, and if we have it, add the location_t of the first surplus param as a secondary location within the diagnostic. For the "too few arguments" case, determine the minimum number of arguments required and add the expected vs actual counts to the message, tweaking it to "at least" for variadic functions. gcc/testsuite/ChangeLog: PR c/118112 * gcc.dg/too-few-arguments.c: New test. * gcc.dg/too-many-arguments.c: New test. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2025-01-10Use relations when simplifying MIN and MAX.Andrew MacLeod6-2/+472
Query for known relations between the operands, and pass that to fold_range to help simplify MIN and MAX relations. Make it type agnostic as well. Adapt testcases from DOM to EVRP (e suffix) and test floats (f suffix). PR tree-optimization/88575 gcc/ * vr-values.cc (simplify_using_ranges::fold_cond_with_ops): Query relation between op0 and op1 and utilize it. (simplify_using_ranges::simplify): Do not eliminate float checks. gcc/testsuite/ * gcc.dg/tree-ssa/minmax-27.c: Disable VRP. * gcc.dg/tree-ssa/minmax-27e.c: New. * gcc.dg/tree-ssa/minmax-27f.c: New. * gcc.dg/tree-ssa/minmax-28.c: Disable VRP. * gcc.dg/tree-ssa/minmax-28e.c: New. * gcc.dg/tree-ssa/minmax-28f.c: New.
2025-01-10vect: Ensure we add vector skip guard even when versioning for aliasing ↵Alex Coplan1-0/+91
[PR118211] This fixes a latent wrong code issue whereby vect_do_peeling determined the wrong condition for inserting the vector skip guard. Specifically in the case where the loop niters are unknown at compile time we used to check: !LOOP_REQUIRES_VERSIONING (loop_vinfo) but LOOP_REQUIRES_VERSIONING is true for loops which we have versioned for aliasing, and that has nothing to do with prolog peeling. I think this condition should instead be checking specifically if we aren't versioning for alignment. As it stands, when we version for alignment, we don't peel, so the vector skip guard is indeed redundant in that case. With the testcase added (reduced from the Fortran frontend) we would version for aliasing, omit the vector skip guard, and then at runtime we would peel sufficient iterations for alignment that there wasn't a full vector iteration left when we entered the vector body, thus overflowing the output buffer. gcc/ChangeLog: PR tree-optimization/118211 PR tree-optimization/116126 * tree-vect-loop-manip.cc (vect_do_peeling): Adjust skip_vector condition to only omit the edge if we're versioning for alignment. gcc/testsuite/ChangeLog: PR tree-optimization/118211 PR tree-optimization/116126 * gcc.dg/vect/vect-early-break_130.c: New test.
2025-01-10vect: Force alignment peeling to vectorize more early break loops [PR118211]Alex Coplan12-10/+13
This allows us to vectorize more loops with early exits by forcing peeling for alignment to make sure that we're guaranteed to be able to safely read an entire vector iteration without crossing a page boundary. To make this work for VLA architectures we have to allow compile-time non-constant target alignments. We also have to override the result of the target's preferred_vector_alignment hook if it isn't a power-of-two multiple of the TYPE_SIZE of the chosen vector type. gcc/ChangeLog: PR tree-optimization/118211 PR tree-optimization/116126 * tree-vect-data-refs.cc (vect_analyze_early_break_dependences): Set need_peeling_for_alignment flag on read DRs instead of failing vectorization. Punt on gathers. (dr_misalignment): Handle non-constant target alignments. (vect_compute_data_ref_alignment): If need_peeling_for_alignment flag is set on the DR, then override the target alignment chosen by the preferred_vector_alignment hook to choose a safe alignment. (vect_supportable_dr_alignment): Override support_vector_misalignment hook if need_peeling_for_alignment is set on the DR: in this case we must return dr_unaligned_unsupported in order to force peeling. * tree-vect-loop-manip.cc (vect_do_peeling): Allow prolog peeling by a compile-time non-constant amount. * tree-vectorizer.h (dr_vec_info): Add new flag need_peeling_for_alignment. gcc/testsuite/ChangeLog: PR tree-optimization/118211 PR tree-optimization/116126 * gcc.dg/tree-ssa/cunroll-13.c: Don't vectorize. * gcc.dg/tree-ssa/cunroll-14.c: Likewise. * gcc.dg/unroll-6.c: Likewise. * gcc.dg/tree-ssa/gen-vect-28.c: Likewise. * gcc.dg/vect/vect-104.c: Expect to vectorize. * gcc.dg/vect/vect-early-break_108-pr113588.c: Likewise. * gcc.dg/vect/vect-early-break_109-pr113588.c: Likewise. * gcc.dg/vect/vect-early-break_110-pr113467.c: Likewise. * gcc.dg/vect/vect-early-break_3.c: Likewise. * gcc.dg/vect/vect-early-break_65.c: Likewise. * gcc.dg/vect/vect-early-break_8.c: Likewise. * gfortran.dg/vect/vect-5.f90: Likewise. * gfortran.dg/vect/vect-8.f90: Likewise. * gcc.dg/vect/vect-switch-search-line-fast.c: Co-Authored-By: Tamar Christina <tamar.christina@arm.com>
2025-01-10c: Fix up expr location for __builtin_stdc_rotate_* [PR118376]Jakub Jelinek1-0/+11
Seems I forgot to set_c_expr_source_range for the __builtin_stdc_rotate_* case (the other __builtin_stdc_* cases already have it), which means the locations in expr are uninitialized, sometimes causing ICEs in linemap code, at other times just valgrind errors about uninitialized var uses. 2025-01-10 Jakub Jelinek <jakub@redhat.com> PR c/118376 * c-parser.cc (c_parser_postfix_expression): Call set_c_expr_source_range before break in the __builtin_stdc_rotate_* case. * gcc.dg/pr118376.c: New test.
2025-01-10rtl: Remove invalid compare simplification [PR117186]Richard Sandiford1-0/+15
g:d882fe5150fbbeb4e44d007bb4964e5b22373021, posted at https://gcc.gnu.org/pipermail/gcc-patches/2000-July/033786.html , added code to treat: (set (reg:CC cc) (compare:CC (gt:M (reg:CC cc) 0) (lt:M (reg:CC cc) 0))) as a nop. This PR shows that that isn't always correct. The compare in the set above is between two 0/1 booleans (at least on STORE_FLAG_VALUE==1 targets), whereas the unknown comparison that produced the incoming (reg:CC cc) is unconstrained; it could be between arbitrary integers, or even floats. The fold is therefore replacing a cc that is valid for both signed and unsigned comparisons with one that is only known to be valid for signed comparisons. (gt (compare (gt cc 0) (lt cc 0) 0) does simplify to: (gt cc 0) but: (gtu (compare (gt cc 0) (lt cc 0) 0) does not simplify to: (gtu cc 0) The optimisation didn't come with a testcase, but it was added for i386's cmpstrsi, now cmpstrnsi. That probably doesn't matter as much as it once did, since it's now conditional on -minline-all-stringops. But the patch is almost 25 years old, so whatever the original motivation was, it seems likely that other things now rely on it. It therefore seems better to try to preserve the optimisation on rtl rather than get rid of it. To do that, we need to look at how the result of the outer compare is used. We'd therefore be looking at four instructions (the gt, the lt, the compare, and the use of the compare), but combine already allows that for 3-instruction combinations thanks to: /* If the source is a COMPARE, look for the use of the comparison result and try to simplify it unless we already have used undobuf.other_insn. */ When applied to boolean inputs, a comparison operator is effectively a boolean logical operator (AND, ANDNOT, XOR, etc.). simplify_logical_relational_operation already had code to simplify logical operators between two comparison results, but: * It only handled IOR, which doesn't cover all the cases needed here. The others are easily added. * It treated comparisons of integers as having an ORDERED/UNORDERED result. Therefore: * it would not treat "true for LT + EQ + GT" as "always true" for comparisons between integers, because the mask excluded the UNORDERED condition. * it would try to convert "true for LT + GT" into LTGT even for comparisons between integers. To prevent an ICE later, the code used: /* Many comparison codes are only valid for certain mode classes. */ if (!comparison_code_valid_for_mode (code, mode)) return 0; However, this used the wrong mode, since "mode" is here the integer result of the comparisons (and the mode of the IOR), not the mode of the things being compared. Thus the effect was to reject all floating-point-only codes, even when comparing floats. I think instead the code should detect whether the comparison is between integer values and remove UNORDERED from consideration if so. It then always produces a valid comparison (or an always true/false result), and so comparison_code_valid_for_mode is not needed. In particular, "true for LT + GT" becomes NE for comparisons between integers but remains LTGT for comparisons between floats. * There was a missing check for whether the comparison inputs had side effects. While there, it also seemed worth extending simplify_logical_relational_operation to unsigned comparisons, since that makes the testing easier. As far as that testing goes: the patch exhaustively tests all combinations of integer comparisons in: (cmp1 (cmp2 X Y) (cmp3 X Y)) for the 10 integer comparisons, giving 1000 fold attempts in total. It then tries all combinations of (X in {-1,0,1} x Y in {-1,0,1}) on the result of the fold, giving 9 checks per fold, or 9000 in total. That's probably more than is typical for self-tests, but it seems to complete in neglible time, even for -O0 builds. gcc/ PR rtl-optimization/117186 * rtl.h (simplify_context::simplify_logical_relational_operation): Add an invert0_p parameter. * simplify-rtx.cc (unsigned_comparison_to_mask): New function. (mask_to_unsigned_comparison): Likewise. (comparison_code_valid_for_mode): Delete. (simplify_context::simplify_logical_relational_operation): Add an invert0_p parameter. Handle AND and XOR. Handle unsigned comparisons. Handle always-false results. Ignore the low bit of the mask if the operands are always ordered and remove the then-redundant check of comparison_code_valid_for_mode. Check for side-effects in the operands before simplifying them away. (simplify_context::simplify_binary_operation_1): Remove simplification of (compare (gt ...) (lt ...)) and instead... (simplify_context::simplify_relational_operation_1): ...handle comparisons of comparisons here. (test_comparisons): New function. (test_scalar_ops): Call it. gcc/testsuite/ PR rtl-optimization/117186 * gcc.dg/torture/pr117186.c: New test. * gcc.target/aarch64/pr117186.c: Likewise.
2025-01-10[ifcombine] fix mask variable test to match use [PR118344]Alexandre Oliva1-0/+41
There was a cut&pasto in the rr_and_mask's adjustment to match the combined type: the test on whether there was a mask already was testing the wrong variable, and then it might crash or otherwise fail accessing an undefined mask. This only hit with checking enabled, and rarely at that. for gcc/ChangeLog PR tree-optimization/118344 * gimple-fold.cc (fold_truth_andor_for_ifcombine): Fix typo in rr_and_mask's type adjustment test. for gcc/testsuite/ChangeLog PR tree-optimization/118344 * gcc.dg/field-merge-19.c: New.
2025-01-10[ifcombine] adjust for narrowing converts before shifts [PR118206]Alexandre Oliva1-0/+46
A narrowing conversion and a shift both drop bits from the loaded value, but we need to take into account which one comes first to get the right number of bits and mask. Fold when applying masks to parts, comparing the parts, and combining the results, in the odd chance either mask happens to be zero. for gcc/ChangeLog PR tree-optimization/118206 * gimple-fold.cc (decode_field_reference): Account for upper bits dropped by narrowing conversions whether before or after a right shift. (fold_truth_andor_for_ifcombine): Fold masks, compares, and combined results. for gcc/testsuite/ChangeLog PR tree-optimization/118206 * gcc.dg/field-merge-18.c: New.
2025-01-10testsuite: generalized field-merge tests for <32-bit int [PR118025]Alexandre Oliva6-14/+20
Explicitly convert constants to the desired types, so as to not elicit warnings about implicit truncations, nor execution errors, on targets whose ints are narrower than 32 bits. for gcc/testsuite/ChangeLog PR testsuite/118025 * gcc.dg/field-merge-1.c: Convert constants to desired types. * gcc.dg/field-merge-3.c: Likewise. * gcc.dg/field-merge-4.c: Likewise. * gcc.dg/field-merge-5.c: Likewise. * gcc.dg/field-merge-11.c: Likewise. * gcc.dg/field-merge-17.c: Don't mess with padding bits.
2025-01-10testsuite: generalize ifcombine field-merge tests [PR118025]Alexandre Oliva9-16/+20
A number of tests that check for specific ifcombine transformations fail on AVR and PRU targets, whose type sizes and alignments aren't conducive of the expected transformations. Adjust the expectations. Most execution tests should run successfully regardless of the transformations, but a few that could conceivably fail if short and char have the same bit width now check for that and bypass the tests that would fail. Conversely, one test that had such a runtime test, but that would work regardless, no longer has that runtime test, and its types are narrowed so that the transformations on 32-bit targets are more likely to be the same as those that used to take place on 64-bit targets. This latter change is somewhat obviated by a separate patch, but I've left it in place anyway. for gcc/testsuite/ChangeLog PR testsuite/118025 * gcc.dg/field-merge-1.c: Skip BIT_FIELD_REF counting on AVR and PRU. * gcc.dg/field-merge-3.c: Bypass the test if short doesn't have the expected size. * gcc.dg/field-merge-8.c: Likewise. * gcc.dg/field-merge-9.c: Likewise. Skip optimization counting on AVR and PRU. * gcc.dg/field-merge-13.c: Skip optimization counting on AVR and PRU. * gcc.dg/field-merge-15.c: Likewise. * gcc.dg/field-merge-17.c: Likewise. * gcc.dg/field-merge-16.c: Likewise. Drop runtime bypass. Use smaller types. * gcc.dg/field-merge-14.c: Add comments.
2025-01-10ifcombine field-merge: improve handling of dwordsAlexandre Oliva1-0/+46
On 32-bit hosts, data types with 64-bit alignment aren't getting treated as desired by ifcombine field-merging: we limit the choice of modes at BITS_PER_WORD sizes, but when deciding the boundary for a split, we'd limit the choice only by the alignment, so we wouldn't even consider a split at an odd 32-bit boundary. Fix that by limiting the boundary choice by word choice as well. Now, this would still leave misaligned 64-bit fields in 64-bit-aligned data structures unhandled by ifcombine on 32-bit hosts. We already need to loading them as double words, and if they're not byte-aligned, the code gets really ugly, but ifcombine could improve it if it allows double-word loads as a last resort. I've added that. for gcc/ChangeLog * gimple-fold.cc (fold_truth_andor_for_ifcombine): Limit boundary choice by word size as well. Try aligned double-word loads as a last resort. for gcc/testsuite/ChangeLog * gcc.dg/field-merge-17.c: New.
2025-01-10ipa-cp: Fold-convert values when necessary (PR 118138)Martin Jambor1-0/+30
PR 118138 and quite a few duplicates that it has acquired in a short time show that even though we are careful to make sure we do not loose any bits when newly allowing type conversions in jump-functions, we still need to perform the fold conversions during IPA constant propagation and not just at the end in order to properly perform sign-extensions or zero-extensions as appropriate. This patch does just that, changing a safety predicate we already use at the appropriate places to return the necessary type. gcc/ChangeLog: 2025-01-03 Martin Jambor <mjambor@suse.cz> PR ipa/118138 * ipa-cp.cc (ipacp_value_safe_for_type): Return the appropriate type instead of a bool, accept NULL_TREE VALUEs. (propagate_vals_across_arith_jfunc): Use the new returned value of ipacp_value_safe_for_type. (propagate_vals_across_ancestor): Likewise. (propagate_scalar_across_jump_function): Likewise. gcc/testsuite/ChangeLog: 2025-01-03 Martin Jambor <mjambor@suse.cz> PR ipa/118138 * gcc.dg/ipa/pr118138.c: New test.
2025-01-09c: Restore warning for incomplete structures declared in parameter list ↵Martin Uecker2-1/+6
[PR117866] In C23 mode the warning about declaring structures and union in parameter lists was removed, because it is possible to redeclare a compatible type elsewhere. This is not the case for incomplete types, so restore the warning for those types. PR c/117866 gcc/c/ChangeLog: * c-decl.cc (get_parm_info): Change condition for warning. gcc/testsuite/ChangeLog: * gcc.dg/pr117866.c: New test. * gcc.dg/strub-pr118007.c: Adapt.
2025-01-09c, c++: preserve type name in conversion [PR116060]Jason Merrill1-14/+14
When the program requests a conversion to a typedef, let's try harder to remember the new name. Torbjörn's original patch changed the type of the original expression, but that seems not generally desirable; we might want either or both of the original type and the converted-to type to be represented. So this expresses the name change as a NOP_EXPR. Compiling stdc++.h, this adds 519 allocations out of 1870k, or 0.28%. The -Wsuggest-attribute=format change was necessary to do the check before converting to the target type, which seems like an improvement. PR c/116060 gcc/c/ChangeLog: * c-typeck.cc (convert_for_assignment): Make sure left hand side and right hand side has identical named types to aid diagnostic output. gcc/cp/ChangeLog: * call.cc (standard_conversion): Preserve type name in ck_identity. (maybe_adjust_type_name): New. (convert_like_internal): Use it. Handle -Wsuggest-attribute=format here. (convert_for_arg_passing): Not here. gcc/testsuite/ChangeLog: * c-c++-common/analyzer/out-of-bounds-diagram-8.c: Update to correct type. * c-c++-common/analyzer/out-of-bounds-diagram-11.c: Likewise. * gcc.dg/analyzer/out-of-bounds-diagram-10.c: Likewise. Co-authored-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com> Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
2025-01-09testsuite: Require trampolines for gcc.dg/pr118325.cDimitar Dimitrov1-0/+1
The test case uses a nested function, which is not supported by some targets. gcc/testsuite/ChangeLog: * gcc.dg/pr118325.c: Require effective target trampolines. Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
2025-01-09'git mv gcc/testsuite/gcc.dg/{,torture/}crc-linux-3.c'Thomas Schwinge1-0/+0
Like recent commit 96f5fd3089075b56ea9ea85060213cc4edd7251a "Move some CRC tests into the gcc.dg/torture directory" moved a few files, this one also needs to go into torture testing: otherwise, it's compiled just at '-O0', where the CRC optimization pass isn't active. gcc/testsuite/ * gcc.dg/crc-linux-3.c: Move... * gcc.dg/torture/crc-linux-3.c: ... here.
2025-01-09match.pd: Avoid introducing UB in the a r<< (32-b) -> a r>> b optimization ↵Jakub Jelinek1-0/+97
[PR117927] As mentioned in the PR, the a r<< (bitsize-b) to a r>> b and similar match.pd optimization which has been introduced in GCC 15 can introduce UB which wasn't there before, in particular if b is equal at runtime to bitsize, then a r<< 0 is turned into a r>> bitsize. The following patch fixes it by optimizing it early only if VRP tells us the count isn't equal to the bitsize, and late into a r>> (b & (bitsize - 1)) if bitsize is power of two and the subtraction has single use, on various targets the masking then goes away because its rotate instructions do masking already. The latter can't be done too early though, because then the expr_not_equal_to case is basically useless and we introduce the masking always and can't find out anymore that there was originally no masking. Even cfun->after_inlining check would be too early, there is forwprop before vrp, so the patch introduces a new PROP for the start of the last forwprop pass. 2025-01-09 Jakub Jelinek <jakub@redhat.com> Andrew Pinski <quic_apinski@quicinc.com> PR tree-optimization/117927 * tree-pass.h (PROP_last_full_fold): Define. * passes.def: Add last= parameters to pass_forwprop. * tree-ssa-forwprop.cc (pass_forwprop): Add last_p non-static data member and initialize it in the ctor. (pass_forwprop::set_pass_param): New method. (pass_forwprop::execute): Set PROP_last_full_fold in curr_properties at the start if last_p. * match.pd (a rrotate (32-b) -> a lrotate b): Only optimize either if @2 is known not to be equal to prec or if during/after last forwprop the subtraction has single use and prec is power of two; in that case transform it into orotate by masked count. * gcc.dg/tree-ssa/pr117927.c: New test.
2025-01-08nvptx: Re-enable "Stack alignment causes use of alloca" test casesThomas Schwinge10-10/+1
These generally PASS nowadays, without requiring 'alloca'. There were two exceptions: 'gcc.dg/torture/stackalign/pr16660-2.c', 'gcc.dg/torture/stackalign/pr16660-3.c', where variants specifying '-O0' or '-fpic' FAILed with 'ptxas' of, for example, CUDA 10.0 due to: nvptx-as: ptxas terminated with signal 11 [Segmentation fault], core dumped That however is gone with 'ptxas' of, for example, CUDA 11.5 and later. gcc/testsuite/ * gcc.dg/torture/stackalign/global-1.c: Re-enable for nvptx. * gcc.dg/torture/stackalign/inline-1.c: Likewise. * gcc.dg/torture/stackalign/nested-1.c: Likewise. * gcc.dg/torture/stackalign/nested-2.c: Likewise. * gcc.dg/torture/stackalign/nested-4.c: Likewise. * gcc.dg/torture/stackalign/pr16660-1.c: Likewise. * gcc.dg/torture/stackalign/pr16660-2.c: Likewise. * gcc.dg/torture/stackalign/pr16660-3.c: Likewise. * gcc.dg/torture/stackalign/ret-struct-1.c: Likewise. * gcc.dg/torture/stackalign/struct-1.c: Likewise.
2025-01-08tree-optimization/117979 - failed irreducible loop update from DCERichard Biener1-0/+21
When CD-DCE creates forwarders to reduce false control dependences it fails to update the irreducible state of edge and the forwarder block in case the fowarder groups both normal (entry) and edges from an irreducible region (necessarily backedges). This is because when we split the first edge, if that's a normal edge, the forwarder and its edge to the original block will not be marked as part of the irreducible region but when we then redirect an edge from within the region it becomes so. The following fixes this up. Note I think creating a forwarder that includes backedges is likely not going to help, but at this stage I don't want to change the CFG going into DCE. For regular loops we'll have a single entry and a single backedge by means of loop init and will never create a forwarder - so this is solely happening for irreducible regions where it's harder to prove that such forwarder doesn't help. PR tree-optimization/117979 * tree-ssa-dce.cc (make_forwarders_with_degenerate_phis): Properly update the irreducible region state. * gcc.dg/torture/pr117979.c: New testcase.
2025-01-08middle-end/118325 - nonlocal goto loweringRichard Biener1-0/+16
When nonlocal goto lowering creates an artificial label it fails to adjust its context. PR middle-end/118325 * tree-nested.cc (convert_nl_goto_reference): Assign proper context to generated artificial label. * gcc.dg/pr118325.c: New testcase.
2025-01-08tree-optimization/118269 - SLP reduction chain and early breaksRichard Biener1-0/+17
When we create the SLP reduction chain epilogue for the PHIs for the early exit we fail to properly classify the reduction as SLP reduction chain. The following fixes the corresponding checks. PR tree-optimization/118269 * tree-vect-loop.cc (vect_create_epilog_for_reduction): Use the correct stmt for the REDUC_GROUP_FIRST_ELEMENT lookup. * gcc.dg/vect/vect-early-break_131-pr118269.c: New testcase.
2025-01-07[PATCH] riscv: add mising masking in lrsc expander (PR118137)Andreas Schwab1-0/+29
gcc: PR target/118137 * config/riscv/sync.md ("lrsc_atomic_exchange<mode>"): Apply mask to shifted value. gcc/testsuite: PR target/118137 * gcc.dg/atomic/pr118137.c: New.
2025-01-07testsuite: RISC-V: Skip tests providing -march for ILP32E/ILP64E ABIsDimitar Dimitrov2-2/+2
Many test cases explicitly set -march with extensions which are not compatible with the E ABI variants. This leads to spurious errors when toolchain has been configured for RV32E base ISA and ILP32E ABI: spawn ... -march=rv32gc_zbb ... cc1: error: ILP32E ABI does not support the 'D' extension Fix by skipping those tests if toolchain's default ABI is E. gcc/testsuite/ChangeLog: * gcc.dg/pr90838-2.c: Skip if default ABI is E. * gcc.dg/pr90838.c: Ditto. * gcc.target/riscv/adddibeq.c: Ditto. * gcc.target/riscv/adddibfeq.c: Ditto. * gcc.target/riscv/adddibfge.c: Ditto. * gcc.target/riscv/adddibfgt.c: Ditto. * gcc.target/riscv/adddibfle.c: Ditto. * gcc.target/riscv/adddibflt.c: Ditto. * gcc.target/riscv/adddibfne.c: Ditto. * gcc.target/riscv/adddibge.c: Ditto. * gcc.target/riscv/adddibgeu.c: Ditto. * gcc.target/riscv/adddibgt.c: Ditto. * gcc.target/riscv/adddibgtu.c: Ditto. * gcc.target/riscv/adddible.c: Ditto. * gcc.target/riscv/adddibleu.c: Ditto. * gcc.target/riscv/adddiblt.c: Ditto. * gcc.target/riscv/adddibltu.c: Ditto. * gcc.target/riscv/adddibne.c: Ditto. * gcc.target/riscv/adddieq.c: Ditto. * gcc.target/riscv/adddifeq.c: Ditto. * gcc.target/riscv/adddifge.c: Ditto. * gcc.target/riscv/adddifgt.c: Ditto. * gcc.target/riscv/adddifle.c: Ditto. * gcc.target/riscv/adddiflt.c: Ditto. * gcc.target/riscv/adddifne.c: Ditto. * gcc.target/riscv/adddige.c: Ditto. * gcc.target/riscv/adddigeu.c: Ditto. * gcc.target/riscv/adddigt.c: Ditto. * gcc.target/riscv/adddigtu.c: Ditto. * gcc.target/riscv/adddile.c: Ditto. * gcc.target/riscv/adddileu.c: Ditto. * gcc.target/riscv/adddilt.c: Ditto. * gcc.target/riscv/adddiltu.c: Ditto. * gcc.target/riscv/adddine.c: Ditto. * gcc.target/riscv/addsibeq.c: Ditto. * gcc.target/riscv/addsibfeq.c: Ditto. * gcc.target/riscv/addsibfge.c: Ditto. * gcc.target/riscv/addsibfgt.c: Ditto. * gcc.target/riscv/addsibfle.c: Ditto. * gcc.target/riscv/addsibflt.c: Ditto. * gcc.target/riscv/addsibfne.c: Ditto. * gcc.target/riscv/addsibge.c: Ditto. * gcc.target/riscv/addsibgeu.c: Ditto. * gcc.target/riscv/addsibgt.c: Ditto. * gcc.target/riscv/addsibgtu.c: Ditto. * gcc.target/riscv/addsible.c: Ditto. * gcc.target/riscv/addsibleu.c: Ditto. * gcc.target/riscv/addsiblt.c: Ditto. * gcc.target/riscv/addsibltu.c: Ditto. * gcc.target/riscv/addsibne.c: Ditto. * gcc.target/riscv/addsieq.c: Ditto. * gcc.target/riscv/addsifeq.c: Ditto. * gcc.target/riscv/addsifge.c: Ditto. * gcc.target/riscv/addsifgt.c: Ditto. * gcc.target/riscv/addsifle.c: Ditto. * gcc.target/riscv/addsiflt.c: Ditto. * gcc.target/riscv/addsifne.c: Ditto. * gcc.target/riscv/addsige.c: Ditto. * gcc.target/riscv/addsigeu.c: Ditto. * gcc.target/riscv/addsigt.c: Ditto. * gcc.target/riscv/addsigtu.c: Ditto. * gcc.target/riscv/addsile.c: Ditto. * gcc.target/riscv/addsileu.c: Ditto. * gcc.target/riscv/addsilt.c: Ditto. * gcc.target/riscv/addsiltu.c: Ditto. * gcc.target/riscv/addsine.c: Ditto. * gcc.target/riscv/cmo-zicboz-zic64-1.c: Ditto. * gcc.target/riscv/cmpmemsi-2.c: Ditto. * gcc.target/riscv/cmpmemsi-3.c: Ditto. * gcc.target/riscv/cmpmemsi.c: Ditto. * gcc.target/riscv/cpymemsi-2.c: Ditto. * gcc.target/riscv/cpymemsi-3.c: Ditto. * gcc.target/riscv/cpymemsi.c: Ditto. * gcc.target/riscv/crc-builtin-zbc32.c: Ditto. * gcc.target/riscv/crc-builtin-zbc64.c: Ditto. * gcc.target/riscv/cset-sext-rtl.c: Ditto. * gcc.target/riscv/cset-sext-rtl32.c: Ditto. * gcc.target/riscv/cset-sext-sfb-rtl.c: Ditto. * gcc.target/riscv/cset-sext-sfb-rtl32.c: Ditto. * gcc.target/riscv/cset-sext-sfb.c: Ditto. * gcc.target/riscv/cset-sext-thead-rtl.c: Ditto. * gcc.target/riscv/cset-sext-thead.c: Ditto. * gcc.target/riscv/cset-sext-ventana-rtl.c: Ditto. * gcc.target/riscv/cset-sext-ventana.c: Ditto. * gcc.target/riscv/cset-sext-zicond-rtl.c: Ditto. * gcc.target/riscv/cset-sext-zicond-rtl32.c: Ditto. * gcc.target/riscv/cset-sext-zicond.c: Ditto. * gcc.target/riscv/cset-sext.c: Ditto. * gcc.target/riscv/matrix_add_const.c: Ditto. * gcc.target/riscv/movdibeq-thead.c: Ditto. * gcc.target/riscv/movdibeq-ventana.c: Ditto. * gcc.target/riscv/movdibeq-zicond.c: Ditto. * gcc.target/riscv/movdibeq.c: Ditto. * gcc.target/riscv/movdibfeq-ventana.c: Ditto. * gcc.target/riscv/movdibfeq-zicond.c: Ditto. * gcc.target/riscv/movdibfeq.c: Ditto. * gcc.target/riscv/movdibfge-ventana.c: Ditto. * gcc.target/riscv/movdibfge-zicond.c: Ditto. * gcc.target/riscv/movdibfge.c: Ditto. * gcc.target/riscv/movdibfgt-ventana.c: Ditto. * gcc.target/riscv/movdibfgt-zicond.c: Ditto. * gcc.target/riscv/movdibfgt.c: Ditto. * gcc.target/riscv/movdibfle-ventana.c: Ditto. * gcc.target/riscv/movdibfle-zicond.c: Ditto. * gcc.target/riscv/movdibfle.c: Ditto. * gcc.target/riscv/movdibflt-ventana.c: Ditto. * gcc.target/riscv/movdibflt-zicond.c: Ditto. * gcc.target/riscv/movdibflt.c: Ditto. * gcc.target/riscv/movdibfne-ventana.c: Ditto. * gcc.target/riscv/movdibfne-zicond.c: Ditto. * gcc.target/riscv/movdibfne.c: Ditto. * gcc.target/riscv/movdibge-thead.c: Ditto. * gcc.target/riscv/movdibge-ventana.c: Ditto. * gcc.target/riscv/movdibge-zicond.c: Ditto. * gcc.target/riscv/movdibge.c: Ditto. * gcc.target/riscv/movdibgeu-thead.c: Ditto. * gcc.target/riscv/movdibgeu-ventana.c: Ditto. * gcc.target/riscv/movdibgeu-zicond.c: Ditto. * gcc.target/riscv/movdibgeu.c: Ditto. * gcc.target/riscv/movdibgt-thead.c: Ditto. * gcc.target/riscv/movdibgt-ventana.c: Ditto. * gcc.target/riscv/movdibgt-zicond.c: Ditto. * gcc.target/riscv/movdibgt.c: Ditto. * gcc.target/riscv/movdibgtu-thead.c: Ditto. * gcc.target/riscv/movdibgtu-ventana.c: Ditto. * gcc.target/riscv/movdibgtu-zicond.c: Ditto. * gcc.target/riscv/movdibgtu.c: Ditto. * gcc.target/riscv/movdible-thead.c: Ditto. * gcc.target/riscv/movdible-ventana.c: Ditto. * gcc.target/riscv/movdible-zicond.c: Ditto. * gcc.target/riscv/movdible.c: Ditto. * gcc.target/riscv/movdibleu-thead.c: Ditto. * gcc.target/riscv/movdibleu-ventana.c: Ditto. * gcc.target/riscv/movdibleu-zicond.c: Ditto. * gcc.target/riscv/movdibleu.c: Ditto. * gcc.target/riscv/movdiblt-thead.c: Ditto. * gcc.target/riscv/movdiblt-ventana.c: Ditto. * gcc.target/riscv/movdiblt-zicond.c: Ditto. * gcc.target/riscv/movdiblt.c: Ditto. * gcc.target/riscv/movdibltu-thead.c: Ditto. * gcc.target/riscv/movdibltu-ventana.c: Ditto. * gcc.target/riscv/movdibltu-zicond.c: Ditto. * gcc.target/riscv/movdibltu.c: Ditto. * gcc.target/riscv/movdibne-thead.c: Ditto. * gcc.target/riscv/movdibne-ventana.c: Ditto. * gcc.target/riscv/movdibne-zicond.c: Ditto. * gcc.target/riscv/movdibne.c: Ditto. * gcc.target/riscv/movdieq-sfb.c: Ditto. * gcc.target/riscv/movdieq-thead.c: Ditto. * gcc.target/riscv/movdieq-ventana.c: Ditto. * gcc.target/riscv/movdieq-zicond.c: Ditto. * gcc.target/riscv/movdieq.c: Ditto. * gcc.target/riscv/movdifeq-sfb.c: Ditto. * gcc.target/riscv/movdifeq-thead.c: Ditto. * gcc.target/riscv/movdifeq-ventana.c: Ditto. * gcc.target/riscv/movdifeq-zicond.c: Ditto. * gcc.target/riscv/movdifeq.c: Ditto. * gcc.target/riscv/movdifge-sfb.c: Ditto. * gcc.target/riscv/movdifge-thead.c: Ditto. * gcc.target/riscv/movdifge-ventana.c: Ditto. * gcc.target/riscv/movdifge-zicond.c: Ditto. * gcc.target/riscv/movdifge.c: Ditto. * gcc.target/riscv/movdifgt-sfb.c: Ditto. * gcc.target/riscv/movdifgt-thead.c: Ditto. * gcc.target/riscv/movdifgt-ventana.c: Ditto. * gcc.target/riscv/movdifgt-zicond.c: Ditto. * gcc.target/riscv/movdifgt.c: Ditto. * gcc.target/riscv/movdifle-sfb.c: Ditto. * gcc.target/riscv/movdifle-thead.c: Ditto. * gcc.target/riscv/movdifle-ventana.c: Ditto. * gcc.target/riscv/movdifle-zicond.c: Ditto. * gcc.target/riscv/movdifle.c: Ditto. * gcc.target/riscv/movdiflt-sfb.c: Ditto. * gcc.target/riscv/movdiflt-thead.c: Ditto. * gcc.target/riscv/movdiflt-ventana.c: Ditto. * gcc.target/riscv/movdiflt-zicond.c: Ditto. * gcc.target/riscv/movdiflt.c: Ditto. * gcc.target/riscv/movdifne-sfb.c: Ditto. * gcc.target/riscv/movdifne-thead.c: Ditto. * gcc.target/riscv/movdifne-ventana.c: Ditto. * gcc.target/riscv/movdifne-zicond.c: Ditto. * gcc.target/riscv/movdifne.c: Ditto. * gcc.target/riscv/movdige-sfb.c: Ditto. * gcc.target/riscv/movdige-thead.c: Ditto. * gcc.target/riscv/movdige-ventana.c: Ditto. * gcc.target/riscv/movdige-zicond.c: Ditto. * gcc.target/riscv/movdige.c: Ditto. * gcc.target/riscv/movdigeu-sfb.c: Ditto. * gcc.target/riscv/movdigeu-thead.c: Ditto. * gcc.target/riscv/movdigeu-ventana.c: Ditto. * gcc.target/riscv/movdigeu-zicond.c: Ditto. * gcc.target/riscv/movdigeu.c: Ditto. * gcc.target/riscv/movdigt-sfb.c: Ditto. * gcc.target/riscv/movdigt-thead.c: Ditto. * gcc.target/riscv/movdigt-ventana.c: Ditto. * gcc.target/riscv/movdigt-zicond.c: Ditto. * gcc.target/riscv/movdigt.c: Ditto. * gcc.target/riscv/movdigtu-sfb.c: Ditto. * gcc.target/riscv/movdigtu-thead.c: Ditto. * gcc.target/riscv/movdigtu-ventana.c: Ditto. * gcc.target/riscv/movdigtu-zicond.c: Ditto. * gcc.target/riscv/movdigtu.c: Ditto. * gcc.target/riscv/movdile-sfb.c: Ditto. * gcc.target/riscv/movdile-thead.c: Ditto. * gcc.target/riscv/movdile-ventana.c: Ditto. * gcc.target/riscv/movdile-zicond.c: Ditto. * gcc.target/riscv/movdile.c: Ditto. * gcc.target/riscv/movdileu-sfb.c: Ditto. * gcc.target/riscv/movdileu-thead.c: Ditto. * gcc.target/riscv/movdileu-ventana.c: Ditto. * gcc.target/riscv/movdileu-zicond.c: Ditto. * gcc.target/riscv/movdileu.c: Ditto. * gcc.target/riscv/movdilt-sfb.c: Ditto. * gcc.target/riscv/movdilt-thead.c: Ditto. * gcc.target/riscv/movdilt-ventana.c: Ditto. * gcc.target/riscv/movdilt-zicond.c: Ditto. * gcc.target/riscv/movdilt.c: Ditto. * gcc.target/riscv/movdiltu-sfb.c: Ditto. * gcc.target/riscv/movdiltu-thead.c: Ditto. * gcc.target/riscv/movdiltu-ventana.c: Ditto. * gcc.target/riscv/movdiltu-zicond.c: Ditto. * gcc.target/riscv/movdiltu.c: Ditto. * gcc.target/riscv/movdine-sfb.c: Ditto. * gcc.target/riscv/movdine-thead.c: Ditto. * gcc.target/riscv/movdine-ventana.c: Ditto. * gcc.target/riscv/movdine-zicond.c: Ditto. * gcc.target/riscv/movdine.c: Ditto. * gcc.target/riscv/movsibeq-thead.c: Ditto. * gcc.target/riscv/movsibeq-ventana.c: Ditto. * gcc.target/riscv/movsibeq-zicond.c: Ditto. * gcc.target/riscv/movsibeq.c: Ditto. * gcc.target/riscv/movsibfeq-ventana.c: Ditto. * gcc.target/riscv/movsibfeq-zicond.c: Ditto. * gcc.target/riscv/movsibfeq.c: Ditto. * gcc.target/riscv/movsibfge-ventana.c: Ditto. * gcc.target/riscv/movsibfge-zicond.c: Ditto. * gcc.target/riscv/movsibfge.c: Ditto. * gcc.target/riscv/movsibfgt-ventana.c: Ditto. * gcc.target/riscv/movsibfgt-zicond.c: Ditto. * gcc.target/riscv/movsibfgt.c: Ditto. * gcc.target/riscv/movsibfle-ventana.c: Ditto. * gcc.target/riscv/movsibfle-zicond.c: Ditto. * gcc.target/riscv/movsibfle.c: Ditto. * gcc.target/riscv/movsibflt-ventana.c: Ditto. * gcc.target/riscv/movsibflt-zicond.c: Ditto. * gcc.target/riscv/movsibflt.c: Ditto. * gcc.target/riscv/movsibfne-ventana.c: Ditto. * gcc.target/riscv/movsibfne-zicond.c: Ditto. * gcc.target/riscv/movsibfne.c: Ditto. * gcc.target/riscv/movsibge-thead.c: Ditto. * gcc.target/riscv/movsibge-ventana.c: Ditto. * gcc.target/riscv/movsibge-zicond.c: Ditto. * gcc.target/riscv/movsibge.c: Ditto. * gcc.target/riscv/movsibgeu-thead.c: Ditto. * gcc.target/riscv/movsibgeu-ventana.c: Ditto. * gcc.target/riscv/movsibgeu-zicond.c: Ditto. * gcc.target/riscv/movsibgeu.c: Ditto. * gcc.target/riscv/movsibgt-thead.c: Ditto. * gcc.target/riscv/movsibgt-ventana.c: Ditto. * gcc.target/riscv/movsibgt-zicond.c: Ditto. * gcc.target/riscv/movsibgt.c: Ditto. * gcc.target/riscv/movsibgtu-thead.c: Ditto. * gcc.target/riscv/movsibgtu-ventana.c: Ditto. * gcc.target/riscv/movsibgtu-zicond.c: Ditto. * gcc.target/riscv/movsibgtu.c: Ditto. * gcc.target/riscv/movsible-thead.c: Ditto. * gcc.target/riscv/movsible-ventana.c: Ditto. * gcc.target/riscv/movsible-zicond.c: Ditto. * gcc.target/riscv/movsible.c: Ditto. * gcc.target/riscv/movsibleu-thead.c: Ditto. * gcc.target/riscv/movsibleu-ventana.c: Ditto. * gcc.target/riscv/movsibleu-zicond.c: Ditto. * gcc.target/riscv/movsibleu.c: Ditto. * gcc.target/riscv/movsiblt-thead.c: Ditto. * gcc.target/riscv/movsiblt-ventana.c: Ditto. * gcc.target/riscv/movsiblt-zicond.c: Ditto. * gcc.target/riscv/movsiblt.c: Ditto. * gcc.target/riscv/movsibltu-thead.c: Ditto. * gcc.target/riscv/movsibltu-ventana.c: Ditto. * gcc.target/riscv/movsibltu-zicond.c: Ditto. * gcc.target/riscv/movsibltu.c: Ditto. * gcc.target/riscv/movsibne-thead.c: Ditto. * gcc.target/riscv/movsibne-ventana.c: Ditto. * gcc.target/riscv/movsibne-zicond.c: Ditto. * gcc.target/riscv/movsibne.c: Ditto. * gcc.target/riscv/movsieq-sfb.c: Ditto. * gcc.target/riscv/movsieq-thead.c: Ditto. * gcc.target/riscv/movsieq-ventana.c: Ditto. * gcc.target/riscv/movsieq-zicond.c: Ditto. * gcc.target/riscv/movsieq.c: Ditto. * gcc.target/riscv/movsifeq-sfb.c: Ditto. * gcc.target/riscv/movsifeq-thead.c: Ditto. * gcc.target/riscv/movsifeq-ventana.c: Ditto. * gcc.target/riscv/movsifeq-zicond.c: Ditto. * gcc.target/riscv/movsifeq.c: Ditto. * gcc.target/riscv/movsifge-sfb.c: Ditto. * gcc.target/riscv/movsifge-thead.c: Ditto. * gcc.target/riscv/movsifge-ventana.c: Ditto. * gcc.target/riscv/movsifge-zicond.c: Ditto. * gcc.target/riscv/movsifge.c: Ditto. * gcc.target/riscv/movsifgt-sfb.c: Ditto. * gcc.target/riscv/movsifgt-thead.c: Ditto. * gcc.target/riscv/movsifgt-ventana.c: Ditto. * gcc.target/riscv/movsifgt-zicond.c: Ditto. * gcc.target/riscv/movsifgt.c: Ditto. * gcc.target/riscv/movsifle-sfb.c: Ditto. * gcc.target/riscv/movsifle-thead.c: Ditto. * gcc.target/riscv/movsifle-ventana.c: Ditto. * gcc.target/riscv/movsifle-zicond.c: Ditto. * gcc.target/riscv/movsifle.c: Ditto. * gcc.target/riscv/movsiflt-sfb.c: Ditto. * gcc.target/riscv/movsiflt-thead.c: Ditto. * gcc.target/riscv/movsiflt-ventana.c: Ditto. * gcc.target/riscv/movsiflt-zicond.c: Ditto. * gcc.target/riscv/movsiflt.c: Ditto. * gcc.target/riscv/movsifne-sfb.c: Ditto. * gcc.target/riscv/movsifne-thead.c: Ditto. * gcc.target/riscv/movsifne-ventana.c: Ditto. * gcc.target/riscv/movsifne-zicond.c: Ditto. * gcc.target/riscv/movsifne.c: Ditto. * gcc.target/riscv/movsige-sfb.c: Ditto. * gcc.target/riscv/movsige-thead.c: Ditto. * gcc.target/riscv/movsige-ventana.c: Ditto. * gcc.target/riscv/movsige-zicond.c: Ditto. * gcc.target/riscv/movsige.c: Ditto. * gcc.target/riscv/movsigeu-sfb.c: Ditto. * gcc.target/riscv/movsigeu-thead.c: Ditto. * gcc.target/riscv/movsigeu-ventana.c: Ditto. * gcc.target/riscv/movsigeu-zicond.c: Ditto. * gcc.target/riscv/movsigeu.c: Ditto. * gcc.target/riscv/movsigt-sfb.c: Ditto. * gcc.target/riscv/movsigt-thead.c: Ditto. * gcc.target/riscv/movsigt-ventana.c: Ditto. * gcc.target/riscv/movsigt-zicond.c: Ditto. * gcc.target/riscv/movsigt.c: Ditto. * gcc.target/riscv/movsigtu-sfb.c: Ditto. * gcc.target/riscv/movsigtu-thead.c: Ditto. * gcc.target/riscv/movsigtu-ventana.c: Ditto. * gcc.target/riscv/movsigtu-zicond.c: Ditto. * gcc.target/riscv/movsigtu.c: Ditto. * gcc.target/riscv/movsile-sfb.c: Ditto. * gcc.target/riscv/movsile-thead.c: Ditto. * gcc.target/riscv/movsile-ventana.c: Ditto. * gcc.target/riscv/movsile-zicond.c: Ditto. * gcc.target/riscv/movsile.c: Ditto. * gcc.target/riscv/movsileu-sfb.c: Ditto. * gcc.target/riscv/movsileu-thead.c: Ditto. * gcc.target/riscv/movsileu-ventana.c: Ditto. * gcc.target/riscv/movsileu-zicond.c: Ditto. * gcc.target/riscv/movsileu.c: Ditto. * gcc.target/riscv/movsilt-sfb.c: Ditto. * gcc.target/riscv/movsilt-thead.c: Ditto. * gcc.target/riscv/movsilt-ventana.c: Ditto. * gcc.target/riscv/movsilt-zicond.c: Ditto. * gcc.target/riscv/movsilt.c: Ditto. * gcc.target/riscv/movsiltu-sfb.c: Ditto. * gcc.target/riscv/movsiltu-thead.c: Ditto. * gcc.target/riscv/movsiltu-ventana.c: Ditto. * gcc.target/riscv/movsiltu-zicond.c: Ditto. * gcc.target/riscv/movsiltu.c: Ditto. * gcc.target/riscv/movsine-sfb.c: Ditto. * gcc.target/riscv/movsine-thead.c: Ditto. * gcc.target/riscv/movsine-ventana.c: Ditto. * gcc.target/riscv/movsine-zicond.c: Ditto. * gcc.target/riscv/movsine.c: Ditto. * gcc.target/riscv/pr111501.c: Ditto. * gcc.target/riscv/pr115921.c: Ditto. * gcc.target/riscv/pr116033.c: Ditto. * gcc.target/riscv/pr116035-1.c: Ditto. * gcc.target/riscv/pr116035-2.c: Ditto. * gcc.target/riscv/pr116131.c: Ditto. * gcc.target/riscv/reg_subreg_costs.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/shuffle-slide.c: Ditto. * gcc.target/riscv/rvv/xtheadvector.c: Ditto. * gcc.target/riscv/rvv/xtheadvector/pr114194.c: Ditto. * gcc.target/riscv/sign-extend-rshift-32.c: Ditto. * gcc.target/riscv/sign-extend-rshift-64.c: Ditto. * gcc.target/riscv/sign-extend-rshift.c: Ditto. * gcc.target/riscv/synthesis-1.c: Ditto. * gcc.target/riscv/synthesis-10.c: Ditto. * gcc.target/riscv/synthesis-11.c: Ditto. * gcc.target/riscv/synthesis-12.c: Ditto. * gcc.target/riscv/synthesis-13.c: Ditto. * gcc.target/riscv/synthesis-14.c: Ditto. * gcc.target/riscv/synthesis-15.c: Ditto. * gcc.target/riscv/synthesis-16.c: Ditto. * gcc.target/riscv/synthesis-2.c: Ditto. * gcc.target/riscv/synthesis-3.c: Ditto. * gcc.target/riscv/synthesis-4.c: Ditto. * gcc.target/riscv/synthesis-5.c: Ditto. * gcc.target/riscv/synthesis-6.c: Ditto. * gcc.target/riscv/synthesis-7.c: Ditto. * gcc.target/riscv/synthesis-8.c: Ditto. * gcc.target/riscv/synthesis-9.c: Ditto. * gcc.target/riscv/target-attr-16.c: Ditto. * gcc.target/riscv/target-attr-norelax.c: Ditto. * gcc.target/riscv/xtheadba-addsl.c: Ditto. * gcc.target/riscv/xtheadba.c: Ditto. * gcc.target/riscv/xtheadbb-ext-1.c: Ditto. * gcc.target/riscv/xtheadbb-ext-2.c: Ditto. * gcc.target/riscv/xtheadbb-ext-3.c: Ditto. * gcc.target/riscv/xtheadbb-ext.c: Ditto. * gcc.target/riscv/xtheadbb-extu-1.c: Ditto. * gcc.target/riscv/xtheadbb-extu-2.c: Ditto. * gcc.target/riscv/xtheadbb-extu-4.c: Ditto. * gcc.target/riscv/xtheadbb-extu.c: Ditto. * gcc.target/riscv/xtheadbb-ff1.c: Ditto. * gcc.target/riscv/xtheadbb-rev.c: Ditto. * gcc.target/riscv/xtheadbb-srri.c: Ditto. * gcc.target/riscv/xtheadbb-strcmp.c: Ditto. * gcc.target/riscv/xtheadbb-strlen-unaligned.c: Ditto. * gcc.target/riscv/xtheadbb-strlen.c: Ditto. * gcc.target/riscv/xtheadbb.c: Ditto. * gcc.target/riscv/xtheadbs-tst.c: Ditto. * gcc.target/riscv/xtheadbs.c: Ditto. * gcc.target/riscv/xtheadcmo.c: Ditto. * gcc.target/riscv/xtheadcondmov-indirect.c: Ditto. * gcc.target/riscv/xtheadcondmov-mveqz-imm-eqz.c: Ditto. * gcc.target/riscv/xtheadcondmov-mveqz-imm-not.c: Ditto. * gcc.target/riscv/xtheadcondmov-mveqz-reg-eqz.c: Ditto. * gcc.target/riscv/xtheadcondmov-mveqz-reg-not.c: Ditto. * gcc.target/riscv/xtheadcondmov-mvnez-imm-cond.c: Ditto. * gcc.target/riscv/xtheadcondmov-mvnez-imm-nez.c: Ditto. * gcc.target/riscv/xtheadcondmov-mvnez-reg-cond.c: Ditto. * gcc.target/riscv/xtheadcondmov-mvnez-reg-nez.c: Ditto. * gcc.target/riscv/xtheadcondmov.c: Ditto. * gcc.target/riscv/xtheadfmemidx-without-xtheadmemidx.c: Ditto. * gcc.target/riscv/xtheadfmemidx.c: Ditto. * gcc.target/riscv/xtheadfmv.c: Ditto. * gcc.target/riscv/xtheadint.c: Ditto. * gcc.target/riscv/xtheadmac-mula-muls.c: Ditto. * gcc.target/riscv/xtheadmac.c: Ditto. * gcc.target/riscv/xtheadmemidx-index-update.c: Ditto. * gcc.target/riscv/xtheadmemidx-index-xtheadbb-update.c: Ditto. * gcc.target/riscv/xtheadmemidx-index-xtheadbb.c: Ditto. * gcc.target/riscv/xtheadmemidx-index.c: Ditto. * gcc.target/riscv/xtheadmemidx-modify-xtheadbb.c: Ditto. * gcc.target/riscv/xtheadmemidx-modify.c: Ditto. * gcc.target/riscv/xtheadmemidx-uindex-update.c: Ditto. * gcc.target/riscv/xtheadmemidx-uindex-xtheadbb-update.c: Ditto. * gcc.target/riscv/xtheadmemidx-uindex-xtheadbb.c: Ditto. * gcc.target/riscv/xtheadmemidx-uindex.c: Ditto. * gcc.target/riscv/xtheadmemidx.c: Ditto. * gcc.target/riscv/xtheadmempair-1.c: Ditto. * gcc.target/riscv/xtheadmempair-2.c: Ditto. * gcc.target/riscv/xtheadmempair-3.c: Ditto. * gcc.target/riscv/xtheadmempair-4.c: Ditto. * gcc.target/riscv/xtheadmempair-interrupt-fcsr.c: Ditto. * gcc.target/riscv/xtheadmempair.c: Ditto. * gcc.target/riscv/xtheadsync.c: Ditto. * gcc.target/riscv/za-ext.c: Ditto. * gcc.target/riscv/zawrs.c: Ditto. * gcc.target/riscv/zbb-strcmp-disabled-2.c: Ditto. * gcc.target/riscv/zbb-strcmp-disabled.c: Ditto. * gcc.target/riscv/zbb-strcmp-limit.c: Ditto. * gcc.target/riscv/zbb-strcmp-unaligned.c: Ditto. * gcc.target/riscv/zbb-strcmp.c: Ditto. * gcc.target/riscv/zbb-strlen-disabled-2.c: Ditto. * gcc.target/riscv/zbb-strlen-disabled.c: Ditto. * gcc.target/riscv/zbb-strlen-unaligned.c: Ditto. * gcc.target/riscv/zbb-strlen.c: Ditto. * gcc.target/riscv/zero-extend-rshift-32.c: Ditto. * gcc.target/riscv/zero-extend-rshift-64.c: Ditto. * gcc.target/riscv/zero-extend-rshift.c: Ditto. * gcc.target/riscv/zi-ext.c: Ditto. * gcc.target/riscv/zvbb.c: Ditto. * gcc.target/riscv/zvbc.c: Ditto. * gcc.target/riscv/zvkb.c: Ditto. * gcc.target/riscv/zvkg.c: Ditto. * gcc.target/riscv/zvkn-1.c: Ditto. * gcc.target/riscv/zvkn.c: Ditto. * gcc.target/riscv/zvknc-1.c: Ditto. * gcc.target/riscv/zvknc-2.c: Ditto. * gcc.target/riscv/zvknc.c: Ditto. * gcc.target/riscv/zvkned.c: Ditto. * gcc.target/riscv/zvkng-1.c: Ditto. * gcc.target/riscv/zvkng-2.c: Ditto. * gcc.target/riscv/zvkng.c: Ditto. * gcc.target/riscv/zvknha.c: Ditto. * gcc.target/riscv/zvknhb.c: Ditto. * gcc.target/riscv/zvks-1.c: Ditto. * gcc.target/riscv/zvks.c: Ditto. * gcc.target/riscv/zvksc-1.c: Ditto. * gcc.target/riscv/zvksc-2.c: Ditto. * gcc.target/riscv/zvksc.c: Ditto. * gcc.target/riscv/zvksed.c: Ditto. * gcc.target/riscv/zvksg-1.c: Ditto. * gcc.target/riscv/zvksg-2.c: Ditto. * gcc.target/riscv/zvksg.c: Ditto. * gcc.target/riscv/zvksh.c: Ditto. * gcc.target/riscv/zvkt.c: Ditto. Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
2025-01-07AArch64: Switch off early schedulingWilco Dijkstra3-3/+3
The early scheduler takes up ~33% of the total build time, however it doesn't provide a meaningful performance gain. This is partly because modern OoO cores need far less scheduling, partly because the scheduler tends to create many unnecessary spills by increasing register pressure. Building applications 56% faster is far more useful than ~0.1% improvement on SPEC, so switch off early scheduling on AArch64. Codesize reduces by ~0.2%. Fix various tests that depend on scheduling by explicitly adding -fschedule-insns. gcc: * common/config/aarch64/aarch64-common.cc: Switch off fschedule_insns. gcc/testsuite: * gcc.dg/guality/pr36728-3.c: Remove XFAIL. * gcc.dg/guality/pr68860-1.c: Likewise. * gcc.dg/guality/pr68860-2.c: Likewise. * gcc.target/aarch64/ldp_aligned.c: Fix test. * gcc.target/aarch64/ldp_always.c: Likewise. * gcc.target/aarch64/ldp_stp_10.c: Add -fschedule-insns. * gcc.target/aarch64/ldp_stp_12.c: Likewise. * gcc.target/aarch64/ldp_stp_13.c: Remove test. * gcc.target/aarch64/ldp_stp_21.c: Add -fschedule-insns. * gcc.target/aarch64/ldp_stp_8.c: Likewise. * gcc.target/aarch64/ldp_vec_v2sf.c: Likewise. * gcc.target/aarch64/ldp_vec_v2si.c: Likewise. * gcc.target/aarch64/test_frame_16.c: Fix test. * gcc.target/aarch64/sve/vcond_12.c: Add -fschedule-insns. * gcc.target/aarch64/sve/acle/general/ldff1_3.c: Likewise.
2025-01-07perform affine fold to unsigned on non address expressions. [PR114932]Tamar Christina1-1/+1
When the patch for PR114074 was applied we saw a good boost in exchange2. This boost was partially caused by a simplification of the addressing modes. With the patch applied IV opts saw the following form for the base addressing; Base: (integer(kind=4) *) &block + ((sizetype) ((unsigned long) l0_19(D) * 324) + 36) vs what we normally get: Base: (integer(kind=4) *) &block + ((sizetype) ((integer(kind=8)) l0_19(D) * 81) + 9) * 4 This is because the patch promoted multiplies where one operand is a constant from a signed multiply to an unsigned one, to attempt to fold away the constant. This patch attempts the same but due to the various problems with SCEV and niters not being able to analyze the resulting forms (i.e. PR114322) we can't do it during SCEV or in the general form like in fold-const like extract_muldiv attempts. Instead this applies the simplification during IVopts initialization when we create the IV. This allows IV opts to see the simplified form without influencing the rest of the compiler. as mentioned in PR114074 it would be good to fix the missed optimization in the other passes so we can perform this in general. The reason this has a big impact on Fortran code is that Fortran doesn't seem to have unsigned integer types. As such all it's addressing are created with signed types and folding does not happen on them due to the possible overflow. concretely on AArch64 this changes the results from generation: mov x27, -108 mov x24, -72 mov x23, -36 add x21, x1, x0, lsl 2 add x19, x20, x22 .L5: add x0, x22, x19 add x19, x19, 324 ldr d1, [x0, x27] add v1.2s, v1.2s, v15.2s str d1, [x20, 216] ldr d0, [x0, x24] add v0.2s, v0.2s, v15.2s str d0, [x20, 252] ldr d31, [x0, x23] add v31.2s, v31.2s, v15.2s str d31, [x20, 288] bl digits_20_ cmp x21, x19 bne .L5 into: .L5: ldr d1, [x19, -108] add v1.2s, v1.2s, v15.2s str d1, [x20, 216] ldr d0, [x19, -72] add v0.2s, v0.2s, v15.2s str d0, [x20, 252] ldr d31, [x19, -36] add x19, x19, 324 add v31.2s, v31.2s, v15.2s str d31, [x20, 288] bl digits_20_ cmp x21, x19 bne .L5 The two patches together results in a 10% performance increase in exchange2 in SPECCPU 2017 and a 4% reduction in binary size and a 5% improvement in compile time. There's also a 5% performance improvement in fotonik3d and similar reduction in binary size. The patch folds every IV to unsigned to canonicalize them. At the end of the pass we match.pd will then remove unneeded conversions. Note that we cannot force everything to unsigned, IVops requires that array address expressions remain as such. Folding them results in them becoming pointer expressions for which some optimizations in IVopts do not run. gcc/ChangeLog: PR tree-optimization/114932 * tree-ssa-loop-ivopts.cc (alloc_iv): Perform affine unsigned fold. gcc/testsuite/ChangeLog: PR tree-optimization/114932 * gcc.dg/tree-ssa/pr64705.c: Update dump file scan. * gcc.target/i386/pr115462.c: The testcase shares 3 IVs which calculates the same thing but with a slightly different increment offset. The test checks for 3 complex addressing loads, one for each IV. But with this change they now all share one IV. That is the loop now only has one complex addressing. This is ultimately driven by the backend costing and the current costing says this is preferred so updating the testcase. * gfortran.dg/addressing-modes_1.f90: New test.
2025-01-07cfgexpand: Rewrite add_scope_conflicts_2 to use cache and look back further ↵Andrew Pinski1-0/+53
[PR111422] After fixing loop-im to do the correct overflow rewriting for pointer types too. We end up with code like: ``` _9 = (unsigned long) &g; _84 = _9 + 18446744073709551615; _11 = _42 + _84; _44 = (signed char *) _11; ... *_44 = 10; g ={v} {CLOBBER(eos)}; ... n[0] = &f; *_44 = 8; g ={v} {CLOBBER(eos)}; ``` Which was not being recongized by the scope conflicts code. This was because it only handled one level walk backs rather than multiple ones. This fixes the issue by having a cache which records all references to addresses of stack variables. Unlike the previous patch, this only records and looks at addresses of stack variables. The cache uses a bitmap and uses the index as the bit to look at. PR middle-end/117426 PR middle-end/111422 gcc/ChangeLog: * cfgexpand.cc (struct vars_ssa_cache): New class. (vars_ssa_cache::vars_ssa_cache): New constructor. (vars_ssa_cache::~vars_ssa_cache): New deconstructor. (vars_ssa_cache::create): New method. (vars_ssa_cache::exists): New method. (vars_ssa_cache::add_one): New method. (vars_ssa_cache::update): New method. (vars_ssa_cache::dump): New method. (add_scope_conflicts_2): Factor mostly out to vars_ssa_cache::operator(). New cache argument. Walk the bitmap cache for the stack variables addresses. (vars_ssa_cache::operator()): New method factored out from add_scope_conflicts_2. Rewrite to be a full walk of all operands and use a worklist. (add_scope_conflicts_1): Add cache new argument for the addr cache. Just call add_scope_conflicts_2 for the phi result instead of calling for the uses and don't call walk_stmt_load_store_addr_ops for phis. Update call to add_scope_conflicts_2 to add cache argument. (add_scope_conflicts): Add cache argument and update calls to add_scope_conflicts_1. gcc/testsuite/ChangeLog: * gcc.dg/torture/pr117426-1.c: New test. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2025-01-07[PR testsuite/118055] Trivial testsuite adjustment for m68k targetJeff Law2-2/+2
After a bit of a prod from Hans... Make the obvious change to these tests to get them passing again on m68k. PR testsuite/118055 gcc/testsuite * gcc.dg/tree-ssa/pr83403-1.c: Add m68k*-*-* to targets needing additional arguments for peeling. * gcc.dg/tree-ssa/pr83403-2.c: Similarly.
2025-01-07Fixup convert-dfp*.cRichard Biener2-0/+2
The testcases use -save-temps which doesn't play nice with -flto and multilib testing resulting in spurious UNRESOLVED like /usr/lib64/gcc/x86_64-suse-linux/14/../../../../x86_64-suse-linux/bin/ld: i386:x86-64 architecture of input file `./convert-dfp-2.ltrans0.ltrans.o' is incompatible with i386 output The following skips the testcases when using -flto. * gcc.dg/torture/convert-dfp-2.c: Skip with -flto. * gcc.dg/torture/convert-dfp.c: Likewise.
2025-01-07testsuite: add testcase for fixed PR117546Sam James1-0/+84
PR117546 was fixed by Eric's r14-10693-gadab597af288d6 change, but the testcase here is sufficiently different to be worth including in torture/. gcc/testsuite/ChangeLog: PR ipa/117546 * gcc.dg/torture/pr117546.c: New test.
2025-01-06tree-ssa-dce: Punt on allocations with too large constant sizes [PR118224]Jakub Jelinek1-0/+31
As suggested by Richi in the PR, the following patch will fail to DCE allocation calls if they have constant size which is too large (over PTRDIFF_MAX), or for the case of calloc, if either of the arguments is too large (in that case in theory the call could succeed if the other argument is variable zero but who cares) or if both are constant and their product overflows or is above PTRDIFF_MAX. This will make some pedantic conformance tests happy, though if one hides the size one will still need to use -fno-malloc-dce or obfuscate even the malloc etc. uses. If the size is constant and too large, it isn't worth trying to optimize it. 2025-01-06 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/118224 * tree-ssa-dce.cc (is_removable_allocation_p): Don't return true for allocations with constant size argument larger than PTRDIFF_MAX or for calloc with one of the arguments constant larger than PTRDIFF_MAX or their product known constant above PTRDIFF_MAX. Fix comment typos, furhter -> further and then -> than. * lto-section-in.cc (lto_free_function_in_decl_state_for_node): Fix comment typo, furhter -> further. * gcc.dg/pr118224.c: New test. * c-c++-common/ubsan/vla-1.c (bar): Use noipa attribute instead of noinline, noclone.
2025-01-04testsuite: Replace MMIX-specific adjustments with ↵Hans-Peter Nilsson2-4/+2
TARGET_CALLEE_COPIES-adjustments With the dump now emitting "privatized symbols" in the default "%s.%lu" format also for MMIX, there's still a difference for MMIX. This time it's because numbers have changed (copies introduced before this point) because it has TARGET_CALLEE_COPIES yielding true. Redundant copies may have been elided at this point, but the change in name remains. Since that's true for other targets too, an obvious change is to generalize the tested patterns to include TARGET_CALLEE_COPIES-true targets, as a brief inspection of the history of these tests shows that the point of these tests lie not in whether copies have been done but in the part of the pattern that match a constant. Also fixed up a "." where there should have been a "\\.". * gcc.dg/tree-ssa/vector-4.c: Replace MMIX adjustments with TARGET_CALLEE_COPIES-agnostic adjustments. * gcc.dg/tree-ssa/forwprop-36.c: Ditto. Correct pattern to match a literal ".".
2025-01-03rtlanal: Treat writes to sp as also writing to memory [PR117938]Richard Sandiford1-0/+36
This PR was about a case in which late-combine moved a stack deallocation across an earlier stack access. This was possible because the deallocation was missing the RTL-SSA equivalent of a vop, which in turn was because rtl_properties didn't treat the deallocation as writing to memory. I think the bug was ultimately there. gcc/ PR rtl-optimization/117938 * rtlanal.cc (rtx_properties::try_to_add_dest): Treat writes to the stack pointer as also writing to memory. gcc/testsuite/ PR rtl-optimization/117938 * gcc.dg/torture/pr117938.c: New test.
2025-01-03testsuite: torture: add LLVM testcase for DSE vs. -ftrivial-auto-var-init=Sam James1-0/+17
This testcase came up in a recent LLVM bug report [0] for DSE vs -ftrivial-auto-var-init=. Add it to our testsuite given that area could do with better coverage. [0] https://github.com/llvm/llvm-project/issues/119646 gcc/testsuite/ChangeLog: * gcc.dg/torture/dse-trivial-auto-var-init.c: New test. Co-authored-by: Andrew Pinski <pinskia@gmail.com>
2025-01-02c: special-case some "bool" errors with C23 (v2) [PR117629]David Malcolm4-1/+43
Changed in v2: - distinguish between "bool" and "_Bool" when determining standard version This patch attempts to provide better error messages for code compiled with C23 that hasn't been updated for "bool", "true", and "false" becoming keywords. Specifically: (1) with "typedef int bool;" previously we emitted: t1.c:7:13: error: two or more data types in declaration specifiers 7 | typedef int bool; | ^~~~ t1.c:7:1: warning: useless type name in empty declaration 7 | typedef int bool; | ^~~~~~~ whereas with this patch we emit: t1.c:7:13: error: 'bool' cannot be defined via 'typedef' 7 | typedef int bool; | ^~~~ t1.c:7:13: note: 'bool' is a keyword with '-std=c23' onwards t1.c:7:1: warning: useless type name in empty declaration 7 | typedef int bool; | ^~~~~~~ (2) with "int bool;" previously we emitted: t2.c:7:5: error: two or more data types in declaration specifiers 7 | int bool; | ^~~~ t2.c:7:1: warning: useless type name in empty declaration 7 | int bool; | ^~~ whereas with this patch we emit: t2.c:7:5: error: 'bool' cannot be used here 7 | int bool; | ^~~~ t2.c:7:5: note: 'bool' is a keyword with '-std=c23' onwards t2.c:7:1: warning: useless type name in empty declaration 7 | int bool; | ^~~ (3) with "typedef enum { false = 0, true = 1 } _Bool;" previously we emitted: t3.c:7:16: error: expected identifier before 'false' 7 | typedef enum { false = 0, true = 1 } _Bool; | ^~~~~ t3.c:7:38: error: expected ';', identifier or '(' before '_Bool' 7 | typedef enum { false = 0, true = 1 } _Bool; | ^~~~~ t3.c:7:38: warning: useless type name in empty declaration whereas with this patch we emit: t3.c:7:16: error: cannot use keyword 'false' as enumeration constant 7 | typedef enum { false = 0, true = 1 } _Bool; | ^~~~~ t3.c:7:16: note: 'false' is a keyword with '-std=c23' onwards t3.c:7:38: error: expected ';', identifier or '(' before '_Bool' 7 | typedef enum { false = 0, true = 1 } _Bool; | ^~~~~ t3.c:7:38: warning: useless type name in empty declaration gcc/c/ChangeLog: PR c/117629 * c-decl.cc (declspecs_add_type): Special-case attempts to use bool as a typedef name or declaration name. * c-errors.cc (get_std_for_keyword): New. (add_note_about_new_keyword): New. * c-parser.cc (report_bad_enum_name): New, split out from... (c_parser_enum_specifier): ...here, adding handling for RID_FALSE and RID_TRUE. * c-tree.h (add_note_about_new_keyword): New decl. gcc/testsuite/ChangeLog: PR c/117629 * gcc.dg/auto-type-2.c: Update expected output with _Bool. * gcc.dg/c23-bool-errors-1.c: New test. * gcc.dg/c23-bool-errors-2.c: New test. * gcc.dg/c23-bool-errors-3.c: New test. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2025-01-02Use _Float128 in test for PR118184Richard Sandiford1-3/+3
The test was failing on x86 because longdouble128 only checks sizeof, rather than a full 128-bit payload. Using _Float128 is more portable and still exposes the original bug. gcc/testsuite/ PR target/118184 * gcc.dg/torture/pr118184.c: Use _Float128 instead of long double.
2025-01-02tree-optimization/118171 - GENERIC folding in PRE results in invalid GIMPLERichard Biener1-0/+12
PRE applies GENERIC folding to some component ref components which might result in invalid GIMPLE, like a VIEW_CONVERT_EXPR wrapping a REALPART_EXPR as in the PR. The following removes all GENERIC folding in the code re-constructing a GENERIC component-ref from the PRE VN IL. PR tree-optimization/118171 * tree-ssa-pre.cc (create_component_ref_by_pieces_1): Do not fold any component ref parts. * gcc.dg/torture/pr118171.c: New testcase.
2025-01-02aarch64: Detect word-level modification in early-ra [PR118184]Richard Sandiford1-0/+36
REGMODE_NATURAL_SIZE is set to 64 bits for everything except VLA SVE modes. This means that it's possible to modify (say) the highpart of a TI pseudo or a V2DI pseudo independently of the lowpart. Modifying such highparts requires a reload if the highpart ends up in the upper 64 bits of an FPR, since RTL semantics do not allow the highpart of a single hard register to be modified independently of the lowpart. early-ra missed a check for this case, which meant that it effectively treated an assignment to (subreg:DI (reg:TI R) 0) as an assignment to the whole of R. gcc/ PR target/118184 * config/aarch64/aarch64-early-ra.cc (allocno_assignment_is_rmw): New function. (early_ra::record_insn_defs): Mark the live range information as untrustworthy if an assignment would change part of an allocno but preserve the rest. gcc/testsuite/ * gcc.dg/torture/pr118184.c: New test.
2025-01-02forwprop: Handle RAW_DATA_CST in check_ctz_arrayJakub Jelinek1-0/+39
In order to stress test RAW_DATA_CST handling, I've tested trunk gcc with r15-6339 reapplied and a hack where I've changed const unsigned int raw_data_min_len = 128; to const unsigned int raw_data_min_len = 2; in cp_lexer_new_main and 64 to 4 several times in c_parser_initval and c_maybe_optimize_large_byte_initializer, so that RAW_DATA_CST doesn't trigger just on very large initializers, but even quite small ones. One of the regressions (will work on the others next) was that pr90838.c testcase regressed, check_ctz_array needs to handle RAW_DATA_CST, otherwise on larger initializers or if those come from #embed just won't trigger. The new testcase shows when it doesn't trigger anymore (regression from 14). The patch just handles RAW_DATA_CST in the CONSTRUCTOR_ELTS the same as is it was a series of INTEGER_CSTs. 2025-01-02 Jakub Jelinek <jakub@redhat.com> * tree-ssa-forwprop.cc (check_ctz_array): Handle also RAW_DATA_CST in the CONSTRUCTOR_ELTS. * gcc.dg/pr90838-2.c: New test.
2025-01-02Update copyright years.Jakub Jelinek56-56/+56
2025-01-01middle-end/118174 - bogus TER of tailcallRichard Biener1-0/+24
The following avoids applying TER to direct internal functions that are tailcall since the involved expansion code path doesn't honor TER constraints. PR middle-end/118174 * tree-outof-ssa.cc (ssa_is_replaceable_p): Exclude tailcalls. * gcc.dg/torture/pr118174.c: New testcase.
2024-12-30[RISC-V][PR target/115375] Fix expected dump outputJeff Law1-1/+1
Several months ago changes were made to the vectorizer which mucked up several of the scan tests. All but one of the cases in pr115375 have since been fixed. The remaining failure seems to be primarily a debugging dump issue -- we're still selecting the same lmul values. This patch adjusts the dump scan appropriately. PR target/115375 gcc/testsuite * gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-9.c: Adjust expected output.
2024-12-28gimple-fold: Fix up fold_array_ctor_reference RAW_DATA_CST handling [PR118207]Jakub Jelinek1-0/+25
The following testcases ICE because fold_array_ctor_reference in the RAW_DATA_CST handling just return build_int_cst without actually checking that if type is non-NULL, TREE_TYPE (val) is uselessly convertible to it. By falling through the code after it without *suboff += we get everything we need, the two if conditionals will never be true (we've already checked that size == BITS_PER_UNIT and so can't be 0, and val will be INTEGER_CST), but it will do the important fold_ctor_reference call which will deal with type incompatibilities. 2024-12-28 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/118207 * gimple-fold.cc (fold_array_ctor_reference): For RAW_DATA_CST, just set val to build_int_cst and fall through to the normal element handling code instead of returning build_int_cst right away. * gcc.dg/pr118207.c: New test.
2024-12-24testsuite/gcc.dg/memcmp-1.c: Cut down a factor of 7 for simulatorsHans-Peter Nilsson1-0/+1
Running tests in parallel on my 4.5y+ old laptop made this test time out: the test itself runs in 9m20s, the timeout being 10 minutes with the 2x factor. That's a bit too close. This commit does to the base test a similar change as was done for gcc.dg/torture/inline-mem-cpy-1.c in commit r14-8188-g6eca0d23b7ea84; or IOW cut it down a factor of 7 (r14-8188 was by a factor of 11). * gcc.dg/memcmp-1.c: Pass -DRUN_FRACTION=7 when testing in a simulator.
2024-12-23testsuite: Don't test pr118149.c on AArch64Christoph Müllner1-3/+3
Recently two test cases for PR118149 have been added. While pr118149-2.c works well for AArch64, pr118149.c fails because the expected optimization in forwprop4 cannot be applied as SLP vectorization does not happen. This patch fixes this issue by disabling the check on AArch64. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/pr118149.c: Disable for AArch64. Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
2024-12-20strub: accept indirection of volatile pointer types [PR118007]Alexandre Oliva1-0/+5
We don't want to indirect pointers in strub wrappers, because it generally isn't profitable, but if the argument is volatile, then we must use indirection to preserve access patterns, so amend the assertion check. for gcc/ChangeLog PR middle-end/118007 * ipa-strub.cc (pass_ipa_strub::execute): Accept indirecting volatile args of pointer types. for gcc/testsuite/ChangeLog PR middle-end/118007 * gcc.dg/strub-pr118007.c: New.
2024-12-20testsuite: tree-ssa: Fix i686/-m32 fails for vector-*.c testsChristoph Müllner5-9/+8
FAILs have been reported for several tree-ssa vector-*.c tests on i686-linux or on x86_64-linux with -m32. This patch addresses these fails by setting the necessary -msse2 flags. This patch also streamlines all tests to use dg-options instead of dg-additional-options. This is in line with most other tests in gcc.dg/tree-ssa. Tested with the following board config in RUNTESTFLAGS: --target_board=unix\{-m64,-m32,-m32/-mno-mmx/-mno-sse} gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/satd-hadamard.c: Rename dg-additional-options to dg-options. * gcc.dg/tree-ssa/vector-10.c: Rename dg-additional-options to dg-options and add -msse2 to it. * gcc.dg/tree-ssa/vector-11.c: Likewise. * gcc.dg/tree-ssa/vector-8.c: Rename dg-additional-options to dg-options. * gcc.dg/tree-ssa/vector-9.c: Likewise. Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
2024-12-20testsuite: Add tests for PR118149Christoph Müllner2-0/+57
A recent bugfix (eee2891312) for PR117830 also addressed PR118149. This patch adds two test cases for PR118149. These tests are different than other tests in that one of the vec-perm selectors contains indices in descending order (1, 1, 0, 0), which is the root cause for the ICE observed in PR118149. PR118149 gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/pr118149-2.c: New test. * gcc.dg/tree-ssa/pr118149.c: New test. Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
2024-12-20forwprop: Fix lane handling for VEC_PERM sequence blendingChristoph Müllner1-0/+38
In PR117830 a miscompilation of 464.h264ref was reported. An analysis showed that wrong code was generated because of unsatisfied assumptions. This patch addresses these issues. The first assumption was that we could independently analyze the two vec-perms at the start of a vec-perm-simplify sequence and use the information later for calculating a final vec-perm selector that utilizes fewer lanes. However, this information does not help much, because for changing the selector entry, we need to ensure that both elements of the operand vectors v_1 and v_2 remain equal. This is addressed by removing the function get_vect_selector_index_map and checking for this equality in the loop where we create the new selector. The calculation of the selector vector for the blended sequence assumed that the indices of the selector vector of the narrowed sequences are increasing. This assumption does not hold in general. This was fixed by allowing a wrap-around when searching for an empty lane. Further, there was an issue in the calculation of the selector vector entries for the second sequence. The code did not consider that the lanes of the second sequence could have been moved. A relevant property of this patch is that it introduces a couple of nested loops, where the out loop iterates from i=0..nelts and the inner loop iterates from j=0..i. To avoid performance concerns, a check is introduced that ensures nelts won't exceed 4 lanes. The added test case is derived from h264ref (the other cases from the benchmark have the same structure and don't provide additional coverage). Bootstrapped and regression-tested on x86-64 and aarch64. Further, tested on CPU 2006 h264ref and CPU 2017 x264. PR117830 gcc/ChangeLog: * tree-ssa-forwprop.cc (get_vect_selector_index_map): Removed. (recognise_vec_perm_simplify_seq): Fix calculation of vec-perm selectors of narrowed sequence. (calc_perm_vec_perm_simplify_seqs): Fixing calculation of vec-perm selectors of the blended sequence. (process_vec_perm_simplify_seq_list): Add whitespace to dump string to avoid bad formatted dump output. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/vector-11.c: New test. Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
2024-12-18ifcombine field merge: handle masks with sign extensionsAlexandre Oliva1-0/+66
When a loaded field is sign extended, masked and compared, we used to drop from the mask the bits past the original field width, which is not correct. Take note of the fact that the mask covered copies of the sign bit, before clipping it, and arrange to test the sign bit if we're comparing with zero. Punt in other cases. If bits_test fail recoverably, try other ifcombine strategies. for gcc/ChangeLog * gimple-fold.cc (decode_field_reference): Add psignbit parameter. Set it if the mask references sign-extending bits. (fold_truth_andor_for_ifcombine): Adjust calls with new variables. Swap them along with other r?_* variables. Handle extended sign bit compares with zero. * tree-ssa-ifcombine.cc (ifcombine_ifandif): If bits_test fails in a way that doesn't prevent other ifcombine strategies from passing, give them a try. for gcc/testsuite/ChangeLog * gcc.dg/field-merge-16.c: New.
2024-12-18ifcombine field merge: handle bitfield zero tests in range testsAlexandre Oliva1-0/+36
Some bitfield compares with zero are optimized to range tests, so instead of X & ~(Bit - 1) != 0 what reaches ifcombine is X > (Bit - 1), where Bit is a power of two and X is unsigned. This patch recognizes this optimized form of masked compares, and attempts to merge them like masked compares, which enables some more field merging that a folder version of fold_truth_andor used to handle without additional effort. I haven't seen X & ~(Bit - 1) == 0 become X <= (Bit - 1), or X < Bit for that matter, but it was easy enough to handle the former symmetrically to the above. The latter was also easy enough, and so was its symmetric, X >= Bit, that is handled like X & ~(Bit - 1) != 0. for gcc/ChangeLog * gimple-fold.cc (decode_field_reference): Accept incoming mask. (fold_truth_andor_for_ifcombine): Handle some compares with powers of two, minus 1 or 0, like masked compares with zero. for gcc/testsuite/ChangeLog * gcc.dg/field-merge-15.c: New.