aboutsummaryrefslogtreecommitdiff
path: root/gcc
AgeCommit message (Collapse)AuthorFilesLines
2025-07-11aarch64: Tweak handling of general SVE permutes [PR121027]Richard Sandiford2-5/+30
This PR is partly about a code quality regression that was triggered by g:caa7a99a052929d5970677c5b639e1fa5166e334. That patch taught the gimple optimisers to fold two VEC_PERM_EXPRs into one, conditional upon either (a) the original permutations not being "native" operations or (b) the combined permutation being a "native" operation. Whether something is a "native" operation is tested by calling can_vec_perm_const_p with allow_variable_p set to false. This requires the permutation to be supported directly by TARGET_VECTORIZE_VEC_PERM_CONST, rather than falling back to the general vec_perm optab. This exposed a problem with the way that we handled general 2-input permutations for SVE. Unlike Advanced SIMD, base SVE does not have an instruction to do general 2-input permutations. We do still implement the vec_perm optab for SVE, but only when the vector length is known at compile time. The general expansion is pretty expensive: an AND, a SUB, two TBLs, and an ORR. It certainly couldn't be considered a "native" operation. However, if a VEC_PERM_EXPR has a constant selector, the indices can be wider than the elements being permuted. This is not true for the vec_perm optab, where the indices and permuted elements must have the same precision. This leads to one case where we cannot leave a general 2-input permutation to be handled by the vec_perm optab: when permuting bytes on a target with 2048-bit vectors. In that case, the indices of the elements in the second vector are in the range [256, 511], which cannot be stored in a byte index. TARGET_VECTORIZE_VEC_PERM_CONST therefore has to handle 2-input SVE permutations for one specific case. Rather than check for that specific case, the code went ahead and used the vec_perm expansion whenever it worked. But that undermines the !allow_variable_p handling in can_vec_perm_const_p; it becomes impossible for target-independent code to distinguish "native" operations from the worst-case fallback. This patch instead limits TARGET_VECTORIZE_VEC_PERM_CONST to the cases that it has to handle. It fixes the PR for all vector lengths except 2048 bits. A better fix would be to introduce some sort of costing mechanism, which would allow us to reject the new VEC_PERM_EXPR even for 2048-bit targets. But that would be a significant amount of work and would not be backportable. gcc/ PR target/121027 * config/aarch64/aarch64.cc (aarch64_evpc_sve_tbl): Punt on 2-input operations that can be handled by vec_perm. gcc/testsuite/ PR target/121027 * gcc.target/aarch64/sve/acle/general/perm_1.c: New test.
2025-07-11aarch64: Use EOR3 for DImode valuesKyrylo Tkachov2-1/+30
Similar to BCAX, we can use EOR3 for DImode, but we have to be careful not to force GP<->SIMD moves unnecessarily, so add a splitter for that case. So for input: uint64_t eor3_d_gp (uint64_t a, uint64_t b, uint64_t c) { return EOR3 (a, b, c); } uint64x1_t eor3_d (uint64x1_t a, uint64x1_t b, uint64x1_t c) { return EOR3 (a, b, c); } We generate the desired: eor3_d_gp: eor x1, x1, x2 eor x0, x1, x0 ret eor3_d: eor3 v0.16b, v0.16b, v1.16b, v2.16b ret Bootstrapped and tested on aarch64-none-linux-gnu. Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com> gcc/ * config/aarch64/aarch64-simd.md (*eor3qdi4): New define_insn_and_split. gcc/testsuite/ * gcc.target/aarch64/simd/eor3_d.c: Add tests for DImode operands.
2025-07-11aarch64: Handle DImode BCAX operationsKyrylo Tkachov2-1/+34
To handle DImode BCAX operations we want to do them on the SIMD side only if the incoming arguments don't require a cross-bank move. This means we need to split back the combination to separate GP BIC+EOR instructions if the operands are expected to be in GP regs through reload. The split happens pre-reload if we already know that the destination will be a GP reg. Otherwise if reload descides to use the "=r,r" alternative we ensure operand 0 is early-clobber. This scheme is similar to how we handle the BSL operations elsewhere in aarch64-simd.md. Thus, for the functions: uint64_t bcax_d_gp (uint64_t a, uint64_t b, uint64_t c) { return BCAX (a, b, c); } uint64x1_t bcax_d (uint64x1_t a, uint64x1_t b, uint64x1_t c) { return BCAX (a, b, c); } we now generate the desired: bcax_d_gp: bic x1, x1, x2 eor x0, x1, x0 ret bcax_d: bcax v0.16b, v0.16b, v1.16b, v2.16b ret When the inputs are in SIMD regs we use BCAX and when they are in GP regs we don't force them to SIMD with extra moves. Bootstrapped and tested on aarch64-none-linux-gnu. Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com> gcc/ * config/aarch64/aarch64-simd.md (*bcaxqdi4): New define_insn_and_split. gcc/testsuite/ * gcc.target/aarch64/simd/bcax_d.c: Add tests for DImode arguments.
2025-07-11aarch64: Use EOR3 for 64-bit vector modesKyrylo Tkachov2-6/+21
Similar to the BCAX patch, we can also use EOR3 for 64-bit modes, just by adjusting the mode iterator used. Thus for input: uint32x2_t bcax_s (uint32x2_t a, uint32x2_t b, uint32x2_t c) { return EOR3 (a, b, c); } we now generate: bcax_s: eor3 v0.16b, v0.16b, v1.16b, v2.16b ret instead of: bcax_s: eor v1.8b, v1.8b, v2.8b eor v0.8b, v1.8b, v0.8b ret Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com> gcc/ * config/aarch64/aarch64-simd.md (eor3q<mode>4): Use VDQ_I mode iterator. gcc/testsuite/ * gcc.target/aarch64/simd/eor3_d.c: New test.
2025-07-11aarch64: Allow 64-bit vector modes in pattern for BCAX instructionKyrylo Tkachov2-6/+21
The BCAX instruction from TARGET_SHA3 only operates on the full .16b form of the inputs but as it's a pure bitwise operation we can use it for the 64-bit modes as well as there we don't care about the upper 64 bits. This patch extends the relevant pattern in aarch64-simd.md to accept the 64-bit vector modes. Thus, for the input: uint32x2_t bcax_s (uint32x2_t a, uint32x2_t b, uint32x2_t c) { return BCAX (a, b, c); } we can now generate: bcax_s: bcax v0.16b, v0.16b, v1.16b, v2.16b ret instead of the current: bcax_s: bic v1.8b, v1.8b, v2.8b eor v0.8b, v1.8b, v0.8b ret This patch doesn't cover the DI/V1DI modes as that would require extending the bcaxqdi4 pattern with =r,r alternatives and adding splitting logic to handle the cases where the operands arrive in GP regs. It is doable, but can be a separate patch. This patch as is should be a straightforward improvement always. Bootstrapped and tested on aarch64-none-linux-gnu. Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com> gcc/ * config/aarch64/aarch64-simd.md (bcaxq<mode>4): Use VDQ_I mode iterator. gcc/testsuite/ * gcc.target/aarch64/simd/bcax_d.c: New test.
2025-07-11tree-optimization/121034 - fix reduction vectorizationRichard Biener2-16/+32
The following fixes the loop following the reduction chain to properly visit all SLP nodes involved and makes the stmt info and the SLP node we track match. PR tree-optimization/121034 * tree-vect-loop.cc (vectorizable_reduction): Cleanup reduction chain following code. * gcc.dg/vect/pr121034.c: New testcase.
2025-07-11testsuite: Add testcase for already fixed PR [PR120954]Jakub Jelinek1-0/+21
This was a regression introduced by r16-1893 (and its backports) for C++, though for C it had false positive warning for years. Fixed by r16-2000 (and its backports). 2025-07-11 Jakub Jelinek <jakub@redhat.com> PR c++/120954 * c-c++-common/Warray-bounds-11.c: New test.
2025-07-11Rewrite assign_discriminatorsJan Hubicka2-189/+67
To assign debug locations to corresponding statements auto-fdo uses discriminators. Documentation says that if given statement belongs to multiple basic blocks, the discrminator distinguishes them. Current implementation however only work fork statements that expands into a squence of gimple statements which forms a linear sequence, sicne it essentially tracks a current location and renews it each time new BB is found. This is commonly not true for C++ code as in: <bb 25> : [simulator/csimplemodule.cc:379:85] _40 = std::__cxx11::basic_string<char>::c_str ([simulator/csimplemodule.cc:379:85] &D.80680); [simulator/csimplemodule.cc:379:85 discrim 13] _41 = [simulator/csimplemodule.cc:379:85] &this->D.78503.D.78106.D.72008.D.68585.D.67935.D.67879.D.67782; [simulator/csimplemodule.cc:379:85 discrim 13] _42 = &this->D.78503.D.78106.D.72008.D.68585.D.67935.D.67879.D.67782; [simulator/csimplemodule.cc:377:45] _43 = this->D.78503.D.78106.D.72008.D.68585.D.67935.D.67879.D.67782._vptr.cObject; [simulator/csimplemodule.cc:377:45] _44 = _43 + 40; [simulator/csimplemodule.cc:377:45] _45 = [simulator/csimplemodule.cc:377:45] *_44; [simulator/csimplemodule.cc:379:85] D.89001 = OBJ_TYPE_REF(_45;(const struct cObject)_42->5B) (_41); This is a fragment of code that is expanded from: 371 if (this!=simulation.getContextModule()) 372 throw cRuntimeError("send()/sendDelayed() of module (%s)%s called in the context of " 373 "module (%s)%s: method called from the latter module " 374 "lacks Enter_Method() or Enter_Method_Silent()? " 375 "Also, if message to be sent is passed from that module, " 376 "you'll need to call take(msg) after Enter_Method() as well", 377 getClassName(), getFullPath().c_str(), 378 simulation.getContextModule()->getClassName(), 379 simulation.getContextModule()->getFullPath().c_str()); Notice that 379:85 is interleaved by 377:45 and the pass does not assign new discriminator. With patch we get: <bb 25> : [simulator/csimplemodule.cc:379:85 discrim 7] _40 = std::__cxx11::basic_string<char>::c_str ([simulator/csimplemodule.cc:379:85] &D.80680); [simulator/csimplemodule.cc:379:85 discrim 8] _41 = [simulator/csimplemodule.cc:379:85] &this->D.78503.D.78106.D.72008.D.68585.D.67935.D.67879.D.67782; [simulator/csimplemodule.cc:379:85 discrim 8] _42 = &this->D.78503.D.78106.D.72008.D.68585.D.67935.D.67879.D.67782; [simulator/csimplemodule.cc:377:45 discrim 1] _43 = this->D.78503.D.78106.D.72008.D.68585.D.67935.D.67879.D.67782._vptr.cObject; [simulator/csimplemodule.cc:377:45 discrim 1] _44 = _43 + 40; [simulator/csimplemodule.cc:377:45 discrim 1] _45 = [simulator/csimplemodule.cc:377:45] *_44; [simulator/csimplemodule.cc:379:85 discrim 8] D.89001 = OBJ_TYPE_REF(_45;(const struct cObject)_42->5B) (_41); There are earlier statements with line number 379, so that is why there is discriminator 7 for the call. After that discriminator is increased. There are two reasons for it 1) AFDO requires every callsite to have unique lineno:discriminator pair 2) call may not terminate and htus the profile of first statement may be higher than the rest. Old pass also contained logic to skip debug statements. This is not a good idea since we output them to the debug output and if AFDO tool picks these locations up they will be misplaced in basic blocks. Debug statements are naturally quite useful to track back the AFDO profiles and in meantime LLVM folks implemented something similar called pseudoprobe. I think it makes sense toenable debug statements with -fauto-profile even if debug info is off and make use of them as done in this patch. Sadly AFDO tool is quite broken and bulid around assumption that every address has at most one debug location assigned to it (i.e. debug info before debug statements were introduced). I have WIP patch fixing this. Note that LLVM also has -fdebug-info-for-auto-profile (on by defualt it seems) that controls discriminator production and some other little bits. I wonder if we want to have something similar. Should it be -gdebug-info-for-auto-profile instead? gcc/ChangeLog: * opts.cc (finish_options): Enable debug_nonbind_markers_p for auto-profile. * tree-cfg.cc (struct locus_discrim_map): Remove. (struct locus_discrim_hasher): Remove. (locus_discrim_hasher::hash): Remove. (locus_discrim_hasher::equal): Remove. (first_non_label_nondebug_stmt): Remove. (build_gimple_cfg): Do not allocate discriminator tables. (next_discriminator_for_locus): Remove. (same_line_p): Remove. (struct discrim_entry): New structure. (assign_discriminator): Rewrite. (assign_discriminators): Rewrite.
2025-07-11Fix ICE in speculative devirtualizationJan Hubicka3-0/+44
This patch fixes ICE bilding lto1 with autoprofiledbootstrap and in pr114790. What happens is that auto-fdo speculatively devirtualizes to a wrong target. This is due to a bug where it mixes up dwarf names and linkage names of inline functions I need to fix as well. Later we clone at WPA time. At ltrans time clone is materialized and call is turned into a direct call (this optimization is missed by ipa-cp propagation). At this time we should resolve speculation but we don't. As a result we get error from verifier after inlining complaining that there is speculative call with corresponding direct call lacking speculative flag. This seems long-lasting problem in cgraph_update_edges_for_call_stmt_node but I suppose it does not trigger since we usually speculate correctly or notice the direct call at WPA time already. Bootstrapped/regtested x86_64-linux. gcc/ChangeLog: PR ipa/114790 * cgraph.cc (cgraph_update_edges_for_call_stmt_node): Resolve devirtualization if call statement was optimized out or turned to direct call. gcc/testsuite/ChangeLog: * g++.dg/lto/pr114790_0.C: New test. * g++.dg/lto/pr114790_1.C: New test.
2025-07-11ipa: Disallow signature changes in fun->has_musttail functions [PR121023]Jakub Jelinek2-0/+38
As the following testcase shows e.g. on ia32, letting IPA opts change signature of functions which have [[{gnu,clang}::musttail]] calls can turn programs that would be compiled normally into something that is rejected because the caller has fewer argument stack slots than the function being tail called. The following patch prevents signature changes for such functions. It is perhaps too big hammer in some cases, but it might be hard to try to figure out what signature changes are still acceptable and which are not at IPA time. 2025-07-11 Jakub Jelinek <jakub@redhat.com> Martin Jambor <mjambor@suse.cz> PR ipa/121023 * ipa-fnsummary.cc (compute_fn_summary): Disallow signature changes on cfun->has_musttail functions. * c-c++-common/musttail32.c: New test.
2025-07-11i386: Add a new peeophole2 for PR91384 under APX_FHu, Lin12-0/+31
gcc/ChangeLog: PR target/91384 * config/i386/i386.md: Add new peeophole2 for optimize *negsi_1 followed by *cmpsi_ccno_1 with APX_F. gcc/testsuite/ChangeLog: PR target/91384 * gcc.target/i386/pr91384-1.c: New test.
2025-07-11properly compute fp/mode for scalar ops for vectorizer costingRichard Biener1-0/+8
The x86 add_stmt_hook relies on the passed vectype to determine the mode and whether it is FP for a scalar operation. This is unreliable now for stmts involving patterns and in the future when there is no vector type passed for scalar operations. To be least disruptive I've kept using the vector type if it is passed. * config/i386/i386.cc (ix86_vector_costs::add_stmt_cost): Use the LHS of a scalar stmt to determine mode and whether it is FP.
2025-07-11cobol: Fix build on 32-bit Darwin [PR120621]Rainer Orth4-6/+6
Bootstrapping trunk with 32-bit-default on Mac OS X 10.11 (i386-apple-darwin15) fails: /vol/gcc/src/hg/master/local/gcc/cobol/lexio.cc: In static member function 'static void cdftext::process_file(filespan_t, int, bool)': /vol/gcc/src/hg/master/local/gcc/cobol/lexio.cc:1859:14: error: format '%u' expects argument of type 'unsigned int', but argument 4 has type 'size_t' {aka 'long unsigned int'} [-Werror=format=] 1859 | dbgmsg("%s:%d: line " HOST_SIZE_T_PRINT_UNSIGNED ", opening %s on fd %d", | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1860 | __func__, __LINE__,mfile.lineno(), | ~~~~~~~~~~~~~~ | | | size_t {aka long unsigned int} In file included from /vol/gcc/src/hg/master/local/gcc/system.h:1244, from /vol/gcc/src/hg/master/local/gcc/cobol/cobol-system.h:61, from /vol/gcc/src/hg/master/local/gcc/cobol/lexio.cc:33: /vol/gcc/src/hg/master/local/gcc/hwint.h:135:51: note: format string is defined here 135 | #define HOST_SIZE_T_PRINT_UNSIGNED "%" GCC_PRISZ "u" | ~~~~~~~~~~~~~~^ | | | unsigned int | %" GCC_PRISZ "lu On Darwin, size_t is always long unsigned int. However, unsigned int and long unsigned int are both 32-bit, so hwint.h selects %u for the format. As documented there, the arg needs to be cast to fmt_size_t to avoid the error. This isn't an issue on other 32-bit platforms like Solaris/i386 or Linux/i686 since they use unsigned int for size_t. /vol/gcc/src/hg/master/local/gcc/cobol/parse.y: In function 'int yyparse()': /vol/gcc/src/hg/master/local/gcc/cobol/parse.y:10215:36: error: format '%zu' expects argument of type 'size_t', but argument 4 has type 'int' [-Werror=format=] 10215 | error_msg(loc, "FUNCTION %qs has " | ^~~~~~~~~~~~~~~~~~~ 10216 | "inconsistent parameter type %zu (%qs)", | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 10217 | keyword_str($1), p - args.data(), name_of(p->field) ); | ~~~~~~~~~~~~~~~ | | | int The arg (p - args.data())) is ptrdiff_t (int on 32-bit Darwin), while the %zu format expect size_t (long unsigned int). The patch therefore casts the ptrdiff_t arg to long and prints it as such. There are two more instances of the same problem: /vol/gcc/src/hg/master/local/gcc/cobol/util.cc: In member function 'void cbl_field_t::report_invalid_initial_value(const YYLTYPE&) const': /vol/gcc/src/hg/master/local/gcc/cobol/util.cc:905:80: error: format '%zu' expects argument of type 'size_t', but argument 6 has type 'int' [-Werror=format=] 905 | error_msg(loc, "%s cannot represent VALUE %qs exactly (max %c%zu)", | ~~^ | | | long unsigned int | %u 906 | name, data.initial, '.', pend - p); | ~~~~~~~~ | | | int In file included from /vol/gcc/src/hg/master/local/gcc/cobol/scan.l:48: /vol/gcc/src/hg/master/local/gcc/cobol/scan_ante.h: In function 'int numstr_of(const char*, radix_t)': /vol/gcc/src/hg/master/local/gcc/cobol/scan_ante.h:152:25: error: format '%zu' expects argument of type 'size_t', but argument 4 has type 'int' [-Werror=format=] 152 | error_msg(yylloc, "significand of %s has more than 36 digits (%zu)", input, nx); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~ | | | int Fixed in the same way. Bootstrapped without regressions on i386-apple-darwin15, x86_64-apple-darwin, i386-pc-solaris2.11, amd64-pc-solaris2.11, i686-pc-linux-gnu, and x86_64-pc-linux-gnu. 2025-06-23 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> gcc/cobol: PR cobol/120621 * lexio.cc (parse_replace_pairs): Cast mfile.lineno() to fmt_size_t. * parse.y (intrinsic): Print ptrdiff_t using %ld, cast arg to long. * scan_ante.h (numstr_of): Print nx using %ld, cast arg to long. * util.cc (cbl_field_t::report_invalid_initial_value): Print ptrdiff_t using %ld, cast arg to long.
2025-07-11Fortran: Implement F2018 IMPORT statements [PR106135]Paul Thomas6-16/+673
2025-09-09 Paul Thomas <pault@gcc.gnu.org> gcc/fortran PR fortran/106135 * decl.cc (build_sym): Emit an error if a symbol associated by an IMPORT, ONLY or IMPORT, all statement is being redeclared. (gfc_match_import): Parse and check the F2018 versions of the IMPORT statement. For scopes other than and interface body, if the symbol cannot be found in the host scope, generate it and set it up such that gfc_fixup_sibling_symbols can transfer its 'imported attribute' if it turnes out to be a not yet parsed procedure. Test for violations of C897-8100. * gfortran.h : Add 'import_only' to the gfc_symtree structure. Add the enum, 'importstate', which is used for values the new field 'import_state' in gfc_namespace. * parse.cc (gfc_fixup_sibling_symbols): Transfer the attribute 'imported' to the new symbol. * resolve.cc (check_sym_import_status, check_import_status): New functions to test symbols and expressions for violations of F2018:C8102. (resolve_call): Test the 'resolved_sym' against C8102 by a call to 'check_sym_import_status'. (gfc_resolve_expr): If the expression is OK and an IMPORT statement has been registered in the current scope, test C102 by calling 'check_import_status'. (resolve_select_type): Test the declared derived type in TYPE IS and CLASS IS statements. gcc/testsuite/ PR fortran/106135 * gfortran.dg/import3.f90: Use -std=f2008 and comment on change in error message texts with f2018. * gfortran.dg/import12.f90: New test.
2025-07-11Daily bump.GCC Administrator8-1/+890
2025-07-11c++: Save 8 further bytes from lang_type allocationsJakub Jelinek5-9/+26
The following patch implements the /* FIXME reuse another field? */ comment on the lambda_expr member. I think (and asserts in the patch seem to confirm) CLASSTYPE_KEY_METHOD is only ever non-NULL for TYE_POLYMORPHIC_P and on the other side CLASSTYPE_LAMBDA_EXPR is only used on closure types which are never polymorphic. So, the patch just uses one member for both, with the accessor macros changed to be no longer lvalues and adding SET_* variants of the macros for setters. 2025-07-11 Jakub Jelinek <jakub@redhat.com> * cp-tree.h (struct lang_type): Add comment before key_method. Remove lambda_expr. (CLASSTYPE_KEY_METHOD): Give NULL_TREE if not TYPE_POLYMORPHIC_P. (SET_CLASSTYPE_KEY_METHOD): Define. (CLASSTYPE_LAMBDA_EXPR): Give NULL_TREE if TYPE_POLYMORPHIC_P. Use key_method member instead of lambda_expr. (SET_CLASSTYPE_LAMBDA_EXPR): Define. * class.cc (determine_key_method): Use SET_CLASSTYPE_KEY_METHOD macro. * decl.cc (xref_tag): Use SET_CLASSTYPE_LAMBDA_EXPR macro. * lambda.cc (begin_lambda_type): Likewise. * module.cc (trees_in::read_class_def): Use SET_CLASSTYPE_LAMBDA_EXPR and SET_CLASSTYPE_KEY_METHOD macros, assert lambda is NULL if TYPE_POLYMORPHIC_P and otherwise assert key_method is NULL.
2025-07-10c++: Fix up final handling in C++98 [PR120628]Jakub Jelinek4-7/+77
The following patch is on top of the https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686210.html patch which stopped treating override as conditional keyword in class properties. This PR mentions another problem; we emit a bogus warning on code like struct C {}; struct C final = {}; in C++98. In this case we parse final as conditional keyword in C++ (including pedwarn) but the caller then immediately aborts the tentative parse because it isn't followed by { nor (in some cases) : . I think we certainly shouldn't pedwarn on it, but I think we even shouldn't warn for it say for -Wc++11-compat, because we don't actually treat the identifier as conditional keyword even in C++11 and later. The patch only does this if final is the only class property conditional keyword, if one uses struct S __final final __final = {}; one gets the warning and duplicate diagnostics and later parsing errors. 2025-07-10 Jakub Jelinek <jakub@redhat.com> PR c++/120628 * parser.cc (cp_parser_elaborated_type_specifier): Use cp_parser_nth_token_starts_class_definition_p with extra argument 1 instead of cp_parser_next_token_starts_class_definition_p. (cp_parser_class_property_specifier_seq_opt): For final conditional keyword in C++98 check if the token after it isn't cp_parser_nth_token_starts_class_definition_p nor CPP_NAME and in that case break without consuming it nor warning. (cp_parser_class_head): Use cp_parser_nth_token_starts_class_definition_p with extra argument 1 instead of cp_parser_next_token_starts_class_definition_p. (cp_parser_next_token_starts_class_definition_p): Renamed to ... (cp_parser_nth_token_starts_class_definition_p): ... this. Add N argument. Use cp_lexer_peek_nth_token instead of cp_lexer_peek_token. * g++.dg/cpp0x/final1.C: New test. * g++.dg/cpp0x/final2.C: New test. * g++.dg/cpp0x/override6.C: New test.
2025-07-10c++: Don't incorrectly reject override after class head name [PR120569]Jakub Jelinek4-17/+85
While the https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2025/p2786r13.html#c03-compatibility-changes-for-annex-c-diff.cpp03.dcl.dcl hunk dropped because struct C {}; struct C final {}; is actually not valid C++98 (which didn't have list initialization), we actually also reject struct D {}; struct D override {}; and that IMHO is valid all the way from C++11 onwards. Especially in the light of P2786R13 adding new contextual keywords, I think it is better to use a separate routine for parsing the class-virt-specifier-seq (in C++11, there was export next to final), class-virt-specifier (in C++14 to C++23) and class-property-specifier-seq (in C++26) instead of using the same function for virt-specifier-seq and class-property-specifier-seq. 2025-07-10 Jakub Jelinek <jakub@redhat.com> PR c++/120569 * parser.cc (cp_parser_class_property_specifier_seq_opt): New function. (cp_parser_class_head): Use it instead of cp_parser_property_specifier_seq_opt. Don't diagnose VIRT_SPEC_OVERRIDE here. Formatting fix. * g++.dg/cpp0x/override2.C: Expect different diagnostics with override or duplicate final. * g++.dg/cpp0x/override5.C: New test. * g++.dg/cpp0x/duplicate1.C: Expect different diagnostics with duplicate final.
2025-07-10c++, libstdc++: Implement C++26 P3068R5 - constexpr exceptions [PR117785]Jakub Jelinek43-402/+3635
The following patch implements the C++26 P3068R5 - constexpr exceptions paper. As the IL cxx_eval_constant* functions process already contains the low level calls like __cxa_{allocate,free}_exception, __cxa_{,re}throw etc., the patch just makes 10 extern "C" __cxa_* functions magic builtins which during constant evaluation pretend to be constexpr even when not declared so and handle them directly, plus does the same for 3 std namespace functions - std::uncaught_exceptions, std::current_exception and std::rethrow_exception and adds one new FE builtin - __builtin_eh_ptr_adjust_ref which the library can use instead of the _M_addref and _M_release out of line methods (this one instead of recognizing _M_* as magic too because those are clearly specific to libstdc++ and e.g. libc++ could use something else). The patch uses magic VAR_DECLs with heap_{uninit_,,deleted_}identifier DECL_NAME like for operator new/delete for objects allocated with __cxa_allocate_exception, just sets their DECL_LANG_SPECIFIC so that we can track their reference count as well (with std::exception_ptr the same exception object can be referenced multiple times and we want to destruct and free only when it reaches zero refcount). For uncaught exceptions being propagated, the patch uses new kind of *jump_target, which is that magic VAR_DECL described above. The largest change in the patch is making jump_target argument non-optional in cxa_eval_constant_exception and all functions it calls that need it. This is because exceptions can be thrown from pretty much everywhere, e.g. binary expression can throw in either operand. And the patch also adds if (*jump_target) return NULL_TREE; or similar in many spots, so that we don't crash because cxx_eval_constant_expression returned NULL_TREE somewhere before actually trying to use it and so that we don't uselessly dive into other operands etc. Note, with statement expressions actually this was something we just didn't handle correctly before, one can validly have: a = ({ if (x) return 42; 12; }) + b; or in the other operand, or break/continue instead of return if it is somewhere in a loop/switch; and it isn't ok to branch from one operand to another one through some kind of goto. On the potential_constant_expression_1 side, important change was to set *jump_target conservatively on calls that could throw for C++26 (the patch uses magic void_node for potential_constant_expression* instead of VAR_DECL, so that we don't have to create new VAR_DECLs there uselessly). Without that change, several methods in libstdc++ wouldn't work correctly. I'm not sure what exactly potential_constant_expression_1 maps to in the C++26 standard wording now and whether doing that is ok, because basically after the first call to non-noexcept function it stops checking stuff. And, in some spots where I know potential_constant_expression_1 didn't check some subexpressions (e.g. the EH only cleanups or TRY_BLOCK handlers) I've added *potential_constant_expression* calls during cxx_eval_constant*, not sure if I need to do that because potential_constant_expression_1 is very conservative and just doesn't recurse on subexpressions in many cases. 2025-07-10 Jakub Jelinek <jakub@redhat.com> PR c++/117785 gcc/c-family/ * c-cppbuiltin.cc (c_cpp_builtins): Predefine __cpp_constexpr_exceptions=202411L for C++26. gcc/cp/ * constexpr.cc: Implement C++26 P3068R5 - constexpr exceptions. (class constexpr_global_ctx): Add caught_exceptions and uncaught_exceptions members. (constexpr_global_ctx::constexpr_global_ctx): Initialize uncaught_exceptions. (returns, breaks, continues, switches): Move earlier. (throws): New function. (exception_what_str, diagnose_std_terminate, diagnose_uncaught_exception): New functions. (enum cxa_builtin): New type. (cxx_cxa_builtin_fn_p, cxx_eval_cxa_builtin_fn): New functions. (cxx_eval_builtin_function_call): Add jump_target argument. Call cxx_eval_cxa_builtin_fn for __builtin_eh_ptr_adjust_ref. Adjust cxx_eval_constant_expression calls, if it results in jmp_target, set *jump_target to it and return. (cxx_bind_parameters_in_call): Add jump_target argument. Pass it through to cxx_eval_constant_expression. If it sets *jump_target, break. (fold_operand): Adjust cxx_eval_constant_expression caller. (cxx_eval_assert): Likewise. If it set jmp_target, return true. (cxx_eval_internal_function): Add jump_target argument. Pass it through to cxx_eval_constant_expression. Return early if *jump_target after recursing on args. (cxx_eval_dynamic_cast_fn): Likewise. Don't set reference_p for C++26 with -fexceptions. (cxx_eval_thunk_call): Add jump_target argument. Pass it through to cxx_eval_constant_expression. (cxx_set_object_constness): Likewise. Don't set TREE_READONLY if throws (jump_target). (cxx_eval_call_expression): Add jump_target argument. Pass it through to cxx_eval_internal_function, cxx_eval_builtin_function_call, cxx_eval_thunk_call, cxx_eval_dynamic_cast_fn and cxx_set_object_constness. Pass it through also cxx_eval_constant_expression on arguments, cxx_bind_parameters_in_call and cxx_fold_indirect_ref and for those cases return early if *jump_target. Call cxx_eval_cxa_builtin_fn for cxx_cxa_builtin_fn_p functions. For cxx_eval_constant_expression on body, pass address of cleared jmp_target automatic variable, if it throws propagate to *jump_target and make it non-cacheable. For C++26 don't diagnose calls to non-constexpr functions before cxx_bind_parameters_in_call could report some argument throwing an exception. (cxx_eval_unary_expression): Add jump_target argument. Pass it through to cxx_eval_constant_expression and return early if *jump_target after the call. (cxx_fold_pointer_plus_expression): Likewise. (cxx_eval_binary_expression): Likewise and similarly for cxx_fold_pointer_plus_expression call. (cxx_eval_conditional_expression): Pass jump_target to cxx_eval_constant_expression on first operand and return early if *jump_target after the call. (cxx_eval_vector_conditional_expression): Add jump_target argument. Pass it through to cxx_eval_constant_expression for all 3 arguments and return early if *jump_target after any of those calls. (get_array_or_vector_nelts): Add jump_target argument. Pass it through to cxx_eval_constant_expression. (eval_and_check_array_index): Add jump_target argument. Pass it through to cxx_eval_constant_expression calls and return early after each of them if *jump_target. (cxx_eval_array_reference): Likewise. (cxx_eval_component_reference): Likewise. (cxx_eval_bit_field_ref): Likewise. (cxx_eval_bit_cast): Likewise. Assert CHECKING_P call doesn't throw or return. (cxx_eval_logical_expression): Add jump_target argument. Pass it through to cxx_eval_constant_expression calls and return early after each of them if *jump_target. (cxx_eval_bare_aggregate): Likewise. (cxx_eval_vec_init_1): Add jump_target argument. Pass it through to cxx_eval_bare_aggregate and recursive call. Pass it through to get_array_or_vector_nelts and cxx_eval_constant_expression and return early after it if *jump_target. (cxx_eval_vec_init): Add jump_target argument. Pass it through to cxx_eval_constant_expression and cxx_eval_vec_init_1. (cxx_union_active_member): Add jump_target argument. Pass it through to cxx_eval_constant_expression and return early after it if *jump_target. (cxx_fold_indirect_ref_1): Add jump_target argument. Pass it through to cxx_union_active_member and recursive calls. (cxx_eval_indirect_ref): Add jump_target argument. Pass it through to cxx_fold_indirect_ref_1 calls and to recursive call, in which case return early after it if *jump_target. (cxx_fold_indirect_ref): Add jump_target argument. Pass it through to cxx_fold_indirect_ref and cxx_eval_constant_expression calls and return early after those if *jump_target. (cxx_eval_trinary_expression): Add jump_target argument. Pass it through to cxx_eval_constant_expression calls and return early after those if *jump_target. (cxx_eval_store_expression): Add jump_target argument. Pass it through to cxx_eval_constant_expression and eval_and_check_array_index calls and return early after those if *jump_target. (cxx_eval_increment_expression): Add jump_target argument. Pass it through to cxx_eval_constant_expression calls and return early after those if *jump_target. (label_matches): Handle VAR_DECL case. (cxx_eval_statement_list): Remove local_target variable and !jump_target handling. Handle throws (jump_target) like returns or breaks. (cxx_eval_loop_expr): Remove local_target variable and !jump_target handling. Pass it through to cxx_eval_constant_expression. Handle throws (jump_target) like returns. (cxx_eval_switch_expr): Pass jump_target through to cxx_eval_constant_expression on cond, return early after it if *jump_target. (build_new_constexpr_heap_type): Add jump_target argument. Pass it through to cxx_eval_constant_expression calls, return early after those if *jump_target. (merge_jump_target): New function. (cxx_eval_constant_expression): Make jump_target argument no longer defaulted, don't test jump_target for NULL. Pass jump_target through to recursive calls, cxx_eval_call_expression, cxx_eval_store_expression, cxx_eval_indirect_ref, cxx_eval_unary_expression, cxx_eval_binary_expression, cxx_eval_logical_expression, cxx_eval_array_reference, cxx_eval_component_reference, cxx_eval_bit_field_ref, cxx_eval_vector_conditional_expression, cxx_eval_bare_aggregate, cxx_eval_vec_init, cxx_eval_trinary_expression, cxx_fold_indirect_ref, build_new_constexpr_heap_type, cxx_eval_increment_expression, cxx_eval_bit_cast and return earlyu after some of those if *jump_target as needed. (cxx_eval_constant_expression) <case TARGET_EXPR>: For C++26 push also CLEANUP_EH_ONLY cleanups, with NULL_TREE marker after them. (cxx_eval_constant_expression) <case RETURN_EXPR>: Don't override *jump_target if throws (jump_target). (cxx_eval_constant_expression) <case TRY_CATCH_EXPR, case TRY_BLOCK, case MUST_NOT_THROW_EXPR, case TRY_FINALLY_EXPR, case CLEANUP_STMT>: Handle C++26 constant expressions. (cxx_eval_constant_expression) <case CLEANUP_POINT_EXPR>: For C++26 with throws (jump_target) evaluate the CLEANUP_EH_ONLY cleanups as well, and if not throws (jump_target) skip those. Set *jump_target if some of the cleanups threw. (cxx_eval_constant_expression) <case THROW_EXPR>: Recurse on operand for C++26. (cxx_eval_outermost_constant_expr): Diagnose uncaught exceptions both from main expression and cleanups, diagnose also break/continue/returns from the main expression. Handle CLEANUP_EH_ONLY cleanup markers. Don't diagnose mutable poison stuff if non_constant_p. Use different diagnostics for non-deleted heap allocations if they were allocated by __cxa_allocate_exception. (callee_might_throw): New function. (struct check_for_return_continue_data): Add could_throw field. (check_for_return_continue): Handle AGGR_INIT_EXPR and CALL_EXPR and set d->could_throw if they could throw. (potential_constant_expression_1): For CALL_EXPR allow cxx_dynamic_cast_fn_p calls. For C++26 set *jump_target to void_node for calls that could throw. For C++26 if call to non-constexpr call is seen, try to evaluate arguments first and if they could throw, don't diagnose call to non-constexpr function nor return false. Adjust check_for_return_continue_data initializers and set *jump_target to void_node if data.could_throw_p. For C++26 recurse on THROW_EXPR argument. Add comment explaining TRY_BLOCK handling with C++26 exceptions. Handle throws like returns in some cases. * cp-tree.h (MUST_NOT_THROW_NOEXCEPT_P, MUST_NOT_THROW_THROW_P, MUST_NOT_THROW_CATCH_P, DECL_EXCEPTION_REFCOUNT): Define. (DECL_LOCAL_DECL_P): Fix comment typo, VARIABLE_DECL -> VAR_DECL. (enum cp_built_in_function): Add CP_BUILT_IN_EH_PTR_ADJUST_REF, (handler_match_for_exception_type): Declare. * call.cc (handler_match_for_exception_type): New function. * except.cc (initialize_handler_parm): Set MUST_NOT_THROW_CATCH_P on newly created MUST_NOT_THROW_EXPR. (begin_eh_spec_block): Set MUST_NOT_THROW_NOEXCEPT_P. (wrap_cleanups_r): Set MUST_NOT_THROW_THROW_P. (build_throw): Add another TARGET_EXPR whose scope spans until after the __cxa_throw call and copy pointer value from ptr to it and use it in __cxa_throw argument. * tree.cc (builtin_valid_in_constant_expr_p): Handle CP_BUILT_IN_EH_PTR_ADJUST_REF. * decl.cc (cxx_init_decl_processing): Initialize __builtin_eh_ptr_adjust_ref FE builtin. * pt.cc (tsubst_stmt) <case MUST_NOT_THROW_EXPR>: Copy the MUST_NOT_THROW_NOEXCEPT_P, MUST_NOT_THROW_THROW_P and MUST_NOT_THROW_CATCH_P flags. * cp-gimplify.cc (cp_gimplify_expr) <case CALL_EXPR>: Error on non-folded CP_BUILT_IN_EH_PTR_ADJUST_REF calls. gcc/testsuite/ * g++.dg/cpp0x/constexpr-ellipsis2.C: Expect different diagnostics for C++26. * g++.dg/cpp0x/constexpr-throw.C: Likewise. * g++.dg/cpp1y/constexpr-84192.C: Expect different diagnostics. * g++.dg/cpp1y/constexpr-throw.C: Expect different diagnostics for C++26. * g++.dg/cpp1z/constexpr-asm-5.C: Likewise. * g++.dg/cpp26/constexpr-eh1.C: New test. * g++.dg/cpp26/constexpr-eh2.C: New test. * g++.dg/cpp26/constexpr-eh3.C: New test. * g++.dg/cpp26/constexpr-eh4.C: New test. * g++.dg/cpp26/constexpr-eh5.C: New test. * g++.dg/cpp26/constexpr-eh6.C: New test. * g++.dg/cpp26/constexpr-eh7.C: New test. * g++.dg/cpp26/constexpr-eh8.C: New test. * g++.dg/cpp26/constexpr-eh9.C: New test. * g++.dg/cpp26/constexpr-eh10.C: New test. * g++.dg/cpp26/constexpr-eh11.C: New test. * g++.dg/cpp26/constexpr-eh12.C: New test. * g++.dg/cpp26/constexpr-eh13.C: New test. * g++.dg/cpp26/constexpr-eh14.C: New test. * g++.dg/cpp26/constexpr-eh15.C: New test. * g++.dg/cpp26/feat-cxx26.C: Change formatting in __cpp_pack_indexing and __cpp_pp_embed test. Add __cpp_constexpr_exceptions test. * g++.dg/cpp26/static_assert1.C: Expect different diagnostics for C++26. * g++.dg/cpp2a/consteval34.C: Likewise. * g++.dg/cpp2a/consteval-memfn1.C: Likewise. * g++.dg/cpp2a/constexpr-dynamic4.C: For C++26 add std::exception and std::bad_cast definitions and expect different diagnostics. * g++.dg/cpp2a/constexpr-dynamic6.C: Likewise. * g++.dg/cpp2a/constexpr-dynamic7.C: Likewise. * g++.dg/cpp2a/constexpr-dynamic8.C: Likewise. * g++.dg/cpp2a/constexpr-dynamic9.C: Likewise. * g++.dg/cpp2a/constexpr-dynamic11.C: Likewise. * g++.dg/cpp2a/constexpr-dynamic14.C: Likewise. * g++.dg/cpp2a/constexpr-dynamic18.C: Likewise. * g++.dg/cpp2a/constexpr-new27.C: New test. * g++.dg/cpp2a/constexpr-typeid5.C: New test. libstdc++-v3/ * include/bits/version.def (constexpr_exceptions): New. * include/bits/version.h: Regenerate. * libsupc++/exception (std::bad_exception::bad_exception): Add _GLIBCXX26_CONSTEXPR. (std::bad_exception::~bad_exception, std::bad_exception::what): For C++26 add constexpr and define inline. * libsupc++/exception.h (std::exception::exception, std::exception::operator=): Add _GLIBCXX26_CONSTEXPR. (std::exception::~exception, std::exception::what): For C++26 add constexpr and define inline. * libsupc++/exception_ptr.h (std::make_exception_ptr): Add _GLIBCXX26_CONSTEXPR. For if consteval use just throw with current_exception() in catch. (std::exception_ptr::exception_ptr(void*)): For C++26 add constexpr and define inline. (std::exception_ptr::exception_ptr()): Add _GLIBCXX26_CONSTEXPR. (std::exception_ptr::exception_ptr(const exception_ptr&)): Likewise. Use __builtin_eh_ptr_adjust_ref if consteval and compiler has it instead of _M_addref. (std::exception_ptr::exception_ptr(nullptr_t)): Add _GLIBCXX26_CONSTEXPR. (std::exception_ptr::exception_ptr(exception_ptr&&)): Likewise. (std::exception_ptr::operator=): Likewise. (std::exception_ptr::~exception_ptr): Likewise. Use __builtin_eh_ptr_adjust_ref if consteval and compiler has it instead of _M_release. (std::exception_ptr::swap): Add _GLIBCXX26_CONSTEXPR. (std::exception_ptr::operator bool): Likewise. (std::exception_ptr::operator==): Likewise. * libsupc++/nested_exception.h (std::nested_exception::nested_exception): Add _GLIBCXX26_CONSTEXPR. (std::nested_exception::operator=): Likewise. (std::nested_exception::~nested_exception): For C++26 add constexpr and define inline. (std::nested_exception::rethrow_if_nested): Add _GLIBCXX26_CONSTEXPR. (std::nested_exception::nested_ptr): Likewise. (std::_Nested_exception::_Nested_exception): Likewise. (std::throw_with_nested, std::rethrow_if_nested): Likewise. * libsupc++/new (std::bad_alloc::bad_alloc): Likewise. (std::bad_alloc::operator=): Likewise. (std::bad_alloc::~bad_alloc): For C++26 add constexpr and define inline. (std::bad_alloc::what): Likewise. (std::bad_array_new_length::bad_array_new_length): Add _GLIBCXX26_CONSTEXPR. (std::bad_array_new_length::~bad_array_new_length): For C++26 add constexpr and define inline. (std::bad_array_new_length::what): Likewise. * libsupc++/typeinfo (std::bad_cast::bad_cast): Add _GLIBCXX26_CONSTEXPR. (std::bad_cast::~bad_cast): For C++26 add constexpr and define inline. (std::bad_cast::what): Likewise. (std::bad_typeid::bad_typeid): Add _GLIBCXX26_CONSTEXPR. (std::bad_typeid::~bad_typeid): For C++26 add constexpr and define inline. (std::bad_typeid::what): Likewise.
2025-07-10aarch64: Guard VF-based costing with !m_costing_for_scalarRichard Sandiford1-1/+1
g:4b47acfe2b626d1276e229a0cf165e934813df6c caused a segfault in aarch64_vector_costs::analyze_loop_vinfo when costing scalar code, since we'd end up dividing by a zero VF. Much of the structure of the aarch64 costing code dates from a stage 4 patch, when we had to work within the bounds of what the target-independent code did. Some of it could do with a rework now that we're not so constrained. This patch is therefore an emergency fix rather than the best long-term solution. I'll revisit when I have more time to think about it. gcc/ * config/aarch64/aarch64.cc (aarch64_vector_costs::add_stmt_cost): Guard VF-based costing with !m_costing_for_scalar.
2025-07-10Reduce the # of arguments of .ACCESS_WITH_SIZE from 6 to 4.Qing Zhao5-54/+38
This is an improvement to the design of internal function .ACCESS_WITH_SIZE. Currently, the .ACCESS_WITH_SIZE is designed as: ACCESS_WITH_SIZE (REF_TO_OBJ, REF_TO_SIZE, CLASS_OF_SIZE, TYPE_OF_SIZE, ACCESS_MODE, TYPE_SIZE_UNIT for element) which returns the REF_TO_OBJ same as the 1st argument; 1st argument REF_TO_OBJ: The reference to the object; 2nd argument REF_TO_SIZE: The reference to the size of the object, 3rd argument CLASS_OF_SIZE: The size referenced by the REF_TO_SIZE represents 0: the number of bytes. 1: the number of the elements of the object type; 4th argument TYPE_OF_SIZE: A constant 0 with its TYPE being the same as the TYPE of the object referenced by REF_TO_SIZE 5th argument ACCESS_MODE: -1: Unknown access semantics 0: none 1: read_only 2: write_only 3: read_write 6th argument: The TYPE_SIZE_UNIT of the element TYPE of the FAM when 3rd argument is 1. NULL when 3rd argument is 0. Among the 6 arguments: A. The 3rd argument CLASS_OF_SIZE is not needed. If the REF_TO_SIZE represents the number of bytes, simply pass 1 to the TYPE_SIZE_UNIT argument. B. The 4th and the 5th arguments can be combined into 1 argument, whose TYPE represents the TYPE_OF_SIZE, and the constant value represents the ACCESS_MODE. As a result, the new design of the .ACCESS_WITH_SIZE is: ACCESS_WITH_SIZE (REF_TO_OBJ, REF_TO_SIZE, TYPE_OF_SIZE + ACCESS_MODE, TYPE_SIZE_UNIT for element) which returns the REF_TO_OBJ same as the 1st argument; 1st argument REF_TO_OBJ: The reference to the object; 2nd argument REF_TO_SIZE: The reference to the size of the object, 3rd argument TYPE_OF_SIZE + ACCESS_MODE: An integer constant with a pointer TYPE. The pointee TYPE of the pointer TYPE is the TYPE of the object referenced by REF_TO_SIZE. The integer constant value represents the ACCESS_MODE: 0: none 1: read_only 2: write_only 3: read_write 4th argument: The TYPE_SIZE_UNIT of the element TYPE of the array. gcc/c-family/ChangeLog: * c-ubsan.cc (get_bound_from_access_with_size): Adjust the position of the arguments per the new design. gcc/c/ChangeLog: * c-typeck.cc (build_access_with_size_for_counted_by): Update comments. Adjust the arguments per the new design. gcc/ChangeLog: * internal-fn.cc (expand_ACCESS_WITH_SIZE): Update comments. * internal-fn.def (ACCESS_WITH_SIZE): Update comments. * tree-object-size.cc (access_with_size_object_size): Update comments. Adjust the arguments per the new design.
2025-07-10Passing TYPE_SIZE_UNIT of the element as the 6th argument to ↵Qing Zhao5-24/+69
.ACCESS_WITH_SIZE (PR121000) The size of the element of the FAM _cannot_ reliably depends on the original TYPE of the FAM that we passed as the 6th parameter to the .ACCESS_WITH_SIZE: TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (gimple_call_arg (call, 5)))) when the element of the FAM has a variable length type. Since the variable that represents TYPE_SIZE_UNIT has no explicit usage in the original IL, compiler transformations (such as DSE) that are applied before object_size phase might eliminate the whole definition to the variable that represents the TYPE_SIZE_UNIT of the element of the FAM. In order to resolve this issue, instead of passing the original TYPE of the FAM as the 6th argument to .ACCESS_WITH_SIZE, we should explicitly pass the original TYPE_SIZE_UNIT of the element TYPE of the FAM as the 6th argument to the call to .ACCESS_WITH_SIZE. PR middle-end/121000 gcc/c/ChangeLog: * c-typeck.cc (build_access_with_size_for_counted_by): Update comments. Pass TYPE_SIZE_UNIT of the element as the 6th argument. gcc/ChangeLog: * internal-fn.cc (expand_ACCESS_WITH_SIZE): Update comments. * internal-fn.def (ACCESS_WITH_SIZE): Update comments. * tree-object-size.cc (access_with_size_object_size): Update comments. Get the element_size from the 6th argument directly. gcc/testsuite/ChangeLog: * gcc.dg/flex-array-counted-by-pr121000.c: New test.
2025-07-10testsuite: Fix unallocated array usage in testMikael Morin1-0/+2
gcc/testsuite/ChangeLog: * gfortran.dg/asan/array_constructor_1.f90: Allocate array before using it.
2025-07-10aarch64: Fix LD1Q and ST1Q failures for big-endianRichard Sandiford2-8/+18
LD1Q gathers and ST1Q scatters are unusual in that they operate on 128-bit blocks (effectively VNx1TI). However, we don't have modes or ACLE types for 128-bit integers, and 128-bit integers are not the intended use case. Instead, the instructions are intended to be used in "hybrid VLA" operations, where each 128-bit block is an Advanced SIMD vector. The normal SVE modes therefore capture the intended use case better than VNx1TI would. For example, VNx2DI is effectively N copies of V2DI, VNx4SI N copies of V4SI, etc. Since there is only one LD1Q instruction and one ST1Q instruction, the ACLE support used a single pattern for each, with the loaded or stored data having mode VNx2DI. The ST1Q pattern was generated by: rtx data = e.args.last (); e.args.last () = force_lowpart_subreg (VNx2DImode, data, GET_MODE (data)); e.prepare_gather_address_operands (1, false); return e.use_exact_insn (CODE_FOR_aarch64_scatter_st1q); where the force_lowpart_subreg bitcast the stored data to VNx2DI. But such subregs require an element reverse on big-endian targets (see the comment at the head of aarch64-sve.md), which wasn't the intention. The code should have used aarch64_sve_reinterpret instead. The LD1Q pattern was used as follows: e.prepare_gather_address_operands (1, false); return e.use_exact_insn (CODE_FOR_aarch64_gather_ld1q); which always returns a VNx2DI value, leaving the caller to bitcast that to the correct mode. That bitcast again uses subregs and has the same problem as above. However, for the reasons explained in the comment, using aarch64_sve_reinterpret does not work well for LD1Q. The patch instead parameterises the LD1Q based on the required data mode. gcc/ * config/aarch64/aarch64-sve2.md (aarch64_gather_ld1q): Replace with... (@aarch64_gather_ld1q<mode>): ...this, parameterizing based on mode. * config/aarch64/aarch64-sve-builtins-sve2.cc (svld1q_gather_impl::expand): Update accordingly. (svst1q_scatter_impl::expand): Use aarch64_sve_reinterpret instead of force_lowpart_subreg.
2025-07-10cobol: Add PUSH and POP to CDF.James K. Lowden17-1275/+1540
Introduce cdf_directives_t class to centralize management of CDF state. Move existing CDF state variables and functions into the new class. gcc/cobol/ChangeLog: PR cobol/120765 * cdf.y: Extend grammar for new CDF syntax, relocate dictionary. * cdfval.h (cdf_dictionary): Use new CDF dictionary. * dts.h: Remove useless assignment, note incorrect behavior. * except.cc: Remove obsolete EC state. * gcobol.1: Document CDF in its own section. * genapi.cc (parser_statement_begin): Use new EC state function. (parser_file_merge): Same. (parser_check_fatal_exception): Same. * genutil.cc (get_and_check_refstart_and_reflen): Same. (get_depending_on_value_from_odo): Same. (get_data_offset): Same. (process_this_exception): Same. * lexio.cc (check_push_pop_directive): New function. (check_source_format_directive): Restrict regex search to 1 line. (cdftext::free_form_reference_format): Use new function. * parse.y: Define new CDF tokens, use new CDF state. * parse_ante.h (cdf_tokens): Use new CDF state. (redefined_token): Same. (class prog_descr_t): Remove obsolete CDF state. (class program_stack_t): Same. (current_call_convention): Same. * scan.l: Recognize new CDF tokens. * scan_post.h (is_cdf_token): Same. * symbols.h (cdf_current_tokens): Change current_call_convention to return void. * token_names.h: Regenerate. * udf/stored-char-length.cbl: Use new PUSH/POP CDF functionality. * util.cc (class cdf_directives_t): Define cdf_directives_t. (current_call_convention): Same. (cdf_current_tokens): Same. (cdf_dictionary): Same. (cdf_enabled_exceptions): Same. (cdf_push): Same. (cdf_push_call_convention): Same. (cdf_push_current_tokens): Same. (cdf_push_dictionary): Same. (cdf_push_enabled_exceptions): Same. (cdf_push_source_format): Same. (cdf_pop): Same. (cdf_pop_call_convention): Same. (cdf_pop_current_tokens): Same. (cdf_pop_dictionary): Same. (cdf_pop_enabled_exceptions): Same. (cdf_pop_source_format): Same. * util.h (cdf_push): Declare cdf_directives_t. (cdf_push_call_convention): Same. (cdf_push_current_tokens): Same. (cdf_push_dictionary): Same. (cdf_push_enabled_exceptions): Same. (cdf_push_source_format): Same. (cdf_pop): Same. (cdf_pop_call_convention): Same. (cdf_pop_current_tokens): Same. (cdf_pop_dictionary): Same. (cdf_pop_source_format): Same. (cdf_pop_enabled_exceptions): Same. libgcobol/ChangeLog: * common-defs.h (cdf_enabled_exceptions): Use new CDF state.
2025-07-10Fixes to auto-profile and Gimple matching.Jan Hubicka2-84/+156
This patch fixes several issues I noticed in gimple matching and -Wauto-profile warning. One problem is that we mismatched symbols with user names, such as "*strlen" instead of "strlen". I added raw_symbol_name to strip extra '*' which is ok on ELF targets which are only targets we support with auto-profile, but eventually we will want to add the user prefix. There is sorry about this. Also I think dwarf2out is wrong: static void add_linkage_attr (dw_die_ref die, tree decl) { const char *name = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl)); /* Mimic what assemble_name_raw does with a leading '*'. */ if (name[0] == '*') name = &name[1]; The patch also fixes locations of warning. I used location of problematic statement as warning_at parmaeter but also included info about the containing funtction. This makes warning_at to ignore the fist location that is fixed now. I also fixed the ICE with -Wno-auto-profile disussed earlier. Bootstrapped/regtested x86_64-linux. Autoprofiled bootstrap now fails for weird reasons for me (it does not bild the training stage), so I will try to debug this before comitting. gcc/ChangeLog: * auto-profile.cc: Include output.h. (function_instance::set_call_location): Also sanity check that location is known. (raw_symbol_name): Two new static functions. (dump_inline_stack): Use it. (string_table::get_index_by_decl): Likewise. (function_instance::get_cgraph_node): Likewise. (function_instance::get_function_instance_by_decl): Fix typo in warning; use raw names; fix lineno decoding. (match_with_target): Add containing funciton parameter; correctly output function and call location in warning. (function_instance::lookup_count): Fix warning locations. (function_instance::match): Fix warning locations; avoid crash with mismatched callee; do not warn about broken callsites twice. (autofdo_source_profile::offline_external_functions): Use raw_assembler_name. (walk_block): Use raw_assembler_name. gcc/testsuite/ChangeLog: * gcc.dg/tree-prof/afdo-inline.c: Add user symbol names.
2025-07-10expand: ICE if asked to expand RDIV with non-float type.Robin Dapp3-1/+5
This patch adds asserts that ensure we only expand an RDIV_EXPR with actual float mode. It also replaces the RDIV_EXPR in setting a vectorized loop's length by EXACT_DIV_EXPR. The code in question is only used with length-control targets (riscv, powerpc, s390). PR target/121014 gcc/ChangeLog: * cfgexpand.cc (expand_debug_expr): Assert FLOAT_MODE_P. * optabs-tree.cc (optab_for_tree_code): Assert FLOAT_TYPE_P. * tree-vect-loop.cc (vect_get_loop_len): Use EXACT_DIV_EXPR.
2025-07-10RISC-V: Make zero-stride load broadcast a tunable.Robin Dapp7-34/+133
This patch makes the zero-stride load broadcast idiom dependent on a uarch-tunable "use_zero_stride_load". Right now we have quite a few paths that reach a strided load and some of them are not exactly straightforward. While broadcast is relatively rare on rv64 targets it is more common on rv32 targets that want to vectorize 64-bit elements. While the patch is more involved than I would have liked it could have even touched more places. The whole broadcast-like insn path feels a bit hackish due to the several optimizations we employ. Some of the complications stem from the fact that we lump together real broadcasts, vector single-element sets, and strided broadcasts. The strided-load alternatives currently require a memory_constraint to work properly which causes more complications when trying to disable just these. In short, the whole pred_broadcast handling in combination with the sew64_scalar_helper could use work in the future. I was about to start with it in this patch but soon realized that it would only distract from the original intent. What can help in the future is split strided and non-strided broadcast entirely, as well as the single-element sets. Yet unclear is whether we need to pay special attention for misaligned strided loads (PR120782). I regtested on rv32 and rv64 with strided_load_broadcast_p forced to true and false. With either I didn't observe any new execution failures but obviously there are new scan failures with strided broadcast turned off. PR target/118734 gcc/ChangeLog: * config/riscv/constraints.md (Wdm): Use tunable for Wdm constraint. * config/riscv/riscv-protos.h (emit_avltype_insn): Declare. (can_be_broadcasted_p): Rename to... (can_be_broadcast_p): ...this. * config/riscv/predicates.md: Use renamed function. (strided_load_broadcast_p): Declare. * config/riscv/riscv-selftests.cc (run_broadcast_selftests): Only run broadcast selftest if strided broadcasts are OK. * config/riscv/riscv-v.cc (emit_avltype_insn): New function. (sew64_scalar_helper): Only emit a pred_broadcast if the new tunable says so. (can_be_broadcasted_p): Rename to... (can_be_broadcast_p): ...this and use new tunable. * config/riscv/riscv.cc (struct riscv_tune_param): Add strided broad tunable. (strided_load_broadcast_p): Implement. * config/riscv/vector.md: Use strided_load_broadcast_p () and work around 64-bit broadcast on rv32 targets.
2025-07-10[RISC-V] Detect new fusions for RISC-VDaniel Barboza1-1/+382
This is primarily Daniel's work... He's chasing things in QEMU & LLVM right now so I'm doing a bit of clean-up and shepherding this patch forward. -- Instruction fusion is a reasonably common way to improve the performance of code on many architectures/designs. A few years ago we submitted (via VRULL I suspect) fusion support for a number of cases in the RISC-V space. We made each type of fusion selectable independently in the tuning structure so that designs which implemented some particular set of fusions could select just the ones their design implemented. This patch adds to that generic infrastructure. In particular we're introducing additional load fusions, store pair fusions, bitfield extractions and a few B extension related fusions. Conceptually for the new load fusions we're adding the ability to fuse most add/shNadd instructions with a subsequent load. There's a couple of exceptions, but in general the expectation is that if we have add/shNadd for address computation, then they can potentially use with the load where the address gets used. We've had limited forms of store pair fusion for a while. Essentially we required both stores to be 64 bits wide and land on opposite sides of a 128 bit cache line. That was enough to help prologues and a few other things, but was fairly restrictive. The new cases capture store pairs where the two stores have the same size and hit consecutive memory locations. For example, storing consecutive bytes with sb+sb is fusible. For bitfield extractions we can fuse together a shift left followed by a shift right for arbitrary shift counts where as previously we restricted the shift counts to those implementing sign/zero extensions of 8, and 16 bit objects. Finally some B extension fusions. orc.b+not which shows up in string comparisons, ctz+andi (deepsjeng?), neg+max (synthesized abs). I hope these prove to be useful to other RISC-V designs. I wouldn't be surprised if we have to break down the new load fusions further for some designs. If we need to do that it wouldn't be hard. FWIW, our data indicates the generalized store fusions followed by the expanded load fusions are the most important cases for the new code. These have been tested with crosses and bootstrapped on the BPI. Waiting on pre-commit CI before moving forward (though it has been failing to pick up some patches recently...) gcc/ * config/riscv/riscv.cc (riscv_fusion_pairs): Add new cases. (riscv_set_is_add): New function. (riscv_set_is_addi, riscv_set_is_adduw, riscv_set_is_shNadd): Likewise. (riscv_set_is_shNadduw): Likewise. (riscv_macro_fusion_pair_p): Add new fusion cases. Co-authored-by: Jeff Law <jlaw@ventanamicro.com>
2025-07-10testsuite: Add -funwind-tables to sve*/pfalse* testsRichard Sandiford62-62/+62
The SVE svpfalse folding tests use CFI directives to delimit the function bodies. That requires -funwind-tables to be enabled, which is true by default for *-linux-gnu targets, but not for *-elf. gcc/testsuite/ * gcc.target/aarch64/sve/pfalse-binary.c: Add -funwind-tables. * gcc.target/aarch64/sve/pfalse-binary_int_opt_n.c: Likewise. * gcc.target/aarch64/sve/pfalse-binary_opt_n.c: Likewise. * gcc.target/aarch64/sve/pfalse-binary_opt_single_n.c: Likewise. * gcc.target/aarch64/sve/pfalse-binary_rotate.c: Likewise. * gcc.target/aarch64/sve/pfalse-binary_uint64_opt_n.c: Likewise. * gcc.target/aarch64/sve/pfalse-binary_uint_opt_n.c: Likewise. * gcc.target/aarch64/sve/pfalse-binaryxn.c: Likewise. * gcc.target/aarch64/sve/pfalse-clast.c: Likewise. * gcc.target/aarch64/sve/pfalse-compare_opt_n.c: Likewise. * gcc.target/aarch64/sve/pfalse-compare_wide_opt_n.c: Likewise. * gcc.target/aarch64/sve/pfalse-count_pred.c: Likewise. * gcc.target/aarch64/sve/pfalse-fold_left.c: Likewise. * gcc.target/aarch64/sve/pfalse-load.c: Likewise. * gcc.target/aarch64/sve/pfalse-load_ext.c: Likewise. * gcc.target/aarch64/sve/pfalse-load_ext_gather_index.c: Likewise. * gcc.target/aarch64/sve/pfalse-load_ext_gather_offset.c: Likewise. * gcc.target/aarch64/sve/pfalse-load_gather_sv.c: Likewise. * gcc.target/aarch64/sve/pfalse-load_gather_vs.c: Likewise. * gcc.target/aarch64/sve/pfalse-load_replicate.c: Likewise. * gcc.target/aarch64/sve/pfalse-prefetch.c: Likewise. * gcc.target/aarch64/sve/pfalse-prefetch_gather_index.c: Likewise. * gcc.target/aarch64/sve/pfalse-prefetch_gather_offset.c: Likewise. * gcc.target/aarch64/sve/pfalse-ptest.c: Likewise. * gcc.target/aarch64/sve/pfalse-rdffr.c: Likewise. * gcc.target/aarch64/sve/pfalse-reduction.c: Likewise. * gcc.target/aarch64/sve/pfalse-reduction_wide.c: Likewise. * gcc.target/aarch64/sve/pfalse-shift_right_imm.c: Likewise. * gcc.target/aarch64/sve/pfalse-store.c: Likewise. * gcc.target/aarch64/sve/pfalse-store_scatter_index.c: Likewise. * gcc.target/aarch64/sve/pfalse-store_scatter_offset.c: Likewise. * gcc.target/aarch64/sve/pfalse-storexn.c: Likewise. * gcc.target/aarch64/sve/pfalse-ternary_opt_n.c: Likewise. * gcc.target/aarch64/sve/pfalse-ternary_rotate.c: Likewise. * gcc.target/aarch64/sve/pfalse-unary.c: Likewise. * gcc.target/aarch64/sve/pfalse-unary_convert_narrowt.c: Likewise. * gcc.target/aarch64/sve/pfalse-unary_convertxn.c: Likewise. * gcc.target/aarch64/sve/pfalse-unary_n.c: Likewise. * gcc.target/aarch64/sve/pfalse-unary_pred.c: Likewise. * gcc.target/aarch64/sve/pfalse-unary_to_uint.c: Likewise. * gcc.target/aarch64/sve/pfalse-unaryxn.c: Likewise. * gcc.target/aarch64/sve2/pfalse-binary.c: Likewise. * gcc.target/aarch64/sve2/pfalse-binary_int_opt_n.c: Likewise. * gcc.target/aarch64/sve2/pfalse-binary_int_opt_single_n.c: Likewise. * gcc.target/aarch64/sve2/pfalse-binary_opt_n.c: Likewise. * gcc.target/aarch64/sve2/pfalse-binary_opt_single_n.c: Likewise. * gcc.target/aarch64/sve2/pfalse-binary_to_uint.c: Likewise. * gcc.target/aarch64/sve2/pfalse-binary_uint_opt_n.c: Likewise. * gcc.target/aarch64/sve2/pfalse-binary_wide.c: Likewise. * gcc.target/aarch64/sve2/pfalse-compare.c: Likewise. * gcc.target/aarch64/sve2/pfalse-load_ext_gather_index_restricted.c, * gcc.target/aarch64/sve2/pfalse-load_ext_gather_offset_restricted.c, * gcc.target/aarch64/sve2/pfalse-load_gather_sv_restricted.c: Likewise. * gcc.target/aarch64/sve2/pfalse-load_gather_vs.c: Likewise. * gcc.target/aarch64/sve2/pfalse-shift_left_imm_to_uint.c: Likewise. * gcc.target/aarch64/sve2/pfalse-shift_right_imm.c: Likewise. * gcc.target/aarch64/sve2/pfalse-store_scatter_index_restricted.c, * gcc.target/aarch64/sve2/pfalse-store_scatter_offset_restricted.c, * gcc.target/aarch64/sve2/pfalse-unary.c: Likewise. * gcc.target/aarch64/sve2/pfalse-unary_convert.c: Likewise. * gcc.target/aarch64/sve2/pfalse-unary_convert_narrowt.c: Likewise. * gcc.target/aarch64/sve2/pfalse-unary_to_int.c: Likewise.
2025-07-10Handle failed gcond pattern gracefullyRichard Biener1-3/+9
SLP analysis of early break conditions asserts pattern recognition canonicalized all of them. But the pattern can fail, for example when vector types cannot be computed. So be graceful here, so we don't ICE when we didn't yet compute vector types. * tree-vect-slp.cc (vect_analyze_slp): Fail for non-canonical gconds.
2025-07-10Adjust reduction with conversion SLP buildRichard Biener1-1/+6
The following adjusts how we set SLP_TREE_VECTYPE for the conversion node we build when fixing up the reduction with conversion SLP instance. This should probably see more TLC, but the following avoids relying on STMT_VINFO_VECTYPE for this. * tree-vect-slp.cc (vect_build_slp_instance): Do not use SLP_TREE_VECTYPE to determine the conversion back to the reduction IV.
2025-07-10Avoid vect_is_simple_use call from vectorizable_reductionRichard Biener1-8/+5
When analyzing the reduction cycle we look to determine the reduction input vector type, for lane-reducing ops we look at the input but instead of using vect_is_simple_use which is problematic for SLP we should simply get at the SLP operands vector type. If that's not set and we make up one we should also ensure it stays so. * tree-vect-loop.cc (vectorizable_reduction): Avoid vect_is_simple_use and record a vector type if we come up with one.
2025-07-10Avoid vect_is_simple_use call from get_load_store_typeRichard Biener1-11/+4
This isn't the required refactoring of vect_check_gather_scatter but it avoids a now unnecessary call to vect_is_simple_use which is problematic because it looks at STMT_VINFO_VECTYPE which we want to get rid of. SLP build already ensures vect_is_simple_use on all lane defs, so all we need is to populate the offset_vectype and offset_dt which is not always set by vect_check_gather_scatter. That's both easy to get from the SLP child directly. * tree-vect-stmts.cc (get_load_store_type): Do not use vect_is_simple_use to fill gather/scatter offset operand vectype and dt.
2025-07-10Pass SLP node down to cost hook for reduction costRichard Biener1-18/+19
The following arranges vector reduction costs to hand down the SLP node (of the reduction stmt) to the cost hooks, not only the stmt_info. This also avoids accessing STMT_VINFO_VECTYPE of an unrelated stmt to the node that is subject to code generation. * tree-vect-loop.cc (vect_model_reduction_cost): Get SLP node instead of stmt_info and use that when recording costs.
2025-07-10aarch64: PR target/120999: Adjust operands for movprfx alternative of NBSL ↵Kyrylo Tkachov2-1/+18
implementation of NOR While the SVE2 NBSL instruction accepts MOVPRFX to add more flexibility due to its tied operands, the destination of the movprfx cannot be also a source operand. But the offending pattern in aarch64-sve2.md tries to do exactly that for the "=?&w,w,w" alternative and gas warns for the attached testcase. This patch adjusts that alternative to avoid taking operand 0 as an input in the NBSL again. So for the testcase in the patch we now generate: nor_z: movprfx z0, z1 nbsl z0.d, z0.d, z2.d, z1.d ret instead of the previous: nor_z: movprfx z0, z1 nbsl z0.d, z0.d, z2.d, z0.d ret which generated a gas warning. Bootstrapped and tested on aarch64-none-linux-gnu. Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com> gcc/ PR target/120999 * config/aarch64/aarch64-sve2.md (*aarch64_sve2_nor<mode>): Adjust movprfx alternative. gcc/testsuite/ PR target/120999 * gcc.target/aarch64/sve2/pr120999.c: New test.
2025-07-10aarch64: Extend HVLA permutations to big-endianRichard Sandiford11-42/+380
TARGET_VECTORIZE_VEC_PERM_CONST has code to match the SVE2.1 "hybrid VLA" DUPQ, EXTQ, UZPQ{1,2}, and ZIPQ{1,2} instructions. This matching was conditional on !BYTES_BIG_ENDIAN. The ACLE code also lowered the associated SVE2.1 intrinsics into suitable VEC_PERM_EXPRs. This lowering was not conditional on !BYTES_BIG_ENDIAN. The mismatch led to lots of ICEs in the ACLE tests on big-endian targets: we lowered to VEC_PERM_EXPRs that are not supported. I think the !BYTES_BIG_ENDIAN restriction was unnecessary. SVE maps the first memory element to the least significant end of the register for both endiannesses, so no endian correction or lane number adjustment is necessary. This is in some ways a bit counterintuitive. ZIPQ1 is conceptually "apply Advanced SIMD ZIP1 to each 128-bit block" and endianness does matter when choosing between Advanced SIMD ZIP1 and ZIP2. For example, the V4SI permute selector { 0, 4, 1, 5 } corresponds to ZIP1 for little- endian and ZIP2 for big-endian. But the difference between the hybrid VLA and Advanced SIMD permute selectors is a consequence of the difference between the SVE and Advanced SIMD element orders. The same thing applies to ACLE intrinsics. The current lowering of svzipq1 etc. is correct for both endiannesses. If ACLE code does: 2x svld1_s32 + svzipq1_s32 + svst1_s32 then the byte-for-byte result is the same for both endiannesses. On big-endian targets, this is different from using the Advanced SIMD sequence below for each 128-bit block: 2x LDR + ZIP1 + STR In contrast, the byte-for-byte result of: 2x svld1q_gather_s32 + svzipq1_s32 + svst11_scatter_s32 depends on endianness, since the quadword gathers and scatters use Advanced SIMD byte ordering for each 128-bit block. This gather/scatter sequence behaves in the same way as the Advanced SIMD LDR+ZIP1+STR sequence for both endiannesses. Programmers writing ACLE code have to be aware of this difference if they want to support both endiannesses. The patch includes some new execution tests to verify the expansion of the VEC_PERM_EXPRs. gcc/ * doc/sourcebuild.texi (aarch64_sve2_hw, aarch64_sve2p1_hw): Document. * config/aarch64/aarch64.cc (aarch64_evpc_hvla): Extend to BYTES_BIG_ENDIAN. gcc/testsuite/ * lib/target-supports.exp (check_effective_target_aarch64_sve2p1_hw): New proc. * gcc.target/aarch64/sve2/dupq_1.c: Extend to big-endian. Add noipa attributes. * gcc.target/aarch64/sve2/extq_1.c: Likewise. * gcc.target/aarch64/sve2/uzpq_1.c: Likewise. * gcc.target/aarch64/sve2/zipq_1.c: Likewise. * gcc.target/aarch64/sve2/dupq_1_run.c: New test. * gcc.target/aarch64/sve2/extq_1_run.c: Likewise. * gcc.target/aarch64/sve2/uzpq_1_run.c: Likewise. * gcc.target/aarch64/sve2/zipq_1_run.c: Likewise.
2025-07-10Remove dead code dealing with non-SLPRichard Biener3-151/+38
After vect_analyze_loop_operations is gone we can clean up vect_analyze_stmt as it is no longer called out of SLP context. * tree-vectorizer.h (vect_analyze_stmt): Remove stmt-info and need_to_vectorize arguments. * tree-vect-slp.cc (vect_slp_analyze_node_operations_1): Adjust. * tree-vect-stmts.cc (can_vectorize_live_stmts): Remove stmt_info argument and remove non-SLP path. (vect_analyze_stmt): Remove stmt_info and need_to_vectorize argument and prune paths no longer reachable. (vect_transform_stmt): Adjust.
2025-07-10Comment spelling fix: tunning -> tuningJakub Jelinek2-2/+2
Kyrylo noticed another spelling bug and like usually, the same mistake happens in multiple places. 2025-07-10 Jakub Jelinek <jakub@redhat.com> * config/i386/x86-tune.def: Change "Tunning the" to "tuning" in comment and use semicolon instead of dot in comment. * loop-unroll.cc (decide_unroll_stupid): Comment spelling fix, tunning -> tuning.
2025-07-10Change bellow in comments to belowJakub Jelinek18-18/+18
While I'm not a native English speaker, I believe all the uses of bellow (roar/bark/...) in comments in gcc are meant to be below (beneath/under/...). 2025-07-10 Jakub Jelinek <jakub@redhat.com> gcc/ * tree-vect-loop.cc (scale_profile_for_vect_loop): Comment spelling fix: bellow -> below. * ipa-polymorphic-call.cc (record_known_type): Likewise. * config/i386/x86-tune.def: Likewise. * config/riscv/vector.md (*vsetvldi_no_side_effects_si_extend): Likewise. * tree-scalar-evolution.cc (iv_can_overflow_p): Likewise. * ipa-devirt.cc (add_type_duplicate): Likewise. * tree-ssa-loop-niter.cc (maybe_lower_iteration_bound): Likewise. * gimple-ssa-sccopy.cc: Likewise. * cgraphunit.cc: Likewise. * graphite.h (struct poly_dr): Likewise. * ipa-reference.cc (ignore_edge_p): Likewise. * tree-ssa-alias.cc (ao_compare::compare_ao_refs): Likewise. * profile-count.h (profile_probability::probably_reliable_p): Likewise. * ipa-inline-transform.cc (inline_call): Likewise. gcc/ada/ * par-load.adb: Comment spelling fix: bellow -> below. * libgnarl/s-taskin.ads: Likewise. gcc/testsuite/ * gfortran.dg/g77/980310-3.f: Comment spelling fix: bellow -> below. * jit.dg/test-debuginfo.c: Likewise. libstdc++-v3/ * testsuite/22_locale/codecvt/codecvt_unicode.h (ucs2_to_utf8_out_error): Comment spelling fix: bellow -> below. (utf16_to_ucs2_in_error): Likewise.
2025-07-10Remove vect_dissolve_slp_only_groupsRichard Biener1-75/+0
This function dissolves DR groups that are not subject to SLP. Which means it is no longer necessary. * tree-vect-loop.cc (vect_dissolve_slp_only_groups): Remove. (vect_analyze_loop_2): Do not call it.
2025-07-10Remove vect_analyze_loop_operationsRichard Biener1-137/+0
This removes the remains of vect_analyze_loop_operations. All the checks it does still on LC PHIs of inner loops in outer loop vectorization should be handled by vectorizable_lc_phi. * tree-vect-loop.cc (vect_active_double_reduction_p): Remove. (vect_analyze_loop_operations): Remove. (vect_analyze_loop_2): Do not call it.
2025-07-10Remove non-SLP vectorization factor determiningRichard Biener4-182/+100
The following removes the VF determining step from non-SLP stmts. For now we keep setting STMT_VINFO_VECTYPE for all stmts, there are too many places to fix, including some more complicated ones, so this is defered for a followup. Along this removes vect_update_vf_for_slp, merging the check for present hybrid SLP stmts to vect_detect_hybrid_slp and fail analysis early. This also removes to essentially duplicate this check in the stmt walk of vect_analyze_loop_operations. Getting rid of that, and performing some other checks earlier is also defered to a followup. * tree-vect-loop.cc (vect_determine_vf_for_stmt_1): Rename to ... (vect_determine_vectype_for_stmt_1): ... this and only set STMT_VINFO_VECTYPE. Fail for single-element vector types. (vect_determine_vf_for_stmt): Rename to ... (vect_determine_vectype_for_stmt): ... this and only set STMT_VINFO_VECTYPE. Fail for single-element vector types. (vect_determine_vectorization_factor): Rename to ... (vect_set_stmts_vectype): ... this and only set STMT_VINFO_VECTYPE. (vect_update_vf_for_slp): Remove. (vect_analyze_loop_operations): Remove walk over stmts. (vect_analyze_loop_2): Call vect_set_stmts_vectype instead of vect_determine_vectorization_factor. Set vectorization factor from LOOP_VINFO_SLP_UNROLLING_FACTOR. Fail if vect_detect_hybrid_slp detects hybrid stmts or when vect_make_slp_decision finds nothing to SLP. * tree-vect-slp.cc (vect_detect_hybrid_slp): Move check whether we have any hybrid stmts here from vect_update_vf_for_slp * tree-vect-stmts.cc (vect_analyze_stmt): Remove loop over stmts. * tree-vectorizer.h (vect_detect_hybrid_slp): Update.
2025-07-10RISCV: Remove the v extension requirement for sat scalar run testPan Li214-214/+214
The sat scalar run test should not require the v extension, thus take rv32 || rv64 instead of riscv_v for the requirement. The below test suites are passed for this patch series. * The rv64gcv fully regression test. * The rv32gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/sat/sat_s_add-run-1-i16.c: Take rv32 || rv64 instead of riscv_v for scalar run test. * gcc.target/riscv/sat/sat_s_add-run-1-i32.c: Ditto. * gcc.target/riscv/sat/sat_s_add-run-1-i64.c: Ditto. * gcc.target/riscv/sat/sat_s_add-run-1-i8.c: Ditto. * gcc.target/riscv/sat/sat_s_add-run-2-i16.c: Ditto. * gcc.target/riscv/sat/sat_s_add-run-2-i32.c: Ditto. * gcc.target/riscv/sat/sat_s_add-run-2-i64.c: Ditto. * gcc.target/riscv/sat/sat_s_add-run-2-i8.c: Ditto. * gcc.target/riscv/sat/sat_s_add-run-3-i16.c: Ditto. * gcc.target/riscv/sat/sat_s_add-run-3-i32.c: Ditto. * gcc.target/riscv/sat/sat_s_add-run-3-i64.c: Ditto. * gcc.target/riscv/sat/sat_s_add-run-3-i8.c: Ditto. * gcc.target/riscv/sat/sat_s_add-run-4-i16.c: Ditto. * gcc.target/riscv/sat/sat_s_add-run-4-i32.c: Ditto. * gcc.target/riscv/sat/sat_s_add-run-4-i64.c: Ditto. * gcc.target/riscv/sat/sat_s_add-run-4-i8.c: Ditto. * gcc.target/riscv/sat/sat_s_sub-run-1-i16.c: Ditto. * gcc.target/riscv/sat/sat_s_sub-run-1-i32.c: Ditto. * gcc.target/riscv/sat/sat_s_sub-run-1-i64.c: Ditto. * gcc.target/riscv/sat/sat_s_sub-run-1-i8.c: Ditto. * gcc.target/riscv/sat/sat_s_sub-run-2-i16.c: Ditto. * gcc.target/riscv/sat/sat_s_sub-run-2-i32.c: Ditto. * gcc.target/riscv/sat/sat_s_sub-run-2-i64.c: Ditto. * gcc.target/riscv/sat/sat_s_sub-run-2-i8.c: Ditto. * gcc.target/riscv/sat/sat_s_sub-run-3-i16.c: Ditto. * gcc.target/riscv/sat/sat_s_sub-run-3-i32.c: Ditto. * gcc.target/riscv/sat/sat_s_sub-run-3-i64.c: Ditto. * gcc.target/riscv/sat/sat_s_sub-run-3-i8.c: Ditto. * gcc.target/riscv/sat/sat_s_sub-run-4-i16.c: Ditto. * gcc.target/riscv/sat/sat_s_sub-run-4-i32.c: Ditto. * gcc.target/riscv/sat/sat_s_sub-run-4-i64.c: Ditto. * gcc.target/riscv/sat/sat_s_sub-run-4-i8.c: Ditto. * gcc.target/riscv/sat/sat_s_trunc-run-1-i16-to-i8.c: Ditto. * gcc.target/riscv/sat/sat_s_trunc-run-1-i32-to-i16.c: Ditto. * gcc.target/riscv/sat/sat_s_trunc-run-1-i32-to-i8.c: Ditto. * gcc.target/riscv/sat/sat_s_trunc-run-1-i64-to-i16.c: Ditto. * gcc.target/riscv/sat/sat_s_trunc-run-1-i64-to-i32.c: Ditto. * gcc.target/riscv/sat/sat_s_trunc-run-1-i64-to-i8.c: Ditto. * gcc.target/riscv/sat/sat_s_trunc-run-2-i16-to-i8.c: Ditto. * gcc.target/riscv/sat/sat_s_trunc-run-2-i32-to-i16.c: Ditto. * gcc.target/riscv/sat/sat_s_trunc-run-2-i32-to-i8.c: Ditto. * gcc.target/riscv/sat/sat_s_trunc-run-2-i64-to-i16.c: Ditto. * gcc.target/riscv/sat/sat_s_trunc-run-2-i64-to-i32.c: Ditto. * gcc.target/riscv/sat/sat_s_trunc-run-2-i64-to-i8.c: Ditto. * gcc.target/riscv/sat/sat_s_trunc-run-3-i16-to-i8.c: Ditto. * gcc.target/riscv/sat/sat_s_trunc-run-3-i32-to-i16.c: Ditto. * gcc.target/riscv/sat/sat_s_trunc-run-3-i32-to-i8.c: Ditto. * gcc.target/riscv/sat/sat_s_trunc-run-3-i64-to-i16.c: Ditto. * gcc.target/riscv/sat/sat_s_trunc-run-3-i64-to-i32.c: Ditto. * gcc.target/riscv/sat/sat_s_trunc-run-3-i64-to-i8.c: Ditto. * gcc.target/riscv/sat/sat_s_trunc-run-4-i16-to-i8.c: Ditto. * gcc.target/riscv/sat/sat_s_trunc-run-4-i32-to-i16.c: Ditto. * gcc.target/riscv/sat/sat_s_trunc-run-4-i32-to-i8.c: Ditto. * gcc.target/riscv/sat/sat_s_trunc-run-4-i64-to-i16.c: Ditto. * gcc.target/riscv/sat/sat_s_trunc-run-4-i64-to-i32.c: Ditto. * gcc.target/riscv/sat/sat_s_trunc-run-4-i64-to-i8.c: Ditto. * gcc.target/riscv/sat/sat_s_trunc-run-5-i16-to-i8.c: Ditto. * gcc.target/riscv/sat/sat_s_trunc-run-5-i32-to-i16.c: Ditto. * gcc.target/riscv/sat/sat_s_trunc-run-5-i32-to-i8.c: Ditto. * gcc.target/riscv/sat/sat_s_trunc-run-5-i64-to-i16.c: Ditto. * gcc.target/riscv/sat/sat_s_trunc-run-5-i64-to-i32.c: Ditto. * gcc.target/riscv/sat/sat_s_trunc-run-5-i64-to-i8.c: Ditto. * gcc.target/riscv/sat/sat_s_trunc-run-6-i16-to-i8.c: Ditto. * gcc.target/riscv/sat/sat_s_trunc-run-6-i32-to-i16.c: Ditto. * gcc.target/riscv/sat/sat_s_trunc-run-6-i32-to-i8.c: Ditto. * gcc.target/riscv/sat/sat_s_trunc-run-6-i64-to-i16.c: Ditto. * gcc.target/riscv/sat/sat_s_trunc-run-6-i64-to-i32.c: Ditto. * gcc.target/riscv/sat/sat_s_trunc-run-6-i64-to-i8.c: Ditto. * gcc.target/riscv/sat/sat_s_trunc-run-7-i16-to-i8.c: Ditto. * gcc.target/riscv/sat/sat_s_trunc-run-7-i32-to-i16.c: Ditto. * gcc.target/riscv/sat/sat_s_trunc-run-7-i32-to-i8.c: Ditto. * gcc.target/riscv/sat/sat_s_trunc-run-7-i64-to-i16.c: Ditto. * gcc.target/riscv/sat/sat_s_trunc-run-7-i64-to-i32.c: Ditto. * gcc.target/riscv/sat/sat_s_trunc-run-7-i64-to-i8.c: Ditto. * gcc.target/riscv/sat/sat_s_trunc-run-8-i16-to-i8.c: Ditto. * gcc.target/riscv/sat/sat_s_trunc-run-8-i32-to-i16.c: Ditto. * gcc.target/riscv/sat/sat_s_trunc-run-8-i32-to-i8.c: Ditto. * gcc.target/riscv/sat/sat_s_trunc-run-8-i64-to-i16.c: Ditto. * gcc.target/riscv/sat/sat_s_trunc-run-8-i64-to-i32.c: Ditto. * gcc.target/riscv/sat/sat_s_trunc-run-8-i64-to-i8.c: Ditto. * gcc.target/riscv/sat/sat_u_add-run-1-u16.c: Ditto. * gcc.target/riscv/sat/sat_u_add-run-1-u32.c: Ditto. * gcc.target/riscv/sat/sat_u_add-run-1-u64.c: Ditto. * gcc.target/riscv/sat/sat_u_add-run-1-u8.c: Ditto. * gcc.target/riscv/sat/sat_u_add-run-2-u16.c: Ditto. * gcc.target/riscv/sat/sat_u_add-run-2-u32.c: Ditto. * gcc.target/riscv/sat/sat_u_add-run-2-u64.c: Ditto. * gcc.target/riscv/sat/sat_u_add-run-2-u8.c: Ditto. * gcc.target/riscv/sat/sat_u_add-run-3-u16.c: Ditto. * gcc.target/riscv/sat/sat_u_add-run-3-u32.c: Ditto. * gcc.target/riscv/sat/sat_u_add-run-3-u64.c: Ditto. * gcc.target/riscv/sat/sat_u_add-run-3-u8.c: Ditto. * gcc.target/riscv/sat/sat_u_add-run-4-u16.c: Ditto. * gcc.target/riscv/sat/sat_u_add-run-4-u32.c: Ditto. * gcc.target/riscv/sat/sat_u_add-run-4-u64.c: Ditto. * gcc.target/riscv/sat/sat_u_add-run-4-u8.c: Ditto. * gcc.target/riscv/sat/sat_u_add-run-5-u16.c: Ditto. * gcc.target/riscv/sat/sat_u_add-run-5-u32.c: Ditto. * gcc.target/riscv/sat/sat_u_add-run-5-u64.c: Ditto. * gcc.target/riscv/sat/sat_u_add-run-5-u8.c: Ditto. * gcc.target/riscv/sat/sat_u_add-run-6-u16.c: Ditto. * gcc.target/riscv/sat/sat_u_add-run-6-u32.c: Ditto. * gcc.target/riscv/sat/sat_u_add-run-6-u64.c: Ditto. * gcc.target/riscv/sat/sat_u_add-run-6-u8.c: Ditto. * gcc.target/riscv/sat/sat_u_add-run-7-u16-from-u32.c: Ditto. * gcc.target/riscv/sat/sat_u_add-run-7-u16-from-u64.c: Ditto. * gcc.target/riscv/sat/sat_u_add-run-7-u32-from-u64.c: Ditto. * gcc.target/riscv/sat/sat_u_add-run-7-u8-from-u16.c: Ditto. * gcc.target/riscv/sat/sat_u_add-run-7-u8-from-u32.c: Ditto. * gcc.target/riscv/sat/sat_u_add-run-7-u8-from-u64.c: Ditto. * gcc.target/riscv/sat/sat_u_add_imm-run-1-u16.c: Ditto. * gcc.target/riscv/sat/sat_u_add_imm-run-1-u32.c: Ditto. * gcc.target/riscv/sat/sat_u_add_imm-run-1-u64.c: Ditto. * gcc.target/riscv/sat/sat_u_add_imm-run-1-u8.c: Ditto. * gcc.target/riscv/sat/sat_u_add_imm-run-2-u16.c: Ditto. * gcc.target/riscv/sat/sat_u_add_imm-run-2-u32.c: Ditto. * gcc.target/riscv/sat/sat_u_add_imm-run-2-u64.c: Ditto. * gcc.target/riscv/sat/sat_u_add_imm-run-2-u8.c: Ditto. * gcc.target/riscv/sat/sat_u_add_imm-run-3-u16.c: Ditto. * gcc.target/riscv/sat/sat_u_add_imm-run-3-u32.c: Ditto. * gcc.target/riscv/sat/sat_u_add_imm-run-3-u64.c: Ditto. * gcc.target/riscv/sat/sat_u_add_imm-run-3-u8.c: Ditto. * gcc.target/riscv/sat/sat_u_add_imm-run-4-u16.c: Ditto. * gcc.target/riscv/sat/sat_u_add_imm-run-4-u32.c: Ditto. * gcc.target/riscv/sat/sat_u_add_imm-run-4-u64.c: Ditto. * gcc.target/riscv/sat/sat_u_add_imm-run-4-u8.c: Ditto. * gcc.target/riscv/sat/sat_u_sub-run-1-u16.c: Ditto. * gcc.target/riscv/sat/sat_u_sub-run-1-u32.c: Ditto. * gcc.target/riscv/sat/sat_u_sub-run-1-u64.c: Ditto. * gcc.target/riscv/sat/sat_u_sub-run-1-u8.c: Ditto. * gcc.target/riscv/sat/sat_u_sub-run-10-u16.c: Ditto. * gcc.target/riscv/sat/sat_u_sub-run-10-u32.c: Ditto. * gcc.target/riscv/sat/sat_u_sub-run-10-u64.c: Ditto. * gcc.target/riscv/sat/sat_u_sub-run-10-u8.c: Ditto. * gcc.target/riscv/sat/sat_u_sub-run-11-u16.c: Ditto. * gcc.target/riscv/sat/sat_u_sub-run-11-u32.c: Ditto. * gcc.target/riscv/sat/sat_u_sub-run-11-u64.c: Ditto. * gcc.target/riscv/sat/sat_u_sub-run-11-u8.c: Ditto. * gcc.target/riscv/sat/sat_u_sub-run-12-u16.c: Ditto. * gcc.target/riscv/sat/sat_u_sub-run-12-u32.c: Ditto. * gcc.target/riscv/sat/sat_u_sub-run-12-u64.c: Ditto. * gcc.target/riscv/sat/sat_u_sub-run-12-u8.c: Ditto. * gcc.target/riscv/sat/sat_u_sub-run-2-u16.c: Ditto. * gcc.target/riscv/sat/sat_u_sub-run-2-u32.c: Ditto. * gcc.target/riscv/sat/sat_u_sub-run-2-u64.c: Ditto. * gcc.target/riscv/sat/sat_u_sub-run-2-u8.c: Ditto. * gcc.target/riscv/sat/sat_u_sub-run-3-u16.c: Ditto. * gcc.target/riscv/sat/sat_u_sub-run-3-u32.c: Ditto. * gcc.target/riscv/sat/sat_u_sub-run-3-u64.c: Ditto. * gcc.target/riscv/sat/sat_u_sub-run-3-u8.c: Ditto. * gcc.target/riscv/sat/sat_u_sub-run-4-u16.c: Ditto. * gcc.target/riscv/sat/sat_u_sub-run-4-u32.c: Ditto. * gcc.target/riscv/sat/sat_u_sub-run-4-u64.c: Ditto. * gcc.target/riscv/sat/sat_u_sub-run-4-u8.c: Ditto. * gcc.target/riscv/sat/sat_u_sub-run-5-u16.c: Ditto. * gcc.target/riscv/sat/sat_u_sub-run-5-u32.c: Ditto. * gcc.target/riscv/sat/sat_u_sub-run-5-u64.c: Ditto. * gcc.target/riscv/sat/sat_u_sub-run-5-u8.c: Ditto. * gcc.target/riscv/sat/sat_u_sub-run-6-u16.c: Ditto. * gcc.target/riscv/sat/sat_u_sub-run-6-u32.c: Ditto. * gcc.target/riscv/sat/sat_u_sub-run-6-u64.c: Ditto. * gcc.target/riscv/sat/sat_u_sub-run-6-u8.c: Ditto. * gcc.target/riscv/sat/sat_u_sub-run-7-u16.c: Ditto. * gcc.target/riscv/sat/sat_u_sub-run-7-u32.c: Ditto. * gcc.target/riscv/sat/sat_u_sub-run-7-u64.c: Ditto. * gcc.target/riscv/sat/sat_u_sub-run-7-u8.c: Ditto. * gcc.target/riscv/sat/sat_u_sub-run-8-u16.c: Ditto. * gcc.target/riscv/sat/sat_u_sub-run-8-u32.c: Ditto. * gcc.target/riscv/sat/sat_u_sub-run-8-u64.c: Ditto. * gcc.target/riscv/sat/sat_u_sub-run-8-u8.c: Ditto. * gcc.target/riscv/sat/sat_u_sub-run-9-u16.c: Ditto. * gcc.target/riscv/sat/sat_u_sub-run-9-u32.c: Ditto. * gcc.target/riscv/sat/sat_u_sub-run-9-u64.c: Ditto. * gcc.target/riscv/sat/sat_u_sub-run-9-u8.c: Ditto. * gcc.target/riscv/sat/sat_u_sub_imm-run-1-u16.c: Ditto. * gcc.target/riscv/sat/sat_u_sub_imm-run-1-u32.c: Ditto. * gcc.target/riscv/sat/sat_u_sub_imm-run-1-u64.c: Ditto. * gcc.target/riscv/sat/sat_u_sub_imm-run-1-u8.c: Ditto. * gcc.target/riscv/sat/sat_u_sub_imm-run-2-u16.c: Ditto. * gcc.target/riscv/sat/sat_u_sub_imm-run-2-u32.c: Ditto. * gcc.target/riscv/sat/sat_u_sub_imm-run-2-u64.c: Ditto. * gcc.target/riscv/sat/sat_u_sub_imm-run-2-u8.c: Ditto. * gcc.target/riscv/sat/sat_u_sub_imm-run-3-u16.c: Ditto. * gcc.target/riscv/sat/sat_u_sub_imm-run-3-u32.c: Ditto. * gcc.target/riscv/sat/sat_u_sub_imm-run-3-u64.c: Ditto. * gcc.target/riscv/sat/sat_u_sub_imm-run-3-u8.c: Ditto. * gcc.target/riscv/sat/sat_u_sub_imm-run-4-u16.c: Ditto. * gcc.target/riscv/sat/sat_u_sub_imm-run-4-u32.c: Ditto. * gcc.target/riscv/sat/sat_u_sub_imm-run-4-u64.c: Ditto. * gcc.target/riscv/sat/sat_u_sub_imm-run-4-u8.c: Ditto. * gcc.target/riscv/sat/sat_u_trunc-run-1-u16.c: Ditto. * gcc.target/riscv/sat/sat_u_trunc-run-1-u32.c: Ditto. * gcc.target/riscv/sat/sat_u_trunc-run-1-u64.c: Ditto. * gcc.target/riscv/sat/sat_u_trunc-run-1-u8.c: Ditto. * gcc.target/riscv/sat/sat_u_trunc-run-2-u16.c: Ditto. * gcc.target/riscv/sat/sat_u_trunc-run-2-u32.c: Ditto. * gcc.target/riscv/sat/sat_u_trunc-run-2-u64.c: Ditto. * gcc.target/riscv/sat/sat_u_trunc-run-2-u8.c: Ditto. * gcc.target/riscv/sat/sat_u_trunc-run-3-u16.c: Ditto. * gcc.target/riscv/sat/sat_u_trunc-run-3-u32.c: Ditto. * gcc.target/riscv/sat/sat_u_trunc-run-3-u64.c: Ditto. * gcc.target/riscv/sat/sat_u_trunc-run-3-u8.c: Ditto. * gcc.target/riscv/sat/sat_u_trunc-run-4-u16.c: Ditto. * gcc.target/riscv/sat/sat_u_trunc-run-4-u32.c: Ditto. * gcc.target/riscv/sat/sat_u_trunc-run-4-u64.c: Ditto. * gcc.target/riscv/sat/sat_u_trunc-run-4-u8.c: Ditto. * gcc.target/riscv/sat/sat_u_trunc-run-5-u16.c: Ditto. * gcc.target/riscv/sat/sat_u_trunc-run-5-u32.c: Ditto. * gcc.target/riscv/sat/sat_u_trunc-run-5-u64.c: Ditto. * gcc.target/riscv/sat/sat_u_trunc-run-5-u8.c: Ditto. * gcc.target/riscv/sat/sat_u_trunc-run-6-u16.c: Ditto. * gcc.target/riscv/sat/sat_u_trunc-run-6-u32.c: Ditto. * gcc.target/riscv/sat/sat_u_trunc-run-6-u64.c: Ditto. * gcc.target/riscv/sat/sat_u_trunc-run-6-u8.c: Ditto. Signed-off-by: Pan Li <pan2.li@intel.com>
2025-07-10Daily bump.GCC Administrator5-1/+380
2025-07-09cobol: Development round-up. [PR120765, PR119337, PR120794]Robert Dubner18-1601/+2012
This collection of changes reflects development by both Jim Lowden and Bob Dubner. It includes fixes to the cobcd script; refinements to the multiple- period syntax; changes to the parser; implementation of DISPLAY/ACCEPT to and from ENVIRONMENT-NAME, ENVIRONMENT-VALUE, ARGUMENT-NUMBER, ARGUMENT-VALUE and minor changes to genapi.cc to cut down on the number of cppcheck warnings. Co-authored-by: James K. Lowden <jklowden@cobolworx.com> Co-authored-by: Robert Dubner <rdubner@symas.com> gcc/cobol/ChangeLog: PR cobol/120765 PR cobol/119337 PR cobol/120794 * Make-lang.in: Take control of the .cc.o rule. * cbldiag.h (error_msg_direct): New declaration. (gcc_location_dump): Forward declaration. (location_dump): Use gcc_location_dump. * cdf.y: Change some tokens. * gcobc: Change dialect handling. * genapi.cc (parser_call_targets_dump): Temporarily remove from service. (parser_compile_dcls): Combine temporary arrays. (get_binary_value_from_float): Apply const to one parameter. (depending_on_value): Localize a boolean variable. (normal_normal_compare): Likewise. (cobol_compare): Eliminate cppcheck warning. (combined_name): Apply const to an input parameter. (parser_perform): Apply const to a variable. (parser_accept): Improve handling of special_name_t parameter and the exception conditions. (parser_display): Improve handling of speciat_name_t parameter; use the os_filename[] string when appropriate. (program_end_stuff): Rename shadowing variable. (parser_division): Consolidate temporary char[] arrays. (parser_file_start): Apply const to a parameter. (inspect_replacing): Likewise. (parser_program_hierarchy): Rename shadowing variable. (mh_identical): Apply const to parameters. (float_type_of): Likewise. (picky_memcpy): Likewise. (mh_numeric_display): Likewise. (mh_little_endian): Likewise. (mh_source_is_group): Apply static to a variable it. (move_helper): Quiet a cppcheck warning. * genapi.h (parser_accept): Add exceptions to declaration. (parser_accept_under_discussion): Add declaration. (parser_display): Change to std::vector; add exceptions to declaration. * lexio.cc (cdf_source_format): Improve source code location handling. (source_format_t::infer): Likewise. (is_fixed_format): Likewise. (is_reference_format): Likewise. (left_margin): Likewise. (right_margin): Likewise. (cobol_set_indicator_column): Likewise. (include_debug): Likewise. (continues_at): Likewise. (indicated): Likewise. (check_source_format_directive): Likewise. (cdftext::free_form_reference_format): Likewise. * parse.y: Tokens; program and function names; DISPLAY and ACCEPT handling. * parse_ante.h (class tokenset_t): Removed. (class current_tokens_t): Removed. (field_of): Removed. * scan.l: Token handling. * scan_ante.h (level_found): Comment. * scan_post.h (start_condition_str): Remove cast author_state:. * symbols.cc (symbols_update): Change error message. (symbol_table_init): Correct and reorder entries. (symbol_unresolved_file_key): New function definition. (cbl_file_key_t::deforward): Change error message. * symbols.h (symbol_unresolved_file_key): New declaration. (keyword_tok): New function. (redefined_token): New function. (class current_tokens_t): New class. * symfind.cc (symbol_match): Revise error message. * token_names.h: Reorder and change numbers in comments. * util.cc (class cdf_directives_t): New class. (cobol_set_indicator_column): New function. (cdf_source_format): New function. (gcc_location_set_impl): Improve column handling in token_location. (gcc_location_dump): New function. (class temp_loc_t): Modify constructor. (error_msg_direct): New function. * util.h (class source_format_t): New class. libgcobol/ChangeLog: * libgcobol.cc (__gg__accept_envar): ACCEPT/DISPLAY environment variables. (accept_envar): Likewise. (default_exception_handler): Refine system log entries. (open_syslog): Likewise. (__gg__set_env_name): ACCEPT/DISPLAY environment variables. (__gg__get_env_name): ACCEPT/DISPLAY environment variables. (__gg__get_env_value): ACCEPT/DISPLAY environment variables. (__gg__set_env_value): ACCEPT/DISPLAY environment variables. (__gg__fprintf_stderr): Adjust __attribute__ for printf. (__gg__set_arg_num): ACCEPT/DISPLAY command-line arguments. (__gg__accept_arg_value): ACCEPT/DISPLAY command-line arguments. (__gg__get_file_descriptor): DISPLAY on os_filename[] /dev device.
2025-07-09aarch64: Fix endianness of DFmode vector constantsRichard Sandiford1-0/+2
aarch64_simd_valid_imm tries to decompose a constant into a repeating series of 64 bits, since most Advanced SIMD and SVE immediate forms require that. (The exceptions are handled first.) It does this by building up a byte-level register image, lsb first. If the image does turn out to repeat every 64 bits, it loads the first 64 bits into an integer. At this point, endianness has mostly been dealt with. Endianness applies to transfers between registers and memory, whereas at this point we're dealing purely with register values. However, one of things we try is to bitcast the value to a float and use FMOV. This involves splitting the value into 32-bit chunks (stored as longs) and passing them to real_from_target. The problem being fixed by this patch is that, when a value spans multiple 32-bit chunks, real_from_target expects them to be in memory rather than register order. Thus index 0 is the most significant chunk if FLOAT_WORDS_BIG_ENDIAN and the least significant chunk otherwise. This fixes aarch64/sve/cond_fadd_1.c and various other tests for aarch64_be-elf. gcc/ * config/aarch64/aarch64.cc (aarch64_simd_valid_imm): Account for FLOAT_WORDS_BIG_ENDIAN when building a floating-point value.
2025-07-09Fix ICE in afdo_adjust_guessed_profileJan Hubicka1-3/+7
gcc/ChangeLog: * auto-profile.cc (afdo_adjust_guessed_profile): Add forgotten if (dump_file) guard.
2025-07-09c++: add passing testcases [PR120243]Jason Merrill2-0/+67
These pass now; the first was fixed by r16-1507. PR c++/120243 gcc/testsuite/ChangeLog: * g++.dg/coroutines/torture/pr120243-unhandled-1.C: New test. * g++.dg/coroutines/torture/pr120243-unhandled-2.C: New test.
2025-07-09c++: generic lambda in template arg [PR121012]Jason Merrill2-0/+11
My r16-2065 adding missed errors for auto in a template arg in a lambda parameter also introduced a bogus error on this testcase, where the auto is both in a lambda parameter and in a template arg, but in the other order, which is OK. So we should clear in_template_argument_list_p for lambdas like we do so many other parser flags. PR c++/121012 PR c++/120917 gcc/cp/ChangeLog: * parser.cc (cp_parser_lambda_expression): Clear parser->in_template_argument_list_p. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/lambda-targ17.C: New test.