aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-12-18c++: Speed up compilation of large char array initializers when not using #embedJakub Jelinek1-0/+101
The following patch (again, on top of the #embed patchset attempts to optimize compilation of large {{{,un}signed ,}char,std::byte} array initializers when not using #embed in the source. Unlike the C patch which is done during the parsing of initializers this is done when lexing tokens into an array, because C++ lexes all tokens upfront and so by the time we parse the initializers we already have 16 bytes per token allocated (i.e. 32 extra compile time memory bytes per one byte in the array). The drawback is again that it can result in worse locations for diagnostics (-Wnarrowing, -Wconversion) when initializing signed char arrays with values 128..255. Not really sure what to do about this though unlike the C case, the locations would need to be preserved through reshape_init* and perhaps till template instantiation. For #embed, there is just a single location_t (could be range of the directive), for diagnostics perhaps we could extend it to say byte xyz of the file embedded here or something like that, but the optimization done by this patch, either we'd need to bump the minimum limit at which to try it, or say temporarily allocate a location_t array for each byte and then clear it when we no longer need it or something. I've been using the same testcases as for C, with #embed of 100'000'000 bytes: time ./cc1plus -quiet -O2 -o test4a.s2 test4a.c real 0m0.972s user 0m0.578s sys 0m0.195s with xxd -i alternative of the same data without this patch it consumed around 13.2GB of RAM and time ./cc1plus -quiet -O2 -o test4b.s4 test4b.c real 3m47.968s user 3m41.907s sys 0m5.015s and the same with this patch it consumed around 3.7GB of RAM and time ./cc1plus -quiet -O2 -o test4b.s3 test4b.c real 0m24.772s user 0m23.118s sys 0m1.495s 2024-12-18 Jakub Jelinek <jakub@redhat.com> * parser.cc (cp_lexer_new_main): Attempt to optimize large sequences of CPP_NUMBER with int type and values 0-255 separated by CPP_COMMA into CPP_EMBED with RAW_DATA_CST u.value.
2024-12-18gimple-fold: Fix up decode_field_reference xor handling [PR118081]Jakub Jelinek2-1/+29
The function comment says: *XOR_P is to be FALSE if EXP might be a XOR used in a compare, in which case, if XOR_CMP_OP is a zero constant, it will be overridden with *PEXP, *XOR_P will be set to TRUE, and the left-hand operand of the XOR will be decoded. If *XOR_P is TRUE, XOR_CMP_OP is supposed to be NULL, and then the right-hand operand of the XOR will be decoded. and the comment right above the xor_p handling says /* Turn (a ^ b) [!]= 0 into a [!]= b. */ but I don't see anything that would actually check that the other operand is 0, in the testcase below it happily optimizes (a ^ 1) == 8 into a == 1. The following patch adds that check. Note, there are various other parts of the function I'm worried about, but haven't had time to construct counterexamples yet. One worrying thing is the /* Drop casts, only save the outermost type. We need not worry about narrowing then widening casts, or vice-versa, for those that are not essential for the compare have already been optimized out at this point. */ comment, while obviously there are various optimizations which do optimize nested casts and the like, I'm not really sure it is safe to rely on them happening always before this optimization, there are various options to disable certain optimizations and some IL could appear right before ifcombine without being optimized yet the way this routine expects. Plus, the 3 casts are looked through in between various optimizations which might make those narrowing/widening or vice versa cases necessary. Also, e.g. for the xor optimization, I think there is a difference between int a and (a ^ 0x23) == 0 and ((int) (((unsigned char) a) ^ (unsigned char) 0x23)) == 0 etc. Another thing I'm worrying about are mixing up the different patterns together, there is the BIT_AND_EXPR handling, BIT_XOR_EXPR handling, RSHIFT_EXPR handling and then load handling. What if all 4 appear together, or 3 of them, 2 of them? Is the xor optimization still valid if there is BIT_AND_EXPR in between? I.e. instead of (a ^ 123) == 0 there is ((a ^ 123) & 234) == 0 ? 2024-12-18 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/118081 * gimple-fold.cc (decode_field_reference): Only set *xor_p to true if *xor_cmp_op is integer_zerop. * gcc.dg/pr118081.c: New test.
2024-12-18PR81358: Enable automatic linking of libatomic.Prathamesh Kulkarni16-16/+231
ChangeLog: PR driver/81358 * Makefile.def: Add dependencies so libatomic is built before target libraries are configured. * Makefile.tpl: Export TARGET_CONFIGDIRS. * configure.ac: Add libatomic to bootstrap_target_libs. * Makefile.in: Regenerate. * configure: Regenerate. gcc/ChangeLog: PR driver/81358 * common.opt: New option -flink-libatomic. * gcc.cc (LINK_LIBATOMIC_SPEC): New macro. * config/gnu-user.h (GNU_USER_TARGET_LINK_GCC_C_SEQUENCE_SPEC): Use LINK_LIBATOMIC_SPEC. * doc/invoke.texi: Document -flink-libatomic. * configure.ac: Define TARGET_PROVIDES_LIBATOMIC. * configure: Regenerate. * config.in: Regenerate. libatomic/ChangeLog: PR driver/81358 * Makefile.am: Pass -fno-link-libatomic. New rule all. * configure.ac: Assert that CFLAGS is set and pass -fno-link-libatomic. * Makefile.in: Regenerate. * configure: Regenerate. Signed-off-by: Prathamesh Kulkarni <prathameshk@nvidia.com> Co-authored-by: Matthew Malcolmson <mmalcolmson@nvidia.com>
2024-12-18OpenMP: Add declare variant's 'append_args' clause in C/C++Tobias Burnus15-265/+1197
Add the append_args clause of 'declare variant' to C and C++, fix/improve diagnostic for 'interop' clause and 'declare_variant' clauses on the way. Cleanup dispatch handling in gimplify_call_expr a bit and partially handle 'append_args'. (Namely, those parts that do not require libraries calls, i.e. a dispatch construct where the 'device' and 'interop' clause has been specified.) The sorry can be removed once an enum value like omp_ipr_(ompx_gnu_)omp_device_num (cf. OpenMP Spec Issue 4451) has to be added to the runtime side such that omp_get_interop_int returns the device number of an interop object (as passed to dispatch via the interop clause); and a call to GOMP_interop has to be added to create interop objects. Once available, only a very localized change in gimplify_call_expr is required to claim for full support. - And Fortran parsing support. gcc/c-family/ChangeLog: * c-omp.cc (c_omp_interop_t_p): Handle error_mark_node. gcc/c/ChangeLog: * c-parser.cc (c_parser_omp_clause_init_modifiers): New; split of from ... (c_parser_omp_clause_init): ... here; call it. (c_finish_omp_declare_variant): Parse 'append_args' clause. (c_parser_omp_clause_interop): Set tree used/read. gcc/cp/ChangeLog: * decl.cc (omp_declare_variant_finalize_one): Handle append_args. * parser.cc (cp_parser_omp_clause_init_modifiers): New; split of from ... (cp_parser_omp_clause_init): ... here; call it. (cp_parser_omp_all_clauses): Replace interop parsing by a call to ... (cp_parser_omp_clause_interop): ... this new function; set tree used/read. (cp_finish_omp_declare_variant): Parse 'append_args' clause. (cp_parser_omp_declare): Update comment. * pt.cc (tsubst_attribute, tsubst_omp_clauses): Handle template substitution also for declare variant's append_args clause, using for 'init' the same code as for interop's init clause. gcc/ChangeLog: * gimplify.cc (gimplify_call_expr): Update for OpenMP's append_args; cleanup of OpenMP's dispatch clause handling. gcc/testsuite/ChangeLog: * c-c++-common/gomp/declare-variant-2.c: Update dg-error msg. * c-c++-common/gomp/dispatch-12.c: Likewise. * c-c++-common/gomp/dispatch-11.c: Likewise and extend a bit. * c-c++-common/gomp/append-args-1.c: New test. * c-c++-common/gomp/append-args-2.c: New test. * c-c++-common/gomp/append-args-3.c: New test. * g++.dg/gomp/append-args-1.C: New test. * g++.dg/gomp/append-args-2.C: New test. * g++.dg/gomp/append-args-3.C: New test.
2024-12-18c++: Use type_id_in_expr_sentinel in 6 further spots in the parserJakub Jelinek1-25/+17
The following patch uses type_id_in_expr_sentinel in a few spots which did it all manually. 2024-12-18 Jakub Jelinek <jakub@redhat.com> * parser.cc (cp_parser_postfix_expression): Use type_id_in_expr_sentinel instead of manually saving+setting/restoring parser->in_type_id_in_expr_p around cp_parser_type_id calls. (cp_parser_has_attribute_expression): Likewise. (cp_parser_cast_expression): Likewise. (cp_parser_sizeof_operand): Likewise.
2024-12-18c++: Fix up pedantic handling of alignas [PR110345]Jakub Jelinek13-8/+223
The following patch on top of the PR110345 P2552R3 series emits pedantic pedwarns for alignas appertaining to incorrect entities. As the middle-end and attribute exclusions look for "aligned" attribute, the patch transforms alignas into "internal "::aligned attribute (didn't use [[aligned (x)]] so that people can't type it that way). 2024-12-18 Jakub Jelinek <jakub@redhat.com> PR c++/110345 gcc/c-family/ * c-common.h (attr_aligned_exclusions): Declare. (handle_aligned_attribute): Likewise. * c-attribs.cc (handle_aligned_attribute): No longer static. (attr_aligned_exclusions): Use extern instead of static. gcc/cp/ * cp-tree.h (enum cp_tree_index): Add CPTI_INTERNAL_IDENTIFIER. (internal_identifier): Define. (internal_attribute_table): Declare. * parser.cc (cp_parser_exception_declaration): Error on alignas on exception declaration. (cp_parser_std_attribute_spec): Turn alignas into internal ns aligned attribute rather than gnu. * decl.cc (initialize_predefined_identifiers): Initialize internal_identifier. * tree.cc (handle_alignas_attribute): New function. (internal_attributes): New variable. (internal_attribute_table): Likewise. * cp-objcp-common.h (cp_objcp_attribute_table): Add internal_attribute_table entry. gcc/testsuite/ * g++.dg/cpp0x/alignas1.C: Add dg-options "". * g++.dg/cpp0x/alignas2.C: Likewise. * g++.dg/cpp0x/alignas7.C: Likewise. * g++.dg/cpp0x/alignas21.C: New test. * g++.dg/ext/bitfield9.C: Expect a warning. * g++.dg/cpp2a/is-layout-compatible3.C: Add dg-options -pedantic. Expect a warning.
2024-12-18c++: Add {,un}likely attribute further test coverage [PR110345]Jakub Jelinek2-0/+298
Similarly for likely/unlikely attributes. 2024-12-18 Jakub Jelinek <jakub@redhat.com> PR c++/110345 * g++.dg/cpp0x/attr-likely1.C: New test. * g++.dg/cpp0x/attr-unlikely1.C: New test.
2024-12-18c++: Add fallthrough attribute further test coverage [PR110345]Jakub Jelinek2-0/+179
Similarly for fallthrough attribute. Had to add a second testcase because the diagnostics for fallthrough not used within switch at all is done during expansion and expansion won't happen if there are other errors in the testcase. 2024-12-18 Jakub Jelinek <jakub@redhat.com> PR c++/110345 * g++.dg/cpp0x/attr-fallthrough1.C: New test. * g++.dg/cpp0x/attr-fallthrough2.C: New test.
2024-12-18c++: Add carries_dependency further test coverage [PR110345]Jakub Jelinek1-0/+152
This patch adds additional test coverage for the carries_dependency attribute (unlike other attributes, the attribute actually isn't implemented for real, so we warn even in the cases of valid uses because we ignore those as well). 2024-12-18 Jakub Jelinek <jakub@redhat.com> PR c++/110345 * g++.dg/cpp0x/attr-carries_dependency2.C: New test.
2024-12-18c++: Handle attributes on exception declarations [PR110345]Jakub Jelinek2-4/+175
This is a continuation of the series for the ignorability of standard attributes. I've added a test for assume attribute diagnostics appertaining to various entities (mostly invalid) and while doing that, I've discovered that attributes on exception declarations were mostly ignored, this patch adds the missing cp_decl_attributes call and also in the cp_parser_type_specifier_seq case differentiates between attributes and std_attributes to be able to differentiate between attributes which apply to the declaration using type-specifier-seq and attributes after the type specifiers. 2024-12-18 Jakub Jelinek <jakub@redhat.com> PR c++/110345 * parser.cc (cp_parser_type_specifier_seq): Chain cxx11_attribute_p attributes after any type specifier in the is_declaration case to std_attributes rather than attributes. Set also ds_attribute or ds_std_attribute locations if not yet set. (cp_parser_exception_declaration): Pass &type_specifiers.attributes instead of NULL as last argument, call cp_decl_attributes. * g++.dg/cpp0x/attr-assume1.C: New test.
2024-12-18c++: Diagnose attributes on class/enum declarations [PR110345]Jakub Jelinek2-0/+19
The following testcase shows another issue where we just ignored attributes without telling user we did that. If there are any declarators, the ignoring of the attribute are diagnosed in grokdeclarator etc., but if there is none (and we don't error such as on int; ), the following patch emits diagnostics. 2024-12-18 Jakub Jelinek <jakub@redhat.com> PR c++/110345 * decl.cc (check_tag_decl): Diagnose std_attributes. * g++.dg/cpp0x/gen-attrs-86.C: New test.
2024-12-18c++: Handle enum attributes like class attributes [PR110345]Jakub Jelinek2-6/+13
As the following testcase shows, cp_parser_decl_specifier_seq was calling warn_misplaced_attr_for_class_type only for class types and not for enum types, while check_tag_decl calls them for both class and enum types. Enum types are really the same case here, the attribute needs to go before the type name to apply to all instances of the type. Additionally, when warn_misplaced_attr_for_class_type is called, it diagnoses something and so it is fine to drop the attributes then on the floor, but in case it wasn't a type decision, it silently discarded the attributes, which is invalid for the ignorability of standard attributes paper. This patch in that case adds them to decl_specs->std_attributes and let it be diagnosed later (e.g. in grokdeclarator). 2024-12-18 Jakub Jelinek <jakub@redhat.com> PR c++/110345 * parser.cc (cp_parser_decl_specifier_seq): Call warn_misplaced_attr_for_class_type for all OVERLOAD_TYPE_P types, not just CLASS_TYPE_P. When not calling warn_misplaced_attr_for_class_type, don't clear attrs and add it to decl_specs->std_attributes instead. * g++.dg/cpp0x/gen-attrs-85.C: New test.
2024-12-18inline-asm: Add - constraint modifier support for toplevel extended asm ↵Jakub Jelinek10-7/+100
[PR41045] The following patch adds - constraint modifier support (only in toplevel asms), which one can use to allow i, s and n constraint to accept SYMBOL_REFs even with -fpic. So, the recommended way mark toplevel asm as defining some symbol would be ":" constraint (usually with cc modifier in the pattern), while to mark toplevel asm as using some symbol (again, either function or variable), one would use "-s" constraint again with address of that function or variable. 2024-12-18 Jakub Jelinek <jakub@redhat.com> PR c/41045 gcc/ * stmt.cc (parse_output_constraint, parse_input_constraint): Handle - modifier. * recog.h (raw_constraint_p): Declare. * recog.cc (raw_constraint_p): New variable. (asm_operand_ok, constrain_operands): Handle - modifier. * common.md (i, s, n): For raw_constraint_p don't require LEGITIMATE_PIC_OPERAND_P. * doc/md.texi: Document - constraint modifier. gcc/c/ * c-typeck.cc (build_asm_expr): Reject - constraint modifier inside of a function. gcc/cp/ * semantics.cc (finish_asm_stmt): Reject - constraint modifier inside of a function. gcc/testsuite/ * c-c++-common/toplevel-asm-4.c: Add missing %cc2 use in template, add bar, x, &y operands with "-i" and "-s" constraints. (x, y): New variables. (bar): Declare. * c-c++-common/toplevel-asm-7.c: New test. * c-c++-common/toplevel-asm-8.c: New test.
2024-12-18inline-asm: Add support for cc operand modifierJakub Jelinek5-4/+27
As mentioned in the "inline asm: Add new constraint for symbol definitions" patch description, while the c operand modifier is documented to: Require a constant operand and print the constant expression with no punctuation. it actually doesn't do that with -fpic at least on some targets and has been behaving that way for at least 3 decades. It prints the operand using output_addr_const if CONSTANT_ADDRESS_P is true, but CONSTANT_ADDRESS_P can do all sorts of target specific checks. And if it is false, it falls back to output_operand (operands[opnum], 'c'); which will on various targets just result in an error that it is invalid modifier letter (weird because it is documented), on others like x86 or alpha will handle the operand in some weird way if it is a comparison and otherwise complain the argument isn't a comparison, on others like arm perhaps do what the user wanted. As I wrote, we are pretty much out of modifier letters because some targets use a lot of them, and almost out of % punctuation chars (I think ` is left) but right now punctuation chars aren't normally followed by operand number anyway. So, the following patch takes one of the generic letters (c) and adds an extra modifier char after it, I chose cc, which behaves like c but just always uses output_addr_const instead of falling back to the machine dependent code. 2024-12-18 Jakub Jelinek <jakub@redhat.com> * final.cc (output_asm_insn): Add support for cc operand modifier. * doc/extend.texi (Generic Operand Modifiers): Document cc operand modifier. * doc/md.texi (@samp{:} in constraint): Mention the cc operand modifier and add small example. * c-c++-common/toplevel-asm-4.c: Don't use -fno-pie option. Use cc modifier instead of c. (v, w): Add extern keyword. * c-c++-common/toplevel-asm-6.c: New test.
2024-12-18inline asm: Add new constraint for symbol definitionsJakub Jelinek8-2/+101
The following patch on top of the PR41045 toplevel extended asm patch allows marking inline asms (both toplevel and function-local, admittedly it is less useful for the latter, so if you want, I can add restrictions) as defining symbols, either functions or variables. As most remaining constraint letters are used at least on some targets, I'm using : as the new constraint. It is similar to "s" in that it wants CONSTANT_P && !CONST_SCALAR_INT_P, but 1) it specially requires an address of a function or variable declaration, so for functions the expected use is void foo (void); ... ":" (foo) or ":" (&foo) and for variables (unless they are arrays) extern int var; ... ":" (&var) 2) it makes no sense to say that either something is defined or it is used in a register or something similar, so the patch diagnoses if one attempts to mix it with other constraints; ":,:,:" is allowed just because one could be using 3 alternatives in some other operand 3) unlike "s", the constraint doesn't check LEGITIMATE_PIC_OPERAND_P for -fpic, even in -fpic one should be able to use it the same way 4) the cgraph portion needs to be really added later 5) and last but not least, I'm afraid %c0 print modifier isn't very good for printing it; it works fine without -fpic/-fpie, but 'c' modifier is handled as if (CONSTANT_ADDRESS_P (operands[opnum])) output_addr_const (asm_out_file, operands[opnum]); else output_operand (operands[opnum], 'c'); and because at least on some arches like x86 CONSTANT_ADDRESS_P is redefined to do backend specific PIC mess, it will just output_operand and likely just be rejected (on x86 with an error that the argument is not a comparison) Guess for x86 one can use %p0 instead. But I'm afraid we are mostly out of generic modifiers, and targetm.asm_out.print_operand_punct_valid_p seems to use most of the punctuation characters as well. I think ` is unused, but wonder if we want to use up the last remaining letter that way, perhaps make %`<letter>0? Or extend the existing generic modifiers, keep %c0 behave as it does right now and make %cc0 be a 2 letter modifier which is PIC friendly and prints using output_addr_const anything that can be printed that way? A follow-up patch implements the %cc0 version. 2024-12-18 Jakub Jelinek <jakub@redhat.com> gcc/ * genpreds.cc (mangle): Add ':' mangling. (add_constraint): Allow : constraint. * common.md (:): New define_constraint. * stmt.cc (parse_output_constraint): Diagnose "=:". (parse_input_constraint): Handle ":" and diagnose invalid uses. * doc/md.texi (Simple Constraints): Document ":" constraint. gcc/c/ * c-typeck.cc (build_asm_expr): Diagnose invalid ":" constraint uses. gcc/cp/ * semantics.cc (finish_asm_stmt): Diagnose invalid ":" constraint uses. gcc/testsuite/ * c-c++-common/toplevel-asm-4.c: New test. * c-c++-common/toplevel-asm-5.c: New test.
2024-12-18libstdc++: Add inline keyword to _M_locateTamar Christina1-1/+1
In GCC 12 there was a ~40% regression in the performance of hashmap->find. This regression came about accidentally: Before GCC 12 the find function was small enough that IPA would inline it even though it wasn't marked inline. In GCC-12 an optimization was added to perform a linear search when the entries in the hashmap are small. This increased the size of the function enough that IPA would no longer inline. Inlining had two benefits: 1. The return value is a reference. so it has to be returned and dereferenced even though the search loop may have already dereference it. 2. The pattern is a hard pattern to track for branch predictors. This causes a large number of branch misses if the value is immediately checked and branched on. i.e. if (a != m.end()) which is a common pattern. The patch fixes both these issues by adding the inline keyword to _M_locate to allow the inliner to consider inlining again. This and the other patches have been ran through serveral benchmarks where the size, number of elements searched for and type (reference vs value) etc were tested. The change shows no statistical regression, but an average find improvement of ~27% and a range between ~10-60% improvements. A selection of the results: +-----------+--------------------+-------+----------+ | Group | Benchmark | Size | % Inline | +-----------+--------------------+-------+----------+ | Find | unord<uint64_t | 11274 | 53.52% | | Find | unord<uint64_t | 11254 | 47.98% | | Find Mult | unord<uint64_t | 12 | 47.62% | | Find Mult | unord<std::string | 12 | 44.94% | | Find Mult | unord<std::string | 10 | 44.89% | | Find Mult | unord<uint64_t | 11 | 40.90% | | Find Mult | unord<uint64_t | 352 | 30.57% | | Find | unord<uint64_t | 351 | 28.27% | | Find Mult | unord<uint64_t | 342 | 26.80% | | Find | unord<std::string | 12 | 25.66% | | Find Mult | unord<std::string | 352 | 23.12% | | Find | unord<std::string | 13 | 20.36% | | Find Mult | unord<std::string | 355 | 19.23% | | Find | unord<std::string | 353 | 18.59% | | Find | unord<uint64_t | 350 | 15.43% | | Find | unord<std::string | 11260 | 11.80% | | Find | unord<std::string | 352 | 11.12% | | Find | unord<std::string | 11262 | 9.97% | +-----------+--------------------+-------+----------+ libstdc++-v3/ChangeLog: * include/bits/hashtable.h: Inline _M_locate.
2024-12-18LoongArch: Add crc testsXi Ruoyao2-0/+133
gcc/testsuite/ChangeLog: * g++.target/loongarch/crc.C: New test. * g++.target/loongarch/crc-scan.C: New test.
2024-12-18LoongArch: Combine xor and crc instructionsXi Ruoyao1-0/+25
For a textbook-style CRC implementation: uint32_t crc = 0xffffffffu; for (size_t k = 0; k < len; k++) { crc ^= data[k]; for (int i = 0; i < 8 * sizeof (T); i++) if (crc & 1) crc = (crc >> 1) ^ poly; else crc >>= 1; } return crc; The generic code reports: Data and CRC are xor-ed before for loop. Initializing data with 0. resulting in: ld.bu $t1, $a0, 0 xor $t0, $t0, $t1 crc.w.b.w $t0, $zero, $t0 But it's just better to use ld.bu $t1, $a0, 0 crc.w.b.w $t0, $t1, $t0 instead. Implement this optimization now. gcc/ChangeLog: * config/loongarch/loongarch.md (*crc_combine): New define_insn_and_split.
2024-12-18LoongArch: Add CRC expander to generate faster CRCXi Ruoyao1-0/+57
64-bit LoongArch has native CRC instructions for two specific polynomials. For other polynomials or 32-bit, use the generic table-based approach but optimize bit reversing. gcc/ChangeLog: * config/loongarch/loongarch.md (crc_rev<mode:SUBDI>si4): New define_expand.
2024-12-18LoongArch: Add bit reverse operationsXi Ruoyao1-0/+51
LoongArch supports native bit reverse operation for QI, SI, DI, and for HI we can expand it into a shift and a bit reverse in word_mode. I was reluctant to add them because until PR50481 is fixed these operations will be just useless. But now it turns out we can use them to optimize the bit reversing CRC calculation if recognized by the generic CRC pass. So add them in prepare for the next patch adding CRC expanders. gcc/ChangeLog: * config/loongarch/loongarch.md (@rbit<mode:GPR>): New define_insn template. (rbitsi_extended): New define_insn. (rbitqi): New define_insn. (rbithi): New define_expand.
2024-12-18LoongArch: Remove QHSD and use QHWD insteadXi Ruoyao1-3/+2
QHSD and QHWD are basically the same thing, but QHSD will be incorrect when we start to add LA32 support. So it's just better to always use QHWD. gcc/ChangeLog: * config/loongarch/loongarch.md (QHSD): Remove. (loongarch_<crc>_w_<size>_w): Use QHSD instead of QHWD. (loongarch_<crc>_w_<size>_w_extended): Likewise.
2024-12-18libstdc++: Add missing character to __to_wstring_numeric mapJonathan Wakely1-0/+2
The mapping from char to wchar_t needs to handle 'i' and 'I' but those were absent from the table that is used for some non-ASCII encodings. libstdc++-v3/ChangeLog: * include/bits/basic_string.h (__to_wstring_numeric): Add 'i' and 'I' to mapping.
2024-12-18libstdc++: Call regex_traits::transform_primary() only when necessary [PR98723]Luca Bacci1-4/+7
This is both a performance optimization and a partial fix for PR 98723. This commit fixes the issue for bracket expressions that do not depend on the locale's collation facet. Examples: * Character ranges ([a-z]) when std::regex::collate is not set * Character classes ([:alnum:]) * Individual characters ([abc]) Signed-off-by: Luca Bacci <luca.bacci982@gmail.com> libstdc++-v3/ChangeLog: PR libstdc++/98723 * include/bits/regex_compiler.tcc (_BracketMatcher::_M_apply): Only use transform_primary when an equivalence set is used.
2024-12-18Documentation: Fix paste-o in recent OpenMP/OpenACC patchSandra Loosemore1-1/+1
gcc/ChangeLog * doc/extend.texi (OpenACC): Fix paste-o.
2024-12-17c++: modules: Fix 32-bit overflow with 64-bit location_t [PR117970]Lewis Hyatt1-2/+3
With the move to 64-bit location_t in r15-6016, I missed a spot in module.cc where a location_t was still being stored in a 32-bit int. Fixed. The xtreme-header* tests in modules.exp were still passing fine on lots of architectures that were tested (x86-64, i686, aarch64, sparc, riscv64), but the PR shows that they were failing in some particular risc-v multilib configurations. They pass now. gcc/cp/ChangeLog: PR c++/117970 * module.cc (module_state::read_ordinary_maps): Change argument to line_map_uint_t instead of unsigned int.
2024-12-18Daily bump.GCC Administrator7-1/+237
2024-12-17c++: print NONTYPE_ARGUMENT_PACK [PR118073]Marek Polacek2-0/+25
This PR points out that we're not pretty-printing NONTYPE_ARGUMENT_PACK so the compiler emits the ugly: 'nontype_argument_pack' not supported by dump_expr<expression error>> Fixed thus. I've wrapped the elements of the pack in { } because that's what cxx_pretty_printer::expression does. PR c++/118073 gcc/cp/ChangeLog: * error.cc (dump_expr) <case NONTYPE_ARGUMENT_PACK>: New case. gcc/testsuite/ChangeLog: * g++.dg/diagnostic/arg-pack1.C: New test.
2024-12-17libstdc++: Fix -Wparentheses warning in Debug Mode macroJonathan Wakely1-1/+1
libstdc++-v3/ChangeLog: * include/debug/safe_local_iterator.h (_GLIBCXX_DEBUG_VERIFY_OPERANDS): Add parentheses to avoid -Wparentheses warning.
2024-12-17libstdc++: Fix std::deque::insert(pos, first, last) undefined behaviour ↵Jonathan Wakely2-0/+29
[PR118035] Inserting an empty range into a std::deque results in undefined calls to either std::copy, std::copy_backward, std::move, or std::move_backward. We call those algos with invalid arguments where the output range is the same as the input range, e.g. std::copy(first, last, first) which violates the preconditions for the algorithms. This fix simply returns early if there's nothing to insert. Most callers already ensure that we don't even call _M_range_insert_aux with an empty range, but some callers don't. Rather than checking for n == 0 in each of the callers, this just does the check once and uses __builtin_expect to treat empty insertions as unlikely. libstdc++-v3/ChangeLog: PR libstdc++/118035 * include/bits/deque.tcc (_M_range_insert_aux): Return immediately if inserting an empty range. * testsuite/23_containers/deque/modifiers/insert/118035.cc: New test.
2024-12-17Documentation: Make OpenMP/OpenACC docs easier to find [PR26154]Sandra Loosemore6-150/+227
PR c/26154 is one of our oldest documentation issues. The only discussion of OpenMP support in the GCC manual is buried in the "C Dialect Options" section, with nothing at all under "Extensions". The Fortran manual does have separate sections for OpenMP and OpenACC extensions so I have copy-edited/adapted that text for similar sections in the GCC manual, as well as breaking out the OpenMP and OpenACC options into their own section (they apply to all of C, C++, and Fortran). I also updated the information about what versions of OpenMP and OpenACC are supported and removed some redundant text from the Fortran manual to prevent it from getting out of sync on future updates, and inserted some cross-references to the new sections elsewhere. gcc/c-family/ChangeLog PR c/26154 * c.opt.urls: Regenerated. gcc/ChangeLog PR c/26154 * common.opt.urls: Regenerated. * doc/extend.texi (C Extensions): Adjust menu for new sections. (Attribute Syntax): Mention OpenMP directives. (Pragmas): Mention OpenMP and OpenACC directives. (OpenMP): New section. (OpenACC): New section. * doc/invoke.texi (Invoking GCC): Adjust menu for new section. (Option Summary): Move OpenMP and OpenACC options to their own category. (C Dialect Options): Move documentation for -foffload, -fopenacc, -fopenacc-dim, -fopenmp, -fopenmd-simd, and -fopenmp-target-simd-clone to... (OpenMP and OpenACC Options): ...this new section. Light copy-editing of the option descriptions. gcc/fortran/ChangeLog: PR c/26154 * gfortran.texi (Standards): Remove redundant info about OpenMP/OpenACC standard support. (OpenMP): Copy-editing and update version info. (OpenACC): Likewise. * lang.opt.urls: Regenerated.
2024-12-17middle-end/118062 - bogus lowering of vector comparesRichard Biener1-1/+2
The generic expand_vector_piecewise routine supports lowering of a vector operation to vector operations of smaller size. When computing the extract position from the larger vector it uses the element size in bits of the original result vector to determine the number of elements in the smaller vector. That is wrong when lowering a compare as the vector element size of a bool vector does not have to agree with that of the compare operand. The following simplifies this, fixing the error. PR middle-end/118062 * tree-vect-generic.cc (expand_vector_piecewise): Properly compute delta.
2024-12-17c++: ICE initializing array of aggrs [PR117985]Marek Polacek3-0/+64
This crash started with my r12-7803 but I believe the problem lies elsewhere. build_vec_init has cleanup_flags whose purpose is -- if I grok this correctly -- to avoid destructing an object multiple times. Let's say we are initializing an array of A. Then we might end up in a scenario similar to initlist-eh1.C: try { call A::A in a loop // #0 try { call a fn using the array } finally { // #1 call A::~A in a loop } } catch { // #2 call A::~A in a loop } cleanup_flags makes us emit a statement like D.3048 = 2; at #0 to disable performing the cleanup at #2, since #1 will take care of the destruction of the array. But if we are not emitting the loop because we can use a constant initializer (and use a single { a, b, ...}), we shouldn't generate the statement resetting the iterator to its initial value. Otherwise we crash in gimplify_var_or_parm_decl because it gets the stray decl D.3048. PR c++/117985 gcc/cp/ChangeLog: * init.cc (build_vec_init): Pop CLEANUP_FLAGS if we're not generating the loop. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/initlist-array23.C: New test. * g++.dg/cpp0x/initlist-array24.C: New test.
2024-12-17[PATCH] RISC-V: optimization on checking certain bits set ((x & mask) == val)Oliver Kozul2-0/+44
The patch optimizes code generation for comparisons of the form X & C1 == C2 by converting them to (X | ~C1) == (C2 | ~C1). C1 is a constant that requires li and addi to be loaded, while ~C1 requires a single lui instruction. As the values of C1 and C2 are not visible within the equality expression, a plus pattern is matched instead.       PR target/114087 gcc/ChangeLog: * config/riscv/riscv.md (*lui_constraint<ANYI:mode>_and_to_or): New pattern gcc/testsuite/ChangeLog: * gcc.target/riscv/pr114087-1.c: New test.
2024-12-17RISC-V: Remove svvptc from riscv-ext-bitmask.defYangyu Chen1-1/+0
There should be no svvptc in the riscv-ext-bitmask.def file since it has not yet been added to the RISC-V C API Specification or the Linux hwprobe. And there is no need for userspace software to know that this extension exists. So remove it from the riscv-ext-bitmask.def file. Fixes: e4f4b2dc08 ("RISC-V: Minimal support for svvptc extension.") Signed-off-by: Yangyu Chen <cyy@cyyself.name> gcc/ChangeLog: * common/config/riscv/riscv-ext-bitmask.def (RISCV_EXT_BITMASK): Remove svvptc.
2024-12-17testsuite: arm: Mark pr81812.C as xfail for thumb1Torbjörn SVENSSON1-0/+2
Test fails for Cortex-M0 with: .../pr81812.C:6:8: error: generic thunk code fails for method 'virtual void ChildNode::_ZTv0_n12_NK9ChildNode5errorEz(...) const' which uses '...' According to PR108277, it's expected that thumb1 targets does not support empty virtual functions with ellipsis. gcc/testsuite/ChangeLog: * g++.dg/torture/pr81812.C: Add xfail for thumb1. Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
2024-12-17[PATCH v2 2/2] RISC-V: Add Tenstorrent Ascalon 8 wide architectureAnton Blanchard4-1/+108
This adds the Tenstorrent Ascalon 8 wide architecture (tt-ascalon-d8) to the list of known cores. gcc/ChangeLog: * config/riscv/riscv-cores.def: Add tt-ascalon-d8. * config/riscv/riscv.cc (tt_ascalon_d8_tune_info): New. * doc/invoke.texi (RISC-V): Add tt-ascalon-d8 to -mcpu. gcc/testsuite/ChangeLog: * gcc.target/riscv/mcpu-tt-ascalon-d8.c: New test.
2024-12-17[PATCH v2 1/2] RISC-V: Document thead-c906, xiangshan-nanhu, and generic-oooAnton Blanchard1-5/+6
gcc/ChangeLog * doc/invoke.texi (RISC-V): Add thead-c906, xiangshan-nanhu to -mcpu, add generic-ooo and remove thead-c906 from -mtune.
2024-12-17testsuite: arm: Add -mtune to all arm_cpu_* effective targetsTorbjörn SVENSSON1-12/+15
Fixes Linaro CI reported regression on r15-6164-gbdf75257aad2 in https://linaro.atlassian.net/browse/GNU-1463. gcc/testsuite/ChangeLog: * lib/target-supports.exp: Added corresponding -mtune= option for each fo the arm_cpu_* effective targets. Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
2024-12-17RISC-V: Add new constraint R for register even-odd pairsKito Cheng3-0/+30
Although this constraint is not currently used for any instructions, it is very useful for custom instructions. Additionally, some new standard extensions (not yet upstream), such as `Zilsd` and `Zclsd`, are potential users of this constraint. Therefore, I believe there is sufficient justification to add it now. gcc/ChangeLog: * config/riscv/constraints.md (R): New constraint. * doc/md.texi: Document new constraint `R`. gcc/testsuite/ChangeLog: * gcc.target/riscv/constraint-R.c: New.
2024-12-17RISC-V: Implment N modifier for printing the register number rather than the ↵Kito Cheng5-0/+74
register name The modifier `N`, to print the raw encoding of a register. This is used when using `.insn <length>, <encoding>`, where the user wants to pass a value to the instruction in a known register, but where the instruction doesn't follow the existing instruction formats, so the assembly parser is not expecting a register name, just a raw integer. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_print_operand): Add N. * doc/extend.texi: Document for N, gcc/testsuite/ChangeLog: * gcc.target/riscv/modifier-N-fpr.c: New. * gcc.target/riscv/modifier-N-vr.c: New. * gcc.target/riscv/modifier-N.c: New.
2024-12-17RISC-V: Rename internal operand modifier N to nKito Cheng3-5/+5
Here is a purposal that using N for printing register encoding number, so let rename the existing internal operand modifier `N` to `n`. gcc/ChangeLog: * config/riscv/corev.md (*cv_branch<mode>): Update modifier. (*branch<mode>): Ditto. * config/riscv/riscv.cc (riscv_print_operand): Update modifier. * config/riscv/riscv.md (*branch<mode>): Update modifier.
2024-12-17RISC-V: Add cr and cf constraintKito Cheng7-11/+77
gcc/ChangeLog: * config/riscv/constraints.md (cr): New. (cf): New. * config/riscv/riscv.h (reg_class): Add RVC_GR_REGS and RVC_FP_REGS. (REG_CLASS_NAMES): Ditto. (REG_CLASS_CONTENTS): Ditto. * doc/md.texi: Document cr and cf constraint. * config/riscv/riscv.cc (riscv_regno_to_class): Update FP_REGS to RVC_FP_REGS since it smaller set. (riscv_secondary_memory_needed): Handle RVC_FP_REGS. (riscv_register_move_cost): Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/constraint-cf-zfinx.c: New. * gcc.target/riscv/constraint-cf.c: New. * gcc.target/riscv/constraint-cr.c: New.
2024-12-17RISC-V: Rename constraint c0* to k0*Kito Cheng4-233/+233
Rename those constraint since we want define other constraint start with `c`, those constraints are internal and undocumented, so it's fine to rename. gcc/ChangeLog: * config/riscv/constraints.md (c01): Rename to... (k01): ...this. (c02): Rename to... (k02): ...this. (c03): Rename to... (k03): ...this. (c04): Rename to... (k04): ...this. (c08): Rename to... (k08): ...this. * config/riscv/corev.md (riscv_cv_simd_add_h_si): Update constraints. (riscv_cv_simd_sub_h_si): Ditto. (riscv_cv_simd_cplxmul_i_si): Ditto. (riscv_cv_simd_subrotmj_si): Ditto. * config/riscv/riscv-v.cc (splat_to_scalar_move_p): Update constraints. * config/riscv/vector-iterators.md (stride_load_constraint): Update constraints. (stride_store_constraint): Ditto.
2024-12-17ipa: Improve how we derive value ranges from IPA invariantsMartin Jambor6-26/+67
I believe that the current function ipa_range_set_and_normalize lacks a check that a base of an ADDR_EXPR lacks a test whether the base really cannot be NULL, so this patch adds it. Moreover, I never liked the name as I do not think it makes the value of ranges any more normal but rather just special-cases non-zero ip_invariant pointers. Therefore, I have given it a different name and moved it to a .cc file, our LTO bootstrap should inline (and/or split) it if necessary anyway. Because, as Honza correctly pointed out, deriving non-NULLness from a pointer depends on flag_delete_null_pointer_checks which is an optimization flag and thus depends on a given function, in this version of the patch ipa_get_range_from_ip_invariant gets a context_node parameter for that purpose. This then needs to be used within symtab_node::nonzero_address which gets a special overload in which the value of the flag can be provided as a parameter. gcc/ChangeLog: 2024-12-11 Martin Jambor <mjambor@suse.cz> * cgraph.h (symtab_node): Add a new overload of nonzero_address. * symtab.cc (symtab_node::nonzero_address): Add a new overload whith a parameter for delete_null_pointer_checks. Make the original overload call the new one which has retains the actual implementation. * ipa-prop.h (ipa_get_range_from_ip_invariant): Declare. (ipa_range_set_and_normalize): Remove. * ipa-prop.cc (ipa_get_range_from_ip_invariant): New function. (ipa_range_set_and_normalize): Remove. * ipa-cp.cc (ipa_vr_intersect_with_arith_jfunc): Add a new parameter context_node. Use ipa_get_range_from_ip_invariant instead of ipa_range_set_and_normalize and pass to it the new parameter. (ipa_value_range_from_jfunc): Pass cs->caller as the context_node to ipa_vr_intersect_with_arith_jfunc. (propagate_vr_across_jump_function): Likewise. (ipa_get_range_from_ip_invariant): New function. * ipa-fnsummary.cc (evaluate_conditions_for_known_args): Use ipa_get_range_from_ip_invariant instead of ipa_range_set_and_normalize
2024-12-17ipa: Better value ranges for pointer integer constantsMartin Jambor1-19/+16
When looking into cases where we know an actual argument of a call is a constant but we don't generate a singleton value-range for the jump function, I found out that the special handling of pointer constants does not work well for constant zero pointer values. In fact the code only attempts to see if it can figure out that an argument is not zero and if it can figure out any alignment information. With this patch, we try to use the value_range that ranger can give us in the jump function if we can and we query ranger for all kinds of arguments, not just SSA_NAMES (and so also pointer integer constants). If we cannot figure out a useful range we fall back again on figuring out non-NULLness with tree_single_nonzero_warnv_p. With this patch, we generate [prange] struct S * [0, 0] MASK 0x0 VALUE 0x0 instead of for example: [prange] struct S * [0, +INF] MASK 0xfffffffffffffff0 VALUE 0x0 for a zero constant passed in a call. If you are wondering why we check whether the value range obtained from range_of_expr can be undefined, even when the function returns true, that is because that can apparently happen fro default-definition SSA_NAMEs. gcc/ChangeLog: 2024-11-15 Martin Jambor <mjambor@suse.cz> * ipa-prop.cc (ipa_compute_jump_functions_for_edge): Try harder to use the value range obtained from ranger for pointer values.
2024-12-17ipa: Skip widening type conversions in jump function constructionsMartin Jambor2-0/+88
Originally, we did not stream any formal parameter types into WPA and were generally very conservative when it came to type mismatches in IPA-CP. Over the time, mismatches that happen in code and blew up in WPA made us to be much more resilient and also to stream the types of the parameters which we now use commonly. With that information, we can safely skip conversions when looking at the IL from which we build jump functions and then simply fold convert the constants and ranges to the resulting type, as long as we are careful that performing the corresponding folding of constants gives the corresponding results. In order to do that, we must ensure that the old value can be represented in the new one without any loss. With this change, we can nicely propagate non-NULLness in IPA-VR as demonstrated with the new test case. I have gone through all other uses of (all components of) jump functions which could be affected by this and verified they do indeed check types and can handle mismatches. gcc/ChangeLog: 2024-12-11 Martin Jambor <mjambor@suse.cz> * ipa-prop.cc: Include vr-values.h. (skip_a_safe_conversion_op): New function. (ipa_compute_jump_functions_for_edge): Use it. gcc/testsuite/ChangeLog: 2024-11-01 Martin Jambor <mjambor@suse.cz> * gcc.dg/ipa/vrp9.c: New test.
2024-12-17c++: Diagnose earlier non-static data members with cv containing class type ↵Jakub Jelinek2-1/+9
[PR116108] In r10-6457 aka PR92593 fix a check has been added to reject earlier non-static data members with current_class_type in templates, as the deduction then can result in endless recursion in reshape_init. It fixed the template <class T> struct S { S s = 1; }; S t{2}; crashes, but as the following testcase shows, didn't catch when there are cv qualifiers on the non-static data member. Fixed by using TYPE_MAIN_VARIANT. 2024-12-17 Jakub Jelinek <jakub@redhat.com> PR c++/116108 gcc/cp/ * decl.cc (grokdeclarator): Pass TYYPE_MAIN_VARIANT (type) rather than type to same_type_p when checking if the non-static data member doesn't have current class type. gcc/testsuite/ * g++.dg/cpp1z/class-deduction117.C: New test.
2024-12-17Fortran: Fix associate with derived type array construtor [PR117347]Andre Vehreschild2-0/+40
gcc/fortran/ChangeLog: PR fortran/117347 * primary.cc (gfc_match_varspec): Add array constructors for guessing their type like with unresolved function calls. gcc/testsuite/ChangeLog: * gfortran.dg/associate_71.f90: New test.
2024-12-17Daily bump.GCC Administrator8-1/+340
2024-12-16Update cpplib sr.poJoseph Myers1-30/+19
* sr.po: Update.