aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2025-04-17[PATCH] RISC-V: Do not free a riscv_arch_string when handling target-arch ↵翁愷邑1-5/+1
attribute The build_target_option_node() function may return a cached node when fndecl having the same effective global_options. Therefore, freeing memory used in target nodes can lead to a use-after-free issue, as a target node may be shared by multiple fndecl. This issue occurs in gcc.target/riscv/target-attr-16.c, where all functions have the same march, but the last function tries to free its old x_riscv_arch_string (which is shared) when processing the second target attribute.However, the behavior of this issue depends on how the OS handles malloc. It's very likely that xstrdup returns the old address just freed, coincidentally hiding the issue. We can verify the issue by forcing xstrdup to return a new address, e.g., - if (opts->x_riscv_arch_string != default_opts->x_riscv_arch_string) - free (CONST_CAST (void *, (const void *) opts->x_riscv_arch_string)); + // Force it to use a new address, NFCI + const char *tmp = opts->x_riscv_arch_string; opts->x_riscv_arch_string = xstrdup (local_arch_str); + if (tmp != default_opts->x_riscv_arch_string) + free (CONST_CAST (void *, (const void *) tmp)); This patch replaces xstrdup with ggc_strdup and let gc to take care of unused strings. gcc/ChangeLog: * config/riscv/riscv-target-attr.cc (riscv_target_attr_parser::update_settings): Do not manually free any arch string.
2025-04-17c++: constexpr virtual base diagnosticJason Merrill3-3/+10
I thought this diagnostic could be clearer that the problem is the combination of virtual bases and constexpr constructor, not just complain that the class has virtual bases without context. gcc/cp/ChangeLog: * constexpr.cc (is_valid_constexpr_fn): Improve diagnostic. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/constexpr-dtor16.C: Adjust diagnostic. * g++.dg/cpp2a/constexpr-dynamic10.C: Likewise.
2025-04-17Document peculiarities of BOOLEAN_TYPEEric Botcazou1-1/+5
gcc/ * tree.def (BOOLEAN_TYPE): Add more details.
2025-04-17c++: constexpr new diagnostic locationJason Merrill8-21/+31
Presenting the allocation location as the location of the outermost expression we're trying to evaluate is inaccurate; let's provide both locations. gcc/cp/ChangeLog: * constexpr.cc (cxx_eval_outermost_constant_expr): Give both expression and allocation location in allocated storage diagnostics. gcc/testsuite/ChangeLog: * g++.dg/cpp1y/constexpr-new.C: Adjust diagnostics. * g++.dg/cpp1z/constexpr-asm-5.C: Likewise. * g++.dg/cpp26/static_assert1.C: Likewise. * g++.dg/cpp2a/constexpr-dtor7.C: Likewise. * g++.dg/cpp2a/constexpr-new26.C: Likewise. * g++.dg/cpp2a/constexpr-new3.C: Likewise. * g++.dg/cpp2a/constinit14.C: Likewise.
2025-04-17c++: vec_safe_reserve usage tweaksJason Merrill2-13/+4
A couple of cleanups from noticing that the semantics of std::vector<T>::reserve() (request the new minimum allocation) differ from the GCC vec<...>::reserve() (request a minimum number of slots available). In preserve_state, we were tripling the size of the vec when doubling it is more than enough. In get_tinfo_desc we were using vec_safe_reserve properly, but it's simpler to use vec_safe_grow_cleared. gcc/cp/ChangeLog: * name-lookup.cc (name_lookup::preserve_state): Fix reserve call. * rtti.cc (get_tinfo_desc): Use vec_safe_grow_cleared.
2025-04-17c++: improve pack index diagnosticsJason Merrill5-10/+20
While looking at pack-indexing16.C I thought it would be helpful to print the problematic type/value. gcc/cp/ChangeLog: * semantics.cc (finish_type_pack_element): Add more info to diagnostics. libstdc++-v3/ChangeLog: * testsuite/20_util/tuple/element_access/get_neg.cc: Adjust diagnostic. gcc/testsuite/ChangeLog: * g++.dg/cpp26/pack-indexing2.C: Adjust diagnostics. * g++.dg/ext/type_pack_element2.C: Likewise. * g++.dg/ext/type_pack_element4.C: Likewise.
2025-04-17c++: add assert to cp_make_fname_declJason Merrill1-0/+2
In the PR118629 testcase, pushdecl_outermost_localscope was failing and returning error_mark_node without ever actually giving an error; in addition to my earlier fix for the failure, make sure failures aren't silent. gcc/cp/ChangeLog: * decl.cc (cp_make_fname_decl): Prevent silent failure.
2025-04-17c++: 'requires' diagnostic before C++20Jason Merrill1-0/+3
We were giving a generic "not declared" error for a requires-expression without concepts enabled; we can do better. gcc/cp/ChangeLog: * lex.cc (unqualified_name_lookup_error): Handle 'requires' better.
2025-04-17doc: say "compatible types" for -fstrict-aliasingSam James1-6/+8
Include the term used in the standard to ease further research for users, and while at it, rephrase the description of the rule entirely using Alexander Monakov's suggestion: it was previously wrong (and imprecise) as "the same address" may well be re-used later on, and the issue is the access via an expression of the wrong type. gcc/ChangeLog: * doc/invoke.texi: Use "compatible types" term. Rephrase to be more precise (and correct).
2025-04-17ada: bump Library_Version to 16.Jakub Jelinek1-1/+1
gcc/ada/ChangeLog: * gnatvsn.ads: Bump Library_Version to 16.
2025-04-17Update crontab and git_update_version.pyJakub Jelinek2-6/+7
2025-04-17 Jakub Jelinek <jakub@redhat.com> maintainer-scripts/ * crontab: Snapshots from trunk are now GCC 16 related. Add GCC 15 snapshots from the respective branch. contrib/ * gcc-changelog/git_update_version.py (active_refs): Add releases/gcc-15.
2025-04-17Bump BASE-VER.basepoints/gcc-16Jakub Jelinek1-1/+1
2025-04-17 Jakub Jelinek <jakub@redhat.com> * BASE-VER: Set to 16.0.0.
2025-04-17libgomp: Don't test ompx::allocator::gnu_pinned_mem on non-linux targets.Jakub Jelinek2-0/+22
The libgomp.c/alloc-pinned*.c test have /* { dg-skip-if "Pinning not implemented on this host" { ! *-*-linux-gnu* } } */ so they are only run on Linux targets right now. Duplicating the tests or reworking them into headers looked like too much work for me right now this late in stage4, so I've just #ifdefed the uses at least for now. 2025-04-17 Jakub Jelinek <jakub@redhat.com> PR libgomp/119849 * testsuite/libgomp.c++/allocator-1.C (test_inequality, main): Guard ompx::allocator::gnu_pinned_mem uses with #ifdef __gnu_linux__. * testsuite/libgomp.c++/allocator-2.C (main): Likewise.
2025-04-17libstdc++: Fixed signed comparision in _M_parse_fill_and_align [PR119840]Tomasz Kamiński1-2/+2
Explicitly cast elements of __not_fill to _CharT. Only '{' and ':' are used as `__not_fill`, so they are never negative. PR libstdc++/119840 libstdc++-v3/ChangeLog: * include/std/format (_M_parse_fill_and_align): Cast elements of __not_fill to _CharT.
2025-04-17middle-end: fix masking for partial vectors and early break [PR119351]Tamar Christina3-24/+99
The following testcase shows an incorrect masked codegen: #define N 512 #define START 1 #define END 505 int x[N] __attribute__((aligned(32))); int __attribute__((noipa)) foo (void) { int z = 0; for (unsigned int i = START; i < END; ++i) { z++; if (x[i] > 0) continue; return z; } return -1; } notice how there's a continue there instead of a break. This means we generate a control flow where success stays within the loop iteration: mask_patt_9.12_46 = vect__1.11_45 > { 0, 0, 0, 0 }; vec_mask_and_47 = mask_patt_9.12_46 & loop_mask_41; if (vec_mask_and_47 == { -1, -1, -1, -1 }) goto <bb 4>; [41.48%] else goto <bb 15>; [58.52%] However when loop_mask_41 is a partial mask this comparison can lead to an incorrect match. In this case the mask is: # loop_mask_41 = PHI <next_mask_63(6), { 0, -1, -1, -1 }(2)> due to peeling for alignment with masking and compiling with -msve-vector-bits=128. At codegen time we generate: ptrue p15.s, vl4 ptrue p7.b, vl1 not p7.b, p15/z, p7.b .L5: ld1w z29.s, p7/z, [x1, x0, lsl 2] cmpgt p7.s, p7/z, z29.s, #0 not p7.b, p15/z, p7.b ptest p15, p7.b b.none .L2 ...<early exit>... Here the basic blocks are rotated and a not is generated. But the generated not is unmasked (or predicated over an ALL true mask in this case). This has the unintended side-effect of flipping the results of the inactive lanes (which were zero'd by the cmpgt) into -1. Which then incorrectly causes us to not take the branch to .L2. This is happening because we're not comparing against the right value for the forall case. This patch gets rid of the forall case by rewriting the if(all(mask)) into if (!all(mask)) which is the same as if (any(~mask)) by negating the masks and flipping the branches. 1. For unmasked loops we simply reduce the ~mask. 2. For masked loops we reduce (~mask & loop_mask) which is the same as doing (mask & loop_mask) ^ loop_mask. For the above we now generate: .L5: ld1w z28.s, p7/z, [x1, x0, lsl 2] cmple p7.s, p7/z, z28.s, #0 ptest p15, p7.b b.none .L2 This fixes gromacs with > 1 OpenMP threads and improves performance. gcc/ChangeLog: PR tree-optimization/119351 * tree-vect-stmts.cc (vectorizable_early_exit): Mask both operands of the gcond for partial masking support. gcc/testsuite/ChangeLog: PR tree-optimization/119351 * gcc.target/aarch64/sve/pr119351.c: New test. * gcc.target/aarch64/sve/pr119351_run.c: New test.
2025-04-17libstdc++: Do not use 'not' alternative token in <format>Jonathan Wakely2-1/+17
This fixes: FAIL: 17_intro/headers/c++1998/operator_names.cc -std=gnu++23 (test for excess errors) FAIL: 17_intro/headers/c++1998/operator_names.cc -std=gnu++26 (test for excess errors) The purpose of 'not defined<format_kind<R>>' is to be ill-formed (as required by [format.range.fmtkind]) and to give an error that includes the string "not defined<format_kind<R>>". That was intended to tell you that format_kind<R> is not defined, just like it says! But user code can use -fno-operator-names so we can't use 'not' here, and "! defined" in the diagnostic doesn't seem as user-friendly. It also raises questions about whether it was intended to be the preprocessor token 'defined' (it's not) or where 'defined' is defined (it's not). Replace it with __primary_template_not_defined<format_kind<R>> and a comment, which seems to give a fairly clear diagnostic with both GCC and Clang. The diagnostic now looks like: .../include/c++/15.0.1/format:5165:7: error: use of 'std::format_kind<int>' before deduction of 'auto' 5165 | format_kind<_Rg> // you can specialize this for non-const input ranges | ^~~~~~~~~~~~~~~~ .../include/c++/15.0.1/format:5164:35: error: '__primary_template_not_defined' was not declared in this scope 5164 | __primary_template_not_defined( | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^ 5165 | format_kind<_Rg> // you can specialize this for non-const input ranges | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 5166 | ); | ~ libstdc++-v3/ChangeLog: * include/std/format (format_kind): Do not use 'not' alternative token to make the primary template ill-formed. Use the undeclared identifier __primary_template_not_defined and a comment that will appear in diagnostics. * testsuite/std/format/ranges/format_kind_neg.cc: New test.
2025-04-17s390: Use match_scratch instead of scratch in define_split [PR119834]Jakub Jelinek2-11/+87
The following testcase ICEs since r15-1579 (addition of late combiner), because *clrmem_short can't be split. The problem is that the define_insn uses (use (match_operand 1 "nonmemory_operand" "n,a,a,a")) (use (match_operand 2 "immediate_operand" "X,R,X,X")) (clobber (match_scratch:P 3 "=X,X,X,&a")) and define_split assumed that if operands[1] is const_int_operand, match_scratch will be always scratch, and it will be reg only if it was the last alternative where operands[1] is a reg. The pattern doesn't guarantee it though, of course RA will not try to uselessly assign a reg there if it is not needed, but during RA on the testcase below we match the last alternative, but then comes late combiner and propagates const_int 3 into operands[1]. And that matches fine, match_scratch matches either scratch or reg and the constraint in that case is X for the first variant, so still just fine. But we won't split that because the splitters only expect scratch. The following patch fixes it by using match_scratch instead of scratch, so that it accepts either. 2025-04-17 Jakub Jelinek <jakub@redhat.com> PR target/119834 * config/s390/s390.md (define_split after *cpymem_short): Use (clobber (match_scratch N)) instead of (clobber (scratch)). Use (match_dup 4) and operands[4] instead of (match_dup 3) and operands[3] in the last of those. (define_split after *clrmem_short): Use (clobber (match_scratch N)) instead of (clobber (scratch)). (define_split after *cmpmem_short): Likewise. * g++.target/s390/pr119834.C: New test.
2025-04-17libstdc++: Remove dead code in range_formatter::format [PR109162]Tomasz Kamiński1-7/+12
Because the _M_format(__rg, __fc) were placed outside of if constexpr, these method and its children where instantiated, even if _M_format<const _Range> could be used. To simplify the if constexpr chain, we introduce a __simply_formattable_range (name based on simple-view) exposition only concept, that checks if range is const and mutable formattable and uses same formatter specialization for references in each case. PR libstdc++/109162 libstdc++-v3/ChangeLog: * include/std/format (__format::__simply_formattable_range): Define. (range_formatter::format): Do not instantiate _M_format for mutable _Rg if const _Rg can be used. Reviewed-by: Jonathan Wakely <jwakely@redhat.com> Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
2025-04-17nvptx: Remove 'TARGET_ASM_NEED_VAR_DECL_BEFORE_USE'Thomas Schwinge1-2/+0
Unused; remnant of an (internal) experiment, before we had nvptx 'as'. gcc/ * config/nvptx/nvptx.cc (TARGET_ASM_NEED_VAR_DECL_BEFORE_USE): Don't '#define'.
2025-04-17libgomp.texi: For HIP interop, mention cpp defines to setTobias Burnus1-0/+6
The HIP header files recognize the used compiler, defaulting to either AMD or Nvidia/CUDA; thus, the alternative way of explicitly defining a macro is less prominently documented. With GCC, the user has to define the preprocessor macro manually. Hence, as a service to the user, mention __HIP_PLATFORM_AMD__ and __HIP_PLATFORM_NVIDIA__ in the interop documentation, even though it has only indirectly to do with GCC and its interop support. Note to commit-log readers, only: For Fortran, the hipfort modules can be used; when compiling the hipfort package (defaults to use gfortran), it generates the module (*.mod) files in include/hipfort/{amdgcn,nvidia}/ such that the choice is made by setting the respective include path. libgomp/ChangeLog: * libgomp.texi (gcn interop, nvptx interop): For HIP with C/C++, add a note about setting a preprocessor define.
2025-04-17d: Fix infinite loop regression in CTFEIain Buclaw4-4/+38
An infinite loop was introduced by a previous refactoring in the semantic pass for DeclarationExp nodes. Ensure the loop properly terminates and add tests cases. gcc/d/ChangeLog: * dmd/MERGE: Merge upstream dmd 956e73d64e. gcc/testsuite/ChangeLog: * gdc.test/fail_compilation/test21247.d: New test. * gdc.test/fail_compilation/test21247b.d: New test. Reviewed-on: https://github.com/dlang/dmd/pull/21248
2025-04-17combine: Correct comments about combine_validate_costHans-Peter Nilsson1-3/+3
Fix misleading comments. That function only determines whether replacements cost more; it doesn't actually *validate* costs as being cheaper. For example, it returns true also if it for various reasons cannot determine the costs, or if the new cost is the same, like when doing an identity replacement. The code has been the same since r0-59417-g64b8935d4809f3. * combine.cc: Correct comments about combine_validate_cost.
2025-04-16c++: ill-formed constexpr function [PR113360]Jason Merrill9-12/+45
If we already gave an error while parsing a function, we don't also need to try to explain what's wrong with it when we later try to use it in a constant-expression. In the new testcase explain_invalid_constexpr_fn couldn't find anything still in the function to complain about, so it said because: followed by nothing. We still try to constant-evaluate it to reduce error cascades, but we shouldn't complain if it doesn't work very well. This flag is similar to CLASSTYPE_ERRONEOUS that I added a while back. PR c++/113360 gcc/cp/ChangeLog: * cp-tree.h (struct language_function): Add erroneous bit. * constexpr.cc (explain_invalid_constexpr_fn): Return if set. (cxx_eval_call_expression): Quiet if set. * parser.cc (cp_parser_function_definition_after_declarator) * pt.cc (instantiate_body): Set it. gcc/testsuite/ChangeLog: * g++.dg/cpp23/constexpr-nonlit18.C: Remove redundant message. * g++.dg/cpp1y/constexpr-diag2.C: New test. * g++.dg/cpp1y/pr63996.C: Adjust expected errors. * g++.dg/template/explicit-args6.C: Likewise. * g++.dg/cpp0x/constexpr-ice21.C: Likewise.
2025-04-17Daily bump.GCC Administrator10-1/+298
2025-04-16[testsuite] [ppc] ipa-sra-19.c: pass -Wno-psabi on powerpc-*-elf as wellAlexandre Oliva1-1/+1
Like other ppc targets, powerpc-*-elf needs -Wno-psabi to compile gcc.dg/ipa/ipa-sra-19.c without an undesired warning about vector argument passing. for gcc/testsuite/ChangeLog * gcc.dg/ipa/ipa-sra-19.c: Add -Wno-psabi on ppc-elf too.
2025-04-16Doc: Document raw string literals as GNU C extension [PR88382]Sandra Loosemore1-0/+20
gcc/ChangeLog PR c/88382 * doc/extend.texi (Syntax Extensions): Adjust menu. (Raw String Literals): New section.
2025-04-16testsuite: Replace altivec vector attribute with generic equivalent [PR112822]Peter Bergner1-1/+1
Usage of the altivec vector attribute requires use of the -maltivec option. Replace with a generic equivalent which allows building the test case on multiple other targets and non-altivec ppc cpus, but still diagnoses the ICE on unfixed compilers. 2025-04-16 Peter Bergner <bergner@linux.ibm.com> gcc/testsuite/ PR tree-optimization/112822 * g++.dg/pr112822.C: Replace altivec vector attribute with a generic vector attribute.
2025-04-16cobol: Eliminate gcc/cobol/LICENSE. [PR119759]Bob Dubner1-29/+0
gcc/cobol PR cobol/119759 * LICENSE: Deleted.
2025-04-16[PATCH] rx: avoid adding setpsw for rx_cmpstrn when len is constKeith Packard1-4/+16
pattern using rx_cmpstrn is cmpstrsi for which len is a constant -1, so we'll be moving the setpsw instructions from rx_cmpstrn to cmpstrnsi as follows: 1. Adjust the predicate on the length operand from "register_operand" to "nonmemory_operand". This will allow constants to appear here, instead of having them already transferred into a register. 2. Check to see if the len value is constant, and then check if it is actually zero. In that case, short-circuit the rest of the pattern and set the result register to 0. 3. Emit 'setpsw c' and 'setpsw z' instructions when the len is not a constant, in case it turns out to be zero at runtime. 4. Remove the two 'setpsw' instructions from rx_cmpstrn. gcc/ * config/rx/rx.md (cmpstrnsi): Allow constant length. For static length 0, just store 0 into the output register. For dynamic zero, set C/Z appropriately. (rxcmpstrn): No longer set C/Z.
2025-04-16Fix wrong optimization of conditional expression with enumeration typeEric Botcazou4-3/+53
This is a regression introduced on the mainline and 14 branch by: https://gcc.gnu.org/pipermail/gcc-cvs/2023-October/391658.html The change bypasses int_fits_type_p (essentially) to work around the signedness constraints, but in doing so disregards the peculiarities of boolean types whose precision is not 1 dealt with by the predicate, leading to the creation of a problematic conversion here. Fixed by special-casing boolean types whose precision is not 1, as done in several other places. gcc/ * tree-ssa-phiopt.cc (factor_out_conditional_operation): Do not bypass the int_fits_type_p test for boolean types whose precision is not 1. gcc/testsuite/ * gnat.dg/opt105.adb: New test. * gnat.dg/opt105_pkg.ads, gnat.dg/opt105_pkg.adb: New helper.
2025-04-16Doc: make regenerate-opt-urlsSandra Loosemore1-2/+3
gcc/ChangeLog * common.opt.urls: Regenerated.
2025-04-16c++: templates, attributes, #pragma target [PR114772]Jason Merrill2-0/+20
Since r12-5426 apply_late_template_attributes suppresses various global state to avoid applying active pragmas to earlier declarations; we also need to override target_option_current_node. PR c++/114772 PR c++/101180 gcc/cp/ChangeLog: * pt.cc (apply_late_template_attributes): Also override target_option_current_node. gcc/testsuite/ChangeLog: * g++.dg/ext/pragma-target2.C: New test.
2025-04-16c++: format attribute redeclaration [PR116954]Jason Merrill2-1/+24
Here when merging the two decls, remove_contract_attributes loses ATTR_IS_DEPENDENT on the format attribute, so apply_late_template_attributes just returns, so the attribute doesn't get propagated to the type where the warning looks for it. Fixed by using copy_node instead of tree_cons to preserve flags. PR c++/116954 gcc/cp/ChangeLog: * contracts.cc (remove_contract_attributes): Preserve flags on the attribute list. gcc/testsuite/ChangeLog: * g++.dg/warn/Wformat-3.C: New test.
2025-04-16i386: Enable -mnop-mcount for -fpic with PLTs [PR119386]Ard Biesheuvel2-2/+12
-mnop-mcount can be trivially enabled for -fPIC codegen as long as PLTs are being used, given that the instruction encodings are identical, only the target may resolve differently depending on how the linker decides to incorporate the object file. So relax the option check, and add a test to ensure that 5-byte NOPs are emitted when -mnop-mcount is being used. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> gcc/ChangeLog: PR target/119386 * config/i386/i386-options.cc: Permit -mnop-mcount when using -fpic with PLTs. gcc/testsuite/ChangeLog: PR target/119386 * gcc.target/i386/pr119386-3.c: New test.
2025-04-16i386: Prefer PLT indirection for __fentry__ calls under -fPIC [PR119386]Ard Biesheuvel3-2/+32
Commit bde21de1205 ("i386: Honour -mdirect-extern-access when calling __fentry__") updated the logic that emits mcount() / __fentry__() calls into function prologues when profiling is enabled, to avoid GOT-based indirect calls when a direct call would suffice. There are two problems with that change: - it relies on -mdirect-extern-access rather than -fno-plt to decide whether or not a direct [PLT based] call is appropriate; - for the PLT case, it falls through to x86_print_call_or_nop(), which does not emit the @PLT suffix, resulting in the wrong relocation to be used (R_X86_64_PC32 instead of R_X86_64_PLT32) Fix this by testing flag_plt instead of ix86_direct_extern_access, and updating x86_print_call_or_nop() to take flag_pic and flag_plt into account. This also ensures that -mnop-mcount works as expected when emitting the PLT based profiling calls. While at it, fix the 32-bit logic as well, and issue a PLT call unless PLTs are explicitly disabled. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119386 Signed-off-by: Ard Biesheuvel <ardb@kernel.org> gcc/ChangeLog: PR target/119386 * config/i386/i386.cc (x86_print_call_or_nop): Add @PLT suffix where appropriate. (x86_function_profiler): Fall through to x86_print_call_or_nop() for PIC codegen when flag_plt is set. gcc/testsuite/ChangeLog: PR target/119386 * gcc.target/i386/pr119386-1.c: New test. * gcc.target/i386/pr119386-2.c: New test.
2025-04-16Doc: Add pointer to --help use to main entry for -Q option [PR90465]Sandra Loosemore1-2/+8
-Q does something completely different in conjunction with --help than it does otherwise; its main entry in the manual didn't mention that, nor did -Q have an entry in the index for the --help usage. gcc/ChangeLog PR driver/90465 * doc/invoke.texi (Overall Options): Add a @cindex for -Q in connection with --help=. (Developer Options): Point at --help= documentation for the other use of -Q.
2025-04-16Fortran: pure subroutine with pure procedure as dummy [PR106948]Harald Anlauf2-0/+56
PR fortran/106948 gcc/fortran/ChangeLog: * resolve.cc (gfc_pure_function): If a function has been resolved, but esym is not yet set, look at its attributes to see whether it is pure or elemental. gcc/testsuite/ChangeLog: * gfortran.dg/pure_formal_proc_4.f90: New test.
2025-04-16Remove 'ALWAYS_INLINE' workaround in ↵Thomas Schwinge1-6/+0
'libgomp.c++/target-exceptions-pr118794-1.C' With commit ca9cffe737d20953082333dacebb65d4261e0d0c "For nvptx offloading, make sure to emit C++ constructor, destructor aliases [PR97106]", we're able to remove the 'ALWAYS_INLINE' workaround added in commit fe283dba774be57b705a7a871b000d2894d2e553 "GCN, nvptx: Support '-mfake-exceptions', and use it for offloading compilation [PR118794]". libgomp/ * testsuite/libgomp.c++/target-exceptions-pr118794-1.C: Remove 'ALWAYS_INLINE' workaround.
2025-04-16libatomic: Fix up libat_{,un}lock_n for mingw [PR119796]Jakub Jelinek1-18/+32
Here is just a port of the previously posted patch to mingw which clearly has the same problems. 2025-04-16 Jakub Jelinek <jakub@redhat.com> PR libgcc/101075 PR libgcc/119796 * config/mingw/lock.c (libat_lock_n, libat_unlock_n): Start with computing how many locks will be needed and take into account ((uintptr_t)ptr % WATCH_SIZE). If some locks from the end of the locks array and others from the start of it will be needed, first lock the ones from the start followed by ones from the end.
2025-04-16libatomic: Fix up libat_{,un}lock_n [PR119796]Jakub Jelinek1-16/+23
As mentioned in the PR (and I think in PR101075 too), we can run into deadlock with libat_lock_n calls with larger n. As mentioned in PR66842, we use multiple locks (normally 64 mutexes for each 64 byte cache line in 4KiB page) and currently can lock more than one lock, in particular for n [0, 64] a single lock, for n [65, 128] 2 locks, for n [129, 192] 3 locks etc. There are two problems with this: 1) we can deadlock if there is some wrap-around, because the locks are acquired always in the order from addr_hash (ptr) up to locks[NLOCKS-1].mutex and then if needed from locks[0].mutex onwards; so if e.g. 2 threads perform libat_lock_n with n = 2048+64, in one case at pointer starting at page boundary and in another case at page boundary + 2048 bytes, the first thread can lock the first 32 mutexes, the second thread can lock the last 32 mutexes and then first thread wait for the lock 32 held by second thread and second thread wait for the lock 0 held by the first thread; fixed below by always locking the locks in order of increasing index, if there is a wrap-around, by locking in 2 loops, first locking some locks at the start of the array and second at the end of it 2) the number of locks seems to be determined solely depending on the n value, I think that is wrong, we don't know the structure alignment on the libatomic side, it could very well be 1 byte aligned struct, and so how many cachelines are actually (partly or fully) occupied by the atomic access depends not just on the size, but also on ptr % WATCH_SIZE, e.g. 2 byte structure at address page_boundary+63 should IMHO lock 2 locks because it occupies the first and second cacheline Note, before this patch it locked exactly one lock for n = 0, while with this patch it could lock either no locks at all (if it is at cacheline boundary) or 1 (otherwise). Dunno of libatomic APIs can be called for zero sizes and whether we actually care that much how many mutexes are locked in that case, because one can't actually read/write anything into zero sized memory. If you think it is important, I could add else if (nlocks == 0) nlocks = 1; in both spots. 2025-04-16 Jakub Jelinek <jakub@redhat.com> PR libgcc/101075 PR libgcc/119796 * config/posix/lock.c (libat_lock_n, libat_unlock_n): Start with computing how many locks will be needed and take into account ((uintptr_t)ptr % WATCH_SIZE). If some locks from the end of the locks array and others from the start of it will be needed, first lock the ones from the start followed by ones from the end.
2025-04-16Add 'libgomp.c++/pr106445-1{,-O0}.C' [PR106445]Thomas Schwinge2-0/+21
PR target/106445 libgomp/ * testsuite/libgomp.c++/pr106445-1.C: New. * testsuite/libgomp.c++/pr106445-1-O0.C: Likewise.
2025-04-16For nvptx offloading, make sure to emit C++ constructor, destructor aliases ↵Thomas Schwinge3-3/+13
[PR97106] PR target/97106 gcc/ * config/nvptx/nvptx.cc (nvptx_asm_output_def_from_decls) [ACCEL_COMPILER]: Make sure to emit C++ constructor, destructor aliases. libgomp/ * testsuite/libgomp.c++/pr96390.C: Un-XFAIL nvptx offloading. * testsuite/libgomp.c-c++-common/pr96390.c: Adjust.
2025-04-16Stream ipa_return_value_summaryJan Hubicka2-18/+131
Add streaming of return summaries from compile time to ltrans which are now needed for vrp to not ouput false errors on musttail. Co-authored-by: Jakub Jelinek <jakub@redhat.com> gcc/ChangeLog: PR tree-optimization/119614 * ipa-prop.cc (ipa_write_return_summaries): New function. (ipa_record_return_value_range_1): Break out from .... (ipa_record_return_value_range): ... here. (ipa_read_return_summaries): New function. (ipa_prop_read_section): Read return summaries. (read_ipcp_transformation_info): Read return summaries. (ipcp_write_transformation_summaries): Write return summaries; do not stream stray 0. gcc/testsuite/ChangeLog: * g++.dg/lto/pr119614_0.C: New test.
2025-04-16MAINTAINERS: Add myself to Write After ApprovalWaffl3x1-0/+1
ChangeLog: * MAINTAINERS: Add myself.
2025-04-16libstdc++: Fix constification in range_formatter::format [PR109162]Tomasz Kamiński2-3/+30
The _Rg is deduced to lvalue reference for the lvalue arguments, and in such case __format::__maybe_const_range<_Rg, _CharT> is always _Rg (adding const to reference does not change behavior). Now we correctly check if _Range = remove_reference_t<_Rg> is const formattable range, furthermore as range_formatter<T> can only format ranges of values of type (possibly const) _Tp, we additional check if the remove_cvref_t<range_reference_t<const _Range>> is _Tp. The range_reference_t<R> and range_reference_t<const R> have different types (modulo remove_cvref_t) for std::vector<bool> (::reference and bool) or flat_map<T, U> (pair<const T&, U&> and pair<const T&, const U&>). PR libstdc++/109162 libstdc++-v3/ChangeLog: * include/std/format (range_formatter::format): Format const range, only if reference type is not changed. * testsuite/std/format/ranges/formatter.cc: New tests. Reviewed-by: Jonathan Wakely <jwakely@redhat.com> Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
2025-04-16middle-end: force AMDGCN test for vect-early-break_18.c to consistent ↵Tamar Christina1-1/+1
architecture [PR119286] The given test is intended to test vectorization of a strided access done by having a step of > 1. GCN target doesn't support load lanes, so the testcase is expected to fail, other targets create a permuted load here which we then then reject. However some GCN arch don't seem to support the permuted loads either, so the vectorizer tries a gather/scatter. But the indices aren't supported by some target, so instead the vectorizer scalarizes the loads. I can't really test for which architecture is being used by the compiler, so instead this updates the testcase to use one single architecture so we get a consistent result. gcc/testsuite/ChangeLog: PR target/119286 * gcc.dg/vect/vect-early-break_18.c: Force -march=gfx908 for amdgcn.
2025-04-16middle-end: Fix incorrect codegen with PFA and VLS [PR119351]Tamar Christina14-3/+357
The following example: #define N 512 #define START 2 #define END 505 int x[N] __attribute__((aligned(32))); int __attribute__((noipa)) foo (void) { for (signed int i = START; i < END; ++i) { if (x[i] == 0) return i; } return -1; } generates incorrect code with fixed length SVE because for early break we need to know which value to start the scalar loop with if we take an early exit. Historically this means that we take the first element of every induction. this is because there's an assumption in place, that even with masked loops the masks come from a whilel* instruction. As such we reduce using a BIT_FIELD_REF <, 0>. When PFA was added this assumption was correct for non-masked loop, however we assumed that PFA for VLA wouldn't work for now, and disabled it using the alignment requirement checks. We also expected VLS to PFA using scalar loops. However as this PR shows, for VLS the vectorizer can, and does in some circumstances choose to peel using masks by masking the first iteration of the loop with an additional alignment mask. When this is done, the first elements of the predicate can be inactive. In this example element 1 is inactive based on the calculated misalignment. hence the -1 value in the first vector IV element. When we reduce using BIT_FIELD_REF we get the wrong value. This patch updates it by creating a new scalar PHI that keeps track of whether we are the first iteration of the loop (with the additional masking) or whether we have taken a loop iteration already. The generated sequence: pre-header: bb1: i_1 = <number of leading inactive elements> header: bb2: i_2 = PHI <i_1(bb1), 0(latch)> … early-exit: bb3: i_3 = iv_step * i_2 + PHI<vector-iv> Which eliminates the need to do an expensive mask based reduction. This fixes gromacs with one OpenMP thread. But with > 1 there is still an issue. gcc/ChangeLog: PR tree-optimization/119351 * tree-vectorizer.h (LOOP_VINFO_MASK_NITERS_PFA_OFFSET, LOOP_VINFO_NON_LINEAR_IV): New. (class _loop_vec_info): Add mask_skip_niters_pfa_offset and nonlinear_iv. * tree-vect-loop.cc (_loop_vec_info::_loop_vec_info): Initialize them. (vect_analyze_scalar_cycles_1): Record non-linear inductions. (vectorizable_induction): If early break and PFA using masking create a new phi which tracks where the scalar code needs to start... (vectorizable_live_operation): ...and generate the adjustments here. (vect_use_loop_mask_for_alignment_p): Reject non-linear inductions and early break needing peeling. gcc/testsuite/ChangeLog: PR tree-optimization/119351 * gcc.target/aarch64/sve/peel_ind_10.c: New test. * gcc.target/aarch64/sve/peel_ind_10_run.c: New test. * gcc.target/aarch64/sve/peel_ind_5.c: New test. * gcc.target/aarch64/sve/peel_ind_5_run.c: New test. * gcc.target/aarch64/sve/peel_ind_6.c: New test. * gcc.target/aarch64/sve/peel_ind_6_run.c: New test. * gcc.target/aarch64/sve/peel_ind_7.c: New test. * gcc.target/aarch64/sve/peel_ind_7_run.c: New test. * gcc.target/aarch64/sve/peel_ind_8.c: New test. * gcc.target/aarch64/sve/peel_ind_8_run.c: New test. * gcc.target/aarch64/sve/peel_ind_9.c: New test. * gcc.target/aarch64/sve/peel_ind_9_run.c: New test.
2025-04-16libstdc++: Implement formatters for pair and tuple [PR109162]Tomasz Kamiński6-93/+813
This patch implements formatter specializations for pair and tuple form P2286R8. In addition using 'm` and range_format::map (from P2585R1) for ranges are now supported. The formatters for pairs and tuples whose corresponding elements are the same (after applying remove_cvref_t) derive from the same __tuple_formatter class. This reduce the code duplication, as most of the parsing and formatting is the same in such cases. We use a custom reduced implementation of the tuple (__formatters_storage) to store the elements formatters. Handling of the padding (width and fill) options, is extracted to __format::__format_padded function, that is used both by __tuple_formatter and range_formatter. To reduce number of instantations range_formatter::format triggers, we cast incoming range to __format::__maybe_const_range<_Rg, _CharT>&, before formatting it. As in the case of previous commits, the signatures of the user-facing parse and format methods of the provided formatters deviate from the standard by constraining types of parameters: * _CharT is constrained __formatter::__char * basic_format_parse_context<_CharT> for parse argument * basic_format_context<_Out, _CharT> for format second argument The standard specifies last three of above as unconstrained types. Finally, test for tuple-like std::array and std::ranges::subrange, that illustrate that they remain formatted as ranges. PR libstdc++/109162 libstdc++-v3/ChangeLog: * include/std/format (__formatter_int::_M_format_character_escaped) (__formatter_str::format): Use __sink.out() to produce _Sink_iter. (__format::__const_formattable_range): Moved closer to range_formatter. (__format::__maybe_const_range): Use `__conditional_t` and moved closer to range_formatter. (__format::__format_padded, __format::maybe_const) (__format::__indexed_formatter_storage, __format::__tuple_formatter) (std::formatter<pair<_Fp, _Sp>, _CharT>>) (std::formatter<tuple<_Tps...>, _CharT): Define. (std::formatter<_Rg, _CharT>::format): Cast incoming range to __format::__maybe_const_range<_Rg, _CharT>&. (std::formatter<_Rg, _CharT>::_M_format): Extracted from format, and use __format_padded. (std::formatter<_Rg, _CharT>::_M_format_no_padding): Rename... (std::formatter<_Rg, _CharT>::_M_format_elems): ...to this. (std::formatter<_Rg, _CharT>::_M_format_with_padding): Extracted as __format_padded. * testsuite/util/testsuite_iterators.h (test_input_range_nocopy): Define. * testsuite/std/format/ranges/formatter.cc: Tests for `m` specifier. * testsuite/std/format/ranges/sequence.cc: Tests for array and subrange. * testsuite/std/format/ranges/map.cc: New test. * testsuite/std/format/tuple.cc: New test. Reviewed-by: Jonathan Wakely <jwakely@redhat.com> Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
2025-04-16bitintlower: Fix interaction of gimple_assign_copy_p stmts vs. ↵Jakub Jelinek2-4/+46
has_single_use [PR119808] The following testcase is miscompiled, because we emit a CLOBBER in a place where it shouldn't be emitted. Before lowering we have: b_5 = 0; b.0_6 = b_5; b.1_1 = (unsigned _BitInt(129)) b.0_6; ... <retval> = b_5; The bitint coalescing assigns the same partition/underlying variable for both b_5 and b.0_6 (possible because there is a copy assignment) and of course a different one for b.1_1 (and other SSA_NAMEs in between). This is -O0 so stmts aren't DCEd and aren't propagated that much etc. It is -O0 so we also don't try to optimize and omit some names from m_names and handle multiple stmts at once, so the expansion emits essentially bitint.4 = {}; bitint.4 = bitint.4; bitint.2 = cast of bitint.4; bitint.4 = CLOBBER; ... <retval> = bitint.4; and the CLOBBER is the problem because bitint.4 is still live afterwards. We emit the clobbers to improve code generation, but do it only for (initially) has_single_use SSA_NAMEs (remembered in m_single_use_names) being used, if they don't have the same partition on the lhs and a few other conditions. The problem above is that b.0_6 which is used in the cast has_single_use and so was in m_single_use_names bitmask and the lhs in that case is bitint.2, so a different partition. But there is gimple_assign_copy_p with SSA_NAME rhs1 and the partitioning special cases those and while b.0_6 is single use, b_5 has multiple uses. I believe this ought to be a problem solely in the case of such copy stmts and its special case by the partitioning, if instead of b.0_6 = b_5; there would be b.0_6 = b_5 + 1; or whatever other stmts that performs or may perform changes on the value, partitioning couldn't assign the same partition to b.0_6 and b_5 if b_5 is used later, it couldn't have two different (or potentially different) values in the same bitint.N var. With copy that is possible though. So the following patch fixes it by being more careful when we set m_single_use_names, don't set it if it is a has_single_use SSA_NAME but SSA_NAME_DEF_STMT of it is a copy stmt with SSA_NAME rhs1 and that rhs1 doesn't have single use, or has_single_use but SSA_NAME_DEF_STMT of it is a copy stmt etc. Just to make sure it doesn't change code generation too much, I've gathered statistics how many times if (m_first && m_single_use_names && m_vars[p] != m_lhs && m_after_stmt && bitmap_bit_p (m_single_use_names, SSA_NAME_VERSION (op))) { tree clobber = build_clobber (TREE_TYPE (m_vars[p]), CLOBBER_STORAGE_END); g = gimple_build_assign (m_vars[p], clobber); gimple_stmt_iterator gsi = gsi_for_stmt (m_after_stmt); gsi_insert_after (&gsi, g, GSI_SAME_STMT); } emits a clobber on make check-gcc GCC_TEST_RUN_EXPENSIVE=1 RUNTESTFLAGS="--target_board=unix\{-m64,-m32\} GCC_TEST_RUN_EXPENSIVE=1 dg.exp='*bitint* pr112673.c builtin-stdc-bit-*.c pr112566-2.c pr112511.c pr116588.c pr116003.c pr113693.c pr113602.c flex-array-counted-by-7.c' dg-torture.exp='*bitint* pr116480-2.c pr114312.c pr114121.c' dfp.exp=*bitint* i386.exp='pr118017.c pr117946.c apx-ndd-x32-2a.c' vect.exp='vect-early-break_99-pr113287.c' tree-ssa.exp=pr113735.c" and before this patch it was 41010 clobbers and after it is 40968, so difference is 42 clobbers, 0.1% fewer. 2025-04-16 Jakub Jelinek <jakub@redhat.com> PR middle-end/119808 * gimple-lower-bitint.cc (gimple_lower_bitint): Don't set m_single_use_names bits for SSA_NAMEs which have single use but their SSA_NAME_DEF_STMT is a copy from another SSA_NAME which doesn't have a single use, or single use which is such a copy etc. * gcc.dg/bitint-121.c: New test.
2025-04-16riscv: Fix incorrect gnu property alignment on rv32Jesse Huang3-1/+15
Codegen is incorrectly emitting a ".p2align 3" that coerces the alignment of the .note.gnu.property section from 4 to 8 on rv32. 2025-04-11 Jesse Huang <jesse.huang@sifive.com> gcc/ChangeLog * config/riscv/riscv.cc (riscv_file_end): Fix .p2align value. gcc/testsuite/ChangeLog * gcc.target/riscv/gnu-property-align-rv32.c: New file. * gcc.target/riscv/gnu-property-align-rv64.c: New file.