aboutsummaryrefslogtreecommitdiff
path: root/gcc
AgeCommit message (Collapse)AuthorFilesLines
2025-03-10Add empty ChangeLog files for GCC COBOL.Jakub Jelinek1-0/+6
2025-03-11c++/modules: Handle exposures of TU-local types in uninstantiated member ↵Nathaniel Shead2-10/+87
templates Previously, 'is_tu_local_entity' wouldn't detect the exposure of the (in practice) TU-local lambda in the following example, unless instantiated: struct S { template <typename> static inline decltype([]{}) x = {}; }; This is for two reasons. Firstly, when traversing the TYPE_FIELDS of S we only see the TEMPLATE_DECL, and never end up building a dependency on its DECL_TEMPLATE_RESULT (due to not being instantiated). This patch fixes this by stripping any templates before checking for unnamed types. The second reason is that we currently assume all class-scope entities are not TU-local. Despite this being unambiguous in the standard, this is not actually true in our implementation just yet, due to issues with mangling lambdas in some circumstances. Allowing these lambdas to be exported can cause issues in importers with apparently conflicting declarations, so this patch treats them as TU-local as well. After these changes, we now get double diagnostics from the two ways that we can see the above lambda being exposed, via 'S' (through TYPE_FIELDS) or via 'S::x'. To workaround this we hide diagnostics from the first case, so we only get errors from 'S::x' which will be closer to the point the offending lambda is declared. gcc/cp/ChangeLog: * module.cc (trees_out::has_tu_local_dep): Also look at the TI_TEMPLATE if we don't find a dep for a decl. (depset::hash::is_tu_local_entity): Handle unnamed template types, treat lambdas specially. (is_exposure_of_member_type): New function. (depset::hash::add_dependency): Use it. (depset::hash::finalize_dependencies): Likewise. gcc/testsuite/ChangeLog: * g++.dg/modules/internal-10.C: New test. Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by: Jason Merrill <jason@redhat.com>
2025-03-10arm: [MVE] Fix predicates for vec_cmp, vec_vcmpu and vcond_mask (PR 115439)Christophe Lyon1-3/+3
When compiling c-c++-common/vector-compare-3.c with -march=armv8.1-m.main+mve+fp.dp -mfloat-abi=hard -mfpu=auto (which enables MVE), we fail to match vcond_mask because operand 3 has s_register_operand as predicate for a MVE_VPRED mode, but we try to match: (insn 26 25 27 2 (set (reg:V4SI 137) (unspec:V4SI [ (reg:V4SI 144) (reg:V4SI 145) (subreg:V4BI (reg:HI 143) 0) ] VPSELQ_S)) "/src/gcc/testsuite/c-c++-common/vector-compare-3.c":23:6 -1 (nil)) The fix is to use the right predicate: vpr_register_operand. The patch also fixes vec_cmp and vec_cmpu in the same way. When testing with -mthumb/-march=armv8.1-m.main+mve.fp+fp.dp/-mtune=cortex-m55/-mfloat-abi=hard/-mfpu=auto it fixes the ICES in c-c++-common/vector-compare-3.c, g++.dg/opt/pr79734.C, g++.dg/tree-ssa/pr111150.C and gcc.dg/tree-ssa/pr111150.c gcc/ChangeLog PR target/115439 * config/arm/mve.md (vec_vcmp, vec_vcmpu, vcond_mask): Use vpr_register_operand predicate for MVE_VPRED operands.
2025-03-10Fortran: Fix gimplification error for pointer remapping in forall [PR107143]Andre Vehreschild2-1/+42
Enhance dependency checking for data pointers to check for same derived type and not only for a type being a derived type. This prevent generation of a descriptor for a function call, that is unsuitable in forall's pointer assignment. PR fortran/107143 gcc/fortran/ChangeLog: * dependency.cc (check_data_pointer_types): Do not just compare for derived type, but for same derived type. gcc/testsuite/ChangeLog: * gfortran.dg/forall_20.f90: New test.
2025-03-10libgcc: Fix up unwind-dw2-btree.h [PR119151]Jakub Jelinek1-0/+151
The following testcase shows a bug in unwind-dw2-btree.h. In short, the header provides lock-free btree data structure (so no parent link on nodes, both insertion and deletion are done in top-down walks with some locking of just a few nodes at a time so that lookups can notice concurrent modifications and retry, non-leaf (inner) nodes contain keys which are initially the base address of the left-most leaf entry of the following child (or all ones if there is none) minus one, insertion ensures balancing of the tree to ensure [d/2, d] entries filled through aggressive splitting if it sees a full tree while walking, deletion performs various operations like merging neighbour trees, merging into parent or moving some nodes from neighbour to the current one). What differs from the textbook implementations is mostly that the leaf nodes don't include just address as a key, but address range, address + size (where we don't insert any ranges with zero size) and the lookups can be performed for any address in the [address, address + size) range. The keys on inner nodes are still just address-1, so the child covers all nodes where addr <= key unless it is covered already in children to the left. The user (static executables or JIT) should always ensure there is no overlap in between any of the ranges. In the testcase a bunch of insertions are done, always followed by one removal, followed by one insertion of a range slightly different from the removed one. E.g. in the first case [&code[0x50], &code[0x59]] range is removed and then we insert [&code[0x4c], &code[0x53]] range instead. This is valid, it doesn't overlap anything. But the problem is that some non-leaf (inner) one used the &code[0x4f] key (after the 11 insertions completely correctly). On removal, nothing adjusts the keys on the parent nodes (it really can't in the top-down only walk, the keys could be many nodes above it and unlike insertion, removal only knows the start address, doesn't know the removed size and so will discover it only when reaching the leaf node which contains it; plus even if it knew the address and size, it still doesn't know what the second left-most leaf node will be (i.e. the one after removal)). And on insertion, if nodes aren't split at a level, nothing adjusts the inner keys either. If a range is inserted and is either fully bellow key (keys are - 1, so having address + size - 1 being equal to key is fine) or fully after key (i.e. address > key), it works just fine, but if the key is in a middle of the range like in this case, &code[0x4f] is in the middle of the [&code[0x4c], &code[0x53]] range, then insertion works fine (we only use size on the leaf nodes), and lookup of the addresses below the key work fine too (i.e. [&code[0x4c], &code[0x4f]] will succeed). The problem is with lookups after the key (i.e. [&code[0x50, &code[0x53]]), the lookup looks for them in different children of the btree and doesn't find an entry and returns NULL. As users need to ensure non-overlapping entries at any time, the following patch fixes it by adjusting keys during insertion where we know not just the address but also size; if we find during the top-down walk a key which is in the middle of the range being inserted, we simply increase the key to be equal to address + size - 1 of the range being inserted. There can't be any existing leaf nodes overlapping the range in correct programs and the btree rebalancing done on deletion ensures we don't have any empty nodes which would also cause problems. The patch adjusts the keys in two spots, once for the current node being walked (the last hunk in the header, with large comment trying to explain it) and once during inner node splitting in a parent node if we'd otherwise try to add that key in the middle of the range being inserted into the parent node (in that case it would be missed in the last hunk). The testcase covers both of those spots, so succeeds with GCC 12 (which didn't have btrees) and fails with vanilla GCC trunk and also fails if either the if (fence < base + size - 1) fence = iter->content.children[slot].separator = base + size - 1; or if (left_fence >= target && left_fence < target + size - 1) left_fence = target + size - 1; hunk is removed (of course, only with the current node sizes, i.e. up to 15 children of inner nodes and up to 10 entries in leaf nodes). 2025-03-10 Jakub Jelinek <jakub@redhat.com> Michael Leuchtenburg <michael@slashhome.org> PR libgcc/119151 * unwind-dw2-btree.h (btree_split_inner): Add size argument. If left_fence is in the middle of [target,target + size - 1] range, increase it to target + size - 1. (btree_insert): Adjust btree_split_inner caller. If fence is smaller than base + size - 1, increase it and separator of the slot to base + size - 1. * gcc.dg/pr119151.c: New test.
2025-03-10LoongArch: Fix ICE when trying to recognize bitwise + alsl.w pair [PR119127]Xi Ruoyao3-11/+31
When we call loongarch_reassoc_shift_bitwise for <optab>_alsl_reversesi_extend, the mask is in DImode but we are trying to operate it in SImode, causing an ICE. To fix the issue sign-extend the mask into the mode we want. And also specially handle the case the mask is extended into -1 to avoid a miss-optimization. gcc/ChangeLog: PR target/119127 * config/loongarch/loongarch.cc (loongarch_reassoc_shift_bitwise): Sign extend mask to mode, specially handle the case it's extended to -1. * config/loongarch/loongarch.md (loongarch_reassoc_shift_bitwise): Update the comment for the special case.
2025-03-10gimple-ssa-warn-access: Adjust maybe_warn_nonstring_arg for nonstring ↵Jakub Jelinek3-6/+20
multidimensional arrays [PR117178] The following patch fixes 4 xfails in attr-nonstring-11.c (and results in 2 false positive warnings in attr-nonstring-12.c not being produced either). The thing is that maybe_warn_nonstring_arg simply assumed that nonstring arrays must be single-dimensional, so when it sees a nonstring decl with ARRAY_TYPE, it just used its dimension. With multi-dimensional arrays that is not the right dimension to use though, it can be dimension of some outer dimension, e.g. if we have char a[5][6][7] __attribute__((nonstring)) if decl is a[5] it would assume maximum non-NUL terminated string length of 5 rather than 7, if a[5][6] it would assume 6 and only for a[5][6][0] it would assume the correct 7. So, the following patch looks through all the outer dimensions to reach the innermost one (which for attribute nonstring is guaranteed to have char/unsigned char/signed char element type). 2025-03-10 Jakub Jelinek <jakub@redhat.com> PR c/117178 * gimple-ssa-warn-access.cc (maybe_warn_nonstring_arg): Look through multi-dimensional array types, stop at the innermost ARRAY_TYPE. * c-c++-common/attr-nonstring-11.c: Remove xfails. * c-c++-common/attr-nonstring-12.c (warn_strcmp_cst_1, warn_strcmp_cst_2): Don't expect any warnings here. (warn_strcmp_cst_3, warn_strcmp_cst_4): New functions with expected warnings.
2025-03-10LoongArch: testsuite: Fix gcc.dg/vect/slp-26.c.Lulu Cheng1-2/+2
After d34cda720988674bcf8a24267c9e1ec61335d6de, what was originally not vectorizable can now be vectorized. So adjust gcc.dg/vect/slp-26.c. gcc/testsuite/ChangeLog: * gcc.dg/vect/slp-26.c: Adjust.
2025-03-10LoongArch: testsuite: Fix gcc.dg/vect/bb-slp-77.c.Lulu Cheng1-1/+1
The issue is the same as 12383255fe4e82c31f5e42c72a8fbcb1b5dea35d. Neither is .REDUC_PLUS set for V2SImode on LoongArch, so add it to the list of targets not expecting BB vectorization. gcc/testsuite/ChangeLog: * gcc.dg/vect/bb-slp-77.c: Add loongarch*-*-* to the list of expected failing targets.
2025-03-10LoongArch: testsuite: Fix pr112325.c and pr117888-1.c.Lulu Cheng2-0/+2
By default, vectorization is not enabled on LoongArch, resulting in the failure of these two test cases. gcc/testsuite/ChangeLog: * gcc.dg/vect/pr112325.c: Add the vector compilation option '-mlsx' for LoongArch. * gcc.dg/vect/pr117888-1.c: Likewise.
2025-03-10Daily bump.GCC Administrator4-1/+64
2025-03-09[rtl-optimization/117467] Mark FP destinations as deadJeff Law1-4/+4
The next step in improving ext-dce is to clean up a minor wart in the set/clobber handling code. In that code the safe thing to do is to not process a destination at all. That will leave bits set in the live bitmaps for objects that may no longer be live. Of course with extraneous bits set we use more memory and do more work managing the bitmaps, but it's safe from a code correctness standpoint. One case that is slipping through that we need to fix is scalar fp destinations. Essentially the code never tried to handle those and as a result would leave those entities live and bubble them up through the CFG. In the testcase at hand this takes us from ~10k live objects at entry to ~4k live objects at entry. Time spent in ext-dce goes from 2.14s to .64s. Bootstrapped and regression tested on x86_64. PR rtl-optimization/117467 gcc/ * ext-dce.cc (ext_dce_process_sets): Handle FP destinations better.
2025-03-09[rtl-optimization/117467] Avoid unnecessarily marking things live in ext-dceJeff Law1-0/+12
This is the first of what I expect to be a few patches to improve memory consumption and performance of ext-dce. While I haven't been able to reproduce the insane memory usage that Richi saw, I can certainly see how we might get there. I instrumented ext-dce to dump the size of liveness sets, removed the memory allocation limiter, then compiled the appropriate file from specfp on rv64. In my test I saw the liveness sets growing to absurd sizes as we worked from the last block back to the first. Think 125k entries by the time we got back to the entry block which would mean ~30k live registers. Simply no way that's correct. The use handling is the primary source of problems and the code that I most want to rewrite for gcc-16. It's just a fugly mess. I'm not terribly inclined to do that rewrite for gcc-15 though. So these will be spot adjustments. The most important thing to know about use processing is it sets up an iterator and walks that. When a SET is encountered we actually manually dive into the SRC/DEST and ideally terminate the iterator. If during that SET processing we encounter something unexpected we let the iterator continue normally, which causes iteration down into the SET_DEST object. That's safe behavior, though it can lead to too many objects as being marked live. We can refine that behavior by trivially realizing that we need not process the SET_DEST if it is a naked REG (and probably for other cases too, but they're not expected to be terribly important). So once we see the SET with a simple REG destination, we can bump the iterator to avoid having it dive into the SET_DEST if something unexpected is seen on the SET_SRC side. Fixing this alone takes us from 125k live objects to 10k live objects at the entry block. Time in ext-dce for rv64 on the testcase goes from 10.81s to 2.14s. Given this reduces the things considered live, this could easily result in finding more cases for ext-dce to improve. In fact a missed optimization issue for rv64 I've been poking at needs this patch as a prerequisite. Bootstrapped and regression tested on x86_64. Pushing to the trunk. PR rtl-optimization/117467 gcc * ext-dce.cc (ext_dce_process_uses): When trivially possible advance the iterator over the destination of a SET.
2025-03-09Use gfc_commit_symbol() to remove UNDO status instead of new function.Thomas Koenig3-8/+2
This is a cleaner version, removing an unneeded function and making sure that no memory leaks can occur if callers change. gcc/fortran/ChangeLog: PR fortran/119157 * gfortran.h (gfc_pop_undo_symbol): Remove prototype. * interface.cc (gfc_get_formal_from_actual_arglist): Use gfc_commit_symbol() instead of gfc_pop_undo_symbol(). * symbol.cc (gfc_pop_undo_symbol): Remove.
2025-03-09phiopt: Fix value_replacement for middle bb having phi nodes [PR118922]Andrew Pinski2-0/+61
After r12-5300-gf98f373dd822b3, value_replacement would be able to look at the following cfg structure: ``` <bb 5> [local count: 1014686024]: if (h_6 != 0) goto <bb 7>; [94.50%] else goto <bb 6>; [5.50%] <bb 6> [local count: 114863530]: # h_6 = PHI <0(4), 1(5)> <bb 7> [local count: 1073741824]: # f_8 = PHI <0(5), h_6(6)> _9 = f_8 ^ 1; a.0_10 = a; _11 = _9 + a.0_10; if (_11 != -117) goto <bb 5>; [94.50%] else goto <bb 8>; [5.50%] ``` value_replacement would incorrectly think the middle bb (6) was empty and so it decides to remove condition in bb5 and replacing it with 0 as the function thought it was `h_6 ? 0 : h_6`. But since the there is an incoming phi node to bb6 defining h_6 that is incorrect. The fix is to check if there is phi nodes in the middle bb and set empty_or_with_defined_p to false. This was not needed before r12-5300-gf98f373dd822b3 because the phi would have been dead otherwise due to other checks. Bootstrapped and tested on x86_64-linux-gnu. PR tree-optimization/118922 gcc/ChangeLog: * tree-ssa-phiopt.cc (value_replacement): Set empty_or_with_defined_p to false when there is phi nodes for the middle bb. gcc/testsuite/ChangeLog: * gcc.dg/torture/pr118922-1.c: New test. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2025-03-09testsuite: Require effective target float16 for test [PR119133]Dimitar Dimitrov1-0/+1
The test spuriously failed on pru-unknown-elf due to missing support for _Float16 type. PR target/119133 gcc/testsuite/ChangeLog: * gcc.dg/torture/pr119133.c: Require effective target float16. Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
2025-03-09OpenMP: Integrate dynamic selectors with dispatch argument handling [PR118457]Sandra Loosemore6-465/+499
Support for dynamic selectors in "declare variant" was developed in parallel with support for the adjust_args/append_args clauses and the dispatch construct; they collided in a bad way. This patch fixes the "sorry" for calls that need both by removing the adjust_args/append_args code from gimplify_call_expr and invoking it from the new variant substitution code instead. It's handled as a tree -> tree transformation rather than tree -> gimple because eventually this code may end up being invoked from the front ends instead of the gimplifier (see PR115076). gcc/ChangeLog PR middle-end/118457 * gimplify.cc (modify_call_for_omp_dispatch): New, containing code split from gimplify_call_expr and modified to emit tree instead of gimple. Remove the error for falling through to a call to the base function. (expand_variant_call_expr): New, split from gimplify_variant_call_expr. Call modify_call_for_omp_dispatch on calls to variants in a dispatch construct context. (gimplify_variant_call_expr): Make it call expand_variant_call_expr to do the actual work. (gimplify_call_expr): Remove sorry for calls involving both dynamic/late selectors and adjust_args/append_args, and adjust for new interface. Move adjust_args/append_args code to modify_call_for_omp_dispatch. (gimplify_omp_dispatch): Add some comments. gcc/testsuite/ChangeLog PR middle-end/118457 * c-c++-common/gomp/adjust-args-6.c: Remove xfails and adjust expected output. * c-c++-common/gomp/append-args-5.c: Adjust expected output. * c-c++-common/gomp/append-args-dynamic.c: New. * c-c++-common/gomp/dispatch-11.c: Adjust expected output. * gfortran.dg/gomp/dispatch-11.f90: Likewise.
2025-03-09Daily bump.GCC Administrator5-1/+76
2025-03-08Fix regression with -Wexternal-argument-mismatch.Thomas Koenig4-1/+23
The attached patch fixes an ICE regresseion where undo state was not handled properly when generating formal from actual arguments, which occurred under certain conditions with the newly introduced -Wexternal-argument-mismatch option. The fix is simple: When we are generating these symbols, we no longer need to undo anything, so we can just remove them. I had considered adding an extra optional argument, but decided against it on code clarity grounds. While looking at the code, I also saw that a member of gfc_symbol introduced with my patch should be a bitfield of width 1. gcc/fortran/ChangeLog: PR fortran/119157 * gfortran.h (gfc_symbol): Make ext_dummy_arglist_mismatch a one-bit bitfield (gfc_pop_undo_symbol): Declare prototype. * symbol.cc (gfc_pop_undo_symbol): New function. * interface.cc (gfc_get_formal_from_actual_arglist): Call it for artificially introduced formal variables. gcc/testsuite/ChangeLog: PR fortran/119157 * gfortran.dg/interface_57.f90: New test.
2025-03-08inline-asm: Improve documentation of "asm constexpr".Sandra Loosemore2-27/+41
While working on an adjacent documentation fix, I noticed that the documentation for the gnu++11 "asm constexpr" feature was very confusing, in some cases being attached to parts of the asm syntax that are not otherwise required to be string literals, and missing from other parts of the syntax that are. I've checked what the C++ parser actually does and fixed the documentation to match, also improving it to use correct markup and to be more explicit and less implementor-speaky. gcc/cp/ChangeLog * parser.cc (cp_parser_asm_definition): Make comment more explicit. (cp_parser_asm_operand_list): Likewise. Also correct the comment block at the top of the function to reflect reality. gcc/ChangeLog * doc/extend.texi (Basic Asm): Document that AssemblerInstructions can be an asm constexpr. (Extended Asm): Move the notes about asm constexprs for AssemblerTemplate and Clobbers to the corresponding subsections. Remove the notes for OutputOperands and InputOperands and reword misleading descriptions of the list item syntax. Note that constraint strings can be asm constexprs. (Asm constexprs): Use "title case" for subsection name. Be explicit about what parts of the asm syntax this applies to and that the parentheses are required. Correct markup and terminology.
2025-03-08c++/modules: purview of explicit instantiations [PR114630]Jason Merrill5-5/+76
When calling instantiate_pending_templates at end of parsing, any new functions that are instantiated from this point have their module purview set based on the current value of module_kind. This is unideal, however, as the modules code will then treat these instantiations as reachable and cause large swathes of the GMF to be emitted into the module CMI, despite no code in the actual module purview referencing it. This patch fixes this by setting DECL_MODULE_PURVIEW_P as appropriate when we see an explicit instantiation, and adjusting module_kind accordingly during deferred instantiation, meaning that GMF entities won't be counted as reachable unless referenced by an actually reachable entity. Note that purviewness and attachment etc. is generally only determined by the base template: this is purely for determining whether an explicit instantiation is in the module purview and hence whether it should be streamed out. See the comment on 'set_instantiating_module'. Incidentally, since the "xtreme" testcases are deliberately large (and this commit adds another one), let's make sure we only run them once. PR c++/114630 PR c++/114795 gcc/cp/ChangeLog: * pt.cc (reopen_tinst_level): Set or clear MK_PURVIEW. (mark_decl_instantiated): Call set_instantiating_module. (instantiate_pending_templates): Save and restore module_kind so it isn't affected by reopen_tinst_level. gcc/testsuite/ChangeLog: * g++.dg/modules/modules.exp: Run xtreme tests once. * g++.dg/modules/gmf-3.C: New test. * g++.dg/modules/gmf-4.C: New test. * g++.dg/modules/gmf-xtreme.C: New test. Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Co-authored-by: Nathaniel Shead <nathanieloshead@gmail.com>
2025-03-08inline-asm: Clarify documentation of operand syntax [PR67301]Sandra Loosemore1-8/+10
gcc/ChangeLog PR c/67301 * doc/extend.texi (Extended Asm): Clarify that the square brackets around the asmSymbolicName of operands are a required part of the syntax.
2025-03-07Fortran: Fix ICE in resolve.cc with -pedanticJerry DeLisle2-1/+17
Fixes an ICE in gfc_resolve_code when passing an optional array to an elemental procedure with `-pedantic` enabled. PR95446 added the original check, this patch fixes the case where the other actual argument is an array literal (or something else other than a variable). PR fortran/119054 gcc/fortran/ChangeLog: * resolve.cc (resolve_elemental_actual): When checking other actual arguments to elemental procedures, don't check attributes of literals and function calls. gcc/testsuite/ChangeLog: * gfortran.dg/pr95446.f90: Expand test case to literals and function calls. Signed-off-by: Peter Hill <peter.hill@york.ac.uk>
2025-03-08Daily bump.GCC Administrator6-1/+354
2025-03-08c-family, tree: Allow nonstring attribute on multidimensional arrays [PR117178]Jakub Jelinek11-11/+1237
As requested in the PR117178 thread, the following patch allows nonstring attribute also on multi-dimensional arrays (with cv char/signed char/unsigned char as innermost element type) and pointers to such multi-dimensional arrays or pointers to single-dimensional cv char/signed char/unsigned char arrays. Given that (unfortunately) nonstring is a decl attribute rather than type attribute, I think restricting it to single-dimensional arrays makes no sense, even multi-dimensional ones can be used for storage of non-nul terminated strings. I really don't know what the kernel plans are, whether they'll go with -Wno-unterminated-string-initialization added in Makefiles, or whether the plan is to use nonstring attributes to quiet the warning. In the latter case, some of the nonstring attributes will need to be conditional on gcc version, because gcc before this patch will reject it on multidimensional arrays. 2025-03-08 Jakub Jelinek <jakub@redhat.com> PR c/117178 gcc/ * tree.cc (get_attr_nonstring_decl): Look through all ARRAY_REFs, not just one and handle COMPONENT_REF and MEM_REF after skipping those rather than only when there wasn't ARRAY_REF. Formatting fix. gcc/c-family/ * c-attribs.cc (handle_nonstring_attribute): Allow the attribute also on multi-dimensional arrays with char/signed char/unsigned char element type or pointers to such single and multi-dimensional arrays. gcc/testsuite/ * c-c++-common/attr-nonstring-7.c: Remove one xfail. * c-c++-common/attr-nonstring-9.c: New test. * c-c++-common/attr-nonstring-10.c: New test. * c-c++-common/attr-nonstring-11.c: New test. * c-c++-common/attr-nonstring-12.c: New test. * c-c++-common/attr-nonstring-13.c: New test. * c-c++-common/attr-nonstring-14.c: New test. * c-c++-common/attr-nonstring-15.c: New test. * c-c++-common/attr-nonstring-16.c: New test.
2025-03-07c: do not warn about truncating NUL char when initializing nonstring arrays ↵Jakub Jelinek7-35/+143
[PR117178] When initializing a nonstring char array when compiled with -Wunterminated-string-initialization the warning trips even when truncating the trailing NUL character from the string constant. Only warn about this when running under -Wc++-compat since under C++ we should not initialize nonstrings from C strings. This patch separates the -Wunterminated-string-initialization and -Wc++-compat warnings, they are now independent option, the former implied by -Wextra, the latter not implied by anything. If -Wc++-compat is in effect, it takes precedence over -Wunterminated-string-initialization and warns regardless of nonstring attribute, otherwise if -Wunterminated-string-initialization is enabled, it warns only if there isn't nonstring attribute. In all cases, the warnings and also pedwarn_init for even larger sizes now provide details on the lengths. 2025-03-07 Kees Cook <kees@kernel.org> Jakub Jelinek <jakub@redhat.com> PR c/117178 gcc/ * doc/invoke.texi (Wunterminated-string-initialization): Document the new interaction between this warning and -Wc++-compat and that initialization of decls with nonstring attribute aren't warned about. gcc/c-family/ * c.opt (Wunterminated-string-initialization): Don't depend on -Wc++-compat. gcc/c/ * c-typeck.cc (digest_init): Add DECL argument. Adjust wording of pedwarn_init for too long strings and provide details on the lengths, for string literals where just the trailing NULL doesn't fit warn for warn_cxx_compat with OPT_Wc___compat, wording which mentions "for C++" and provides details on lengths, otherwise for warn_unterminated_string_initialization adjust the warning, provide details on lengths and don't warn if get_attr_nonstring_decl (decl). (build_c_cast, store_init_value, output_init_element): Adjust digest_init callers. gcc/testsuite/ * gcc.dg/Wunterminated-string-initialization.c: Add additional test coverage. * gcc.dg/Wcxx-compat-14.c: Check in dg-warning for "for C++" part of the diagnostics. * gcc.dg/Wcxx-compat-23.c: New test. * gcc.dg/Wcxx-compat-24.c: New test. Signed-off-by: Kees Cook <kees@kernel.org>
2025-03-07Sanitizer: Mention -g option in documentation [PR56682]Sandra Loosemore1-1/+8
gcc/ChangeLog PR sanitizer/56682 * doc/invoke.texi (Instrumentation Options): Document that -g is useful with -fsanitize=thread and -fsanitize=address. Also mention -fno-omit-frame-pointer per the asan wiki.
2025-03-07Fix testcases up after recent -Wreturn-type changeAndrew Pinski2-2/+2
I missed these two testcases in the diff when looking for testcases that fail. The change is the same as what was done for gcc.dg/Wreturn-mismatch-2.c. Pushed as obvious after a quick test. gcc/testsuite/ChangeLog: * gcc.dg/Wreturn-mismatch-2a.c: Change dg-warning for the last -Wreturn-type to dg-bogus. * gcc.dg/Wreturn-mismatch-6.c: Likewise. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2025-03-08ira: Add new hooks for callee-save vs spills [PR117477]Richard Sandiford14-39/+458
Following on from the discussion in: https://gcc.gnu.org/pipermail/gcc-patches/2025-February/675256.html this patch removes TARGET_IRA_CALLEE_SAVED_REGISTER_COST_SCALE and replaces it with two hooks: one that controls the cost of using an extra callee-saved register and one that controls the cost of allocating a frame for the first spill. (The patch does not attempt to address the shrink-wrapping part of the thread above.) On AArch64, this is enough to fix PR117477, as verified by the new tests. The patch does not change the SPEC2017 scores significantly. (I saw a slight improvement in fotonik3d and roms, but I'm not convinced that the improvements are real.) The patch makes IRA use caller saves for gcc.target/aarch64/pr103350-1.c, which is a scan-dump correctness test that relies on not using caller saves. The decision to use caller saves looks appropriate, and saves an instruction, so I've just added -fno-caller-saves to the test options. The x86 parts were written by Honza. ix86_callee_save_cost is updated by H.J. to replace gcc_checking_assert with returning 1 if mem_cost <= 2. gcc/ PR rtl-optimization/117477 * config/aarch64/aarch64.cc (aarch64_count_saves): New function. (aarch64_count_above_hard_fp_saves, aarch64_callee_save_cost) (aarch64_frame_allocation_cost): Likewise. (TARGET_CALLEE_SAVE_COST): Define. (TARGET_FRAME_ALLOCATION_COST): Likewise. * config/i386/i386.cc (ix86_ira_callee_saved_register_cost_scale): Replace with... (ix86_callee_save_cost): ...this new hook. (TARGET_IRA_CALLEE_SAVED_REGISTER_COST_SCALE): Delete. (TARGET_CALLEE_SAVE_COST): Define. * target.h (spill_cost_type, frame_cost_type): New enums. * target.def (callee_save_cost, frame_allocation_cost): New hooks. (ira_callee_saved_register_cost_scale): Delete. * doc/tm.texi.in (TARGET_IRA_CALLEE_SAVED_REGISTER_COST_SCALE): Delete. (TARGET_CALLEE_SAVE_COST, TARGET_FRAME_ALLOCATION_COST): New hooks. * doc/tm.texi: Regenerate. * hard-reg-set.h (hard_reg_set_popcount): New function. * ira-color.cc (allocated_memory_p): New variable. (allocated_callee_save_regs): Likewise. (record_allocation): New function. (assign_hard_reg): Use targetm.frame_allocation_cost to model the cost of the first spill or first caller save. Use targetm.callee_save_cost to model the cost of using new callee-saved registers. Apply the exit rather than entry frequency to the cost of restoring a register or deallocating the frame. Update the new variables above. (improve_allocation): Use record_allocation. (color): Initialize allocated_callee_save_regs. (ira_color): Initialize allocated_memory_p. * targhooks.h (default_callee_save_cost): Declare. (default_frame_allocation_cost): Likewise. * targhooks.cc (default_callee_save_cost): New function. (default_frame_allocation_cost): Likewise. gcc/testsuite/ PR rtl-optimization/117477 * gcc.target/aarch64/callee_save_1.c: New test. * gcc.target/aarch64/callee_save_2.c: Likewise. * gcc.target/aarch64/callee_save_3.c: Likewise. * gcc.target/aarch64/pr103350-1.c: Add -fno-caller-saves. Co-authored-by: Jan Hubicka <hubicka@ucw.cz> Co-authored-by: H.J. Lu <hjl.tools@gmail.com>
2025-03-07c: Fix warning after an error on a return statment [PR60440]Andrew Pinski3-2/+17
Like r5-6912-g3dbb84276aca10 but this is for the C front-end. Basically we have an error on a return statement, we just return error_mark_node and then the warning happens as there is no return statement. Anyways instead mark the current function for supression of the warning instead. PR c/60440 gcc/c/ChangeLog: * c-typeck.cc (c_finish_return): Mark the current function for supression of the -Wreturn-type if there was an error on the return statement. gcc/testsuite/ChangeLog: * gcc.dg/Wreturn-mismatch-2.c: Change dg-warning for the last -Wreturn-type to dg-bogus. * gcc.dg/pr60440-1.c: New test. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2025-03-07c++: ICE with operator new[] in constexpr [PR118775]Marek Polacek3-0/+63
Here we ICE since r11-7740 because we no longer say that (long)&a (where a is a global var) is non_constant_p. So VERIFY_CONSTANT does not return and we crash on tree_to_uhwi. We should check tree_fits_uhwi_p before calling tree_to_uhwi. PR c++/118775 gcc/cp/ChangeLog: * constexpr.cc (cxx_eval_call_expression): Check tree_fits_uhwi_p. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/constexpr-new24.C: New test. * g++.dg/cpp2a/constexpr-new25.C: New test. Reviewed-by: Jason Merrill <jason@redhat.com>
2025-03-07x86: Improve documentation for -msse4 [PR116708]Sandra Loosemore1-0/+5
gcc/ChangeLog PR target/116708 * doc/invoke.texi (x86 Options): Clarify how -msse4 and -mno-sse4 interact with other SSE options.
2025-03-07ipa-cp: Avoid ICE when redistributing nodes among edges to recursive clones ↵Martin Jambor1-1/+2
(PR 118318) PR 118318 reported an ICE during PGO build of Firefox when IPA-CP, in the final stages of update_counts_for_self_gen_clones where it attempts to guess how to distribute profile count among clones created for recursive edges and the various edges that are created in the process. If one such edge has profile count of kind GUESSED_GLOBAL0, the compatibility check in the operator+ will lead to an ICE. After discussing the situation with Honza, we concluded that there is little more we can do other than check for this situation before touching the edge count, so this is what this patch does. gcc/ChangeLog: 2025-02-28 Martin Jambor <mjambor@suse.cz> PR ipa/118318 * ipa-cp.cc (adjust_clone_incoming_counts): Add a compatible_p check.
2025-03-07arm: testsuite: improve guard checks for arm_neon.hRichard Earnshaw5-13/+60
The header file arm_neon.h provides the Advanced SIMD intrinsics that are available on armv7 or later A & R profile cores. However, they are not compatible with M-profile and we also need to ensure that the FP instructions are enabled (with -mfloat-abi=softfp/hard). That leads to some complicated checking as arm_neon.h includes stdint.h and, at least on linux, that can require that the appropriate ABI bits/ headers are also installed. This patch adds a new check to target-supports.exp to establish the minimal set of option overrides needed to enable use of this header in a test. gcc/testsuite: * lib/target-supports.exp (check_effective_target_arm_neon_h_ok_nocache): New function. (check_effective_target_arm_neon_h_ok): Likewise. (add_options_for_arm_neon_h): Likewise. (check_effective_target_arm_libc_fp_abi_ok_nocache): Allow any Arm target, not just arm32. * gcc.target/arm/attr-neon-builtin-fail.c: Use it. * gcc.target/arm/attr-neon-builtin-fail2.c: Likewise. * gcc.target/arm/attr-neon-fp16.c: Likewise. * gcc.target/arm/attr-neon2.c: Likewise.
2025-03-07arm: make arm_neon.h compatible with '-march=<base> -mfloat-abi=softfp'Richard Earnshaw1-0/+17
With -mfpu set to auto, an architecture specification that lacks floating-point, but has -mfloat-abi=softfp will cause a misleading error. Specifically, if we have gcc -c test.c -mfloat-abi=softfp -march=armv7-a -mfpu=auto where test.c contains #include <arm_neon.h> then we get a misleading error: test.c:11:2: error: #error "NEON intrinsics not available with the soft-float ABI. Please use -mfloat-abi=softfp or -mfloat-abi=hard" ... the error message is advising us to add -mfloat-abi=softfp when we already have it. The difficulty is that we can't directly detect the softfp abi from the available set of pre-defines. Consider the options in this table, assuming -mfpu=auto: -mfloat-abi hard softfp soft +----------------------------------------------- -march=armv7-a | *build-error* __ARM_FP=0 __ARM_FP=0 -march=armv7-a+fp | __ARM_FP=12 __ARM_FP=12 __ARM_FP=0 However, for the first line, if we subsequently add #pragma GCC target ("fpu=vfp") then the value of __ARM_FP will change as follows: -mfloat-abi hard softfp soft +----------------------------------------------- -march=armv7-a | *build-error* __ARM_FP=12 __ARM_FP=0 -march=armv7-a+fp | __ARM_FP=12 __ARM_FP=12 __ARM_FP=0 We can therefore distinguish between the soft and softfp ABIs by temporarily forcing VFP instructions into the ISA. If __ARM_FP is still zero after doing this then we must be using the soft ABI. gcc: * config/arm/arm_neon.h: Try harder to detect if we have the softfp ABI enabled.
2025-03-07docs: Attempt to clarify complex literal suffixes [PR112960]Jakub Jelinek1-11/+16
This attempts to clarify Complex literal suffixes in the documentation. 2025-03-07 Jakub Jelinek <jakub@redhat.com> PR c/112960 PR c/117029 * doc/extend.texi (Complex): Add I and J suffixes to the list of complex suffixes, adjust for all of those being part of ISO C2Y, clarify that for -fno-ext-numeric-literals none of those are recognized as GNU extensions and for C++14 i is considered UDL even for -fext-numeric-literals when <complex> is included.
2025-03-07vect: Fix build on MacOSSimon Martin1-0/+1
The build is broken on MacOS since r15-7881-ge8651b80aeb86d because tree-vect-data-refs.cc uses std::min but does not include <algorithm>. This patch fixes it by defining INCLUDE_ALGORITHM in that file. gcc/ChangeLog: * tree-vect-data-refs.cc: Define INCLUDE_ALGORITHM.
2025-03-07middle-end: delay checking for alignment to load [PR118464]Tamar Christina35-96/+509
This fixes two PRs on Early break vectorization by delaying the safety checks to vectorizable_load when the VF, VMAT and vectype are all known. This patch does add two new restrictions: 1. On LOAD_LANES targets, where the buffer size is known, we reject non-power of two group sizes, as they are unaligned every other iteration and so may cross a page unwittingly. For those cases require partial masking support. 2. On LOAD_LANES targets when the buffer is unknown, we reject vectorization if we cannot peel for alignment, as the alignment requirement is quite large at GROUP_SIZE * vectype_size. This is unlikely to ever be beneficial so we don't support it for now. There are other steps documented inside the code itself so that the reasoning is next to the code. As a fall-back, when the alignment fails we require partial vector support. For VLA targets like SVE return element alignment as the desired vector alignment. This means that the loads are never misaligned and so annoying it won't ever need to peel. So what I think needs to happen in GCC 16 is that. 1. during vect_compute_data_ref_alignment we need to take the max of POLY_VALUE_MIN and vector_alignment. 2. vect_do_peeling define skip_vector when PFA for VLA, and in the guard add a check that ncopies * vectype does not exceed POLY_VALUE_MAX which we use as a proxy for pagesize. 3. Force LOOP_VINFO_USING_PARTIAL_VECTORS_P to be true in vect_determine_partial_vectors_and_peeling since the first iteration has to be partial. Require LOOP_VINFO_MUST_USE_PARTIAL_VECTORS_P otherwise we have to fail to vectorize. 4. Create a default mask to be used, so that vect_use_loop_mask_for_alignment_p becomes true and we generate the peeled check through loop control for partial loops. From what I can tell this won't work for LOOP_VINFO_FULLY_WITH_LENGTH_P since they don't have any peeling support at all in the compiler. That would need to be done independently from the above. In any case, not GCC 15 material so I've kept the WIP patches I have downstream. Bootstrapped Regtested on aarch64-none-linux-gnu, arm-none-linux-gnueabihf, x86_64-pc-linux-gnu -m32, -m64 and no issues. gcc/ChangeLog: PR tree-optimization/118464 PR tree-optimization/116855 * doc/invoke.texi (min-pagesize): Update docs with vectorizer use. * tree-vect-data-refs.cc (vect_analyze_early_break_dependences): Delay checks. (vect_compute_data_ref_alignment): Remove alignment checks and move to get_load_store_type, increase group access alignment. (vect_enhance_data_refs_alignment): Add note to comment needing investigating. (vect_analyze_data_refs_alignment): Likewise. (vect_supportable_dr_alignment): For group loads look at first DR. * tree-vect-stmts.cc (get_load_store_type): Perform safety checks for early break pfa. * tree-vectorizer.h (dr_set_safe_speculative_read_required, dr_safe_speculative_read_required, DR_SCALAR_KNOWN_BOUNDS): New. (need_peeling_for_alignment): Renamed to... (safe_speculative_read_required): .. This (class dr_vec_info): Add scalar_access_known_in_bounds. gcc/testsuite/ChangeLog: PR tree-optimization/118464 PR tree-optimization/116855 * gcc.dg/vect/bb-slp-pr65935.c: Update, it now vectorizes because the load type is relaxed later. * gcc.dg/vect/vect-early-break_121-pr114081.c: Update. * gcc.dg/vect/vect-early-break_22.c: Require partial vectors. * gcc.dg/vect/vect-early-break_128.c: Likewise. * gcc.dg/vect/vect-early-break_26.c: Likewise. * gcc.dg/vect/vect-early-break_43.c: Likewise. * gcc.dg/vect/vect-early-break_44.c: Likewise. * gcc.dg/vect/vect-early-break_2.c: Require load_lanes. * gcc.dg/vect/vect-early-break_7.c: Likewise. * gcc.dg/vect/vect-early-break_132-pr118464.c: New test. * gcc.dg/vect/vect-early-break_133_pfa1.c: New test. * gcc.dg/vect/vect-early-break_133_pfa11.c: New test. * gcc.dg/vect/vect-early-break_133_pfa10.c: New test. * gcc.dg/vect/vect-early-break_133_pfa2.c: New test. * gcc.dg/vect/vect-early-break_133_pfa3.c: New test. * gcc.dg/vect/vect-early-break_133_pfa4.c: New test. * gcc.dg/vect/vect-early-break_133_pfa5.c: New test. * gcc.dg/vect/vect-early-break_133_pfa6.c: New test. * gcc.dg/vect/vect-early-break_133_pfa7.c: New test. * gcc.dg/vect/vect-early-break_133_pfa8.c: New test. * gcc.dg/vect/vect-early-break_133_pfa9.c: New test. * gcc.dg/vect/vect-early-break_39.c: Update testcase for misalignment. * gcc.dg/vect/vect-early-break_18.c: Likewise. * gcc.dg/vect/vect-early-break_20.c: Likewise. * gcc.dg/vect/vect-early-break_21.c: Likewise. * gcc.dg/vect/vect-early-break_38.c: Likewise. * gcc.dg/vect/vect-early-break_6.c: Likewise. * gcc.dg/vect/vect-early-break_53.c: Likewise. * gcc.dg/vect/vect-early-break_56.c: Likewise. * gcc.dg/vect/vect-early-break_57.c: Likewise. * gcc.dg/vect/vect-early-break_81.c: Likewise.
2025-03-07aarch64: add support for partial modes to last extractions [PR118464]Tamar Christina2-13/+17
The last extraction instructions work full both full and partial SVE vectors, however we currrently only define them for FULL vectors. Early break code for VLA now however requires partial vector support, which relies on extract_last support. I have not added any new testcases as they overlap with the existing Early break tests which now fail without this. gcc/ChangeLog: PR tree-optimization/118464 PR tree-optimization/116855 * config/aarch64/aarch64-sve.md (@extract_<last_op>_<mode>, @fold_extract_<last_op>_<mode>, @aarch64_fold_extract_vector_<last_op>_<mode>): Change SVE_FULL to SVE_ALL. * config/aarch64/iterators.md (vccore): Add more partial types.
2025-03-07tree-optimization/119145 - avoid stray .MASK_CALL after vectorizationRichard Biener2-1/+38
When we BB vectorize an if-converted loop body we make sure to not leave around .MASK_LOAD or .MASK_STORE created by if-conversion but we failed to check for .MASK_CALL. PR tree-optimization/119145 * tree-vectorizer.cc (try_vectorize_loop_1): Avoid BB vectorizing an if-converted loop body when there's a .MASK_CALL in the loop body. * gcc.dg/vect/pr119145.c: New testcase.
2025-03-07arm: Handle fixed PIC register in require_pic_register (PR target/115485)Christophe Lyon2-2/+19
Commit r9-4307-g89d7557202d25a forgot to accept a fixed PIC register when extending the assert in require_pic_register. arm_pic_register can be set explicitly by the user (e.g. -mpic-register=r9) or implicitly as the default value with -fpic/-fPIC/-fPIE and -mno-pic-data-is-text-relative -mlong-calls, and we want to use/accept it when recording cfun->machine->pic_reg as used to be the case. PR target/115485 gcc/ * config/arm/arm.cc (require_pic_register): Fix typos in comment. Handle fixed arm_pic_register. gcc/testsuite/ * g++.target/arm/pr115485.C: New test.
2025-03-07vect: Enforce dr_with_seg_len::align precondition [PR116125]Richard Sandiford3-3/+38
tree-data-refs.cc uses alignment information to try to optimise the code generated for alias checks. The assumption for "normal" non-grouped, full-width scalar accesses was that the access size would be a multiple of the alignment. As Richi notes in the PR, this is a documented precondition of dr_with_seg_len: /* The minimum common alignment of DR's start address, SEG_LEN and ACCESS_SIZE. */ unsigned int align; PR115192 was a case in which this assumption didn't hold. The access was part of an aligned 4-element group, but only the first 2 elements of the group were accessed. The alignment was therefore double the access size. In r15-820-ga0fe4fb1c8d78045 I'd "fixed" that by capping the alignment in one of the output routines. But I think that was misconceived. The precondition means that we should cap the alignment at source instead. Failure to do that caused a similar wrong code bug in this PR, where the alignment comes from a short bitfield access rather than from a group access. gcc/ PR tree-optimization/116125 * tree-vect-data-refs.cc (vect_prune_runtime_alias_test_list): Make the dr_with_seg_len alignment fields describe tha access sizes as well as the pointer alignment. * tree-data-ref.cc (create_intersect_range_checks): Don't compensate for invalid alignment fields here. gcc/testsuite/ PR tree-optimization/116125 * gcc.dg/vect/pr116125.c: New test.
2025-03-07aarch64: Use force_lowpart_subreg in a BFI splitter [PR119133]Richard Sandiford2-2/+10
lowpart_subreg ICEs are the gift that keeps giving. This is another case where we need to use force_lowpart_subreg instead, to handle cases where the input is already a subreg and where the combined subreg is not allowed as a single operation. We don't need to check can_create_pseudo_p since the input should be a hard register rather than a subreg if !can_create_pseudo_p. gcc/ PR target/119133 * config/aarch64/aarch64.md (*aarch64_bfi<GPI:mode><ALLX:mode>_<SUBDI_BITS>): Use force_lowpart_subreg. gcc/testsuite/ PR target/119133 * gcc.dg/torture/pr119133.c: New test.
2025-03-07c++: Handle TU_LOCAL_ENTITY in tsubst_expr and potential_constant_expressionNathaniel Shead2-66/+19
This cleans up the TU_LOCAL_ENTITY handling to avoid unnecessary tree walks and make the logic more robust. gcc/cp/ChangeLog: * constexpr.cc (potential_constant_expression_1): Handle TU_LOCAL_ENTITY. * pt.cc (expr_contains_tu_local_entity): Remove. (function_contains_tu_local_entity): Remove. (dependent_operand_p): Remove special handling for TU_LOCAL_ENTITY. (tsubst_expr): Handle TU_LOCAL_ENTITY when tsubsting OVERLOADs; remove now-unnecessary extra handling. (type_dependent_expression_p): Handle TU_LOCAL_ENTITY. Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by: Patrick Palka <ppalka@redhat.com> Reviewed-by: Jason Merrill <jason@redhat.com>
2025-03-07middle-end/118801 - excessive redundant DEBUG BEGIN_STMTRichard Biener1-0/+10
The following addresses the fact that we keep an excessive amount of redundant DEBUG BEGIN_STMTs - in the testcase it sums up to 99.999% of all stmts, sucking up compile-time in IL walks. The patch amends the GIMPLE DCE code that elides redundant DEBUG BIND stmts, also pruning uninterrupted sequences of DEBUG BEGIN_STMTs, keeping only the last of each set of DEBUG BEGIN_STMT with unique location. PR middle-end/118801 * tree-ssa-dce.cc (eliminate_unnecessary_stmts): Prune sequences of uninterrupted DEBUG BEGIN_STMTs, keeping only the last of a set with unique location.
2025-03-07Documentation: Improve -Wstringop-overflow documentation [PR 113515]Sandra Loosemore1-6/+29
This option can warn about things other than string and memory functions. Say so explicitly, and give an example. I also did some copy-editing of the text and added some paragraph breaks. gcc/ChangeLog PR c/113515 * doc/invoke.texi (Warning Options): Improve -Wstringop-overflow documentation.
2025-03-07i386: Correct mask width for bf8->fp16 intrin on 256/512 bitHaochen Jiang4-8/+8
For bf8 -> fp16 convert, when dst is 256 bit, the mask should be 16 bit since 16*16=256, not the 8 bit in the current intrin. In 512 bit intrin, the mask size is also halved. This patch will fix both of them. gcc/ChangeLog: * config/i386/avx10_2-512convertintrin.h (_mm512_mask_cvtbf8_ph): Correct mask width. (_mm512_maskz_cvtbf8_ph): Ditto. * config/i386/avx10_2convertintrin.h (_mm256_mask_cvtbf8_ph): Ditto. (_mm256_maskz_cvtbf8_ph): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/avx10_2-512-convert-1.c: Change function call. * gcc.target/i386/avx10_2-convert-1.c: Ditto.
2025-03-07Daily bump.GCC Administrator6-1/+248
2025-03-06[PR rtl-optimization/119099] Avoid infinite loop in ext-dce.Alexey Merzlyakov2-10/+22
This fixes the ping-ponging of live sets in ext-dce which is left unresolved can lead to infinite loops in the ext-dce pass as seen by the P1 regression 119099. At its core instead of replacing the livein set with the just recomputed data, we IOR in the just recomputed data to the existing livein set. That ensures the existing livein set never shrinks. Bootstrapped and regression tested on x86. I've also thrown this into my tester to verify it across multiple targets and that we aren't regressing the (limited) tests we have in place for ext-dce's optimization behavior. While it's a generic patch, I'll wait for the RISC-V tester to run is course before committing. PR rtl-optimization/119099 gcc/ * ext-dce.cc (ext_dce_rd_transfer_n): Do not allow the livein set to shrink. gcc/testsuite/ * gcc.dg/torture/pr119099.c: New test. Co-authored-by: Jeff Law <jlaw@ventanamicro.com>
2025-03-06Fortran: improve checking of substring bounds [PR119118]Harald Anlauf5-3/+125
After the fix for pr98490 no substring bounds check was generated if the substring start was not a variable. While the purpose of that fix was to suppress a premature check before implied-do indices were substituted, this prevented a check if the substring start was an expression or a constant. A better solution is to defer the check until implied-do indices have been substituted in the start and end expressions. PR fortran/119118 gcc/fortran/ChangeLog: * dependency.cc (gfc_contains_implied_index_p): Helper function to determine if an expression has a dependence on an implied-do index. * dependency.h (gfc_contains_implied_index_p): Add prototype. * trans-expr.cc (gfc_conv_substring): Adjust logic to not generate substring bounds checks before implied-do indices have been substituted. gcc/testsuite/ChangeLog: * gfortran.dg/bounds_check_23.f90: Generalize test. * gfortran.dg/bounds_check_26.f90: New test.