aboutsummaryrefslogtreecommitdiff
path: root/gcc/testsuite/gcc.dg
AgeCommit message (Collapse)AuthorFilesLines
2025-03-05middle-end/97323 - TYPE_CANONICAL vs. ARRAY_TYPE modesRichard Biener1-0/+5
For strict-alignment targets we can end up with BLKmode single-element array types when the element type is unaligned. This confuses type checking since the canonical type would have an aligned element type and a non-BLKmode mode. The following simply ignores the mode we assign to array types for this purpose, like we already do for record and union types. PR middle-end/97323 * tree.cc (gimple_canonical_types_compatible_p): Ignore TYPE_MODE also for ARRAY_TYPE. (verify_type): Likewise. * gcc.dg/pr97323.c: New testcase.
2025-03-04testsuite: Add tests for already fixed PR [PR119071]Jakub Jelinek1-0/+45
Uros' r15-7793 fixed this PR as well, I'm just committing tests from the PR so that it can be closed. 2025-03-04 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/119071 * gcc.dg/pr119071.c: New test. * gcc.c-torture/execute/pr119071.c: New test.
2025-03-04tree-optimization/119096 - bogus conditional reduction vectorizationRichard Biener1-0/+21
When we vectorize a .COND_ADD reduction and apply the single-use-def cycle optimization we can end up chosing the wrong else value for subsequent .COND_ADD. The following rectifies this. PR tree-optimization/119096 * tree-vect-loop.cc (vect_transform_reduction): Use the correct else value for .COND_fn. * gcc.dg/vect/pr119096.c: New testcase.
2025-03-03tree-optimization/119057 - bogus double reduction detectionRichard Biener1-0/+19
We are detecting a cycle as double reduction where the inner loop cycle has extra out-of-loop uses. This clashes at least with assumptions from the SLP discovery code which says the cycle isn't reachable from another SLP instance. It also was not intended to support this case, in fact with GCC 14 we seem to generate wrong code here. PR tree-optimization/119057 * tree-vect-loop.cc (check_reduction_path): Add argument specifying whether we're analyzing the inner loop of a double reduction. Do not allow extra uses outside of the double reduction cycle in this case. (vect_is_simple_reduction): Adjust. * gcc.dg/vect/pr119057.c: New testcase.
2025-03-01openmp: Fix up simd clone mask argument creation on x86 [PR115871]Jakub Jelinek1-0/+10
The following testcase ICEs since r14-5057. The Intel vector ABI says that in the ZMM case the masks is passed in unsigned int or unsigned long long arguments and how many bits in them and how many of those arguments are is determined by the characteristic data type of the function. In the testcase simdlen is 32 and characteristic data type is double, so return as well as first argument is passed in 4 V8DFmode arguments and the mask is supposed to be passed in 4 unsigned int arguments (8 bits in each). Before the r14-5057 change there was sc->args[i].orig_type = parm_type; ... case SIMD_CLONE_ARG_TYPE_LINEAR_VAL_CONSTANT_STEP: case SIMD_CLONE_ARG_TYPE_LINEAR_VAL_VARIABLE_STEP: case SIMD_CLONE_ARG_TYPE_VECTOR: if (INTEGRAL_TYPE_P (parm_type) || POINTER_TYPE_P (parm_type)) veclen = sc->vecsize_int; else veclen = sc->vecsize_float; if (known_eq (veclen, 0U)) veclen = sc->simdlen; else veclen = exact_div (veclen, GET_MODE_BITSIZE (SCALAR_TYPE_MODE (parm_type))); for the argument handling and if (sc->inbranch) { tree base_type = simd_clone_compute_base_data_type (sc->origin, sc); ... if (INTEGRAL_TYPE_P (base_type) || POINTER_TYPE_P (base_type)) veclen = sc->vecsize_int; else veclen = sc->vecsize_float; if (known_eq (veclen, 0U)) veclen = sc->simdlen; else veclen = exact_div (veclen, GET_MODE_BITSIZE (SCALAR_TYPE_MODE (base_type))); for the mask handling. r14-5057 moved this argument creation later and unified that: case SIMD_CLONE_ARG_TYPE_MASK: case SIMD_CLONE_ARG_TYPE_LINEAR_VAL_CONSTANT_STEP: case SIMD_CLONE_ARG_TYPE_LINEAR_VAL_VARIABLE_STEP: case SIMD_CLONE_ARG_TYPE_VECTOR: if (sc->args[i].arg_type == SIMD_CLONE_ARG_TYPE_MASK && sc->mask_mode != VOIDmode) elem_type = boolean_type_node; else elem_type = TREE_TYPE (sc->args[i].vector_type); if (INTEGRAL_TYPE_P (elem_type) || POINTER_TYPE_P (elem_type)) veclen = sc->vecsize_int; else veclen = sc->vecsize_float; if (known_eq (veclen, 0U)) veclen = sc->simdlen; else veclen = exact_div (veclen, GET_MODE_BITSIZE (SCALAR_TYPE_MODE (elem_type))); This is correct for the argument cases (so linear or vector) (though POINTER_TYPE_P will never appear as TREE_TYPE of a vector), but the boolean_type_node in there is completely bogus, when using AVX512 integer masks as I wrote above we need the characteristic data type, not bool, and bool is strange in that it has bitsize of 8 (or 32 on darwin), while the masks are 1 bit per lane anyway. Fixed thusly. 2025-03-01 Jakub Jelinek <jakub@redhat.com> PR middle-end/115871 * omp-simd-clone.cc (simd_clone_adjust): For SIMD_CLONE_ARG_TYPE_MASK and sc->mask_mode not VOIDmode, set elem_type to the characteristic type rather than boolean_type_node. * gcc.dg/gomp/simd-clones-8.c: New test.
2025-02-28c++: Adjust #embed support for P1967R14Jakub Jelinek1-0/+14
Now that the #embed paper has been voted in, the following patch removes the pedwarn for C++26 on it (and adjusts pedwarn warning for older C++ versions) and predefines __cpp_pp_embed FTM. Also, the patch changes cpp_error to cpp_pedwarning with for C++ -Wc++26-extensions guarding, and for C add -Wc11-c23-compat warning about #embed. I believe we otherwise implement everything in the paper already, except I'm really confused by the [Example: #embed <data.dat> limit(__has_include("a.h")) #if __has_embed(<data.dat> limit(__has_include("a.h"))) // ill-formed: __has_include [cpp.cond] cannot appear here #endif — end example] part. My reading of both C23 and C++ with the P1967R14 paper in is that the first case (#embed with __has_include or __has_embed in its clauses) is what is clearly invalid and so the ill-formed note should be for #embed. And the __has_include/__has_embed in __has_embed is actually questionable. Both C and C++ have something like "The identifiers __has_include, __has_embed, and __has_c_attribute shall not appear in any context not mentioned in this subclause." or "The identifiers __has_include and __has_cpp_attribute shall not appear in any context not mentioned in this subclause." (into which P1967R14 adds __has_embed) in the conditional inclusion subclause. #embed is defined in a different one, so using those in there is invalid (unless "using the rules specified for conditional inclusion" wording e.g. in limit clause overrides that). The reason why I think it is fuzzy for __has_embed is that __has_embed is actually defined in the Conditional inclusion subclause (so that would mean one can use __has_include, __has_embed and __has_*attribute in there) but its clauses are described in a different one. GCC currently accepts #embed __FILE__ limit (__has_include (<stdarg.h>)) #if __has_embed (__FILE__ limit (__has_include (<stdarg.h>))) #endif #embed __FILE__ limit (__has_embed (__FILE__)) #if __has_embed (__FILE__ limit (__has_embed (__FILE__))) #endif Note, it isn't just about limit clause, but also about prefix/suffix/if_empty, except that in those cases the "using the rules specified for conditional inclusion" doesn't apply. In any case, I'd hope that can be dealt with incrementally (and should be handled the same for both C and C++). 2025-02-28 Jakub Jelinek <jakub@redhat.com> libcpp/ * include/cpplib.h (enum cpp_warning_reason): Add CPP_W_CXX26_EXTENSIONS enumerator. * init.cc (lang_defaults): Set embed for GNUCXX26 and CXX26. * directives.cc (do_embed): Adjust pedwarn wording for embed in C++, use cpp_pedwarning instead of cpp_error and add CPP_W_C11_C23_COMPAT warning of cpp_pedwarning hasn't diagnosed anything. gcc/c-family/ * c.opt (Wc++26-extensions): Add CppReason(CPP_W_CXX26_EXTENSIONS). * c-cppbuiltin.cc (c_cpp_builtins): Predefine __cpp_pp_embed=202502 for C++26. gcc/testsuite/ * g++.dg/cpp/embed-1.C: Adjust for pedwarn wording change and don't expect any error for C++26. * g++.dg/cpp/embed-2.C: Adjust for pedwarn wording change and don't expect any warning for C++26. * g++.dg/cpp26/feat-cxx26.C: Test __cpp_pp_embed value. * gcc.dg/cpp/embed-17.c: New test.
2025-02-28lto/91299 - weak definition inlined with LTORichard Biener2-0/+22
The following fixes a thinko in the handling of interposed weak definitions which confused the interposition check in get_availability by setting DECL_EXTERNAL too early. PR lto/91299 gcc/lto/ * lto-symtab.cc (lto_symtab_merge_symbols): Set DECL_EXTERNAL only after calling get_availability. gcc/testsuite/ * gcc.dg/lto/pr91299_0.c: New testcase. * gcc.dg/lto/pr91299_1.c: Likewise.
2025-02-28ifcvt: Fix ICE with (fix:SI (fix:DF (reg:DF))) [PR117712]Jakub Jelinek1-0/+13
As documented in the manual, FIX/UNSIGNED_FIX from floating point mode to integral mode has unspecified rounding and FIX from floating point mode to the same floating point mode is expressing rounding toward zero. So, some targets (arc, arm, csky, m68k, mmix, nds32, pdp11, sparc and visium) use (fix:SI (fix:SF (match_operand:SF 1 "..._operand"))) etc. to express the rounding toward zero during conversion to integer. For some reason other targets don't use that. Anyway, the 2 FIXes (or inner FIX with outer UNSIGNED_FIX) cause problems since the r15-2890 which removed some strict checks in ifcvt.cc on what SET_SRC can be actually conditionalized (I must say I'm still worried about the change, don't know why one can't get e.g. inline asm or something with UNSPEC or some complex backend specific RTLs that force_operand can't handle), force_operand just ICEs on it, it can only handle (through expand_fix) conversions from floating point to integral. The following patch fixes this by detecting this case and just pretend the inner FIX isn't there, i.e. call expand_fix with the inner FIX's operand instead, which works and on targets like arm it will just create the nested FIXes again. 2025-02-28 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/117712 * expr.cc (force_operand): Handle {,UNSIGNED_}FIX with FIX operand using expand_fix on the inner FIX operand. * gcc.dg/pr117712.c: New test.
2025-02-27input: Fix up ICEs with --param=file-cache-files=N for N > 16 [PR118860]Jakub Jelinek1-0/+5
The following testcase ICEs, because we first construct file_cache object inside of *global_dc, then process options and then call file_cache::tune. The earlier construction allocates the m_file_slots array (using new) based on the static data member file_cache::num_file_slots, but then tune changes it, without actually reallocating all m_file_slots arrays in already constructed file_cache objects. I think it is just weird to have the count be a static data member and the pointer be non-static data member, that is just asking for issues like this. So, this patch changes num_file_slots into m_num_file_slots and turns tune into a non-static member function and changes toplev.cc to call it on the global_gc->get_file_cache () object. And let's the tune just delete the array and allocate it freshly if there is a change in the number of slots or lines. Note, file_cache_slot has similar problem, but because there are many, I haven't moved the count into those objects; I just hope that when tune is called there is exactly one file_cache constructed and all the file_cache_slot objects constructed are pointed by its m_file_slots member, so also on lines change it just deletes it and allocates again. I think it should be unlikely that the cache actually has any used slots by the time it is called. 2025-02-27 Jakub Jelinek <jakub@redhat.com> PR middle-end/118860 * input.h (file_cache::tune): No longer static. Rename argument from num_file_slots_ to num_file_slots. Formatting fix. (file_cache::num_file_slots): Renamed to ... (file_cache::m_num_file_slots): ... this. No longer static. * input.cc (file_cache_slot::tune): Change return type from void to size_t, return previous file_cache_slot::line_record_size value. Formatting fixes. (file_cache::tune): Rename argument from num_file_slots_ to num_file_slots. Set m_num_file_slots rather than num_file_slots. If m_num_file_slots or file_cache_slot::line_record_size changes, delete[] m_file_slots and new it again. (file_cache::num_file_slots): Remove definition. (file_cache::lookup_file): Use m_num_file_slots rather than num_file_slots. (file_cache::evicted_cache_tab_entry): Likewise. (file_cache::file_cache): Likewise. Initialize m_num_file_slots to 16. (file_cache::dump): Use m_num_file_slots rather than num_file_slots. (file_cache_slot::get_next_line): Formatting fixes. (file_cache_slot::read_line_num): Likewise. (get_source_text_between): Likewise. * toplev.cc (toplev::main): Call global_dc->get_file_cache ().tune rather than file_cache::tune. * gcc.dg/pr118860.c: New test.
2025-02-27[PR116336][LRA]: Add a testVladimir N. Makarov1-0/+16
Patch for PR116234 solves given PR116366. So the patch adds only the test case which is very different from PR116234 one. gcc/testsuite/ChangeLog: PR rtl-optimization/116336 * gcc.dg/pr116336.c: New test.
2025-02-26c: Assorted fixes for flexible array members in unions [PR119001]Jakub Jelinek2-0/+55
r15-209 allowed flexible array members inside of unions, but as the following testcase shows, not everything has been adjusted for that. Unlike structures, in unions flexible array member (as an extension) can be any of the members, not just the last one, as in union all members are effectively last. The first hunk is about an ICE on the initialization of the FAM in union which is not the last FIELD_DECL with a string literal, the second hunk just formatting fix, third hunk fixes a bug in which we were just throwing away the initializers (except for with string literal) of FAMs in unions which aren't the last FIELD_DECL, and the last hunk is to diagnose FAM errors in unions the same as for structures, in particular trying to initialize a FAM with non-constant or initialization in nested context. 2025-02-26 Jakub Jelinek <jakub@redhat.com> PR c/119001 gcc/ * varasm.cc (output_constructor_regular_field): Don't fail assertion if next is non-NULL and FIELD_DECL if TREE_CODE (local->type) is UNION_TYPE. gcc/c/ * c-typeck.cc (pop_init_level): Don't clear constructor_type if DECL_CHAIN of constructor_fields is NULL but p->type is UNION_TYPE. Formatting fix. (process_init_element): Diagnose non-static initialization of flexible array member in union or FAM in union initialization in nested context. gcc/testsuite/ * gcc.dg/pr119001-1.c: New test. * gcc.dg/pr119001-2.c: New test.
2025-02-26c: stddef.h C23 fixes [PR114870]Jakub Jelinek1-0/+17
The stddef.h header for C23 defines __STDC_VERSION_STDDEF_H__ and unreachable macros multiple times in some cases. The header doesn't have normal multiple inclusion guard, because it supports for glibc inclusion with __need_{size_t,wchar_t,ptrdiff_t,wint_t,NULL}. While the definition of __STDC_VERSION_STDDEF_H__ and unreachable is done solely in the #ifdef _STDDEF_H part, so they are defined only if stddef.h is included without those __need_* macros defined. But actually once stddef.h is included without the __need_* macros, _STDDEF_H is then defined and while further stddef.h includes without __need_* macros don't do anything: #if (!defined(_STDDEF_H) && !defined(_STDDEF_H_) && !defined(_ANSI_STDDEF_H) \ && !defined(__STDDEF_H__)) \ || defined(__need_wchar_t) || defined(__need_size_t) \ || defined(__need_ptrdiff_t) || defined(__need_NULL) \ || defined(__need_wint_t) if one includes whole stddef.h first and then stddef.h with some of the __need_* macros defined, the #ifdef _STDDEF_H part is used again. It isn't that big deal for most cases, as it uses extra guarding macros like: #ifndef _GCC_MAX_ALIGN_T #define _GCC_MAX_ALIGN_T ... #endif etc., but for __STDC_VERSION_STDDEF_H__/unreachable nothing like that is used. So, either we do what the following patch does and just don't define __STDC_VERSION_STDDEF_H__/unreachable second time, or use #ifndef unreachable separately for the #define unreachable() case, or use new _GCC_STDC_VERSION_STDDEF_H macro to guard this (or two, one for __STDC_VERSION_STDDEF_H__ and one for unreachable), or rework the initial condition to be just #if !defined(_STDDEF_H) && !defined(_STDDEF_H_) && !defined(_ANSI_STDDEF_H) \ && !defined(__STDDEF_H__) - I really don't understand why the header should do anything at all after it has been included once without __need_* macros. But changing how this behaves after 35 years might be risky for various OS/libc combinations. 2025-02-26 Jakub Jelinek <jakub@redhat.com> PR c/114870 * ginclude/stddef.h (__STDC_VERSION_STDDEF_H__, unreachable): Don't redefine multiple times if stddef.h is first included without __need_* defines and later with them. Move nullptr_t and unreachable and __STDC_VERSION_STDDEF_H__ definitions into the same defined (__STDC_VERSION__) && __STDC_VERSION__ > 201710L #if block. * gcc.dg/c23-stddef-2.c: New test.
2025-02-26[testsuite] adjust expectations of x86 vect-simd-clone testsAlexandre Oliva4-2/+10
Some vect-simd-clone tests fail when targeting ancient x86 variants, because the expected transformations only take place with -msse4 or higher. So arrange for these tests to take an -msse4 option on x86, so that the expected vectorization takes place, but decay to a compile test if vect.exp would enable execution but the target doesn't have an sse4 runtime. This requires the new dg-do-if to override the action on a target while retaining the default action on others, instead of disabling the test. We can count on avx512f compile-time support for these tests, because vect_simd_clones requires that on x86, and that implies sse4 support, so we need not complicate the scan conditionals with tests for sse4, except on the last test. for gcc/ChangeLog * doc/sourcebuild.texi (dg-do-if): Document. for gcc/testsuite/ChangeLog * lib/target-supports-dg.exp (dg-do-if): New. * gcc.dg/vect/vect-simd-clone-16f.c: Use -msse4 on x86, and skip in case execution is enabled but the runtime isn't. * gcc.dg/vect/vect-simd-clone-17f.c: Likewise. * gcc.dg/vect/vect-simd-clone-18f.c: Likewise. * gcc.dg/vect/vect-simd-clone-20.c: Likewise, but only skip the scan test.
2025-02-26testsuite: Add pragma novector to more tests [PR118464]Tamar Christina21-1/+22
These loops will now vectorize the entry finding loops. As such we get more failures because they were not expecting to be vectorized. Fixed by adding #pragma GCC novector. gcc/testsuite/ChangeLog: PR tree-optimization/118464 PR tree-optimization/116855 * g++.dg/ext/pragma-unroll-lambda-lto.C: Add pragma novector. * gcc.dg/tree-ssa/gen-vect-2.c: Likewise. * gcc.dg/tree-ssa/gen-vect-25.c: Likewise. * gcc.dg/tree-ssa/gen-vect-32.c: Likewise. * gcc.dg/tree-ssa/ivopt_mult_2g.c: Likewise. * gcc.dg/tree-ssa/ivopts-5.c: Likewise. * gcc.dg/tree-ssa/ivopts-6.c: Likewise. * gcc.dg/tree-ssa/ivopts-7.c: Likewise. * gcc.dg/tree-ssa/ivopts-8.c: Likewise. * gcc.dg/tree-ssa/ivopts-9.c: Likewise. * gcc.dg/tree-ssa/predcom-dse-1.c: Likewise. * gcc.dg/tree-ssa/predcom-dse-10.c: Likewise. * gcc.dg/tree-ssa/predcom-dse-11.c: Likewise. * gcc.dg/tree-ssa/predcom-dse-12.c: Likewise. * gcc.dg/tree-ssa/predcom-dse-2.c: Likewise. * gcc.dg/tree-ssa/predcom-dse-3.c: Likewise. * gcc.dg/tree-ssa/predcom-dse-4.c: Likewise. * gcc.dg/tree-ssa/predcom-dse-5.c: Likewise. * gcc.dg/tree-ssa/predcom-dse-6.c: Likewise. * gcc.dg/tree-ssa/predcom-dse-7.c: Likewise. * gcc.dg/tree-ssa/predcom-dse-8.c: Likewise. * gcc.dg/tree-ssa/predcom-dse-9.c: Likewise. * gcc.target/i386/pr90178.c: Likewise.
2025-02-24RISC-V: Include pattern stmts for dynamic LMUL computation [PR114516].Robin Dapp1-0/+29
When scanning for program points, i.e. vector statements, we're missing pattern statements. In PR114516 this becomes obvious as we choose LMUL=8 assuming there are only three statements but the divmod pattern adds another three. Those push us beyond four registers so we need to switch to LMUL=4. This patch adds pattern statements to the program points which helps calculate a better register pressure estimate. PR target/114516 gcc/ChangeLog: * config/riscv/riscv-vector-costs.cc (compute_estimated_lmul): Add pattern statements to program points. gcc/testsuite/ChangeLog: * gcc.dg/vect/costmodel/riscv/rvv/pr114516.c: New test.
2025-02-24Use nonnull_if_nonzero attribute rather than nonnull on various builtins ↵Jakub Jelinek8-48/+317
[PR117023] On top of the https://gcc.gnu.org/pipermail/gcc-patches/2024-November/668554.html https://gcc.gnu.org/pipermail/gcc-patches/2024-November/668699.html https://gcc.gnu.org/pipermail/gcc-patches/2024-November/668700.html patches the following patch adds nonnull_if_nonzero attribute(s) to various builtins instead of or in addition to nonnull attribute. The patch adjusts builtins (when we have them) corresponding to the APIs mentioned in the C2Y N3322 paper: 1) strndup and memset get one nonnull_if_nonzero attribute instead of nonnull 2) memcpy, memmove, strncpy, memcmp, strncmp get two nonnull_if_nonzero attributes instead of nonnull 3) strncat has nonnull without argument changed to nonnull (1) and gets one nonnull_if_nonzero for the src argument (maybe it needs to be clarified in C2Y, but I really think first argument to strncat and wcsncat shouldn't be NULL even for n == 0, because NULL doesn't point to NULL terminated string and one can't append anything to it; and various implementations in the wild including glibc will crash with NULL first argument (x86_64 avx+ doesn't though) Such changes are done also to the _chk suffixed counterparts of the builtins. Furthermore I've changed a couple of builtins for POSIX functions which aren't covered by ISO C, but I'd expect if/when POSIX incorporates C2Y it would do the same changes. In particular 4) strnlen gets one nonnull_if_nonzero instead of nonnull 5) mempcpy and stpncpy get two nonnull_if_nonzero instead of nonnull and lose returns_nonnull attribute; this is kind of unfortunate but I think in the spirit of N3322 mempcpy (NULL, src, 0) should return NULL (i.e. dest + n aka NULL + 0, now valid) and it is hard to express returns non-NULL if first argument is non-NULL or third argument is non-zero I'm not really sure about fread/fwrite, N3322 doesn't mention those, can the first argument be NULL if third argument is 0? What about if second argument is 0? Can the fourth argument be NULL in such cases? And of course, when not using builtins the glibc headers will affect stuff too, so we'll need to wait for N3322 implementation there too (possibly by dropping the nonnull attributes and perhaps conditionally replacing them with this new one if the compiler supports them). 2025-02-24 Jakub Jelinek <jakub@redhat.com> PR c/117023 gcc/ * builtin-attrs.def (ATTR_NONNULL_IF_NONZERO): New DEF_ATTR_IDENT. (ATTR_NOTHROW_NONNULL_IF12_LEAF, ATTR_NOTHROW_NONNULL_IF13_LEAF, ATTR_NOTHROW_NONNULL_IF123_LEAF, ATTR_NOTHROW_NONNULL_IF23_LEAF, ATTR_NOTHROW_NONNULL_1_IF23_LEAF, ATTR_PURE_NOTHROW_NONNULL_IF12_LEAF, ATTR_PURE_NOTHROW_NONNULL_IF13_LEAF, ATTR_PURE_NOTHROW_NONNULL_IF123_LEAF, ATTR_WARN_UNUSED_RESULT_NOTHROW_NONNULL_IF12_LEAF, ATTR_MALLOC_WARN_UNUSED_RESULT_NOTHROW_NONNULL_IF12_LEAF): New DEF_ATTR_TREE_LIST. * builtins.def (BUILT_IN_STRNDUP): Use ATTR_MALLOC_WARN_UNUSED_RESULT_NOTHROW_NONNULL_IF12_LEAF instead of ATTR_MALLOC_WARN_UNUSED_RESULT_NOTHROW_NONNULL_LEAF. (BUILT_IN_STRNCAT, BUILT_IN_STRNCAT_CHK): Use ATTR_NOTHROW_NONNULL_1_IF23_LEAF instead of ATTR_NOTHROW_NONNULL_LEAF. (BUILT_IN_BCOPY, BUILT_IN_MEMCPY, BUILT_IN_MEMCPY_CHK, BUILT_IN_MEMMOVE, BUILT_IN_MEMMOVE_CHK, BUILT_IN_STRNCPY, BUILT_IN_STRNCPY_CHK): Use ATTR_NOTHROW_NONNULL_IF123_LEAF instead of ATTR_NOTHROW_NONNULL_LEAF. (BUILT_IN_MEMPCPY, BUILT_IN_MEMPCPY_CHK, BUILT_IN_STPNCPY, BUILT_IN_STPNCPY_CHK): Use ATTR_NOTHROW_NONNULL_IF123_LEAF instead of ATTR_RETNONNULL_NOTHROW_LEAF. (BUILT_IN_BZERO, BUILT_IN_MEMSET, BUILT_IN_MEMSET_CHK): Use ATTR_NOTHROW_NONNULL_IF13_LEAF instead of ATTR_NOTHROW_NONNULL_LEAF. (BUILT_IN_BCMP, BUILT_IN_MEMCMP, BUILT_IN_STRNCASECMP, BUILT_IN_STRNCMP): Use ATTR_PURE_NOTHROW_NONNULL_IF123_LEAF instead of ATTR_PURE_NOTHROW_NONNULL_LEAF. (BUILT_IN_STRNLEN): Use ATTR_PURE_NOTHROW_NONNULL_IF12_LEAF instead of ATTR_PURE_NOTHROW_NONNULL_LEAF. (BUILT_IN_MEMCHR): Use ATTR_PURE_NOTHROW_NONNULL_IF13_LEAF instead of ATTR_PURE_NOTHROW_NONNULL_LEAF. gcc/testsuite/ * gcc.dg/builtins-nonnull.c (test_memfuncs, test_memfuncs_chk, test_strfuncs, test_strfuncs_chk): Add if (n == 0) return; at the start of the functions. * gcc.dg/Wnonnull-2.c: Copy __builtin_* call statements where appropriate 3 times, once with 0 length, once with n and once with non-zero constant and expect warning only in the third case. Formatting fixes. * gcc.dg/Wnonnull-3.c: Copy __builtin_* call statements where appropriate 3 times, once with 0 length, once with n and once with n guarded with n != 0 and expect warning only in the third case. Formatting fixes. * gcc.dg/nonnull-3.c (foo): Use 16 instead of 0 in the calls added for PR80936. * gcc.dg/nonnull-11.c: New test. * c-c++-common/ubsan/nonnull-1.c: Don't expect runtime diagnostics for the __builtin_memcpy call. * gcc.dg/tree-ssa/pr78154.c (f): Add dn argument and return early if it is NULL. Duplicate cases of builtins which have the first argument changed from nonnull to nonnull_if_nonzero except stpncpy, once with dn as first argument instead of d and once with constant non-zero count rather than n. Disable the stpncpy non-null check. * gcc.dg/Wbuiltin-declaration-mismatch-14.c (test_builtin_calls): Triplicate the strncmp calls, once with 1 last argument and expect warning, once with n last argument and don't expect warning and once with 0 last argument and don't expect warning. * gcc.dg/Wbuiltin-declaration-mismatch-15.c (test_builtin_calls_fe): Likewise.
2025-02-22Turn test cases into UNSUPPORTED if running into 'sorry, unimplemented: ↵Thomas Schwinge197-292/+94
dynamic stack allocation not supported' In Subversion r217296 (Git commit e2acc079ff125a869159be45371dc0a29b230e92) "Testsuite alloca fixes for ptx", effective-target 'alloca' was added to mark up test cases that run into the nvptx back end's non-support of dynamic stack allocation. (Later, nvptx gained conditional support for that in commit 3861d362ec7e3c50742fc43833fe9d8674f4070e "nvptx: PTX 'alloca' for '-mptx=7.3'+, '-march=sm_52'+ [PR65181]", but on the other hand, in commit f93a612fc4567652b75ffc916d31a446378e6613 "bpf: liberate R9 for general register allocation", the BPF back end joined "the list of targets that do not support alloca in target-support.exp". Manually maintaining the list of test cases requiring effective-target 'alloca' is notoriously hard, gets out of date quickly: new test cases added to the test suite may need to be analyzed and annotated, and over time annotations also may need to be removed, in cases where the compiler learns to optimize out 'alloca'/VLA usage, for example. This commit replaces (99 % of) the manual annotations with an automatic scheme: turn test cases into UNSUPPORTED if running into 'sorry, unimplemented: dynamic stack allocation not supported'. gcc/testsuite/ * lib/target-supports.exp (check_effective_target_alloca): Gracefully handle the case that we've not be called (indirectly) from 'dg-test'. * lib/gcc-dg.exp (proc gcc-dg-prune): Turn 'sorry, unimplemented: dynamic stack allocation not supported' into UNSUPPORTED. * c-c++-common/Walloca-larger-than.c: Don't 'dg-require-effective-target alloca'. * c-c++-common/Warray-bounds-9.c: Likewise. * c-c++-common/Warray-bounds.c: Likewise. * c-c++-common/Wdangling-pointer-2.c: Likewise. * c-c++-common/Wdangling-pointer-4.c: Likewise. * c-c++-common/Wdangling-pointer-5.c: Likewise. * c-c++-common/Wdangling-pointer.c: Likewise. * c-c++-common/Wimplicit-fallthrough-7.c: Likewise. * c-c++-common/Wsizeof-pointer-memaccess1.c: Likewise. * c-c++-common/Wsizeof-pointer-memaccess2.c: Likewise. * c-c++-common/Wstringop-truncation.c: Likewise. * c-c++-common/Wunused-var-6.c: Likewise. * c-c++-common/Wunused-var-8.c: Likewise. * c-c++-common/analyzer/alloca-leak.c: Likewise. * c-c++-common/analyzer/allocation-size-multiline-2.c: Likewise. * c-c++-common/analyzer/allocation-size-multiline-3.c: Likewise. * c-c++-common/analyzer/capacity-1.c: Likewise. * c-c++-common/analyzer/capacity-3.c: Likewise. * c-c++-common/analyzer/imprecise-floating-point-1.c: Likewise. * c-c++-common/analyzer/infinite-recursion-alloca.c: Likewise. * c-c++-common/analyzer/malloc-callbacks.c: Likewise. * c-c++-common/analyzer/malloc-paths-8.c: Likewise. * c-c++-common/analyzer/out-of-bounds-5.c: Likewise. * c-c++-common/analyzer/out-of-bounds-diagram-11.c: Likewise. * c-c++-common/analyzer/uninit-alloca.c: Likewise. * c-c++-common/analyzer/write-to-string-literal-5.c: Likewise. * c-c++-common/asan/alloca_loop_unpoisoning.c: Likewise. * c-c++-common/auto-init-11.c: Likewise. * c-c++-common/auto-init-12.c: Likewise. * c-c++-common/auto-init-15.c: Likewise. * c-c++-common/auto-init-16.c: Likewise. * c-c++-common/builtins.c: Likewise. * c-c++-common/dwarf2/vla1.c: Likewise. * c-c++-common/gomp/pr61486-2.c: Likewise. * c-c++-common/torture/builtin-clear-padding-4.c: Likewise. * c-c++-common/torture/strub-run3.c: Likewise. * c-c++-common/torture/strub-run4.c: Likewise. * c-c++-common/torture/strub-run4c.c: Likewise. * c-c++-common/torture/strub-run4d.c: Likewise. * c-c++-common/torture/strub-run4i.c: Likewise. * g++.dg/Walloca1.C: Likewise. * g++.dg/Walloca2.C: Likewise. * g++.dg/cpp0x/pr70338.C: Likewise. * g++.dg/cpp1y/lambda-generic-vla1.C: Likewise. * g++.dg/cpp1y/vla10.C: Likewise. * g++.dg/cpp1y/vla2.C: Likewise. * g++.dg/cpp1y/vla6.C: Likewise. * g++.dg/cpp1y/vla8.C: Likewise. * g++.dg/debug/debug5.C: Likewise. * g++.dg/debug/debug6.C: Likewise. * g++.dg/debug/pr54828.C: Likewise. * g++.dg/diagnostic/pr70105.C: Likewise. * g++.dg/eh/cleanup5.C: Likewise. * g++.dg/eh/spbp.C: Likewise. * g++.dg/ext/builtin_alloca.C: Likewise. * g++.dg/ext/tmplattr9.C: Likewise. * g++.dg/ext/vla10.C: Likewise. * g++.dg/ext/vla11.C: Likewise. * g++.dg/ext/vla12.C: Likewise. * g++.dg/ext/vla15.C: Likewise. * g++.dg/ext/vla16.C: Likewise. * g++.dg/ext/vla17.C: Likewise. * g++.dg/ext/vla23.C: Likewise. * g++.dg/ext/vla3.C: Likewise. * g++.dg/ext/vla6.C: Likewise. * g++.dg/ext/vla7.C: Likewise. * g++.dg/init/array24.C: Likewise. * g++.dg/init/new47.C: Likewise. * g++.dg/init/pr55497.C: Likewise. * g++.dg/opt/pr78201.C: Likewise. * g++.dg/template/vla2.C: Likewise. * g++.dg/torture/Wsizeof-pointer-memaccess1.C: Likewise. * g++.dg/torture/Wsizeof-pointer-memaccess2.C: Likewise. * g++.dg/torture/pr62127.C: Likewise. * g++.dg/torture/pr67055.C: Likewise. * g++.dg/torture/stackalign/eh-alloca-1.C: Likewise. * g++.dg/torture/stackalign/eh-inline-2.C: Likewise. * g++.dg/torture/stackalign/eh-vararg-1.C: Likewise. * g++.dg/torture/stackalign/eh-vararg-2.C: Likewise. * g++.dg/warn/Wplacement-new-size-5.C: Likewise. * g++.dg/warn/Wsizeof-pointer-memaccess-1.C: Likewise. * g++.dg/warn/Wvla-1.C: Likewise. * g++.dg/warn/Wvla-3.C: Likewise. * g++.old-deja/g++.ext/array2.C: Likewise. * g++.old-deja/g++.ext/constructor.C: Likewise. * g++.old-deja/g++.law/builtin1.C: Likewise. * g++.old-deja/g++.other/crash12.C: Likewise. * g++.old-deja/g++.other/eh3.C: Likewise. * g++.old-deja/g++.pt/array6.C: Likewise. * g++.old-deja/g++.pt/dynarray.C: Likewise. * gcc.c-torture/compile/20000923-1.c: Likewise. * gcc.c-torture/compile/20030224-1.c: Likewise. * gcc.c-torture/compile/20071108-1.c: Likewise. * gcc.c-torture/compile/20071117-1.c: Likewise. * gcc.c-torture/compile/900313-1.c: Likewise. * gcc.c-torture/compile/parms.c: Likewise. * gcc.c-torture/compile/pr17397.c: Likewise. * gcc.c-torture/compile/pr35006.c: Likewise. * gcc.c-torture/compile/pr42956.c: Likewise. * gcc.c-torture/compile/pr51354.c: Likewise. * gcc.c-torture/compile/pr52714.c: Likewise. * gcc.c-torture/compile/pr55851.c: Likewise. * gcc.c-torture/compile/pr77754-1.c: Likewise. * gcc.c-torture/compile/pr77754-2.c: Likewise. * gcc.c-torture/compile/pr77754-3.c: Likewise. * gcc.c-torture/compile/pr77754-4.c: Likewise. * gcc.c-torture/compile/pr77754-5.c: Likewise. * gcc.c-torture/compile/pr77754-6.c: Likewise. * gcc.c-torture/compile/pr78439.c: Likewise. * gcc.c-torture/compile/pr79413.c: Likewise. * gcc.c-torture/compile/pr82564.c: Likewise. * gcc.c-torture/compile/pr87110.c: Likewise. * gcc.c-torture/compile/pr99787-1.c: Likewise. * gcc.c-torture/compile/vla-const-1.c: Likewise. * gcc.c-torture/compile/vla-const-2.c: Likewise. * gcc.c-torture/execute/20010209-1.c: Likewise. * gcc.c-torture/execute/20020314-1.c: Likewise. * gcc.c-torture/execute/20020412-1.c: Likewise. * gcc.c-torture/execute/20021113-1.c: Likewise. * gcc.c-torture/execute/20040223-1.c: Likewise. * gcc.c-torture/execute/20040308-1.c: Likewise. * gcc.c-torture/execute/20040811-1.c: Likewise. * gcc.c-torture/execute/20070824-1.c: Likewise. * gcc.c-torture/execute/20070919-1.c: Likewise. * gcc.c-torture/execute/built-in-setjmp.c: Likewise. * gcc.c-torture/execute/pr22061-1.c: Likewise. * gcc.c-torture/execute/pr43220.c: Likewise. * gcc.c-torture/execute/pr82210.c: Likewise. * gcc.c-torture/execute/pr86528.c: Likewise. * gcc.c-torture/execute/vla-dealloc-1.c: Likewise. * gcc.dg/20001012-2.c: Likewise. * gcc.dg/20020415-1.c: Likewise. * gcc.dg/20030331-2.c: Likewise. * gcc.dg/20101010-1.c: Likewise. * gcc.dg/Walloca-1.c: Likewise. * gcc.dg/Walloca-10.c: Likewise. * gcc.dg/Walloca-11.c: Likewise. * gcc.dg/Walloca-12.c: Likewise. * gcc.dg/Walloca-13.c: Likewise. * gcc.dg/Walloca-14.c: Likewise. * gcc.dg/Walloca-15.c: Likewise. * gcc.dg/Walloca-2.c: Likewise. * gcc.dg/Walloca-3.c: Likewise. * gcc.dg/Walloca-4.c: Likewise. * gcc.dg/Walloca-5.c: Likewise. * gcc.dg/Walloca-6.c: Likewise. * gcc.dg/Walloca-7.c: Likewise. * gcc.dg/Walloca-8.c: Likewise. * gcc.dg/Walloca-9.c: Likewise. * gcc.dg/Walloca-larger-than-2.c: Likewise. * gcc.dg/Walloca-larger-than-3.c: Likewise. * gcc.dg/Walloca-larger-than-4.c: Likewise. * gcc.dg/Walloca-larger-than.c: Likewise. * gcc.dg/Warray-bounds-22.c: Likewise. * gcc.dg/Warray-bounds-41.c: Likewise. * gcc.dg/Warray-bounds-46.c: Likewise. * gcc.dg/Warray-bounds-48-novec.c: Likewise. * gcc.dg/Warray-bounds-48.c: Likewise. * gcc.dg/Warray-bounds-50.c: Likewise. * gcc.dg/Warray-bounds-63.c: Likewise. * gcc.dg/Warray-bounds-66.c: Likewise. * gcc.dg/Wdangling-pointer.c: Likewise. * gcc.dg/Wfree-nonheap-object-2.c: Likewise. * gcc.dg/Wfree-nonheap-object.c: Likewise. * gcc.dg/Wrestrict-17.c: Likewise. * gcc.dg/Wrestrict.c: Likewise. * gcc.dg/Wreturn-local-addr-2.c: Likewise. * gcc.dg/Wreturn-local-addr-3.c: Likewise. * gcc.dg/Wreturn-local-addr-4.c: Likewise. * gcc.dg/Wreturn-local-addr-6.c: Likewise. * gcc.dg/Wsizeof-pointer-memaccess1.c: Likewise. * gcc.dg/Wstack-usage.c: Likewise. * gcc.dg/Wstrict-aliasing-bogus-vla-1.c: Likewise. * gcc.dg/Wstrict-overflow-27.c: Likewise. * gcc.dg/Wstringop-overflow-15.c: Likewise. * gcc.dg/Wstringop-overflow-23.c: Likewise. * gcc.dg/Wstringop-overflow-25.c: Likewise. * gcc.dg/Wstringop-overflow-27.c: Likewise. * gcc.dg/Wstringop-overflow-3.c: Likewise. * gcc.dg/Wstringop-overflow-39.c: Likewise. * gcc.dg/Wstringop-overflow-56.c: Likewise. * gcc.dg/Wstringop-overflow-57.c: Likewise. * gcc.dg/Wstringop-overflow-67.c: Likewise. * gcc.dg/Wstringop-overflow-71.c: Likewise. * gcc.dg/Wstringop-truncation-3.c: Likewise. * gcc.dg/Wvla-larger-than-1.c: Likewise. * gcc.dg/Wvla-larger-than-2.c: Likewise. * gcc.dg/Wvla-larger-than-3.c: Likewise. * gcc.dg/Wvla-larger-than-4.c: Likewise. * gcc.dg/Wvla-larger-than-5.c: Likewise. * gcc.dg/analyzer/boxed-malloc-1.c: Likewise. * gcc.dg/analyzer/call-summaries-2.c: Likewise. * gcc.dg/analyzer/malloc-1.c: Likewise. * gcc.dg/analyzer/malloc-reuse.c: Likewise. * gcc.dg/analyzer/out-of-bounds-diagram-12.c: Likewise. * gcc.dg/analyzer/pr93355-localealias.c: Likewise. * gcc.dg/analyzer/putenv-1.c: Likewise. * gcc.dg/analyzer/taint-alloc-1.c: Likewise. * gcc.dg/analyzer/torture/pr93373.c: Likewise. * gcc.dg/analyzer/torture/ubsan-1.c: Likewise. * gcc.dg/analyzer/vla-1.c: Likewise. * gcc.dg/atomic/stdatomic-vm.c: Likewise. * gcc.dg/attr-alloc_size-6.c: Likewise. * gcc.dg/attr-alloc_size-7.c: Likewise. * gcc.dg/attr-alloc_size-8.c: Likewise. * gcc.dg/attr-alloc_size-9.c: Likewise. * gcc.dg/attr-noipa.c: Likewise. * gcc.dg/auto-init-uninit-36.c: Likewise. * gcc.dg/auto-init-uninit-9.c: Likewise. * gcc.dg/auto-type-1.c: Likewise. * gcc.dg/builtin-alloc-size.c: Likewise. * gcc.dg/builtin-dynamic-alloc-size.c: Likewise. * gcc.dg/builtin-dynamic-object-size-1.c: Likewise. * gcc.dg/builtin-dynamic-object-size-2.c: Likewise. * gcc.dg/builtin-dynamic-object-size-3.c: Likewise. * gcc.dg/builtin-dynamic-object-size-4.c: Likewise. * gcc.dg/builtin-object-size-1.c: Likewise. * gcc.dg/builtin-object-size-2.c: Likewise. * gcc.dg/builtin-object-size-3.c: Likewise. * gcc.dg/builtin-object-size-4.c: Likewise. * gcc.dg/builtins-64.c: Likewise. * gcc.dg/builtins-68.c: Likewise. * gcc.dg/c23-auto-2.c: Likewise. * gcc.dg/c99-const-expr-13.c: Likewise. * gcc.dg/c99-vla-1.c: Likewise. * gcc.dg/fold-alloca-1.c: Likewise. * gcc.dg/gomp/pr30494.c: Likewise. * gcc.dg/gomp/vla-2.c: Likewise. * gcc.dg/gomp/vla-3.c: Likewise. * gcc.dg/gomp/vla-4.c: Likewise. * gcc.dg/gomp/vla-5.c: Likewise. * gcc.dg/graphite/pr99085.c: Likewise. * gcc.dg/guality/guality.c: Likewise. * gcc.dg/lto/pr80778_0.c: Likewise. * gcc.dg/nested-func-10.c: Likewise. * gcc.dg/nested-func-12.c: Likewise. * gcc.dg/nested-func-13.c: Likewise. * gcc.dg/nested-func-14.c: Likewise. * gcc.dg/nested-func-15.c: Likewise. * gcc.dg/nested-func-16.c: Likewise. * gcc.dg/nested-func-17.c: Likewise. * gcc.dg/nested-func-9.c: Likewise. * gcc.dg/packed-vla.c: Likewise. * gcc.dg/pr100225.c: Likewise. * gcc.dg/pr25682.c: Likewise. * gcc.dg/pr27301.c: Likewise. * gcc.dg/pr31507-1.c: Likewise. * gcc.dg/pr33238.c: Likewise. * gcc.dg/pr41470.c: Likewise. * gcc.dg/pr49120.c: Likewise. * gcc.dg/pr50764.c: Likewise. * gcc.dg/pr51491-2.c: Likewise. * gcc.dg/pr51990-2.c: Likewise. * gcc.dg/pr51990.c: Likewise. * gcc.dg/pr59011.c: Likewise. * gcc.dg/pr59523.c: Likewise. * gcc.dg/pr61561.c: Likewise. * gcc.dg/pr78468.c: Likewise. * gcc.dg/pr78902.c: Likewise. * gcc.dg/pr79972.c: Likewise. * gcc.dg/pr82875.c: Likewise. * gcc.dg/pr83844.c: Likewise. * gcc.dg/pr84131.c: Likewise. * gcc.dg/pr87099.c: Likewise. * gcc.dg/pr87320.c: Likewise. * gcc.dg/pr89045.c: Likewise. * gcc.dg/pr91014.c: Likewise. * gcc.dg/pr93986.c: Likewise. * gcc.dg/pr98721-1.c: Likewise. * gcc.dg/pr99122-2.c: Likewise. * gcc.dg/shrink-wrap-alloca.c: Likewise. * gcc.dg/sso-14.c: Likewise. * gcc.dg/strlenopt-62.c: Likewise. * gcc.dg/strlenopt-83.c: Likewise. * gcc.dg/strlenopt-84.c: Likewise. * gcc.dg/strlenopt-91.c: Likewise. * gcc.dg/torture/Wsizeof-pointer-memaccess1.c: Likewise. * gcc.dg/torture/calleesave-sse.c: Likewise. * gcc.dg/torture/pr48953.c: Likewise. * gcc.dg/torture/pr71881.c: Likewise. * gcc.dg/torture/pr71901.c: Likewise. * gcc.dg/torture/pr78742.c: Likewise. * gcc.dg/torture/pr92088-1.c: Likewise. * gcc.dg/torture/pr92088-2.c: Likewise. * gcc.dg/torture/pr93124.c: Likewise. * gcc.dg/torture/pr94479.c: Likewise. * gcc.dg/torture/stackalign/alloca-1.c: Likewise. * gcc.dg/torture/stackalign/inline-2.c: Likewise. * gcc.dg/torture/stackalign/nested-3.c: Likewise. * gcc.dg/torture/stackalign/vararg-1.c: Likewise. * gcc.dg/torture/stackalign/vararg-2.c: Likewise. * gcc.dg/tree-ssa/20030807-2.c: Likewise. * gcc.dg/tree-ssa/20080530.c: Likewise. * gcc.dg/tree-ssa/alias-37.c: Likewise. * gcc.dg/tree-ssa/builtin-sprintf-warn-22.c: Likewise. * gcc.dg/tree-ssa/builtin-sprintf-warn-25.c: Likewise. * gcc.dg/tree-ssa/builtin-sprintf-warn-3.c: Likewise. * gcc.dg/tree-ssa/loop-interchange-15.c: Likewise. * gcc.dg/tree-ssa/pr23848-1.c: Likewise. * gcc.dg/tree-ssa/pr23848-2.c: Likewise. * gcc.dg/tree-ssa/pr23848-3.c: Likewise. * gcc.dg/tree-ssa/pr23848-4.c: Likewise. * gcc.dg/uninit-32.c: Likewise. * gcc.dg/uninit-36.c: Likewise. * gcc.dg/uninit-39.c: Likewise. * gcc.dg/uninit-41.c: Likewise. * gcc.dg/uninit-9-O0.c: Likewise. * gcc.dg/uninit-9.c: Likewise. * gcc.dg/uninit-pr100250.c: Likewise. * gcc.dg/uninit-pr101300.c: Likewise. * gcc.dg/uninit-pr101494.c: Likewise. * gcc.dg/uninit-pr98583.c: Likewise. * gcc.dg/vla-2.c: Likewise. * gcc.dg/vla-22.c: Likewise. * gcc.dg/vla-24.c: Likewise. * gcc.dg/vla-3.c: Likewise. * gcc.dg/vla-4.c: Likewise. * gcc.dg/vla-stexp-1.c: Likewise. * gcc.dg/vla-stexp-2.c: Likewise. * gcc.dg/vla-stexp-4.c: Likewise. * gcc.dg/vla-stexp-5.c: Likewise. * gcc.dg/winline-7.c: Likewise. * gcc.target/aarch64/stack-check-alloca-1.c: Likewise. * gcc.target/aarch64/stack-check-alloca-10.c: Likewise. * gcc.target/aarch64/stack-check-alloca-2.c: Likewise. * gcc.target/aarch64/stack-check-alloca-3.c: Likewise. * gcc.target/aarch64/stack-check-alloca-4.c: Likewise. * gcc.target/aarch64/stack-check-alloca-5.c: Likewise. * gcc.target/aarch64/stack-check-alloca-6.c: Likewise. * gcc.target/aarch64/stack-check-alloca-7.c: Likewise. * gcc.target/aarch64/stack-check-alloca-8.c: Likewise. * gcc.target/aarch64/stack-check-alloca-9.c: Likewise. * gcc.target/arc/interrupt-6.c: Likewise. * gcc.target/i386/pr80969-3.c: Likewise. * gcc.target/loongarch/stack-check-alloca-1.c: Likewise. * gcc.target/loongarch/stack-check-alloca-2.c: Likewise. * gcc.target/loongarch/stack-check-alloca-3.c: Likewise. * gcc.target/loongarch/stack-check-alloca-4.c: Likewise. * gcc.target/loongarch/stack-check-alloca-5.c: Likewise. * gcc.target/loongarch/stack-check-alloca-6.c: Likewise. * gcc.target/riscv/stack-check-alloca-1.c: Likewise. * gcc.target/riscv/stack-check-alloca-10.c: Likewise. * gcc.target/riscv/stack-check-alloca-2.c: Likewise. * gcc.target/riscv/stack-check-alloca-3.c: Likewise. * gcc.target/riscv/stack-check-alloca-4.c: Likewise. * gcc.target/riscv/stack-check-alloca-5.c: Likewise. * gcc.target/riscv/stack-check-alloca-6.c: Likewise. * gcc.target/riscv/stack-check-alloca-7.c: Likewise. * gcc.target/riscv/stack-check-alloca-8.c: Likewise. * gcc.target/riscv/stack-check-alloca-9.c: Likewise. * gcc.target/sparc/setjmp-1.c: Likewise. * gcc.target/x86_64/abi/ms-sysv/ms-sysv.c: Likewise. * gcc.c-torture/compile/20001221-1.c: Don't 'dg-skip-if' for '! alloca'. * gcc.c-torture/compile/20020807-1.c: Likewise. * gcc.c-torture/compile/20050801-2.c: Likewise. * gcc.c-torture/compile/920428-4.c: Likewise. * gcc.c-torture/compile/debugvlafunction-1.c: Likewise. * gcc.c-torture/compile/pr41469.c: Likewise. * gcc.c-torture/execute/920721-2.c: Likewise. * gcc.c-torture/execute/920929-1.c: Likewise. * gcc.c-torture/execute/921017-1.c: Likewise. * gcc.c-torture/execute/941202-1.c: Likewise. * gcc.c-torture/execute/align-nest.c: Likewise. * gcc.c-torture/execute/alloca-1.c: Likewise. * gcc.c-torture/execute/pr22061-4.c: Likewise. * gcc.c-torture/execute/pr36321.c: Likewise. * gcc.dg/torture/pr8081.c: Likewise. * gcc.dg/analyzer/data-model-1.c: Don't 'dg-require-effective-target alloca'. XFAIL relevant 'dg-warning's for '! alloca'. * gcc.dg/uninit-38.c: Likewise. * gcc.dg/uninit-pr98578.c: Likewise. * gcc.dg/compat/struct-by-value-22_main.c: Comment on 'dg-require-effective-target alloca'. libstdc++-v3/ * testsuite/lib/prune.exp (proc libstdc++-dg-prune): Turn 'sorry, unimplemented: dynamic stack allocation not supported' into UNSUPPORTED.
2025-02-21tree-optimization/118954 - avoid UB on ref created by predcomRichard Biener1-0/+22
When predicitive commoning moves an invariant ref it makes sure to not build a MEM_REF with a base that is negatively offsetted from an object. But in trying to preserve some transforms it does not consider association of a constant offset with the address computation in DR_BASE_ADDRESS leading to exactly this problem again. This is arguably a problem in data-ref analysis producing such an out-of-bound DR_BASE_ADDRESS, but this looks quite involved to fix, so the following avoids the association in one more case. This fixes the testcase while preserving the desired transform in gcc.dg/tree-ssa/predcom-1.c. PR tree-optimization/118954 * tree-predcom.cc (ref_at_iteration): Make sure to not associate the constant offset with DR_BASE_ADDRESS when that is an offsetted pointer. * gcc.dg/torture/pr118954.c: New testcase.
2025-02-19analyzer: handle more IFN_UBSAN_* as no-ops [PR118300]David Malcolm1-0/+15
Previously the analyzer treated IFN_UBSAN_BOUNDS as a no-op, but the other IFN_UBSAN_* were unrecognized and conservatively treated as having arbitrary behavior. Treat IFN_UBSAN_NULL and IFN_UBSAN_PTR also as no-ops, which should make -fanalyzer behave better with -fsanitize=undefined. gcc/analyzer/ChangeLog: PR analyzer/118300 * kf.cc (class kf_ubsan_bounds): Replace this with... (class kf_ubsan_noop): ...this. (register_sanitizer_builtins): Use it to handle IFN_UBSAN_NULL, IFN_UBSAN_BOUNDS, and IFN_UBSAN_PTR as nop-ops. (register_known_functions): Drop handling of IFN_UBSAN_BOUNDS here, as it's now handled by register_sanitizer_builtins above. gcc/testsuite/ChangeLog: PR analyzer/118300 * gcc.dg/analyzer/ubsan-pr118300.c: New test. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2025-02-18testsuite: Include stdint.h instead of stdint-gcc.h in some testsJohn David Anglin5-5/+5
When use_gcc_stdint=provide, the stdint-gcc.h header is not provided. 2025-02-18 John David Anglin <danglin@gcc.gnu.org> gcc/testsuite/ChangeLog: PR testsuite/116986 * gcc.dg/crc-builtin-rev-target32.c: Include stdint.h instead of stdint-gcc.h. * gcc.dg/crc-builtin-rev-target64.c: Likewise. * gcc.dg/crc-builtin-target32.c: Likewise. * gcc.dg/crc-builtin-target64.c: Likewise. * gcc.dg/torture/pr115387-2.c: Likewise.
2025-02-18tree-optimization/98845 - ICE with tail-merging and DCE/DSE disabledRichard Biener2-1/+38
The following shows that tail-merging will make dead SSA defs live in paths where it wasn't before, possibly introducing UB or as in this case, uses of abnormals that eventually fail coalescing later. The fix is to register such defs for stmt comparison. PR tree-optimization/98845 * tree-ssa-tail-merge.cc (stmt_local_def): Consider a def with no uses not local. * gcc.dg/pr98845.c: New testcase. * gcc.dg/pr81192.c: Adjust.
2025-02-17[ifcombine] cope with signbit tests of extended valuesAlexandre Oliva1-0/+20
A compare with zero may be taken as a sign bit test by fold_truth_andor_for_ifcombine, but the operand may be extended from a narrower field. If the operand was narrower, the bitsize will reflect the narrowing conversion, but if it was wider, we'll only know whether the field is sign- or zero-extended from unsignedp, but we won't know whether it needed to be extended, because arg will have changed to the narrower variable when we get to the point in which we can compute the arg width. If it's sign-extended, we're testing the right bit, but if it's zero-extended, there isn't any bit we can test. Instead of punting and leaving the foldable compare to be figured out by another pass, arrange for the sign bit resulting from the widening zero-extension to be taken as zero, so that the modified compare will yield the desired result. While at that, avoid swapping the right-hand compare operands when we've already determined that it was a signbit test: it no use to even try. for gcc/ChangeLog PR tree-optimization/118805 * gimple-fold.cc (fold_truth_andor_for_combine): Detect and cope with zero-extension in signbit tests. Reject swapping right-compare operands if rsignbit. for gcc/testsuite/ChangeLog PR tree-optimization/118805 * gcc.dg/field-merge-26.c: New.
2025-02-17middle-end: Fixup constant integers when expanding __builtin_crc [PR118288]Uros Bizjak1-0/+8
Constant integers with MSB set have to be represented as corresponding signed integers. Use gen_int_mode to emit them in the correct way. PR middle-end/118288 gcc/ChangeLog: * builtins.cc (expand_builtin_crc_table_based): Use gen_int_mode to emit constant integers with MSB set. gcc/testsuite/ChangeLog: * gcc.dg/pr118288.c: New test.
2025-02-17tree-optimization/118895 - ICE during PRERichard Biener1-0/+13
When we simplify a NARY during PHI translation we have to make sure to not inject not available operands into it given that might violate the valueization hook constraints and we'd pick up invalid context-sensitive data in further simplification or as in this case later ICE when we try to insert the expression. PR tree-optimization/118895 * tree-ssa-sccvn.cc (vn_nary_build_or_lookup_1): Only allow CSE if we can verify the result is available. * gcc.dg/pr118895.c: New testcase.
2025-02-15[PR tree-optimization/98028] Use relationship between operands to simplify ↵Jakub Jelinek1-0/+26
SUB_OVERFLOW So this is a fairly old regression, but with all the ranger work that's been done, it's become easy to resolve. The basic idea here is to use known relationships between two operands of a SUB_OVERFLOW IFN to statically compute the overflow state and ultimately allow turning the IFN into simple arithmetic (or for the tests in this BZ elide the arithmetic entirely). The regression example is when the two inputs are known equal. In that case the subtraction will never overflow. But there's a few other cases we can handle as well. a == b -> never overflows a > b -> never overflows when A and B are unsigned a >= b -> never overflows when A and B are unsigned a < b -> always overflows when A and B are unsigned Bootstrapped and regression tested on x86, and regression tested on the usual cross platforms. This is Jakub's version of the vr-values.cc fix rather than Jeff's. PR tree-optimization/98028 gcc/ * vr-values.cc (check_for_binary_op_overflow): Try to use a known relationship betwen op0/op1 to statically determine overflow state. gcc/testsuite * gcc.dg/tree-ssa/pr98028.c: New test.
2025-02-14tree-optimization/118852 - wrong code with 502.gcc_rRichard Biener1-0/+105
502.gcc_r when built with -fprofile-generate exposes a SLP discovery issue where an IV forced live due to early break is not properly discovered if its latch def is part of a different IVs SSA cycle. To mitigate this we have to make sure to create an SLP instance for the original IV. Ideally we'd handle all vect_induction_def the same but this is left for next stage1. PR tree-optimization/118852 * tree-vect-slp.cc (vect_analyze_slp): For early-break forced-live IVs make sure we create an appropriate entry into the SLP graph. * gcc.dg/vect/pr118852.c: New testcase.
2025-02-13tree-optimization/118817 - fix ICE with VN CTOR simplificationRichard Biener1-0/+14
The representation of CONSTRUCTOR nodes in VN NARY and gimple_match_op do not agree so do not attempt to marshal between them. PR tree-optimization/118817 * tree-ssa-sccvn.cc (vn_nary_simplify): Do not process CONSTRUCTOR NARY or update from CONSTRUCTOR simplified gimple_match_op. * gcc.dg/pr118817.c: New testcase.
2025-02-10ipa-cp: Perform operations in the appropriate types (PR 118097)Martin Jambor3-0/+57
One of the testcases from PR 118097 and the one from PR 118535 show that the fix to PR 118138 was incomplete. We must not only make sure that (intermediate) results of operations performed by IPA-CP are fold_converted to the type of the destination formal parameter but we also must decouple the these types from the ones in which operations are performed. This patch does that, even though we do not store or stream the operation types, instead we simply limit ourselves to tcc_comparisons and operations for which the first operand and the result are of the same type as determined by expr_type_first_operand_type_p. If we wanted to go beyond these, we would indeed need to store/stream the respective operation type. ipa_value_from_jfunc needs an additional check that res_type is not NULL because it is not called just from within IPA-CP (where we know we have a destination lattice slot belonging to a defined parameter) but also from inlining, ipa-fnsummary and ipa-modref where it is used to examine a call to a function with variadic arguments and we do not have types for the unknown parameters. But we cannot really work with those or estimate any benefits when it comes to them, so ignoring them should be OK. Even after this patch, ipa_get_jf_arith_result has a parameter called res_type in which it performs operations for aggregate jump functions, where we do not allow type conversions when constucting the jump functions and the type is the type of the stored data. In GCC 16, we could relax this and allow conversions like for scalars. gcc/ChangeLog: 2025-01-20 Martin Jambor <mjambor@suse.cz> PR ipa/118097 * ipa-cp.cc (ipa_get_jf_arith_result): Adjust comment. (ipa_get_jf_pass_through_result): Removed. (ipa_value_from_jfunc): Use directly ipa_get_jf_arith_result, do not specify operation type but make sure we check and possibly convert the result. (get_val_across_arith_op): Remove the last parameter, always pass NULL_TREE to ipa_get_jf_arith_result in its last argument. (propagate_vals_across_arith_jfunc): Do not pass res_type to get_val_across_arith_op. (propagate_vals_across_pass_through): Add checking assert that parm_type is not NULL. gcc/testsuite/ChangeLog: 2025-01-24 Martin Jambor <mjambor@suse.cz> PR ipa/118097 * gcc.dg/ipa/pr118097.c: New test. * gcc.dg/ipa/pr118535.c: Likewise. * gcc.dg/ipa/ipa-notypes-1.c: Likewise.
2025-02-10testsuite: Fix two testisms on x86 after PFA [PR118754]Tamar Christina1-0/+2
These two tests now vectorize the result finding loop with PFA and so the number of loops checked fails. This fixes them by adding #pragma GCC novector to the testcases. gcc/testsuite/ChangeLog: PR testsuite/118754 * gcc.dg/vect/vect-tail-nomask-1.c: Add novector. * gcc.target/i386/pr106010-8c.c: Likewise.
2025-02-08GCN, nvptx: 'sorry, unimplemented: exception handling not supported'Thomas Schwinge1-2/+0
For GCN, this avoids ICEs further down the compilation pipeline. For nvptx, there's effectively no change: in presence of exception handling constructs, instead of 'sorry, unimplemented: target cannot support nonlocal goto', we now emit 'sorry, unimplemented: exception handling not supported'. Additionally, turn test cases into UNSUPPORTED if running into 'sorry, unimplemented: exception handling not supported'. gcc/ * config/gcn/gcn.md (exception_receiver): 'define_expand'. * config/nvptx/nvptx.md (exception_receiver): Likewise. gcc/testsuite/ * lib/gcc-dg.exp (gcc-dg-prune): Turn 'sorry, unimplemented: exception handling not supported' into UNSUPPORTED. * gcc.dg/pr104464.c: Remove GCN XFAIL. libstdc++-v3/ * testsuite/lib/prune.exp (libstdc++-dg-prune): Turn 'sorry, unimplemented: exception handling not supported' into UNSUPPORTED.
2025-02-08For a few test cases, clarify dependance on effective-target 'nonlocal_goto' ↵Thomas Schwinge4-4/+0
into 'exceptions' For example, for nvptx, these test cases currently indeed fail with 'sorry, unimplemented: target cannot support nonlocal goto'. However, that's just an artefact of non-existing support for exception handling, and these test cases already require effective-target 'exceptions'. gcc/testsuite/ * gcc.dg/cleanup-12.c: Don't 'dg-skip-if "" { ! nonlocal_goto }'. * gcc.dg/cleanup-13.c: Likewise. * gcc.dg/cleanup-5.c: Likewise. * gcc.dg/gimplefe-44.c: Don't 'dg-require-effective-target nonlocal_goto'.
2025-02-08'gcc.dg/pr88870.c': don't 'dg-require-effective-target nonlocal_goto'Thomas Schwinge1-1/+0
I confirm that back then, 'gcc.dg/pr88870.c' for nvptx failed due to 'sorry, unimplemented: target cannot support nonlocal goto', however at some (indeterminate) point in time, that must've disappeared, and we now don't have to 'dg-require-effective-target nonlocal_goto' anymore, and therefore get: [-UNSUPPORTED:-]{+PASS:+} gcc.dg/pr88870.c {+(test for excess errors)+} (And, if ever necessary again, this nowadays probably should 'dg-require-effective-target exceptions' instead of 'nonlocal_goto'.) gcc/testsuite/ * gcc.dg/pr88870.c: Don't 'dg-require-effective-target nonlocal_goto'.
2025-02-07[testsuite] tolerate later success [PR108357]Alexandre Oliva1-2/+5
On leon3-elf and presumably on other targets, the test fails due to differences in calling conventions and other reasons, that add extra gimple stmts that prevent the expected optimization at the expected point. The optimization takes place anyway, just a little later, so tolerate that. for gcc/testsuite/ChangeLog PR tree-optimization/108357 * gcc.dg/tree-ssa/pr108357.c: Tolerate later optimization.
2025-02-07[ifcombine] avoid creating out-of-bounds BIT_FIELD_REFs [PR118514]Alexandre Oliva1-0/+15
If decode_field_reference finds a load that accesses past the inner object's size, bail out. Drop the too-strict assert. for gcc/ChangeLog PR tree-optimization/118514 PR tree-optimization/118706 * gimple-fold.cc (decode_field_reference): Refuse to consider merging out-of-bounds BIT_FIELD_REFs. (make_bit_field_load): Drop too-strict assert. * tree-eh.cc (bit_field_ref_in_bounds_p): Rename to... (access_in_bounds_of_type_p): ... this. Change interface, export. (tree_could_trap_p): Adjust. * tree-eh.h (access_in_bounds_of_type_p): Declare. for gcc/testsuite/ChangeLog PR tree-optimization/118514 PR tree-optimization/118706 * gcc.dg/field-merge-25.c: New.
2025-02-06loop-iv, riscv: Fix get_biv_step_1 for RISC-V [PR117506]Jakub Jelinek1-0/+18
The following test ICEs on RISC-V at least latently since r14-1622-g99bfdb072e67fa3fe294d86b4b2a9f686f8d9705 which added RISC-V specific case to get_biv_step_1 to recognize also ({zero,sign}_extend:DI (plus:SI op0 op1)) The reason for the ICE is that op1 in this case is CONST_POLY_INT which unlike the really expected VOIDmode CONST_INTs has its own mode and still satisfies CONSTANT_P. GET_MODE (rhs) (SImode) is different from outer_mode (DImode), so the function later does *inner_step = simplify_gen_binary (code, outer_mode, *inner_step, op1); but that obviously ICEs because while *inner_step is either VOIDmode or DImode, op1 has SImode. The following patch fixes it by extending op1 using code so that simplify_gen_binary can handle it. Another option would be to change the !CONSTANT_P (op1) 3 lines above this to !CONST_INT_P (op1), I think it isn't very likely that we get something useful from other constants there. 2025-02-06 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/117506 * loop-iv.cc (get_biv_step_1): For {ZERO,SIGN}_EXTEND of PLUS apply {ZERO,SIGN}_EXTEND to op1. * gcc.dg/pr117506.c: New test. * gcc.target/riscv/pr117506.c: New test.
2025-02-06tree-optimization/118749 - bogus alignment peeling causes misaligned accessRichard Biener1-0/+41
The vectorizer thinks it can align a vector access to 16 bytes when using a vectorization factor of 8 and 1 byte elements. That of course does not work for the 2nd vector iteration. Apparently we lack a guard against such nonsense. PR tree-optimization/118749 * tree-vect-data-refs.cc (vector_alignment_reachable_p): Pass in the vectorization factor, when that cannot maintain the DRs target alignment do not claim we can reach that by peeling. * gcc.dg/vect/pr118749.c: New testcase.
2025-02-05cselib: For CALL_INSNs to const/pure fns invalidate memory below sp [PR117239]Jakub Jelinek1-0/+42
The following testcase is miscompiled on x86_64 during postreload. After reload (with IPA-RA figuring out the calls don't modify any registers but %rax for return value) postreload sees (insn 14 12 15 2 (set (mem:DI (plus:DI (reg/f:DI 7 sp) (const_int 16 [0x10])) [0 S8 A64]) (reg:DI 1 dx [orig:105 q+16 ] [105])) "pr117239.c":18:7 95 {*movdi_internal} (nil)) (call_insn/i 15 14 16 2 (set (reg:SI 0 ax) (call (mem:QI (symbol_ref:DI ("baz") [flags 0x3] <function_decl 0x7ffb2e2bdf00 r>) [0 baz S1 A8]) (const_int 24 [0x18]))) "pr117239.c":18:7 1476 {*call_value} (expr_list:REG_CALL_DECL (symbol_ref:DI ("baz") [flags 0x3] <function_decl 0x7ffb2e2bdf00 baz>) (expr_list:REG_EH_REGION (const_int 0 [0]) (nil))) (nil)) (insn 16 15 18 2 (parallel [ (set (reg/f:DI 7 sp) (plus:DI (reg/f:DI 7 sp) (const_int 24 [0x18]))) (clobber (reg:CC 17 flags)) ]) "pr117239.c":18:7 285 {*adddi_1} (expr_list:REG_ARGS_SIZE (const_int 0 [0]) (nil))) ... (call_insn/i 19 18 21 2 (set (reg:SI 0 ax) (call (mem:QI (symbol_ref:DI ("foo") [flags 0x3] <function_decl 0x7ffb2e2bdb00 l>) [0 foo S1 A8]) (const_int 0 [0]))) "pr117239.c":19:3 1476 {*call_value} (expr_list:REG_CALL_DECL (symbol_ref:DI ("foo") [flags 0x3] <function_decl 0x7ffb2e2bdb00 foo>) (expr_list:REG_EH_REGION (const_int 0 [0]) (nil))) (nil)) (insn 21 19 26 2 (parallel [ (set (reg/f:DI 7 sp) (plus:DI (reg/f:DI 7 sp) (const_int -24 [0xffffffffffffffe8]))) (clobber (reg:CC 17 flags)) ]) "pr117239.c":19:3 discrim 1 285 {*adddi_1} (expr_list:REG_ARGS_SIZE (const_int 24 [0x18]) (nil))) (insn 26 21 24 2 (set (mem:DI (plus:DI (reg/f:DI 7 sp) (const_int 16 [0x10])) [0 S8 A64]) (reg:DI 1 dx [orig:105 q+16 ] [105])) "pr117239.c":19:3 discrim 1 95 {*movdi_internal} (nil)) i.e. movq %rdx, 16(%rsp) call baz addq $24, %rsp ... call foo subq $24, %rsp movq %rdx, 16(%rsp) Now, postreload uses cselib and cselib remembered that %rdx value has been stored into 16(%rsp). Both baz and foo are pure calls. If they weren't, when processing those CALL_INSNs cselib would invalidate all MEMs if (RTL_LOOPING_CONST_OR_PURE_CALL_P (insn) || !(RTL_CONST_OR_PURE_CALL_P (insn))) cselib_invalidate_mem (callmem); where callmem is (mem:BLK (scratch)). But they are pure, so instead the code just invalidates the argument slots from CALL_INSN_FUNCTION_USAGE. The calls actually clobber more than that, even const/pure calls clobber all memory below the stack pointer. And that is something that hasn't been invalidated. In this failing testcase, the call to baz is not a big deal, we don't have anything remembered in memory below %rsp at that call. But then we increment %rsp by 24, so the %rsp+16 is now 8 bytes below stack and do the call to foo. And that call now actually, not just in theory, clobbers the memory below the stack pointer (in particular overwrites it with the return value). But cselib does not invalidate. Then %rsp is decremented again (in preparation for another call, to bar) and cselib is processing store of %rdx (which IPA-RA says has not been modified by either baz or foo calls) to %rsp + 16, and it sees the memory already has that value, so the store is useless, let's remove it. But it is not, the call to foo has changed it, so it needs to be stored again. The following patch adds targetted invalidation of memory below stack pointer (or on SPARC memory below stack pointer + 2047 when stack bias is used, or on PA memory above stack pointer instead). It does so only in !ACCUMULATE_OUTGOING_ARGS or cfun->calls_alloca functions, because in other functions the stack pointer should be constant from the end of prologue till start of epilogue and so nothing should be stored within the function below the stack pointer. Now, memory below stack pointer is special, except for functions using alloca/VLAs I believe no addressable memory should be there, it should be purely outgoing function argument area, if we take address of some automatic variable, it should live all the time above the outgoing function argument area. So on top of just trying to flush memory below stack pointer (represented by %rsp - PTRDIFF_MAX with PTRDIFF_MAX size on most arches), the patch tries to optimize and only invalidate memory that has address clearly derived from stack pointer (memory with other bases is not invalidated) and if we can prove (we see same SP_DERIVED_VALUE_P bases in both VALUEs) it is above current stack, also don't call canon_anti_dependence which might just give up in certain cases. I've gathered statistics from x86_64-linux and i686-linux bootstraps/regtests. During -m64 compilations from those, there were 3718396 + 42634 + 27761 cases of processing MEMs in cselib_invalidate_mem (callmem[1]) calls, the first number is number of MEMs not invalidated because of the optimization, i.e. + if (sp_derived_base == NULL_RTX) + { + has_mem = true; + num_mems++; + p = &(*p)->next; + continue; + } in the patch, the second number is number of MEMs not invalidated because canon_anti_dependence returned false and finally the last number is number of MEMs actually invalidated (so that is what hasn't been invalidated before). During -m32 compilations the numbers were 1422412 + 39354 + 16509 with the same meaning. Note, when there is no red zone, in theory even the sp = sp + incr instruction invalidates memory below the new stack pointer, as signal can come and overwrite the memory. So maybe we should be invalidating something at those instructions as well. But in leaf functions we certainly can have even addressable automatic vars in the red zone (which would make it harder to distinguish), on the other side aren't normally storing anything below the red zone, and in non-leaf it should normally be just the outgoing arguments area. 2025-02-05 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/117239 * cselib.cc: Include predict.h. (callmem): Change type from rtx to rtx[2]. (cselib_preserve_only_values): Use callmem[0] rather than callmem. (cselib_invalidate_mem): Optimize and don't try to invalidate for the mem_rtx == callmem[1] case MEMs which clearly can't be below the stack pointer. (cselib_process_insn): Use callmem[0] rather than callmem. For const/pure calls also call cselib_invalidate_mem (callmem[1]) in !ACCUMULATE_OUTGOING_ARGS or cfun->calls_alloca functions. (cselib_init): Initialize callmem[0] rather than callmem and also initialize callmem[1]. * gcc.dg/pr117239.c: New test.
2025-02-05vect: Fix wrong code with pr108692.c on targets with only non-widening ABD ↵Xi Ruoyao2-0/+33
[PR118727] With things like // signed char a_14, a_16; a.0_4 = (unsigned char) a_14; _5 = (int) a.0_4; b.1_6 = (unsigned char) b_16; _7 = (int) b.1_6; c_17 = _5 - _7; _8 = ABS_EXPR <c_17>; r_18 = _8 + r_23; An ABD pattern will be recognized for _8: patt_31 = .ABD (a.0_4, b.1_6); It's still correct. But then when the SAD pattern is recognized: patt_29 = SAD_EXPR <a_14, b_16, r_23>; This is not correct. This only happens for targets with both uabd and sabd but not vec_widen_{s,u}abd, currently LoongArch is the only target affected. The problem is vect_look_through_possible_promotion will throw away a series of conversions if the effect is equivalent to a sign change and a promotion, but here the sign change is definitely relevant, and the promotion is also relevant for "mixed sign" cases like r += abs((unsigned int)(unsigned char) a - (signed int)(signed char) b (we need to promote to HImode as the difference can exceed the range of QImode). If there were any redundant promotion, it should have been stripped in vect_recog_abd_pattern (i.e. when patt_31 = .ABD (a.0_4, b.1_6) is recognized) instead of in vect_recog_sad_pattern, or we'd have a missed-optimization if the ABD output is not summerized. So anyway vect_recog_sad_pattern is just not a proper location to call vect_look_through_possible_promotion for the ABD inputs, remove the calls to fix the issue. gcc/ChangeLog: PR tree-optimization/118727 * tree-vect-patterns.cc (vect_recog_sad_pattern): Don't call vect_look_through_possible_promotion on ABD inputs. gcc/testsuite/ChangeLog: PR tree-optimization/118727 * gcc.dg/pr108692.c: Mention PR 118727 in the comment. * gcc.dg/pr118727.c: New test case.
2025-02-04testsuite: XFAIL test in pr109393.c for ilp32 targets [PR116845]kelefth1-1/+2
The match.pd canonicalization that this testcase checks for, is not applied on ilp32 targets. This XFAILs the test on ilp32 targets. PR testsuite/116845 gcc/testsuite/ChangeLog: * gcc.dg/pr109393.c: XFAIL on ilp32 targets.
2025-02-04c/118742 - gimple FE parsing of unary operators of C promoted argsRichard Biener1-0/+24
The GIMPLE FE currently invokes parser_build_unary_op to build unary GENERIC which has the operand subject to C promotion rules which does not match GIMPLE. The following adds a wrapper around the build_unary_op worker which conveniently has an argument to indicate whether to skip such promotion. PR c/118742 gcc/c/ * gimple-parser.cc (gimple_parser_build_unary_op): New wrapper around build_unary_op. (c_parser_gimple_unary_expression): Use it. gcc/testsuite/ * gcc.dg/gimplefe-56.c: New testcase.
2025-02-04tree-optimization/117113 - ICE with unroll-and-jamRichard Biener1-0/+20
When there's an inner loop without virtual header PHI but the outer loop has one the fusion process cannot handle the need to create an inner loop virtual header PHI. Punt in this case. PR tree-optimization/117113 * gimple-loop-jam.cc (unroll_jam_possible_p): Detect when we cannot handle virtual SSA update. * gcc.dg/torture/pr117113.c: New testcase.
2025-02-04rtl-optimization/117611 - ICE in simplify_shift_const_1Richard Biener1-0/+7
The following checks we have a scalar int shift mode before enforcing it. As AVR shows the mode can be a signed _Accum mode as well. PR rtl-optimization/117611 * combine.cc (simplify_shift_const_1): Bail if not scalar int mode. * gcc.dg/fixed-point/pr117611.c: New testcase.
2025-02-04lto/113207 - fix free_lang_data_in_typeRichard Biener1-0/+10
When we process function types we strip volatile and const qualifiers after building a simplified type variant (which preserves those). The qualified type handling of both isn't really compatible, so avoid bad interaction by swapping this, first dropping const/volatile qualifiers and then building the simplified type thereof. PR lto/113207 * ipa-free-lang-data.cc (free_lang_data_in_type): First drop const/volatile qualifiers from function argument types, then build a simplified type. * gcc.dg/pr113207.c: New testcase.
2025-02-03tree-optimization/118717 - store commoning vs. abnormalsRichard Biener1-0/+41
When we sink common stores in cselim or the sink pass we have to make sure to not introduce overlapping lifetimes for abnormals used in the ref. The easiest is to avoid sinking stmts which reference abnormals at all which is what the following does. PR tree-optimization/118717 * tree-ssa-phiopt.cc (cond_if_else_store_replacement_1): Do not common stores referencing abnormal SSA names. * tree-ssa-sink.cc (sink_common_stores_to_bb): Likewise. * gcc.dg/torture/pr118717.c: New testcase.
2025-01-30[testsuite] require -Ofast for vect-ifcvt-18 even without avxAlexandre Oliva1-1/+2
The test expects transformations that depend on -Ofast on x86*, but that option is only passed when the avx_runtime is available. Split -Ofast out of the avx conditional, so that it is passed on the same targets that expect the transformation. for gcc/testsuite/ChangeLog * gcc.dg/vect/vect-ifcvt-18.c: Split -Ofast out of avx_runtime.
2025-01-30s390: Fix up *vec_cmpgt{,u}<mode><mode>_nocc_emu splitters [PR118696]Jakub Jelinek1-0/+131
The following testcase is miscompiled on s390x-linux with e.g. -march=z13 (both -O0 and -O2) starting with r15-7053. The problem is in the splitters which emulate TImode/V1TImode GT and GTU comparisons. For GT we want to do (ior (gt (hi op1) (hi op2)) (and (eq (hi op1) (hi op2)) (gtu (lo op1) (lo op2)))) and for GTU similarly except for gtu instead of gt in there. Now, the splitter emulation is using V2DImode comparisons where on s390x the hi part is in the first element of the vector, lo part in the second, and for the gtu case it swaps the elements of the vector. So, we get the right result in the first element of the result vector. But vrepg was then broadcasting the second element of the result vector rather than the first, and the value of the second element of the vector is instead (ior (gt (lo op1) (lo op2)) (and (eq (lo op1) (lo op2)) (gtu (hi op1) (hi op2)))) so something not really usable for the emulated comparison. The following patch fixes that. The testcase tries to test behavior of double-word smin/smax/umin/umax with various cases of the halves of both operands (one that is sometimes EQ, sometimes GT, sometimes LT, sometimes GTU, sometimes LTU). 2025-01-30 Jakub Jelinek <jakub@redhat.com> Stefan Schulze Frielinghaus <stefansf@gcc.gnu.org> PR target/118696 * config/s390/vector.md (*vec_cmpgt<mode><mode>_nocc_emu, *vec_cmpgtu<mode><mode>_nocc_emu): Duplicate the first rather than second V2DImode element. * gcc.dg/pr118696.c: New test. * gcc.target/s390/vector/pr118696.c: New test. * gcc.target/s390/vector/vec-abs-emu.c: Expect vrepg with 0 as last operand rather than 1. * gcc.target/s390/vector/vec-max-emu.c: Likewise. * gcc.target/s390/vector/vec-min-emu.c: Likewise.
2025-01-30middle-end/118695 - missed misalign handling in MEM_REF expansionRichard Biener1-0/+9
When MEM_REF expansion of a non-MEM falls back to a stack temporary we fail to handle the case where the offset adjusted reference to the temporary is not aligned according to the requirement of the mode. We have to go through bitfield extraction or movmisalign in this case. Fortunately there's a helper for this. This fixes an ICE observed on arm which has sanity checks in its move patterns for this. PR middle-end/118695 * expr.cc (expand_expr_real_1): When expanding a MEM_REF to a non-MEM by committing it to a stack temporary make sure to handle misaligned accesses correctly. * gcc.dg/pr118695.c: New testcase.
2025-01-30OpenMP: append_args clause fixes + Fortran supportTobias Burnus1-0/+70
This fixes a large number of smaller and larger issues with the append_args clause to 'declare variant' and adds Fortran support for it; it also contains a larger number of testcases. In particular, for Fortran, it also handles passing allocatable, pointer, optional arguments to an interop dummy argument with or without value attribute. And it changes the internal representation such that dumping the tree does not lead to an ICE. gcc/c/ChangeLog: * c-parser.cc (c_finish_omp_declare_variant): Modify how append_args is saved internally. gcc/cp/ChangeLog: * parser.cc (cp_finish_omp_declare_variant): Modify how append_args is saved internally. * pt.cc (tsubst_attribute): Likewise. (tsubst_omp_clauses): Remove C_ORT_OMP_DECLARE_SIMD from interop handling as no longer called for it. * decl.cc (omp_declare_variant_finalize_one): Update append_args changes; fixes for ADL input. gcc/fortran/ChangeLog: * gfortran.h (gfc_omp_declare_variant): Add append_args_list. * openmp.cc (gfc_parser_omp_clause_init_modifiers): New; splitt of from ... (gfc_match_omp_init): ... here; call it. (gfc_match_omp_declare_variant): Update to handle append_args clause; some syntax handling fixes. * trans-openmp.cc (gfc_trans_omp_declare_variant): Handle append_args clause; add some diagnostic. gcc/ChangeLog: * gimplify.cc (gimplify_call_expr): For OpenMP's append_args clause processed by 'omp dispatch', update for internal-representation changes; fix handling of hidden arguments, add some comments and handle Fortran's value dummy and optional/pointer/allocatable actual args. libgomp/ChangeLog: * libgomp.texi (Impl. Status): Update for accumpulated changes related to 'dispatch' and interop. gcc/testsuite/ChangeLog: * c-c++-common/gomp/append-args-1.c: Update dg-*. * c-c++-common/gomp/append-args-3.c: Likewise. * g++.dg/gomp/append-args-1.C: Likewise. * gfortran.dg/gomp/adjust-args-1.f90: Likewise. * gfortran.dg/gomp/adjust-args-3.f90: Likewise. * gfortran.dg/gomp/declare-variant-2.f90: Likewise. * c-c++-common/gomp/append-args-6.c: New test. * c-c++-common/gomp/append-args-7.c: New test. * c-c++-common/gomp/append-args-8.c: New test. * c-c++-common/gomp/append-args-9.c: New test. * g++.dg/gomp/append-args-4.C: New test. * g++.dg/gomp/append-args-5.C: New test. * g++.dg/gomp/append-args-6.C: New test. * g++.dg/gomp/append-args-7.C: New test. * gcc.dg/gomp/append-args-1.c: New test. * gfortran.dg/gomp/append_args-1.f90: New test. * gfortran.dg/gomp/append_args-2.f90: New test. * gfortran.dg/gomp/append_args-3.f90: New test. * gfortran.dg/gomp/append_args-4.f90: New test.
2025-01-30middle-end/118692 - ICE with out-of-bound ref expansionRichard Biener1-0/+10
The following guards the BIT_FIELD_REF expansion fallback for MEM_REFs of entities expanded to register (or constant) further, avoiding large out-of-bound offsets by, when the access does not overlap the base object, expanding the offset as if it were zero. PR middle-end/118692 * expr.cc (expand_expr_real_1): When expanding a MEM_REF as BIT_FIELD_REF avoid large offsets for accesses not overlapping the base object. * gcc.dg/pr118692.c: New testcase.
2025-01-30tree-optimization/114052 - consider infinite sub-loops when lowering iter boundRichard Biener1-0/+41
When we walk stmts to find always executed stmts with UB in the last iteration to be able to reduce the iteration count by one we fail to consider infinite subloops in the last iteration that would make such stmt not execute. The following adds this. PR tree-optimization/114052 * tree-ssa-loop-niter.cc (maybe_lower_iteration_bound): Check for infinite subloops we might not exit. * gcc.dg/pr114052-1.c: New testcase.