Age | Commit message (Collapse) | Author | Files | Lines |
|
For strict-alignment targets we can end up with BLKmode single-element
array types when the element type is unaligned. This confuses
type checking since the canonical type would have an aligned
element type and a non-BLKmode mode. The following simply ignores
the mode we assign to array types for this purpose, like we already
do for record and union types.
PR middle-end/97323
* tree.cc (gimple_canonical_types_compatible_p): Ignore
TYPE_MODE also for ARRAY_TYPE.
(verify_type): Likewise.
* gcc.dg/pr97323.c: New testcase.
|
|
Uros' r15-7793 fixed this PR as well, I'm just committing tests
from the PR so that it can be closed.
2025-03-04 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/119071
* gcc.dg/pr119071.c: New test.
* gcc.c-torture/execute/pr119071.c: New test.
|
|
When we vectorize a .COND_ADD reduction and apply the single-use-def
cycle optimization we can end up chosing the wrong else value for
subsequent .COND_ADD. The following rectifies this.
PR tree-optimization/119096
* tree-vect-loop.cc (vect_transform_reduction): Use the
correct else value for .COND_fn.
* gcc.dg/vect/pr119096.c: New testcase.
|
|
We are detecting a cycle as double reduction where the inner loop
cycle has extra out-of-loop uses. This clashes at least with
assumptions from the SLP discovery code which says the cycle
isn't reachable from another SLP instance. It also was not intended
to support this case, in fact with GCC 14 we seem to generate wrong
code here.
PR tree-optimization/119057
* tree-vect-loop.cc (check_reduction_path): Add argument
specifying whether we're analyzing the inner loop of a
double reduction. Do not allow extra uses outside of the
double reduction cycle in this case.
(vect_is_simple_reduction): Adjust.
* gcc.dg/vect/pr119057.c: New testcase.
|
|
The following testcase ICEs since r14-5057.
The Intel vector ABI says that in the ZMM case the masks is passed
in unsigned int or unsigned long long arguments and how many bits in
them and how many of those arguments are is determined by the characteristic
data type of the function. In the testcase simdlen is 32 and characteristic
data type is double, so return as well as first argument is passed in 4
V8DFmode arguments and the mask is supposed to be passed in 4 unsigned int
arguments (8 bits in each).
Before the r14-5057 change there was
sc->args[i].orig_type = parm_type;
...
case SIMD_CLONE_ARG_TYPE_LINEAR_VAL_CONSTANT_STEP:
case SIMD_CLONE_ARG_TYPE_LINEAR_VAL_VARIABLE_STEP:
case SIMD_CLONE_ARG_TYPE_VECTOR:
if (INTEGRAL_TYPE_P (parm_type) || POINTER_TYPE_P (parm_type))
veclen = sc->vecsize_int;
else
veclen = sc->vecsize_float;
if (known_eq (veclen, 0U))
veclen = sc->simdlen;
else
veclen
= exact_div (veclen,
GET_MODE_BITSIZE (SCALAR_TYPE_MODE (parm_type)));
for the argument handling and
if (sc->inbranch)
{
tree base_type = simd_clone_compute_base_data_type (sc->origin, sc);
...
if (INTEGRAL_TYPE_P (base_type) || POINTER_TYPE_P (base_type))
veclen = sc->vecsize_int;
else
veclen = sc->vecsize_float;
if (known_eq (veclen, 0U))
veclen = sc->simdlen;
else
veclen = exact_div (veclen,
GET_MODE_BITSIZE (SCALAR_TYPE_MODE (base_type)));
for the mask handling. r14-5057 moved this argument creation later and
unified that:
case SIMD_CLONE_ARG_TYPE_MASK:
case SIMD_CLONE_ARG_TYPE_LINEAR_VAL_CONSTANT_STEP:
case SIMD_CLONE_ARG_TYPE_LINEAR_VAL_VARIABLE_STEP:
case SIMD_CLONE_ARG_TYPE_VECTOR:
if (sc->args[i].arg_type == SIMD_CLONE_ARG_TYPE_MASK
&& sc->mask_mode != VOIDmode)
elem_type = boolean_type_node;
else
elem_type = TREE_TYPE (sc->args[i].vector_type);
if (INTEGRAL_TYPE_P (elem_type) || POINTER_TYPE_P (elem_type))
veclen = sc->vecsize_int;
else
veclen = sc->vecsize_float;
if (known_eq (veclen, 0U))
veclen = sc->simdlen;
else
veclen
= exact_div (veclen,
GET_MODE_BITSIZE (SCALAR_TYPE_MODE (elem_type)));
This is correct for the argument cases (so linear or vector) (though
POINTER_TYPE_P will never appear as TREE_TYPE of a vector), but the
boolean_type_node in there is completely bogus, when using AVX512 integer
masks as I wrote above we need the characteristic data type, not bool,
and bool is strange in that it has bitsize of 8 (or 32 on darwin), while
the masks are 1 bit per lane anyway.
Fixed thusly.
2025-03-01 Jakub Jelinek <jakub@redhat.com>
PR middle-end/115871
* omp-simd-clone.cc (simd_clone_adjust): For SIMD_CLONE_ARG_TYPE_MASK
and sc->mask_mode not VOIDmode, set elem_type to the characteristic
type rather than boolean_type_node.
* gcc.dg/gomp/simd-clones-8.c: New test.
|
|
Now that the #embed paper has been voted in, the following patch
removes the pedwarn for C++26 on it (and adjusts pedwarn warning for
older C++ versions) and predefines __cpp_pp_embed FTM.
Also, the patch changes cpp_error to cpp_pedwarning with for C++
-Wc++26-extensions guarding, and for C add -Wc11-c23-compat warning
about #embed.
I believe we otherwise implement everything in the paper already,
except I'm really confused by the
[Example:
#embed <data.dat> limit(__has_include("a.h"))
#if __has_embed(<data.dat> limit(__has_include("a.h")))
// ill-formed: __has_include [cpp.cond] cannot appear here
#endif
— end example]
part. My reading of both C23 and C++ with the P1967R14 paper in
is that the first case (#embed with __has_include or __has_embed in its
clauses) is what is clearly invalid and so the ill-formed note should be
for #embed. And the __has_include/__has_embed in __has_embed is actually
questionable.
Both C and C++ have something like
"The identifiers __has_include, __has_embed, and __has_c_attribute
shall not appear in any context not mentioned in this subclause."
or
"The identifiers __has_include and __has_cpp_attribute shall not appear
in any context not mentioned in this subclause."
(into which P1967R14 adds __has_embed) in the conditional inclusion
subclause. #embed is defined in a different one, so using those in there
is invalid (unless "using the rules specified for conditional inclusion"
wording e.g. in limit clause overrides that).
The reason why I think it is fuzzy for __has_embed is that __has_embed
is actually defined in the Conditional inclusion subclause (so that
would mean one can use __has_include, __has_embed and __has_*attribute
in there) but its clauses are described in a different one.
GCC currently accepts
#embed __FILE__ limit (__has_include (<stdarg.h>))
#if __has_embed (__FILE__ limit (__has_include (<stdarg.h>)))
#endif
#embed __FILE__ limit (__has_embed (__FILE__))
#if __has_embed (__FILE__ limit (__has_embed (__FILE__)))
#endif
Note, it isn't just about limit clause, but also about
prefix/suffix/if_empty, except that in those cases the "using the rules
specified for conditional inclusion" doesn't apply.
In any case, I'd hope that can be dealt with incrementally (and should
be handled the same for both C and C++).
2025-02-28 Jakub Jelinek <jakub@redhat.com>
libcpp/
* include/cpplib.h (enum cpp_warning_reason): Add
CPP_W_CXX26_EXTENSIONS enumerator.
* init.cc (lang_defaults): Set embed for GNUCXX26 and CXX26.
* directives.cc (do_embed): Adjust pedwarn wording for embed in C++,
use cpp_pedwarning instead of cpp_error and add CPP_W_C11_C23_COMPAT
warning of cpp_pedwarning hasn't diagnosed anything.
gcc/c-family/
* c.opt (Wc++26-extensions): Add CppReason(CPP_W_CXX26_EXTENSIONS).
* c-cppbuiltin.cc (c_cpp_builtins): Predefine __cpp_pp_embed=202502
for C++26.
gcc/testsuite/
* g++.dg/cpp/embed-1.C: Adjust for pedwarn wording change and don't
expect any error for C++26.
* g++.dg/cpp/embed-2.C: Adjust for pedwarn wording change and don't
expect any warning for C++26.
* g++.dg/cpp26/feat-cxx26.C: Test __cpp_pp_embed value.
* gcc.dg/cpp/embed-17.c: New test.
|
|
The following fixes a thinko in the handling of interposed weak
definitions which confused the interposition check in
get_availability by setting DECL_EXTERNAL too early.
PR lto/91299
gcc/lto/
* lto-symtab.cc (lto_symtab_merge_symbols): Set DECL_EXTERNAL
only after calling get_availability.
gcc/testsuite/
* gcc.dg/lto/pr91299_0.c: New testcase.
* gcc.dg/lto/pr91299_1.c: Likewise.
|
|
As documented in the manual, FIX/UNSIGNED_FIX from floating point
mode to integral mode has unspecified rounding and FIX from floating point
mode to the same floating point mode is expressing rounding toward zero.
So, some targets (arc, arm, csky, m68k, mmix, nds32, pdp11, sparc and
visium) use
(fix:SI (fix:SF (match_operand:SF 1 "..._operand")))
etc. to express the rounding toward zero during conversion to integer.
For some reason other targets don't use that.
Anyway, the 2 FIXes (or inner FIX with outer UNSIGNED_FIX) cause problems
since the r15-2890 which removed some strict checks in ifcvt.cc on what
SET_SRC can be actually conditionalized (I must say I'm still worried
about the change, don't know why one can't get e.g. inline asm or
something with UNSPEC or some complex backend specific RTLs that
force_operand can't handle), force_operand just ICEs on it, it can only
handle (through expand_fix) conversions from floating point to integral.
The following patch fixes this by detecting this case and just pretend
the inner FIX isn't there, i.e. call expand_fix with the inner FIX's
operand instead, which works and on targets like arm it will just
create the nested FIXes again.
2025-02-28 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/117712
* expr.cc (force_operand): Handle {,UNSIGNED_}FIX with
FIX operand using expand_fix on the inner FIX operand.
* gcc.dg/pr117712.c: New test.
|
|
The following testcase ICEs, because we first construct file_cache object
inside of *global_dc, then process options and then call file_cache::tune.
The earlier construction allocates the m_file_slots array (using new)
based on the static data member file_cache::num_file_slots, but then tune
changes it, without actually reallocating all m_file_slots arrays in already
constructed file_cache objects.
I think it is just weird to have the count be a static data member and
the pointer be non-static data member, that is just asking for issues like
this.
So, this patch changes num_file_slots into m_num_file_slots and turns tune
into a non-static member function and changes toplev.cc to call it on the
global_gc->get_file_cache () object. And let's the tune just delete the
array and allocate it freshly if there is a change in the number of slots
or lines.
Note, file_cache_slot has similar problem, but because there are many, I
haven't moved the count into those objects; I just hope that when tune
is called there is exactly one file_cache constructed and all the
file_cache_slot objects constructed are pointed by its m_file_slots member,
so also on lines change it just deletes it and allocates again. I think
it should be unlikely that the cache actually has any used slots by the time
it is called.
2025-02-27 Jakub Jelinek <jakub@redhat.com>
PR middle-end/118860
* input.h (file_cache::tune): No longer static. Rename argument
from num_file_slots_ to num_file_slots. Formatting fix.
(file_cache::num_file_slots): Renamed to ...
(file_cache::m_num_file_slots): ... this. No longer static.
* input.cc (file_cache_slot::tune): Change return type from void to
size_t, return previous file_cache_slot::line_record_size value.
Formatting fixes.
(file_cache::tune): Rename argument from num_file_slots_ to
num_file_slots. Set m_num_file_slots rather than num_file_slots.
If m_num_file_slots or file_cache_slot::line_record_size changes,
delete[] m_file_slots and new it again.
(file_cache::num_file_slots): Remove definition.
(file_cache::lookup_file): Use m_num_file_slots rather than
num_file_slots.
(file_cache::evicted_cache_tab_entry): Likewise.
(file_cache::file_cache): Likewise. Initialize m_num_file_slots
to 16.
(file_cache::dump): Use m_num_file_slots rather than num_file_slots.
(file_cache_slot::get_next_line): Formatting fixes.
(file_cache_slot::read_line_num): Likewise.
(get_source_text_between): Likewise.
* toplev.cc (toplev::main): Call global_dc->get_file_cache ().tune
rather than file_cache::tune.
* gcc.dg/pr118860.c: New test.
|
|
Patch for PR116234 solves given PR116366. So the patch adds only the test
case which is very different from PR116234 one.
gcc/testsuite/ChangeLog:
PR rtl-optimization/116336
* gcc.dg/pr116336.c: New test.
|
|
r15-209 allowed flexible array members inside of unions, but as the
following testcase shows, not everything has been adjusted for that.
Unlike structures, in unions flexible array member (as an extension)
can be any of the members, not just the last one, as in union all
members are effectively last.
The first hunk is about an ICE on the initialization of the FAM
in union which is not the last FIELD_DECL with a string literal,
the second hunk just formatting fix, third hunk fixes a bug in which
we were just throwing away the initializers (except for with string literal)
of FAMs in unions which aren't the last FIELD_DECL, and the last hunk
is to diagnose FAM errors in unions the same as for structures, in
particular trying to initialize a FAM with non-constant or initialization
in nested context.
2025-02-26 Jakub Jelinek <jakub@redhat.com>
PR c/119001
gcc/
* varasm.cc (output_constructor_regular_field): Don't fail
assertion if next is non-NULL and FIELD_DECL if
TREE_CODE (local->type) is UNION_TYPE.
gcc/c/
* c-typeck.cc (pop_init_level): Don't clear constructor_type
if DECL_CHAIN of constructor_fields is NULL but p->type is UNION_TYPE.
Formatting fix.
(process_init_element): Diagnose non-static initialization of flexible
array member in union or FAM in union initialization in nested context.
gcc/testsuite/
* gcc.dg/pr119001-1.c: New test.
* gcc.dg/pr119001-2.c: New test.
|
|
The stddef.h header for C23 defines __STDC_VERSION_STDDEF_H__ and
unreachable macros multiple times in some cases.
The header doesn't have normal multiple inclusion guard, because it supports
for glibc inclusion with __need_{size_t,wchar_t,ptrdiff_t,wint_t,NULL}.
While the definition of __STDC_VERSION_STDDEF_H__ and unreachable is done
solely in the #ifdef _STDDEF_H part, so they are defined only if stddef.h
is included without those __need_* macros defined. But actually once
stddef.h is included without the __need_* macros, _STDDEF_H is then defined
and while further stddef.h includes without __need_* macros don't do
anything:
#if (!defined(_STDDEF_H) && !defined(_STDDEF_H_) && !defined(_ANSI_STDDEF_H) \
&& !defined(__STDDEF_H__)) \
|| defined(__need_wchar_t) || defined(__need_size_t) \
|| defined(__need_ptrdiff_t) || defined(__need_NULL) \
|| defined(__need_wint_t)
if one includes whole stddef.h first and then stddef.h with some of the
__need_* macros defined, the #ifdef _STDDEF_H part is used again.
It isn't that big deal for most cases, as it uses extra guarding macros
like:
#ifndef _GCC_MAX_ALIGN_T
#define _GCC_MAX_ALIGN_T
...
#endif
etc., but for __STDC_VERSION_STDDEF_H__/unreachable nothing like that is
used.
So, either we do what the following patch does and just don't define
__STDC_VERSION_STDDEF_H__/unreachable second time, or use #ifndef
unreachable separately for the #define unreachable() case, or use
new _GCC_STDC_VERSION_STDDEF_H macro to guard this (or two, one for
__STDC_VERSION_STDDEF_H__ and one for unreachable), or rework the initial
condition to be just
#if !defined(_STDDEF_H) && !defined(_STDDEF_H_) && !defined(_ANSI_STDDEF_H) \
&& !defined(__STDDEF_H__)
- I really don't understand why the header should do anything at all after
it has been included once without __need_* macros. But changing how this
behaves after 35 years might be risky for various OS/libc combinations.
2025-02-26 Jakub Jelinek <jakub@redhat.com>
PR c/114870
* ginclude/stddef.h (__STDC_VERSION_STDDEF_H__, unreachable): Don't
redefine multiple times if stddef.h is first included without __need_*
defines and later with them. Move nullptr_t and unreachable and
__STDC_VERSION_STDDEF_H__ definitions into the same
defined (__STDC_VERSION__) && __STDC_VERSION__ > 201710L #if block.
* gcc.dg/c23-stddef-2.c: New test.
|
|
Some vect-simd-clone tests fail when targeting ancient x86 variants,
because the expected transformations only take place with -msse4 or
higher.
So arrange for these tests to take an -msse4 option on x86, so that
the expected vectorization takes place, but decay to a compile test if
vect.exp would enable execution but the target doesn't have an sse4
runtime. This requires the new dg-do-if to override the action on a
target while retaining the default action on others, instead of
disabling the test.
We can count on avx512f compile-time support for these tests, because
vect_simd_clones requires that on x86, and that implies sse4 support,
so we need not complicate the scan conditionals with tests for sse4,
except on the last test.
for gcc/ChangeLog
* doc/sourcebuild.texi (dg-do-if): Document.
for gcc/testsuite/ChangeLog
* lib/target-supports-dg.exp (dg-do-if): New.
* gcc.dg/vect/vect-simd-clone-16f.c: Use -msse4 on x86, and
skip in case execution is enabled but the runtime isn't.
* gcc.dg/vect/vect-simd-clone-17f.c: Likewise.
* gcc.dg/vect/vect-simd-clone-18f.c: Likewise.
* gcc.dg/vect/vect-simd-clone-20.c: Likewise, but only skip
the scan test.
|
|
These loops will now vectorize the entry finding
loops. As such we get more failures because they
were not expecting to be vectorized.
Fixed by adding #pragma GCC novector.
gcc/testsuite/ChangeLog:
PR tree-optimization/118464
PR tree-optimization/116855
* g++.dg/ext/pragma-unroll-lambda-lto.C: Add pragma novector.
* gcc.dg/tree-ssa/gen-vect-2.c: Likewise.
* gcc.dg/tree-ssa/gen-vect-25.c: Likewise.
* gcc.dg/tree-ssa/gen-vect-32.c: Likewise.
* gcc.dg/tree-ssa/ivopt_mult_2g.c: Likewise.
* gcc.dg/tree-ssa/ivopts-5.c: Likewise.
* gcc.dg/tree-ssa/ivopts-6.c: Likewise.
* gcc.dg/tree-ssa/ivopts-7.c: Likewise.
* gcc.dg/tree-ssa/ivopts-8.c: Likewise.
* gcc.dg/tree-ssa/ivopts-9.c: Likewise.
* gcc.dg/tree-ssa/predcom-dse-1.c: Likewise.
* gcc.dg/tree-ssa/predcom-dse-10.c: Likewise.
* gcc.dg/tree-ssa/predcom-dse-11.c: Likewise.
* gcc.dg/tree-ssa/predcom-dse-12.c: Likewise.
* gcc.dg/tree-ssa/predcom-dse-2.c: Likewise.
* gcc.dg/tree-ssa/predcom-dse-3.c: Likewise.
* gcc.dg/tree-ssa/predcom-dse-4.c: Likewise.
* gcc.dg/tree-ssa/predcom-dse-5.c: Likewise.
* gcc.dg/tree-ssa/predcom-dse-6.c: Likewise.
* gcc.dg/tree-ssa/predcom-dse-7.c: Likewise.
* gcc.dg/tree-ssa/predcom-dse-8.c: Likewise.
* gcc.dg/tree-ssa/predcom-dse-9.c: Likewise.
* gcc.target/i386/pr90178.c: Likewise.
|
|
When scanning for program points, i.e. vector statements, we're missing
pattern statements. In PR114516 this becomes obvious as we choose
LMUL=8 assuming there are only three statements but the divmod pattern
adds another three. Those push us beyond four registers so we need to
switch to LMUL=4.
This patch adds pattern statements to the program points which helps
calculate a better register pressure estimate.
PR target/114516
gcc/ChangeLog:
* config/riscv/riscv-vector-costs.cc (compute_estimated_lmul):
Add pattern statements to program points.
gcc/testsuite/ChangeLog:
* gcc.dg/vect/costmodel/riscv/rvv/pr114516.c: New test.
|
|
[PR117023]
On top of the
https://gcc.gnu.org/pipermail/gcc-patches/2024-November/668554.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-November/668699.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-November/668700.html
patches the following patch adds nonnull_if_nonzero attribute(s) to
various builtins instead of or in addition to nonnull attribute.
The patch adjusts builtins (when we have them) corresponding to the APIs
mentioned in the C2Y N3322 paper:
1) strndup and memset get one nonnull_if_nonzero attribute instead of
nonnull
2) memcpy, memmove, strncpy, memcmp, strncmp get two nonnull_if_nonzero
attributes instead of nonnull
3) strncat has nonnull without argument changed to nonnull (1) and
gets one nonnull_if_nonzero for the src argument (maybe it needs
to be clarified in C2Y, but I really think first argument to strncat
and wcsncat shouldn't be NULL even for n == 0, because NULL doesn't
point to NULL terminated string and one can't append anything to it;
and various implementations in the wild including glibc will crash
with NULL first argument (x86_64 avx+ doesn't though)
Such changes are done also to the _chk suffixed counterparts of the
builtins.
Furthermore I've changed a couple of builtins for POSIX functions which
aren't covered by ISO C, but I'd expect if/when POSIX incorporates C2Y
it would do the same changes. In particular
4) strnlen gets one nonnull_if_nonzero instead of nonnull
5) mempcpy and stpncpy get two nonnull_if_nonzero instead of nonnull
and lose returns_nonnull attribute; this is kind of unfortunate
but I think in the spirit of N3322 mempcpy (NULL, src, 0) should
return NULL (i.e. dest + n aka NULL + 0, now valid) and it is hard to
express returns non-NULL if first argument is non-NULL or third argument
is non-zero
I'm not really sure about fread/fwrite, N3322 doesn't mention those,
can the first argument be NULL if third argument is 0? What about
if second argument is 0? Can the fourth argument be NULL in such cases?
And of course, when not using builtins the glibc headers will affect stuff
too, so we'll need to wait for N3322 implementation there too (possibly
by dropping the nonnull attributes and perhaps conditionally replacing them
with this new one if the compiler supports them).
2025-02-24 Jakub Jelinek <jakub@redhat.com>
PR c/117023
gcc/
* builtin-attrs.def (ATTR_NONNULL_IF_NONZERO): New DEF_ATTR_IDENT.
(ATTR_NOTHROW_NONNULL_IF12_LEAF, ATTR_NOTHROW_NONNULL_IF13_LEAF,
ATTR_NOTHROW_NONNULL_IF123_LEAF, ATTR_NOTHROW_NONNULL_IF23_LEAF,
ATTR_NOTHROW_NONNULL_1_IF23_LEAF, ATTR_PURE_NOTHROW_NONNULL_IF12_LEAF,
ATTR_PURE_NOTHROW_NONNULL_IF13_LEAF,
ATTR_PURE_NOTHROW_NONNULL_IF123_LEAF,
ATTR_WARN_UNUSED_RESULT_NOTHROW_NONNULL_IF12_LEAF,
ATTR_MALLOC_WARN_UNUSED_RESULT_NOTHROW_NONNULL_IF12_LEAF): New
DEF_ATTR_TREE_LIST.
* builtins.def (BUILT_IN_STRNDUP): Use
ATTR_MALLOC_WARN_UNUSED_RESULT_NOTHROW_NONNULL_IF12_LEAF instead of
ATTR_MALLOC_WARN_UNUSED_RESULT_NOTHROW_NONNULL_LEAF.
(BUILT_IN_STRNCAT, BUILT_IN_STRNCAT_CHK): Use
ATTR_NOTHROW_NONNULL_1_IF23_LEAF instead of ATTR_NOTHROW_NONNULL_LEAF.
(BUILT_IN_BCOPY, BUILT_IN_MEMCPY, BUILT_IN_MEMCPY_CHK,
BUILT_IN_MEMMOVE, BUILT_IN_MEMMOVE_CHK, BUILT_IN_STRNCPY,
BUILT_IN_STRNCPY_CHK): Use ATTR_NOTHROW_NONNULL_IF123_LEAF instead of
ATTR_NOTHROW_NONNULL_LEAF.
(BUILT_IN_MEMPCPY, BUILT_IN_MEMPCPY_CHK, BUILT_IN_STPNCPY,
BUILT_IN_STPNCPY_CHK): Use ATTR_NOTHROW_NONNULL_IF123_LEAF instead of
ATTR_RETNONNULL_NOTHROW_LEAF.
(BUILT_IN_BZERO, BUILT_IN_MEMSET, BUILT_IN_MEMSET_CHK): Use
ATTR_NOTHROW_NONNULL_IF13_LEAF instead of ATTR_NOTHROW_NONNULL_LEAF.
(BUILT_IN_BCMP, BUILT_IN_MEMCMP, BUILT_IN_STRNCASECMP,
BUILT_IN_STRNCMP): Use ATTR_PURE_NOTHROW_NONNULL_IF123_LEAF instead of
ATTR_PURE_NOTHROW_NONNULL_LEAF.
(BUILT_IN_STRNLEN): Use ATTR_PURE_NOTHROW_NONNULL_IF12_LEAF instead of
ATTR_PURE_NOTHROW_NONNULL_LEAF.
(BUILT_IN_MEMCHR): Use ATTR_PURE_NOTHROW_NONNULL_IF13_LEAF instead of
ATTR_PURE_NOTHROW_NONNULL_LEAF.
gcc/testsuite/
* gcc.dg/builtins-nonnull.c (test_memfuncs, test_memfuncs_chk,
test_strfuncs, test_strfuncs_chk): Add if (n == 0) return; at the
start of the functions.
* gcc.dg/Wnonnull-2.c: Copy __builtin_* call statements where
appropriate 3 times, once with 0 length, once with n and once with
non-zero constant and expect warning only in the third case.
Formatting fixes.
* gcc.dg/Wnonnull-3.c: Copy __builtin_* call statements where
appropriate 3 times, once with 0 length, once with n and once with
n guarded with n != 0 and expect warning only in the third case.
Formatting fixes.
* gcc.dg/nonnull-3.c (foo): Use 16 instead of 0 in the calls added
for PR80936.
* gcc.dg/nonnull-11.c: New test.
* c-c++-common/ubsan/nonnull-1.c: Don't expect runtime diagnostics
for the __builtin_memcpy call.
* gcc.dg/tree-ssa/pr78154.c (f): Add dn argument and return early
if it is NULL. Duplicate cases of builtins which have the first
argument changed from nonnull to nonnull_if_nonzero except stpncpy,
once with dn as first argument instead of d and once with constant
non-zero count rather than n. Disable the stpncpy non-null check.
* gcc.dg/Wbuiltin-declaration-mismatch-14.c (test_builtin_calls):
Triplicate the strncmp calls, once with 1 last argument and expect
warning, once with n last argument and don't expect warning and
once with 0 last argument and don't expect warning.
* gcc.dg/Wbuiltin-declaration-mismatch-15.c (test_builtin_calls_fe):
Likewise.
|
|
dynamic stack allocation not supported'
In Subversion r217296 (Git commit e2acc079ff125a869159be45371dc0a29b230e92)
"Testsuite alloca fixes for ptx", effective-target 'alloca' was added to mark
up test cases that run into the nvptx back end's non-support of dynamic stack
allocation. (Later, nvptx gained conditional support for that in
commit 3861d362ec7e3c50742fc43833fe9d8674f4070e
"nvptx: PTX 'alloca' for '-mptx=7.3'+, '-march=sm_52'+ [PR65181]", but on the
other hand, in commit f93a612fc4567652b75ffc916d31a446378e6613
"bpf: liberate R9 for general register allocation", the BPF back end joined
"the list of targets that do not support alloca in target-support.exp".
Manually maintaining the list of test cases requiring effective-target 'alloca'
is notoriously hard, gets out of date quickly: new test cases added to the test
suite may need to be analyzed and annotated, and over time annotations also may
need to be removed, in cases where the compiler learns to optimize out
'alloca'/VLA usage, for example. This commit replaces (99 % of) the manual
annotations with an automatic scheme: turn test cases into UNSUPPORTED if
running into 'sorry, unimplemented: dynamic stack allocation not supported'.
gcc/testsuite/
* lib/target-supports.exp (check_effective_target_alloca):
Gracefully handle the case that we've not be called (indirectly)
from 'dg-test'.
* lib/gcc-dg.exp (proc gcc-dg-prune): Turn
'sorry, unimplemented: dynamic stack allocation not supported' into
UNSUPPORTED.
* c-c++-common/Walloca-larger-than.c: Don't
'dg-require-effective-target alloca'.
* c-c++-common/Warray-bounds-9.c: Likewise.
* c-c++-common/Warray-bounds.c: Likewise.
* c-c++-common/Wdangling-pointer-2.c: Likewise.
* c-c++-common/Wdangling-pointer-4.c: Likewise.
* c-c++-common/Wdangling-pointer-5.c: Likewise.
* c-c++-common/Wdangling-pointer.c: Likewise.
* c-c++-common/Wimplicit-fallthrough-7.c: Likewise.
* c-c++-common/Wsizeof-pointer-memaccess1.c: Likewise.
* c-c++-common/Wsizeof-pointer-memaccess2.c: Likewise.
* c-c++-common/Wstringop-truncation.c: Likewise.
* c-c++-common/Wunused-var-6.c: Likewise.
* c-c++-common/Wunused-var-8.c: Likewise.
* c-c++-common/analyzer/alloca-leak.c: Likewise.
* c-c++-common/analyzer/allocation-size-multiline-2.c: Likewise.
* c-c++-common/analyzer/allocation-size-multiline-3.c: Likewise.
* c-c++-common/analyzer/capacity-1.c: Likewise.
* c-c++-common/analyzer/capacity-3.c: Likewise.
* c-c++-common/analyzer/imprecise-floating-point-1.c: Likewise.
* c-c++-common/analyzer/infinite-recursion-alloca.c: Likewise.
* c-c++-common/analyzer/malloc-callbacks.c: Likewise.
* c-c++-common/analyzer/malloc-paths-8.c: Likewise.
* c-c++-common/analyzer/out-of-bounds-5.c: Likewise.
* c-c++-common/analyzer/out-of-bounds-diagram-11.c: Likewise.
* c-c++-common/analyzer/uninit-alloca.c: Likewise.
* c-c++-common/analyzer/write-to-string-literal-5.c: Likewise.
* c-c++-common/asan/alloca_loop_unpoisoning.c: Likewise.
* c-c++-common/auto-init-11.c: Likewise.
* c-c++-common/auto-init-12.c: Likewise.
* c-c++-common/auto-init-15.c: Likewise.
* c-c++-common/auto-init-16.c: Likewise.
* c-c++-common/builtins.c: Likewise.
* c-c++-common/dwarf2/vla1.c: Likewise.
* c-c++-common/gomp/pr61486-2.c: Likewise.
* c-c++-common/torture/builtin-clear-padding-4.c: Likewise.
* c-c++-common/torture/strub-run3.c: Likewise.
* c-c++-common/torture/strub-run4.c: Likewise.
* c-c++-common/torture/strub-run4c.c: Likewise.
* c-c++-common/torture/strub-run4d.c: Likewise.
* c-c++-common/torture/strub-run4i.c: Likewise.
* g++.dg/Walloca1.C: Likewise.
* g++.dg/Walloca2.C: Likewise.
* g++.dg/cpp0x/pr70338.C: Likewise.
* g++.dg/cpp1y/lambda-generic-vla1.C: Likewise.
* g++.dg/cpp1y/vla10.C: Likewise.
* g++.dg/cpp1y/vla2.C: Likewise.
* g++.dg/cpp1y/vla6.C: Likewise.
* g++.dg/cpp1y/vla8.C: Likewise.
* g++.dg/debug/debug5.C: Likewise.
* g++.dg/debug/debug6.C: Likewise.
* g++.dg/debug/pr54828.C: Likewise.
* g++.dg/diagnostic/pr70105.C: Likewise.
* g++.dg/eh/cleanup5.C: Likewise.
* g++.dg/eh/spbp.C: Likewise.
* g++.dg/ext/builtin_alloca.C: Likewise.
* g++.dg/ext/tmplattr9.C: Likewise.
* g++.dg/ext/vla10.C: Likewise.
* g++.dg/ext/vla11.C: Likewise.
* g++.dg/ext/vla12.C: Likewise.
* g++.dg/ext/vla15.C: Likewise.
* g++.dg/ext/vla16.C: Likewise.
* g++.dg/ext/vla17.C: Likewise.
* g++.dg/ext/vla23.C: Likewise.
* g++.dg/ext/vla3.C: Likewise.
* g++.dg/ext/vla6.C: Likewise.
* g++.dg/ext/vla7.C: Likewise.
* g++.dg/init/array24.C: Likewise.
* g++.dg/init/new47.C: Likewise.
* g++.dg/init/pr55497.C: Likewise.
* g++.dg/opt/pr78201.C: Likewise.
* g++.dg/template/vla2.C: Likewise.
* g++.dg/torture/Wsizeof-pointer-memaccess1.C: Likewise.
* g++.dg/torture/Wsizeof-pointer-memaccess2.C: Likewise.
* g++.dg/torture/pr62127.C: Likewise.
* g++.dg/torture/pr67055.C: Likewise.
* g++.dg/torture/stackalign/eh-alloca-1.C: Likewise.
* g++.dg/torture/stackalign/eh-inline-2.C: Likewise.
* g++.dg/torture/stackalign/eh-vararg-1.C: Likewise.
* g++.dg/torture/stackalign/eh-vararg-2.C: Likewise.
* g++.dg/warn/Wplacement-new-size-5.C: Likewise.
* g++.dg/warn/Wsizeof-pointer-memaccess-1.C: Likewise.
* g++.dg/warn/Wvla-1.C: Likewise.
* g++.dg/warn/Wvla-3.C: Likewise.
* g++.old-deja/g++.ext/array2.C: Likewise.
* g++.old-deja/g++.ext/constructor.C: Likewise.
* g++.old-deja/g++.law/builtin1.C: Likewise.
* g++.old-deja/g++.other/crash12.C: Likewise.
* g++.old-deja/g++.other/eh3.C: Likewise.
* g++.old-deja/g++.pt/array6.C: Likewise.
* g++.old-deja/g++.pt/dynarray.C: Likewise.
* gcc.c-torture/compile/20000923-1.c: Likewise.
* gcc.c-torture/compile/20030224-1.c: Likewise.
* gcc.c-torture/compile/20071108-1.c: Likewise.
* gcc.c-torture/compile/20071117-1.c: Likewise.
* gcc.c-torture/compile/900313-1.c: Likewise.
* gcc.c-torture/compile/parms.c: Likewise.
* gcc.c-torture/compile/pr17397.c: Likewise.
* gcc.c-torture/compile/pr35006.c: Likewise.
* gcc.c-torture/compile/pr42956.c: Likewise.
* gcc.c-torture/compile/pr51354.c: Likewise.
* gcc.c-torture/compile/pr52714.c: Likewise.
* gcc.c-torture/compile/pr55851.c: Likewise.
* gcc.c-torture/compile/pr77754-1.c: Likewise.
* gcc.c-torture/compile/pr77754-2.c: Likewise.
* gcc.c-torture/compile/pr77754-3.c: Likewise.
* gcc.c-torture/compile/pr77754-4.c: Likewise.
* gcc.c-torture/compile/pr77754-5.c: Likewise.
* gcc.c-torture/compile/pr77754-6.c: Likewise.
* gcc.c-torture/compile/pr78439.c: Likewise.
* gcc.c-torture/compile/pr79413.c: Likewise.
* gcc.c-torture/compile/pr82564.c: Likewise.
* gcc.c-torture/compile/pr87110.c: Likewise.
* gcc.c-torture/compile/pr99787-1.c: Likewise.
* gcc.c-torture/compile/vla-const-1.c: Likewise.
* gcc.c-torture/compile/vla-const-2.c: Likewise.
* gcc.c-torture/execute/20010209-1.c: Likewise.
* gcc.c-torture/execute/20020314-1.c: Likewise.
* gcc.c-torture/execute/20020412-1.c: Likewise.
* gcc.c-torture/execute/20021113-1.c: Likewise.
* gcc.c-torture/execute/20040223-1.c: Likewise.
* gcc.c-torture/execute/20040308-1.c: Likewise.
* gcc.c-torture/execute/20040811-1.c: Likewise.
* gcc.c-torture/execute/20070824-1.c: Likewise.
* gcc.c-torture/execute/20070919-1.c: Likewise.
* gcc.c-torture/execute/built-in-setjmp.c: Likewise.
* gcc.c-torture/execute/pr22061-1.c: Likewise.
* gcc.c-torture/execute/pr43220.c: Likewise.
* gcc.c-torture/execute/pr82210.c: Likewise.
* gcc.c-torture/execute/pr86528.c: Likewise.
* gcc.c-torture/execute/vla-dealloc-1.c: Likewise.
* gcc.dg/20001012-2.c: Likewise.
* gcc.dg/20020415-1.c: Likewise.
* gcc.dg/20030331-2.c: Likewise.
* gcc.dg/20101010-1.c: Likewise.
* gcc.dg/Walloca-1.c: Likewise.
* gcc.dg/Walloca-10.c: Likewise.
* gcc.dg/Walloca-11.c: Likewise.
* gcc.dg/Walloca-12.c: Likewise.
* gcc.dg/Walloca-13.c: Likewise.
* gcc.dg/Walloca-14.c: Likewise.
* gcc.dg/Walloca-15.c: Likewise.
* gcc.dg/Walloca-2.c: Likewise.
* gcc.dg/Walloca-3.c: Likewise.
* gcc.dg/Walloca-4.c: Likewise.
* gcc.dg/Walloca-5.c: Likewise.
* gcc.dg/Walloca-6.c: Likewise.
* gcc.dg/Walloca-7.c: Likewise.
* gcc.dg/Walloca-8.c: Likewise.
* gcc.dg/Walloca-9.c: Likewise.
* gcc.dg/Walloca-larger-than-2.c: Likewise.
* gcc.dg/Walloca-larger-than-3.c: Likewise.
* gcc.dg/Walloca-larger-than-4.c: Likewise.
* gcc.dg/Walloca-larger-than.c: Likewise.
* gcc.dg/Warray-bounds-22.c: Likewise.
* gcc.dg/Warray-bounds-41.c: Likewise.
* gcc.dg/Warray-bounds-46.c: Likewise.
* gcc.dg/Warray-bounds-48-novec.c: Likewise.
* gcc.dg/Warray-bounds-48.c: Likewise.
* gcc.dg/Warray-bounds-50.c: Likewise.
* gcc.dg/Warray-bounds-63.c: Likewise.
* gcc.dg/Warray-bounds-66.c: Likewise.
* gcc.dg/Wdangling-pointer.c: Likewise.
* gcc.dg/Wfree-nonheap-object-2.c: Likewise.
* gcc.dg/Wfree-nonheap-object.c: Likewise.
* gcc.dg/Wrestrict-17.c: Likewise.
* gcc.dg/Wrestrict.c: Likewise.
* gcc.dg/Wreturn-local-addr-2.c: Likewise.
* gcc.dg/Wreturn-local-addr-3.c: Likewise.
* gcc.dg/Wreturn-local-addr-4.c: Likewise.
* gcc.dg/Wreturn-local-addr-6.c: Likewise.
* gcc.dg/Wsizeof-pointer-memaccess1.c: Likewise.
* gcc.dg/Wstack-usage.c: Likewise.
* gcc.dg/Wstrict-aliasing-bogus-vla-1.c: Likewise.
* gcc.dg/Wstrict-overflow-27.c: Likewise.
* gcc.dg/Wstringop-overflow-15.c: Likewise.
* gcc.dg/Wstringop-overflow-23.c: Likewise.
* gcc.dg/Wstringop-overflow-25.c: Likewise.
* gcc.dg/Wstringop-overflow-27.c: Likewise.
* gcc.dg/Wstringop-overflow-3.c: Likewise.
* gcc.dg/Wstringop-overflow-39.c: Likewise.
* gcc.dg/Wstringop-overflow-56.c: Likewise.
* gcc.dg/Wstringop-overflow-57.c: Likewise.
* gcc.dg/Wstringop-overflow-67.c: Likewise.
* gcc.dg/Wstringop-overflow-71.c: Likewise.
* gcc.dg/Wstringop-truncation-3.c: Likewise.
* gcc.dg/Wvla-larger-than-1.c: Likewise.
* gcc.dg/Wvla-larger-than-2.c: Likewise.
* gcc.dg/Wvla-larger-than-3.c: Likewise.
* gcc.dg/Wvla-larger-than-4.c: Likewise.
* gcc.dg/Wvla-larger-than-5.c: Likewise.
* gcc.dg/analyzer/boxed-malloc-1.c: Likewise.
* gcc.dg/analyzer/call-summaries-2.c: Likewise.
* gcc.dg/analyzer/malloc-1.c: Likewise.
* gcc.dg/analyzer/malloc-reuse.c: Likewise.
* gcc.dg/analyzer/out-of-bounds-diagram-12.c: Likewise.
* gcc.dg/analyzer/pr93355-localealias.c: Likewise.
* gcc.dg/analyzer/putenv-1.c: Likewise.
* gcc.dg/analyzer/taint-alloc-1.c: Likewise.
* gcc.dg/analyzer/torture/pr93373.c: Likewise.
* gcc.dg/analyzer/torture/ubsan-1.c: Likewise.
* gcc.dg/analyzer/vla-1.c: Likewise.
* gcc.dg/atomic/stdatomic-vm.c: Likewise.
* gcc.dg/attr-alloc_size-6.c: Likewise.
* gcc.dg/attr-alloc_size-7.c: Likewise.
* gcc.dg/attr-alloc_size-8.c: Likewise.
* gcc.dg/attr-alloc_size-9.c: Likewise.
* gcc.dg/attr-noipa.c: Likewise.
* gcc.dg/auto-init-uninit-36.c: Likewise.
* gcc.dg/auto-init-uninit-9.c: Likewise.
* gcc.dg/auto-type-1.c: Likewise.
* gcc.dg/builtin-alloc-size.c: Likewise.
* gcc.dg/builtin-dynamic-alloc-size.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-1.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-2.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-3.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-4.c: Likewise.
* gcc.dg/builtin-object-size-1.c: Likewise.
* gcc.dg/builtin-object-size-2.c: Likewise.
* gcc.dg/builtin-object-size-3.c: Likewise.
* gcc.dg/builtin-object-size-4.c: Likewise.
* gcc.dg/builtins-64.c: Likewise.
* gcc.dg/builtins-68.c: Likewise.
* gcc.dg/c23-auto-2.c: Likewise.
* gcc.dg/c99-const-expr-13.c: Likewise.
* gcc.dg/c99-vla-1.c: Likewise.
* gcc.dg/fold-alloca-1.c: Likewise.
* gcc.dg/gomp/pr30494.c: Likewise.
* gcc.dg/gomp/vla-2.c: Likewise.
* gcc.dg/gomp/vla-3.c: Likewise.
* gcc.dg/gomp/vla-4.c: Likewise.
* gcc.dg/gomp/vla-5.c: Likewise.
* gcc.dg/graphite/pr99085.c: Likewise.
* gcc.dg/guality/guality.c: Likewise.
* gcc.dg/lto/pr80778_0.c: Likewise.
* gcc.dg/nested-func-10.c: Likewise.
* gcc.dg/nested-func-12.c: Likewise.
* gcc.dg/nested-func-13.c: Likewise.
* gcc.dg/nested-func-14.c: Likewise.
* gcc.dg/nested-func-15.c: Likewise.
* gcc.dg/nested-func-16.c: Likewise.
* gcc.dg/nested-func-17.c: Likewise.
* gcc.dg/nested-func-9.c: Likewise.
* gcc.dg/packed-vla.c: Likewise.
* gcc.dg/pr100225.c: Likewise.
* gcc.dg/pr25682.c: Likewise.
* gcc.dg/pr27301.c: Likewise.
* gcc.dg/pr31507-1.c: Likewise.
* gcc.dg/pr33238.c: Likewise.
* gcc.dg/pr41470.c: Likewise.
* gcc.dg/pr49120.c: Likewise.
* gcc.dg/pr50764.c: Likewise.
* gcc.dg/pr51491-2.c: Likewise.
* gcc.dg/pr51990-2.c: Likewise.
* gcc.dg/pr51990.c: Likewise.
* gcc.dg/pr59011.c: Likewise.
* gcc.dg/pr59523.c: Likewise.
* gcc.dg/pr61561.c: Likewise.
* gcc.dg/pr78468.c: Likewise.
* gcc.dg/pr78902.c: Likewise.
* gcc.dg/pr79972.c: Likewise.
* gcc.dg/pr82875.c: Likewise.
* gcc.dg/pr83844.c: Likewise.
* gcc.dg/pr84131.c: Likewise.
* gcc.dg/pr87099.c: Likewise.
* gcc.dg/pr87320.c: Likewise.
* gcc.dg/pr89045.c: Likewise.
* gcc.dg/pr91014.c: Likewise.
* gcc.dg/pr93986.c: Likewise.
* gcc.dg/pr98721-1.c: Likewise.
* gcc.dg/pr99122-2.c: Likewise.
* gcc.dg/shrink-wrap-alloca.c: Likewise.
* gcc.dg/sso-14.c: Likewise.
* gcc.dg/strlenopt-62.c: Likewise.
* gcc.dg/strlenopt-83.c: Likewise.
* gcc.dg/strlenopt-84.c: Likewise.
* gcc.dg/strlenopt-91.c: Likewise.
* gcc.dg/torture/Wsizeof-pointer-memaccess1.c: Likewise.
* gcc.dg/torture/calleesave-sse.c: Likewise.
* gcc.dg/torture/pr48953.c: Likewise.
* gcc.dg/torture/pr71881.c: Likewise.
* gcc.dg/torture/pr71901.c: Likewise.
* gcc.dg/torture/pr78742.c: Likewise.
* gcc.dg/torture/pr92088-1.c: Likewise.
* gcc.dg/torture/pr92088-2.c: Likewise.
* gcc.dg/torture/pr93124.c: Likewise.
* gcc.dg/torture/pr94479.c: Likewise.
* gcc.dg/torture/stackalign/alloca-1.c: Likewise.
* gcc.dg/torture/stackalign/inline-2.c: Likewise.
* gcc.dg/torture/stackalign/nested-3.c: Likewise.
* gcc.dg/torture/stackalign/vararg-1.c: Likewise.
* gcc.dg/torture/stackalign/vararg-2.c: Likewise.
* gcc.dg/tree-ssa/20030807-2.c: Likewise.
* gcc.dg/tree-ssa/20080530.c: Likewise.
* gcc.dg/tree-ssa/alias-37.c: Likewise.
* gcc.dg/tree-ssa/builtin-sprintf-warn-22.c: Likewise.
* gcc.dg/tree-ssa/builtin-sprintf-warn-25.c: Likewise.
* gcc.dg/tree-ssa/builtin-sprintf-warn-3.c: Likewise.
* gcc.dg/tree-ssa/loop-interchange-15.c: Likewise.
* gcc.dg/tree-ssa/pr23848-1.c: Likewise.
* gcc.dg/tree-ssa/pr23848-2.c: Likewise.
* gcc.dg/tree-ssa/pr23848-3.c: Likewise.
* gcc.dg/tree-ssa/pr23848-4.c: Likewise.
* gcc.dg/uninit-32.c: Likewise.
* gcc.dg/uninit-36.c: Likewise.
* gcc.dg/uninit-39.c: Likewise.
* gcc.dg/uninit-41.c: Likewise.
* gcc.dg/uninit-9-O0.c: Likewise.
* gcc.dg/uninit-9.c: Likewise.
* gcc.dg/uninit-pr100250.c: Likewise.
* gcc.dg/uninit-pr101300.c: Likewise.
* gcc.dg/uninit-pr101494.c: Likewise.
* gcc.dg/uninit-pr98583.c: Likewise.
* gcc.dg/vla-2.c: Likewise.
* gcc.dg/vla-22.c: Likewise.
* gcc.dg/vla-24.c: Likewise.
* gcc.dg/vla-3.c: Likewise.
* gcc.dg/vla-4.c: Likewise.
* gcc.dg/vla-stexp-1.c: Likewise.
* gcc.dg/vla-stexp-2.c: Likewise.
* gcc.dg/vla-stexp-4.c: Likewise.
* gcc.dg/vla-stexp-5.c: Likewise.
* gcc.dg/winline-7.c: Likewise.
* gcc.target/aarch64/stack-check-alloca-1.c: Likewise.
* gcc.target/aarch64/stack-check-alloca-10.c: Likewise.
* gcc.target/aarch64/stack-check-alloca-2.c: Likewise.
* gcc.target/aarch64/stack-check-alloca-3.c: Likewise.
* gcc.target/aarch64/stack-check-alloca-4.c: Likewise.
* gcc.target/aarch64/stack-check-alloca-5.c: Likewise.
* gcc.target/aarch64/stack-check-alloca-6.c: Likewise.
* gcc.target/aarch64/stack-check-alloca-7.c: Likewise.
* gcc.target/aarch64/stack-check-alloca-8.c: Likewise.
* gcc.target/aarch64/stack-check-alloca-9.c: Likewise.
* gcc.target/arc/interrupt-6.c: Likewise.
* gcc.target/i386/pr80969-3.c: Likewise.
* gcc.target/loongarch/stack-check-alloca-1.c: Likewise.
* gcc.target/loongarch/stack-check-alloca-2.c: Likewise.
* gcc.target/loongarch/stack-check-alloca-3.c: Likewise.
* gcc.target/loongarch/stack-check-alloca-4.c: Likewise.
* gcc.target/loongarch/stack-check-alloca-5.c: Likewise.
* gcc.target/loongarch/stack-check-alloca-6.c: Likewise.
* gcc.target/riscv/stack-check-alloca-1.c: Likewise.
* gcc.target/riscv/stack-check-alloca-10.c: Likewise.
* gcc.target/riscv/stack-check-alloca-2.c: Likewise.
* gcc.target/riscv/stack-check-alloca-3.c: Likewise.
* gcc.target/riscv/stack-check-alloca-4.c: Likewise.
* gcc.target/riscv/stack-check-alloca-5.c: Likewise.
* gcc.target/riscv/stack-check-alloca-6.c: Likewise.
* gcc.target/riscv/stack-check-alloca-7.c: Likewise.
* gcc.target/riscv/stack-check-alloca-8.c: Likewise.
* gcc.target/riscv/stack-check-alloca-9.c: Likewise.
* gcc.target/sparc/setjmp-1.c: Likewise.
* gcc.target/x86_64/abi/ms-sysv/ms-sysv.c: Likewise.
* gcc.c-torture/compile/20001221-1.c: Don't 'dg-skip-if'
for '! alloca'.
* gcc.c-torture/compile/20020807-1.c: Likewise.
* gcc.c-torture/compile/20050801-2.c: Likewise.
* gcc.c-torture/compile/920428-4.c: Likewise.
* gcc.c-torture/compile/debugvlafunction-1.c: Likewise.
* gcc.c-torture/compile/pr41469.c: Likewise.
* gcc.c-torture/execute/920721-2.c: Likewise.
* gcc.c-torture/execute/920929-1.c: Likewise.
* gcc.c-torture/execute/921017-1.c: Likewise.
* gcc.c-torture/execute/941202-1.c: Likewise.
* gcc.c-torture/execute/align-nest.c: Likewise.
* gcc.c-torture/execute/alloca-1.c: Likewise.
* gcc.c-torture/execute/pr22061-4.c: Likewise.
* gcc.c-torture/execute/pr36321.c: Likewise.
* gcc.dg/torture/pr8081.c: Likewise.
* gcc.dg/analyzer/data-model-1.c: Don't
'dg-require-effective-target alloca'. XFAIL relevant
'dg-warning's for '! alloca'.
* gcc.dg/uninit-38.c: Likewise.
* gcc.dg/uninit-pr98578.c: Likewise.
* gcc.dg/compat/struct-by-value-22_main.c: Comment on
'dg-require-effective-target alloca'.
libstdc++-v3/
* testsuite/lib/prune.exp (proc libstdc++-dg-prune): Turn
'sorry, unimplemented: dynamic stack allocation not supported' into
UNSUPPORTED.
|
|
When predicitive commoning moves an invariant ref it makes sure to
not build a MEM_REF with a base that is negatively offsetted from
an object. But in trying to preserve some transforms it does not
consider association of a constant offset with the address computation
in DR_BASE_ADDRESS leading to exactly this problem again. This is
arguably a problem in data-ref analysis producing such an out-of-bound
DR_BASE_ADDRESS, but this looks quite involved to fix, so the
following avoids the association in one more case. This fixes the
testcase while preserving the desired transform in
gcc.dg/tree-ssa/predcom-1.c.
PR tree-optimization/118954
* tree-predcom.cc (ref_at_iteration): Make sure to not
associate the constant offset with DR_BASE_ADDRESS when
that is an offsetted pointer.
* gcc.dg/torture/pr118954.c: New testcase.
|
|
Previously the analyzer treated IFN_UBSAN_BOUNDS as a no-op, but
the other IFN_UBSAN_* were unrecognized and conservatively treated
as having arbitrary behavior.
Treat IFN_UBSAN_NULL and IFN_UBSAN_PTR also as no-ops, which should
make -fanalyzer behave better with -fsanitize=undefined.
gcc/analyzer/ChangeLog:
PR analyzer/118300
* kf.cc (class kf_ubsan_bounds): Replace this with...
(class kf_ubsan_noop): ...this.
(register_sanitizer_builtins): Use it to handle IFN_UBSAN_NULL,
IFN_UBSAN_BOUNDS, and IFN_UBSAN_PTR as nop-ops.
(register_known_functions): Drop handling of IFN_UBSAN_BOUNDS
here, as it's now handled by register_sanitizer_builtins above.
gcc/testsuite/ChangeLog:
PR analyzer/118300
* gcc.dg/analyzer/ubsan-pr118300.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
When use_gcc_stdint=provide, the stdint-gcc.h header is not provided.
2025-02-18 John David Anglin <danglin@gcc.gnu.org>
gcc/testsuite/ChangeLog:
PR testsuite/116986
* gcc.dg/crc-builtin-rev-target32.c: Include stdint.h
instead of stdint-gcc.h.
* gcc.dg/crc-builtin-rev-target64.c: Likewise.
* gcc.dg/crc-builtin-target32.c: Likewise.
* gcc.dg/crc-builtin-target64.c: Likewise.
* gcc.dg/torture/pr115387-2.c: Likewise.
|
|
The following shows that tail-merging will make dead SSA defs live
in paths where it wasn't before, possibly introducing UB or as
in this case, uses of abnormals that eventually fail coalescing
later. The fix is to register such defs for stmt comparison.
PR tree-optimization/98845
* tree-ssa-tail-merge.cc (stmt_local_def): Consider a
def with no uses not local.
* gcc.dg/pr98845.c: New testcase.
* gcc.dg/pr81192.c: Adjust.
|
|
A compare with zero may be taken as a sign bit test by
fold_truth_andor_for_ifcombine, but the operand may be extended from a
narrower field. If the operand was narrower, the bitsize will reflect
the narrowing conversion, but if it was wider, we'll only know whether
the field is sign- or zero-extended from unsignedp, but we won't know
whether it needed to be extended, because arg will have changed to the
narrower variable when we get to the point in which we can compute the
arg width. If it's sign-extended, we're testing the right bit, but if
it's zero-extended, there isn't any bit we can test.
Instead of punting and leaving the foldable compare to be figured out
by another pass, arrange for the sign bit resulting from the widening
zero-extension to be taken as zero, so that the modified compare will
yield the desired result.
While at that, avoid swapping the right-hand compare operands when
we've already determined that it was a signbit test: it no use to even
try.
for gcc/ChangeLog
PR tree-optimization/118805
* gimple-fold.cc (fold_truth_andor_for_combine): Detect and
cope with zero-extension in signbit tests. Reject swapping
right-compare operands if rsignbit.
for gcc/testsuite/ChangeLog
PR tree-optimization/118805
* gcc.dg/field-merge-26.c: New.
|
|
Constant integers with MSB set have to be represented as corresponding
signed integers. Use gen_int_mode to emit them in the correct way.
PR middle-end/118288
gcc/ChangeLog:
* builtins.cc (expand_builtin_crc_table_based):
Use gen_int_mode to emit constant integers with MSB set.
gcc/testsuite/ChangeLog:
* gcc.dg/pr118288.c: New test.
|
|
When we simplify a NARY during PHI translation we have to make sure
to not inject not available operands into it given that might violate
the valueization hook constraints and we'd pick up invalid
context-sensitive data in further simplification or as in this case
later ICE when we try to insert the expression.
PR tree-optimization/118895
* tree-ssa-sccvn.cc (vn_nary_build_or_lookup_1): Only allow
CSE if we can verify the result is available.
* gcc.dg/pr118895.c: New testcase.
|
|
SUB_OVERFLOW
So this is a fairly old regression, but with all the ranger work that's been
done, it's become easy to resolve.
The basic idea here is to use known relationships between two operands of a
SUB_OVERFLOW IFN to statically compute the overflow state and ultimately allow
turning the IFN into simple arithmetic (or for the tests in this BZ elide the
arithmetic entirely).
The regression example is when the two inputs are known equal. In that case
the subtraction will never overflow. But there's a few other cases we can
handle as well.
a == b -> never overflows
a > b -> never overflows when A and B are unsigned
a >= b -> never overflows when A and B are unsigned
a < b -> always overflows when A and B are unsigned
Bootstrapped and regression tested on x86, and regression tested on the usual
cross platforms.
This is Jakub's version of the vr-values.cc fix rather than Jeff's.
PR tree-optimization/98028
gcc/
* vr-values.cc (check_for_binary_op_overflow): Try to use a known
relationship betwen op0/op1 to statically determine overflow state.
gcc/testsuite
* gcc.dg/tree-ssa/pr98028.c: New test.
|
|
502.gcc_r when built with -fprofile-generate exposes a SLP discovery
issue where an IV forced live due to early break is not properly
discovered if its latch def is part of a different IVs SSA cycle.
To mitigate this we have to make sure to create an SLP instance
for the original IV. Ideally we'd handle all vect_induction_def
the same but this is left for next stage1.
PR tree-optimization/118852
* tree-vect-slp.cc (vect_analyze_slp): For early-break
forced-live IVs make sure we create an appropriate
entry into the SLP graph.
* gcc.dg/vect/pr118852.c: New testcase.
|
|
The representation of CONSTRUCTOR nodes in VN NARY and gimple_match_op
do not agree so do not attempt to marshal between them.
PR tree-optimization/118817
* tree-ssa-sccvn.cc (vn_nary_simplify): Do not process
CONSTRUCTOR NARY or update from CONSTRUCTOR simplified
gimple_match_op.
* gcc.dg/pr118817.c: New testcase.
|
|
One of the testcases from PR 118097 and the one from PR 118535 show
that the fix to PR 118138 was incomplete. We must not only make sure
that (intermediate) results of operations performed by IPA-CP are
fold_converted to the type of the destination formal parameter but we
also must decouple the these types from the ones in which operations
are performed.
This patch does that, even though we do not store or stream the
operation types, instead we simply limit ourselves to tcc_comparisons
and operations for which the first operand and the result are of the
same type as determined by expr_type_first_operand_type_p. If we
wanted to go beyond these, we would indeed need to store/stream the
respective operation type.
ipa_value_from_jfunc needs an additional check that res_type is not
NULL because it is not called just from within IPA-CP (where we know
we have a destination lattice slot belonging to a defined parameter)
but also from inlining, ipa-fnsummary and ipa-modref where it is used
to examine a call to a function with variadic arguments and we do not
have types for the unknown parameters. But we cannot really work with
those or estimate any benefits when it comes to them, so ignoring them
should be OK.
Even after this patch, ipa_get_jf_arith_result has a parameter called
res_type in which it performs operations for aggregate jump functions,
where we do not allow type conversions when constucting the jump
functions and the type is the type of the stored data. In GCC 16, we
could relax this and allow conversions like for scalars.
gcc/ChangeLog:
2025-01-20 Martin Jambor <mjambor@suse.cz>
PR ipa/118097
* ipa-cp.cc (ipa_get_jf_arith_result): Adjust comment.
(ipa_get_jf_pass_through_result): Removed.
(ipa_value_from_jfunc): Use directly ipa_get_jf_arith_result, do
not specify operation type but make sure we check and possibly
convert the result.
(get_val_across_arith_op): Remove the last parameter, always pass
NULL_TREE to ipa_get_jf_arith_result in its last argument.
(propagate_vals_across_arith_jfunc): Do not pass res_type to
get_val_across_arith_op.
(propagate_vals_across_pass_through): Add checking assert that
parm_type is not NULL.
gcc/testsuite/ChangeLog:
2025-01-24 Martin Jambor <mjambor@suse.cz>
PR ipa/118097
* gcc.dg/ipa/pr118097.c: New test.
* gcc.dg/ipa/pr118535.c: Likewise.
* gcc.dg/ipa/ipa-notypes-1.c: Likewise.
|
|
These two tests now vectorize the result finding
loop with PFA and so the number of loops checked
fails.
This fixes them by adding #pragma GCC novector to
the testcases.
gcc/testsuite/ChangeLog:
PR testsuite/118754
* gcc.dg/vect/vect-tail-nomask-1.c: Add novector.
* gcc.target/i386/pr106010-8c.c: Likewise.
|
|
For GCN, this avoids ICEs further down the compilation pipeline. For nvptx,
there's effectively no change: in presence of exception handling constructs,
instead of 'sorry, unimplemented: target cannot support nonlocal goto', we
now emit 'sorry, unimplemented: exception handling not supported'.
Additionally, turn test cases into UNSUPPORTED if running into
'sorry, unimplemented: exception handling not supported'.
gcc/
* config/gcn/gcn.md (exception_receiver): 'define_expand'.
* config/nvptx/nvptx.md (exception_receiver): Likewise.
gcc/testsuite/
* lib/gcc-dg.exp (gcc-dg-prune): Turn
'sorry, unimplemented: exception handling not supported' into
UNSUPPORTED.
* gcc.dg/pr104464.c: Remove GCN XFAIL.
libstdc++-v3/
* testsuite/lib/prune.exp (libstdc++-dg-prune): Turn
'sorry, unimplemented: exception handling not supported' into
UNSUPPORTED.
|
|
into 'exceptions'
For example, for nvptx, these test cases currently indeed fail with
'sorry, unimplemented: target cannot support nonlocal goto'. However,
that's just an artefact of non-existing support for exception handling,
and these test cases already require effective-target 'exceptions'.
gcc/testsuite/
* gcc.dg/cleanup-12.c: Don't 'dg-skip-if "" { ! nonlocal_goto }'.
* gcc.dg/cleanup-13.c: Likewise.
* gcc.dg/cleanup-5.c: Likewise.
* gcc.dg/gimplefe-44.c: Don't
'dg-require-effective-target nonlocal_goto'.
|
|
I confirm that back then, 'gcc.dg/pr88870.c' for nvptx failed due to
'sorry, unimplemented: target cannot support nonlocal goto', however at some
(indeterminate) point in time, that must've disappeared, and we now don't have
to 'dg-require-effective-target nonlocal_goto' anymore, and therefore get:
[-UNSUPPORTED:-]{+PASS:+} gcc.dg/pr88870.c {+(test for excess errors)+}
(And, if ever necessary again, this nowadays probably should
'dg-require-effective-target exceptions' instead of 'nonlocal_goto'.)
gcc/testsuite/
* gcc.dg/pr88870.c: Don't 'dg-require-effective-target nonlocal_goto'.
|
|
On leon3-elf and presumably on other targets, the test fails due to
differences in calling conventions and other reasons, that add extra
gimple stmts that prevent the expected optimization at the expected
point. The optimization takes place anyway, just a little later, so
tolerate that.
for gcc/testsuite/ChangeLog
PR tree-optimization/108357
* gcc.dg/tree-ssa/pr108357.c: Tolerate later optimization.
|
|
If decode_field_reference finds a load that accesses past the inner
object's size, bail out.
Drop the too-strict assert.
for gcc/ChangeLog
PR tree-optimization/118514
PR tree-optimization/118706
* gimple-fold.cc (decode_field_reference): Refuse to consider
merging out-of-bounds BIT_FIELD_REFs.
(make_bit_field_load): Drop too-strict assert.
* tree-eh.cc (bit_field_ref_in_bounds_p): Rename to...
(access_in_bounds_of_type_p): ... this. Change interface,
export.
(tree_could_trap_p): Adjust.
* tree-eh.h (access_in_bounds_of_type_p): Declare.
for gcc/testsuite/ChangeLog
PR tree-optimization/118514
PR tree-optimization/118706
* gcc.dg/field-merge-25.c: New.
|
|
The following test ICEs on RISC-V at least latently since
r14-1622-g99bfdb072e67fa3fe294d86b4b2a9f686f8d9705 which added
RISC-V specific case to get_biv_step_1 to recognize also
({zero,sign}_extend:DI (plus:SI op0 op1))
The reason for the ICE is that op1 in this case is CONST_POLY_INT
which unlike the really expected VOIDmode CONST_INTs has its own
mode and still satisfies CONSTANT_P.
GET_MODE (rhs) (SImode) is different from outer_mode (DImode), so
the function later does
*inner_step = simplify_gen_binary (code, outer_mode,
*inner_step, op1);
but that obviously ICEs because while *inner_step is either VOIDmode
or DImode, op1 has SImode.
The following patch fixes it by extending op1 using code so that
simplify_gen_binary can handle it. Another option would be
to change the !CONSTANT_P (op1) 3 lines above this to
!CONST_INT_P (op1), I think it isn't very likely that we get something
useful from other constants there.
2025-02-06 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/117506
* loop-iv.cc (get_biv_step_1): For {ZERO,SIGN}_EXTEND
of PLUS apply {ZERO,SIGN}_EXTEND to op1.
* gcc.dg/pr117506.c: New test.
* gcc.target/riscv/pr117506.c: New test.
|
|
The vectorizer thinks it can align a vector access to 16 bytes when
using a vectorization factor of 8 and 1 byte elements. That of
course does not work for the 2nd vector iteration. Apparently we
lack a guard against such nonsense.
PR tree-optimization/118749
* tree-vect-data-refs.cc (vector_alignment_reachable_p): Pass
in the vectorization factor, when that cannot maintain
the DRs target alignment do not claim we can reach that
by peeling.
* gcc.dg/vect/pr118749.c: New testcase.
|
|
The following testcase is miscompiled on x86_64 during postreload.
After reload (with IPA-RA figuring out the calls don't modify any
registers but %rax for return value) postreload sees
(insn 14 12 15 2 (set (mem:DI (plus:DI (reg/f:DI 7 sp)
(const_int 16 [0x10])) [0 S8 A64])
(reg:DI 1 dx [orig:105 q+16 ] [105])) "pr117239.c":18:7 95 {*movdi_internal}
(nil))
(call_insn/i 15 14 16 2 (set (reg:SI 0 ax)
(call (mem:QI (symbol_ref:DI ("baz") [flags 0x3] <function_decl 0x7ffb2e2bdf00 r>) [0 baz S1 A8])
(const_int 24 [0x18]))) "pr117239.c":18:7 1476 {*call_value}
(expr_list:REG_CALL_DECL (symbol_ref:DI ("baz") [flags 0x3] <function_decl 0x7ffb2e2bdf00 baz>)
(expr_list:REG_EH_REGION (const_int 0 [0])
(nil)))
(nil))
(insn 16 15 18 2 (parallel [
(set (reg/f:DI 7 sp)
(plus:DI (reg/f:DI 7 sp)
(const_int 24 [0x18])))
(clobber (reg:CC 17 flags))
]) "pr117239.c":18:7 285 {*adddi_1}
(expr_list:REG_ARGS_SIZE (const_int 0 [0])
(nil)))
...
(call_insn/i 19 18 21 2 (set (reg:SI 0 ax)
(call (mem:QI (symbol_ref:DI ("foo") [flags 0x3] <function_decl 0x7ffb2e2bdb00 l>) [0 foo S1 A8])
(const_int 0 [0]))) "pr117239.c":19:3 1476 {*call_value}
(expr_list:REG_CALL_DECL (symbol_ref:DI ("foo") [flags 0x3] <function_decl 0x7ffb2e2bdb00 foo>)
(expr_list:REG_EH_REGION (const_int 0 [0])
(nil)))
(nil))
(insn 21 19 26 2 (parallel [
(set (reg/f:DI 7 sp)
(plus:DI (reg/f:DI 7 sp)
(const_int -24 [0xffffffffffffffe8])))
(clobber (reg:CC 17 flags))
]) "pr117239.c":19:3 discrim 1 285 {*adddi_1}
(expr_list:REG_ARGS_SIZE (const_int 24 [0x18])
(nil)))
(insn 26 21 24 2 (set (mem:DI (plus:DI (reg/f:DI 7 sp)
(const_int 16 [0x10])) [0 S8 A64])
(reg:DI 1 dx [orig:105 q+16 ] [105])) "pr117239.c":19:3 discrim 1 95 {*movdi_internal}
(nil))
i.e.
movq %rdx, 16(%rsp)
call baz
addq $24, %rsp
...
call foo
subq $24, %rsp
movq %rdx, 16(%rsp)
Now, postreload uses cselib and cselib remembered that %rdx value has been
stored into 16(%rsp). Both baz and foo are pure calls. If they weren't,
when processing those CALL_INSNs cselib would invalidate all MEMs
if (RTL_LOOPING_CONST_OR_PURE_CALL_P (insn)
|| !(RTL_CONST_OR_PURE_CALL_P (insn)))
cselib_invalidate_mem (callmem);
where callmem is (mem:BLK (scratch)). But they are pure, so instead the
code just invalidates the argument slots from CALL_INSN_FUNCTION_USAGE.
The calls actually clobber more than that, even const/pure calls clobber
all memory below the stack pointer. And that is something that hasn't been
invalidated. In this failing testcase, the call to baz is not a big deal,
we don't have anything remembered in memory below %rsp at that call.
But then we increment %rsp by 24, so the %rsp+16 is now 8 bytes below stack
and do the call to foo. And that call now actually, not just in theory,
clobbers the memory below the stack pointer (in particular overwrites it
with the return value). But cselib does not invalidate. Then %rsp
is decremented again (in preparation for another call, to bar) and cselib
is processing store of %rdx (which IPA-RA says has not been modified by
either baz or foo calls) to %rsp + 16, and it sees the memory already has
that value, so the store is useless, let's remove it.
But it is not, the call to foo has changed it, so it needs to be stored
again.
The following patch adds targetted invalidation of memory below stack
pointer (or on SPARC memory below stack pointer + 2047 when stack bias is
used, or on PA memory above stack pointer instead).
It does so only in !ACCUMULATE_OUTGOING_ARGS or cfun->calls_alloca functions,
because in other functions the stack pointer should be constant from
the end of prologue till start of epilogue and so nothing should be stored
within the function below the stack pointer.
Now, memory below stack pointer is special, except for functions using
alloca/VLAs I believe no addressable memory should be there, it should be
purely outgoing function argument area, if we take address of some automatic
variable, it should live all the time above the outgoing function argument
area. So on top of just trying to flush memory below stack pointer
(represented by %rsp - PTRDIFF_MAX with PTRDIFF_MAX size on most arches),
the patch tries to optimize and only invalidate memory that has address
clearly derived from stack pointer (memory with other bases is not
invalidated) and if we can prove (we see same SP_DERIVED_VALUE_P bases in
both VALUEs) it is above current stack, also don't call
canon_anti_dependence which might just give up in certain cases.
I've gathered statistics from x86_64-linux and i686-linux
bootstraps/regtests. During -m64 compilations from those, there were
3718396 + 42634 + 27761 cases of processing MEMs in cselib_invalidate_mem
(callmem[1]) calls, the first number is number of MEMs not invalidated
because of the optimization, i.e.
+ if (sp_derived_base == NULL_RTX)
+ {
+ has_mem = true;
+ num_mems++;
+ p = &(*p)->next;
+ continue;
+ }
in the patch, the second number is number of MEMs not invalidated because
canon_anti_dependence returned false and finally the last number is number
of MEMs actually invalidated (so that is what hasn't been invalidated
before). During -m32 compilations the numbers were
1422412 + 39354 + 16509 with the same meaning.
Note, when there is no red zone, in theory even the sp = sp + incr
instruction invalidates memory below the new stack pointer, as signal
can come and overwrite the memory. So maybe we should be invalidating
something at those instructions as well. But in leaf functions we certainly
can have even addressable automatic vars in the red zone (which would make
it harder to distinguish), on the other side aren't normally storing
anything below the red zone, and in non-leaf it should normally be just the
outgoing arguments area.
2025-02-05 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/117239
* cselib.cc: Include predict.h.
(callmem): Change type from rtx to rtx[2].
(cselib_preserve_only_values): Use callmem[0] rather than callmem.
(cselib_invalidate_mem): Optimize and don't try to invalidate
for the mem_rtx == callmem[1] case MEMs which clearly can't be
below the stack pointer.
(cselib_process_insn): Use callmem[0] rather than callmem.
For const/pure calls also call cselib_invalidate_mem (callmem[1])
in !ACCUMULATE_OUTGOING_ARGS or cfun->calls_alloca functions.
(cselib_init): Initialize callmem[0] rather than callmem and also
initialize callmem[1].
* gcc.dg/pr117239.c: New test.
|
|
[PR118727]
With things like
// signed char a_14, a_16;
a.0_4 = (unsigned char) a_14;
_5 = (int) a.0_4;
b.1_6 = (unsigned char) b_16;
_7 = (int) b.1_6;
c_17 = _5 - _7;
_8 = ABS_EXPR <c_17>;
r_18 = _8 + r_23;
An ABD pattern will be recognized for _8:
patt_31 = .ABD (a.0_4, b.1_6);
It's still correct. But then when the SAD pattern is recognized:
patt_29 = SAD_EXPR <a_14, b_16, r_23>;
This is not correct. This only happens for targets with both uabd and
sabd but not vec_widen_{s,u}abd, currently LoongArch is the only target
affected.
The problem is vect_look_through_possible_promotion will throw away a
series of conversions if the effect is equivalent to a sign change and a
promotion, but here the sign change is definitely relevant, and the
promotion is also relevant for "mixed sign" cases like
r += abs((unsigned int)(unsigned char) a - (signed int)(signed char) b
(we need to promote to HImode as the difference can exceed the range of
QImode).
If there were any redundant promotion, it should have been stripped in
vect_recog_abd_pattern (i.e. when patt_31 = .ABD (a.0_4, b.1_6) is
recognized) instead of in vect_recog_sad_pattern, or we'd have a
missed-optimization if the ABD output is not summerized. So anyway
vect_recog_sad_pattern is just not a proper location to call
vect_look_through_possible_promotion for the ABD inputs, remove the
calls to fix the issue.
gcc/ChangeLog:
PR tree-optimization/118727
* tree-vect-patterns.cc (vect_recog_sad_pattern): Don't call
vect_look_through_possible_promotion on ABD inputs.
gcc/testsuite/ChangeLog:
PR tree-optimization/118727
* gcc.dg/pr108692.c: Mention PR 118727 in the comment.
* gcc.dg/pr118727.c: New test case.
|
|
The match.pd canonicalization that this testcase checks for,
is not applied on ilp32 targets.
This XFAILs the test on ilp32 targets.
PR testsuite/116845
gcc/testsuite/ChangeLog:
* gcc.dg/pr109393.c: XFAIL on ilp32 targets.
|
|
The GIMPLE FE currently invokes parser_build_unary_op to build
unary GENERIC which has the operand subject to C promotion rules
which does not match GIMPLE. The following adds a wrapper around
the build_unary_op worker which conveniently has an argument to
indicate whether to skip such promotion.
PR c/118742
gcc/c/
* gimple-parser.cc (gimple_parser_build_unary_op): New
wrapper around build_unary_op.
(c_parser_gimple_unary_expression): Use it.
gcc/testsuite/
* gcc.dg/gimplefe-56.c: New testcase.
|
|
When there's an inner loop without virtual header PHI but the outer
loop has one the fusion process cannot handle the need to create
an inner loop virtual header PHI. Punt in this case.
PR tree-optimization/117113
* gimple-loop-jam.cc (unroll_jam_possible_p): Detect when
we cannot handle virtual SSA update.
* gcc.dg/torture/pr117113.c: New testcase.
|
|
The following checks we have a scalar int shift mode before
enforcing it. As AVR shows the mode can be a signed _Accum mode
as well.
PR rtl-optimization/117611
* combine.cc (simplify_shift_const_1): Bail if not
scalar int mode.
* gcc.dg/fixed-point/pr117611.c: New testcase.
|
|
When we process function types we strip volatile and const qualifiers
after building a simplified type variant (which preserves those).
The qualified type handling of both isn't really compatible, so avoid
bad interaction by swapping this, first dropping const/volatile
qualifiers and then building the simplified type thereof.
PR lto/113207
* ipa-free-lang-data.cc (free_lang_data_in_type): First drop
const/volatile qualifiers from function argument types,
then build a simplified type.
* gcc.dg/pr113207.c: New testcase.
|
|
When we sink common stores in cselim or the sink pass we have to
make sure to not introduce overlapping lifetimes for abnormals
used in the ref. The easiest is to avoid sinking stmts which
reference abnormals at all which is what the following does.
PR tree-optimization/118717
* tree-ssa-phiopt.cc (cond_if_else_store_replacement_1):
Do not common stores referencing abnormal SSA names.
* tree-ssa-sink.cc (sink_common_stores_to_bb): Likewise.
* gcc.dg/torture/pr118717.c: New testcase.
|
|
The test expects transformations that depend on -Ofast on x86*, but
that option is only passed when the avx_runtime is available.
Split -Ofast out of the avx conditional, so that it is passed on the
same targets that expect the transformation.
for gcc/testsuite/ChangeLog
* gcc.dg/vect/vect-ifcvt-18.c: Split -Ofast out of
avx_runtime.
|
|
The following testcase is miscompiled on s390x-linux with e.g. -march=z13
(both -O0 and -O2) starting with r15-7053.
The problem is in the splitters which emulate TImode/V1TImode GT and GTU
comparisons.
For GT we want to do
(ior (gt (hi op1) (hi op2))
(and (eq (hi op1) (hi op2)) (gtu (lo op1) (lo op2))))
and for GTU similarly except for gtu instead of gt in there.
Now, the splitter emulation is using V2DImode comparisons where on s390x
the hi part is in the first element of the vector, lo part in the second,
and for the gtu case it swaps the elements of the vector.
So, we get the right result in the first element of the result vector.
But vrepg was then broadcasting the second element of the result vector
rather than the first, and the value of the second element of the vector
is instead
(ior (gt (lo op1) (lo op2))
(and (eq (lo op1) (lo op2)) (gtu (hi op1) (hi op2))))
so something not really usable for the emulated comparison.
The following patch fixes that. The testcase tries to test behavior of
double-word smin/smax/umin/umax with various cases of the halves of both
operands (one that is sometimes EQ, sometimes GT, sometimes LT, sometimes
GTU, sometimes LTU).
2025-01-30 Jakub Jelinek <jakub@redhat.com>
Stefan Schulze Frielinghaus <stefansf@gcc.gnu.org>
PR target/118696
* config/s390/vector.md (*vec_cmpgt<mode><mode>_nocc_emu,
*vec_cmpgtu<mode><mode>_nocc_emu): Duplicate the first rather than
second V2DImode element.
* gcc.dg/pr118696.c: New test.
* gcc.target/s390/vector/pr118696.c: New test.
* gcc.target/s390/vector/vec-abs-emu.c: Expect vrepg with 0 as last
operand rather than 1.
* gcc.target/s390/vector/vec-max-emu.c: Likewise.
* gcc.target/s390/vector/vec-min-emu.c: Likewise.
|
|
When MEM_REF expansion of a non-MEM falls back to a stack temporary
we fail to handle the case where the offset adjusted reference to
the temporary is not aligned according to the requirement of the
mode. We have to go through bitfield extraction or movmisalign
in this case. Fortunately there's a helper for this.
This fixes an ICE observed on arm which has sanity checks in its
move patterns for this.
PR middle-end/118695
* expr.cc (expand_expr_real_1): When expanding a MEM_REF
to a non-MEM by committing it to a stack temporary make
sure to handle misaligned accesses correctly.
* gcc.dg/pr118695.c: New testcase.
|
|
This fixes a large number of smaller and larger issues with the append_args
clause to 'declare variant' and adds Fortran support for it; it also contains
a larger number of testcases.
In particular, for Fortran, it also handles passing allocatable, pointer,
optional arguments to an interop dummy argument with or without value
attribute. And it changes the internal representation such that dumping the
tree does not lead to an ICE.
gcc/c/ChangeLog:
* c-parser.cc (c_finish_omp_declare_variant): Modify how
append_args is saved internally.
gcc/cp/ChangeLog:
* parser.cc (cp_finish_omp_declare_variant): Modify how append_args
is saved internally.
* pt.cc (tsubst_attribute): Likewise.
(tsubst_omp_clauses): Remove C_ORT_OMP_DECLARE_SIMD from interop
handling as no longer called for it.
* decl.cc (omp_declare_variant_finalize_one): Update append_args
changes; fixes for ADL input.
gcc/fortran/ChangeLog:
* gfortran.h (gfc_omp_declare_variant): Add append_args_list.
* openmp.cc (gfc_parser_omp_clause_init_modifiers): New;
splitt of from ...
(gfc_match_omp_init): ... here; call it.
(gfc_match_omp_declare_variant): Update to handle append_args
clause; some syntax handling fixes.
* trans-openmp.cc (gfc_trans_omp_declare_variant): Handle
append_args clause; add some diagnostic.
gcc/ChangeLog:
* gimplify.cc (gimplify_call_expr): For OpenMP's append_args clause
processed by 'omp dispatch', update for internal-representation
changes; fix handling of hidden arguments, add some comments and
handle Fortran's value dummy and optional/pointer/allocatable actual
args.
libgomp/ChangeLog:
* libgomp.texi (Impl. Status): Update for accumpulated changes
related to 'dispatch' and interop.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/append-args-1.c: Update dg-*.
* c-c++-common/gomp/append-args-3.c: Likewise.
* g++.dg/gomp/append-args-1.C: Likewise.
* gfortran.dg/gomp/adjust-args-1.f90: Likewise.
* gfortran.dg/gomp/adjust-args-3.f90: Likewise.
* gfortran.dg/gomp/declare-variant-2.f90: Likewise.
* c-c++-common/gomp/append-args-6.c: New test.
* c-c++-common/gomp/append-args-7.c: New test.
* c-c++-common/gomp/append-args-8.c: New test.
* c-c++-common/gomp/append-args-9.c: New test.
* g++.dg/gomp/append-args-4.C: New test.
* g++.dg/gomp/append-args-5.C: New test.
* g++.dg/gomp/append-args-6.C: New test.
* g++.dg/gomp/append-args-7.C: New test.
* gcc.dg/gomp/append-args-1.c: New test.
* gfortran.dg/gomp/append_args-1.f90: New test.
* gfortran.dg/gomp/append_args-2.f90: New test.
* gfortran.dg/gomp/append_args-3.f90: New test.
* gfortran.dg/gomp/append_args-4.f90: New test.
|
|
The following guards the BIT_FIELD_REF expansion fallback for
MEM_REFs of entities expanded to register (or constant) further,
avoiding large out-of-bound offsets by, when the access does not
overlap the base object, expanding the offset as if it were zero.
PR middle-end/118692
* expr.cc (expand_expr_real_1): When expanding a MEM_REF
as BIT_FIELD_REF avoid large offsets for accesses not
overlapping the base object.
* gcc.dg/pr118692.c: New testcase.
|
|
When we walk stmts to find always executed stmts with UB in the last
iteration to be able to reduce the iteration count by one we fail
to consider infinite subloops in the last iteration that would make
such stmt not execute. The following adds this.
PR tree-optimization/114052
* tree-ssa-loop-niter.cc (maybe_lower_iteration_bound): Check
for infinite subloops we might not exit.
* gcc.dg/pr114052-1.c: New testcase.
|