Age | Commit message (Collapse) | Author | Files | Lines |
|
|
|
|
|
As reported on gcc-regression, this test FAILs on aarch64, but my
r15-2090 change didn't change anything on the generated assembly,
just added the forgotten dg-do run directive to the test, so the
test has been failing forever, just we didn't know it.
I can actually reproduce it on x86_64 with -funsigned-char too,
s2.b.a has int type and -1 is stored to it, so we should compare
it against -1 rather than (char) -1; the latter is appropriate for
testing char fields into which we've stored -1.
2024-07-18 Jakub Jelinek <jakub@redhat.com>
* c-c++-common/torture/builtin-clear-padding-3.c (main): Compare
s2.b.a against -1 rather than (char) -1.
(cherry picked from commit 958ee138748fae4371e453eb9b357f576abbe83e)
|
|
|
|
The builtin-clear-padding-6.c testcase fails as clear_padding_type
doesn't correctly recompute the buf->size and buf->off members after
expanding clearing of an array using a runtime loop.
buf->size should be in that case the offset after which it should continue
with next members or padding before them modulo UNITS_PER_WORD and
buf->off that offset minus buf->size. That is what the code was doing,
but with off being the start of the loop cleared array, not its end.
So, the last hunk in gimple-fold.cc fixes that.
When adding the testcase, I've noticed that the
c-c++-common/torture/builtin-clear-padding-* tests, although clearly
written as runtime tests to test the builtins at runtime, didn't have
{ dg-do run } directive and were just compile tests because of that.
When adding that to the tests, builtin-clear-padding-1.c was already
failing without that clear_padding_type hunk too, but
builtin-clear-padding-5.c was still failing even after the change.
That is due to a bug in clear_padding_flush which the patch fixes as
well - when clear_padding_flush is called with full=true (that happens
at the end of the whole __builtin_clear_padding or on those array
padding clears done by a runtime loop), it wants to flush all the pending
padding clearings rather than just some. If it is at the end of the whole
object, it decreases wordsize when needed to make sure the code never writes
including RMW cycles to something outside of the object:
if ((unsigned HOST_WIDE_INT) (buf->off + i + wordsize)
> (unsigned HOST_WIDE_INT) buf->sz)
{
gcc_assert (wordsize > 1);
wordsize /= 2;
i -= wordsize;
continue;
}
but if it is full==true flush in the middle, this doesn't happen, but we
still process just the buffer bytes before the current end. If that end
is not on a wordsize boundary, e.g. on the builtin-clear-padding-5.c test
the last chunk is 2 bytes, '\0', '\xff', i is 16 and end is 18,
nonzero_last might be equal to the end - i, i.e. 2 here, but still all_ones
might be true, so in some spots we just didn't emit any clearing in that
last chunk.
2024-07-17 Jakub Jelinek <jakub@redhat.com>
PR middle-end/115527
* gimple-fold.c (clear_padding_flush): Introduce endsize
variable and use it instead of wordsize when comparing it against
nonzero_last.
(clear_padding_type): Increment off by sz.
* c-c++-common/torture/builtin-clear-padding-1.c: Add dg-do run
directive.
* c-c++-common/torture/builtin-clear-padding-2.c: Likewise.
* c-c++-common/torture/builtin-clear-padding-3.c: Likewise.
* c-c++-common/torture/builtin-clear-padding-4.c: Likewise.
* c-c++-common/torture/builtin-clear-padding-5.c: Likewise.
* c-c++-common/torture/builtin-clear-padding-6.c: New test.
(cherry picked from commit 8b5919bae11754f4b65a17e63663d3143f9615ac)
|
|
|
|
Currently unaligned YMM and ZMM load and store costs are cheaper than
aligned which causes the vectorizer to purposely mis-align accesses
by adding an alignment prologue. It looks like the unaligned costs
were simply left untouched from znver3 where they equate the aligned
costs when tweaking aligned costs for znver4. The following makes
the unaligned costs equal to the aligned costs.
This avoids the miscompile seen in PR115843 but it's of course not
a real fix for the issue uncovered there. But it makes it qualify
as a regression fix.
PR tree-optimization/115843
* config/i386/x86-tune-costs.h (znver4_cost): Update unaligned
load and store cost from the aligned costs.
(cherry picked from commit 1e3aa9c9278db69d4bdb661a750a7268789188d6)
|
|
|
|
|
|
|
|
|
|
|
|
This patch fixes the backend pattern that was printing the wrong input
scalar register pair when inserting into lane 1.
Added a new test to force float-abi=hard so we can use scan-assembler to check
correct codegen.
gcc/ChangeLog:
PR target/115611
* config/arm/mve.md (mve_vec_setv2di_internal): Fix printing of input
scalar register pair when lane = 1.
gcc/testsuite/ChangeLog:
* gcc.target/arm/mve/intrinsics/vsetq_lane_su64.c: New test.
(cherry picked from commit 7c11fdd2cc11a7058e9643b6abf27831970ad2c9)
|
|
|
|
emit_store_flag_1 calculates scode (swapped condition code) at the
beginning of the function from the value of code variable. However,
code variable may change before scode usage site, resulting in
invalid stalled scode value.
Move calculation of scode value just before its only usage site to
avoid stalled scode value.
PR middle-end/115836
gcc/ChangeLog:
* expmed.c (emit_store_flag_1): Move calculation of
scode just before its only usage site.
(cherry picked from commit 44933fdeb338e00c972e42224b9a83d3f8f6a757)
|
|
|
|
The ACLE requires __ARM_FEATURE_SVE_BF16 to be enabled when SVE and BF16
and the associated intrinsics are available.
GCC does support the required intrinsics for TARGET_SVE_BF16 so define
this macro too.
Bootstrapped and tested on aarch64-none-linux-gnu.
gcc/
PR target/115475
* config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins):
Define __ARM_FEATURE_SVE_BF16 for TARGET_SVE_BF16.
gcc/testsuite/
PR target/115475
* gcc.target/aarch64/acle/bf16_sve_feature.c: New test.
Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com>
(cherry picked from commit 6492c7130d6ae9992298fc3d072e2589d1131376)
|
|
The ACLE asks the user to test for __ARM_FEATURE_BF16 before using the
<arm_bf16.h> header but GCC doesn't set this up.
LLVM does, so this is an inconsistency between the compilers.
This patch enables that macro for TARGET_BF16_FP.
Bootstrapped and tested on aarch64-none-linux-gnu.
gcc/
PR target/115457
* config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins):
Define __ARM_FEATURE_BF16 for TARGET_BF16_FP.
gcc/testsuite/
PR target/115457
* gcc.target/aarch64/acle/bf16_feature.c: New test.
Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com>
(cherry picked from commit c10942134fa759843ac1ed1424b86fcb8e6368ba)
|
|
|
|
This testcase was fixed by r14-5934-gf26d68d5d128c8 but we should add
one to make sure it does not regress again.
Committed as obvious after a quick test on the testcase.
PR c++/97990
gcc/testsuite/ChangeLog:
* g++.dg/torture/vector-struct-1.C: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
(cherry picked from commit 5f1438db419c9eb8901d1d1d7f98fb69082aec8e)
|
|
The following fixes a stray TYPE_ALIAS_SET in a type variant built
by build_opaque_vector_type which is diagnosed by type checking
enabled with -flto.
PR middle-end/112732
* tree.c (build_opaque_vector_type): Reset TYPE_ALIAS_SET
of the newly built type.
(cherry picked from commit f26d68d5d128c86faaceeb81b1e8f22254ad53df)
|
|
|
|
|
|
|
|
|
|
|
|
2024-06-30 John David Anglin <danglin@gcc.gnu.org>
gcc/ChangeLog:
PR target/115691
* config/pa/pa.md: Remove incorrect xmpyu patterns.
|
|
|
|
|
|
|
|
|
|
When building gcc's C++ sources against recent libc++, the poisoning of
the ctype macros due to including safe-ctype.h before including C++
standard headers such as <list>, <map>, etc, causes many compilation
errors, similar to:
In file included from /home/dim/src/gcc/master/gcc/gensupport.cc:23:
In file included from /home/dim/src/gcc/master/gcc/system.h:233:
In file included from /usr/include/c++/v1/vector:321:
In file included from
/usr/include/c++/v1/__format/formatter_bool.h:20:
In file included from
/usr/include/c++/v1/__format/formatter_integral.h:32:
In file included from /usr/include/c++/v1/locale:202:
/usr/include/c++/v1/__locale:546:5: error: '__abi_tag__' attribute
only applies to structs, variables, functions, and namespaces
546 | _LIBCPP_INLINE_VISIBILITY
| ^
/usr/include/c++/v1/__config:813:37: note: expanded from macro
'_LIBCPP_INLINE_VISIBILITY'
813 | # define _LIBCPP_INLINE_VISIBILITY _LIBCPP_HIDE_FROM_ABI
| ^
/usr/include/c++/v1/__config:792:26: note: expanded from macro
'_LIBCPP_HIDE_FROM_ABI'
792 |
__attribute__((__abi_tag__(_LIBCPP_TOSTRING(
_LIBCPP_VERSIONED_IDENTIFIER))))
| ^
In file included from /home/dim/src/gcc/master/gcc/gensupport.cc:23:
In file included from /home/dim/src/gcc/master/gcc/system.h:233:
In file included from /usr/include/c++/v1/vector:321:
In file included from
/usr/include/c++/v1/__format/formatter_bool.h:20:
In file included from
/usr/include/c++/v1/__format/formatter_integral.h:32:
In file included from /usr/include/c++/v1/locale:202:
/usr/include/c++/v1/__locale:547:37: error: expected ';' at end of
declaration list
547 | char_type toupper(char_type __c) const
| ^
/usr/include/c++/v1/__locale:553:48: error: too many arguments
provided to function-like macro invocation
553 | const char_type* toupper(char_type* __low, const
char_type* __high) const
| ^
/home/dim/src/gcc/master/gcc/../include/safe-ctype.h:146:9: note:
macro 'toupper' defined here
146 | #define toupper(c) do_not_use_toupper_with_safe_ctype
| ^
This is because libc++ uses different transitive includes than
libstdc++, and some of those transitive includes pull in various ctype
declarations (typically via <locale>).
There was already a special case for including <string> before
safe-ctype.h, so move the rest of the C++ standard header includes to
the same location, to fix the problem.
gcc/ChangeLog:
* system.h: Include safe-ctype.h after C++ standard headers.
Signed-off-by: Dimitry Andric <dimitry@andric.com>
(cherry picked from commit 9970b576b7e4ae337af1268395ff221348c4b34a)
|
|
Most of the time we get away with using the dsymutil that is
installed with the latest Xcode, however for some cross-compilation
cases that does not work.
We now have the ability to specify the correct dsymutil to use for
the toolchain (--with-dsymutil=) and we should use that specified
tool for debug link. Fixes cross-compilers from x86-64 to powerpc.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/ada/ChangeLog:
* gcc-interface/Makefile.in: Use DSYMUTIL_FOR_TARGET in
libgnat/libgnarl recipies.
|
|
|
|
|
|
We check that the final_suspend () method returns a sane type (i.e. a class
or structure) but, unfortunately, that check has to be later than the one
for a throwing case. If the use returns some nonsensical type from the
method, we need to handle that in the checking for noexcept.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
PR c++/104051
gcc/cp/ChangeLog:
* coroutines.cc (coro_diagnose_throwing_final_aw_expr): Handle
non-target expression inputs.
gcc/testsuite/ChangeLog:
* g++.dg/coroutines/pr104051.C: New test.
(cherry picked from commit 7b96274a340bc0e9bcaef9baff3a44ec2f12c3df)
|
|
We do not support this yet.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
PR c++/101765
gcc/cp/ChangeLog:
* coroutines.cc (register_local_var_uses): Emit a sorry if
we encounter a VLA in the coroutine local variables.
gcc/testsuite/ChangeLog:
* g++.dg/coroutines/pr101765.C: New test.
(cherry picked from commit fdf0b6ce6c1cfa1c328c0c40473c71ca11fd8303)
|
|
The wording of the standard has been clarified to be explicit that
the the parameters to any user-defined operator-new in the promise
class should be lvalues.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
PR c++/100772
gcc/cp/ChangeLog:
* coroutines.cc (morph_fn_to_coro): Convert function parms
from reference before constructing any operator-new args
list.
gcc/testsuite/ChangeLog:
* g++.dg/coroutines/pr100772-a.C: New test.
* g++.dg/coroutines/pr100772-b.C: New test.
(cherry picked from commit 921942a8a106cb53994c21162922e4934eb3a3e0)
|
|
C++20 [expr.await] / 2
An await-expression shall appear only in a potentially-evaluated expression
within the compound-statement of a function-body outside of a handler.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
PR c++/99710
gcc/cp/ChangeLog:
* coroutines.cc (await_statement_walker): Report an error if
an await expression is found in a handler body.
gcc/testsuite/ChangeLog:
* g++.dg/coroutines/pr99710.C: New test.
(cherry picked from commit 650beb110538097b9c3e8600149b333a83e7e836)
|
|
This adds support for the NVIDIA Grace CPU to aarch64.
We reuse the tuning decisions for the Neoverse V2 core, but include a
number of architecture features that are not enabled by default in
-mcpu=neoverse-v2.
This allows Grace users to more simply target the CPU with -mcpu=grace
rather than remembering what extensions to tag on top of
-mcpu=neoverse-v2.
Bootstrapped and tested on aarch64-none-linux-gnu.
gcc/
* config/aarch64/aarch64-cores.def (grace): New entry.
* config/aarch64/aarch64-tune.md: Regenerate.
* doc/invoke.texi (AArch64 Options): Document the above.
Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com>
|
|
|
|
|
|
|
|
As the associated test case in PR114846 shows, currently
with eh_return involved some register restoring for EH
RETURN DATA in epilogue can clobber the one which holding
the return value. Referring to the existing handlings in
some other targets, this patch makes eh_return expander
call one new define_insn_and_split eh_return_internal which
directly calls rs6000_emit_epilogue with epilogue_type
EPILOGUE_TYPE_EH_RETURN instead of the previous treating
normal return with crtl->calls_eh_return specially.
PR target/114846
gcc/ChangeLog:
* config/rs6000/rs6000-logue.c (rs6000_emit_epilogue): As
EPILOGUE_TYPE_EH_RETURN would be passed as epilogue_type directly
now, adjust the relevant handlings on it.
* config/rs6000/rs6000.md (eh_return expander): Append by calling
gen_eh_return_internal and emit_barrier.
(eh_return_internal): New define_insn_and_split, call function
rs6000_emit_epilogue with epilogue type EPILOGUE_TYPE_EH_RETURN.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/pr114846.c: New test.
(cherry picked from commit e5fc5d42d25c86ae48178db04ce64d340a834614)
|
|
|
|
|
|
|
|
The following fixes a wrong pattern that didn't match the behavior
of the original fold_widened_comparison in that get_unwidened
returned a constant always in the wider type. But here we're
using (int) 4294967295u without the conversion applied. Fixed
by doing as earlier in the pattern - matching constants only
if the conversion was actually applied.
PR middle-end/110176
* match.pd (zext (bool) <= (int) 4294967295u): Make sure
to match INTEGER_CST only without outstanding conversion.
* gcc.dg/torture/pr110176.c: New testcase.
(cherry picked from commit 22dbfbe8767ff4c1d93e39f68ec7c2d5b1358beb)
|
|
We now got test coverage for non-SSA name bits so the following amends
the SSA_NAME_OCCURS_IN_ABNORMAL_PHI checks.
PR tree-optimization/111070
* tree-ssa-ifcombine.c (ifcombine_ifandif): Check we have
an SSA name before checking SSA_NAME_OCCURS_IN_ABNORMAL_PHI.
* gcc.dg/pr111070.c: New testcase.
|
|
The following guards the bit test merging code in if-combine against
the appearance of SSA names used in abnormal PHIs.
PR tree-optimization/111039
* tree-ssa-ifcombine.c (ifcombine_ifandif): Check for
SSA_NAME_OCCURS_IN_ABNORMAL_PHI.
* gcc.dg/pr111039.c: New testcase.
|