Age | Commit message (Collapse) | Author | Files | Lines |
|
This resolves a regression on i686 that was introduced with
r15-429-gfb1649f8b4ad50.
Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:
PR libstdc++/115247
* include/experimental/bits/simd.h (__as_vector): Don't use
vector_size(8) on __i386__.
(__vec_shuffle): Never return MMX vectors, widen to 16 bytes
instead.
(concat): Fix padding calculation to pick up widening logic from
__as_vector.
(cherry picked from commit 241a6cc88d866fb36bd35ddb3edb659453d6322e)
|
|
Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:
PR libstdc++/114958
* include/experimental/bits/simd.h (__as_vector): Return scalar
simd as one-element vector. Return vector from single-vector
fixed_size simd.
(__vec_shuffle): New.
(__extract_part): Adjust return type signature.
(split): Use __extract_part for any split into non-fixed_size
simds.
(concat): If the return type stores a single vector, use
__vec_shuffle (which calls __builtin_shufflevector) to produce
the return value.
* include/experimental/bits/simd_builtin.h
(__shift_elements_right): Removed.
(__extract_part): Return single elements directly. Use
__vec_shuffle (which calls __builtin_shufflevector) to for all
non-trivial cases.
* include/experimental/bits/simd_fixed_size.h (__extract_part):
Return single elements directly.
* testsuite/experimental/simd/pr114958.cc: New test.
(cherry picked from commit fb1649f8b4ad5043dd0e65e4e3a643a0ced018a9)
|
|
The option_map array for most entries contains just non-NULL opt0
{ "-Wno-", NULL, "-W", false, true },
{ "-fno-", NULL, "-f", false, true },
{ "-gno-", NULL, "-g", false, true },
{ "-mno-", NULL, "-m", false, true },
{ "--debug=", NULL, "-g", false, false },
{ "--machine-", NULL, "-m", true, false },
{ "--machine-no-", NULL, "-m", false, true },
{ "--machine=", NULL, "-m", false, false },
{ "--machine=no-", NULL, "-m", false, true },
{ "--machine", "", "-m", false, false },
{ "--machine", "no-", "-m", false, true },
{ "--optimize=", NULL, "-O", false, false },
{ "--std=", NULL, "-std=", false, false },
{ "--std", "", "-std=", false, false },
{ "--warn-", NULL, "-W", true, false },
{ "--warn-no-", NULL, "-W", false, true },
{ "--", NULL, "-f", true, false },
{ "--no-", NULL, "-f", false, true }
and so add_misspelling_candidates works correctly for it, but 3 out of
these,
{ "--machine", "", "-m", false, false },
{ "--machine", "no-", "-m", false, true },
and
{ "--std", "", "-std=", false, false },
use non-NULL opt1. That says that
--machine foo
should map to
-mfoo
and
--machine no-foo
should map to
-mno-foo
and
--std c++17
should map to
-std=c++17
add_misspelling_canidates was not handling this, so it hapilly
registered say
--stdc++17
or
--machineavx512
(twice) as spelling alternatives, when those options aren't recognized.
Instead we support
--std c++17
or
--machine avx512
--machine no-avx512
The following patch fixes that. On this particular testcase, we no longer
suggest anything, even when among the suggestion is say that
--std c++17
or
-std=c++17
etc.
2024-06-17 Jakub Jelinek <jakub@redhat.com>
PR driver/115440
* opts-common.cc (add_misspelling_candidates): If opt1 is non-NULL,
add a space and opt1 to the alternative suggestion text.
* g++.dg/cpp1z/pr115440.C: New test.
(cherry picked from commit 96db57948b50f45235ae4af3b46db66cae7ea859)
|
|
The warning code uses %D to print the ARRAY_REF first operands.
That works in the most common case where those operands are decls, but
as can be seen on the following testcase, they can be other expressions
with array type.
Just changing %D to %E isn't enough, because then the diagnostics can
suggest something like
note: use '&(x) != 0 ? (int (*)[32])&a : (int (*)[32])&b[0] == &(y) != 0 ? (int (*)[32])&a : (int (*)[32])&b[0]' to compare the addresses
which is a bad suggestion, the %E printing doesn't know that the
warning code will want to add & before it and [0] after it.
So, the following patch adds ()s around the operand as well, but does
that only for non-decls, for decls keeps it as &arr[0] like before.
2024-06-17 Jakub Jelinek <jakub@redhat.com>
PR c/115290
* c-warn.cc (do_warn_array_compare): Use %E rather than %D for
printing op0 and op1; if those operands aren't decls, also print
parens around them.
* c-c++-common/Warray-compare-3.c: New test.
(cherry picked from commit b63c7d92012f92e0517190cf263d29bbef8a06bf)
|
|
2024-06-20 Jakub Jelinek <jakub@redhat.com>
* BASE-VER: Set to 14.4.1.
|
|
|
|
|
|
|
|
|
|
Thanks to Jérôme Duval for noticing this.
libstdc++-v3/ChangeLog:
* libsupc++/new_opa.cc [!_GLIBCXX_HOSTED]: Fix declaration of
posix_memalign.
(cherry picked from commit 161efd677458f20d13ee1018a4d5e3964febd508)
|
|
|
|
|
|
|
|
|
|
|
|
This patch adds missing assembly directives to the CMSE library wrapper to call
functions with attribute cmse_nonsecure_call. Without the .type directive the
linker will fail to produce the correct veneer if a call to this wrapper
function is to far from the wrapper itself. The .size was added for
completeness, though we don't necessarily have a usecase for it.
libgcc/ChangeLog:
PR target/115360
* config/arm/cmse_nonsecure_call.S: Add .type and .size directives.
(cherry picked from commit c559353af49fe5743d226ac3112a285b27a50f6a)
|
|
The PR shows that when cfgrtl.cc:duplicate_insn_chain attempts to
update the MR_DEPENDENCE_CLIQUE information for a MEM_EXPR we can end up
accidentally dropping (e.g.) an ARRAY_REF from the MEM_EXPR and end up
replacing it with the underlying MEM_REF. This leads to an
inconsistency in the MEM_EXPR information, and could lead to wrong code.
While the walk down to the MEM_REF is necessary to update
MR_DEPENDENCE_CLIQUE, we should use the outer tree expression for the
MEM_EXPR. This patch does that.
gcc/ChangeLog:
PR rtl-optimization/114924
* cfgrtl.cc (duplicate_insn_chain): When updating MEM_EXPRs,
don't strip (e.g.) ARRAY_REFs from the final MEM_EXPR.
(cherry picked from commit fe40d525619eee9c2821126390df75068df4773a)
|
|
When we substitute the equivalence and it becomes shared, we can fail
to correctly update reg info used by LRA. This can result in wrong
code generation, e.g. because of incorrect live analysis. It can also
result in compiler crash as the pseudo survives RA. This is what
exactly happened for the PR. This patch solves this problem by
unsharing substituted equivalences.
gcc/ChangeLog:
PR middle-end/111497
* lra-constraints.cc (lra_constraints): Copy substituted
equivalence.
* lra.cc (lra): Change comment for calling unshare_all_rtl_again.
gcc/testsuite/ChangeLog:
PR middle-end/111497
* g++.target/i386/pr111497.C: new test.
(cherry picked from commit 3c23defed384cf17518ad6c817d94463a445d21b)
|
|
The following fixes an issue where SSA update loses PHI argument
locations when updating PHI nodes it didn't create as part of the
SSA update. For the case where the reaching def is the same as
the current argument opt to do nothing and for the case where the
PHI argument already has a location keep that (that's an indication
the PHI node wasn't created as part of the update SSA process).
PR middle-end/40635
* tree-into-ssa.cc (rewrite_update_phi_arguments): Only
update the argument when the reaching definition is different
from the current argument. Keep an existing argument
location.
* gcc.dg/uninit-pr40635.c: New testcase.
(cherry picked from commit 0d14720f93a8139a7f234b2762c361e8e5da99cc)
|
|
The following tries to address the PHI insertion compile-time hog in
RTL fwprop observed with the PR54052 testcase where the loop computing
the "unfiltered" set of variables possibly needing PHI nodes for each
block exhibits quadratic compile-time and memory-use.
It does so by pruning the local DEFs with LR_OUT of the block, removing
regs that can never be LR_IN (defined by this block) in the dominance
frontier.
PR rtl-optimization/54052
* rtl-ssa/blocks.cc (function_info::place_phis): Filter
local defs by LR_OUT.
(cherry picked from commit c7151283dc747769d4ac4f216d8f519bda2569b5)
|
|
For Armv8.1-M, the clearing of the registers is handled differently than
for Armv8-M, so update the test case accordingly.
gcc/testsuite/ChangeLog:
PR target/115253
* gcc.target/arm/cmse/extend-return.c: Update test case
condition for Armv8.1-M.
Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
Co-authored-by: Yvan ROUX <yvan.roux@foss.st.com>
(cherry picked from commit cf5f9171bae1f5f3034dc9a055b77446962f1a8c)
|
|
Properly handle zero and sign extension for Armv8-M.baseline as
Cortex-M23 can have the security extension active.
Currently, there is an internal compiler error on Cortex-M23 for the
epilog processing of sign extension.
This patch addresses the following CVE-2024-0151 for Armv8-M.baseline.
gcc/ChangeLog:
PR target/115253
* config/arm/arm.cc (cmse_nonsecure_call_inline_register_clear):
Sign extend for Thumb1.
(thumb1_expand_prologue): Add zero/sign extend.
Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
Co-authored-by: Yvan ROUX <yvan.roux@foss.st.com>
(cherry picked from commit 65bd0655ece268895e5018e393bafb769e201c78)
|
|
|
|
When building gcc's C++ sources against recent libc++, the poisoning of
the ctype macros due to including safe-ctype.h before including C++
standard headers such as <list>, <map>, etc, causes many compilation
errors, similar to:
In file included from /home/dim/src/gcc/master/gcc/gensupport.cc:23:
In file included from /home/dim/src/gcc/master/gcc/system.h:233:
In file included from /usr/include/c++/v1/vector:321:
In file included from
/usr/include/c++/v1/__format/formatter_bool.h:20:
In file included from
/usr/include/c++/v1/__format/formatter_integral.h:32:
In file included from /usr/include/c++/v1/locale:202:
/usr/include/c++/v1/__locale:546:5: error: '__abi_tag__' attribute
only applies to structs, variables, functions, and namespaces
546 | _LIBCPP_INLINE_VISIBILITY
| ^
/usr/include/c++/v1/__config:813:37: note: expanded from macro
'_LIBCPP_INLINE_VISIBILITY'
813 | # define _LIBCPP_INLINE_VISIBILITY _LIBCPP_HIDE_FROM_ABI
| ^
/usr/include/c++/v1/__config:792:26: note: expanded from macro
'_LIBCPP_HIDE_FROM_ABI'
792 |
__attribute__((__abi_tag__(_LIBCPP_TOSTRING(
_LIBCPP_VERSIONED_IDENTIFIER))))
| ^
In file included from /home/dim/src/gcc/master/gcc/gensupport.cc:23:
In file included from /home/dim/src/gcc/master/gcc/system.h:233:
In file included from /usr/include/c++/v1/vector:321:
In file included from
/usr/include/c++/v1/__format/formatter_bool.h:20:
In file included from
/usr/include/c++/v1/__format/formatter_integral.h:32:
In file included from /usr/include/c++/v1/locale:202:
/usr/include/c++/v1/__locale:547:37: error: expected ';' at end of
declaration list
547 | char_type toupper(char_type __c) const
| ^
/usr/include/c++/v1/__locale:553:48: error: too many arguments
provided to function-like macro invocation
553 | const char_type* toupper(char_type* __low, const
char_type* __high) const
| ^
/home/dim/src/gcc/master/gcc/../include/safe-ctype.h:146:9: note:
macro 'toupper' defined here
146 | #define toupper(c) do_not_use_toupper_with_safe_ctype
| ^
This is because libc++ uses different transitive includes than
libstdc++, and some of those transitive includes pull in various ctype
declarations (typically via <locale>).
There was already a special case for including <string> before
safe-ctype.h, so move the rest of the C++ standard header includes to
the same location, to fix the problem.
PR middle-end/111632
gcc/ChangeLog:
* system.h: Include safe-ctype.h after C++ standard headers.
Signed-off-by: Dimitry Andric <dimitry@andric.com>
(cherry picked from commit 9970b576b7e4ae337af1268395ff221348c4b34a)
|
|
Use INCLUDE_VECTOR before including system.h, instead of directly
including <vector>, to avoid running into poisoned identifiers.
Signed-off-by: Dimitry Andric <dimitry@andric.com>
PR middle-end/111632
libcc1/ChangeLog:
* libcc1plugin.cc: Fix include.
* libcp1plugin.cc: Fix include.
(cherry picked from commit 5213047b1d50af63dfabb5e5649821a6cb157e33)
|
|
The problem here is even if last_and_only_stmt returns a statement,
the bb might still contain a phi node which defines a ssa name
which is used in that statement so we need to add a check to make sure
that the phi nodes are empty for the middle bbs in both the
`CMP?MINMAX:MINMAX` case and the `CMP?MINMAX:B` cases.
Bootstrapped and tested on x86_64_linux-gnu with no regressions.
PR tree-optimization/115143
gcc/ChangeLog:
* tree-ssa-phiopt.cc (minmax_replacement): Check for empty
phi nodes for middle bbs for the case where middle bb is not empty.
gcc/testsuite/ChangeLog:
* gcc.c-torture/compile/pr115143-1.c: New test.
* gcc.c-torture/compile/pr115143-2.c: New test.
* gcc.c-torture/compile/pr115143-3.c: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
(cherry picked from commit 9ff8f041331ef8b56007fb3c4d41d76f9850010d)
|
|
The first parameter of fwrite should be the const char* __s which want
write to FILE *__file, rather than the FILE *__file write to the FILE
*__file.
libstdc++-v3/ChangeLog:
* config/io/basic_file_stdio.cc (xwrite) [USE_STDIO_PURE]: Fix
first argument.
(cherry picked from commit bb4f8f14ed15310b5e01f1c6013585550debdab9)
|
|
We actually defined this macro in <utility> at one point, but I removed
it in r10-7901-g2025db692e9ed1.
libstdc++-v3/ChangeLog:
* include/std/utility (__cpp_lib_constexpr_algorithms): Define,
as per LWG 3792.
* testsuite/20_util/exchange/constexpr.cc: Check for it.
(cherry picked from commit 924d990425d29ef39f3faac49d4a3772e4302c96)
|
|
The non-reserved names 'val' and 'dest' were being used in our headers
but haven't been added to the 17_intro/names.cc test. That's because
they are used by <asm-generic/posix_types.h> and <netinet/tcp.h>
respecitvely on glibc-based systems.
libstdc++-v3/ChangeLog:
* include/bits/fs_ops.h (create_directory): Use reserved name
for parameter.
* include/bits/regex_automaton.h (_State_base::_M_print):
Likewise.
* include/bits/regex_automaton.tcc(_State_base::_M_print):
Likewise.
* include/bits/regex_scanner.tcc(_Scanner::_M_print): Likewise.
* include/experimental/bits/fs_ops.h (create_directory):
Likewise.
* include/std/mutex (timed_mutex::_M_clocklock): Likewise.
(recursive_timed_mutex:_M_clocklock): Likewise.
* libsupc++/cxxabi_init_exception.h
(__cxa_init_primary_exception): Likewise.
* testsuite/17_intro/names.cc: Add checks.
(cherry picked from commit dc79eba72b4d16180e500eba336f511a485a8496)
|
|
libstdc++-v3/ChangeLog:
* include/experimental/internet (network_v6::network): Define.
(network_v6::hosts): Finish implementing.
(network_v6::to_string): Do not concatenate std::string to
arbitrary std::basic_string specialization.
* testsuite/experimental/net/internet/network/v6/cons.cc: New
test.
(cherry picked from commit 5b069117e261ae0b438aee0cdff8707827810adf)
|
|
We can't define endpoints and resolvers without the relevant OS support.
If IPPROTO_TCP and IPPROTO_UDP are both undefined then we won't need
basic_endpoint and basic_resolver anyway, so make them depend on those
macros.
libstdc++-v3/ChangeLog:
PR libstdc++/100285
* include/experimental/internet [IPPROTO_TCP || IPPROTO_UDP]
(basic_endpoint, basic_resolver_entry, resolver_base)
(basic_resolver_results, basic_resolver): Only define if the tcp
or udp protocols will be defined.
(cherry picked from commit 793ed718b522b15e2d758eca953feeec1979fe2c)
|
|
libstdc++-v3/ChangeLog:
PR libstdc++/110542
* include/bits/stl_uninitialized.h (__uninitialized_default_n):
Do not use std::fill_n during constant evaluation.
(cherry picked from commit 83cae6c4b788544635a71748e1881c150f42efef)
|
|
The test 27_io/manipulators/extended/get_time/char/2.cc expects that
%a in the de_DE locale is "Di" but on FreeBSD it's "Di." with a trailing
period. Adjust the input string to be "1971 Di." instead of "Di 1971"
and that way if %a doesn't expect the trailing '.' it simply won't
extract it from the stream.
This fixes:
FAIL: 27_io/manipulators/extended/get_time/char/2.cc -std=gnu++17 execution test
libstdc++-v3/ChangeLog:
* testsuite/27_io/manipulators/extended/get_time/char/2.cc:
Adjust input string so that it matches %a with or without a
trailing period.
(cherry picked from commit 4decc1062f0f6eb44209d9d5a26a744ffa474648)
|
|
The multiplication (4 * _M_t * __1p) can wraparound to zero if _M_t is
unsigned and 4 * _M_t wraps to zero. The third operand has type double,
so do the second multiplication first, so that we aren't multiplying
integers.
libstdc++-v3/ChangeLog:
PR libstdc++/114359
* include/bits/random.tcc (binomial_distribution::param_type):
Ensure arithmetic is done as type double.
* testsuite/26_numerics/random/binomial_distribution/114359.cc: New test.
(cherry picked from commit 07e03761a7fc1626a6a74ed957e117f56981558c)
|
|
This doesn't cause a problem with GCC, but Clang correctly diagnoses a
bug in the code. The objects in the allocated storage need to begin
their lifetime before we start using them.
This change uses the allocator's construct function instead of using
std::construct_at directly, in order to support fancy pointers.
libstdc++-v3/ChangeLog:
PR libstdc++/114367
* include/bits/stl_bvector.h (_M_allocate): Use allocator's
construct function to begin lifetime of words.
(cherry picked from commit 16afbd9c9c4282d56062cef95e6eccfdcf3efe03)
|
|
This is a workaround for a possible compiler bug that causes constraint
recursion in the operator<=>(const optional<T>&, const U&) overload.
libstdc++-v3/ChangeLog:
PR libstdc++/104606
* include/std/optional (operator<=>(const optional<T>&, const U&)):
Reverse order of three_way_comparable_with template arguments.
* testsuite/20_util/optional/relops/104606.cc: New test.
(cherry picked from commit 7f65d8267fbfd19cf21a3dc71d27e989e75044a3)
|
|
The allocator objects in container node handles were not being destroyed
after the node was re-inserted into a container. They are stored in a
union and so need to be explicitly destroyed when the node becomes
empty. The containers were zeroing the node handle's pointer, which
makes it empty, causing the handle's destructor to think there's nothing
to clean up.
Add a new member function to the node handle which destroys the
allocator and zeros the pointer. Change the containers to call that
instead of just changing the pointer manually.
We can also remove the _M_empty member of the union which is not
necessary.
libstdc++-v3/ChangeLog:
PR libstdc++/114401
* include/bits/hashtable.h (_Hashtable::_M_reinsert_node): Call
release() on node handle instead of just zeroing its pointer.
(_Hashtable::_M_reinsert_node_multi): Likewise.
(_Hashtable::_M_merge_unique): Likewise.
(_Hashtable::_M_merge_multi): Likewise.
* include/bits/node_handle.h (_Node_handle_common::release()):
New member function.
(_Node_handle_common::_Optional_alloc::_M_empty): Remove
unnecessary union member.
(_Node_handle_common): Declare _Hashtable as a friend.
* include/bits/stl_tree.h (_Rb_tree::_M_reinsert_node_unique):
Call release() on node handle instead of just zeroing its
pointer.
(_Rb_tree::_M_reinsert_node_equal): Likewise.
(_Rb_tree::_M_reinsert_node_hint_unique): Likewise.
(_Rb_tree::_M_reinsert_node_hint_equal): Likewise.
* testsuite/23_containers/multiset/modifiers/114401.cc: New test.
* testsuite/23_containers/set/modifiers/114401.cc: New test.
* testsuite/23_containers/unordered_multiset/modifiers/114401.cc: New test.
* testsuite/23_containers/unordered_set/modifiers/114401.cc: New test.
(cherry picked from commit c2e28df90a1640cebadef6c6c8ab5ea964071bb1)
|
|
The following testcase ICEs in ipa-free-lang, because the
fld_incomplete_type_of
gcc_assert (TYPE_CANONICAL (t2) != t2
&& TYPE_CANONICAL (t2) == TYPE_CANONICAL (TREE_TYPE (t)));
assertion doesn't hold.
This is because t is a struct S * type which was created while struct S
was still incomplete and without the may_alias attribute (and TYPE_CANONICAL
of a pointer type is a type created with can_alias_all = false argument),
while later on on the struct definition may_alias attribute was used.
fld_incomplete_type_of then creates an incomplete distinct copy of the
structure (but with the original attributes) but pointers created for it
are because of the "may_alias" attribute TYPE_REF_CAN_ALIAS_ALL, including
their TYPE_CANONICAL, because while that is created with !can_alias_all
argument, we later set it because of the "may_alias" attribute on the
to_type.
This doesn't ICE with C++ since PR70512 fix because the C++ FE sets
TYPE_REF_CAN_ALIAS_ALL on all pointer types to the class type (and its
variants) when the may_alias is added.
The following patch does that in the C FE as well.
2024-06-06 Jakub Jelinek <jakub@redhat.com>
PR c/114493
* c-decl.cc (c_fixup_may_alias): New function.
(finish_struct): Call it if "may_alias" attribute is
specified.
* gcc.dg/pr114493-1.c: New test.
* gcc.dg/pr114493-2.c: New test.
(cherry picked from commit d5a3c6d43acb8b2211d9fb59d59482d74c010f01)
|
|
The function currently incorrectly assumes all the __builtin_clz* and .CLZ
calls have non-negative result. That is the case of the former which is UB
on zero and has [0, prec-1] return value otherwise, and is the case of the
single argument .CLZ as well (again, UB on zero), but for two argument
.CLZ is the case only if the second argument is also nonnegative (or if we
know the argument can't be zero, but let's do that just in the ranger IMHO).
The following patch does that.
2024-06-04 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/115337
* fold-const.cc (tree_call_nonnegative_warnv_p) <CASE_CFN_CLZ>:
If fn is CFN_CLZ, use CLZ_DEFINED_VALUE_AT.
(cherry picked from commit b82a816000791e7a286c7836b3a473ec0e2a577b)
|
|
The following testcase is miscompiled, because we use save_expr
on the .{ADD,SUB,MUL}_OVERFLOW call we are creating, but if the first
two operands are not INTEGER_CSTs (in that case we just fold it right away)
but are TREE_READONLY/!TREE_SIDE_EFFECTS, save_expr doesn't actually
create a SAVE_EXPR at all and so we lower it to
*arg2 = REALPART_EXPR (.ADD_OVERFLOW (arg0, arg1)), \
IMAGPART_EXPR (.ADD_OVERFLOW (arg0, arg1))
which evaluates the ifn twice and just hope it will be CSEd back.
As *arg2 aliases *arg0, that is not the case.
The builtins are really never const/pure as they store into what
the third arguments points to, so after handling the INTEGER_CST+INTEGER_CST
case, I think we should just always use SAVE_EXPR. Just building SAVE_EXPR
by hand and setting TREE_SIDE_EFFECTS on it doesn't work, because
c_fully_fold optimizes it away again, so the following patch marks the
ifn calls as TREE_SIDE_EFFECTS (but doesn't do it for the
__builtin_{add,sub,mul}_overflow_p case which were designed for use
especially in constant expressions and don't really evaluate the
realpart side, so we don't really need a SAVE_EXPR in that case).
2024-06-04 Jakub Jelinek <jakub@redhat.com>
PR middle-end/108789
* builtins.cc (fold_builtin_arith_overflow): For ovf_only,
don't call save_expr and don't build REALPART_EXPR, otherwise
set TREE_SIDE_EFFECTS on call before calling save_expr.
* gcc.c-torture/execute/pr108789.c: New test.
(cherry picked from commit b8e28381cb5c0cddfe5201faf799d8b27f5d7d6c)
|
|
PCH doesn't work properly in --enable-host-pie configurations on
powerpc*-linux*.
The problem is that the rs6000_builtin_info and rs6000_instance_info
arrays mix pointers to .rodata/.data (bifname and attr_string point
to string literals in .rodata section, and the next member is either NULL
or &rs6000_instance_info[XXX]) and GC member (tree fntype).
Now, for normal GC this works just fine, we emit
{
&rs6000_instance_info[0].fntype,
1 * (RS6000_INST_MAX),
sizeof (rs6000_instance_info[0]),
>_ggc_mx_tree_node,
>_pch_nx_tree_node
},
{
&rs6000_builtin_info[0].fntype,
1 * (RS6000_BIF_MAX),
sizeof (rs6000_builtin_info[0]),
>_ggc_mx_tree_node,
>_pch_nx_tree_node
},
GC roots which are strided and thus cover only the fntype members of all
the elements of the two arrays.
For PCH though it actually results in saving those huge arrays (one is
130832 bytes, another 81568 bytes) into the .gch files and loading them back
in full. While the bifname and attr_string and next pointers are marked as
GTY((skip)), they are actually saved to point to the .rodata and .data
sections of the process which writes the PCH, but because cc1/cc1plus etc.
are position independent executables with --enable-host-pie, when it is
loaded from the PCH file, it can point in a completely different addresses
where nothing is mapped at all or some random different thing appears at.
While gengtype supports the callback option, that one is meant for
relocatable function pointers and doesn't work in the case of GTY arrays
inside of .data section anyway.
So, either we'd need to add some further GTY extensions, or the following
patch instead reworks it such that the fntype members which were the only
reason for PCH in those arrays are moved to separate arrays.
Size-wise in .data sections it is (in bytes):
vanilla patched
rs6000_builtin_info 130832 110704
rs6000_instance_info 81568 40784
rs6000_overload_info 7392 7392
rs6000_builtin_info_fntype 0 10064
rs6000_instance_info_fntype 0 20392
sum 219792 189336
where previously we saved/restored for PCH those 130832+81568 bytes, now we
save/restore just 10064+20392 bytes, so this change is beneficial for the
data section size.
Unfortunately, it grows the size of the rs6000_init_generated_builtins
function, vanilla had 218328 bytes, patched has 228668.
When I applied
void
rs6000_init_generated_builtins ()
{
+ bifdata *rs6000_builtin_info_p;
+ tree *rs6000_builtin_info_fntype_p;
+ ovlddata *rs6000_instance_info_p;
+ tree *rs6000_instance_info_fntype_p;
+ ovldrecord *rs6000_overload_info_p;
+ __asm ("" : "=r" (rs6000_builtin_info_p) : "0" (rs6000_builtin_info));
+ __asm ("" : "=r" (rs6000_builtin_info_fntype_p) : "0" (rs6000_builtin_info_fntype));
+ __asm ("" : "=r" (rs6000_instance_info_p) : "0" (rs6000_instance_info));
+ __asm ("" : "=r" (rs6000_instance_info_fntype_p) : "0" (rs6000_instance_info_fntype));
+ __asm ("" : "=r" (rs6000_overload_info_p) : "0" (rs6000_overload_info));
+ #define rs6000_builtin_info rs6000_builtin_info_p
+ #define rs6000_builtin_info_fntype rs6000_builtin_info_fntype_p
+ #define rs6000_instance_info rs6000_instance_info_p
+ #define rs6000_instance_info_fntype rs6000_instance_info_fntype_p
+ #define rs6000_overload_info rs6000_overload_info_p
+
hack by hand, the size of the function is 209700 though, so if really
wanted, we could add __attribute__((__noipa__)) to the function when
building with recent enough GCC and pass pointers to the first elements
of the 5 arrays to the function as arguments. If you want such a change,
could that be done incrementally?
2024-06-03 Jakub Jelinek <jakub@redhat.com>
PR target/115324
* config/rs6000/rs6000-gen-builtins.cc (write_decls): Remove
GTY markup from struct bifdata and struct ovlddata and remove their
fntype members. Change next member in struct ovlddata and
first_instance member of struct ovldrecord to have int type rather
than struct ovlddata *. Remove GTY markup from rs6000_builtin_info
and rs6000_instance_info arrays, declare new
rs6000_builtin_info_fntype and rs6000_instance_info_fntype arrays,
which have GTY markup.
(write_bif_static_init): Adjust for the above changes.
(write_ovld_static_init): Likewise.
(write_init_bif_table): Likewise.
(write_init_ovld_table): Likewise.
* config/rs6000/rs6000-builtin.cc (rs6000_init_builtins): Likewise.
* config/rs6000/rs6000-c.cc (find_instance): Likewise. Make static.
(altivec_resolve_overloaded_builtin): Adjust for the above changes.
(cherry picked from commit 4cf2de9b5268224816a3d53fdd2c3d799ebfd9c8)
|
|
The following testcases are miscompiled (with tons of GIMPLE
optimization disabled) because combine sees GE comparison of
1-bit sign_extract (i.e. something with [-1, 0] value range)
with (const_int -1) (which is always true) and optimizes it into
NE comparison of 1-bit zero_extract ([0, 1] value range) against
(const_int 0).
The reason is that simplify_compare_const first (correctly)
simplifies the comparison to
GE (ashift:SI something (const_int 31)) (const_int -2147483648)
and then an optimization for when the second operand is power of 2
triggers. That optimization is fine for power of 2s which aren't
the signed minimum of the mode, or if it is NE, EQ, GEU or LTU
against the signed minimum of the mode, but for GE or LT optimizing
it into NE (or EQ) against const0_rtx is wrong, those cases
are always true or always false (but the function doesn't have
a standardized way to tell callers the comparison is now unconditional).
The following patch just disables the optimization in that case.
2024-05-15 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/114902
PR rtl-optimization/115092
* combine.cc (simplify_compare_const): Don't optimize
GE op0 SIGNED_MIN or LT op0 SIGNED_MIN into NE op0 const0_rtx or
EQ op0 const0_rtx.
* gcc.dg/pr114902.c: New test.
* gcc.dg/pr115092.c: New test.
(cherry picked from commit 0b93a0ae153ef70a82ff63e67926a01fdab9956b)
|
|
no_sanitize callers [PR114956]
In r9-5742 we've started allowing to inline always_inline functions into
functions which have disabled e.g. address sanitization even when the
always_inline function is implicitly from command line options sanitized.
This mostly works fine because most of the asan instrumentation is done only
late after ipa, but as the following testcase the .ASAN_MARK ifn calls
gimplifier adds can result in ICEs.
Fixed by dropping those during inlining, similarly to how we drop
.TSAN_FUNC_EXIT calls.
2024-05-07 Jakub Jelinek <jakub@redhat.com>
PR sanitizer/114956
* tree-inline.cc: Include asan.h.
(copy_bb): Remove also .ASAN_MARK calls if id->dst_fn has asan/hwasan
sanitization disabled.
* gcc.dg/asan/pr114956.c: New test.
(cherry picked from commit d4e25cf4f7c1f51a8824cc62bbb85a81a41b829a)
|
|
Seems when Martin S. implemented this, he coded there strict reading
of the standard, which said that %lc with (wint_t) 0 argument is handled
as wchar_t[2] temp = { arg, 0 }; %ls with temp arg and so shouldn't print
any values. But, most of the libc implementations actually handled that
case like %c with '\0' argument, adding a single NUL character, the only
known exception is musl.
Recently, C23 changed this in response to GB-141 and POSIX in
https://austingroupbugs.net/view.php?id=1647
so that it should have the same behavior as %c with '\0'.
Because there is implementation divergence, the following patch uses
a range rather than hardcoding it to all 1s (i.e. the %c behavior),
though the likely case is still 1 (forward looking plus most of
implementations).
The res.knownrange = true; assignment removed is redundant due to
the same assignment done unconditionally before the if statement,
rest is formatting fixes.
I don't think the min >= 0 && min < 128 case is right either, I'd think
it should be min >= 0 && max < 128, otherwise it is just some possible
inputs are (maybe) ASCII and there can be others, but this code is a total
mess anyway, with the min, max, likely (somewhere in [min, max]?) and then
unlikely possibly larger than max, dunno, perhaps for at least some chars
in the ASCII range the likely case could be for the ascii case; so perhaps
just the one_2_one_ascii shouldn't set max to 1 and mayfail should be true
for max >= 128. Anyway, didn't feel I should touch that right now.
2024-04-30 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/114876
* gimple-ssa-sprintf.cc (format_character): For min == 0 && max == 0,
set max, likely and unlikely members to 1 rather than 0. Remove
useless res.knownrange = true;. Formatting fixes.
* gcc.dg/pr114876.c: New test.
* gcc.dg/tree-ssa/builtin-sprintf-warn-1.c: Adjust expected
diagnostics.
(cherry picked from commit 6c6b70f07208ca14ba783933988c04c6fc2fff42)
|
|
copy [PR114825]
tree-nested.cc creates in 2 spots artificial VAR_DECLs, one of them is used
both for debug info and OpenMP/OpenACC lowering purposes, the other solely for
OpenMP/OpenACC lowering purposes.
When the decls are used in OpenMP/OpenACC lowering, the OMP langhooks (mostly
Fortran, C just a little and C++ doesn't have nested functions) then inspect
the flags on the vars and based on that decide how to lower the corresponding
clauses.
Unfortunately we weren't copying DECL_LANG_SPECIFIC and DECL_LANG_FLAG_?, so
the langhooks made decisions on the default flags on those instead.
As the original decl isn't necessarily a VAR_DECL, could be e.g. PARM_DECL,
using copy_node wouldn't work properly, so this patch just copies those
flags in addition to other flags it was copying already. And I've removed
code duplication by introducing a helper function which does copying common
to both uses.
2024-04-25 Jakub Jelinek <jakub@redhat.com>
PR fortran/114825
* tree-nested.cc (get_debug_decl): New function.
(get_nonlocal_debug_decl): Use it.
(get_local_debug_decl): Likewise.
* gfortran.dg/gomp/pr114825.f90: New test.
(cherry picked from commit 14d48516e588ad2b35e2007b3970bdcb1b3f145c)
|
|
On the following testcase, combine propagates the mem/v load into mem store
with the same address and then removes it, because noop_move_p says it is a
no-op move. If it was the other way around, i.e. mem/v store and mem load,
or both would be mem/v, it would be kept.
The problem is that rtx_equal_p never checks any kind of flags on the rtxes
(and I think it would be quite dangerous to change it at this point), and
set_noop_p checks side_effects_p on just one of the operands, not both.
In the MEM <- MEM set, it only checks it on the destination, in
store to ZERO_EXTRACT only checks it on the source.
The following patch adds the missing side_effects_p checks.
2024-04-19 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/114768
* rtlanal.cc (set_noop_p): Don't return true for MEM <- MEM
sets if src has side-effects or for stores into ZERO_EXTRACT
if ZERO_EXTRACT operand has side-effects.
* gcc.dg/pr114768.c: New test.
(cherry picked from commit 9f295847a9c32081bdd0fe908ffba58e830a24fb)
|
|
etc. expansion [PR114753]
__builtin_{add,sub,mul}_overflow{,_p} builtins are well defined
for all inputs even for -ftrapv, and the -fsanitize=signed-integer-overflow
ifns shouldn't abort in libgcc but emit the desired ubsan diagnostics
or abort depending on -fsanitize* setting regardless of -ftrapv.
The expansion of these internal functions uses expand_expr* in various
places (e.g. MULT_EXPR at least in 2 spots), so temporarily disabling
flag_trapv in all those spots would be hard.
The following patch disables it around the bodies of 3 functions
which can do the expand_expr calls.
If it was in the C++ FE, I'd use some RAII sentinel, but I don't think
we have one in the middle-end.
2024-04-18 Jakub Jelinek <jakub@redhat.com>
PR middle-end/114753
* internal-fn.cc (expand_mul_overflow): Save flag_trapv and
temporarily clear it for the duration of the function, then
restore previous value.
(expand_vector_ubsan_overflow): Likewise.
(expand_arith_overflow): Likewise.
* gcc.dg/pr114753.c: New test.
(cherry picked from commit 6c152c9db3b5b9d43e12846fb7a44977c0b65fc2)
|
|
The enumerator still doesn't have TREE_TYPE set but diag_attr_exclusions
assumes that all decls must have types.
I think it is better in something as unimportant as diag_attr_exclusions
to be more robust, if there is no type, it can just diagnose exclusions
on the DECL_ATTRIBUTES, like for types it only diagnoses it on
TYPE_ATTRIBUTES.
2024-04-15 Jakub Jelinek <jakub@redhat.com>
PR c++/114634
* attribs.cc (diag_attr_exclusions): Set attrs[1] to NULL_TREE for
decls with NULL TREE_TYPE.
* g++.dg/ext/attrib68.C: New test.
(cherry picked from commit 7ec54f5fdfec298812a749699874db4d6a7246bb)
|
|
The middle-end warns about the ANNOTATE_EXPR added for while/for loops
if they declare a var inside of the loop condition.
This is because the assumption is that ANNOTATE_EXPR argument is used
immediately in a COND_EXPR (later GIMPLE_COND), but simplify_loop_decl_cond
wraps the ANNOTATE_EXPR inside of a TRUTH_NOT_EXPR, so it no longer
holds.
The following patch fixes that by adding the TRUTH_NOT_EXPR inside of the
ANNOTATE_EXPR argument if any.
2024-04-12 Jakub Jelinek <jakub@redhat.com>
PR c++/114691
* semantics.cc (simplify_loop_decl_cond): Use cp_build_unary_op with
TRUTH_NOT_EXPR on ANNOTATE_EXPR argument (if any) rather than
ANNOTATE_EXPR itself.
* g++.dg/ext/pr114691.C: New test.
(cherry picked from commit 91146346f57cc54dfeb2669347edd0eb3d13af7f)
|
|
-fsanitize=address -fstack-protector* [PR110027]
On Tue, Mar 26, 2024 at 02:08:02PM +0800, liuhongt wrote:
> > > So, try to add some other variable with larger size and smaller alignment
> > > to the frame (and make sure it isn't optimized away).
> > >
> > > alignb above is the alignment of the first partition's var, if
> > > align_frame_offset really needs to depend on the var alignment, it probably
> > > should be the maximum alignment of all the vars with alignment
> > > alignb * BITS_PER_UNIT <=3D MAX_SUPPORTED_STACK_ALIGNMENT
> > >
>
> In asan_emit_stack_protection, when it allocated fake stack, it assume
> bottom of stack is also aligned to alignb. And the place violated this
> is the first var partition. which is 32 bytes offsets, it should be
> BIGGEST_ALIGNMENT / BITS_PER_UNIT.
> So I think we need to use MAX (BIGGEST_ALIGNMENT /
> BITS_PER_UNIT, ASAN_RED_ZONE_SIZE) for the first var partition.
Your first patch aligned offsets[0] to maximum of alignb and
ASAN_RED_ZONE_SIZE. But as I wrote in the reply to that mail, alignb there
is the alignment of just a single variable which is the first one to appear
in the sorted list and is placed in the highest spot in the stack frame.
That is not necessarily the largest alignment, the sorting ensures that it
is a variable with the largest size in the frame (and only if several of
them have equal size, largest alignment from the same sized ones). Your
second patch used maximum of BIGGEST_ALIGNMENT / BITS_PER_UNIT and
ASAN_RED_ZONE_SIZE. That doesn't change anything at all when using
-mno-avx512f - offsets[0] is still just 32-byte aligned in that case
relative to top of frame, just changes the -mavx512f case to be 64-byte
aligned offsets[0] (aka offsets[0] is then either 0 or -64 instead of either
0 or -32). That will not help if any variable in the frame needs 128-byte,
256-byte, 512-byte ... 4096-byte alignment. If you want to fix the bug in
the spot you've touched, you'd need to walk all the
stack_vars[stack_vars_sorted[si2]] for si2 [si + 1, n - 1] and for those
where the loop would do anything (i.e.
stack_vars[i2].representative == i2
&& TREE_CODE (decl2) == SSA_NAME
? SA.partition_to_pseudo[var_to_partition (SA.map, decl2)] == NULL_RTX
: DECL_RTL (decl2) == pc_rtx
and the pred applies (but that means also walking the earlier ones!
because with -fstack-protector* the vars can be processed in several calls) and
alignb2 * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT
and compute maximum of those alignments.
That maximum is already computed,
data->asan_alignb = MAX (data->asan_alignb, alignb);
computes that, but you get the final result only after you do all the
expand_stack_vars calls. You'd need to compute it before.
Though, that change would be still in the wrong place.
The thing is, it would be a waste of the precious stack space when it isn't
needed at all (e.g. when asan will not at compile time do the use after
return checking, or if it won't do it at runtime, or even if it will do at
runtime it will waste the space on the stack).
The following patch fixes it solely for the __asan_stack_malloc_N
allocations, doesn't enlarge unnecessarily further the actual stack frame.
Because asan is only supported on FRAME_GROWS_DOWNWARD architectures
(mips, rs6000 and xtensa are conditional FRAME_GROWS_DOWNWARD arches, which
for -fsanitize=address or -fstack-protector* use FRAME_GROWS_DOWNWARD 1,
otherwise 0, others supporting asan always just use 1), the assumption for
the dynamic stack realignment is that the top of the stack frame (aka offset
0) is aligned to alignb passed to the function (which is the maximum of alignb
of all the vars in the frame). As checked by the assertion in the patch,
offsets[0] is 0 most of the time and so that assumption is correct, the only
case when it is not 0 is if -fstack-protector* is on together with
-fsanitize=address and cfgexpand.cc (create_stack_guard) created a stack
guard. That is the only variable which is allocated in the stack frame
right away, for all others with -fsanitize=address defer_stack_allocation
(or -fstack-protector*) returns true and so they aren't allocated
immediately but handled during the frame layout phases. So, the original
frame_offset of 0 is changed because of the stack guard to
-pointer_size_in_bytes and later at the
if (data->asan_vec.is_empty ())
{
align_frame_offset (ASAN_RED_ZONE_SIZE);
prev_offset = frame_offset.to_constant ();
}
to -ASAN_RED_ZONE_SIZE. The asan_emit_stack_protection code wasn't
taking this into account though, so essentially assumed in the
__asan_stack_malloc_N allocated memory it needs to align it such that
pointer corresponding to offsets[0] is alignb aligned. But that isn't
correct if alignb > ASAN_RED_ZONE_SIZE, in that case it needs to ensure that
pointer corresponding to frame offset 0 is alignb aligned.
The following patch fixes that. Unlike the previous case where
we knew that asan_frame_size + base_align_bias falls into the same bucket
as asan_frame_size, this isn't in some cases true anymore, so the patch
recomputes which bucket to use and if going to bucket 11 (because there is
no __asan_stack_malloc_11 function in the library) disables the after return
sanitization.
2024-04-11 Jakub Jelinek <jakub@redhat.com>
PR middle-end/110027
* asan.cc (asan_emit_stack_protection): Assert offsets[0] is
zero if there is no stack protect guard, otherwise
-ASAN_RED_ZONE_SIZE. If alignb > ASAN_RED_ZONE_SIZE and there is
stack pointer guard, take the ASAN_RED_ZONE_SIZE bytes allocated at
the top of the stack into account when computing base_align_bias.
Recompute use_after_return_class from asan_frame_size + base_align_bias
and set to -1 if that would overflow to 11.
* gcc.dg/asan/pr110027.c: New test.
(cherry picked from commit 467898d513e602f5b5fc4183052217d7e6d6e8ab)
|