Age | Commit message (Collapse) | Author | Files | Lines |
|
The previous version of this test required arch v6+ (for sxth), and
the number of vmov depended on the float-point ABI (where softfp
needed more of them to transfer floating-point values to and from
general registers).
With this patch we require arch v7-a, vfp FPU and -mfloat-abi=hard, we
also use -O2 to clean the generated code and convert
scan-assembler-times directives into check-function-bodies.
Tested on arm-none-linux-gnueabihf and several flavours of
arm-none-eabi.
gcc/testsuite/ChangeLog:
PR target/119556
* gcc.target/arm/short-vfp-1.c: Improve dg directives.
|
|
The IPA-VRP workaround in the tailc/musttail passes was just comparing
the singleton constant from a tail call candidate return with the ret_val.
This unfortunately doesn't work in the following testcase, where we have
<bb 5> [local count: 152205050]:
baz (); [must tail call]
goto <bb 4>; [100.00%]
<bb 6> [local count: 762356696]:
_8 = foo ();
<bb 7> [local count: 1073741824]:
# _3 = PHI <0B(4), _8(6)>
return _3;
and in the unreduced testcase even more PHIs before we reach the return
stmt.
Normally when the call has lhs, whenever we follow a (non-EH) successor
edge, it calls propagate_through_phis and that walks the PHIs in the
destination bb of the edge and when it sees a PHI whose argument matches
that of the currently tracked value (ass_var), it updates ass_var to
PHI result of that PHI. I think it is theoretically dangerous that it
picks the first one, perhaps there could be multiple PHIs, so perhaps safer
would be walk backwards from the return value up to the call.
Anyway, this PR is about the IPA-VRP workaround, there ass_var is NULL
because the potential tail call has no lhs, but ret_var is not TREE_CONSTANT
but SSA_NAME with PHI as SSA_NAME_DEF_STMT. The following patch handles
it by pushing the edges we've walked through when ass_var is NULL into a
vector and if ret_var is SSA_NAME set to PHI result, it attempts to walk
back from the ret_var through arguments of PHIs corresponding to the
edges we've walked back until we reach a constant and compare that constant
against the singleton value as well.
2025-04-07 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/119614
* tree-tailcall.cc (find_tail_calls): Remember edges which have been
walked through if !ass_var. Perform IPA-VRP workaround even when
ret_var is not TREE_CONSTANT, in that case check in a loop if it is
a PHI result and in that case look at the PHI argument from
corresponding edge in the edge vector.
* g++.dg/opt/pr119613.C: Change { c || c++11 } in obviously C++ only
test to just c++11.
* g++.dg/opt/pr119614.C: New test.
|
|
For technical reasons, the recently reimplemented finalization machinery
for controlled types requires arrays of controlled types to be allocated
with their bounds, including in the case where their nominal subtype is
constrained. However, in this case, the type of 'Access for the arrays
is pointer-to-constrained-array and, therefore, its value must designate
the array itself and not the bounds.
gcc/ada/
* gcc-interface/utils.cc (convert) <POINTER_TYPE>: Use fold_convert
to convert between thin pointers. If the source is a thin pointer
with zero offset from the base and the target is a pointer to its
array, displace the pointer after converting it.
* gcc-interface/utils2.cc (build_unary_op) <ATTR_ADDR_EXPR>: Use
fold_convert to convert the address before displacing it.
|
|
As noted in the previous patch, combine still takes >30% of
compile time in the original testcase for PR101523. The problem
is that try_combine uses linear insn searches for some dataflow
queries, so in the worst case, an unlimited number of 2->2
combinations for the same i2 can lead to quadratic behaviour.
This patch limits distribute_links to a certain number
of instructions when i2 is unchanged. As Segher said in the PR trail,
it would make more conceptual sense to apply the limit unconditionally,
but I thought it would be better to change as little as possible at
this development stage. Logically, in stage 1, the --param should
be applied directly by distribute_links with no input from callers.
As I mentioned in:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116398#c28
I think it's safe to drop log links even if a use exists. All
processing of log links seems to handle the absence of a link
for a particular register in a conservative way.
The initial set-up errs on the side of dropping links, since for example
create_log_links has:
/* flow.c claimed:
We don't build a LOG_LINK for hard registers contained
in ASM_OPERANDs. If these registers get replaced,
we might wind up changing the semantics of the insn,
even if reload can make what appear to be valid
assignments later. */
if (regno < FIRST_PSEUDO_REGISTER
&& asm_noperands (PATTERN (use_insn)) >= 0)
continue;
which excludes combinations by dropping log links, rather than during
try_combine. And:
/* If this register is being initialized using itself, and the
register is uninitialized in this basic block, and there are
no LOG_LINKS which set the register, then part of the
register is uninitialized. In that case we can't assume
anything about the number of nonzero bits.
??? We could do better if we checked this in
reg_{nonzero_bits,num_sign_bit_copies}_for_combine. Then we
could avoid making assumptions about the insn which initially
sets the register, while still using the information in other
insns. We would have to be careful to check every insn
involved in the combination. */
if (insn
&& reg_referenced_p (x, PATTERN (insn))
&& !REGNO_REG_SET_P (DF_LR_IN (BLOCK_FOR_INSN (insn)),
REGNO (x)))
{
struct insn_link *link;
FOR_EACH_LOG_LINK (link, insn)
if (dead_or_set_p (link->insn, x))
break;
if (!link)
{
rsp->nonzero_bits = GET_MODE_MASK (mode);
rsp->sign_bit_copies = 1;
return;
}
}
treats the lack of a log link as a possible sign of uninitialised data,
but that would be a missed optimisation rather than a correctness issue.
One question is what the default --param value should be. I went with
Jakub's suggestion of 3000 from:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116398#c25
Also, to answer Jakub's question in that comment, I tried bisecting:
int limit = atoi (getenv ("BISECT"));
(so applying the limit for all calls from try_combine) with an
abort in distribute_links if the limit caused a link to be skipped.
The minimum BISECT value that allowed an aarch64-linux-gnu bootstrap
to succeed with --enable-languages=all --enable-checking=yes,rtl,extra
was 142, so much lower than the parameter value. I realised too late
that --enable-checking=release would probably have been a more
interesting test.
The previous patch meant that distribute_links itself is now linear
for a given i2 definition, since each search starts at the previous
last use, rather than at i2 itself. This means that the limit has
to be applied cumulatively across all searches for the same link.
The patch does that by storing a counter in the insn_link structure.
There was a 32-bit hole there on LP64 hosts.
gcc/
PR testsuite/116398
* params.opt (-param=max-combine-search-insns=): New param.
* doc/invoke.texi: Document it.
* combine.cc (insn_link::insn_count): New field.
(alloc_insn_link): Initialize it.
(distribute_links): Add a limit parameter.
(try_combine): Use the new param to limit distribute_links
when only i3 has changed.
|
|
Another problem in PR101523 was that, after each successful 2->2
combination attempt, distribute_links would search further and further
for the next combinable use of the i2 destination. Each search would
start at i2 itself, making the search quadratic in the worst case.
In a 2->2 combination, if i2 is unchanged, the search can start at i3
instead of i2. The same thing applies to i2 when distributing i2's
links, since the only changes to earlier instructions are the deletion
of i0 and i1.
This change, combined with the previous split_i2i3 patch, gives a
34.6% speedup in combine for the testcase in PR101523. Combine
goes from being 41% to 34% of compile time.
gcc/
PR testsuite/116398
* combine.cc (distribute_links): Take an optional start point.
(try_combine): If only i3 has changed, only distribute i3's links,
not i2's. Start the search for the new use from i3 rather than
from the definition instruction. Likewise start the search for
the new use from i2 when distributing i2's links.
|
|
When combining a single-set i2 into a multi-set i3, combine
first tries to match the new multi-set in-place. If that fails,
combine considers splitting the multi-set so that one set goes in
i2 and the other set stays in i3. That moves a destination from i3
to i2 and so combine needs to update any associated log link for that
destination to point to i2 rather than i3.
However, that kind of split can also occur for 2->2 combinations.
For a 2-instruction combination in which i2 doesn't die in i3, combine
tries a 2->1 combination by turning i3 into a parallel of the original
i2 and the combined i3. If that fails, combine will split the parallel
as above, so that the first set goes in i2 and the second set goes in i3.
But that can often leave i2 unchanged, meaning that no destinations have
moved and so no search is necessary.
gcc/
PR testsuite/116398
* combine.cc (try_combine): Shortcut the split_i2i3 handling if
i2 is unchanged.
|
|
One of the problems in PR101523 was that, after each successful
2->2 combination attempt, try_combine would restart combination
attempts at i2 even if i2 hadn't changed. This led to quadratic
behaviour as the same failed combinations between i2 and i3 were
tried repeatedly.
The original patch for the PR dealt with that by disallowing 2->2
combinations. However, that led to various optimisation regressions,
so there was interest in allowing the combinations again, at least
until an alternative way of getting the same results is in place.
This patch is a variant of Richi's in:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101523#c53
but limited to when we're combining 2 instructions.
This speeds up combine by 10x on the original PR101523 testcase
and reduces combine's memory footprint by 100x.
gcc/
PR testsuite/116398
* combine.cc (try_combine): Reallow 2->2 combinations. Detect when
only i3 has changed and restart from i3 in that case.
gcc/testsuite/
* gcc.target/aarch64/popcnt-le-1.c: Account for commutativity of TST.
* gcc.target/aarch64/popcnt-le-3.c: Likewise AND.
* gcc.target/aarch64/pr100056.c: Revert previous patch.
* gcc.target/aarch64/sve/pred-not-gen-1.c: Likewise.
* gcc.target/aarch64/sve/pred-not-gen-4.c: Likewise.
* gcc.target/aarch64/sve/var_stride_2.c: Likewise.
* gcc.target/aarch64/sve/var_stride_4.c: Likewise.
Co-authored-by: Richard Biener <rguenther@suse.de>
|
|
This patch forestalls a regression in gcc.dg/rtl/x86_64/vector_eq.c
with the patch for PR116398. The test wants:
(cinsn 3 (set (reg:V4SI <0>) (const_vector:V4SI [(const_int 0) (const_int 0) (const_int 0) (const_int 0)])))
(cinsn 5 (set (reg:V4SI <2>)
(eq:V4SI (reg:V4SI <0>) (reg:V4SI <1>))))
to be folded to a vector of -1s. One unusual thing about the fold
is that the <1> in the second insn is uninitialised; it looks like
it should be replaced by <0>, or that there should be an insn 4 that
copies <0> to <1>.
As it stands, the test relies on init-regs to insert a zero
initialisation of <1>. This happens after all the cse/pre/fwprop
stuff, with only dce passes between init-regs and combine.
Combine therefore sees:
(insn 3 2 8 2 (set (reg:V4SI 98)
(const_vector:V4SI [
(const_int 0 [0]) repeated x4
])) 2403 {movv4si_internal}
(nil))
(insn 8 3 9 2 (clobber (reg:V4SI 99)) -1
(nil))
(insn 9 8 5 2 (set (reg:V4SI 99)
(const_vector:V4SI [
(const_int 0 [0]) repeated x4
])) -1
(nil))
(insn 5 9 7 2 (set (reg:V4SI 100)
(eq:V4SI (reg:V4SI 98)
(reg:V4SI 99))) 7874 {*sse2_eqv4si3}
(expr_list:REG_DEAD (reg:V4SI 99)
(expr_list:REG_DEAD (reg:V4SI 98)
(expr_list:REG_EQUAL (eq:V4SI (const_vector:V4SI [
(const_int 0 [0]) repeated x4
])
(reg:V4SI 99))
(nil)))))
It looks like the test should then pass through a 3, 9 -> 5 combination,
so that we get an (eq ...) between two zeros and fold it to a vector
of -1s. But although the combination is attempted, the fold doesn't
happen. Instead, combine is left to match the unsimplified (eq ...)
between two zeros, which rightly fails. The test only passes because
late_combine2 happens to try simplifying an (eq ...) between reg X and
reg X, which does fold to a vector of -1s.
The different handling of registers and constants is due to this
code in simplify_const_relational_operation:
if (INTEGRAL_MODE_P (mode) && trueop1 != const0_rtx
&& (code == EQ || code == NE)
&& ! ((REG_P (op0) || CONST_INT_P (trueop0))
&& (REG_P (op1) || CONST_INT_P (trueop1)))
&& (tem = simplify_binary_operation (MINUS, mode, op0, op1)) != 0
/* We cannot do this if tem is a nonzero address. */
&& ! nonzero_address_p (tem))
return simplify_const_relational_operation (signed_condition (code),
mode, tem, const0_rtx);
INTEGRAL_MODE_P matches vector integer modes, but everything else
about the condition is written for scalar integers only. Thus if
trueop0 and trueop1 are equal vector constants, we'll bypass all
the exclusions and try simplifying a subtraction. This will succeed,
giving a vector of zeros. The recursive call will then try to simplify
a comparison between the vector of zeros and const0_rtx, which isn't
well-formed. Luckily or unluckily, the ill-formedness doesn't trigger
an ICE, but it does prevent any simplification from happening.
The least-effort fix would be to replace INTEGRAL_MODE_P with
SCALAR_INT_MODE_P. But the fold does make conceptual sense for
vectors too, so it seemed better to keep the INTEGRAL_MODE_P and
generalise the rest of the condition to match.
gcc/
* simplify-rtx.cc (simplify_const_relational_operation): Generalize
the constant checks in the fold-via-minus path to match the
INTEGRAL_MODE_P condition.
|
|
|
|
gcc/ChangeLog
* doc/extend.texi (Boolean Type): Further clarify support for
_Bool in C23 and C++.
|
|
The discovered paths already include the multilib and so there is
no need to add an extra library to COBOL_UNDER_TEST. Doing so makes
a duplicate, which causes test fails on Darwin, where the linker warns
when duplicate libraries are provided on the link line.
gcc/testsuite/ChangeLog:
* lib/cobol.exp: Simplify the setting of COBOL_UNDER_TEST.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
|
|
As discussed in the PR, the options had been added during development
to handle specific cases, they are no longer needed (and if they should
become necessary, we will need to guard them such that individual
platforms get the correct handling).
PR cobol/119414
gcc/cobol/ChangeLog:
* gcobolspec.cc (append_rdynamic,
append_allow_multiple_definition, append_fpic): Remove.
(lang_specific_driver): Remove platform-specific command
line option handling.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
|
|
gcc/ChangeLog
PR middle-end/78874
* doc/invoke.texi (Warning Options): Fix description of
-Wno-aggressive-loop-optimizations to reflect that this turns
off the warning, and the default is for it to be enabled.
|
|
Here during maybe_dependent_member_ref for accepted_type<_Up>, we
correctly don't strip the typedef because it's a complex one (its
defaulted template parameter isn't used in its definition) and so
we recurse to consider its corresponding TYPE_DECL.
We then incorrectly decide to not rewrite this use because of the
TYPENAME_TYPE shortcut. But I don't think this shortcut should apply to
a typedef TYPE_DECL.
PR c++/118626
gcc/cp/ChangeLog:
* pt.cc (maybe_dependent_member_ref): Restrict TYPENAME_TYPE
shortcut to non-typedef TYPE_DECL.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/class-deduction-alias25a.C: New test.
Reviewed-by: Jason Merrill <jason@redhat.com>
|
|
Here during maybe_dependent_member_ref (as part of CTAD rewriting
for the variant constructor) for __accepted_type<_Up> we strip this
alias all the way to type _Nth_type<__accepted_index<_Up>>, for which
we return NULL since _Nth_type is at namespace scope and so no
longer needs rewriting.
Note that however the template argument __accepted_index<_Up> of this
stripped type _does_ need rewriting (since it specializes a variable
template from the current instantiation). We end up not rewriting this
variable template reference at any point however because upon returning
NULL, the caller (tsubst) proceeds to substitute the original form of
the type __accepted_type<_Up>, which doesn't directly refer to
__accepted_index. This later leads to an ICE during subsequent alias
CTAD rewriting of this guide that contains a non-rewritten reference
to __accepted_index.
So when maybe_dependent_member_ref decides to not rewrite a class-scope
alias that's been stripped, the caller needs to commit to substituting
the stripped type rather than the original type. This patch essentially
implements that by making maybe_dependent_member_ref call tsubst itself
in that case.
PR c++/118626
gcc/cp/ChangeLog:
* pt.cc (maybe_dependent_member_ref): Substitute and return the
stripped type if we decided to not rewrite it directly.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/class-deduction-alias25.C: New test.
Reviewed-by: Jason Merrill <jason@redhat.com>
|
|
I keep forgetting to do this.... :-(
gcc/c-family/ChangeLog
* c.opt.urls: Regenerate.
gcc/d/ChangeLog
* lang.opt.urls: Regenerate.
|
|
Per the issue, there were a couple places in the manual where
-Wno-psabi was mentioned, but the option itself was not documented.
gcc/c-family/ChangeLog
PR c/81831
* c.opt (Wpsabi): Remove "Undocumented" modifier and add a
documentation string.
gcc/ChangeLog
PR c/81831
* doc/invoke.texi (Option Summary): Add -Wno-psabi.
(Warning Options): Document -Wpsabi separately from -Wabi.
Note it's enabled by default, not just implied by -Wabi.
Replace the detailed example for a GCC 4.4 change for x86
(which is unlikely to be very interesting nowadays) with
just a list of all targets that presently diagnose these
warnings.
(RS/6000 and PowerPC Options): Add cross-references for
-Wno-psabi.
|
|
|
|
Here the implicit use of 'this' in inner.size() template argument was
being rejected despite P2280R4 relaxations, due to the special *this
handling in the INDIRECT_REF case of potential_constant_expression_1.
This handling was originally added by r196737 as part of fixing PR56481,
and it seems obsolete especially in light of P2280R4. There doesn't
seem to be a good reason that we need to handle *this specially from
other dereferences.
This patch therefore removes this special handling. As a side benefit
we now correctly reject some *reinterpret_cast<...>(...) constructs
earlier, via p_c_e_1 rather than via constexpr evaluation (because the
removed STRIP_NOPS step meant we'd overlook such casts), which causes a
couple of diagnostic changes (for the better).
PR c++/118249
gcc/cp/ChangeLog:
* constexpr.cc (potential_constant_expression_1)
<case INDIRECT_REF>: Remove obsolete *this handling.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/constexpr-reinterpret2.C: Expect error at
call site of the non-constexpr functions.
* g++.dg/cpp23/constexpr-nonlit12.C: Likewise.
* g++.dg/cpp0x/constexpr-ref14.C: New test.
Reviewed-by: Jason Merrill <jason@redhat.com>
|
|
gcc/ChangeLog
PR middle-end/112589
* common.opt (-fcf-protection): Add documentation string.
* doc/invoke.texi (Option Summary): Add entry for -fcf-protection
without argument.
(Instrumentation Options): Tidy the -fcf-protection entry and
and add documention for the form without an argument.
|
|
This conditionally adds a path for libgcobol when that contains
libgcobol.spec.
gcc/testsuite/ChangeLog:
* lib/cobol.exp: Conditionally add a path for libgcobol.spec.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
|
|
Ads support for using a library spec file (e.g. to include the
target requirements for non-standard libraries - or even libm
which we can now configure at the target side).
gcc/cobol/ChangeLog:
* gcobolspec.cc (SPEC_FILE): New.
(lang_specific_driver): Make the 'need libgcobol' flag global
so that the prelink callback can use it. Libm use is now handled
via the library spec.
(lang_specific_pre_link): Include libgcobol.spec where needed.
libgcobol/ChangeLog:
* Makefile.am: Add libgcobol.spec and dependency.
* Makefile.in: Regenerate.
* configure: Regenerate.
* configure.ac: Add libgcobol.spec handling.
* libgcobol.spec.in: New file.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
|
|
In this testcase, the use of __FUNCTION__ is within a function parameter
scope, the lambda's. And P1787 changed __func__ to live in the parameter
scope. But [basic.scope.pdecl] says that the point of declaration of
__func__ is immediately before {, so in the trailing return type it isn't in
scope yet, so this __FUNCTION__ should refer to foo().
Looking first for a block scope, then a function parameter scope, gives us
the right result.
PR c++/118629
gcc/cp/ChangeLog:
* name-lookup.cc (pushdecl_outermost_localscope): Look for an
sk_block.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/lambda/lambda-__func__3.C: New test.
|
|
|
|
When adding TU_LOCAL_ENTITY in r15-6379 I neglected to add it to
cp_tree_node_structure, so garbage collection was crashing on it.
PR c++/119564
gcc/cp/ChangeLog:
* decl.cc (cp_tree_node_structure): Add TU_LOCAL_ENTITY; fix
formatting.
gcc/testsuite/ChangeLog:
* g++.dg/modules/gc-3_a.C: New test.
* g++.dg/modules/gc-3_b.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Jason Merrill <jason@redhat.com>
|
|
Modules streaming walks decls multiple times, first as a non-streaming
walk to find dependencies, and then later to actually emit the decls.
The first walk needs to be done to note locations that will be emitted.
In the PR we are getting a checking ICE because we are streaming a decl
that we didn't initially walk when collecting dependencies, so the
location isn't in the noted locations map. This is because in decl_node
we have a branch where a PARM_DECL that hasn't previously been
referenced gets walked by value only if 'streaming_p ()' is true.
The true root cause here is that the decltype(v) in the testcase refers
to a different PARM_DECL from the one in the declaration that we're
streaming; it's the PARM_DECL from the initial forward-declaration, that
we're not streaming. A proper fix would be to ensure that it gets
remapped to the decl in the definition we're actually emitting, but for
now this workaround fixes the bug (and any other bugs that might
manifest similarly).
PR c++/119608
gcc/cp/ChangeLog:
* module.cc (trees_out::decl_node): Maybe require by-value
walking not just when streaming.
gcc/testsuite/ChangeLog:
* g++.dg/modules/pr119608_a.C: New test.
* g++.dg/modules/pr119608_b.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Jason Merrill <jason@redhat.com>
|
|
In the linked PR, we're importing over a DECL_MAYBE_DELETED decl with a
decl that has already been instantiated. This patch ensures that the
needed bits are propagated across and that DECL_MAYBE_DELETED is cleared
from the existing decl, so that later synthesize_method doesn't crash
due to a definition unexpectedly already existing.
PR c++/119462
gcc/cp/ChangeLog:
* module.cc (trees_in::is_matching_decl): Propagate exception
spec and constexpr to DECL_MAYBE_DELETED; clear if appropriate.
gcc/testsuite/ChangeLog:
* g++.dg/modules/noexcept-3_a.C: New test.
* g++.dg/modules/noexcept-3_b.C: New test.
* g++.dg/modules/noexcept-3_c.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Jason Merrill <jason@redhat.com>
|
|
This fix reverts the recent cobol_langhook_post_options change setting
flag_strict_aliasing = 0. It isn't necessary.
gcc/cobol
* cobol1.cc: Eliminate cobol_langhook_post_options.
* symbols.cc: Definition of RETURN-CODE special register sets
::attr member to signable_e.
|
|
* gcc.pot: Regenerate.
|
|
Derived from cobolworx UAT run_functions.at.
gcc/testsuite
* cobol.dg/group2/call_subprogram_using_pointer__passing_pointer.cob: New testcase.
* cobol.dg/group2/FUNCTION_ABS.cob: Likewise.
* cobol.dg/group2/FUNCTION_ACOS.cob: Likewise.
* cobol.dg/group2/FUNCTION_ALL_INTRINSIC_simple_test.cob: Likewise.
* cobol.dg/group2/FUNCTION_ANNUITY.cob: Likewise.
* cobol.dg/group2/FUNCTION_as_CALL_parameter_BY_CONTENT.cob: Likewise.
* cobol.dg/group2/FUNCTION_ASIN.cob: Likewise.
* cobol.dg/group2/FUNCTION_ATAN.cob: Likewise.
* cobol.dg/group2/FUNCTION_BIGGER-POINTER__2_.cob: Likewise.
* cobol.dg/group2/FUNCTION_BIGGER-POINTER.cob: Likewise.
* cobol.dg/group2/FUNCTION_BYTE-LENGTH.cob: Likewise.
* cobol.dg/group2/FUNCTION_CHAR.cob: Likewise.
* cobol.dg/group2/FUNCTION_COMBINED-DATETIME.cob: Likewise.
* cobol.dg/group2/FUNCTION_CONCAT___CONCATENATE.cob: Likewise.
* cobol.dg/group2/FUNCTION_CONCAT_with_reference_modding.cob: Likewise.
* cobol.dg/group2/FUNCTION_COS.cob: Likewise.
* cobol.dg/group2/FUNCTION_CURRENT-DATE.cob: Likewise.
* cobol.dg/group2/FUNCTION_DATE-OF-INTEGER.cob: Likewise.
* cobol.dg/group2/FUNCTION_DATE___TIME_OMNIBUS.cob: Likewise.
* cobol.dg/group2/FUNCTION_DATE-TO-YYYYMMDD.cob: Likewise.
* cobol.dg/group2/FUNCTION_DAY-OF-INTEGER.cob: Likewise.
* cobol.dg/group2/FUNCTION_DAY-TO-YYYYDDD.cob: Likewise.
* cobol.dg/group2/FUNCTION_E.cob: Likewise.
* cobol.dg/group2/FUNCTION_EXCEPTION-FILE.cob: Likewise.
* cobol.dg/group2/FUNCTION_EXCEPTION-STATEMENT.cob: Likewise.
* cobol.dg/group2/FUNCTION_EXCEPTION-STATUS.cob: Likewise.
* cobol.dg/group2/FUNCTION_EXP10.cob: Likewise.
* cobol.dg/group2/FUNCTION_EXP.cob: Likewise.
* cobol.dg/group2/FUNCTION_FACTORIAL.cob: Likewise.
* cobol.dg/group2/FUNCTION_FORMATTED-DATE.cob: Likewise.
* cobol.dg/group2/FUNCTION_FORMATTED-DATETIME.cob: Likewise.
* cobol.dg/group2/FUNCTION_FORMATTED-DATE_TIME_DATETIME.cob: Likewise.
* cobol.dg/group2/FUNCTION_FORMATTED-DATETIME_with_ref_modding.cob: Likewise.
* cobol.dg/group2/FUNCTION_FORMATTED-DATE_with_ref_modding.cob: Likewise.
* cobol.dg/group2/FUNCTION_FORMATTED-TIME_DP.COMMA.cob: Likewise.
* cobol.dg/group2/FUNCTION_FORMATTED-TIME_with_ref_modding.cob: Likewise.
* cobol.dg/group2/FUNCTION_FRACTION-PART.cob: Likewise.
* cobol.dg/group2/FUNCTION_HEX-OF.cob: Likewise.
* cobol.dg/group2/FUNCTION_HIGHEST-ALGEBRAIC.cob: Likewise.
* cobol.dg/group2/FUNCTION_INTEGER.cob: Likewise.
* cobol.dg/group2/FUNCTION_INTEGER-OF-DATE.cob: Likewise.
* cobol.dg/group2/FUNCTION_INTEGER-OF-DAY.cob: Likewise.
* cobol.dg/group2/FUNCTION_INTEGER-OF-FORMATTED-DATE.cob: Likewise.
* cobol.dg/group2/FUNCTION_INTEGER-PART.cob: Likewise.
* cobol.dg/group2/FUNCTION_LENGTH__1_.cob: Likewise.
* cobol.dg/group2/FUNCTION_LENGTH__2_.cob: Likewise.
* cobol.dg/group2/FUNCTION_LOCALE-COMPARE.cob: Likewise.
* cobol.dg/group2/FUNCTION_LOCALE-DATE.cob: Likewise.
* cobol.dg/group2/FUNCTION_LOCALE-TIME.cob: Likewise.
* cobol.dg/group2/FUNCTION_LOCALE-TIME-FROM-SECONDS.cob: Likewise.
* cobol.dg/group2/FUNCTION_LOG10.cob: Likewise.
* cobol.dg/group2/FUNCTION_LOG.cob: Likewise.
* cobol.dg/group2/FUNCTION_LOWER-CASE.cob: Likewise.
* cobol.dg/group2/FUNCTION_LOWER-CASE_with_reference_modding.cob: Likewise.
* cobol.dg/group2/FUNCTION_LOWEST-ALGEBRAIC.cob: Likewise.
* cobol.dg/group2/FUNCTION_MAX.cob: Likewise.
* cobol.dg/group2/FUNCTION_MEAN.cob: Likewise.
* cobol.dg/group2/FUNCTION_MEDIAN.cob: Likewise.
* cobol.dg/group2/FUNCTION_MIDRANGE.cob: Likewise.
* cobol.dg/group2/FUNCTION_MIN.cob: Likewise.
* cobol.dg/group2/FUNCTION_MOD__invalid_.cob: Likewise.
* cobol.dg/group2/FUNCTION_MODULE-NAME.cob: Likewise.
* cobol.dg/group2/FUNCTION_MOD__valid_.cob: Likewise.
* cobol.dg/group2/FUNCTION_NUMVAL-C.cob: Likewise.
* cobol.dg/group2/FUNCTION_NUMVAL-C_DP.COMMA.cob: Likewise.
* cobol.dg/group2/FUNCTION_NUMVAL.cob: Likewise.
* cobol.dg/group2/FUNCTION_NUMVAL-F.cob: Likewise.
* cobol.dg/group2/FUNCTION_ORD.cob: Likewise.
* cobol.dg/group2/FUNCTION_ORD-MAX.cob: Likewise.
* cobol.dg/group2/FUNCTION_ORD-MIN.cob: Likewise.
* cobol.dg/group2/FUNCTION_PI.cob: Likewise.
* cobol.dg/group2/FUNCTION_PRESENT-VALUE.cob: Likewise.
* cobol.dg/group2/FUNCTION_RANDOM.cob: Likewise.
* cobol.dg/group2/FUNCTION_RANGE.cob: Likewise.
* cobol.dg/group2/FUNCTION_REM__invalid_.cob: Likewise.
* cobol.dg/group2/FUNCTION_REM__valid_.cob: Likewise.
* cobol.dg/group2/FUNCTION_REVERSE.cob: Likewise.
* cobol.dg/group2/FUNCTION_REVERSE_with_reference_modding.cob: Likewise.
* cobol.dg/group2/FUNCTION_SECONDS-FROM-FORMATTED-TIME.cob: Likewise.
* cobol.dg/group2/FUNCTION_SECONDS-PAST-MIDNIGHT.cob: Likewise.
* cobol.dg/group2/FUNCTION_SIGN.cob: Likewise.
* cobol.dg/group2/FUNCTION_SIN.cob: Likewise.
* cobol.dg/group2/FUNCTION_SQRT.cob: Likewise.
* cobol.dg/group2/FUNCTION_STANDARD-DEVIATION.cob: Likewise.
* cobol.dg/group2/FUNCTION_SUBSTITUTE-CASE.cob: Likewise.
* cobol.dg/group2/FUNCTION_SUBSTITUTE-CASE_with_reference_mod.cob: Likewise.
* cobol.dg/group2/FUNCTION_SUBSTITUTE.cob: Likewise.
* cobol.dg/group2/FUNCTION_SUBSTITUTE_with_reference_modding.cob: Likewise.
* cobol.dg/group2/FUNCTION_SUM.cob: Likewise.
* cobol.dg/group2/FUNCTION_TAN.cob: Likewise.
* cobol.dg/group2/FUNCTION_TEST-DATE-YYYYMMDD.cob: Likewise.
* cobol.dg/group2/FUNCTION_TEST-DAY-YYYYDDD__1_.cob: Likewise.
* cobol.dg/group2/FUNCTION_TEST-DAY-YYYYDDD__2_.cob: Likewise.
* cobol.dg/group2/FUNCTION_TEST-FORMATTED-DATETIME_additional.cob: Likewise.
* cobol.dg/group2/FUNCTION_TEST-FORMATTED-DATETIME_DP.COMMA.cob: Likewise.
* cobol.dg/group2/FUNCTION_TEST-FORMATTED-DATETIME_with_dates.cob: Likewise.
* cobol.dg/group2/FUNCTION_TEST-FORMATTED-DATETIME_with_datetimes.cob: Likewise.
* cobol.dg/group2/FUNCTION_TEST-FORMATTED-DATETIME_with_times.cob: Likewise.
* cobol.dg/group2/FUNCTION_TEST-NUMVAL-C.cob: Likewise.
* cobol.dg/group2/FUNCTION_TEST-NUMVAL.cob: Likewise.
* cobol.dg/group2/FUNCTION_TEST-NUMVAL-F.cob: Likewise.
* cobol.dg/group2/FUNCTION_TRIM.cob: Likewise.
* cobol.dg/group2/FUNCTION_TRIM_with_reference_modding.cob: Likewise.
* cobol.dg/group2/FUNCTION_TRIM_zero_length.cob: Likewise.
* cobol.dg/group2/FUNCTION_UPPER-CASE.cob: Likewise.
* cobol.dg/group2/FUNCTION_UPPER-CASE_with_reference_modding.cob: Likewise.
* cobol.dg/group2/FUNCTION_VARIANCE.cob: Likewise.
* cobol.dg/group2/FUNCTION_WHEN-COMPILED.cob: Likewise.
* cobol.dg/group2/FUNCTION_YEAR-TO-YYYY.cob: Likewise.
* cobol.dg/group2/Intrinsics_without_FUNCTION_keyword__2_.cob: Likewise.
* cobol.dg/group2/Program-to-program_parameters_and_retvals.cob: Likewise.
* cobol.dg/group2/Recursive_FUNCTION_with_local-storage.cob: Likewise.
* cobol.dg/group2/Repository_functions_clause.cob: Likewise.
* cobol.dg/group2/UDF_fibonacci_recursion.cob: Likewise.
* cobol.dg/group2/UDF_in_COMPUTE.cob: Likewise.
* cobol.dg/group2/UDF_RETURNING_group_and_PIC_9_5_.cob: Likewise.
* cobol.dg/group2/UDF_with_recursion.cob: Likewise.
* cobol.dg/group2/call_subprogram_using_pointer__passing_pointer.out: New known-good file.
* cobol.dg/group2/FUNCTION_ABS.out: Likewise.
* cobol.dg/group2/FUNCTION_ALL_INTRINSIC_simple_test.out: Likewise.
* cobol.dg/group2/FUNCTION_as_CALL_parameter_BY_CONTENT.out: Likewise.
* cobol.dg/group2/FUNCTION_BIGGER-POINTER__2_.out: Likewise.
* cobol.dg/group2/FUNCTION_BIGGER-POINTER.out: Likewise.
* cobol.dg/group2/FUNCTION_BYTE-LENGTH.out: Likewise.
* cobol.dg/group2/FUNCTION_EXCEPTION-FILE.out: Likewise.
* cobol.dg/group2/FUNCTION_EXCEPTION-STATEMENT.out: Likewise.
* cobol.dg/group2/FUNCTION_EXCEPTION-STATUS.out: Likewise.
* cobol.dg/group2/FUNCTION_FORMATTED-DATE_TIME_DATETIME.out: Likewise.
* cobol.dg/group2/FUNCTION_HEX-OF.out: Likewise.
* cobol.dg/group2/FUNCTION_LENGTH__2_.out: Likewise.
* cobol.dg/group2/FUNCTION_LOCALE-DATE.out: Likewise.
* cobol.dg/group2/FUNCTION_LOCALE-TIME-FROM-SECONDS.out: Likewise.
* cobol.dg/group2/FUNCTION_LOCALE-TIME.out: Likewise.
* cobol.dg/group2/FUNCTION_MAX.out: Likewise.
* cobol.dg/group2/FUNCTION_MEAN.out: Likewise.
* cobol.dg/group2/FUNCTION_MEDIAN.out: Likewise.
* cobol.dg/group2/FUNCTION_MIDRANGE.out: Likewise.
* cobol.dg/group2/FUNCTION_MIN.out: Likewise.
* cobol.dg/group2/FUNCTION_MODULE-NAME.out: Likewise.
* cobol.dg/group2/FUNCTION_NUMVAL-F.out: Likewise.
* cobol.dg/group2/FUNCTION_ORD-MAX.out: Likewise.
* cobol.dg/group2/FUNCTION_ORD-MIN.out: Likewise.
* cobol.dg/group2/FUNCTION_ORD.out: Likewise.
* cobol.dg/group2/FUNCTION_PRESENT-VALUE.out: Likewise.
* cobol.dg/group2/FUNCTION_SUBSTITUTE.out: Likewise.
* cobol.dg/group2/FUNCTION_TEST-DATE-YYYYMMDD.out: Likewise.
* cobol.dg/group2/FUNCTION_TEST-DAY-YYYYDDD__1_.out: Likewise.
* cobol.dg/group2/FUNCTION_TRIM.out: Likewise.
* cobol.dg/group2/FUNCTION_TRIM_with_reference_modding.out: Likewise.
* cobol.dg/group2/FUNCTION_TRIM_zero_length.out: Likewise.
* cobol.dg/group2/Program-to-program_parameters_and_retvals.out: Likewise.
* cobol.dg/group2/Recursive_FUNCTION_with_local-storage.out: Likewise.
* cobol.dg/group2/Repository_functions_clause.out: Likewise.
* cobol.dg/group2/UDF_fibonacci_recursion.out: Likewise.
* cobol.dg/group2/UDF_in_COMPUTE.out: Likewise.
* cobol.dg/group2/UDF_RETURNING_group_and_PIC_9_5_.out: Likewise.
* cobol.dg/group2/UDF_with_recursion.out: Likewise.
|
|
Since r10-7441 we set processing_template_decl in a requires-expression so
that we can use tsubst_expr to evaluate the requirements, but that confuses
lambdas terribly; begin_lambda_type silently returns error_mark_node and we
continue into other failures. This patch clears processing_template_decl
again while we're defining the closure and op() function, so it only remains
set while parsing the introducer (i.e. any init-captures) and building the
resulting object. This properly avoids trying to create another lambda in
tsubst_lambda_expr.
PR c++/99546
PR c++/113925
PR c++/106976
PR c++/109961
PR c++/117336
gcc/cp/ChangeLog:
* lambda.cc (build_lambda_object): Handle fake
requires-expr processing_template_decl.
* parser.cc (cp_parser_lambda_expression): Likewise.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/lambda-requires2.C: New test.
* g++.dg/cpp2a/lambda-requires3.C: New test.
* g++.dg/cpp2a/lambda-requires4.C: New test.
* g++.dg/cpp2a/lambda-requires5.C: New test.
|
|
I can reproduce a really weird error in our distro i686 trunk gcc
(but haven't managed to reproduce it with vanilla trunk yet).
echo 'void foo (void) {}' > a.c; gcc -O2 -flto=auto -m32 -march=i686 -ffat-lto-objects -fhardened -o a.o -c a.c; gcc -O2 -flto=auto -m32 -march=i686 -r -o a.lo a.o
lto1: fatal error: open failed: No such file or directory
compilation terminated.
lto-wrapper: fatal error: gcc returned 1 exit status
The error is because
cat ./a.lo.lto.o-args.0
""
a.o
My suspicion is that this "" in there is caused by weird .gnu.lto_.opts
section content during
gcc -O2 -flto=auto -m32 -march=i686 -ffat-lto-objects -fhardened -S -o a.s -c a.c
compilation (and I can reproduce that one with vanilla trunk).
The above results in
.section .gnu.lto_.opts,"e",@progbits
.string "'-fno-openmp' '-fno-openacc' '-fPIC' '' '-m32' '-march=i686' '-O2' '-flto=auto' '-ffat-lto-objects'"
There are two weird things, one (IMHO the cause of the "" later on) is
the '' part, I think it comes from lto_write_options doing
append_to_collect_gcc_options (&temporary_obstack, &first_p, "");
IMHO it shouldn't call append_to_collect_gcc_options at all for that case.
The -fhardened option causes global_options.x_flag_cf_protection
to be set to CF_FULL and later on the backend option processing
sets it to CF_FULL | CF_SET (i.e. 7, a value not handled in
lto_write_options).
The following patch fixes it by not emitting anything there if
flag_cf_protection is one of the unhandled values.
Perhaps it could incrementally use
switch (global_options.x_flag_cf_protection & ~CF_SET)
instead, dunno.
And the other problem is that the -fPIC in there is really weird.
Our distro compiler or vanilla configured trunk certainly doesn't
default to -fPIC and -fhardened uses -fPIE when
-fPIC/-fpic/-fno-pie/-fno-pic is not specified, so I was expecting
-fPIE in there.
The thing is that the -fpie option causes setting of both
global_options.x_flag_pi{c,e} to 1, -fPIE both to 2:
/* If -fPIE or -fpie is used, turn on PIC. */
if (opts->x_flag_pie)
opts->x_flag_pic = opts->x_flag_pie;
else if (opts->x_flag_pic == -1)
opts->x_flag_pic = 0;
if (opts->x_flag_pic && !opts->x_flag_pie)
opts->x_flag_shlib = 1;
so checking first for flag_pic == 2 and then flag_pic == 1
and only afterwards for flag_pie means we never print
-fPIE/-fpie.
Or do you want something further (like
switch (global_options.x_flag_cf_protection & ~CF_SET)
)?
2025-04-04 Jakub Jelinek <jakub@redhat.com>
PR lto/119625
* lto-opts.cc (lto_write_options): If neither flag_pic nor
flag_pie are set, check first for flag_pie and only later
for flag_pic rather than the other way around, use a temporary
variable. If flag_cf_protection is not set, don't append anything
if flag_cf_protection is none of CF_{NONE,FULL,BRANCH,RETURN} and
use a temporary variable.
|
|
As the following testcase shows, sometimes we can have debug stmts
after a musttail call and profile.cc in that case would incorrectly
allow the edge from that, causing musttail error and -fcompare-debug
failure (because if there are no debug stmts after it, then musttail
is found there and the edge is ignored).
The following patch uses gsi_last_nondebug_bb instead of gsi_last_bb
to find the musttail call. And so that we don't uselessly skip over
debug stmts at the end of many bbs, the patch limits it to
cfun->has_musttail functions.
2025-04-04 Jakub Jelinek <jakub@redhat.com>
PR gcov-profile/119618
* profile.cc (branch_prob): Only check for musttail calls if
cfun->has_musttail. Use gsi_last_nondebug_bb instead of gsi_last_bb.
* c-c++-common/pr119618.c: New test.
|
|
Before my PR119376 r15-9145 changes, suitable_for_tail_call_opt_p would
return the same value in the same caller, regardless of the calls in it.
If it fails, the caller clears opt_tailcalls which is a reference and
therefore shared by all calls in the caller and we only do tail recursion,
all non-recursive or tail recursion non-optimizable calls are not
tail call optimized.
For musttail calls we want to allow address taken parameters, but the
r15-9145 change effectively resulted in the behavior where if there
are just musttail calls considered, they will be tail call optimized,
and if there are also other tail call candidates (without musttail),
we clear opt_tailcall and then error out on all the musttail calls.
The following patch fixes that by moving the address taken parameter
discovery from suitable_for_tail_call_opt_p to its single caller.
If there are addressable parameters, if !cfun->has_musttail it will
work as before, disable all tail calls in the caller but possibly
allow tail recursions. If cfun->has_musttail, it will set a new
bool automatic flag and reject non-tail recursions. This way musttail
calls can be still accepted and normal tail call candidates rejected
(and tail recursions accepted).
2025-04-04 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/119616
* tree-tailcall.cc (suitable_for_tail_call_opt_p): Move checking
for addressable parameters from here ...
(find_tail_calls): ... here. If cfun->has_musttail, don't clear
opt_tailcalls for it, instead set a local flag and punt if we can't
tail recurse optimize it.
* c-c++-common/pr119616.c: New test.
|
|
In PR119491 r15-9154 I've allowed some useless EH regions for musttail
calls (if there are no non-debug/clobber stmts before resx which resumes
external throwing).
Now, for -O1+ (but not -O0/-Og) there is a cleanup_eh pass after it
which should optimize that way.
The following testcase ICEs at -O0 though, the cleanup_eh in that case
is before the musttail pass and dunno why it didn't actually optimize
it away.
The following patch catches that during expansion and just removes the note,
which causes EH cleanups to do the rest. A tail call, even when it throws,
will not throw while the musttail caller's frame is still on the stack,
will throw after that and so REG_EH_REGION for it is irrelevant (like it
would be never set before the r15-9154 changes).
2025-04-04 Jakub Jelinek <jakub@redhat.com>
PR middle-end/119613
* cfgrtl.cc (purge_dead_edges): Remove REG_EH_REGION notes from
tail calls.
* g++.dg/opt/pr119613.C: New test.
|
|
Testcases compiled with -Os were failing because static functions and static
variables were being optimized away, because of improper data type casts, and
because strict aliasing (whatever that is) was resulting in some loss of data.
These changes eliminate those known problems.
gcc/cobol
* cobol1.cc: (cobol_langhook_post_options): Implemented in order to set
flag_strict_aliasing to zero.
* genapi.cc: (set_user_status): Add comment.
(parser_intrinsic_subst): Expand SHOW_PARSE information.
(psa_global): Change names of return-code and upsi globals,
(psa_FldLiteralA): Set DECL_PRESERVE_P for FldLiteralA.
* gengen.cc: (show_type): Add POINTER type.
(gg_define_function_with_no_parameters): Set DECL_PRESERVE_P for COBOL-
style nested programs. (gg_array_of_bytes): Fix bad cast.
libgcobol
* charmaps.h: Change __gg__data_return_code to 'short' type.
* constants.cc: Likewise.
|
|
Below is an attempt to fix up RTX costing P1 caused by r15-775
https://gcc.gnu.org/pipermail/gcc-patches/2024-May/thread.html#652446
@@ -21562,7 +21562,8 @@ ix86_rtx_costs (rtx x, machine_mode mode, int outer_code_i, int opno,
if (x86_64_immediate_operand (x, VOIDmode))
*total = 0;
else
- *total = 1;
+ /* movabsq is slightly more expensive than a simple instruction. */
+ *total = COSTS_N_INSNS (1) + 1;
return true;
case CONST_DOUBLE:
change. In my understanding this was partially trying to workaround
weird code in pattern_cost, which uses
return cost > 0 ? cost : COSTS_N_INSNS (1);
That doesn't make sense to me. All costs smaller than COSTS_N_INSNS (1)
mean we need to have at least one instruction there which has the
COSTS_N_INSNS (1) minimal cost. So special casing just cost 0 for the
really cheap immediates which can be used pretty much everywhere but not
ones which have just tiny bit larger cost than that (1, 2 or 3) is just
weird.
So, the following patch changes that to MAX (COSTS_N_INSNS (1), cost)
which doesn't have this weird behavior where set_src_cost 0 is considered
more expensive than set_src_cost 1.
Note, pattern_cost isn't the only spot where costs are computed and normally
we often sum the subcosts of different parts of a pattern or just query
rtx costs of different parts of subexpressions, so the jump from
1 to 5 is quite significant.
Additionally, x86_64 doesn't have just 2 kinds of constants with different
costs, it has 3, signed 32-bit ones are the ones which can appear in
almost all instructions and so using cost of 0 for those looks best,
then unsigned 32-bit ones which can be done with still cheap movl
instruction (and I think some others too) and finally full 64-bit ones
which can be done only with a single movabsq instruction and are quite
costly both in instruction size and even more expensive to execute.
The following patch attempts to restore the behavior of GCC 14 with the
pattern_cost hunk fixed for the unsigned 32-bit ones and only keeps the
bigger cost for the 64-bit ones.
2025-04-04 Jakub Jelinek <jakub@redhat.com>
PR target/115910
* rtlanal.cc (pattern_cost): Return at least COSTS_N_INSNS (1)
rather than just COSTS_N_INTNS (1) for cost <= 0.
* config/i386/i386.cc (ix86_rtx_costs): Set *total to 1 for
TARGET_64BIT x86_64_zext_immediate_operand constants.
* gcc.target/i386/pr115910.c: New test.
|
|
Here we wrongly reject the type-requirement at parse time due to its use
of the constraint variable 't' within a template argument (an evaluated
context). Fix this simply by refining the "use of parameter outside
function body" error path to exclude constraint variables.
PR c++/104255 tracks the same issue for function parameters, but fixing
that would be more involved, requiring changes to the PARM_DECL case of
tsubst_expr.
PR c++/117849
gcc/cp/ChangeLog:
* semantics.cc (finish_id_expression_1): Allow use of constraint
variable outside an unevaluated context.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/concepts-requires41.C: New test.
Reviewed-by: Jason Merrill <jason@redhat.com>
|
|
r8-3988-g356fcc67fba52b added code to turn return statements into __builtin_unreachable
calls inside noreturn functions but only while optimizing. Since -funreachable-traps
was added (r13-1204-gd68d3664253696), it is a good idea to move over to using
__builtin_unreachable (and the trap version with this option which defaults at -O0 and -0g)
instead of just a follow through even at -O0.
This also fixes a regression when inlining a noreturn function that returns at -O0 (due to always_inline)
as we would get an empty bb which has no successor edge instead of one with a call to __builtin_unreachable.
I also noticed there was no testcase testing the warning about __builtin_return inside a noreturn function
so I added a testcase there.
Bootstrapped and tested on x86_64-linux-gnu.
PR ipa/119599
gcc/ChangeLog:
* tree-cfg.cc (pass_warn_function_return::execute): Turn return statements always
into __builtin_unreachable calls.
gcc/testsuite/ChangeLog:
* gcc.dg/torture/pr119599-1.c: New test.
* gcc.dg/builtin-apply5.c: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
|
|
The libcpp left shift handling implements (partially) the C99-C23
wording where shifts are UB if shift count is negative, or too large,
or shifting left a negative value or shifting left non-negative value
results in something not representable in the result type (in the
preprocessor case that is intmax_t).
libcpp actually implements left shift by negative count as right shifts
by negation of the count and similarly right shifts by negative count
as left shifts by negation (not ok), sets overflow for too large shift
count (ok), doesn't check for negative values on left shift (not ok)
and checks correctly for the non-representable ones otherwise (ok).
Now, C++11 to C++17 has different behavior, whereas in C99-C23 1 << 63
in preprocessor is invalid, in C++11-17 it is valid, but 3 << 63 is
not. The wording is that left shift of negative value is UB (like in C)
and signed non-negative left shift is UB if the result isn't representable
in corresponding unsigned type (so uintmax_t for libcpp).
And then C++20 and newer says all left shifts are well defined with the
exception of bad shift counts.
In -fsanitize=undefined we handle these by
/* For signed x << y, in C99 and later, the following:
(unsigned) x >> (uprecm1 - y)
if non-zero, is undefined. */
and
/* For signed x << y, in C++11 to C++17, the following:
x < 0 || ((unsigned) x >> (uprecm1 - y))
if > 1, is undefined. */
Now, we are late in GCC 15 development, so I think making the preprocessor
more strict than it is now is undesirable, so will defer setting overflow
flag for the shifts by negative count, or shifts by negative value left.
The following patch just makes some previously incorrectly rejected or
warned cases valid for C++11-17 and even more for C++20 and later.
2025-04-04 Jakub Jelinek <jakub@redhat.com>
PR preprocessor/119391
* expr.cc (num_lshift): Add pfile argument. Don't set num.overflow
for !num.unsignedp in C++20 or later unless n >= precision. For
C++11 to C++17 set it if orig >> (precision - 1 - n) as logical
shift results in value > 1.
(num_binary_op): Pass pfile to num_lshift.
(num_div_op): Likewise.
* g++.dg/cpp/pr119391.C: New test.
|
|
On Arm, running
make check-gcc RUNTESTFLAGS="dwarf2.exp=pr43190.c"
with a target list of "arm-qemu{,-mthumb}"
results in no errors. But running it with
make check-gcc RUNTESTFLAGS="{mve,dwarf2}.exp=pr43190.c"
results in unresolved tests while running the thumb variant. The problem
is that mve.exp is changing dg-do-what-default to "assemble", but failing
to restore the original value once its tests are complete. The result is
that all subsequent tests run with an incorrect underlying default value.
The fix is easy - save dg-do-what-default and restore it after the tests
are complete.
gcc/testsuite/ChangeLog:
* gcc.target/arm/mve/mve.exp: Save dg-do-what-default before
changing it. Restore it once done.
|
|
The implementation solves the eigensystem for a NxN complex Hermitian matrix
by first solving it for a 2Nx2N real symmetric matrix and then interpreting
the 2Nx1 real vectors as Nx1 complex ones, but the last step does not work.
The patch fixes the last step and also performs a small cleanup throughout
the implementation, mostly in the commentary and without functional changes.
gcc/ada/
* libgnat/a-ngcoar.adb (Eigensystem): Adjust notation and fix the
layout of the real symmetric matrix in the main comment. Adjust
the layout of the associated code accordingly and correctly turn
the 2Nx1 real vectors into Nx1 complex ones.
(Eigenvalues): Minor similar tweaks.
* libgnat/a-ngrear.adb (Jacobi): Minor tweaks in the main comment.
Adjust notation and corresponding parameter names of functions.
Fix call to Unit_Matrix routine. Adjust the comment describing
the various kinds of iterations to match the implementation.
|
|
As the first two testcases show, even with pointers IPA-VRP can optimize
return values from functions if they have singleton ranges into just the
exact value, so we need to virtually undo that for tail calls similarly
to integers and floats. The third test just adds check that it works
even with floats (which it does).
2025-04-04 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/119614
* tree-tailcall.cc (find_tail_calls): Handle also pointer types in the
IPA-VRP workaround.
* c-c++-common/pr119614-1.c: New test.
* c-c++-common/pr119614-2.c: New test.
* c-c++-common/pr119614-3.c: New test.
|
|
|
|
This avoids cases where a "File uses too much global constant data" (final
executable, or single object file), and avoids cases of wrong code generation:
"error : State space incorrect for instruction 'st'" ('st.const'), or another
case where an "illegal instruction was encountered", or a lot of cases where
for two compilation units (such as a library linked with user code) we ran into
"error : Memory space doesn't match" due to differences in '.const' usage
between definition and use of a variable.
We progress:
ptxas error : File uses too much global constant data (0x1f01a bytes, 0x10000 max)
nvptx-run: cuLinkAddData failed: a PTX JIT compilation failed (CUDA_ERROR_INVALID_PTX, 218)
... into:
PASS: 20_util/to_chars/103955.cc -std=gnu++17 (test for excess errors)
[-FAIL:-]{+PASS:+} 20_util/to_chars/103955.cc -std=gnu++17 execution test
We progress:
ptxas error : File uses too much global constant data (0x36c65 bytes, 0x10000 max)
nvptx-as: ptxas returned 255 exit status
... into:
[-UNSUPPORTED:-]{+PASS:+} gcc.c-torture/compile/pr46534.c -O0 {+(test for excess errors)+}
[-UNSUPPORTED:-]{+PASS:+} gcc.c-torture/compile/pr46534.c -O1 {+(test for excess errors)+}
[-UNSUPPORTED:-]{+PASS:+} gcc.c-torture/compile/pr46534.c -O2 {+(test for excess errors)+}
[-UNSUPPORTED:-]{+PASS:+} gcc.c-torture/compile/pr46534.c -O3 -g {+(test for excess errors)+}
[-UNSUPPORTED:-]{+PASS:+} gcc.c-torture/compile/pr46534.c -Os {+(test for excess errors)+}
[-FAIL:-]{+PASS:+} g++.dg/torture/pr31863.C -O0 (test for excess errors)
[-FAIL:-]{+PASS:+} g++.dg/torture/pr31863.C -O1 (test for excess errors)
[-FAIL:-]{+PASS:+} g++.dg/torture/pr31863.C -O2 (test for excess errors)
[-FAIL:-]{+PASS:+} g++.dg/torture/pr31863.C -O3 -g (test for excess errors)
[-FAIL:-]{+PASS:+} g++.dg/torture/pr31863.C -Os (test for excess errors)
[-FAIL:-]{+PASS:+} gfortran.dg/bind-c-contiguous-1.f90 -O0 (test for excess errors)
[-UNRESOLVED:-]{+PASS:+} gfortran.dg/bind-c-contiguous-1.f90 -O0 [-compilation failed to produce executable-]{+execution test+}
[-FAIL:-]{+PASS:+} gfortran.dg/bind-c-contiguous-4.f90 -O0 (test for excess errors)
[-UNRESOLVED:-]{+PASS:+} gfortran.dg/bind-c-contiguous-4.f90 -O0 [-compilation failed to produce executable-]{+execution test+}
[-FAIL:-]{+PASS:+} gfortran.dg/bind-c-contiguous-5.f90 -O0 (test for excess errors)
[-UNRESOLVED:-]{+PASS:+} gfortran.dg/bind-c-contiguous-5.f90 -O0 [-compilation failed to produce executable-]{+execution test+}
[-FAIL:-]{+PASS:+} 20_util/to_chars/double.cc -std=gnu++17 (test for excess errors)
[-UNRESOLVED:-]{+PASS:+} 20_util/to_chars/double.cc -std=gnu++17 [-compilation failed to produce executable-]{+execution test+}
[-FAIL:-]{+PASS:+} 20_util/to_chars/float.cc -std=gnu++17 (test for excess errors)
[-UNRESOLVED:-]{+PASS:+} 20_util/to_chars/float.cc -std=gnu++17 [-compilation failed to produce executable-]{+execution test+}
[-FAIL:-]{+PASS:+} special_functions/13_ellint_3/check_value.cc -std=gnu++17 (test for excess errors)
[-UNRESOLVED:-]{+PASS:+} special_functions/13_ellint_3/check_value.cc -std=gnu++17 [-compilation failed to produce executable-]{+execution test+}
[-FAIL:-]{+PASS:+} tr1/5_numerical_facilities/special_functions/14_ellint_3/check_value.cc -std=gnu++17 (test for excess errors)
[-UNRESOLVED:-]{+PASS:+} tr1/5_numerical_facilities/special_functions/14_ellint_3/check_value.cc -std=gnu++17 [-compilation failed to produce executable-]{+execution test+}
..., and progress likewise, but fail later with an unrelated error:
[-FAIL:-]{+PASS:+} ext/special_functions/hyperg/check_value.cc -std=gnu++17 (test for excess errors)
[-UNRESOLVED:-]{+FAIL:+} ext/special_functions/hyperg/check_value.cc -std=gnu++17 [-compilation failed to produce executable-]{+execution test+}
[...]/libstdc++-v3/testsuite/ext/special_functions/hyperg/check_value.cc:12317: void test(const testcase_hyperg<Ret> (&)[Num], Ret) [with Ret = double; unsigned int Num = 19]: Assertion 'max_abs_frac < toler' failed.
..., and:
[-FAIL:-]{+PASS:+} tr1/5_numerical_facilities/special_functions/17_hyperg/check_value.cc -std=gnu++17 (test for excess errors)
[-UNRESOLVED:-]{+FAIL:+} tr1/5_numerical_facilities/special_functions/17_hyperg/check_value.cc -std=gnu++17 [-compilation failed to produce executable-]{+execution test+}
[...]/libstdc++-v3/testsuite/tr1/5_numerical_facilities/special_functions/17_hyperg/check_value.cc:12316: void test(const testcase_hyperg<Ret> (&)[Num], Ret) [with Ret = double; unsigned int Num = 19]: Assertion 'max_abs_frac < toler' failed.
We progress:
nvptx-run: error getting kernel result: an illegal instruction was encountered (CUDA_ERROR_ILLEGAL_INSTRUCTION, 715)
... into:
PASS: g++.dg/cpp1z/inline-var1.C -std=gnu++17 (test for excess errors)
[-FAIL:-]{+PASS:+} g++.dg/cpp1z/inline-var1.C -std=gnu++17 execution test
PASS: g++.dg/cpp1z/inline-var1.C -std=gnu++20 (test for excess errors)
[-FAIL:-]{+PASS:+} g++.dg/cpp1z/inline-var1.C -std=gnu++20 execution test
PASS: g++.dg/cpp1z/inline-var1.C -std=gnu++26 (test for excess errors)
[-FAIL:-]{+PASS:+} g++.dg/cpp1z/inline-var1.C -std=gnu++26 execution test
(A lot of '.const' -> '.global' etc. Haven't researched what the actual
problem was.)
We progress:
ptxas /tmp/cc5TSZZp.o, line 142; error : State space incorrect for instruction 'st'
ptxas /tmp/cc5TSZZp.o, line 174; error : State space incorrect for instruction 'st'
ptxas fatal : Ptx assembly aborted due to errors
nvptx-as: ptxas returned 255 exit status
... into:
[-FAIL:-]{+PASS:+} g++.dg/torture/builtin-clear-padding-1.C -O0 (test for excess errors)
[-UNRESOLVED:-]{+PASS:+} g++.dg/torture/builtin-clear-padding-1.C -O0 [-compilation failed to produce executable-]{+execution test+}
PASS: g++.dg/torture/builtin-clear-padding-1.C -O1 (test for excess errors)
PASS: g++.dg/torture/builtin-clear-padding-1.C -O1 execution test
[-FAIL:-]{+PASS:+} g++.dg/torture/builtin-clear-padding-1.C -O2 (test for excess errors)
[-UNRESOLVED:-]{+PASS:+} g++.dg/torture/builtin-clear-padding-1.C -O2 [-compilation failed to produce executable-]{+execution test+}
[-FAIL:-]{+PASS:+} g++.dg/torture/builtin-clear-padding-1.C -O3 -g (test for excess errors)
[-UNRESOLVED:-]{+PASS:+} g++.dg/torture/builtin-clear-padding-1.C -O3 -g [-compilation failed to produce executable-]{+execution test+}
[-FAIL:-]{+PASS:+} g++.dg/torture/builtin-clear-padding-1.C -Os (test for excess errors)
[-UNRESOLVED:-]{+PASS:+} g++.dg/torture/builtin-clear-padding-1.C -Os [-compilation failed to produce executable-]{+execution test+}
This indeed tried to write ('st.const') into 's2', which was '.const'
(also: 's1' was '.const') -- even though, no explicit 'const' in
'g++.dg/torture/builtin-clear-padding-1.C'; "interesting".
We progress:
error : Memory space doesn't match for '_ZNSt3tr18__detail12__prime_listE' in 'input file 3 at offset 53085', first specified in 'input file 1 at offset 1924'
nvptx-run: cuLinkAddData failed: device kernel image is invalid (CUDA_ERROR_INVALID_SOURCE, 300)
... into execution test PASS for a few dozens of libstdc++ test cases.
We progress:
error : Memory space doesn't match for '_ZNSt6locale17_S_twinned_facetsE' in 'input file 11 at offset 479903', first specified in 'input file 9 at offset 59300'
nvptx-run: cuLinkAddData failed: device kernel image is invalid (CUDA_ERROR_INVALID_SOURCE, 300)
... into:
PASS: g++.dg/tree-ssa/pr20458.C -std=gnu++17 (test for excess errors)
[-FAIL:-]{+PASS:+} g++.dg/tree-ssa/pr20458.C -std=gnu++17 execution test
PASS: g++.dg/tree-ssa/pr20458.C -std=gnu++26 (test for excess errors)
[-FAIL:-]{+PASS:+} g++.dg/tree-ssa/pr20458.C -std=gnu++26 execution test
..., and likewise for a few hundreds of libstdc++ test cases.
We progress:
error : Memory space doesn't match for '_ZNSt6locale5_Impl19_S_facet_categoriesE' in 'input file 11 at offset 821962', first specified in 'input file 10 at offset 676317'
nvptx-run: cuLinkAddData failed: device kernel image is invalid (CUDA_ERROR_INVALID_SOURCE, 300)
... into execution test PASS for a hundred of libstdc++ test cases.
We progress:
error : Memory space doesn't match for '_ctype_' in 'input file 22 at offset 1698331', first specified in 'input file 9 at offset 57095'
nvptx-run: cuLinkAddData failed: device kernel image is invalid (CUDA_ERROR_INVALID_SOURCE, 300)
... into execution test PASS for another few libstdc++ test cases.
PR target/119573
gcc/
* config/nvptx/nvptx.cc (nvptx_encode_section_info): Don't set
'DATA_AREA_CONST' for 'TREE_CONSTANT', or 'TREE_READONLY'.
(nvptx_asm_declare_constant_name): Use '.global' instead of
'.const'.
gcc/testsuite/
* gcc.c-torture/compile/pr46534.c: Don't 'dg-skip-if' nvptx.
* gcc.target/nvptx/decl.c: Adjust.
libstdc++-v3/
* config/cpu/nvptx/t-nvptx (AM_MAKEFLAGS): Don't amend.
|
|
Compiling the testcase in this PR uses 2.5x more memory and 6x more
time ever since r14-5979 which implements P2280R4. This is because
our speculative constexpr folding now does a lot more work trying to
fold ultimately non-constant calls to constexpr functions, and in turn
produces a lot of garbage. We do sometimes successfully fold more
thanks to P2280R4, but it seems to be trivial stuff like calls to
std::array::size or std::addressof. The benefit of P2280 therefore
doesn't seem worth the cost during speculative constexpr folding, so
this patch restricts the paper to only manifestly-constant evaluation.
PR c++/119387
gcc/cp/ChangeLog:
* constexpr.cc (p2280_active_p): New.
(cxx_eval_constant_expression) <case VAR_DECL>: Use it to
restrict P2280 relaxations.
<case PARM_DECL>: Likewise.
Reviewed-by: Jason Merrill <jason@redhat.com>
|
|
The AIX traceback table documentation states the tbtab "lang" field for
Cobol should be set to 7. Use it.
2025-04-03 Peter Bergner <bergner@linux.ibm.com>
gcc/
PR target/119308
* config/rs6000/rs6000-logue.cc (rs6000_output_function_epilogue):
Handle GCC COBOL for the tbtab lang field.
|
|
std/format/string.cc and a few other libstdc++ tests were failing with
module std with undefined references to __failed_to_parse_format_spec. This
turned out to be because since r15-8012 we don't end up calling
note_vague_linkage_fn for functions loaded after at_eof is set.
But once import_export_decl decides on COMDAT linkage, we should be able to
just clear DECL_EXTERNAL and let cgraph take it from there.
I initially made this change in import_export_decl, but decided that for GCC
15 it would be safer to limit the change to modules. For GCC 16 I'd like to
do away with DECL_NOT_REALLY_EXTERN entirely, it's been obsolete since
cgraphunit in 2003.
gcc/cp/ChangeLog:
* module.cc (module_state::read_cluster)
(post_load_processing): Clear DECL_EXTERNAL if DECL_COMDAT.
|
|
When considering an op== as a rewrite target, we need to disqualify it if
there is a matching op!= in the same scope. But add_candidates was assuming
that we could use the same set of op!= for all op==, which is wrong if
arg-dep lookup finds op== in multiple namespaces.
This broke 20_util/optional/relops/constrained.cc if the order of the ADL
set changed.
gcc/cp/ChangeLog:
* call.cc (add_candidates): Re-lookup ne_fns if we move into
another namespace.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/spaceship-rewrite6.C: New test.
|
|
Looking over the recently-committed change to the musttail attribute
documentation, it appears the comment in the last example was a paste-o,
as it does not agree with either what the similar example in the
-Wmaybe-musttail-local-addr documentation says, or the actual behavior
observed when running the code.
In addition, the entire section on musttail was in need of copy-editing
to put it in the present tense, avoid reference to "the user", etc. I've
attempted to clean it up here.
gcc/ChangeLog
* doc/extend.texi (Statement Attributes): Copy-edit the musttail
attribute documentation and correct the comment in the last
example.
|