Age | Commit message (Collapse) | Author | Files | Lines |
|
|
|
gcc/ChangeLog
PR driver/58973
* common.opt (Werror, Werror=): Use less awkward wording in
description.
(pedantic-errors): Likewise.
* doc/invoke.texi (Warning Options): Likewise for -Werror and
-Werror= here.
Co-Authored-By: GUO Yixuan <culu.gyx@gmail.com>
|
|
This reverts a change in the upstream D implementation of the compiler,
as it is no longer necessary since another fix for opDispatch got
applied in the same area (merged in r12-6003-gfd43568cc54e17).
gcc/d/ChangeLog:
* dmd/MERGE: Merge upstream dmd ed17b3e95d.
Reviewed-on: https://github.com/dlang/dmd/pull/21132
|
|
Since r15-9062-g70391e3958db79 we perform vector bitmask initialization
via the vec_duplicate expander directly. This triggered a latent bug in
ours where we missed to mask out the single bit which resulted in an
execution FAIL of pr119114.c
The attached patch adds the 1-masking of the broadcast operand.
PR target/119572
gcc/ChangeLog:
* config/riscv/autovec.md: Mask broadcast value.
|
|
Assuming we have the following variables:
unsigned long long a0, a1;
unsigned int a2;
For the expression:
a0 = (a0 << 50) >> 49; // slli a0, a0, 50 + srli a0, a0, 49
a2 = a1 + a0; // addw a2, a1, a0 + slli a2, a2, 32 + srli a2, a2, 32
In the optimization process of ZBA (combine pass), it would be optimized to:
a2 = a0 << 1 + a1; // sh1add a2, a0, a1 + zext.w a2, a2
This is clearly incorrect, as it overlooks the fact that a0=a0&0x7ffe, meaning
that the bits a0[32:14] are set to zero.
gcc/ChangeLog:
* config/riscv/bitmanip.md: The optimization can only be applied if
the high bit of operands[3] is set to 1.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/zba-shNadd-09.c: New test.
* gcc.target/riscv/zba-shNadd-10.c: New test.
|
|
2025-04-02 John David Anglin <danglin@gcc.gnu.org>
gcc/testsuite/ChangeLog:
* g++.dg/modules/pr98893_b.C: xfail __tcf_ZL1b
assembler check on hppa*-*-hpux*.
|
|
2025-04-02 John David Anglin <danglin@gcc.gnu.org>
gcc/testsuite/ChangeLog:
* g++.dg/abi/abi-tag18a.C: Skip on hppa*-*-hpux*.
|
|
2025-04-02 John David Anglin <danglin@gcc.gnu.org>
libstdc++-v3/ChangeLog:
* config/os/hpux/os_defines.h: Only use long long when
__cplusplus >= 201103L.
|
|
[PR119521]
COBOL variables with attribute intermediate_e are being allocated on
the stack frame, but their data was assigned using malloc(), without
a corresponding call to free(). For numerics, the problem is solved
with a fixed allocation of sixteen bytes for the cblc_field_t::data
member (sixteen is big enough for all data types) and with a fixed
allocation of 8,192 bytes for the alphanumeric type.
In use, the intermediate numeric data types are "shrunk" to the minimum
applicable size. The intermediate alphanumerics, generally used as
destination targets for functions, are trimmed as well.
gcc/cobol
PR cobol/119521
* genapi.cc: (parser_division): Change comment.
(parser_symbol_add): Change intermediate_t handling.
* parse.y: Multiple changes to new_alphanumeric() calls.
* parse_ante.h: Establish named constant for date function
calls. Change declaration of new_alphanumeric() function.
* symbols.cc: (new_temporary_impl): Use named constant
for default size of temporary alphanumerics.
* symbols.h: Establish MAXIMUM_ALPHA_LENGTH constant.
libgcobol
PR cobol/119521
* intrinsic.cc: (__gg__reverse): Trim final result for intermediate_e.
* libgcobol.cc: (__gg__adjust_dest_size): Abort on attempt to increase
the size of a result. (__gg__module_name): Formatting.
__gg__reverse(): Resize only intermediates
|
|
This patch addresses a number of issues with the documentation of
- None of the things in this section had @cindex entries [PR114957].
- The document formatting didn't match that of other #pragma
documentation sections.
- The effect of #pragma pack(0) wasn't documented [PR78008].
- There's a long-standing bug [PR60972] reporting that #pragma pack
and the __attribute__(packed) don't get along well. It seems worthwhile
to warn users about that since elsewhere pragmas are cross-referenced
with related or equivalent attributes.
gcc/ChangeLog
PR c/114957
PR c/78008
PR c++/60972
* doc/extend.texi (Structure-Layout Pragmas): Add @cindex
entries and reformat the pragma descriptions to match the markup
used for other pragmas. Document what #pragma pack(0) does.
Add cross-references to similar attributes.
|
|
The following testcases FAIL, because EH cleanup is performed only before
IPA and then right before musttail pass.
At -O2 etc. (except for -O0/-Og) we handle musttail calls in the tailc
pass though, and we can fail at that point because the calls might appear
to throw internal exceptions which just don't do anything interesting
(perhaps have debug statements or clobber statements in them) before they
continue with resume of the exception (i.e. throw it externally).
As Richi said in the PR (and I agree) that moving passes is risky at this
point, the following patch instead teaches the tail{r,c} and musttail
passes to deal with such extra EDGE_EH edges.
It is fairly simple thing, if we see an EDGE_EH edge from the call we
just look up where it lands and if there are no
non-debug/non-clobber/non-label statements before resx which throws
externally, such edge can be ignored for tail call optimization or
tail recursion. At other spots I just need to avoid using
single_succ/single_succ_edge because the bb might have another edge -
EDGE_EH.
To make this less risky, this is done solely for the musttail calls for now.
2025-04-02 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/119491
* tree-tailcall.cc (single_non_eh_succ_edge): New function.
(independent_of_stmt_p): Use single_non_eh_succ_edge (bb)->dest
instead of single_succ (bb).
(empty_eh_cleanup): New function.
(find_tail_calls): Diagnose throwing of exceptions which do not
propagate only if there are no EDGE_EH successor edges. If there are
and the call is musttail, use empty_eh_cleanup to find if the cleanup
is not empty. If not or the call is not musttail, use different
diagnostics. Set is_noreturn even if there are successor edges. Use
single_non_eh_succ_edge (abb) instead of single_succ_edge (abb). Punt
on internal noreturn calls.
(decrease_profile): Don't assert 0 or 1 successor edges.
(eliminate_tail_call): Use
single_non_eh_succ_edge (gsi_bb (t->call_gsi)) instead of
single_succ_edge (gsi_bb (t->call_gsi)).
(tree_optimize_tail_calls_1): Also look into basic blocks with
single succ edge which is EDGE_EH for noreturn musttail calls.
* g++.dg/opt/musttail3.C: New test.
* g++.dg/opt/musttail4.C: New test.
* g++.dg/opt/musttail5.C: New test.
|
|
The following testcase ICEs because c_fully_fold isn't performed on the
arguments of __sanitizer_ptr_{sub,cmp} builtins and so e.g.
C_MAYBE_CONST_EXPR can leak into the gimplifier where it ICEs.
2025-04-02 Jakub Jelinek <jakub@redhat.com>
PR c/119582
* c-typeck.cc (pointer_diff, build_binary_op): Call c_fully_fold on
__sanitizer_ptr_sub or __sanitizer_ptr_cmp arguments.
* gcc.dg/asan/pr119582.c: New test.
|
|
As noted in PR 118965, the initial interop implementation overlooked
the requirement in the OpenMP spec that at least one of the "target"
and "targetsync" modifiers is required in both the interop construct
init clause and the declare variant append_args clause.
Adding the check was fairly straightforward, but it broke about a
gazillion existing test cases. In particular, things like "init (x, y)"
which were previously accepted (and tested for being accepted) aren't
supposed to be allowed by the spec, much less things like "init (target)"
where target was previously interpreted as a variable name instead of a
modifier. Since one of the effects of the change is that at least one
modifier is always required, I found that deleting all the code that was
trying to detect and handle the no-modifier case allowed for better
diagnostics.
gcc/c/ChangeLog
PR middle-end/118965
* c-parser.cc (c_parser_omp_clause_init_modifiers): Adjust
error message.
(c_parser_omp_clause_init): Remove code for recognizing clauses
without modifiers. Diagnose missing target/targetsync modifier.
(c_finish_omp_declare_variant): Diagnose missing target/targetsync
modifier.
gcc/cp/ChangeLog
PR middle-end/118965
* parser.cc (c_parser_omp_clause_init_modifiers): Adjust
error message.
(cp_parser_omp_clause_init): Remove code for recognizing clauses
without modifiers. Diagnose missing target/targetsync modifier.
(cp_finish_omp_declare_variant): Diagnose missing target/targetsync
modifier.
gcc/fortran/ChangeLog
PR middle-end/118965
* openmp.cc (gfc_parser_omp_clause_init_modifiers): Fix some
inconsistent code indentation. Remove code for recognizing
clauses without modifiers. Diagnose prefer_type without a
following paren. Adjust error message for an unrecognized modifier.
Diagnose missing target/targetsync modifier.
(gfc_match_omp_init): Fix more inconsistent code indentation.
gcc/testsuite/ChangeLog
PR middle-end/118965
* c-c++-common/gomp/append-args-1.c: Add target/targetsync
modifiers so tests do what they were previously supposed to do.
Adjust expected output.
* c-c++-common/gomp/append-args-7.c: Likewise.
* c-c++-common/gomp/append-args-8.c: Likewise.
* c-c++-common/gomp/append-args-9.c: Likewise.
* c-c++-common/gomp/interop-1.c: Likewise.
* c-c++-common/gomp/interop-2.c: Likewise.
* c-c++-common/gomp/interop-3.c: Likewise.
* c-c++-common/gomp/interop-4.c: Likewise.
* c-c++-common/gomp/pr118965-1.c: New.
* c-c++-common/gomp/pr118965-2.c: New.
* g++.dg/gomp/append-args-1.C: Add target/targetsync modifiers
and adjust expected output.
* g++.dg/gomp/append-args-2.C: Likewise.
* g++.dg/gomp/append-args-6.C: Likewise.
* g++.dg/gomp/append-args-7.C: Likewise.
* g++.dg/gomp/append-args-8.C: Likewise.
* g++.dg/gomp/interop-5.C: Likewise.
* gfortran.dg/gomp/append_args-1.f90: Add target/targetsync
modifiers and adjust expected output.
* gfortran.dg/gomp/append_args-2.f90: Likewise.
* gfortran.dg/gomp/append_args-3.f90: Likewise.
* gfortran.dg/gomp/append_args-4.f90: Likewise.
* gfortran.dg/gomp/interop-1.f90: Likewise.
* gfortran.dg/gomp/interop-2.f90: Likewise.
* gfortran.dg/gomp/interop-3.f90: Likewise.
* gfortran.dg/gomp/interop-4.f90: Likewise.
* gfortran.dg/gomp/pr118965-1.f90: New.
* gfortran.dg/gomp/pr118965-2.f90: New.
|
|
Darwin from 10.11 needs embedded rpaths to find the correct libraries at
runtime. Recent increases in hardening have made it such that the dynamic
loader will no longer fall back to using an installed libstdc++ when the
(new) linked one is not found. This means we fail configure tests (that
should pass) for runtimes that use C++.
We can resolve this by passing '-B' to the C++ command lines instead of '-L'
(-B implies -L on Darwin, but also causes a corresponding embedded rpath).
ChangeLog:
* configure: Regenerate.
* configure.ac: Use -B instead of -L to specifiy the C++ runtime
paths on Darwin.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
|
|
Darwin's linker now warns when duplicate rpaths are presented - which
happens when we emit duplicate '-B' paths. In principle, we should avoid
this in the test-suite, however at present we tend to have duplicates
because different parts of the machinery add them. At some point, it might
be nice to have an "add_option_if_missing" and apply that across the whole
of the test infra. However this is not something for late in stage 4. So
the solution here is to prune the warning - the effect of the duplicate in
the libstdc++ testsuite is not important; it will make the exes very slightly
larger but it won't alter the paths that are presented for loading the
runtimes.
libstdc++-v3/ChangeLog:
* testsuite/lib/prune.exp: Prune ld warning about duplicatei
rpaths.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
|
|
The following reverts parts of r15-8047 which assesses alignment
analysis for VMAT_STRIDED_SLP is correct by using aligned accesses
where allowed by it. As the PR shows this analysis is still incorrect,
so revert back to assuming we got it wrong.
PR tree-optimization/119586
* tree-vect-stmts.cc (vectorizable_load): Assume we got
alignment analysis for VMAT_STRIDED_SLP wrong.
(vectorizable_store): Likewise.
* gcc.dg/vect/pr119586.c: New testcase.
|
|
mtrr_ioctl() uses long and casts it to a pointer. Fix warnings
for llp64 platforms.
Signed-off-by: Jonathan Yong <10walls@gmail.com>
gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/torture/switch-3.c: Fix llp64 warnings.
|
|
On Wed, Apr 02, 2025 at 10:32:20AM +0200, Richard Biener wrote:
> I wonder if we can amend the documentation to suggest to end lifetime
> of variables explicitly by proper scoping?
In the -Wmaybe-musttail-local-addr attribute description I've already
tried to show that in the example, but if you think something like
the following would make it clearer.
2025-04-02 Jakub Jelinek <jakub@redhat.com>
* doc/extend.texi (musttail statement attribute): Hint how
to avoid -Wmaybe-musttail-local-addr warnings.
|
|
The call to std::remove_if used here doesn't remove any elements, it
just overwrites the "removed" elements with later elements, leaving the
total number of elements unchanged. Use std::list::remove_if to actually
remove those unwanted elements from the list.
gcc/cobol/ChangeLog:
* symfind.cc (finalize_symbol_map2): Use std::list::remove_if
instead of std::remove_if.
|
|
instead warn [PR119376]
As discussed here and in bugzilla, [[clang::musttail]] attribute in clang
not just strongly asks for tail call or error, but changes behavior.
To quote:
https://clang.llvm.org/docs/AttributeReference.html#musttail
"The lifetimes of all local variables and function parameters end immediately
before the call to the function. This means that it is undefined behaviour
to pass a pointer or reference to a local variable to the called function,
which is not the case without the attribute. Clang will emit a warning in
common cases where this happens."
The GCC behavior was just to error if we can't prove the musttail callee
could not have dereferenced escaped pointers to local vars or parameters
of the caller. That is still the case for variables with non-trivial
destruction (even in clang), like vars with C++ non-trivial destructors or
variables with cleanup attribute.
The following patch changes the behavior to match that of clang, for all of
[[clang::musttail]], [[gnu::musttail]] and __attribute__((musttail)).
clang 20 actually added warning for some cases of it in
https://github.com/llvm/llvm-project/pull/109255
but it is under -Wreturn-stack-address warning.
Now, gcc doesn't have that warning, but -Wreturn-local-addr instead, and
IMHO it is better to have this under new warnings, because this isn't about
returning local address, but about passing it to a musttail call, or maybe
escaping to a musttail call. And perhaps users will appreciate they can
control it separately as well.
The patch introduces 2 new warnings.
-Wmusttail-local-addr
which is turn on by default and warns for the always dumb cases of passing
an address of a local variable or parameter to musttail call's argument.
And then
-Wmaybe-musttail-local-addr
which is only diagnosed if -Wmusttail-local-addr was not diagnosed and
diagnoses at most one (so that we don't emit 100s of warnings for one call
if 100s of vars can escape) case where an address of a local var could have
escaped to the musttail call. This is less severe, the code doesn't have
to be obviously wrong, so the warning is only enabled in -Wextra.
And I've adjusted also the documentation for this change and addition of
new warnings.
2025-04-02 Jakub Jelinek <jakub@redhat.com>
PR ipa/119376
* common.opt (Wmusttail-local-addr, Wmaybe-musttail-local-addr): New.
* tree-tailcall.cc (suitable_for_tail_call_opt_p): Don't fail for
TREE_ADDRESSABLE PARM_DECLs for musttail calls if diag_musttail.
Emit -Wmusttail-local-addr warnings.
(maybe_error_musttail): Use gimple_location instead of directly
accessing location member.
(find_tail_calls): For musttail calls if diag_musttail, don't fail
if address of local could escape to the call, instead emit
-Wmaybe-musttail-local-addr warnings. Emit
-Wmaybe-musttail-local-addr warnings also for address taken
parameters.
* common.opt.urls: Regenerate.
* doc/extend.texi (musttail statement attribute): Clarify local
variables without non-trivial destruction are considered out of scope
before the tail call instruction.
* doc/invoke.texi (-Wno-musttail-local-addr,
-Wmaybe-musttail-local-addr): Document.
* c-c++-common/musttail8.c: Expect a warning rather than error in one
case.
(f4): Add int * argument.
* c-c++-common/musttail15.c: Don't disallow for C++98.
* c-c++-common/musttail16.c: Likewise.
* c-c++-common/musttail17.c: Likewise.
* c-c++-common/musttail18.c: Likewise.
* c-c++-common/musttail19.c: Likewise. Expect a warning rather than
error in one case.
(f4): Add int * argument.
* c-c++-common/musttail20.c: Don't disallow for C++98.
* c-c++-common/musttail21.c: Likewise.
* c-c++-common/musttail28.c: New test.
* c-c++-common/musttail29.c: New test.
* c-c++-common/musttail30.c: New test.
* c-c++-common/musttail31.c: New test.
* g++.dg/ext/musttail1.C: New test.
* g++.dg/ext/musttail2.C: New test.
* g++.dg/ext/musttail3.C: New test.
|
|
Recent syntactic fixes enabled the test, but the result was failing.
It turns out it was missing a space between the register arguments in
the scan-assembler-times directives.
gcc/testsuite/ChangeLog:
PR target/119556
* gcc.target/arm/short-vfp-1.c: Add missing spaces.
|
|
bitmap_set_bit checks the original value of the bit to return it to the
caller and then only writes the new value back if it changes.
Most callers of bitmap_set_bit don't need the return value, but with the conditional store
the CPU still has to predict it correctly since gcc doesn't know how to do
that without APX on x86 (even though CMOV could do it with a dummy target).
Really if-conversion should handle this case, but for now we can fix
it.
This simple patch improves runtime by 15% for the test case in the PR.
Which is more than I expected given it only has ~1.44% of the cycles, but I guess
the mispredicts caused some down stream effects.
cc1plus-bitmap -std=gnu++20 -O2 pr119482.cc -quiet
ran 1.15 ± 0.01 times faster than cc1plus -std=gnu++20 -O2 pr119482.cc -quiet
At least with this test case the total number of branches decreases
drastically. Even though the mispredict rate goes up slightly it is
still a big win.
$ perf stat -e branches,branch-misses,uncore_imc/cas_count_read/,uncore_imc/cas_count_write/ \
-a ../obj-fast/gcc/cc1plus -std=gnu++20 -O2 pr119482.cc -quiet -w
Performance counter stats for 'system wide':
41,932,957,091 branches
686,117,623 branch-misses # 1.64% of all branches
43,690.47 MiB uncore_imc/cas_count_read/
12,362.56 MiB uncore_imc/cas_count_write/
49.328633365 seconds time elapsed
$ perf stat -e branches,branch-misses,uncore_imc/cas_count_read/,uncore_imc/cas_count_write/ \
-a ../obj-fast/gcc/cc1plus-bitmap -std=gnu++20 -O2 pr119482.cc -quiet -w
Performance counter stats for 'system wide':
37,092,113,179 branches
663,641,708 branch-misses # 1.79% of all branches
43,196.52 MiB uncore_imc/cas_count_read/
12,369.33 MiB uncore_imc/cas_count_write/
42.632458350 seconds time elapsed
gcc/ChangeLog:
PR middle-end/119482
* bitmap.cc (bitmap_set_bit): Write back value unconditionally
|
|
Per the issue, the discussion of these two attributes needed to be
better integrated. I also did some editing for style and readability,
and clarified that almost all targets support this feature (it is
enabled by default unless the back end disables it), not just "some".
Co-Authored_by: Jonathan Wakely <jwakely@redhat.com>
gcc/ChangeLog
PR c++/118982
* doc/extend.texi (Common Function Attributes): For the
constructor/destructory attribute, be more explicit about the
relationship between the constructor attribute and
the C++ init_priority attribute, and add a cross-reference.
Also document that most targets support this.
(C++ Attributes): Similarly for the init_priority attribute.
|
|
|
|
gcc/ChangeLog
PR c/118118
* doc/extend.texi (Boolean Type): New section.
|
|
gnattools/
PR ada/119440
PR ada/119571
* Makefile.in (TOOLS_FLAGS_TO_PASS_CROSS): Pass $(PICFLAG) under
CFLAGS and $(LD_PICFLAG) under LDFLAGS.
|
|
These calls were into fixed-length arrays that might be too small.
gcc/cobol
* genapi.cc: (section_label): Use xasprintf() instead of sprintf().
(paragraph_label): Likewise. (leave_procedure): Likewise.
(find_procedure): Likewise. (parser_goto): Likewise.
(parser_enter_file): Likewise.
|
|
This is a C23/C++11 feature that is supported as an extension with
earlier -std= options too, but was never previously documented. It
interacts with the already-documented forward enum definition extension,
so I have merged discussion of the two extensions into the same section.
gcc/ChangeLog
PR c/117689
* doc/extend.texi (Incomplete Enums): Rename to....
(Enum Extensions): This. Document support for specifying the
underlying type of an enum as an extension in all earlier C
and C++ standards. Document that a forward declaration with
underlying type is not an incomplete type, and which dialects
GCC supports that in.
|
|
[PR119551]
An inline variable has vague linkage, and needs to be conditionally
emitted in TUs that reference it. Unfortunately this clashes with
[basic.link] p14.2, which says that we ignore the initialisers of all
variables (including inline ones), since importers will not have access
to the referenced TU-local entities to write the definition.
This patch makes such exposures be ill-formed. One case that continues
to work is if the exposure is part of the dynamic initialiser of an
inline variable; in such cases, the definition has been built as part of
the module interface unit anyway, and importers don't need to write it
out again, so such exposures are "harmless".
PR c++/119551
gcc/cp/ChangeLog:
* module.cc (trees_out::write_var_def): Only ignore non-inline
variable initializers.
gcc/testsuite/ChangeLog:
* g++.dg/modules/internal-5_a.C: Add cases that should be
ignored.
* g++.dg/modules/internal-5_b.C: Test these new cases, and make
the testcase more robust.
* g++.dg/modules/internal-11.C: New test.
* g++.dg/modules/internal-12_a.C: New test.
* g++.dg/modules/internal-12_b.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
|
|
This replaces some usages of the old -fmodules-ts flag with the new
-fmodules flag made in r15-5112-gd9c3c3c85665b2.
gcc/cp/ChangeLog:
* parser.cc (cp_parser_diagnose_invalid_type_name): Replace
fmodules-ts with fmodules.
(cp_parser_template_declaration): Likewise.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
|
|
This is yet another false positive warning fix. This time the compiler
can't prove that when the vector has sufficient excess capacity to
append new elements, the pointer to the existing storage is not null.
libstdc++-v3/ChangeLog:
PR libstdc++/114945
* include/bits/vector.tcc (vector::_M_default_append): Add
unreachable condition so the compiler knows that _M_finish is
not null.
* testsuite/23_containers/vector/capacity/114945.cc: New test.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
|
|
I am working on some changes to GNAT to emit hierarchical DWARF --
i.e., where entities will have simple names nested in a DW_TAG_module.
While working on this I found a couple of paths in modified_type_die
where "mod_scope" should be used, but is not. I suspect these cases
are only reachable by Ada code, as in both spots (subrange types and
base types), I believe that other languages don't generally have named
types in a non-top-level scope, and in these other situations,
mod_scope will still be correct.
gcc
* dwarf2out.cc (modified_type_die): Use mod_scope for
ranged types, base types, and array types.
|
|
This is a partial step towards fixing that PR.
For musttail recursive calls which have non-is_gimple_reg_type typed
parameters, the only case we've handled was if the exact parameter
was passed through (perhaps modified, but still the same PARM_DECL).
That isn't necessary, we can copy the argument to the parameter as well
(just need to watch for the use of the parameter in later arguments,
say musttail recursive call which swaps 2 structure arguments).
The patch attempts to play safe and punts if any of the parameters are
addressable (like we do for all normal tail calls and tail recursions,
except for musttail in the posted unreviewed patch).
With this patch (at least when early inlining isn't done on not yet
optimized body) inlining should see already tail recursion optimized
body and will not have problems with SRA breaking musttail.
This version of the patch limits this for musttail tail recursions,
with intent to enable for all tail recursions in GCC 16.
2025-04-01 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/119493
* tree-tailcall.cc (find_tail_calls): Don't punt on tail recusion
if some arguments don't have is_gimple_reg_type, only punt if they
have non-POD types, or volatile, or addressable or (for now) it is
not a musttail call. Set tailr_arg_needs_copy in those cases too.
(eliminate_tail_call): Copy call arguments to params if they don't
have is_gimple_reg_type, use temporaries if the argument is used
later.
(tree_optimize_tail_calls_1): Skip !is_gimple_reg_type
tailr_arg_needs_copy parameters. Formatting fix.
* gcc.dg/pr119493-1.c: New test.
|
|
[PR119291]
The following testcase is miscompiled on x86_64-linux at -O2 by the combiner.
We have from earlier combinations
(insn 22 21 23 4 (set (reg:SI 104 [ _7 ])
(const_int 0 [0])) "pr119291.c":25:15 96 {*movsi_internal}
(nil))
(insn 23 22 24 4 (set (reg/v:SI 117 [ e ])
(reg/v:SI 116 [ e ])) 96 {*movsi_internal}
(expr_list:REG_DEAD (reg/v:SI 116 [ e ])
(nil)))
(note 24 23 25 4 NOTE_INSN_DELETED)
(insn 25 24 26 4 (parallel [
(set (reg:CCZ 17 flags)
(compare:CCZ (neg:SI (reg:SI 104 [ _7 ]))
(const_int 0 [0])))
(set (reg/v:SI 116 [ e ])
(neg:SI (reg:SI 104 [ _7 ])))
]) "pr119291.c":26:13 977 {*negsi_2}
(expr_list:REG_DEAD (reg:SI 104 [ _7 ])
(nil)))
(note 26 25 27 4 NOTE_INSN_DELETED)
(insn 27 26 28 4 (set (reg:DI 128 [ _9 ])
(ne:DI (reg:CCZ 17 flags)
(const_int 0 [0]))) "pr119291.c":26:13 1447 {*setcc_di_1}
(expr_list:REG_DEAD (reg:CCZ 17 flags)
(nil)))
and try_combine is called on i3 25 and i2 22 (second time)
and reach the hunk being patched with simplified i3
(insn 25 24 26 4 (parallel [
(set (pc)
(pc))
(set (reg/v:SI 116 [ e ])
(const_int 0 [0]))
]) "pr119291.c":28:13 977 {*negsi_2}
(expr_list:REG_DEAD (reg:SI 104 [ _7 ])
(nil)))
and
(insn 22 21 23 4 (set (reg:SI 104 [ _7 ])
(const_int 0 [0])) "pr119291.c":27:15 96 {*movsi_internal}
(nil))
Now, the try_combine code there attempts to split two independent
sets in newpat by moving one of them to i2.
And among other tests it checks
!modified_between_p (SET_DEST (set1), i2, i3)
which is certainly needed, if there would be say
(set (reg/v:SI 116 [ e ]) (const_int 42 [0x2a]))
in between i2 and i3, we couldn't do that, as that set would overwrite
the value set by set1 we want to move to the i2 position.
But in this case pseudo 116 isn't set in between i2 and i3, but used
(and additionally there is a REG_DEAD note for it).
This is equally bad for the move, because while the i3 insn
and later will see the pseudo value that we set, the insn in between
which uses the value will see a different value from the one that
it should see.
As we don't check for that, in the end try_combine succeeds and
changes the IL to:
(insn 22 21 23 4 (set (reg/v:SI 116 [ e ])
(const_int 0 [0])) "pr119291.c":27:15 96 {*movsi_internal}
(nil))
(insn 23 22 24 4 (set (reg/v:SI 117 [ e ])
(reg/v:SI 116 [ e ])) 96 {*movsi_internal}
(expr_list:REG_DEAD (reg/v:SI 116 [ e ])
(nil)))
(note 24 23 25 4 NOTE_INSN_DELETED)
(insn 25 24 26 4 (set (pc)
(pc)) "pr119291.c":28:13 2147483647 {NOOP_MOVE}
(nil))
(note 26 25 27 4 NOTE_INSN_DELETED)
(insn 27 26 28 4 (set (reg:DI 128 [ _9 ])
(const_int 0 [0])) "pr119291.c":28:13 95 {*movdi_internal}
(nil))
(note, the i3 got turned into a nop and try_combine also modified insn 27).
The following patch replaces the modified_between_p
tests with reg_used_between_p, my understanding is that
modified_between_p is a subset of reg_used_between_p, so one
doesn't need both.
Looking at this some more today, I think we should special case
set_noop_p because that can be put into i2 (except for the JUMP_P
violations), currently both modified_between_p (pc_rtx, i2, i3)
and reg_used_between_p (pc_rtx, i2, i3) returns false.
I'll post a patch incrementally for that (but that feels like
new optimization, so probably not something that should be backported).
On Tue, Apr 01, 2025 at 11:27:25AM +0200, Richard Biener wrote:
> Can we constrain SET_DEST (set1/set0) to a REG_P in combine? Why
> does the comment talk about memory?
I was worried about making too risky changes this late in stage4
(and especially also for backports). Most of this code is 1992-ish.
I think many of the functions are just misnamed, the reg_ in there doesn't
match what those functions do (bet they initially supported just REGs
and later on support for other kinds of expressions was added, but haven't
done git archeology to prove that).
What we know for sure is:
&& GET_CODE (SET_DEST (XVECEXP (newpat, 0, 0))) != ZERO_EXTRACT
&& GET_CODE (SET_DEST (XVECEXP (newpat, 0, 0))) != STRICT_LOW_PART
&& GET_CODE (SET_DEST (XVECEXP (newpat, 0, 1))) != ZERO_EXTRACT
&& GET_CODE (SET_DEST (XVECEXP (newpat, 0, 1))) != STRICT_LOW_PART
that is checked earlier in the condition.
Then it calls
&& ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 1)),
XVECEXP (newpat, 0, 0))
&& ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 0)),
XVECEXP (newpat, 0, 1))
While it has reg_* in it, that function mostly calls reg_overlap_mentioned_p
which is also misnamed, that function handles just fine all of
REG, MEM, SUBREG of REG, (SUBREG of MEM not, see below), ZERO_EXTRACT,
STRICT_LOW_PART, PC and even some further cases.
So, IMHO SET_DEST (set0) or SET_DEST (set0) can be certainly a REG, SUBREG
of REG, PC (at least the REG and PC cases are triggered on the testcase)
and quite possibly also MEM (SUBREG of MEM not, see below).
Now, the code uses !modified_between_p (SET_SRC (set{1,0}), i2, i3) where that
function for constants just returns false, for PC returns true, for REG
returns reg_set_between_p, for MEM recurses on the address, for
MEM_READONLY_P otherwise returns false, otherwise checks using alias.cc code
whether the memory could have been modified in between, for all other
rtxes recurses on the subrtxes. This part didn't change in my patch.
I've only changed those
- && !modified_between_p (SET_DEST (set{1,0}), i2, i3)
+ && !reg_used_between_p (SET_DEST (set{1,0}), i2, i3)
where the former has been described above and clearly handles all of
REG, SUBREG of REG, PC, MEM and SUBREG of MEM among other things.
The replacement reg_used_between_p calls reg_overlap_mentioned_p on each
instruction in between i2 and i3. So, there is clearly a difference
in behavior if SET_DEST (set{1,0}) is pc_rtx, in that case modified_between_p
returns unconditionally true even if there are no instructions in between,
but reg_used_between_p if there are no non-debug insns in between returns
false. Sorry for missing that, guess I should check for that (with the
exception of the noop moves which are often (set (pc) (pc)) and handled
by the incremental patch). In fact not just that, reg_used_between_p
will only return true for PC if it is mentioned anywhere in the insns
in between.
Anyway, except for that, for REG it calls refers_to_regno_p
and so should find any occurrences of any of the REG or parts of it for hard
registers, for MEM returns true if it sees any MEMs in insns in between
(conservatively), for SUBREGs apparently it relies on it being SUBREG of REG
(so doesn't handle SUBREG of MEM) and handles SUBREG of REG like the
SUBREG_REG, PC I've already described.
Now, because reg_overlap_mentioned_p doesn't handle SUBREG of MEM, I think
already the initial
&& ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 1)),
XVECEXP (newpat, 0, 0))
&& ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 0)),
XVECEXP (newpat, 0, 1))
calls would have failed --enable-checking=rtl or would have misbehaved, so
I think there is no need to check for it further.
To your question why I don't use reg_referenced_p, that is because
reg_referenced_p is something to call on one insn pattern, while
reg_used_between_p is pretty much that on all insns in between two
instructions (excluding the boundaries).
So, I think it would be safer to add && SET_DEST (set{1,0} != pc_rtx
checks to preserve former behavior, like in the following version.
2025-04-01 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/119291
* combine.cc (try_combine): For splitting of PARALLEL with
2 independent SETs into i2 and i3 sets check reg_used_between_p
of the SET_DESTs rather than just modified_between_p.
* gcc.c-torture/execute/pr119291.c: New test.
|
|
Linux toolchain may configured with --enable-default-pie, and that will
cause lots of regression test failures because the function name will
append with @plt suffix (e.g. `call foo` become `call foo@plt`), also
some code generation will different due to the code model like the address
generation for global variable, so we may add -fno-pie to those
testcases to prevent that.
We may consider just drop @plt suffix to prevent that at all, because
it's not difference between w/ and w/o @plt suffix, the linker will pick
the right one to do, however it's late stage of GCC development, so just
tweak the testcase should be the best way to do now.
Changes from v1:
- Add more testcase for PIE (from rvv.exp).
- Tweak the rule for match @plt.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rv32i_zcmp.c: Tweak testcase for PIE.
* gcc.target/riscv/rv32e_zcmp.c: Likewise.
* gcc.target/riscv/zcmp_stack_alignment.c: Likewise.
* gcc.target/riscv/cm_mv_rv32.c: Likewise.
* gcc.target/riscv/cpymem-64.c: Likewise.
* gcc.target/riscv/fmax-snan.c: Likewise.
* gcc.target/riscv/fmaxf-snan.c: Likewise.
* gcc.target/riscv/fmin-snan.c: Likewise.
* gcc.target/riscv/fminf-snan.c: Likewise.
* gcc.target/riscv/large-model.c: Likewise.
* gcc.target/riscv/predef-1.c: Likewise.
* gcc.target/riscv/predef-4.c: Likewise.
* gcc.target/riscv/predef-7.c: Likewise.
* gcc.target/riscv/predef-9.c: Likewise.
* gcc.target/riscv/rvv/base/abi-callee-saved-2-save-restore.c: Likewise.
* gcc.target/riscv/rvv/base/abi-callee-saved-2-zcmp.c: Likewise.
* gcc.target/riscv/rvv/base/abi-callee-saved-2.c: Likewise.
* gcc.target/riscv/rvv/base/cmpmem-1.c: Likewise.
* gcc.target/riscv/rvv/base/cmpmem-3.c: Likewise.
* gcc.target/riscv/rvv/base/cmpmem-4.c: Likewise.
* gcc.target/riscv/rvv/base/cpymem-1.c: Likewise.
* gcc.target/riscv/rvv/base/movmem-1.c: Likewise.
* gcc.target/riscv/rvv/base/pr114352-3.c: Likewise.
* gcc.target/riscv/rvv/base/setmem-1.c: Likewise.
* gcc.target/riscv/rvv/base/setmem-2.c: Likewise.
* gcc.target/riscv/rvv/base/setmem-3.c: Likewise.
* gcc.target/riscv/rvv/base/spill-9.c: Likewise.
* g++.target/riscv/mv-symbols1.C: Likewise.
* g++.target/riscv/mv-symbols3.C: Likewise.
* g++.target/riscv/mv-symbols4.C: Likewise.
* g++.target/riscv/mv-symbols5.C: Likewise.
* g++.target/riscv/mvc-symbols1.C: Likewise.
* g++.target/riscv/mvc-symbols3.C: Likewise.
|
|
The following makes sure to reject the attempts to emulate a vector
gather when the discovered index vector type is a vector mask.
PR tree-optimization/119534
* tree-vect-stmts.cc (get_load_store_type): Reject
VECTOR_BOOLEAN_TYPE_P offset vector type for emulated gathers.
* gcc.dg/vect/pr119534.c: New testcase.
|
|
Since r15-8011 cp_build_indirect_ref_1 won't do the *&TARGET_EXPR ->
TARGET_EXPR folding not to change its value category. That fix seems
correct but it made us stop extending the lifetime in this testcase,
causing a wrong-code issue -- extend_ref_init_temps_1 did not see
through the extra *& because it doesn't use a tree walk.
This patch reverts r15-8011 and instead handles the problem in
build_over_call by calling force_lvalue in the is_really_empty_class
case as well as in the general case.
PR c++/119383
gcc/cp/ChangeLog:
* call.cc (build_over_call): Use force_lvalue to ensure op= returns
an lvalue.
* cp-tree.h (force_lvalue): Declare.
* cvt.cc (force_lvalue): New.
* typeck.cc (cp_build_indirect_ref_1): Revert r15-8011.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/temp-extend3.C: New test.
Reviewed-by: Jason Merrill <jason@redhat.com>
|
|
The warning -Wzero-as-null-pointer-constant is now not only supported
in C++ but also in C. Change the documentation accordingly.
PR c/119173
gcc/ChangeLog:
* doc/invoke.texi (Warning Options): Move to general options.
|
|
As the following testcase shows, EDGE_FAKE edges from musttail calls to
EXIT aren't the only edges we should ignore, we need to ignore also
edges created by the splitting of blocks for the EDGE_FAKE creation that
point from the musttail calls to the fallthrough block, which typically does
the return or with PHIs for the return value.
2025-04-01 Jakub Jelinek <jakub@redhat.com>
PR gcov-profile/119535
* profile.cc (branch_prob): Ignore any edges from bbs ending with
musttail call, rather than only EDGE_FAKE edges from those to EXIT.
* c-c++-common/pr119535.c: New test.
|
|
While working on the previous tailc patch, I've noticed the following
problem.
The testcase below fails, because we decide to tail recursion optimize
the call, but tail recursion (as documented in tree-tailcall.cc) needs to
add some result multiplication and/or addition if any tail recursion uses
accumulator, which is added right before the return.
So, if there are musttail non-recurive calls in the function, successful
tail recursion optimization will mean we'll later error on the musttail
calls. musttail recursive calls are ok, those would be tail recursion
optimized.
So, the following patch punts on all tail recursion optimizations if it
needs accumulators (add and/or mult) if there is at least one non-recursive
musttail call.
2025-04-01 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/119493
* tree-tailcall.cc (tree_optimize_tail_calls_1): Ignore tail recursion
candidates which need accumulators if there is at least one musttail
non-recursive call.
* gcc.dg/pr119493-2.c: New test.
|
|
as unary && operands outside of those [PR119537]
The following testcases ICE on invalid code which defines
labels inside of statement expressions and then uses &&label
from code outside of the statement expressions.
The C++ FE diagnoses that with a warning (not specifically for
assume attribute, genericallly about taking address of a label
outside of a statement expression so computed goto could violate
the requirement that statement expression is not entered from
outside of it through a jump into it), the C FE doesn't diagnose
anything.
Normal direct gotos to such labels are diagnosed by both C and C++.
In the assume attribute case it is actually worse than for
addresses of labels in normal statement expressions, in that case
the labels are still in the current function, so invalid program
can still jump to those (and in case of OpenMP/OpenACC where it
is also invalid and stuff is moved to a separate function, such
movement is done post cfg discovery of FORCED_LABELs and worst
case one can run into cases which fail to assemble, but I haven't
succeeded in creating ICE for that).
For assume at -O0 we'd just throw away the assume expression if
it is not a simple condition and so the label is then not defined
anywhere and we ICE during cfg pass.
The gimplify.cc hunks fix that, as we don't have FORCED_LABELs
discovery done yet, it preserves all assume expressions which contain
used user labels.
With that we ICE during IRA, which is upset about an indirect jump
to a label which doesn't exist.
So, the gimple-low.cc hunks add diagnostics of the problem, it gathers
uids of all the user used labels inside of the assume expressions (usually
none) and if it finds any, walks the IL to find uses of those from outside
of those expressions now outlined into separate magic functions.
2025-04-01 Jakub Jelinek <jakub@redhat.com>
PR middle-end/119537
* gimplify.cc (find_used_user_labels): New function.
(gimplify_call_expr): Don't remove complex assume expression at -O0
if it defines any user labels.
* gimple-low.cc: Include diagnostic-core.h.
(assume_labels): New variable.
(diagnose_assume_labels): New function.
(lower_function_body): Call it via walk_gimple_seq if assume_labels
is non-NULL, then BITMAP_FREE assume_labels.
(find_assumption_locals_r): Record in assume_labels uids of user
labels defined in assume attribute expressions.
* c-c++-common/pr119537-1.c: New test.
* c-c++-common/pr119537-2.c: New test.
|
|
This resolves all instances of PR119369
"GCN: weak undefined symbols -> execution test FAIL, 'HSA_STATUS_ERROR_VARIABLE_UNDEFINED'";
for all affected test cases, the execution test status progresses FAIL -> PASS.
This however also causes a small number of (expected) regressions, very similar
to GCC/nvptx:
[-PASS:-]{+FAIL:+} g++.dg/abi/pure-virtual1.C -std=c++17 (test for excess errors)
[-PASS:-]{+FAIL:+} g++.dg/abi/pure-virtual1.C -std=c++26 (test for excess errors)
[-PASS:-]{+FAIL:+} g++.dg/abi/pure-virtual1.C -std=c++98 (test for excess errors)
[-PASS:-]{+FAIL:+} g++.dg/cpp0x/pr84497.C -std=c++11 scan-assembler .weak[ \t]*_?_ZTH11derived_obj
[-PASS:-]{+FAIL:+} g++.dg/cpp0x/pr84497.C -std=c++11 scan-assembler .weak[ \t]*_?_ZTH13container_obj
[-PASS:-]{+FAIL:+} g++.dg/cpp0x/pr84497.C -std=c++11 scan-assembler .weak[ \t]*_?_ZTH8base_obj
PASS: g++.dg/cpp0x/pr84497.C -std=c++11 (test for excess errors)
[-PASS:-]{+FAIL:+} g++.dg/cpp0x/pr84497.C -std=c++17 scan-assembler .weak[ \t]*_?_ZTH11derived_obj
[-PASS:-]{+FAIL:+} g++.dg/cpp0x/pr84497.C -std=c++17 scan-assembler .weak[ \t]*_?_ZTH13container_obj
[-PASS:-]{+FAIL:+} g++.dg/cpp0x/pr84497.C -std=c++17 scan-assembler .weak[ \t]*_?_ZTH8base_obj
PASS: g++.dg/cpp0x/pr84497.C -std=c++17 (test for excess errors)
[-PASS:-]{+FAIL:+} g++.dg/cpp0x/pr84497.C -std=c++26 scan-assembler .weak[ \t]*_?_ZTH11derived_obj
[-PASS:-]{+FAIL:+} g++.dg/cpp0x/pr84497.C -std=c++26 scan-assembler .weak[ \t]*_?_ZTH13container_obj
[-PASS:-]{+FAIL:+} g++.dg/cpp0x/pr84497.C -std=c++26 scan-assembler .weak[ \t]*_?_ZTH8base_obj
PASS: g++.dg/cpp0x/pr84497.C -std=c++26 (test for excess errors)
[-PASS:-]{+FAIL:+} g++.dg/ext/weak2.C -std=gnu++17 scan-assembler weak[^ \t]*[ \t]_?_Z3foov
PASS: g++.dg/ext/weak2.C -std=gnu++17 (test for excess errors)
[-PASS:-]{+FAIL:+} g++.dg/ext/weak2.C -std=gnu++26 scan-assembler weak[^ \t]*[ \t]_?_Z3foov
PASS: g++.dg/ext/weak2.C -std=gnu++26 (test for excess errors)
[-PASS:-]{+FAIL:+} g++.dg/ext/weak2.C -std=gnu++98 scan-assembler weak[^ \t]*[ \t]_?_Z3foov
PASS: g++.dg/ext/weak2.C -std=gnu++98 (test for excess errors)
[-PASS:-]{+FAIL:+} gcc.dg/attr-weakref-1.c (test for excess errors)
[-FAIL:-]{+UNRESOLVED:+} gcc.dg/attr-weakref-1.c [-execution test-]{+compilation failed to produce executable+}
@@ -131211,25 +131211,25 @@ PASS: gcc.dg/weak/weak-1.c scan-assembler weak[^ \t]*[ \t]_?c
PASS: gcc.dg/weak/weak-1.c scan-assembler weak[^ \t]*[ \t]_?d
PASS: gcc.dg/weak/weak-1.c scan-assembler weak[^ \t]*[ \t]_?e
PASS: gcc.dg/weak/weak-1.c scan-assembler weak[^ \t]*[ \t]_?g
[-PASS:-]{+FAIL:+} gcc.dg/weak/weak-1.c scan-assembler weak[^ \t]*[ \t]_?j
PASS: gcc.dg/weak/weak-1.c scan-assembler-not weak[^ \t]*[ \t]_?i
PASS: gcc.dg/weak/weak-12.c (test for excess errors)
[-PASS:-]{+FAIL:+} gcc.dg/weak/weak-12.c scan-assembler weak[^ \t]*[ \t]_?foo
PASS: gcc.dg/weak/weak-15.c (test for excess errors)
[-PASS:-]{+FAIL:+} gcc.dg/weak/weak-15.c scan-assembler weak[^ \t]*[ \t]_?a
[-PASS:-]{+FAIL:+} gcc.dg/weak/weak-15.c scan-assembler weak[^ \t]*[ \t]_?c
[-PASS:-]{+FAIL:+} gcc.dg/weak/weak-15.c scan-assembler weak[^ \t]*[ \t]_?d
PASS: gcc.dg/weak/weak-15.c scan-assembler-not weak[^ \t]*[ \t]_?b
PASS: gcc.dg/weak/weak-16.c (test for excess errors)
[-PASS:-]{+FAIL:+} gcc.dg/weak/weak-16.c scan-assembler weak[^ \t]*[ \t]_?kallsyms_token_index
[-PASS:-]{+FAIL:+} gcc.dg/weak/weak-16.c scan-assembler weak[^ \t]*[ \t]_?kallsyms_token_table
PASS: gcc.dg/weak/weak-2.c (test for excess errors)
[-PASS:-]{+FAIL:+} gcc.dg/weak/weak-2.c scan-assembler weak[^ \t]*[ \t]_?ffoo1a
[-PASS:-]{+FAIL:+} gcc.dg/weak/weak-2.c scan-assembler weak[^ \t]*[ \t]_?ffoo1b
[-PASS:-]{+FAIL:+} gcc.dg/weak/weak-2.c scan-assembler weak[^ \t]*[ \t]_?ffoo1c
[-PASS:-]{+FAIL:+} gcc.dg/weak/weak-2.c scan-assembler weak[^ \t]*[ \t]_?ffoo1e
PASS: gcc.dg/weak/weak-2.c scan-assembler-not weak[^ \t]*[ \t]_?ffoo1d
PASS: gcc.dg/weak/weak-3.c (test for warnings, line 58)
PASS: gcc.dg/weak/weak-3.c (test for warnings, line 73)
PASS: gcc.dg/weak/weak-3.c (test for excess errors)
[-PASS:-]{+FAIL:+} gcc.dg/weak/weak-3.c scan-assembler weak[^ \t]*[ \t]_?ffoo1a
[-PASS:-]{+FAIL:+} gcc.dg/weak/weak-3.c scan-assembler weak[^ \t]*[ \t]_?ffoo1b
[-PASS:-]{+FAIL:+} gcc.dg/weak/weak-3.c scan-assembler weak[^ \t]*[ \t]_?ffoo1c
[-PASS:-]{+FAIL:+} gcc.dg/weak/weak-3.c scan-assembler weak[^ \t]*[ \t]_?ffoo1e
PASS: gcc.dg/weak/weak-3.c scan-assembler weak[^ \t]*[ \t]_?ffoo1f
PASS: gcc.dg/weak/weak-3.c scan-assembler weak[^ \t]*[ \t]_?ffoo1g
PASS: gcc.dg/weak/weak-3.c scan-assembler-not weak[^ \t]*[ \t]_?ffoo1d
PASS: gcc.dg/weak/weak-4.c (test for excess errors)
[-PASS:-]{+FAIL:+} gcc.dg/weak/weak-4.c scan-assembler weak[^ \t]*[ \t]_?vfoo1a
[-PASS:-]{+FAIL:+} gcc.dg/weak/weak-4.c scan-assembler weak[^ \t]*[ \t]_?vfoo1b
[-PASS:-]{+FAIL:+} gcc.dg/weak/weak-4.c scan-assembler weak[^ \t]*[ \t]_?vfoo1c
PASS: gcc.dg/weak/weak-4.c scan-assembler weak[^ \t]*[ \t]_?vfoo1d
PASS: gcc.dg/weak/weak-4.c scan-assembler weak[^ \t]*[ \t]_?vfoo1e
PASS: gcc.dg/weak/weak-4.c scan-assembler weak[^ \t]*[ \t]_?vfoo1f
@@ -131267,16 +131267,16 @@ PASS: gcc.dg/weak/weak-4.c scan-assembler weak[^ \t]*[ \t]_?vfoo1i
PASS: gcc.dg/weak/weak-4.c scan-assembler weak[^ \t]*[ \t]_?vfoo1j
PASS: gcc.dg/weak/weak-4.c scan-assembler weak[^ \t]*[ \t]_?vfoo1k
PASS: gcc.dg/weak/weak-5.c (test for excess errors)
[-PASS:-]{+FAIL:+} gcc.dg/weak/weak-5.c scan-assembler weak[^ \t]*[ \t]_?vfoo1a
[-PASS:-]{+FAIL:+} gcc.dg/weak/weak-5.c scan-assembler weak[^ \t]*[ \t]_?vfoo1b
[-PASS:-]{+FAIL:+} gcc.dg/weak/weak-5.c scan-assembler weak[^ \t]*[ \t]_?vfoo1c
PASS: gcc.dg/weak/weak-5.c scan-assembler weak[^ \t]*[ \t]_?vfoo1d
PASS: gcc.dg/weak/weak-5.c scan-assembler weak[^ \t]*[ \t]_?vfoo1e
PASS: gcc.dg/weak/weak-5.c scan-assembler weak[^ \t]*[ \t]_?vfoo1f
PASS: gcc.dg/weak/weak-5.c scan-assembler weak[^ \t]*[ \t]_?vfoo1g
PASS: gcc.dg/weak/weak-5.c scan-assembler weak[^ \t]*[ \t]_?vfoo1h
[-PASS:-]{+FAIL:+} gcc.dg/weak/weak-5.c scan-assembler weak[^ \t]*[ \t]_?vfoo1i
[-PASS:-]{+FAIL:+} gcc.dg/weak/weak-5.c scan-assembler weak[^ \t]*[ \t]_?vfoo1j
PASS: gcc.dg/weak/weak-5.c scan-assembler weak[^ \t]*[ \t]_?vfoo1k
PASS: gcc.dg/weak/weak-5.c scan-assembler weak[^ \t]*[ \t]_?vfoo1l
These get 'dg-xfail-if'ed or 'dg-skip-if'ed, (mostly) similar to GCC/nvptx.
PR target/119369
gcc/
* config/gcn/gcn-protos.h (gcn_asm_weaken_decl): Declare.
* config/gcn/gcn.cc (gcn_asm_weaken_decl): New.
* config/gcn/gcn-hsa.h (ASM_WEAKEN_DECL): '#define' to this.
gcc/testsuite/
* g++.dg/abi/pure-virtual1.C: 'dg-xfail-if' GCN.
* g++.dg/cpp0x/pr84497.C: 'dg-skip-if' GCN.
* g++.dg/ext/weak2.C: Likewise.
* gcc.dg/attr-weakref-1.c: Likewise.
* gcc.dg/weak/weak-1.c: Likewise.
* gcc.dg/weak/weak-12.c: Likewise.
* gcc.dg/weak/weak-15.c: Likewise.
* gcc.dg/weak/weak-16.c: Likewise.
* gcc.dg/weak/weak-2.c: Likewise.
* gcc.dg/weak/weak-3.c: Likewise.
* gcc.dg/weak/weak-4.c: Likewise.
* gcc.dg/weak/weak-5.c: Likewise.
|
|
This fixes a few hundreds of compilation/linking FAILs (similar to PR69506),
where the GCN/LLVM 'ld' reported:
ld: error: relocation R_AMDGPU_REL32_LO cannot be used against symbol '_ZGTtnam'; recompile with -fPIC
>>> defined in [...]/amdgcn-amdhsa/./libstdc++-v3/src/.libs/libstdc++.a(cow-stdexcept.o)
>>> referenced by cow-stdexcept.cc:259 ([...]/libstdc++-v3/src/c++11/cow-stdexcept.cc:259)
>>> cow-stdexcept.o:(_txnal_cow_string_C1_for_exceptions(void*, char const*, void*)) in archive [...]/amdgcn-amdhsa/./libstdc++-v3/src/.libs/libstdc++.a
ld: error: relocation R_AMDGPU_REL32_HI cannot be used against symbol '_ZGTtnam'; recompile with -fPIC
>>> defined in [...]/amdgcn-amdhsa/./libstdc++-v3/src/.libs/libstdc++.a(cow-stdexcept.o)
>>> referenced by cow-stdexcept.cc:259 ([...]/source-gcc/libstdc++-v3/src/c++11/cow-stdexcept.cc:259)
>>> cow-stdexcept.o:(_txnal_cow_string_C1_for_exceptions(void*, char const*, void*)) in archive [...]/amdgcn-amdhsa/./libstdc++-v3/src/.libs/libstdc++.a
[...]
..., which is:
$ c++filt _ZGTtnam
transaction clone for operator new[](unsigned long)
..., and similarly for other libitm symbols.
However, the affected test cases, if applicable, then run into execution test
FAILs, due to PR119369
"GCN: weak undefined symbols -> execution test FAIL, 'HSA_STATUS_ERROR_VARIABLE_UNDEFINED'".
PR target/119369
libstdc++-v3/
* config/cpu/gcn/cpu_defines.h: New.
* configure.host [GCN] (cpu_defines_dir): Point to it.
|
|
The following fixes ix86_valid_target_attribute_inner_p to properly
handle target("no-sse4") via OPT_mno_sse4 rather than as unset OPT_msse4.
I've added asserts to ix86_handle_option that RejectNegative is honored
for both.
PR target/119549
* common/config/i386/i386-common.cc (ix86_handle_option):
Assert that both OPT_msse4 and OPT_mno_sse4 are never unset.
* config/i386/i386-options.cc (ix86_valid_target_attribute_inner_p):
Process negated OPT_msse4 as OPT_mno_sse4.
* gcc.target/i386/pr119549.c: New testcase.
|
|
gcc/ChangeLog:
PR middle-end/119559
* gimplify.cc (modify_call_for_omp_dispatch): Reorder checks to avoid
asserts and bogus diagnostic.
|
|
I've noticed
../../../libquadmath/printf/gmp-impl.h:104:18: warning: old-style function definition [-Wold-style-definition]
../../../libquadmath/printf/gmp-impl.h:104:18: warning: old-style function definition [-Wold-style-definition]
../../../libquadmath/printf/gmp-impl.h:104:18: warning: old-style function definition [-Wold-style-definition]
../../../libquadmath/strtod/strtod_l.c:456:22: warning: old-style function definition [-Wold-style-definition]
warnings during bootstrap (clearly since the switch to -std=gnu23 by default).
The following patch fixes those in libquadmath, the only other warnings are
in zlib.
2025-04-01 Jakub Jelinek <jakub@redhat.com>
* strtod/strtod_l.c (____STRTOF_INTERNAL): Avoid old-style function
definitions.
* printf/addmul_1.c (mpn_addmul_1): Likewise.
* printf/mul_1.c (mpn_mul_1): Likewise.
* printf/submul_1.c (mpn_submul_1): Likewise.
|
|
Fix broken testsuite like
"ERROR: gcc.target/riscv/cmo-zicbop-2.c -Os : 1: too many arguments for " dg-do 1 compile target { { rv32-*-*}} "
gcc/testsuite/ChangeLog:
* gcc.target/riscv/cmo-zicbop-1.c: Fix missing { before target .
* gcc.target/riscv/cmo-zicbop-2.c: Likewise.
* gcc.target/riscv/prefetch-zicbop.c:Likewise.
* gcc.target/riscv/prefetch-zihintntl.c:Likewise.
|
|
For vaes patterns with jm constraint and gpr16 attr, it requires "isa"
attr to distinct avx/avx512 alternatives in ix86_memory_address_reg_class.
Also adds missing type and mode attributes for those vaes patterns.
gcc/ChangeLog:
PR target/119473
* config/i386/sse.md
(vaesdec_<mode>): Set attr "isa" as "avx,vaes_avx512vl", "type" as
"sselog1", "mode" as "TI".
(vaesdeclast_<mode>): Ditto.
(vaesenc_<mode>): Ditto.
(vaesenclast_<mode>): Ditto.
gcc/testsuite/ChangeLog:
PR target/119473
* gcc.target/i386/pr119473.c: New test.
Co-authored-by: Hongyu Wang <hongyu.wang@intel.com>
|
|
According to Section 3.4.2, Vector Register Grouping, in the RISC-V
Vector Specification, the rule for LMUL is LMUL >= SEW/ELEN
Changes since V2:
- Add check on vector-iterators.md
- Add one more testcase to check the VLS use correct mode.
gcc/ChangeLog:
* config/riscv/riscv-v.cc: Add restrict for insert LMUL.
* config/riscv/riscv-vector-builtins-types.def:
Use RVV_REQUIRE_ELEN_64 to check LMUL number.
* config/riscv/riscv-vector-switch.def: Likewise.
* config/riscv/vector-iterators.md: Check TARGET_VECTOR_ELEN_64
rather than "TARGET_MIN_VLEN > 32" for all iterator.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/pr111391-2.c: Update test.
* gcc.target/riscv/rvv/base/abi-14.c: Update test.
* gcc.target/riscv/rvv/base/abi-16.c: Update test.
* gcc.target/riscv/rvv/base/abi-18.c: Update test.
* gcc.target/riscv/rvv/base/vsetvl_zve32-1.c: New test.
* gcc.target/riscv/rvv/base/vsetvl_zve32-2.c: New test.
Co-authored-by: Kito Cheng <kito.cheng@sifive.com>
|
|
correct position.
gcc/ChangeLog:
* doc/invoke.texi: Corrected the position of '-mtls-dialect=opt'
option.
|