Age | Commit message (Collapse) | Author | Files | Lines |
|
With -march=cascadelake we use vpermilps instead of shufps.
PR tree-optimization/116258
* gcc.target/i386/pr116258.c: Also allow vpermilps.
|
|
LRA emits insns to save caller-save registers in the
inheritance/splitting pass. In this pass, LRA builds EBBs (Extended
Basic Block) and traverses the insns in the EBBs in reverse order from
the last insn to the first insn. When LRA sees a write to a pseudo (that
has been assigned a caller-save register), and there is a read following
the write, with an intervening call insn between the write and read,
then LRA generates a spill immediately after the write and a restore
immediately before the read. The spill is needed because the call insn
will clobber the caller-save register.
If there is a write insn and a call insn in two separate BBs but
belonging to the same EBB, the spill insn gets generated in the BB
containing the write insn. If the write insn is in the entry BB, then
the spill insn that is generated in the entry BB prevents shrink wrap
from happening. This is because the spill insn references the stack
pointer and hence the prolog gets generated in the entry BB itself.
This patch ensures the the spill insn is generated before the call insn
instead of after the write. This also ensures that the spill occurs
only in the path containing the call.
2024-08-01 Surya Kumari Jangala <jskumari@linux.ibm.com>
gcc:
PR rtl-optimization/116028
* lra-constraints.cc (split_reg): Spill register before call
insn.
(latest_call_insn): New variable.
(inherit_in_ebb): Track the latest call insn.
gcc/testsuite:
PR rtl-optimization/116028
* gcc.dg/ira-shrinkwrap-prep-1.c: Remove xfail for powerpc.
* gcc.dg/pr10474.c: Remove xfail for powerpc.
|
|
This patch support Zimop and Zcmop extension[1].To enable GCC to recognize
and process Zimop and Zcmop extension correctly at compile time.
https://github.com/riscv/riscv-isa-manual/blob/main/src/zimop.adoc
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc: New extension.
* config/riscv/riscv.opt: New mask.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/arch-42.c: New test.
* gcc.target/riscv/arch-43.c: New test.
|
|
[PR115801]
With modules it may be the case that a template friend class provided
with a qualified name is not found by name lookup at instantiation time,
due to the class not being exported from its module. This causes issues
in tsubst_friend_class which did not handle this case.
This is caused by the named friend class not actually requiring
tsubsting. This was already worked around for the "found by name
lookup" case (g++.dg/template/friend5.C), but it looks like there's no
need to do name lookup at all for this particular case to work.
We do need to be careful to continue to do name lookup to handle
templates from an outer current instantiation though; this patch adds a
new testcase for this as well. This should not impact modules (because
exportingness will only affect namespace lookup).
PR c++/115801
gcc/cp/ChangeLog:
* pt.cc (tsubst_friend_class): Return the type immediately when
no tsubsting or name lookup is required.
gcc/testsuite/ChangeLog:
* g++.dg/modules/tpl-friend-16_a.C: New test.
* g++.dg/modules/tpl-friend-16_b.C: New test.
* g++.dg/template/friend82.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Patrick Palka <ppalka@redhat.com>
Reviewed-by: Jason Merrill <jason@redhat.com>
|
|
Currently name lookup generally seems to assume that all entities
declared within a named module (partition) are attached to said module,
which is not true for GM entities (e.g. via extern "C++"), and causes
issues with deduplication.
This patch fixes the issue by ensuring that module attachment of a
declaration is consistently used to handling merging. Handling this
exposes some issues with deduplicating temploid friends; to resolve this
we always create the BINDING_SLOT_PARTITION slot so that we have
somewhere to place attached names (from any module).
This doesn't yet completely handle issues with allowing otherwise
conflicting temploid friends from different modules to co-exist in the
same module if neither are reachable from the other via name lookup.
PR c++/114950
gcc/cp/ChangeLog:
* module.cc (trees_out::decl_value): Stream bit indicating
imported temploid friends early.
(trees_in::decl_value): Use this bit with key_mergeable.
(trees_in::key_mergeable): Allow merging attached declarations
if they're imported temploid friends (which must be namespace
scope).
(module_state::read_cluster): Check for GM entities that may
require merging even when importing from partitions.
* name-lookup.cc (enum binding_slots): Adjust comment.
(get_fixed_binding_slot): Always create partition slot.
(name_lookup::search_namespace_only): Support binding vectors
with both partition and GM entities to dedup.
(walk_module_binding): Likewise.
(name_lookup::adl_namespace_fns): Likewise.
(set_module_binding): Likewise.
(check_module_override): Use attachment of the decl when
checking overrides rather than named_module_p.
(lookup_imported_hidden_friend): Use partition slot for finding
mergeable template bindings.
* name-lookup.h (set_module_binding): Split mod_glob_flag
parameter into separate global_p and partition_p params.
gcc/testsuite/ChangeLog:
* g++.dg/modules/tpl-friend-13_e.C: Adjust error message.
* g++.dg/modules/ambig-2_a.C: New test.
* g++.dg/modules/ambig-2_b.C: New test.
* g++.dg/modules/part-9_a.C: New test.
* g++.dg/modules/part-9_b.C: New test.
* g++.dg/modules/part-9_c.C: New test.
* g++.dg/modules/tpl-friend-15.h: New test.
* g++.dg/modules/tpl-friend-15_a.C: New test.
* g++.dg/modules/tpl-friend-15_b.C: New test.
* g++.dg/modules/tpl-friend-15_c.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Jason Merrill <jason@redhat.com>
|
|
This error message reads to me the wrong way around, particularly in the
context of other errors. Updated so that the ellipsis connect.
gcc/cp/ChangeLog:
* module.cc (trees_in::read_enum_def): Clarify error.
gcc/testsuite/ChangeLog:
* g++.dg/modules/enum-bad-1_b.C: Update error message.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
|
|
|
|
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/604075
|
|
XCode compilers recognise the weak_framework linker option in the driver
and forward it. This patch makes GCC adopt the same behaviour.
PR target/116237
gcc/ChangeLog:
* config/darwin.h (SUBTARGET_DRIVER_SELF_SPECS): Add a spec for
weak_framework.
* config/darwin.opt: Handle weak_framework driver option.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
|
|
When a partial specialization is deemed erroneous at parse time, we
currently flag the primary template as erroneous instead. Later
at instantiation time we check if the primary template is erroneous
rather than the selected partial specialization, so at least we're
consistent.
But it's better not to conflate a partial specialization with the
primary template since they're instantiated independenty. This avoids
rejecting the instantiation of A<int> in the below testcase.
PR c++/116064
gcc/cp/ChangeLog:
* error.cc (get_current_template): If the current scope is
a partial specialization, return it instead of the primary
template.
* pt.cc (instantiate_class_template): Pass the partial
specialization if any to maybe_diagnose_erroneous_template
instead of the primary template.
gcc/testsuite/ChangeLog:
* g++.dg/template/permissive-error2.C: New test.
Reviewed-by: Jason Merrill <jason@redhat.com>
|
|
When offloading is enabled, the patch streams out host
NUM_POLY_INT_COEFFS, and changes streaming in as follows:
if (host_num_poly_int_coeffs <= NUM_POLY_INT_COEFFS)
{
for (i = 0; i < host_num_poly_int_coeffs; i++)
poly_int.coeffs[i] = stream_in coeff;
for (; i < NUM_POLY_INT_COEFFS; i++)
poly_int.coeffs[i] = 0;
}
else
{
for (i = 0; i < NUM_POLY_INT_COEFFS; i++)
poly_int.coeffs[i] = stream_in coeff;
/* Ensure that degree of poly_int <= accel NUM_POLY_INT_COEFFS. */
for (; i < host_num_poly_int_coeffs; i++)
{
val = stream_in coeff;
if (val != 0)
error ();
}
}
gcc/ChangeLog:
PR ipa/96265
PR ipa/111937
* data-streamer-in.cc (streamer_read_poly_uint64): Remove code for
streaming, and call poly_int_read_common instead.
(streamer_read_poly_int64): Likewise.
* data-streamer.cc (host_num_poly_int_coeffs): Conditionally define
new variable if ACCEL_COMPILER is defined.
* data-streamer.h (host_num_poly_int_coeffs): Declare.
(poly_int_read_common): New function template.
(bp_unpack_poly_value): Remove code for streaming and call
poly_int_read_common instead.
* lto-streamer-in.cc (lto_input_mode_table): Stream-in host
NUM_POLY_INT_COEFFS into host_num_poly_int_coeffs if ACCEL_COMPILER
is defined.
* lto-streamer-out.cc (lto_write_mode_table): Stream out
NUM_POLY_INT_COEFFS if offloading is enabled.
* poly-int.h (MAX_NUM_POLY_INT_COEFFS_BITS): New macro.
* tree-streamer-in.cc (lto_input_ts_poly_tree_pointers): Adjust
streaming-in of poly_int.
Signed-off-by: Prathamesh Kulkarni <prathameshk@nvidia.com>
|
|
SRA adds fancy names like offset$D94316$_M_impl$D93629$_M_start
where the numbers in there are DECL_UIDs if there are unnamed
FIELD_DECLs etc.
Because -g0 vs. -g can cause differences between the exact DECL_UID
values (add bigger gaps in between them, corresponding decls should
still be ordered the same based on DECL_UID) we make sure such
decls have DECL_NAMELESS set and depending on exact options either don't
dump such names at all or dump_fancy_name sanitizes the D123456$ parts in
there to Dxxxx$.
Unfortunately in tons of places we then use get_name to grab either user
names or these SRA created names and use that as argument to
create_tmp_var{,_name,_raw} to base other artificial temporary names based
on that. Those are DECL_NAMELESS too, but unfortunately create_tmp_var_name
starting with
https://gcc.gnu.org/git/?p=gcc.git&a=commit;h=725494f6e4121eace43b7db1202f8ecbf52a8276
calls clean_symbol_name which replaces the $s in there with _s and thus
dump_fancy_name doesn't sanitize it anymore.
I don't see any discussion of that commit (originally to TM branch, later
merged) on the mailing list, but from
DECL_NAME (new_decl)
= create_tmp_var_name (IDENTIFIER_POINTER (DECL_NAME (old_decl)));
- SET_DECL_ASSEMBLER_NAME (new_decl, NULL_TREE);
+ SET_DECL_ASSEMBLER_NAME (new_decl, DECL_NAME (new_decl));
snippet elsewhere in that commit it seems create_tmp_var_name was used at
that point also to determine function names of clones, so presumably the
clean_symbol_name at that point was to ensure the symbol could be emitted
into assembly, maybe in case DECL_NAME is something like C++ operators or
whatever could have there undesirable characters.
Anyway, we don't do that for years anymore, already GCC 4.5 uses for such
purposes clone_function_name which starts of DECL_ASSEMBLER_NAME of the old
function and appends based on supportable symbol suffix separators the
separator and some suffix and/or number, so that part doesn't go through
create_tmp_var_name.
I don't see problems with having the $ and . etc. characters in the names
intended just to make dumps more readable, after all, we already are using
those in the SRA created names. Those names shouldn't make it into the
assembly in any way, neither debug info nor assembly labels.
There is one theoretical case, where the gimplifier promotes automatic
vars into TREE_STATIC ones and therefore those can then appear in assembly,
just in case it would be on e.g. SRA created names and regimplified later.
Because no cases of promotion of DECL_NAMELESS vars to static was observed in
{x86_64,i686,powerpc64le}-linux bootstraps/regtests, the code simply uses
C.NNN names for DECL_NAMELESS vars like it does for !DECL_NAME vars.
Richi mentioned on IRC that the non-cleaned up names might make things
harder to feed stuff back to the GIMPLE FE, but if so, I think it should be
the dumping for GIMPLE FE purposes that cleans those up (but at that point
it should also verify if some such cleaned up names don't collide with
others and somehow deal with those).
2024-08-07 Jakub Jelinek <jakub@redhat.com>
PR c++/116219
* gimple-expr.cc (remove_suffix): Formatting fixes.
(create_tmp_var_name): Don't call clean_symbol_name.
* gimplify.cc (gimplify_init_constructor): When promoting automatic
DECL_NAMELESS vars to static, don't preserve their DECL_NAME.
|
|
This commit also compile-time expands (__builtin_)omp_is_initial_device for
both Fortran and C/C++ (unless, -fno-builtin-omp_is_initial_device is used).
But the main change is:
This commit adds support for running constructors and destructors for
static (file-scope) aggregates for C++ objects which are marked with
"declare target" directives on OpenMP offload targets.
Before this commit, space is allocated on the target for such aggregates,
but nothing ever constructs them properly, so they end up zero-initialised.
(See the new test static-aggr-constructor-destructor-3.C for a reason
why running constructors on the target is preferable to e.g. constructing
on the host and then copying the resulting object to the target.)
2024-08-07 Julian Brown <julian@codesourcery.com>
Tobias Burnus <tobias@baylibre.com>
gcc/ChangeLog:
* builtins.def (DEF_GOMP_BUILTIN_COMPILER): Define
DEF_GOMP_BUILTIN_COMPILER to handle the non-prefix version.
* gimple-fold.cc (gimple_fold_builtin_omp_is_initial_device): New.
(gimple_fold_builtin): Call it.
* omp-builtins.def (BUILT_IN_OMP_IS_INITIAL_DEVICE): Define.
* tree.cc (get_file_function_name): Support names for on-target
constructor/destructor functions.
gcc/cp/
* decl2.cc (tree-inline.h): Include.
(static_init_fini_fns): Bump to four entries. Update comment.
(start_objects, start_partial_init_fini_fn): Add 'omp_target'
parameter. Support "declare target" decls. Update forward declaration.
(emit_partial_init_fini_fn): Add 'host_fn' parameter. Return tree for
the created function. Support "declare target".
(OMP_SSDF_IDENTIFIER): New macro.
(partition_vars_for_init_fini): Support partitioning "declare target"
variables also.
(generate_ctor_or_dtor_function): Add 'omp_target' parameter. Support
"declare target" decls.
(c_parse_final_cleanups): Support constructors/destructors on OpenMP
offload targets.
gcc/fortran/ChangeLog:
* gfortran.h (gfc_option_t): Add disable_omp_is_initial_device.
* lang.opt (fbuiltin-): Add.
* options.cc (gfc_handle_option): Handle
-fno-builtin-omp_is_initial_device.
* f95-lang.cc (gfc_init_builtin_functions): Handle
DEF_GOMP_BUILTIN_COMPILER.
* trans-decl.cc (gfc_get_extern_function_decl): Add code to use
DEF_GOMP_BUILTIN_COMPILER for 'omp_is_initial_device'.
libgomp/ChangeLog:
* testsuite/libgomp.c++/static-aggr-constructor-destructor-1.C: New test.
* testsuite/libgomp.c++/static-aggr-constructor-destructor-2.C: New test.
* testsuite/libgomp.c++/static-aggr-constructor-destructor-3.C: New test.
* testsuite/libgomp.c-c++-common/target-is-initial-host.c: New test.
* testsuite/libgomp.c-c++-common/target-is-initial-host-2.c: New test.
* testsuite/libgomp.fortran/target-is-initial-host.f: New test.
* testsuite/libgomp.fortran/target-is-initial-host.f90: New test.
* testsuite/libgomp.fortran/target-is-initial-host-2.f90: New test.
Co-authored-by: Tobias Burnus <tobias@baylibre.com>
|
|
The following patch attempts to implement DR2387 by making variable
templates including their specialization TREE_PUBLIC when at file
scope and they don't have static storage class.
2024-08-07 Jakub Jelinek <jakub@redhat.com>
PR c++/109126
* decl.cc (grokvardecl): Implement CWG 2387 - Linkage of
const-qualified variable template. Set TREE_PUBLIC on variable
templates with const qualified types unless static is present.
* g++.dg/DRs/dr2387.C: New test.
* g++.dg/DRs/dr2387-aux.cc: New file.
|
|
The commit for PR 116258, added a x86_64 specific testcase,
I thought it would be a good idea to add an aarch64 testcase too.
And since it also fixed VLA vectors too so add a SVE testcase.
Pushed as obvious after a test for aarch64-linux-gnu.
PR middle-end/116258
PR middle-end/116259
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/pr116258.c: New test.
* gcc.target/aarch64/sve/pr116259-1.c: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
|
|
Add the signed __int128 and unsigned __int128 argument types for the
overloaded built-ins vec_sld, vec_sldb, vec_sldw, vec_sll, vec_slo,
vec_srdb, vec_srl, vec_sro. For each of the new argument types add a
testcase and update the documentation for the built-in.
gcc/ChangeLog:
* config/rs6000/altivec.md (vs<SLDB_lr>db_<mode>): Change
define_insn iterator to VEC_IC.
* config/rs6000/rs6000-builtins.def (__builtin_altivec_vsldoi_v1ti,
__builtin_vsx_xxsldwi_v1ti, __builtin_altivec_vsldb_v1ti,
__builtin_altivec_vsrdb_v1ti): New builtin definitions.
* config/rs6000/rs6000-overload.def (vec_sld, vec_sldb, vec_sldw,
vec_sll, vec_slo, vec_srdb, vec_srl, vec_sro): New overloaded
definitions.
* doc/extend.texi (vec_sld, vec_sldb, vec_sldw, vec_sll, vec_slo,
vec_srdb, vec_srl, vec_sro): Add documentation for new overloaded
built-ins.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/vec-shift-double-runnable-int128.c: New test
file.
|
|
The following avoids lowering of PAREN_EXPR of vectors as unsupported
to scalars. Instead PAREN_EXPR is like a plain move or a VIEW_CONVERT.
PR tree-optimization/116258
* tree-vect-generic.cc (expand_vector_operations_1): Do not
lower PAREN_EXPR.
* gcc.target/i386/pr116258.c: New testcase.
|
|
My sincere apologies for not noticing that g++.dg/other/sse2-pr85572-1.C
was FAILing with my recent ashrv2di patch. I'm not sure how that happened.
Many thanks to Andrew Pinski for alerting me, and confirming that the
changes are harmless/beneficial. Sorry again for the inconvenience.
2024-08-07 Roger Sayle <roger@nextmovesoftware.com>
gcc/testsuite/ChangeLog
* g++.dg/other/sse2-pr85572-1.C: Update expected output after
my recent patch for ashrv2di3. Now with one less instruction.
|
|
We currently ICE upon the following valid code, due to the fix made through
commit 9efe5fbde1e8
=== cut here ===
struct ignore { ignore(...) {} };
template<class... Args>
void InternalCompilerError(Args... args)
{ ignore{ ignore(args) ... }; }
int main() { InternalCompilerError(0, 0); }
=== cut here ===
Change 9efe5fbde1e8 avoids infinite recursion in build_over_call by returning
error_mark_node if one invokes ignore::ignore(...) with an argument of type
ignore, because otherwise we end up calling convert_arg_to_ellipsis for that
argument, and recurse into build_over_call with the exact same parameters.
This patch tightens the condition to only return error_mark_node if there's one
and only one parameter to the call being processed - otherwise we won't
infinitely recurse.
Successfully tested on x86_64-pc-linux-gnu.
PR c++/111592
gcc/cp/ChangeLog:
* call.cc (build_over_call): Only error out if there's a single
parameter of type A in a call to A::A(...).
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/variadic186.C: New test.
|
|
The check was implemented incorrectly, so vec_widen_smult_{even,odd}_M
was never used. This is not good for targets with native even/odd
widening multiplication but not lo/hi multiplication.
The fix is actually developed by Richard Biener.
gcc/ChangeLog:
PR tree-optimization/116142
* tree-vect-stmts.cc (supportable_widening_operation): Remove an
redundant and incorrect vect_reduction_def check, and fix the
operand of another vect_reduction_def check.
gcc/testsuite/ChangeLog:
PR tree-optimization/116142
* gcc.target/i386/pr116142.c: New test.
Co-authored-by: Richard Biener <rguenther@suse.de>
|
|
[PR116175]
When working on unsequenced/reproducible attributes, I've noticed that on
templates for some attributes decl_attributes isn't called at all, so they
are kept in TYPE_ATTRIBUTES without any verification/transformations and
also without argument substitution.
The following patch fixes that for FUNCTION/METHOD_TYPE attributes.
The included testcase ICEs without the pt.cc changes.
2024-08-07 Jakub Jelinek <jakub@redhat.com>
PR c++/116175
* pt.cc (apply_late_template_attributes): For function/method types
call cp_build_type_attribute_variant on the non-dependent attributes.
(rebuild_function_or_method_type): Add ARGS argument. Use
apply_late_template_attributes rather than
cp_build_type_attribute_variant.
(maybe_rebuild_function_decl_type): Add ARGS argument, pass it to
rebuild_function_or_method_type.
(tsubst_function_decl): Adjust caller.
(tsubst_function_type): Adjust rebuild_function_or_method_type caller.
* g++.dg/ext/attr-format4.C: New test.
|
|
Currently the forward threader isn't limited as to the search space
it explores and with it now using path-ranger for simplifying
conditions it runs into it became pretty slow for degenerate cases
like compiling insn-emit.cc for RISC-V esp. when compiling for
a host with LOGICAL_OP_NON_SHORT_CIRCUIT disabled.
The following makes the forward threader honor the search space
limit I introduced for the backward threader. This reduces
compile-time from minutes to seconds for the testcase in PR116166.
Note this wasn't necessary before we had ranger but with ranger
the work we do is quadatic in the length of the threading path
we build up (the same is true for the backwards threader).
PR tree-optimization/116166
* tree-ssa-threadedge.h (jump_threader::thread_around_empty_blocks):
Add limit parameter.
(jump_threader::thread_through_normal_block): Likewise.
* tree-ssa-threadedge.cc (jump_threader::thread_around_empty_blocks):
Honor and decrement limit parameter.
(jump_threader::thread_through_normal_block): Likewise.
(jump_threader::thread_across_edge): Initialize limit from
param_max_jump_thread_paths and pass it down to workers.
|
|
When cleaning up the remaining powerpc_{vsx,altivec}_ok test
cases, I found some issues are related to pr78056-*.c.
Firstly, the test points of pr78056-[246].c are no longer
available since r9-3164 drops many HAVE_AS_* and the expected
warning are dropped together, so this patch is to remove them.
Secondly, pr78056-1.c and pr78056-3.c include altivec.h but
don't use any builtins, checking powerpc_altivec is enough
(don't need to check powerpc_vsx). And pr78056-5.c doesn't
require any altivec/vsx feature, so powerpc_vsx_ok can be
removed. Lastly, pr78056-7.c should just use powerpc_fprs
instead of dfp_hw as it only cares about insn fcpsgn.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/pr78056-1.c: Check for powerpc_altivec rather than
powerpc_vsx.
* gcc.target/powerpc/pr78056-3.c: Likewise.
* gcc.target/powerpc/pr78056-5.c: Drop powerpc_vsx_ok check.
* gcc.target/powerpc/pr78056-7.c: Check for powerpc_fprs rather than
dfp_hw.
* gcc.target/powerpc/pr78056-2.c: Remove.
* gcc.target/powerpc/pr78056-4.c: Remove.
* gcc.target/powerpc/pr78056-6.c: Remove.
|
|
When cleaning up the remaining powerpc_{vsx,altivec}_ok test
cases, I found some dg-do run test cases which should check
for the appropriate {p8vector,vmx}_hw check instead. This
patch is to adjust them accordingly.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/swaps-p8-46.c: Check for p8vector_hw rather than
powerpc_vsx_ok.
* gcc.target/powerpc/ppc64-abi-2.c: Check for vmx_hw rather than
powerpc_altivec_ok.
* gcc.target/powerpc/pr96139-c.c: Likewise.
|
|
Following up the previous r15-886, this patch to clean up
the remaining powerpc_vsx_ok which actually should use
powerpc_vsx instead.
PR testsuite/114842
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/error-1.c: Replace powerpc_vsx_ok check with
powerpc_vsx.
* gcc.target/powerpc/warn-2.c: Likewise.
* gcc.target/powerpc/fold-vec-logical-ors-longlong.c: Likewise.
* gcc.target/powerpc/ppc-fortran/pr80108-1.f90: Replace powerpc_vsx_ok
check with powerpc_vsx and remove useless -mfloat128.
* gcc.target/powerpc/pragma_power8.c: Replace powerpc_vsx_ok check with
powerpc_vsx.
|
|
This is a follow up patch for the previous patch adjusting
powerpc_vsx_ok with powerpc_vsx, focusing on those test cases
which don't really require VSX feature but used powerpc_vsx_ok
before, they actually require some other effective target check,
like some of them just require ALTIVEC feature, some of them
just require hard float support, and some of them just require
ISA 2.06 etc..
By the way, ppc-fpconv-4.c is the only one missing powerpc_fprs
among ppc-fpconv-*.c after this replacement, so I also fix it
here.
PR testsuite/114842
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/bswap64-2.c: Replace powerpc_vsx_ok check with
has_arch_pwr7.
* gcc.target/powerpc/ppc-fpconv-2.c: Replace powerpc_vsx_ok check with
powerpc_fprs.
* gcc.target/powerpc/ppc-fpconv-6.c: Likewise.
* gcc.target/powerpc/ppc-pow.c: Likewise.
* gcc.target/powerpc/ppc-target-1.c: Likewise.
* gcc.target/powerpc/ppc-target-2.c: Likewise.
* gcc.target/powerpc/ppc-target-3.c: Likewise.
* gcc.target/powerpc/ppc-target-4.c: Likewise.
* gcc.target/powerpc/ppc-fpconv-4.c: Check for powerpc_fprs.
* gcc.target/powerpc/fold-vec-select-char.c: Replace powerpc_vsx_ok
with powerpc_altivec check and move it after dg-options line.
* gcc.target/powerpc/fold-vec-select-float.c: Likewise.
* gcc.target/powerpc/fold-vec-select-int.c: Likewise.
* gcc.target/powerpc/fold-vec-select-short.c: Likewise.
* gcc.target/powerpc/p9-novsx.c: Likewise.
* gcc.target/powerpc/p9-options-1.c: Likewise.
|
|
Checking the existing powerpc_{altivec,vsx}_ok test cases,
I found there are some test cases which don't require the
checks powerpc_{altivec,vsx} even, some of them already
have other effective target check which can cover check
powerpc_{altivec,vsx}, or some of them don't actually
require VSX/AltiVec feature at all. So this patch is to
remove such useless checks.
PR testsuite/114842
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/amo2.c: Remove powerpc_vsx_ok effective target
check as p9vector_hw already covers it.
* gcc.target/powerpc/p9-sign_extend-runnable.c: Likewise.
* gcc.target/powerpc/clone2.c: Remove powerpc_vsx_ok effective target
check as ppc_cpu_supports_hw already covers it.
* gcc.target/powerpc/pr47251.c: Remove powerpc_vsx_ok effective target
check as it doesn't need VSX.
* gcc.target/powerpc/pr60137.c: Likewise.
* gcc.target/powerpc/pr80098-1.c: Likewise.
* gcc.target/powerpc/pr80098-2.c: Likewise.
* gcc.target/powerpc/pr80098-3.c: Likewise.
* gcc.target/powerpc/sd-pwr6.c: Likewise.
* gcc.target/powerpc/pr57744.c: Remove powerpc_vsx_ok effective target
check and option -mvsx as it doesn't need VSX.
* gcc.target/powerpc/pr69548.c: Remove powerpc_vsx_ok effective target
check as it doesn't need VSX, remove lp64 and use int128 instead.
* gcc.target/powerpc/vec-cmpne-long.c: Remove powerpc_vsx_ok effective
target check as p8vector_hw already covers it.
* gcc.target/powerpc/darwin-save-world-1.c: Remove powerpc_altivec_ok
effective target check as vmx_hw already covers it.
|
|
Different from p9vector_hw, vmx_hw/vsx_hw/p8vector_hw checks
can still succeed without Altivec/VSX feature support. We
have many runnable test cases only checking for these *_hw
without extra checking for if Altivec/VSX feature enabled or
not. It means they can fail if being tested by explicitly
disabling Altivec/VSX. So I think it's reasonable to check
if Altivec/VSX feature is enabled too while checking testing
environment is able to execute some instructions since these
instructions reply on these features. So similar to what we
test for p9vector_hw, this patch is to modify C functions
used for vmx_hw, vsx_hw and p8vector_hw with according vector
types and constraints. For p8vector_hw, excepting for VSX
feature, it also requires ISA 2.7 support. A good thing is
that now almost all of the test cases using p8vector_hw have
specified -mdejagnu-cpu=power8 always or if !has_arch_pwr8.
Considering checking _ARCH_PWR8 in p8vector_hw can stop test
cases being tested even if test case itself has specified
-mdejagnu-cpu=power8, this patch doesn't force p8vector_hw to
check _ARCH_PWR8, instead it updates all existing test cases
which adopt p8vector_hw but don't have -mdejagnu-cpu=power8.
By the way, all test cases adopting p9vector_hw are all fine.
gcc/testsuite/ChangeLog:
* lib/target-supports.exp (check_vsx_hw_available): Modify C source
code used for testing with type vector long long and constraint wa
which require VSX feature.
(check_p8vector_hw_available): Likewise.
(check_vmx_hw_available): Modify C source code used for testing with
type vector int and constraint v which require Altivec feature.
* gcc.target/powerpc/divkc3-1.c: Specify -mdejagnu-cpu=power8 for
!has_arch_pwr8 to ensure power8 support.
* gcc.target/powerpc/mulkc3-1.c: Likewise.
* gcc.target/powerpc/pr96264.c: Likewise.
|
|
Update warning test info for RISC-V target, compared on godbolt:
https://godbolt.org/z/Mexd3dfcc
gcc/testsuite/ChangeLog:
* gcc.dg/Wstringop-overflow-47.c: Remove xfail target.
|
|
gcc/testsuite/
* g++.dg/vect/pr115278.cc: Make cast's type agree with
assignment destination WRITE.
|
|
Deduction guides are represented as 'normal' functions currently, and
have no special handling in modules. However, this causes some issues;
by [temp.deduct.guide] a deduction guide is not found by normal name
lookup and instead all reachable deduction guides for a class template
should be considered, but this does not happen currently.
To solve this, this patch ensures that all deduction guides are
considered exported to ensure that they are always visible to importers
if they are reachable. Another alternative here would be to add a new
kind of "all reachable" flag to name lookup, but that is complicated by
some difficulties in handling GM entities; this may be a better way to
go if more kinds of entities end up needing this handling, however.
Another issue here is that because deduction guides are "unrelated"
functions, they will usually get discarded from the GMF, so this patch
ensures that when finding dependencies, GMF deduction guides will also
have bindings created. We do this in find_dependencies so that we don't
unnecessarily create bindings for GMF deduction guides that are never
reached; for consistency we do this for *all* deduction guides, not just
GM ones. We also make sure that the opposite (a deduction guide being
the only purview reference to a GMF template) correctly marks it as
reachable.
Finally, when merging deduction guides from multiple modules, the name
lookup code may now return two-dimensional overload sets, so update
callers to match.
As a small drive-by improvement this patch also updates the error pretty
printing code to add a space before the '->' when printing a deduction
guide, so we get 'S(int) -> S<int>' instead of 'S(int)-> S<int>'.
PR c++/115231
gcc/cp/ChangeLog:
* error.cc (dump_function_decl): Add a space before '->' when
printing deduction guides.
* module.cc (depset::hash::add_binding_entity): Don't create
bindings for guides here, only mark dependencies.
(depset::hash::add_deduction_guides): New.
(depset::hash::find_dependencies): Add deduction guide
dependencies for a class template.
(module_state::write_cluster): Always consider deduction guides
as exported.
* pt.cc (deduction_guides_for): Use 'lkp_iterator' instead of
'ovl_iterator'.
gcc/testsuite/ChangeLog:
* g++.dg/modules/dguide-1_a.C: New test.
* g++.dg/modules/dguide-1_b.C: New test.
* g++.dg/modules/dguide-2_a.C: New test.
* g++.dg/modules/dguide-2_b.C: New test.
* g++.dg/modules/dguide-3_a.C: New test.
* g++.dg/modules/dguide-3_b.C: New test.
* g++.dg/modules/dguide-3_c.C: New test.
* g++.dg/modules/dguide-3_d.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
|
|
When forgetting the '<>' on an explicit specialisation, the suggested
fixit hint suggests to add 'template <>', but naively applying will
cause nonsense results like 'template template <> struct S<int> {};'.
Instead check if we're currently parsing an explicit instantiation, and
if so inform about the issue (an instantiation cannot have a class body)
and suggest a fixit of simply '<>' to create a specialisation instead.
gcc/cp/ChangeLog:
* parser.cc (cp_parser_class_head): Clarify error message for
explicit instantiations.
gcc/testsuite/ChangeLog:
* g++.dg/template/explicit-instantiation9.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
|
|
testsuite/
* gcc.dg/vect/tsvc/tsvc.h (iterations): Allow to override,
default to 10.
(get_expected_result): Add values for iterations counts
10, 256 and 3200.
(run): Add code to output values for new iterations counts.
* gcc.dg/vect/tsvc/vect-tsvc-s1119.c (dg-additional-options):
Add -Diterations=LEN_2D .
* gcc.dg/vect/tsvc/vect-tsvc-s115.c: Likewise.
* gcc.dg/vect/tsvc/vect-tsvc-s119.c: Likewise.
* gcc.dg/vect/tsvc/vect-tsvc-s125.c: Likewise.
* gcc.dg/vect/tsvc/vect-tsvc-s2102.c: Likewise.
* gcc.dg/vect/tsvc/vect-tsvc-s2233.c: Likewise.
* gcc.dg/vect/tsvc/vect-tsvc-s2275.c: Likewise.
* gcc.dg/vect/tsvc/vect-tsvc-s231.c: Likewise.
* gcc.dg/vect/tsvc/vect-tsvc-s235.c: Likewise.
* gcc.dg/vect/tsvc/vect-tsvc-s176.c: (dg-additional-options):
Add -Diterations=3200 .
[!run_expensive_tests]: dg-additional-options "-DTRUNCATE_TEST" .
[TRUNCATE_TEST]: Set m to 32.
|
|
The .SAT_TRUNC vect pattern recog is valid when the lhs type has
its mode precision. For example as below, QImode with 1 bit precision
like _Bool is invalid here.
g_12 = (long unsigned int) _2;
_13 = MIN_EXPR <g_12, 1>;
_3 = (_Bool) _13;
The above pattern cannot be recog as .SAT_TRUNC (g_12) because the dest
only has 1 bit precision with QImode mode. Aka the type doesn't have
the mode precision.
The below tests are passed for this patch.
1. The rv64gcv fully regression tests.
2. The x86 bootstrap tests.
3. The x86 fully regression tests.
PR target/116202
gcc/ChangeLog:
* tree-vect-patterns.cc (vect_recog_sat_trunc_pattern): Add the
type_has_mode_precision_p check for the lhs type.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/pr116202-run-1.c: New test.
Signed-off-by: Pan Li <pan2.li@intel.com>
|
|
Due to recent middle-end change, update the .SAT_TRUNC expand dump
check from 2 to 4.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-1.c: Adjust
asm check times from 2 to 4.
Signed-off-by: Pan Li <pan2.li@intel.com>
Signed-off-by: Pan Li <pan2.li@intel.com>
|
|
In recent versions of GCC we've been diagnosing more and more kinds of
errors inside a template ahead of time. This is a largely good thing
because it catches bugs, typos, dead code etc sooner.
But if the template never gets instantiated then such errors are harmless
and can be inconvenient to work around if say the code in question is
third party and in maintenance mode. So it'd be handy to be able to
prevent these template errors from rendering the entire TU uncompilable.
(Note that such code is "ill-formed no diagnostic required" according
the standard.)
To that end this patch turns any errors issued within a template into
permerrors associated with a new -Wtemplate-body flag so that they can
be downgraded via e.g. -fpermissive or -Wno-error=template-body. If
the template containing a downgraded error later needs to be instantiated,
we'll issue an error then. But if the template never gets instantiated
then the downgraded error won't affect validity of the rest of the TU.
This is implemented via a diagnostic hook that gets called for each
diagnostic, and which adjusts an error diagnostic appropriately if we
detect it's occurring from a template context, and additionally flags
the template as erroneous.
For example the new testcase permissive-error1a.C gives:
gcc/testsuite/g++.dg/template/permissive-error1a.C: In function 'void f()':
gcc/testsuite/g++.dg/template/permissive-error1a.C:7:5: warning: increment of read-only variable 'n' [-Wtemplate-body]
7 | ++n;
| ^
...
gcc/testsuite/g++.dg/template/permissive-error1a.C: In instantiation of 'void f() [with T = int]':
gcc/testsuite/g++.dg/template/permissive-error1a.C:26:9: required from here
26 | f<int>();
| ~~~~~~^~
gcc/testsuite/g++.dg/template/permissive-error1a.C:5:6: error: instantiating erroneous template
5 | void f() {
| ^
gcc/testsuite/g++.dg/template/permissive-error1a.C:7:5: note: first error appeared here
7 | ++n; // {
| ^
...
PR c++/116064
gcc/c-family/ChangeLog:
* c.opt (Wtemplate-body): New warning.
gcc/cp/ChangeLog:
* cp-tree.h (erroneous_templates_t): Declare.
(erroneous_templates): Declare.
(cp_seen_error): Declare.
(seen_error): #define to cp_seen_error.
* error.cc (get_current_template): Define.
(relaxed_template_errors): Define.
(cp_adjust_diagnostic_info): Define.
(cp_seen_error): Define.
(cxx_initialize_diagnostics): Set
diagnostic_context::m_adjust_diagnostic_info.
* module.cc (finish_module_processing): Don't write the
module if it contains an erroneous template.
* pt.cc (maybe_diagnose_erroneous_template): Define.
(instantiate_class_template): Call it.
(instantiate_decl): Likewise.
gcc/ChangeLog:
* diagnostic.cc (diagnostic_context::initialize): Set
m_adjust_diagnostic_info.
(diagnostic_context::report_diagnostic): Call
m_adjust_diagnostic_info.
* diagnostic.h (diagnostic_context::m_adjust_diagnostic_info):
New data member.
* doc/invoke.texi (-Wno-template-body): Document.
(-fpermissive): Mention -Wtemplate-body.
gcc/testsuite/ChangeLog:
* g++.dg/ext/typedef-init.C: Downgrade error inside template
to warning due to -fpermissive.
* g++.dg/pr84492.C: Likewise.
* g++.old-deja/g++.pt/crash51.C: Remove unneeded dg-options.
* g++.dg/template/permissive-error1.C: New test.
* g++.dg/template/permissive-error1a.C: New test.
* g++.dg/template/permissive-error1b.C: New test.
* g++.dg/template/permissive-error1c.C: New test.
Reviewed-by: Jason Merrill <jason@redhat.com>
|
|
|
|
PR fortran/109105
gcc/fortran/ChangeLog:
* resolve.cc (CHECK_INTERFACES): New helper macro.
(resolve_operator): Replace use of snprintf with
gfc_error.
|
|
After r15-2414-g2d105efd6f60 which fixed the dg-do directive, the testcase
stopped working because there was a missing -save-temps. This adds that and
now the testcase passes again.
Pushed as obvious.
gcc/testsuite/ChangeLog:
PR testsuite/116207
* gcc.target/aarch64/simd/vmmla.c: Add -save-temps to the
options.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
|
|
Previously the invocation's "executionSuccessful" property (§3.20.14)
was only false if there was an ICE.
Update it so that it will also be false if we will exit with a non-zero
exit code (due to errors, Werror, and "sorry").
gcc/ChangeLog:
PR other/116177
* diagnostic-format-sarif.cc (sarif_invocation::prepare_to_flush):
If the diagnostics would lead to us exiting with a failure code,
then emit "executionSuccessful": False (SARIF v2.1.0 section
§3.20.14).
* diagnostic.cc (diagnostic_context::execution_failed_p): New.
* diagnostic.h (diagnostic_context::execution_failed_p): New decl.
* toplev.cc (toplev::main): Use it for determining returned value.
gcc/testsuite/ChangeLog:
PR other/116177
* gcc.dg/sarif-output/include-chain-2.c: Remove pruning of
"exit status is 1", as we expect this to exit with 0.
* gcc.dg/sarif-output/no-diagnostics.c: New test.
* gcc.dg/sarif-output/test-include-chain-1.py
(test_execution_unsuccessful): Add.
* gcc.dg/sarif-output/test-include-chain-2.py
(test_execution_successful): Add.
* gcc.dg/sarif-output/test-missing-semicolon.py
(test_execution_unsuccessful): Add.
* gcc.dg/sarif-output/test-no-diagnostics.py: New test.
* gcc.dg/sarif-output/test-werror.py: New test.
* gcc.dg/sarif-output/werror.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
Gather and scatters are not usually beneficial when the loop count is small.
This is because there's not only a cost to their execution within the loop but
there is also some cost to enter loops with them.
As such this patch models this overhead. For generic tuning we however still
prefer gathers/scatters when the loop costs work out.
gcc/ChangeLog:
* config/aarch64/aarch64-protos.h (struct sve_vec_cost): Add
gather_load_x32_init_cost and gather_load_x64_init_cost.
* config/aarch64/aarch64.cc (aarch64_vector_costs): Add
m_sve_gather_scatter_init_cost.
(aarch64_vector_costs::add_stmt_cost): Use them.
(aarch64_vector_costs::finish_cost): Likewise.
* config/aarch64/tuning_models/a64fx.h: Update.
* config/aarch64/tuning_models/cortexx925.h: Update.
* config/aarch64/tuning_models/generic.h: Update.
* config/aarch64/tuning_models/generic_armv8_a.h: Update.
* config/aarch64/tuning_models/generic_armv9_a.h: Update.
* config/aarch64/tuning_models/neoverse512tvb.h: Update.
* config/aarch64/tuning_models/neoversen2.h: Update.
* config/aarch64/tuning_models/neoversen3.h: Update.
* config/aarch64/tuning_models/neoversev1.h: Update.
* config/aarch64/tuning_models/neoversev2.h: Update.
* config/aarch64/tuning_models/neoversev3.h: Update.
* config/aarch64/tuning_models/neoversev3ae.h: Update.
|
|
gcc:
* doc/gm2.texi (Limitations): Rephrase. Remove invalid link.
|
|
The constant C must be an integral multiple of the shift value in
the above optimization. Non integral values can occur evaluating
IMAGPART_EXPR when the shadd constant is 8 and we have SFmode.
2024-08-06 John David Anglin <danglin@gcc.gnu.org>
gcc/ChangeLog:
PR target/113384
* config/pa/pa.cc (hppa_legitimize_address): Add check to
ensure constant is an integral multiple of shift the value.
|
|
This fixes typos in function names and executed code.
gcc/ChangeLog:
* config/riscv/riscv-target-attr.cc (num_occurences_in_str): Rename...
(num_occurrences_in_str): here.
(riscv_process_target_attr): Update num_occurences_in_str callsite.
* config/riscv/riscv-v.cc (emit_vec_widden_cvt_x_f): widden -> widen.
(emit_vec_widen_cvt_x_f): Ditto.
(emit_vec_widden_cvt_f_f): Ditto.
(emit_vec_widen_cvt_f_f): Ditto.
(emit_vec_rounding_to_integer): Update *widden* callsites.
* config/riscv/riscv-vector-builtins.cc (expand_builtin): Update
required_ext_to_isa_name callsite and fix xtheadvector typo.
* config/riscv/riscv-vector-builtins.h (reqired_ext_to_isa_name): Rename...
(required_ext_to_isa_name): here.
* config/riscv/riscv_th_vector.h: Fix endif label.
* config/riscv/vector-crypto.md: boardcast_scalar -> broadcast_scalar.
* config/riscv/vector.md: Ditto.
Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
|
|
This fixes most of the typos I found when reading various parts of the RISC-V
backend.
gcc/ChangeLog:
* config/riscv/arch-canonicalize: Fix typos in comments.
* config/riscv/autovec.md: Ditto.
* config/riscv/riscv-avlprop.cc (avl_can_be_propagated_p): Ditto.
(pass_avlprop::get_vlmax_ta_preferred_avl): Ditto.
* config/riscv/riscv-modes.def (ADJUST_FLOAT_FORMAT): Ditto.
(VLS_MODES): Ditto.
* config/riscv/riscv-opts.h (TARGET_ZICOND_LIKE): Ditto.
(enum rvv_vector_bits_enum): Ditto.
* config/riscv/riscv-protos.h (enum insn_flags): Ditto.
(enum insn_type): Ditto.
* config/riscv/riscv-sr.cc (riscv_sr_match_epilogue): Ditto.
* config/riscv/riscv-string.cc (expand_block_move): Ditto.
* config/riscv/riscv-v.cc (rvv_builder::is_repeating_sequence): Ditto.
(rvv_builder::single_step_npatterns_p): Ditto.
(calculate_ratio): Ditto.
(expand_const_vector): Ditto.
(shuffle_merge_patterns): Ditto.
(shuffle_compress_patterns): Ditto.
(expand_select_vl): Ditto.
* config/riscv/riscv-vector-builtins-functions.def (REQUIRED_EXTENSIONS): Ditto.
* config/riscv/riscv-vector-builtins-shapes.h: Ditto.
* config/riscv/riscv-vector-builtins.cc (function_builder::add_function): Ditto.
(resolve_overloaded_builtin): Ditto.
* config/riscv/riscv-vector-builtins.def (vbool1_t): Ditto.
(vuint8m8_t): Ditto.
(vuint16m8_t): Ditto.
(vfloat16m8_t): Ditto.
(unsigned_vector): Ditto.
* config/riscv/riscv-vector-builtins.h (enum required_ext): Ditto.
* config/riscv/riscv-vector-costs.cc (get_store_value): Ditto.
(costs::analyze_loop_vinfo): Ditto.
(costs::add_stmt_cost): Ditto.
* config/riscv/riscv.cc (riscv_build_integer): Ditto.
(riscv_vector_type_p): Ditto.
* config/riscv/thead.cc (th_mempair_output_move): Ditto.
* config/riscv/thead.md: Ditto.
* config/riscv/vector-iterators.md: Ditto.
* config/riscv/vector.md: Ditto.
* config/riscv/zc.md: Ditto.
Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
|
|
Patrick noticed a few more concept_check_p checks that can be removed
now.
gcc/cp/ChangeLog:
* constexpr.cc (cxx_eval_call_expression): Remove concept_check_p check.
(cxx_eval_outermost_constant_expr): Likewise.
* cp-gimplify.cc (cp_genericize_r) <case CALL_EXPR>: Likewise.
* except.cc (check_noexcept_r): Likewise.
|
|
Building on the last patch, deduction should probably look through all
IMPLICIT_CONV_EXPR like we do other conversions.
One resulting regression turned out to be due to PR94568, fixed separately.
The one other regression was for a seeming mismatch between a function and
its address, handled here. Before this change we treated the
IMPLICIT_CONV_EXPR as dependent because the template parameter has dependent
type.
PR c++/116223
gcc/cp/ChangeLog:
* pt.cc (deducible_expression): Strip all IMPLICIT_CONV_EXPR.
(unify): Likewise. Handle resulting function/addr mismatch.
|
|
My r14-8291 for PR112632 introduced IMPLICIT_CONV_EXPR_FORCED to express
conversions to the type of an alias template parameter. In this example,
that broke deduction of X in the call to foo, so let's teach deduction to
look through it.
PR c++/116223
PR c++/112632
gcc/cp/ChangeLog:
* pt.cc (deducible_expression): Also look through
IMPLICIT_CONV_EXPR_FORCED.
(unify): Likewise.
gcc/testsuite/ChangeLog:
* g++.dg/cpp1z/nontype-auto25.C: New test.
|
|
A zero-initializer should not reflect the constness of what it's
initializing, as it does not for initializers with different syntax.
This does have mangling implications for rare C++20 code, but it seems
infeasable to make the mangling depend on -fabi-version while fixing the
semantic bug, and C++20 is still experimental anyway.
PR c++/94568
gcc/cp/ChangeLog:
* init.cc (build_zero_init_1): Call cv_unqualified.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/nontype-class36.C: Remove xfail.
* g++.dg/cpp2a/nontype-class37.C: Remove xfail.
* g++.dg/cpp1z/nontype-auto26.C: New test.
|
|
This patch refactors ashrv2di RTL expansion into a function so that it may
be reused by a pre-reload splitter, such that DImode right shifts may be
considered candidates during the Scalar-To-Vector (STV) pass. Currently
DImode arithmetic right shifts are not considered potential candidates
during STV, so for the following testcase:
long long m;
typedef long long v2di __attribute__((vector_size (16)));
void foo(v2di x) { m = x[0]>>63; }
We currently see the following warning/error during STV2
> r101 use in insn 7 isn't convertible
And end up generating scalar code with an interunit move:
foo: movq %xmm0, %rax
sarq $63, %rax
movq %rax, m(%rip)
ret
With this patch, we can reuse the RTL expansion logic and produce:
foo: psrad $31, %xmm0
pshufd $245, %xmm0, %xmm0
movq %xmm0, m(%rip)
ret
Or with the addition of -mavx2, the equivalent:
foo: vpxor %xmm1, %xmm1, %xmm1
vpcmpgtq %xmm0, %xmm1, %xmm0
vmovq %xmm0, m(%rip)
ret
The only design decision of note is the choice to continue lowering V2DI
into vector sequences during RTL expansion, to enable combine to optimize
things if possible. Using just define_insn_and_split potentially misses
optimizations, such as reusing the zero vector produced by vpxor above.
It may be necessary to tweak STV's compute gain at some point, but this
patch controls what's possible (rather than what's beneficial).
2024-08-06 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
* config/i386/i386-expand.cc (ix86_expand_v2di_ashiftrt): New
function refactored from define_expand ashrv2di3.
* config/i386/i386-features.cc (general_scalar_to_vector_candidate_p)
<case ASHIFTRT>: Handle like other shifts and rotates.
* config/i386/i386-protos.h (ix86_expand_v2di_ashiftrt): Prototype.
* config/i386/sse.md (ashrv2di3): Call ix86_expand_v2di_ashiftrt.
(*ashrv2di3): New define_insn_and_split to enable creation by stv2
pass, and splitting during split1 reusing ix86_expand_v2di_ashiftrt.
gcc/testsuite/ChangeLog
* gcc.target/i386/sse2-stv-2.c: New test case.
|