aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-09-20c++: CWG 2789 and usings [PR116492]Patrick Palka3-45/+49
After CWG 2789, the "more constrained" tiebreaker for non-template functions should exclude member functions that are defined in different classes. This patch implements this missing refinement. In turn we can get rid of four-parameter version of object_parms_correspond and call the main overload directly since now correspondence is only only checked for members from the same class. PR c++/116492 DR 2789 gcc/cp/ChangeLog: * call.cc (object_parms_correspond): Remove. (cand_parms_match): Return false for member functions that come from different classes. Adjust call to object_parms_correspond. (joust): Update comment for the non-template "more constrained" case. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/concepts-memfun4.C: Also compile in C++20 mode. Expect ambiguity when candidates come from different classes. * g++.dg/cpp2a/concepts-inherit-ctor12.C: New test. Reviewed-by: Jason Merrill <jason@redhat.com>
2024-09-20c++: CWG 2273 and non-constructorsPatrick Palka3-12/+9
Our implementation of the CWG 2273 inheritedness tiebreaker seems to be incorrectly considering all member functions introduced via using, not just constructors. This patch restricts the tiebreaker accordingly. DR 2273 gcc/cp/ChangeLog: * call.cc (joust): Restrict inheritedness tiebreaker to constructors. gcc/testsuite/ChangeLog: * g++.dg/cpp1z/using1.C: Expect ambiguity for non-constructor call. * g++.dg/overload/using5.C: Likewise. Reviewed-by: Jason Merrill <jason@redhat.com>
2024-09-20AArch64: Define VECTOR_STORE_FLAG_VALUE.Tamar Christina2-0/+39
This defines VECTOR_STORE_FLAG_VALUE to CONST1_RTX for AArch64 so we simplify vector comparisons in AArch64. With this enabled res: movi v0.4s, 0 cmeq v0.4s, v0.4s, v0.4s ret is simplified to: res: mvni v0.4s, 0 ret gcc/ChangeLog: * config/aarch64/aarch64.h (VECTOR_STORE_FLAG_VALUE): New. gcc/testsuite/ChangeLog: * gcc.dg/rtl/aarch64/vector-eq.c: New test.
2024-09-20testsuite: Update commandline for PR116628.c to use neoverse-v2 [PR116628]Tamar Christina1-1/+1
The testcase for this tests needs Neoverse V2 to be used since due to costing the other cost models don't pick this particular SVE mode. committed as obvious. Thanks, Tamar gcc/testsuite/ChangeLog: PR tree-optimization/116628 * gcc.dg/vect/pr116628.c: Update cmdline.
2024-09-20Darwin: Allow for as versions that need '-' for std in.Iain Sandoe1-0/+2
Recent versions of Xcode as require a dash to read from standard input. We can use this on all supported OS versions so make it unconditional. Patch from Mark Mentovai. gcc/ChangeLog: * config/darwin.h (AS_NEEDS_DASH_FOR_PIPED_INPUT): New. Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
2024-09-20c++, coroutines: Rework the ramp codegen.Iain Sandoe1-138/+150
Now that we have separated the codegen of the ramp, actor and destroy functions, we no longer need to manage the scopes for variables manually. This introduces a helper function that allows us to build a local var with a DECL_VALUE_EXPR that relates to the coroutine state frame entry. This fixes a latent issue where we would generate guard vars when exceptions were disabled. gcc/cp/ChangeLog: * coroutines.cc (coro_build_artificial_var_with_dve): New. (coro_build_and_push_artificial_var): New. (coro_build_and_push_artificial_var_with_dve): New. (analyze_fn_parms): Ensure that frame entries cannot clash with local variables. (build_coroutine_frame_delete_expr): Amend comment. (cp_coroutine_transform::build_ramp_function): Rework to avoid manual management of variables and scopes. Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
2024-09-20Fall back to elementwise access for too spaced SLP single element interleavingRichard Biener2-16/+23
gcc.dg/vect/vect-pr111779.c is a case where non-SLP manages to vectorize using VMAT_ELEMENTWISE but SLP currently refuses because doing a regular access with permutes would cause excess vector loads with at most one element used. The following makes us fall back to elementwise accesses for that, too. * tree-vect-stmts.cc (get_group_load_store_type): Fall back to VMAT_ELEMENTWISE when single element interleaving of a too large group. (vectorizable_load): Do not try to verify load permutations when using VMAT_ELEMENTWISE for single-lane SLP and fix code generation for this case. * gfortran.dg/vect/vect-8.f90: Allow one more vectorized loop.
2024-09-20Handle patterns as SLP roots of only live stmtsRichard Biener1-0/+1
gcc.dg/vect/vect-live-2.c shows it's important to handle live but otherwise unused pattern stmts. * tree-vect-slp.cc (vect_analyze_slp): Lookup patterns when discovering from only-live roots.
2024-09-20s390: Remove -m{,no-}lra optionStefan Schulze Frielinghaus5-72/+0
Since the old reload pass is about to be removed and we defaulted to LRA for over a decade, remove option -m{,no-}lra. PR target/113953 gcc/ChangeLog: * config/s390/s390.cc (s390_lra_p): Remove. (TARGET_LRA_P): Remove. * config/s390/s390.opt (mlra): Remove. * config/s390/s390.opt.urls (mlra): Remove. gcc/testsuite/ChangeLog: * gcc.target/s390/TI-constants-nolra.c: Removed. * gcc.target/s390/pr79895.c: Removed.
2024-09-20testsuite/116397 - avoid looking for "VEC_PERM_EXPR"Richard Biener1-1/+1
With SLP this token appears a lot, when looking for what gets code generated instead look for " = VEC_PERM_EXPR" PR testsuite/116397 * gcc.dg/vect/slp-reduc-3.c: Look for " = VEC_PERM_EXPR" instead of "VEC_PERM_EXPR".
2024-09-20Fix small thinko in IPA mod/ref passEric Botcazou2-1/+36
When a memory copy operation is analyzed by analyze_ssa_name, if both the load and store are made through the same SSA name, the store is overlooked. gcc/ * ipa-modref.cc (modref_eaf_analysis::analyze_ssa_name): Always process both the load and the store of a memory copy operation. gcc/testsuite/ * gcc.dg/ipa/modref-4.c: New test.
2024-09-20OpenMP: Add get_device_from_uid/omp_get_uid_from_device routinesTobias Burnus18-7/+384
Those TR13/OpenMP 6.0 routines permit a reproducible offloading to a specific device by mapping an OpenMP device number to a unique ID (UID). The GPU device UIDs should be universally unique, the one for the host is not. gcc/ChangeLog: * omp-general.cc (omp_runtime_api_procname): Add get_device_from_uid and omp_get_uid_from_device routines. include/ChangeLog: * cuda/cuda.h (cuDeviceGetUuid): Declare. (cuDeviceGetUuid_v2): Add prototype. libgomp/ChangeLog: * config/gcn/target.c (omp_get_uid_from_device, omp_get_device_from_uid): Add stub implementation. * config/nvptx/target.c (omp_get_uid_from_device, omp_get_device_from_uid): Likewise. * fortran.c (omp_get_uid_from_device_, omp_get_uid_from_device_8_): New functions. * libgomp-plugin.h (GOMP_OFFLOAD_get_uid): Add prototype. * libgomp.h (struct gomp_device_descr): Add 'uid' and 'get_uid_func'. * libgomp.map (GOMP_6.0): New, includind the new UID routines. * libgomp.texi (OpenMP Technical Report 13): Mark UID routines as 'Y'. (Device Information Routines): Document new UID routines. (Offload-Target Specifics): Document UID format. * omp.h.in (omp_get_device_from_uid, omp_get_uid_from_device): New prototype. * omp_lib.f90.in (omp_get_device_from_uid, omp_get_uid_from_device): New interface. * omp_lib.h.in: Likewise. * plugin/cuda-lib.def: Add cuDeviceGetUuid and cuDeviceGetUuid_v2 via CUDA_ONE_CALL_MAYBE_NULL. * plugin/plugin-gcn.c (GOMP_OFFLOAD_get_uid): New. * plugin/plugin-nvptx.c (GOMP_OFFLOAD_get_uid): New. * target.c (str_omp_initial_device): New static var. (STR_OMP_DEV_PREFIX): Define. (gomp_get_uid_for_device, omp_get_uid_from_device, omp_get_device_from_uid): New. (gomp_load_plugin_for_device): DLSYM_OPT the function 'get_uid'. (gomp_target_init): Set the device's 'uid' field to NULL. * testsuite/libgomp.c/device_uid.c: New test. * testsuite/libgomp.fortran/device_uid.f90: New test.
2024-09-20testsuite: fix target-specific 'do-' typosSam James4-4/+4
Fix some target-specific 'do-' (rather than 'dg-') typos. gcc/testsuite/ChangeLog: * gcc.target/m68k/pr108640.c: Fix dg directive typo. * gcc.target/m68k/pr110934.c: Ditto. * gcc.target/m68k/pr82420.c: Ditto. * gcc.target/powerpc/pr99708.c: Ditto.
2024-09-20i386: Fix up _mm_min_ss etc. handling of zeros and NaNs [PR116738]Jakub Jelinek3-1/+71
min/max patterns for intrinsics which on x86 result in the second input operand if the two operands are both zeros or one or both of them are a NaN shouldn't use SMIN/SMAX RTL, because that is similarly to MIN_EXPR/MAX_EXPR undefined what will be the result in those cases. The following patch adds an expander which uses either a new pattern with UNSPEC_IEEE_M{AX,IN} or use the S{MIN,MAX} representation of the same. 2024-09-20 Uros Bizjak <ubizjak@gmail.com> Jakub Jelinek <jakub@redhat.com> PR target/116738 * config/i386/subst.md (mask_scalar_operand_arg34, mask_scalar_expand_op3, round_saeonly_scalar_mask_arg3): New subst attributes. * config/i386/sse.md (<sse>_vm<code><mode>3<mask_scalar_name><round_saeonly_scalar_name>): Change from define_insn to define_expand, rename the old define_insn to ... (*<sse>_vm<code><mode>3<mask_scalar_name><round_saeonly_scalar_name>): ... this. (<sse>_ieee_vm<ieee_maxmin><mode>3<mask_scalar_name><round_saeonly_scalar_name>): New define_insn. * gcc.target/i386/sse-pr116738.c: New test.
2024-09-20testsuite/116784 - match up SLP scan and vectorized scanRichard Biener1-2/+2
The test used vect_perm_short for the vectorized scanning but vect_perm3_short for whether that's done with SLP. We're now generally expecting SLP to be used - even as fallback, so the following adjusts both to match up, fixing the powerpc64 reported testsuite issue. PR testsuite/116784 * gcc.dg/vect/slp-perm-9.c: Use vect_perm_short also for the SLP check.
2024-09-20testsuite: debug: fix errant whitespaceSam James1-1/+0
I added some whitespace unintentionally in r15-3723-g284c03ec79ec20, fix that. gcc/testsuite/ChangeLog: * gcc.dg/debug/btf/btf-datasec-1.c: Fix whitespace.
2024-09-20testsuite: fix 'do-do' typosSam James5-8/+8
Fix 'do-do' typos (should be 'dg-do'). No change in logs. gcc/testsuite/ChangeLog: * g++.dg/other/operator2.C: Fix dg-do directive. * gcc.dg/Warray-bounds-67.c: Ditto. * gcc.dg/cpp/builtin-macro-1.c: Ditto. * gcc.dg/tree-ssa/builtin-snprintf-3.c: Ditto. * obj-c++.dg/empty-private-1.mm: Ditto.
2024-09-20Remove PHI_RESULT_PTR and change some PHI_RESULT to be gimple_phi_result ↵Andrew Pinski2-6/+5
[PR116643] There was only a few uses PHI_RESULT_PTR so lets remove it and use gimple_phi_result_ptr or gimple_phi_result directly instead. Since I was modifying ssa-iterators.h for the use of PHI_RESULT_PTR, change the use of PHI_RESULT there to be gimple_phi_result instead. This also removes one extra indirection that was done for PHI_RESULT so stage2 building should be slightly faster. Bootstrapped and tested on x86_64-linux-gnu. PR middle-end/116643 gcc/ChangeLog: * ssa-iterators.h (single_phi_def): Use gimple_phi_result instead of PHI_RESULT. (op_iter_init_phidef): Use gimple_phi_result/gimple_phi_result_ptr instead of PHI_RESULT/PHI_RESULT_PTR. * tree-ssa-operands.h (PHI_RESULT_PTR): Remove. (PHI_RESULT): Use gimple_phi_result directly. (SET_PHI_RESULT): Use gimple_phi_result_ptr directly. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2024-09-20testsuite: debug: fix dejagnu directive syntaxSam James55-55/+56
In this case, they were all harmless in reality (no diff in test logs). gcc/testsuite/ChangeLog: * gcc.dg/debug/btf/btf-array-1.c: Fix dg-do directive syntax. * gcc.dg/debug/btf/btf-bitfields-1.c: Ditto. * gcc.dg/debug/btf/btf-bitfields-2.c: Ditto. * gcc.dg/debug/btf/btf-datasec-1.c: Ditto. * gcc.dg/debug/btf/btf-union-1.c: Ditto. * gcc.dg/debug/ctf/ctf-anonymous-struct-1.c: Ditto. * gcc.dg/debug/ctf/ctf-anonymous-union-1.c: Ditto. * gcc.dg/debug/ctf/ctf-array-1.c: Ditto. * gcc.dg/debug/ctf/ctf-array-2.c: Ditto. * gcc.dg/debug/ctf/ctf-array-4.c: Ditto. * gcc.dg/debug/ctf/ctf-array-5.c: Ditto. * gcc.dg/debug/ctf/ctf-array-6.c: Ditto. * gcc.dg/debug/ctf/ctf-attr-mode-1.c: Ditto. * gcc.dg/debug/ctf/ctf-attr-used-1.c: Ditto. * gcc.dg/debug/ctf/ctf-bitfields-1.c: Ditto. * gcc.dg/debug/ctf/ctf-bitfields-2.c: Ditto. * gcc.dg/debug/ctf/ctf-bitfields-3.c: Ditto. * gcc.dg/debug/ctf/ctf-bitfields-4.c: Ditto. * gcc.dg/debug/ctf/ctf-complex-1.c: Ditto. * gcc.dg/debug/ctf/ctf-cvr-quals-1.c: Ditto. * gcc.dg/debug/ctf/ctf-cvr-quals-2.c: Ditto. * gcc.dg/debug/ctf/ctf-cvr-quals-3.c: Ditto. * gcc.dg/debug/ctf/ctf-cvr-quals-4.c: Ditto. * gcc.dg/debug/ctf/ctf-enum-1.c: Ditto. * gcc.dg/debug/ctf/ctf-enum-2.c: Ditto. * gcc.dg/debug/ctf/ctf-file-scope-1.c: Ditto. * gcc.dg/debug/ctf/ctf-float-1.c: Ditto. * gcc.dg/debug/ctf/ctf-forward-1.c: Ditto. * gcc.dg/debug/ctf/ctf-forward-2.c: Ditto. * gcc.dg/debug/ctf/ctf-func-index-1.c: Ditto. * gcc.dg/debug/ctf/ctf-function-pointers-1.c: Ditto. * gcc.dg/debug/ctf/ctf-function-pointers-2.c: Ditto. * gcc.dg/debug/ctf/ctf-function-pointers-3.c: Ditto. * gcc.dg/debug/ctf/ctf-function-pointers-4.c: Ditto. * gcc.dg/debug/ctf/ctf-functions-1.c: Ditto. * gcc.dg/debug/ctf/ctf-int-1.c: Ditto. * gcc.dg/debug/ctf/ctf-objt-index-1.c: Ditto. * gcc.dg/debug/ctf/ctf-pointers-1.c: Ditto. * gcc.dg/debug/ctf/ctf-pointers-2.c: Ditto. * gcc.dg/debug/ctf/ctf-preamble-1.c: Ditto. * gcc.dg/debug/ctf/ctf-str-table-1.c: Ditto. * gcc.dg/debug/ctf/ctf-struct-1.c: Ditto. * gcc.dg/debug/ctf/ctf-struct-2.c: Ditto. * gcc.dg/debug/ctf/ctf-struct-array-1.c: Ditto. * gcc.dg/debug/ctf/ctf-struct-array-2.c: Ditto. * gcc.dg/debug/ctf/ctf-typedef-1.c: Ditto. * gcc.dg/debug/ctf/ctf-typedef-2.c: Ditto. * gcc.dg/debug/ctf/ctf-typedef-3.c: Ditto. * gcc.dg/debug/ctf/ctf-typedef-struct-1.c: Ditto. * gcc.dg/debug/ctf/ctf-typedef-struct-2.c: Ditto. * gcc.dg/debug/ctf/ctf-typedef-struct-3.c: Ditto. * gcc.dg/debug/ctf/ctf-union-1.c: Ditto. * gcc.dg/debug/ctf/ctf-variables-1.c: Ditto. * gcc.dg/debug/ctf/ctf-variables-2.c: Ditto. * gcc.dg/debug/ctf/ctf-variables-3.c: Ditto.
2024-09-19c-family: regenerate c.opt.urlsMarek Polacek1-0/+3
I forgot again. gcc/c-family/ChangeLog: * c.opt.urls: Regenerate.
2024-09-19c++: deleting explicitly-defaulted functions [PR116162]Marek Polacek22-40/+359
This PR points out the we're not implementing [dcl.fct.def.default] properly. Consider e.g. struct C { C(const C&&) = default; }; where we wrongly emit an error, but the move ctor should be just =deleted. According to [dcl.fct.def.default], if the type of the special member function differs from the type of the corresponding special member function that would have been implicitly declared in a way other than as allowed by 2.1-4, the function is defined as deleted. There's an exception for assignment operators in which case the program is ill-formed. clang++ has a warning for when we delete an explicitly-defaulted function so this patch adds it too. When the code is ill-formed, we emit an error in all modes. Otherwise, we emit a pedwarn in C++17 and a warning in C++20. PR c++/116162 gcc/c-family/ChangeLog: * c.opt (Wdefaulted-function-deleted): New. gcc/cp/ChangeLog: * class.cc (check_bases_and_members): Don't set DECL_DELETED_FN here, leave it to defaulted_late_check. * cp-tree.h (maybe_delete_defaulted_fn): Declare. (defaulted_late_check): Add a tristate parameter. * method.cc (maybe_delete_defaulted_fn): New. (defaulted_late_check): Add a tristate parameter. Call maybe_delete_defaulted_fn instead of giving an error. gcc/ChangeLog: * doc/invoke.texi: Document -Wdefaulted-function-deleted. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/defaulted15.C: Add dg-warning/dg-error. * g++.dg/cpp0x/defaulted51.C: Likewise. * g++.dg/cpp0x/defaulted52.C: Likewise. * g++.dg/cpp0x/defaulted53.C: Likewise. * g++.dg/cpp0x/defaulted54.C: Likewise. * g++.dg/cpp0x/defaulted56.C: Likewise. * g++.dg/cpp0x/defaulted57.C: Likewise. * g++.dg/cpp0x/defaulted58.C: Likewise. * g++.dg/cpp0x/defaulted59.C: Likewise. * g++.dg/cpp0x/defaulted63.C: New test. * g++.dg/cpp0x/defaulted64.C: New test. * g++.dg/cpp0x/defaulted65.C: New test. * g++.dg/cpp0x/defaulted66.C: New test. * g++.dg/cpp0x/defaulted67.C: New test. * g++.dg/cpp0x/defaulted68.C: New test. * g++.dg/cpp0x/defaulted69.C: New test. * g++.dg/cpp23/defaulted1.C: New test.
2024-09-19Update cpplib zh_CN.poJoseph Myers1-198/+121
* zh_CN.po: Update.
2024-09-19Update gcc zh_CN.poJoseph Myers1-280/+181
* zh_CN.po: Update.
2024-09-19dwarf2asm: Use constexpr for eh_data_format_name initialization for C++14Jakub Jelinek1-2/+17
Similarly to the previous patch, dwarf2asm.cc had HAVE_DESIGNATED_INITIALIZERS support, and as fallback a huge switch. The switch from what I can see is expanded as a jump table with 256 label pointers and code at those labels then loads addresses of string literals. The following patch instead uses a table with 256 const char * pointers, NULL for ICE, non-NULL for returning something, similarly to the HAVE_DESIGNATED_INITIALIZERS case. 2024-09-19 Jakub Jelinek <jakub@redhat.com> * dwarf2asm.cc (eh_data_format_name): Use constexpr initialization of format_names table for C++14 instead of a large switch.
2024-09-19RISC-V: Add testcases for form 2 of signed scalar SAT_ADDPan Li9-0/+199
This patch would like to add testcases of the signed scalar SAT_ADD for form 2. Aka: Form 2: #define DEF_SAT_S_ADD_FMT_2(T, UT, MIN, MAX) \ T __attribute__((noinline)) \ sat_s_add_##T##_fmt_2 (T x, T y) \ { \ T sum = (UT)x + (UT)y; \ if ((x ^ y) < 0 || (sum ^ x) >= 0) \ return sum; \ return x < 0 ? MIN : MAX; \ } DEF_SAT_S_ADD_FMT_2 (int64_t, uint64_t, INT64_MIN, INT64_MAX) The below test are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/sat_arith.h: Add test helper macros. * gcc.target/riscv/sat_s_add-5.c: New test. * gcc.target/riscv/sat_s_add-6.c: New test. * gcc.target/riscv/sat_s_add-7.c: New test. * gcc.target/riscv/sat_s_add-8.c: New test. * gcc.target/riscv/sat_s_add-run-5.c: New test. * gcc.target/riscv/sat_s_add-run-6.c: New test. * gcc.target/riscv/sat_s_add-run-7.c: New test. * gcc.target/riscv/sat_s_add-run-8.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-09-19tree-optimization/116768 - wrong dependence analysisRichard Biener3-4/+34
The following reverts a bogus fix done for PR101009 and instead makes sure we get into the same_access_functions () case when computing the distance vector for g[1] and g[1] where the constants ended up having different types. The generic code doesn't seem to handle loop invariant dependences. The special case gets us both ( 0 ) and ( 1 ) as distance vectors while formerly we got ( 1 ), which the PR101009 fix changed to ( 0 ) with bad effects on other cases as shown in this PR. PR tree-optimization/116768 * tree-data-ref.cc (build_classic_dist_vector_1): Revert PR101009 change. * tree-chrec.cc (eq_evolutions_p): Make sure (sizetype)1 and (int)1 compare equal. * gcc.dg/torture/pr116768.c: New testcase.
2024-09-19Fall back to single-lane SLP before falling back to no SLPRichard Biener4-42/+54
The following changes the fallback to disable SLP when any of the discovered SLP instances failed to pass vectorization checking into a fallback that emulates what no SLP would do with SLP - force single-lane discovery for all instances. The patch does not remove the final fallback to disable SLP but it reduces the fallout from failing vectorization when any non-SLP stmt survives analysis. * tree-vectorizer.h (vect_analyze_slp): Add force_single_lane parameter. * tree-vect-slp.cc (vect_analyze_slp_instance): Remove defaulting of force_single_lane. (vect_build_slp_instance): Likewise. Pass down appropriate force_single_lane. (vect_analyze_slp): Add force_sigle_lane parameter and pass it down appropriately. (vect_slp_analyze_bb_1): Always do multi-lane SLP. * tree-vect-loop.cc (vect_analyze_loop_2): Track two SLP modes and adjust accordingly. (vect_analyze_loop_1): Save the SLP mode when unrolling. * gcc.dg/vect/vect-outer-slp-1.c: Adjust.
2024-09-19libstdc++: add #pragma diagnosticJason Merrill72-7/+320
The use of #pragma GCC system_header in libstdc++ has led to bugs going undetected for a while due to the silencing of compiler warnings that would have revealed them promptly, and also interferes with warnings about problematic template instantiations induced by user code. But removing it, or even compiling with -Wsystem-header, is also problematic due to warnings about deliberate uses of extensions. So this patch adds #pragma GCC diagnostic as needed to suppress these warnings. The change to acinclude.m4 changes -Wabi to warn only in comparison to ABI 19, to avoid lots of warnings that we now mangle concept requirements, which are in any case still experimental. I checked for any other changes against ABI v15, and found only the <format> lambda mangling, which we can ignore. This also enables -Wsystem-headers while building the library, so we see any warnings not silenced by these #pragmas. libstdc++-v3/ChangeLog: * include/bits/algorithmfwd.h: * include/bits/allocator.h: * include/bits/codecvt.h: * include/bits/concept_check.h: * include/bits/cpp_type_traits.h: * include/bits/hashtable.h: * include/bits/iterator_concepts.h: * include/bits/ostream_insert.h: * include/bits/ranges_base.h: * include/bits/regex_automaton.h: * include/bits/std_abs.h: * include/bits/stl_algo.h: * include/c_compatibility/fenv.h: * include/c_compatibility/inttypes.h: * include/c_compatibility/stdint.h: * include/ext/concurrence.h: * include/ext/type_traits.h: * testsuite/ext/type_traits/add_unsigned_floating_neg.cc: * testsuite/ext/type_traits/add_unsigned_integer_neg.cc: * testsuite/ext/type_traits/remove_unsigned_floating_neg.cc: * testsuite/ext/type_traits/remove_unsigned_integer_neg.cc: * include/bits/basic_ios.tcc: * include/bits/basic_string.tcc: * include/bits/fstream.tcc: * include/bits/istream.tcc: * include/bits/locale_classes.tcc: * include/bits/locale_facets.tcc: * include/bits/ostream.tcc: * include/bits/regex_compiler.tcc: * include/bits/sstream.tcc: * include/bits/streambuf.tcc: * configure: Regenerate. * include/bits/c++config: * include/c/cassert: * include/c/cctype: * include/c/cerrno: * include/c/cfloat: * include/c/climits: * include/c/clocale: * include/c/cmath: * include/c/csetjmp: * include/c/csignal: * include/c/cstdarg: * include/c/cstddef: * include/c/cstdio: * include/c/cstdlib: * include/c/cstring: * include/c/ctime: * include/c/cwchar: * include/c/cwctype: * include/c_global/climits: * include/c_global/cmath: * include/c_global/cstddef: * include/c_global/cstdlib: * include/decimal/decimal: * include/ext/rope: * include/std/any: * include/std/charconv: * include/std/complex: * include/std/coroutine: * include/std/format: * include/std/iomanip: * include/std/limits: * include/std/numbers: * include/tr1/functional: * include/tr1/tuple: * include/tr1/type_traits: * libsupc++/compare: * libsupc++/new: Add #pragma GCC diagnostic to suppress undesired warnings. * acinclude.m4: Change -Wabi version from 2 to 19. gcc/ChangeLog: * ginclude/stdint-wrap.h: Add #pragma GCC diagnostic to suppress undesired warnings. * gsyslimits.h: Likewise.
2024-09-19Always dump generated distance vectorsRichard Biener1-16/+18
There's special-casing for equal access functions which bypasses printing the distance vectors. The following makes sure we print them always which helps debugging. * tree-data-ref.cc (build_classic_dist_vector): Move distance vector dumping to single caller ... (subscript_dependence_tester): ... here, dumping always when we succeed computing it.
2024-09-19tree-optimization/116573 - .SELECT_VL for SLPRichard Biener2-5/+20
The following restores the use of .SELECT_VL for testcases where it is safe to use even when using SLP. I've for now restricted it to single-lane SLP plus optimistically allow store-lane nodes and assume single-lane roots are not widened but at most to load-lane who should be fine. PR tree-optimization/116573 * tree-vect-loop.cc (vect_analyze_loop_2): Allow .SELECV_VL for SLP but disable it when there's multi-lane instances. * tree-vect-stmts.cc (vectorizable_store): Only compute the ptr increment when generating code. (vectorizable_load): Likewise.
2024-09-19Fortran: Break recursion building recursive types. [PR106606]Andre Vehreschild2-6/+51
Build a derived type component's type only, when it is not already being built and the component uses pointer semantics. gcc/fortran/ChangeLog: PR fortran/106606 * trans-types.cc (gfc_get_derived_type): Only build non-pointer derived types as component's types when they are not yet built. gcc/testsuite/ChangeLog: * gfortran.dg/recursive_alloc_comp_5.f90: New test.
2024-09-19RISC-V: Fix vector SAT_ADD dump check due to middle-end changePan Li16-16/+16
This patch would like fix the dump check times of vector SAT_ADD. The middle-end change makes the match times from 2 to 4 times. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-21.c: Adjust the dump check times from 2 to 4. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-22.c: Ditto. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-23.c: Ditto. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-24.c: Ditto. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-25.c: Ditto. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-26.c: Ditto. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-27.c: Ditto. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-28.c: Ditto. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-29.c: Ditto. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-30.c: Ditto. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-31.c: Ditto. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-32.c: Ditto. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-5.c: Ditto. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-6.c: Ditto. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-7.c: Ditto. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-8.c: Ditto. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-09-19Match: Support form 3 for scalar signed integer .SAT_ADDPan Li1-0/+10
This patch would like to support the form 3 of the scalar signed integer .SAT_ADD. Aka below example: Form 3: #define DEF_SAT_S_ADD_FMT_3(T, UT, MIN, MAX) \ T __attribute__((noinline)) \ sat_s_add_##T##_fmt_3 (T x, T y) \ { \ T sum; \ bool overflow = __builtin_add_overflow (x, y, &sum); \ return overflow ? x < 0 ? MIN : MAX : sum; \ } DEF_SAT_S_ADD_FMT_3(int8_t, uint8_t, INT8_MIN, INT8_MAX) We can tell the difference before and after this patch if backend implemented the ssadd<m>3 pattern similar as below. Before this patch: 4 │ __attribute__((noinline)) 5 │ int8_t sat_s_add_int8_t_fmt_3 (int8_t x, int8_t y) 6 │ { 7 │ signed char _1; 8 │ signed char _2; 9 │ int8_t _3; 10 │ __complex__ signed char _6; 11 │ _Bool _8; 12 │ signed char _9; 13 │ signed char _10; 14 │ signed char _11; 15 │ 16 │ ;; basic block 2, loop depth 0 17 │ ;; pred: ENTRY 18 │ _6 = .ADD_OVERFLOW (x_4(D), y_5(D)); 19 │ _2 = IMAGPART_EXPR <_6>; 20 │ if (_2 != 0) 21 │ goto <bb 4>; [50.00%] 22 │ else 23 │ goto <bb 3>; [50.00%] 24 │ ;; succ: 4 25 │ ;; 3 26 │ 27 │ ;; basic block 3, loop depth 0 28 │ ;; pred: 2 29 │ _1 = REALPART_EXPR <_6>; 30 │ goto <bb 5>; [100.00%] 31 │ ;; succ: 5 32 │ 33 │ ;; basic block 4, loop depth 0 34 │ ;; pred: 2 35 │ _8 = x_4(D) < 0; 36 │ _9 = (signed char) _8; 37 │ _10 = -_9; 38 │ _11 = _10 ^ 127; 39 │ ;; succ: 5 40 │ 41 │ ;; basic block 5, loop depth 0 42 │ ;; pred: 3 43 │ ;; 4 44 │ # _3 = PHI <_1(3), _11(4)> 45 │ return _3; 46 │ ;; succ: EXIT 47 │ 48 │ } After this patch: 4 │ __attribute__((noinline)) 5 │ int8_t sat_s_add_int8_t_fmt_3 (int8_t x, int8_t y) 6 │ { 7 │ int8_t _3; 8 │ 9 │ ;; basic block 2, loop depth 0 10 │ ;; pred: ENTRY 11 │ _3 = .SAT_ADD (x_4(D), y_5(D)); [tail call] 12 │ return _3; 13 │ ;; succ: EXIT 14 │ 15 │ } The below test suites are passed for this patch. * The rv64gcv fully regression test. * The x86 bootstrap test. * The x86 fully regression test. gcc/ChangeLog: * match.pd: Add the form 3 of signed .SAT_ADD matching. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-09-19Genmatch: Refine the gen_phi_on_cond by match_cond_with_binary_phiPan Li1-52/+14
This patch would like to leverage the match_cond_with_binary_phi to match the phi on cond, and get the true/false arg if matched. This helps a lot to simplify the implementation of gen_phi_on_cond. Before this patch: basic_block _b1 = gimple_bb (_a1); if (gimple_phi_num_args (_a1) == 2) { basic_block _pb_0_1 = EDGE_PRED (_b1, 0)->src; basic_block _pb_1_1 = EDGE_PRED (_b1, 1)->src; basic_block _db_1 = safe_dyn_cast <gcond *> (*gsi_last_bb (_pb_0_1)) ? _pb_0_1 : _pb_1_1; basic_block _other_db_1 = safe_dyn_cast <gcond *> (*gsi_last_bb (_pb_0_1)) ? _pb_1_1 : _pb_0_1; gcond *_ct_1 = safe_dyn_cast <gcond *> (*gsi_last_bb (_db_1)); if (_ct_1 && EDGE_COUNT (_other_db_1->preds) == 1 && EDGE_COUNT (_other_db_1->succs) == 1 && EDGE_PRED (_other_db_1, 0)->src == _db_1) { tree _cond_lhs_1 = gimple_cond_lhs (_ct_1); tree _cond_rhs_1 = gimple_cond_rhs (_ct_1); tree _p0 = build2 (gimple_cond_code (_ct_1), boolean_type_node, _cond_lhs_1, _cond_rhs_1); bool _arg_0_is_true_1 = gimple_phi_arg_edge (_a1, 0)->flags & EDGE_TRUE_VALUE; tree _p1 = gimple_phi_arg_def (_a1, _arg_0_is_true_1 ? 0 : 1); tree _p2 = gimple_phi_arg_def (_a1, _arg_0_is_true_1 ? 1 : 0); ... After this patch: basic_block _b1 = gimple_bb (_a1); tree _p1, _p2; gcond *_cond_1 = match_cond_with_binary_phi (_a1, &_p1, &_p2); if (_cond_1) { tree _cond_lhs_1 = gimple_cond_lhs (_cond_1); tree _cond_rhs_1 = gimple_cond_rhs (_cond_1); tree _p0 = build2 (gimple_cond_code (_cond_1), boolean_type_node, _cond_lhs_1, _cond_rhs_1); ... The below test suites are passed for this patch. * The rv64gcv fully regression test. * The x86 bootstrap test. * The x86 fully regression test. gcc/ChangeLog: * genmatch.cc (dt_operand::gen_phi_on_cond): Leverage the match_cond_with_binary_phi API to get cond gimple, true and false TREE arg. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-09-19Fix deep copy allocatable components in coarrays. [PR85002]Andre Vehreschild3-9/+32
Fix code for deep copy of allocatable components in derived type nested structures generated, but not inserted when the copy had to be done in a coarray. Additionally fix a comment. gcc/fortran/ChangeLog: PR fortran/85002 * trans-array.cc (duplicate_allocatable_coarray): Allow adding of deep copy code in the when-allocated case. Add bounds computation before condition, because coarrays need the bounds also when not allocated. (structure_alloc_comps): Duplication in the coarray case is done already, omit it. Add the deep-code when duplication a coarray. * trans-expr.cc (gfc_trans_structure_assign): Fix comment. gcc/testsuite/ChangeLog: * gfortran.dg/coarray/alloc_comp_9.f90: New test.
2024-09-19SVE intrinsics: Fold svmul with all-zero operands to zero vectorJennifer Schmitz3-3/+383
As recently implemented for svdiv, this patch folds svmul to a zero vector if one of the operands is a zero vector. This transformation is applied if at least one of the following conditions is met: - the first operand is all zeros or - the second operand is all zeros, and the predicate is ptrue or the predication is _x or _z. In contrast to constant folding, which was implemented in a previous patch, this transformation is applied as soon as one of the operands is a zero vector, while the other operand can be a variable. The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression. OK for mainline? Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com> gcc/ * config/aarch64/aarch64-sve-builtins-base.cc (svmul_impl::fold): Add folding of all-zero operands to zero vector. gcc/testsuite/ * gcc.target/aarch64/sve/const_fold_mul_1.c: Adjust expected outcome. * gcc.target/aarch64/sve/fold_mul_zero.c: New test.
2024-09-19aarch64: Define l1_cache_line_size for -mcpu=neoverse-v2Kyrylo Tkachov1-1/+14
This is a small patch that sets the L1 cache line size for Neoverse V2. Unlike the other cache-related constants in there this value is not used just for SW prefetch generation (which we want to avoid for Neoverse V2 presently). It's also used to set std::hardware_destructive_interference_size. See the links and recent discussions in PR116662 for reference. Some CPU tunings in aarch64 set this value to something useful, but for generic tuning we use the conservative 256, which forces 256-byte alignment in such atomic structures. Using a smaller value can decrease the size of such structs during layout and should not present an ABI problem as std::hardware_destructive_interference_size is not intended to be used for structs in an external interface, and GCC warns about such uses. Another place where the L1 cache line size is used is in phiopt for -fhoist-adjacent-loads where conditional accesses to adjacent struct members can be speculatively loaded as long as they are within the same L1 cache line. e.g. struct S { int i; int j; }; int bar (struct S *x, int y) { int r; if (y) r = x->i; else r = x->j; return r; } The Neoverse V2 L1 cache line is 64 bytes according to the TRM, so set it to that. The rest of the prefetch parameters inherit from the generic tuning so we don't do anything extra for software prefeteches. Bootstrapped and tested on aarch64-none-linux-gnu. Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com> * config/aarch64/tuning_models/neoversev2.h (neoversev2_prefetch_tune): Define. (neoversev2_tunings): Use it.
2024-09-19i386: Add ssemov2, sseicvt2 for some load instructions that use memory on ↵Hu, Lin12-6/+11
operand2 The memory attr of some instructions should be 'load', but these are 'none', currently. gcc/ChangeLog: * config/i386/i386.md: Add ssemov2, sseicvt2. * config/i386/sse.md (sse2_cvtsi2sd): Apply sseicvt2. (sse2_cvtsi2sdq<round_name>): Ditto. (vec_set<mode>_0): Apply ssemov2 for 4, 6.
2024-09-19Match: Add interface match_cond_with_binary_phi for true/false argPan Li1-0/+120
When matching the cond with 2 args phi node, we need to figure out which arg of phi node comes from the true edge of cond block, as well as the false edge. This patch would like to add interface to perform the action and return the true and false arg in TREE type. The below test suites are passed for this patch. * The rv64gcv fully regression test. * The x86 bootstrap test. * The x86 fully regression test. gcc/ChangeLog: * gimple-match-head.cc (match_cond_with_binary_phi): Add new func impl to match binary phi for true and false arg. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-09-19doc: Add more alias option and reorder Intel CPU -march documentationHaochen Jiang1-113/+121
Since r15-3539, there are requests coming in to add other alias option documentation. This patch will add all ot them, including corei7, corei7-avx, core-avx-i, core-avx2, atom, slm, gracemont and emerarldrapids. Also in the patch, I reordered that part of documentation, currently all the CPUs/products are just all over the place. I regrouped them by date-to-now products (since the very first CPU to latest Panther Lake), P-core (since the clients become hybrid cores, starting from Sapphire Rapids) and E-core (since Bonnell to latest Clearwater Forest). And in the patch, I refined the product names in documentation. gcc/ChangeLog: * doc/invoke.texi: Add corei7, corei7-avx, core-avx-i, core-avx2, atom, slm, gracemont and emerarldrapids. Reorder the -march documentation by splitting them into date-to-now products, P-core and E-core. Refine the product names in documentation.
2024-09-19i386: Enhance AVX10.2 convert testsHaochen Jiang15-82/+295
For AVX10.2 convert tests, all of them are missing mask tests previously, this patch will add them in the tests. gcc/testsuite/ChangeLog: * gcc.target/i386/avx10_2-512-vcvt2ps2phx-2.c: Enhance mask test. * gcc.target/i386/avx10_2-512-vcvtbiasph2bf8-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtbiasph2bf8s-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtbiasph2hf8-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtbiasph2hf8s-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvthf82ph-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtne2ph2bf8-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtne2ph2bf8s-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtne2ph2hf8-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtne2ph2hf8s-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtneph2bf8-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtneph2bf8s-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtneph2hf8-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtneph2hf8s-2.c: Ditto. * gcc.target/i386/avx512f-helper.h: Fix a typo in macro define.
2024-09-19i386: Add missing avx512f-mask-type.h includeHaochen Jiang4-0/+5
Since commit r15-3594, we fixed the bugs in MASK_TYPE for AVX10.2 testcases, but we missed the following four. The tests are not FAIL since the binutils part haven't been merged yet, which leads to UNSUPPORTED test. But the avx512f-mask-type.h needs to be included, otherwise, it will be compile error. gcc/testsuite/ChangeLog: * gcc.target/i386/avx10_2-512-vpdpbssd-2.c: Include avx512f-mask-type.h. * gcc.target/i386/avx10_2-vminmaxsd-2.c: Ditto. * gcc.target/i386/avx10_2-vminmaxsh-2.c: Ditto. * gcc.target/i386/avx10_2-vminmaxss-2.c: Ditto.
2024-09-19testsuite/gcc.dg/pr84877.c: Add machinery to stabilize stack aligmnentHans-Peter Nilsson1-0/+26
This test awkwardly "blinks"; xfails and xpasses apparently randomly for cris-elf using the "gdb simulator". On inspection, I see that the stack address depends on the number of environment variables, deliberately passed to the simulator, each adding the size of a pointer. This test is IMHO important enough not to be just skipped just because it blinks (fixing the actual problem is a different task). I guess a random non-16 stack-alignment could happen for other targets as well, so let's try and add a generic machinery to "stabilize" the test as failing, by allocating a dynamic amount to make sure it's misaligned. The most target-dependent item here is an offset between the incoming stack-pointer value (within main in the added framework) and outgoing (within "xmain" as called from main when setting up the p0 parameter). I know there are other wonderful stack shapes, but such targets would fall under the "complicated situations"-label and are no worse off than before. * gcc.dg/pr84877.c: Try to make the test result consistent by misaligning the stack.
2024-09-19Daily bump.GCC Administrator8-1/+211
2024-09-19RISC-V: Fix signed SAT_ADD test case for int64_tPan Li1-8/+7
The int8_t test for signed SAT_ADD is sat_s_add-1.c, the sat_s_add-4.c should be for int64_t. Thus, update sat_s_add-4.c for int64_t type. gcc/testsuite/ChangeLog: * gcc.target/riscv/sat_s_add-4.c: Update test for int64_t instead of int8_t. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-09-18libstdc++: add bracesJason Merrill1-1/+1
GCC compiles with -fno-exceptions, so __throw_exception_again is a no-op, and compilation gives a -Wempty-body warning here, so let's wrap it as is already done in a few other files. libstdc++-v3/ChangeLog: * include/bits/basic_ios.h: Add braces.
2024-09-18[PATCH] configure: fix typosAndrew Kreimer2-2/+2
/ * configure.ac: Fix typos. * configure: Rebuilt.
2024-09-18c++: alias of decltype(lambda) is opaque [PR116714, PR107390]Patrick Palka2-2/+48
Here for using type = decltype([]{}); static_assert(is_same_v<type, type>); we strip the alias ahead of time during template argument coercion which effectively transforms the template-id into is_same_v<decltype([]{}), decltype([]{})> which is wrong because later substitution into the template-id will produce two new lambdas with distinct types and cause is_same_v to return false. This demonstrates that such aliases should be considered opaque (a notion that we recently introduced in r15-2331-g523836716137d0). (An alternative solution might be to consider memoizing lambda-expr substitution rather than always producing a new lambda, but this is much simpler.) PR c++/116714 PR c++/107390 gcc/cp/ChangeLog: * pt.cc (dependent_opaque_alias_p): Also return true for a decltype(lambda) alias. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/lambda-uneval18.C: New test. Reviewed-by: Jason Merrill <jason@redhat.com>
2024-09-18jit: Ensure ssize_t is definedFrancois-Xavier Coudert1-0/+5
On some targets it seems that ssize_t is not defined by any of the headers transitively included by <stdio.h>. This leads to a bootstrap fail when jit is enabled. gcc/jit/ChangeLog: * libgccjit.h: Include <sys/types.h>
2024-09-18hppa: Add peephole2 optimizations for REG+D loads and storesJohn David Anglin2-0/+103
The PA 1.x architecture only supports long displacements in integer loads and stores. Floating-point loads and stores only support short displacements. As a result, we have to wait until reload is complete before generating insns with long displacements. The PA 2.0 architecture supports long displacements in both integer and floating-point loads and stores. The peephole2 optimizations added in this change are only enabled when 14-bit long displacements aren't supported for floating-point loads and stores. 2024-09-18 John David Anglin <danglin@gcc.gnu.org> gcc/ChangeLog: * config/pa/pa.h (GENERAL_REGNO_P): Define. * config/pa/pa.md: Add SImode and SFmode peephole2 patterns to generate loads and stores with long displacements.