aboutsummaryrefslogtreecommitdiff
path: root/gcc
AgeCommit message (Collapse)AuthorFilesLines
2022-10-04openmp: Add begin declare target supportJakub Jelinek17-74/+327
The following patch adds support for the begin declare target construct, which is another spelling for declare target construct without clauses (where it needs paired end declare target), but unlike that one accepts clauses. This is an OpenMP 5.1 feature, implemented with 5.2 clarification because in 5.1 we had a restriction in the declare target chapter shared by declare target and begin declare target that if there are any clauses specified at least one of them needs to be to or link. But that was of course meant just for declare target and not begin declare target, because begin declare target doesn't even allow to/link/enter clauses. In addition to that, the patch also makes device_type clause duplication an error (as stated in 5.1) and similarly makes declare target with just device_type clause an error rather than warning. What this patch doesn't do is: 1) OpenMP 5.1 also added an indirect clause, we don't support that neither on declare target nor begin declare target and I couldn't find it in our features pages (neither libgomp.texi nor web) 2) I think device_type(nohost)/device_type(host) support can't work for variables (in 5.0 it only talked about procedures so this could be also thought as 5.1 feature that we should just add to the list and implement) 3) I don't see any use of the "omp declare target nohost" attribute, so I'm not sure if device_type(nohost) works at all 2022-10-04 Jakub Jelinek <jakub@redhat.com> gcc/c-family/ * c-omp.cc (c_omp_directives): Uncomment begin declare target entry. gcc/c/ * c-lang.h (struct c_omp_declare_target_attr): New type. (current_omp_declare_target_attribute): Change type from int to vec<c_omp_declare_target_attr, va_gc> *. * c-parser.cc (c_parser_translation_unit): Adjust for that change. If last pushed directive was begin declare target, use different wording and simplify format strings for easier translations. (c_parser_omp_clause_device_type): Uncomment check_no_duplicate_clause call. (c_parser_omp_declare_target): Adjust for the current_omp_declare_target_attribute type change, push { -1 }. Use error_at rather than warning_at for declare target with only device_type clauses. (OMP_BEGIN_DECLARE_TARGET_CLAUSE_MASK): Define. (c_parser_omp_begin): Add begin declare target support. (c_parser_omp_end): Adjust for the current_omp_declare_target_attribute type change, adjust diagnostics wording and simplify format strings for easier translations. * c-decl.cc (current_omp_declare_target_attribute): Change type from int to vec<c_omp_declare_target_attr, va_gc> *. (c_decl_attributes): Adjust for the current_omp_declare_target_attribute type change. If device_type was present on begin declare target, add "omp declare target host" and/or "omp declare target nohost" attributes. gcc/cp/ * cp-tree.h (struct omp_declare_target_attr): Rename to ... (cp_omp_declare_target_attr): ... this. Add device_type member. (omp_begin_assumes_data): Rename to ... (cp_omp_begin_assumes_data): ... this. (struct saved_scope): Change types of omp_declare_target_attribute and omp_begin_assumes. * parser.cc (cp_parser_omp_clause_device_type): Uncomment check_no_duplicate_clause call. (cp_parser_omp_all_clauses): Fix up pasto, c_name for OMP_CLAUSE_LINK should be "link" rather than "to". (cp_parser_omp_declare_target): Adjust for omp_declare_target_attr to cp_omp_declare_target_attr changes, push -1 as device_type. Use error_at rather than warning_at for declare target with only device_type clauses. (OMP_BEGIN_DECLARE_TARGET_CLAUSE_MASK): Define. (cp_parser_omp_begin): Add begin declare target support. Adjust for omp_begin_assumes_data to cp_omp_begin_assumes_data change. (cp_parser_omp_end): Adjust for the omp_declare_target_attr to cp_omp_declare_target_attr and omp_begin_assumes_data to cp_omp_begin_assumes_data type changes, adjust diagnostics wording and simplify format strings for easier translations. * semantics.cc (finish_translation_unit): Likewise. * decl2.cc (cplus_decl_attributes): If device_type was present on begin declare target, add "omp declare target host" and/or "omp declare target nohost" attributes. gcc/testsuite/ * c-c++-common/gomp/declare-target-4.c: Move tests that are now rejected into declare-target-7.c. * c-c++-common/gomp/declare-target-6.c: Adjust expected diagnostics. * c-c++-common/gomp/declare-target-7.c: New test. * c-c++-common/gomp/begin-declare-target-1.c: New test. * c-c++-common/gomp/begin-declare-target-2.c: New test. * c-c++-common/gomp/begin-declare-target-3.c: New test. * c-c++-common/gomp/begin-declare-target-4.c: New test. * g++.dg/gomp/attrs-9.C: Add begin declare target tests. * g++.dg/gomp/attrs-18.C: New test. libgomp/ * libgomp.texi (Support begin/end declare target syntax in C/C++): Mark as implemented.
2022-10-04Convert nonzero mask in irange to wide_int.Aldy Hernandez3-171/+130
The reason the nonzero mask was kept in a tree was basically inertia, as everything in irange is a tree. However, there's no need to keep it in a tree, as the conversions to and from wide ints are very annoying. That, plus special casing NULL masks to be -1 is prone to error. I have not only rewritten all the uses to assume a wide int, but have corrected a few places where we weren't propagating the masks, or rather pessimizing them to -1. This will become more important in upcoming patches where we make better use of the masks. Performance testing shows a trivial improvement in VRP, as things like irange::contains_p() are tied to a tree. Ughh, can't wait for trees in iranges to go away. gcc/ChangeLog: * value-range-storage.cc (irange_storage_slot::set_irange): Remove special case. * value-range.cc (irange::irange_set): Adjust for nonzero mask being a wide int. (irange::irange_set_anti_range): Same. (irange::set): Same. (irange::verify_range): Same. (irange::legacy_equal_p): Same. (irange::operator==): Same. (irange::contains_p): Same. (irange::legacy_intersect): Same. (irange::legacy_union): Same. (irange::irange_single_pair_union): Call union_nonzero_bits. (irange::irange_union): Same. (irange::irange_intersect): Call intersect_nonzero_bits. (irange::intersect): Adjust for nonzero mask being a wide int. (irange::invert): Same. (irange::set_nonzero_bits): Same. (irange::get_nonzero_bits_from_range): New. (irange::set_range_from_nonzero_bits): New. (irange::get_nonzero_bits): Adjust for nonzero mask being a wide int. (irange::intersect_nonzero_bits): Same. (irange::union_nonzero_bits): Same. (range_tests_nonzero_bits): Remove test. * value-range.h (irange::varying_compatible_p): Adjust for nonzero mask being a wide int. (gt_ggc_mx): Same. (gt_pch_nx): Same. (irange::set_undefined): Same. (irange::set_varying): Same. (irange::normalize_kind): Same.
2022-10-04[PR107130] range-ops: Separate out ffs and popcount optimizations.Aldy Hernandez2-10/+46
__builtin_popcount and __builtin_ffs were sharing the same range-ops entry, but the nonzero mask optimization is not valid for ffs. Separate them out into two entries. PR tree-optimization/107130 gcc/ChangeLog: * gimple-range-op.cc (class cfn_popcount): Call op_cfn_ffs. (class cfn_ffs): New. (gimple_range_op_handler::maybe_builtin_call): Separate out CASE_CFN_FFS into its own case. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/pr107130.c: New test.
2022-10-03diagnostics: Add test for fixed _Pragma location issue [PR91669]Lewis Hyatt1-0/+28
This PR related to _Pragma locations and diagnostic pragmas was fixed by a combination of r10-325 and r13-1596. Add missing test coverage. gcc/testsuite/ChangeLog: PR c/91669 * c-c++-common/pr91669.c: New test.
2022-10-04Daily bump.GCC Administrator6-1/+112
2022-10-03gcc/config/t-i386: add build dependencies on i386-builtin-types.incSergei Trofimovich1-0/+5
i386-builtin-types.inc is included indirectly via i386-builtins.h into 4 files: i386.cc i386-builtins.cc i386-expand.cc i386-features.cc Only i386.cc dependency was present in gcc/config/t-i386 makefile. As a result parallel builds occasionally fail as: g++ ... -o i386-builtins.o ... ../../gcc-13-20220911/gcc/config/i386/i386-builtins.cc In file included from ../../gcc-13-20220911/gcc/config/i386/i386-builtins.cc:92: ../../gcc-13-20220911/gcc/config/i386/i386-builtins.h:25:10: fatal error: i386-builtin-types.inc: No such file or directory 25 | #include "i386-builtin-types.inc" | ^~~~~~~~~~~~~~~~~~~~~~~~ compilation terminated. make[3]: *** [../../gcc-13-20220911/gcc/config/i386/t-i386:54: i386-builtins.o] Error 1 shuffle=1663349189 gcc/ * config/i386/t-i386: Add build-time dependencies against i386-builtin-types.inc to i386-builtins.o, i386-expand.o, i386-features.o.
2022-10-03[testsuite][arm] Fix cmse-15.c expected outputTorbjörn SVENSSON1-0/+2
The cmse-15.c testcase fails at -Os because ICF means that we generate secure3: b secure1 which is OK, but does not match the currently expected secure3: ... bx r[0-3] gcc/testsuite/ChangeLog: * gcc.target/arm/cmse/cmse-15.c: Align with -Os improvements. Co-Authored-By: Yvan ROUX <yvan.roux@foss.st.com> Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
2022-10-03c++: Disallow jumps into statement expressionsJakub Jelinek7-5/+50
On Fri, Sep 30, 2022 at 04:39:25PM -0400, Jason Merrill wrote: > > --- gcc/cp/decl.cc.jj 2022-09-22 00:14:55.478599363 +0200 > > +++ gcc/cp/decl.cc 2022-09-22 00:24:01.121178256 +0200 > > @@ -223,6 +223,7 @@ struct GTY((for_user)) named_label_entry > > bool in_transaction_scope; > > bool in_constexpr_if; > > bool in_consteval_if; > > + bool in_assume; > > I think it would be better to reject jumps into statement-expressions like > the C front-end. Ok, here is a self-contained patch that does that. 2022-10-03 Jakub Jelinek <jakub@redhat.com> * cp-tree.h (BCS_STMT_EXPR): New enumerator. * name-lookup.h (enum scope_kind): Add sk_stmt_expr. * name-lookup.cc (begin_scope): Handle sk_stmt_expr like sk_block. * semantics.cc (begin_compound_stmt): For BCS_STMT_EXPR use sk_stmt_expr. * parser.cc (cp_parser_statement_expr): Use BCS_STMT_EXPR instead of BCS_NORMAL. * decl.cc (struct named_label_entry): Add in_stmt_expr. (poplevel_named_label_1): Handle sk_stmt_expr. (check_previous_goto_1): Diagnose entering of statement expression. (check_goto): Likewise. * g++.dg/ext/stmtexpr24.C: New test.
2022-10-03Update gcc sv.poJoseph Myers1-17/+19
* sv.po: Update.
2022-10-03c++: rename IS_SAME_AS trait code to IS_SAMEPatrick Palka5-6/+6
... to match the trait's canonical spelling __is_same instead of its alternative spelling __is_same_as. gcc/c-family/ChangeLog: * c-common.cc (c_common_reswords): Use RID_IS_SAME instead of RID_IS_SAME_AS. gcc/cp/ChangeLog: * constraint.cc (diagnose_trait_expr): Use CPTK_IS_SAME instead of CPTK_IS_SAME_AS. * cp-trait.def (IS_SAME_AS): Rename to ... (IS_SAME): ... this. * pt.cc (alias_ctad_tweaks): Use CPTK_IS_SAME instead of CPTK_IS_SAME_AS. * semantics.cc (trait_expr_value): Likewise. (finish_trait_expr): Likewise.
2022-10-03vect: while_ult for integer masksAndrew Stubbs3-6/+35
Add a vector length parameter needed by amdgcn without breaking aarch64. All amdgcn vector masks are DImode, regardless of vector length, so we can't tell what length is implied simply from the operator mode. (Even if we used different integer modes there's no mode small enough to differenciate a 2 or 4 lane mask). Without knowing the intended length we end up using a mask with too many lanes enabled, which leads to undefined behaviour.. The extra operand is not added for vector mask types so AArch64 does not need to be adjusted. gcc/ChangeLog: * config/gcn/gcn-valu.md (while_ultsidi): Limit mask length using operand 3. * doc/md.texi (while_ult): Document new operand 3 usage. * internal-fn.cc (expand_while_optab_fn): Set operand 3 when lhs_type maps to a non-vector mode.
2022-10-03Don't process undefined range.Andrew MacLeod2-0/+23
No need to continue processing an undefined range. gcc/ PR tree-optimization/107109 * range-op.cc (adjust_op1_for_overflow): Don't process undefined. gcc/testsuite/ * gcc.dg/pr107109.c: New.
2022-10-03arm: Add missing early clobber to MVE vrev64q_m patternsChristophe Lyon2-2/+19
Like the non-predicated vrev64q patterns, mve_vrev64q_m_<supf><mode> and mve_vrev64q_m_f<mode> need an early clobber constraint, otherwise we can generate an unpredictable instruction: Warning: 64-bit element size and same destination and source operands makes instruction UNPREDICTABLE when calling vrevq64_m* with the same first and second arguments. OK for trunk? Thanks, Christophe gcc/ChangeLog: * config/arm/mve.md (mve_vrev64q_m_<supf><mode>): Add early clobber. (mve_vrev64q_m_f<mode>): Likewise. gcc/testsuite/ChangeLog: * gcc.target/arm/mve/intrinsics/vrev64q_m_s16-clobber.c: New test.
2022-10-03c: Adjust LDBL_EPSILON for C2x for IBM long doubleJoseph Myers3-1/+44
C2x changes the <float.h> definition of *_EPSILON to apply only to normalized numbers. The effect is that LDBL_EPSILON for IBM long double becomes 0x1p-105L instead of 0x1p-1074L. There is a reasonable case for considering this a defect fix - it originated from the issue reporting process (DR#467), though it ended up being resolved by a paper (N2326) for C2x rather than through the issue process, and code using *_EPSILON often needs to override the pre-C2x value of LDBL_EPSILON and use something on the order of magnitude of the C2x value instead. However, I've followed the conservative approach of only making the change for C2x and not for previous standard versions (and not for C++, which doesn't have the C2x changes in this area). The testcases added are intended to be valid for all long double formats. The C11 one is based on gcc.target/powerpc/rs6000-ldouble-2.c (and when we move to a C2x default, gcc.target/powerpc/rs6000-ldouble-2.c will need an appropriate option added to keep using an older language version). Tested with no regressions for cross to powerpc-linux-gnu. gcc/c-family/ * c-cppbuiltin.cc (builtin_define_float_constants): Do not special-case __*_EPSILON__ setting for IBM long double for C2x. gcc/testsuite/ * gcc.dg/c11-float-7.c, gcc.dg/c2x-float-12.c: New tests.
2022-10-03Do not pessimize range in set_nonzero_bits.Aldy Hernandez1-0/+13
Currently if we have a range of [0,0] and we set the nonzero bits to 1, the current code pessimizes the range to [0,1] because it assumes the range is [1,1] plus the possibility of 0. This fixes the oversight. gcc/ChangeLog: * value-range.cc (irange::set_nonzero_bits): Do not pessimize range. (range_tests_nonzero_bits): New test.
2022-10-03Avoid comparing ranges when sub-ranges is 0.Aldy Hernandez1-0/+3
There is nothing else to compare when the number of sub-ranges is 0. gcc/ChangeLog: * value-range.cc (irange::operator==): Early bail on m_num_ranges equal to 0.
2022-10-03Do not compare nonzero masks for varying.Aldy Hernandez1-4/+1
There is no need to compare nonzero masks when comparing two VARYING ranges, as they are always the same when range types are the same. gcc/ChangeLog: * value-range.cc (irange::legacy_equal_p): Remove nonozero mask check when comparing VR_VARYING ranges.
2022-10-03Do not compare incompatible ranges in ipa-prop.Aldy Hernandez1-2/+2
gcc/ChangeLog: * ipa-prop.cc (struct ipa_vr_ggc_hash_traits): Do not compare incompatible ranges in ipa-prop.
2022-10-03Fortran: fix testcasesFrancois-Xavier Coudert2-7/+3
Remove unreliable test for IEEE_FMA(), which fails on powerpc. Adjust stop codes for modes_1.f90. 2022-10-03 Francois-Xavier Coudert <fxcoudert@gcc.gnu.org> gcc/testsuite/ PR fortran/107062 * gfortran.dg/ieee/fma_1.f90: Fix test. * gfortran.dg/ieee/modes_1.f90: Fix test.
2022-10-03Daily bump.GCC Administrator2-1/+20
2022-10-02tree-cfg: Fix a verification diagnostic typo [PR107121]Jakub Jelinek1-1/+1
Obvious typo in diagnostics. 2022-10-02 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/107121 * tree-cfg.cc (verify_gimple_call): Fix a typo in diagnostics, DEFFERED_INIT -> DEFERRED_INIT.
2022-10-02Define GCC_DRIVER_HOST_INITIALIZATION for VxWorks targetsMarc Poulhiès4-0/+109
We need to perform static links by default on VxWorks, where the use of shared libraries involves unusual steps compared to standard native systems. This has to be conveyed before the lang_specific_driver code gets invoked (in particular for g++), so specs aren't available. This change defines the GCC_DRIVER_HOST_INITIALIZATION macro for VxWorks, to insert a -static option in case the user hasn't provided any explicit indication on the command line of the kind of link desired. While a HOST macro doesn't seem appropriate to control a target OS driven behavior, this matches other uses and won't conflict as VxWorks is not supported on any of the other configurations using this macro. gcc/ * config/vxworks-driver.cc: New. * config.gcc (*vxworks*): Add vxworks-driver.o in extra_gcc_objs. * config/t-vxworks: Add vxworks-driver.o. * config/vxworks.h (GCC_DRIVER_HOST_INITIALIZATION): New.
2022-10-02Refine guard for vxworks crtstuff specOlivier Hainque1-5/+4
Working on the reintroduction of shared libraries support (and of modules depending on shared libraries) exposed a few test failures of simple c++ constructor tests on arm-vxworks7r2. Investigation revealed that we were not linking the crtstuff objects as needed from a compiler configured not to have shared libs support, because of the ENABLE_SHARED_LIBGCC guard in this piece of vxworks.h: /* Setup the crtstuff begin/end we might need for dwarf EH registration and/or INITFINI_ARRAY support for shared libs. */ #if (HAVE_INITFINI_ARRAY_SUPPORT && defined(ENABLE_SHARED_LIBGCC)) \ || (DWARF2_UNWIND_INFO && !defined(CONFIG_SJLJ_EXCEPTIONS)) #define VX_CRTBEGIN_SPEC "%{!shared:vx_crtbegin.o%s;:vx_crtbeginS.o%s}" crtstuff initfini array support is meant to be leveraged for constructors regardless of whether the compiler also happens to be configured with shared library support, so the guard on ENABLE_SHARED_LIBGCC here is inappropriate. This change just removes it, 2022-09-30 Olivier Hainque <hainque@adacore.com> gcc/ * config/vxworks.h (VX_CRTBEGIN_SPEC, VX_CRTEND_SPEC): If HAVE_INITFINI_ARRAY_SUPPORT, pick crtstuff objects regardless of ENABLE_SHARED_LIBGCC.
2022-10-02Daily bump.GCC Administrator5-1/+66
2022-10-01Fortran: Fix ICE and wrong code for assumed-rank arrays [PR100029, PR100040]José Rui Faustino de Sousa3-21/+85
gcc/fortran/ChangeLog: PR fortran/100040 PR fortran/100029 * trans-expr.cc (gfc_conv_class_to_class): Add code to have assumed-rank arrays recognized as full arrays and fix the type of the array assignment. (gfc_conv_procedure_call): Change order of code blocks such that the free of ALLOCATABLE dummy arguments with INTENT(OUT) occurs first. gcc/testsuite/ChangeLog: PR fortran/100029 * gfortran.dg/PR100029.f90: New test. PR fortran/100040 * gfortran.dg/PR100040.f90: New test.
2022-10-01c++: make some cp_trait_kind switch statements exhaustivePatrick Palka1-6/+25
This replaces the unreachable default case in some cp_trait_kind switches with an exhaustive listing of the trait codes that we don't expect to see, so that when adding a new trait we'll get a helpful -Wswitch warning if we forget to handle the new trait in a relevant switch. gcc/cp/ChangeLog: * semantics.cc (trait_expr_value): Make cp_trait_kind switch statement exhaustive. (finish_trait_expr): Likewise. (finish_trait_type): Likewise.
2022-10-01or1k: Only define TARGET_HAVE_TLS when HAVE_AS_TLSStafford Horne1-0/+2
This was found when testing buildroot with linuxthreads enabled. In this case, the build passes --disable-tls to the toolchain during configuration. After building the OpenRISC toolchain it was still generating TLS code sequences and causing linker failures such as: ..../or1k-buildroot-linux-uclibc-gcc -o gpsd-3.24/gpsctl .... -lusb-1.0 -lm -lrt -lnsl ..../ld: ..../sysroot/usr/lib/libusb-1.0.so: undefined reference to `__tls_get_addr' This patch fixes this by disabling tls for the OpenRISC target when requested via --disable-tls. gcc/ChangeLog: * config/or1k/or1k.cc (TARGET_HAVE_TLS): Only define if HAVE_AS_TLS is defined. Tested-by: Yann E. MORIN <yann.morin@orange.com>
2022-10-01OpenACC: Fix struct-component-kind-1.c testJulian Brown1-1/+1
This patch is a minimal fix for the recently-added struct-component-kind-1.c test (which is currently failing to emit one of the errors it expects in scan output). This fragment was erroneously omitted from the second version of the patch posted previously: https://gcc.gnu.org/pipermail/gcc-patches/2022-September/602504.html 2022-10-01 Julian Brown <julian@codesourcery.com> gcc/ * gimplify.cc (omp_group_base): Fix IF_PRESENT (no_create) handling.
2022-10-01Improve Z flag handling on H8Jeff Law2-0/+269
This patch improves handling of the Z bit in the status register in a variety of ways to improve either the code size or code speed on various H8 subtargets. For example, we can test the zero/nonzero status of the upper byte of a 16 bit register using mov.b, we can move the Z or an inverted Z into a QImode register profitably on some subtargets. We can move Z or an inverted Z into the sign bit on the H8/SX profitably, etc. gcc/ * config/h8300/h8300.md (HSI2): New iterator. (eqne_invert): Similarly. * config/h8300/testcompare.md (testhi_upper_z): New pattern. (cmpqi_z, cmphi_z, cmpsi_z): Likewise. (store_z_qi, store_z_i_qi, store_z_hi, store_z_hi_sb): New define_insn_and_splits and/or define_insns. (store_z_hi_neg, store_z_hi_and, store_z_<mode>): Likewise. (store_z_<mode>_neg, store_z_<mode>_and, store_z): Likewise.
2022-09-30c++: loop through array CONSTRUCTORJason Merrill1-1/+5
I noticed that we were ignoring all the special rules for when to use a simple INIT_EXPR for array initialization from a CONSTRUCTOR, because split_nonconstant_init_1 was also passing 1 to the from_array parameter. Arguably that's the real bug, but I think we can be flexible. The test that I noticed this with no longer fails without it. gcc/cp/ChangeLog: * init.cc (build_vec_init): Clear from_array for CONSTRUCTOR initializer.
2022-09-30c++: cast split_nonconstant_init return val to voidJason Merrill1-7/+12
We were already converting the result of expand_vec_init_expr to void; we need to do the same for split_nonconstant_init. The test that I noticed this with no longer fails without it. gcc/cp/ChangeLog: * cp-gimplify.cc (cp_genericize_init): Also convert the result of split_nonconstant_init to void.
2022-09-30Install correct patch version.Jeff Law1-5/+5
gcc/ * tree-ssa-dom.cc (record_edge_info): Install correct version of patch.
2022-09-30Emit discriminators for inlined call sites.Eugene Rozenfeld1-1/+5
This change is based on commit 9fa26998a63d4b22b637ed8702520819e408a694 by Dehao Chen in vendors/google/heads/gcc-4_8. Tested on x86_64-pc-linux-gnu. gcc/ChangeLog: * dwarf2out.cc (add_call_src_coords_attributes): Emit discriminators for inlined call sites.
2022-10-01Daily bump.GCC Administrator6-1/+213
2022-09-30More gimple const/copy propagation opportunitiesJeff Law2-2/+157
While investigating a benchmark for optimization opportunities I came across single block loop which either iterates precisely once or forever. This is an interesting scenario as we can ignore the infinite looping path and treat any PHI nodes as degenerates. So more concretely let's consider this trivial testcase: volatile void abort (void); void foo(int a) { int b = 0; while (1) { if (!a) break; b = 1; } if (b != 0) abort (); } Quick analysis shows that b's initial value is 0 and its value only changes if we enter an infinite loop. So if we get to the test b != 0, the only possible value b could have would be 0 and the test and its true arm can be eliminated. The DOM3 dump looks something like this: ;; basic block 2, loop depth 0, count 118111600 (estimated locally), maybe hot ;; prev block 0, next block 3, flags: (NEW, VISITED) ;; pred: ENTRY [always] count:118111600 (estimated locally) (FALLTHRU,EXECUTABLE) ;; succ: 3 [always] count:118111600 (estimated locally) (FALLTHRU,EXECUTABLE) ;; basic block 3, loop depth 1, count 1073741824 (estimated locally), maybe hot ;; prev block 2, next block 4, flags: (NEW, VISITED) ;; pred: 2 [always] count:118111600 (estimated locally) (FALLTHRU,EXECUTABLE) ;; 3 [89.0% (guessed)] count:955630224 (estimated locally) (FALSE_VALUE,EXECUTABLE) # b_1 = PHI <0(2), 1(3)> if (a_3(D) == 0) goto <bb 4>; [11.00%] else goto <bb 3>; [89.00%] ;; succ: 4 [11.0% (guessed)] count:118111600 (estimated locally) (TRUE_VALUE,EXECUTABLE) ;; 3 [89.0% (guessed)] count:955630224 (estimated locally) (FALSE_VALUE,EXECUTABLE) ;; basic block 4, loop depth 0, count 118111600 (estimated locally), maybe hot ;; prev block 3, next block 5, flags: (NEW, VISITED) ;; pred: 3 [11.0% (guessed)] count:118111600 (estimated locally) (TRUE_VALUE,EXECUTABLE) if (b_1 != 0) goto <bb 5>; [0.00%] else goto <bb 6>; [100.00%] ;; succ: 5 [never] count:0 (precise) (TRUE_VALUE,EXECUTABLE) ;; 6 [always] count:118111600 (estimated locally) (FALSE_VALUE,EXECUTABLE) This is a good representative of what the benchmark code looks like. The primary effect we want to capture is to realize that the test if (b_1 != 0) is always false and optimize it accordingly. In the benchmark, this opportunity is well hidden until after the loop optimizers have completed, so the first chance to capture this case is in DOM3. Furthermore, DOM wants loops normalized with latch blocks/edges. So instead of bb3 looping back to itself, there's an intermediate empty block during DOM. I originally thought this was likely to only affect the benchmark. But when I instrumented the optimization and bootstrapped GCC, much to my surprise there were several hundred similar cases identified in GCC itself. So it's not as benchmark specific as I'd initially feared. Anyway, detecting this in DOM is pretty simple. We detect the infinite loop, including the latch block. Once we've done that, we walk the PHI nodes and attach equivalences to the appropriate outgoing edge. That's all we need to do as the rest of DOM is already prepared to handle equivalences on edges. gcc/ * tree-ssa-dom.cc (single_block_loop_p): New function. (record_edge_info): Also record equivalences for the outgoing edge of a single block loop where the condition is an invariant. gcc/testsuite/ * gcc.dg/infinite-loop.c: New test.
2022-09-30Minor cleanup/prep in DOMJeff Law1-5/+4
It's a bit weird that free_dom_edge_info leaves a dangling pointer in e->aux. Not sure what I was thinking. There's two callers. One wipes e->aux immediately after the call, the other attaches a newly created object immediately after the call. So we can wipe e->aux within the call and simplify one of the two call sites. This is preparatory work for a minor optimization where we want to detect another class of edge equivalences in DOM (until something better is available) and either attach them an existing edge_info structure or create a new one if one doesn't currently exist for a given edge. gcc/ * tree-ssa-dom.cc (free_dom_edge_info): Clear e->aux too. (free_all_edge_infos): Do not clear e->aux here.
2022-09-30Document -fexcess-precision=16 in target.defH.J. Lu1-1/+1
* target.def (TARGET_C_EXCESS_PRECISION): Document -fexcess-precision=16.
2022-09-30Document -fexcess-precision=16 in tm.texiPalmer Dabbelt1-1/+1
I just happened to stuble on this one while trying to sort out the RISC-V bits. gcc/ChangeLog * doc/tm.texi (TARGET_C_EXCESS_PRECISION): Add 16.
2022-09-30RISC-V: Support -fexcess-precision=16Palmer Dabbelt1-0/+1
This fixes f19a327077e ("Support -fexcess-precision=16 which will enable FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 when backend supports _Float16.") on RISC-V targets. gcc/ChangeLog PR target/106815 * config/riscv/riscv.cc (riscv_excess_precision): Add support for EXCESS_PRECISION_TYPE_FLOAT16.
2022-09-30arm, csky: Fix C++ ICEs with _Float16 and __fp16 [PR107080]Jakub Jelinek3-5/+26
On Fri, Sep 30, 2022 at 09:54:49AM -0400, Jason Merrill wrote: > > Note, there is one further problem on aarch64/arm, types with HFmode > > (_Float16 and __fp16) are there mangled as Dh (which is standard > > Itanium mangling: > > ::= Dh # IEEE 754r half-precision floating point (16 bits) > > ::= DF <number> _ # ISO/IEC TS 18661 binary floating point type _FloatN (N bits) > > so in theory is also ok, but DF16_ is more specific. Should we just > > change Dh to DF16_ in those backends, or should __fp16 there be distinct > > type from _Float16 where __fp16 would mangle Dh and _Float16 DF16_ ? > > You argued for keeping __float128 separate from _Float128, does the same > argument not apply to this case? Actually, they already were distinct types that just mangled the same. So the same issue that had to be solved on i?86, ia64 and rs6000 for _Float64x vs. long double is a problem on arm and aarch64 with _Float16 vs. __fp16. The following patch fixes it for arm after aarch64 has been changed already before. > > And there is csky, which mangles __fp16 (but only if type's name is __fp16, > > not _Float16) as __fp16, that looks clearly invalid to me as it isn't > > valid in the mangling grammar. So perhaps just nuke csky's mangle_type > > and have it mangled as DF16_ by the generic code? And seems even on csky __fp16 is distinct type from _Float16 (which is a good thing for consistency, these 3 targets are the only ones that have __fp16 type), so instead the patch handles it the same as on arm/aarch64, Dh mangling for __fp16 and DF16_ for _Float16. 2022-09-30 Jakub Jelinek <jakub@redhat.com> PR c++/107080 * config/arm/arm.cc (arm_mangle_type): Mangle just __fp16 as Dh and _Float16 as DF16_. * config/csky/csky.cc (csky_init_builtins): Fix a comment typo. (csky_mangle_type): Mangle __fp16 as Dh and _Float16 as DF16_ rather than mangling __fp16 as __fp16. * g++.target/arm/pr107080.C: New test.
2022-09-30diagnostics: Fix virtual location for -Wuninitialized [PR69543]Lewis Hyatt5-25/+73
Warnings issued for -Wuninitialized have been using the spelling location of the problematic usage, discarding any information on the location of the macro expansion point if such usage was in a macro. This makes the warnings impossible to control reliably with #pragma GCC diagnostic, and also discards useful context in the diagnostic output. There seems to be no need to discard the virtual location information, so this patch fixes that. PR69543 was mostly about _Pragma issues which have been fixed for many years now. The PR remains open because two of the testcases added in response to it still have xfails, but those xfails have nothing to do with _Pragma and rather just with the issue fixed by this patch, so the PR can be closed now as well. The other testcase modified here, pragma-diagnostic-2.c, was explicitly testing for the undesirable behavior that was xfailed in pr69543-3.c. I have adjusted that and also added a new testcase verifying all 3 types of warning that come from tree-ssa-uninit.cc get the proper location information now. gcc/ChangeLog: PR preprocessor/69543 * tree-ssa-uninit.cc (warn_uninit): Stop stripping macro tracking information away from the diagnostic location. (maybe_warn_read_write_only): Likewise. (maybe_warn_operand): Likewise. gcc/testsuite/ChangeLog: PR preprocessor/69543 * c-c++-common/pr69543-3.c: Remove xfail. * c-c++-common/pr69543-4.c: Likewise. * gcc.dg/cpp/pragma-diagnostic-2.c: Adjust test for new behavior. * c-c++-common/pragma-diag-16.c: New test.
2022-09-30aarch64: Fix C++ ICEs with _Float16 and __fp16 [PR107080]Jakub Jelinek2-0/+21
On Fri, Sep 30, 2022 at 09:54:49AM -0400, Jason Merrill wrote: > > Note, there is one further problem on aarch64/arm, types with HFmode > > (_Float16 and __fp16) are there mangled as Dh (which is standard > > Itanium mangling: > > ::= Dh # IEEE 754r half-precision floating point (16 bits) > > ::= DF <number> _ # ISO/IEC TS 18661 binary floating point type _FloatN (N bits) > > so in theory is also ok, but DF16_ is more specific. Should we just > > change Dh to DF16_ in those backends, or should __fp16 there be distinct > > type from _Float16 where __fp16 would mangle Dh and _Float16 DF16_ ? > > You argued for keeping __float128 separate from _Float128, does the same > argument not apply to this case? Actually, they already were distinct types that just mangled the same. So the same issue that had to be solved on i?86, ia64 and rs6000 for _Float64x vs. long double is a problem on arm and aarch64 with _Float16 vs. __fp16. The following patch fixes it so far for aarch64. 2022-09-30 Jakub Jelinek <jakub@redhat.com> PR c++/107080 * config/aarch64/aarch64.cc (aarch64_mangle_type): Mangle just __fp16 as Dh and _Float16 as DF16_. * g++.target/aarch64/pr107080.C: New test.
2022-09-30i386, rs6000, ia64, s390: Fix C++ ICEs with _Float64x or _Float128 [PR107080]Jakub Jelinek5-11/+91
The following testcase ICEs on x86 as well as ppc64le (the latter with -mabi=ieeelongdouble), because _Float64x there isn't mangled as DF64x but e or u9__ieee128 instead. Those are the mangling that should be used for the non-standard types with the same mode or for long double, but not for _Float64x. All the 4 mangle_type targhook implementations start with type = TYPE_MAIN_VARIANT (type); so I think it is cleanest to handle it the same in all and return NULL before the switches on mode or whatever other tests. s390 doesn't actually have a bug, but while I was there, having type = TYPE_MAIN_VARIANT (type); if (TYPE_MAIN_VARIANT (type) == long_double_type_node) looked useless to me. Note, there is one further problem on aarch64/arm, types with HFmode (_Float16 and __fp16) are there mangled as Dh (which is standard Itanium mangling: ::= Dh # IEEE 754r half-precision floating point (16 bits) ::= DF <number> _ # ISO/IEC TS 18661 binary floating point type _FloatN (N bits) so in theory is also ok, but DF16_ is more specific. Should we just change Dh to DF16_ in those backends, or should __fp16 there be distinct type from _Float16 where __fp16 would mangle Dh and _Float16 DF16_ ? And there is csky, which mangles __fp16 (but only if type's name is __fp16, not _Float16) as __fp16, that looks clearly invalid to me as it isn't valid in the mangling grammar. So perhaps just nuke csky's mangle_type and have it mangled as DF16_ by the generic code? 2022-09-30 Jakub Jelinek <jakub@redhat.com> PR c++/107080 * config/i386/i386.cc (ix86_mangle_type): Always return NULL for float128_type_node or float64x_type_node, don't check float128t_type_node later on. * config/ia64/ia64.cc (ia64_mangle_type): Always return NULL for float128_type_node or float64x_type_node. * config/rs6000/rs6000.cc (rs6000_mangle_type): Likewise. Don't check float128_type_node later on. * config/s390/s390.cc (s390_mangle_type): Don't use TYPE_MAIN_VARIANT on type which was set to TYPE_MAIN_VARIANT a few lines earlier. * g++.dg/cpp23/ext-floating11.C: New test.
2022-09-30testsuite: Only run test on target if VMA == LMATorbjörn SVENSSON7-5/+62
Checking that the triplet matches arm*-*-eabi (or msp430-*-*) is not enough to know if the execution will enter an endless loop, or if it will give a meaningful result. As the execution test only work when VMA and LMA are equal, make sure that this condition is met. gcc/ChangeLog: * doc/sourcebuild.texi: Document new vma_equals_lma effective target check. gcc/testsuite/ChangeLog: * lib/target-supports.exp (check_effective_target_vma_equals_lma): New. * c-c++-common/torture/attr-noinit-1.c: Requre VMA == LMA to run. * c-c++-common/torture/attr-noinit-2.c: Likewise. * c-c++-common/torture/attr-noinit-3.c: Likewise. * c-c++-common/torture/attr-persistent-1.c: Likewise. * c-c++-common/torture/attr-persistent-3.c: Likewise. Co-Authored-By: Yvan ROUX <yvan.roux@foss.st.com> Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
2022-09-30testsuite: Do not prefix linker script with "-Wl,"Torbjörn SVENSSON1-1/+1
The linker script should not be prefixed with "-Wl," - it's not an input file and does not interfere with the new dump output filename strategy. gcc/testsuite/ChangeLog: * lib/gcc-defs.exp: Do not prefix linker script with "-Wl,". Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
2022-09-30RISC-V: Add '-m[no]-csr-check' option in gcc.Jiawei3-0/+17
Add -m[no]-csr-check option in gcc part, when enable -mcsr-check option, it will add csr-check in .option section and pass this to assembler. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_file_start): New .option. * config/riscv/riscv.opt: New options. * doc/invoke.texi: New definations.
2022-09-30c++: streamline built-in trait addition processPatrick Palka8-477/+161
Adding a new built-in trait currently involves manual boilerplate consisting of defining an rid enumerator for the identifier as well as a corresponding cp_trait_kind enumerator and handling them in various switch statements, the exact set of which depends on whether the proposed trait yields (and thus is recognized as) a type or an expression. To streamline the process, this patch adds a central cp-trait.def file that tabulates the essential details about each built-in trait (whether it yields a type or an expression, its code, its spelling and its arity) and uses this file to automate away the manual boilerplate. It also migrates all the existing C++-specific built-in traits to use this approach. After this change, adding a new built-in trait just entails declaring it in cp-trait.def and defining its behavior in finish_trait_expr/type (and handling it in diagnose_trait_expr, if it's an expression-yielding trait). gcc/c-family/ChangeLog: * c-common.cc (c_common_reswords): Use cp/cp-trait.def to handle C++ traits. * c-common.h (enum rid): Likewise. gcc/cp/ChangeLog: * constraint.cc (diagnose_trait_expr): Likewise. * cp-objcp-common.cc (names_builtin_p): Likewise. * cp-tree.h (enum cp_trait_kind): Likewise. * cxx-pretty-print.cc (pp_cxx_trait): Likewise. * parser.cc (cp_keyword_starts_decl_specifier_p): Likewise. (cp_parser_primary_expression): Likewise. (cp_parser_trait): Likewise. (cp_parser_simple_type_specifier): Likewise. * cp-trait.def: New file.
2022-09-30testsuite: Colon is reserved on WindowsTorbjörn SVENSSON2-2/+2
The ':' is reserved in filenames on Windows. Without this patch, the test case failes with: .../ben-1_a.C:4:8: error: failed to write compiled module: Invalid argument .../ben-1_a.C:4:8: note: compiled module file is 'partitions/module:import.mod' gcc/testsuite: * g++.dg/modules/ben-1.map: Replace the colon with dash. * g++.dg/modules/ben-1_a.C: Likewise Co-Authored-By: Yvan ROUX <yvan.roux@foss.st.com> Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
2022-09-30rs6000: Rework ELFv2 support for -fpatchable-function-entry* [PR99888]Kewen Lin12-4/+194
As PR99888 and its related show, the current support for -fpatchable-function-entry on powerpc ELFv2 doesn't work well with global entry existence. For example, with one command line option -fpatchable-function-entry=3,2, it got below w/o this patch: .LPFE1: nop nop .type foo, @function foo: nop .LFB0: .cfi_startproc .LCF0: 0: addis 2,12,.TOC.-.LCF0@ha addi 2,2,.TOC.-.LCF0@l .localentry foo,.-foo , the assembly is unexpected since the patched nops have no effects when being entered from local entry. This patch is to update the nops patched before and after local entry, it looks like: .type foo, @function foo: .LFB0: .cfi_startproc .LCF0: 0: addis 2,12,.TOC.-.LCF0@ha addi 2,2,.TOC.-.LCF0@l nop nop .localentry foo,.-foo nop PR target/99888 PR target/105649 gcc/ChangeLog: * doc/invoke.texi (option -fpatchable-function-entry): Adjust the documentation for PowerPC ELFv2 ABI dual entry points. * config/rs6000/rs6000-internal.h (rs6000_print_patchable_function_entry): New function declaration. * config/rs6000/rs6000-logue.cc (rs6000_output_function_prologue): Support patchable-function-entry by emitting nops before and after local entry for the function that needs global entry. * config/rs6000/rs6000.cc (rs6000_print_patchable_function_entry): Skip the function that needs global entry till global entry has been emitted. * config/rs6000/rs6000.h (struct machine_function): New bool member global_entry_emitted. gcc/testsuite/ChangeLog: * gcc.target/powerpc/pr99888-1.c: New test. * gcc.target/powerpc/pr99888-2.c: New test. * gcc.target/powerpc/pr99888-3.c: New test. * gcc.target/powerpc/pr99888-4.c: New test. * gcc.target/powerpc/pr99888-5.c: New test. * gcc.target/powerpc/pr99888-6.c: New test. * c-c++-common/patchable_function_entry-default.c: Adjust for powerpc_elfv2 to avoid compilation error.
2022-09-30rs6000/test: Adjust pr104992.c with vect_int_mod [PR106516]Kewen Lin2-1/+10
As PR106516 shows, we can get unexpected gimple outputs for function thud on some target which supports modulus operation for vector int. This patch introduces one effective target vect_int_mod for it, then adjusts the test case with it. PR testsuite/106516 gcc/testsuite/ChangeLog: * gcc.dg/pr104992.c: Adjust with vect_int_mod. * lib/target-supports.exp (check_effective_target_vect_int_mod): New effective target.