aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2022-10-17Add relation_trio class for range-ops.Andrew MacLeod9-291/+405
There are 3 possible relations range-ops might care about, but only the one most likely to be needed is supplied. This patch provides a new class relation_trio which allows 3 relations to be passed in a single word. fold_range (), op1_range (), and op2_range () are adjusted to take a relation_trio class instead of a relation_kind, then the routine can extract which relation it wants to work with. * gimple-range-fold.cc (fold_using_range::range_of_range_op): Provide relation_trio class. * gimple-range-gori.cc (gori_compute::refine_using_relation): Provide relation_trio class. (gori_compute::refine_using_relation): Ditto. (gori_compute::compute_operand1_range): Provide lhs_op2 and op1_op2 relations via relation_trio class. (gori_compute::compute_operand2_range): Ditto. * gimple-range-op.cc (gimple_range_op_handler::calc_op1): Use relation_trio instead of relation_kind. (gimple_range_op_handler::calc_op2): Ditto. (*::fold_range): Ditto. * gimple-range-op.h (gimple_range_op::calc_op1): Adjust prototypes. (gimple_range_op::calc_op2): Adjust prototypes. * range-op-float.cc (*::fold_range): Use relation_trio instead of relation_kind. (*::op1_range): Ditto. (*::op2_range): Ditto. * range-op.cc (*::fold_range): Use relation_trio instead of relation_kind. (*::op1_range): Ditto. (*::op2_range): Ditto. * range-op.h (class range_operator): Adjust prototypes. (class range_operator_float): Ditto. (class range_op_handler): Adjust prototypes. (relop_early_resolve): Pickup op1_op2 relation from relation_trio. * value-relation.cc (VREL_LAST): Adjust use to be one past the end of the enum. (relation_oracle::validate_relation): Use relation_trio in call to fold_range. * value-relation.h (enum relation_kind_t): Add VREL_LAST as final element. (class relation_trio): New. (TRIO_VARYING, TRIO_SHIFT, TRIO_MASK): New.
2022-10-17Fix nan updating in range-ops.Andrew MacLeod1-13/+10
Calling clean_nan on an undefined type traps, set_varying first. Other tweaks for correctness. * range-op-float.cc (foperator_not_equal::op1_range): Check for VREL_EQ after singleton. (foperator_unordered::op1_range): Set VARYING before calling clear_nan(). (foperator_ordered::op1_range): Set rather than clear NAN if both operands are the same.
2022-10-17Don't set useless relations.Andrew MacLeod2-1/+8
The oracle will not register nonssense/useless relations, class value_relation shouldn't either. * value-relation.cc (value_relation::dump): Change message. * value-relation.h (value_relation::set_relation): If op1 is the same as op2 do not create a relation.
2022-10-17GCN: Restore build with GCC 4.8Thomas Schwinge1-7/+7
For example, for "g++-4.8 (Ubuntu 4.8.4-2ubuntu1~14.04.4) 4.8.4", the recent commit r13-3220-g45381d6f9f4e7b5c7b062f5ad8cc9788091c2d07 "amdgcn: add multiple vector sizes" broke the build: In file included from [...]/source-gcc/gcc/coretypes.h:458:0, from [...]/source-gcc/gcc/config/gcn/gcn.cc:24: [...]/source-gcc/gcc/config/gcn/gcn.cc: In function ‘machine_mode VnMODE(int, machine_mode)’: ./insn-modes.h:42:71: error: temporary of non-literal type ‘scalar_int_mode’ in a constant expression #define QImode (scalar_int_mode ((scalar_int_mode::from_int) E_QImode)) ^ [...]/source-gcc/gcc/config/gcn/gcn.cc:405:10: note: in expansion of macro ‘QImode’ case QImode: ^ In file included from [...]/source-gcc/gcc/coretypes.h:478:0, from [...]/source-gcc/gcc/config/gcn/gcn.cc:24: [...]/source-gcc/gcc/machmode.h:410:7: note: ‘scalar_int_mode’ is not literal because: class scalar_int_mode ^ [...]/source-gcc/gcc/machmode.h:410:7: note: ‘scalar_int_mode’ is not an aggregate, does not have a trivial default constructor, and has no constexpr constructor that is not a copy or move constructor [...] Addressing this like simiar issues have been addressed in the past. gcc/ * config/gcn/gcn.cc (VnMODE): Use 'case E_QImode:' instead of 'case QImode:', etc.
2022-10-17Tag 'gcc/gimple-expr.cc:mark_addressable_2' as 'static'Thomas Schwinge1-1/+1
Added in 2015 r229696 (commit 1b223a9f3489296c625bdb7cc764196d04fd9231) "defer mark_addressable calls during expand till the end of expand", it has never been used 'extern'ally. gcc/ * gimple-expr.cc (mark_addressable_2): Tag as 'static'.
2022-10-17Fix nvptx-specific '-foffload-options' syntax in ↵Thomas Schwinge1-1/+1
'libgomp.c/reverse-offload-sm30.c' That is, '-mptx=_' is only valid in '-foffload-options=nvptx-none', too. Fix test case added in recent commit r13-2625-g6b43f556f392a7165582aca36a19fe7389d995b2 "nvptx/mkoffload.cc: Warn instead of error when reverse offload is not possible". libgomp/ * testsuite/libgomp.c/reverse-offload-sm30.c: Fix nvptx-specific '-foffload-options' syntax.
2022-10-17Vectorization of first-order recurrencesRichard Biener13-28/+558
The following picks up the prototype by Ju-Zhe Zhong for vectorizing first order recurrences. That solves two TSVC missed optimization PRs. There's a new scalar cycle def kind, vect_first_order_recurrence and it's handling of the backedge value vectorization is complicated by the fact that the vectorized value isn't the PHI but instead a (series of) permute(s) shifting in the recurring value from the previous iteration. I've implemented this by creating both the single vectorized PHI and the series of permutes when vectorizing the scalar PHI but leave the backedge values in both unassigned. The backedge values are (for the testcases) computed by a load which is also the place after which the permutes are inserted. That placement also restricts the cases we can handle (without resorting to code motion). I added both costing and SLP handling though SLP handling is restricted to the case where a single vectorized PHI is enough. Missing is epilogue handling - while prologue peeling would be handled transparently by adjusting iv_phi_p the epilogue case doesn't work with just inserting a scalar LC PHI since that a) keeps the scalar load live and b) that loads is the wrong one, it has to be the last, much like when we'd vectorize the LC PHI as live operation. Unfortunately LIVE compute/analysis happens too early before we decide on peeling. When using fully masked loop vectorization the vect-recurr-6.c works as expected though. I have tested this on x86_64 for now, but since epilogue handling is missing there's probably no practical cases. My prototype WHILE_ULT AVX512 patch can handle vect-recurr-6.c just fine but I didn't feel like running SPEC within SDE nor is the WHILE_ULT patch complete enough. PR tree-optimization/99409 PR tree-optimization/99394 * tree-vectorizer.h (vect_def_type::vect_first_order_recurrence): Add. (stmt_vec_info_type::recurr_info_type): Likewise. (vectorizable_recurr): New function. * tree-vect-loop.cc (vect_phi_first_order_recurrence_p): New function. (vect_analyze_scalar_cycles_1): Look for first order recurrences. (vect_analyze_loop_operations): Handle them. (vect_transform_loop): Likewise. (vectorizable_recurr): New function. (maybe_set_vectorized_backedge_value): Handle the backedge value setting in the first order recurrence PHI and the permutes. * tree-vect-stmts.cc (vect_analyze_stmt): Handle first order recurrences. (vect_transform_stmt): Likewise. (vect_is_simple_use): Likewise. (vect_is_simple_use): Likewise. * tree-vect-slp.cc (vect_get_and_check_slp_defs): Likewise. (vect_build_slp_tree_2): Likewise. (vect_schedule_scc): Handle the backedge value setting in the first order recurrence PHI and the permutes. * gcc.dg/vect/vect-recurr-1.c: New testcase. * gcc.dg/vect/vect-recurr-2.c: Likewise. * gcc.dg/vect/vect-recurr-3.c: Likewise. * gcc.dg/vect/vect-recurr-4.c: Likewise. * gcc.dg/vect/vect-recurr-5.c: Likewise. * gcc.dg/vect/vect-recurr-6.c: Likewise. * gcc.dg/vect/tsvc/vect-tsvc-s252.c: Un-XFAIL. * gcc.dg/vect/tsvc/vect-tsvc-s254.c: Likewise. * gcc.dg/vect/tsvc/vect-tsvc-s291.c: Likewise. Co-authored-by: Ju-Zhe Zhong <juzhe.zhong@rivai.ai>
2022-10-17libgcc: Move cfa_how into potential padding in struct frame_state_reg_infoFlorian Weimer1-5/+6
On many architectures, there is a padding gap after the how array member, and cfa_how can be moved there. This reduces the size of the struct and the amount of memory that uw_frame_state_for has to clear. There is no measurable performance benefit from this on x86-64 (even though the memset goes from 120 to 112 bytes), but it seems to be a good idea to do anyway. libgcc/ * unwind-dw2.h (struct frame_state_reg_info): Move cfa_how member and reduce its size.
2022-10-17libstdc++: Fix value of __cpp_lib_constexpr_charconvJonathan Wakely4-4/+4
libstdc++-v3/ChangeLog: * include/std/charconv (__cpp_lib_constexpr_charconv): Define to correct value. * include/std/version (__cpp_lib_constexpr_charconv): Likewise. * testsuite/20_util/to_chars/constexpr.cc: Check correct value. * testsuite/20_util/to_chars/version.cc: Likewise.
2022-10-17RISC-V: Fix format[NFC]Ju-Zhe Zhong1-1/+1
gcc/ChangeLog: * config/riscv/t-riscv: Change Tab into 2 space.
2022-10-17RISC-V: Reorganize mangle_builtin_type.[NFC]Ju-Zhe Zhong1-13/+13
Hi, this patch fixed my mistake in the previous commit patch. Since "mangle_builtin_type" is a global function will be called in riscv.cc. It's reasonable move it down and put them together stay with other global functions. gcc/ChangeLog: * config/riscv/riscv-vector-builtins.cc (mangle_builtin_type): Move down the function.
2022-10-17elf: ELF toolchain --without-{headers, newlib} should provide stdint.hArsen Arsenovic1-0/+5
stdint.h is considered a freestanding headers by C, and a valid stdint.h is required for certain parts of libstdc++' configuration, so we should simply provide one when we have no other way (i.e. newlib or user-specified sysroot) of getting one. * config.gcc: --target=*-elf --without-{newlib,headers} should provide stdint.h.
2022-10-17Initial Meteorlake SupportHu, Lin12-0/+6
gcc/ChangeLog: * common/config/i386/cpuinfo.h: (get_intel_cpu): Handle Meteorlake. * common/config/i386/i386-common.cc: (processor_alias_table): Add Meteorlake.
2022-10-17Initial Raptorlake SupportHaochen Jiang2-0/+4
gcc/ChangeLog: * common/config/i386/cpuinfo.h: (get_intel_cpu): Handle Raptorlake. * common/config/i386/i386-common.cc: (processor_alias_table): Add Raptorlake.
2022-10-17Daily bump.GCC Administrator2-1/+20
2022-10-16Add new constraints for upcoming autoinc fixesJeff Law2-0/+37
GCC does not allow a the operand of an autoinc addressing mode to overlap with another soure operand in the same insn. This is primarly enforced with insn conditions. However, cases can slip through LRA and reload. To address those scenarios we'll take an idea from the pdp11 port for describing the restriction in constraints as well. To implement that we need register classes and constraints which are "all general purpose hardware registers except r0". And similarly for r1..r7(sp). This patch adds those register classes and constraints, but does not yet use them. gcc/ * config/h8300/constraints.md (Z0..Z7): New register constraints. * config/h8300/h8300.h (reg_class): Add new classes. (REG_CLASS_NAMES): Similarly. (REG_CLASS_CONTENTS): Similarly.
2022-10-16Rename "z" constraint to "Zz" on the H8/300Jeff Law2-5/+5
I want to use Z as a multi-letter constraint. So first we have to adjust the existing use of Z. This does not affect code generation. gcc/ * config/h8300/constraints.md (Zz constraint): Renamed from "z". * config/h8300/movepush.md (movqi_h8sx, movhi_h8sx): Adjust constraint to use Zz instead of Z.
2022-10-15Fix bug in register move costing on H8/300Jeff Law1-1/+1
gcc/ * config/h8300/h8300.cc (h8300_register_move_cost): Fix typo.
2022-10-16Daily bump.GCC Administrator2-1/+32
2022-10-15libstdc++: Fix -Wunused-function warning in src/c++11/debug.ccJonathan Wakely1-8/+8
The only remaining use of print_raw is conditionally compiled, so when libstdc++ i built without debug backtrace support, there's an unused warning function for it. Move it inside the conditional block. libstdc++-v3/ChangeLog: * src/c++11/debug.cc (print_raw): Move inside #if block.
2022-10-15libstdc++: Implement constexpr std::to_chars for C++23 (P2291R3)Jonathan Wakely6-16/+275
Some of the helper functions use static constexpr local variables, which is not permitted in a core constant expression. Removing the 'static' seems to have negligible performance effect for __to_chars and __to_chars_16. For __from_chars_alnum_to_val removing the 'static' causes a significant performance impact for base 36 conversions. Use a consteval lambda instead. libstdc++-v3/ChangeLog: * include/bits/charconv.h (__to_chars_10_impl): Add constexpr for C++23. Remove 'static' from array. * include/std/charconv (__cpp_lib_constexpr_charconv): Define. (__to_chars, __to_chars_16): Remove 'static' from array, add constexpr. (__to_chars_10, __to_chars_8, __to_chars_2, __to_chars_i) (to_chars, __raise_and_add, __from_chars_pow2_base) (__from_chars_alnum, from_chars): Add constexpr. (__from_chars_alnum_to_val): Avoid local static during constant evaluation. Add constexpr. * include/std/version (__cpp_lib_constexpr_charconv): Define. * testsuite/20_util/from_chars/constexpr.cc: New test. * testsuite/20_util/to_chars/constexpr.cc: New test. * testsuite/20_util/to_chars/version.cc: New test.
2022-10-15libstdc++: Fix uses_allocator_construction args for cv pair (LWG 3677)Jonathan Wakely4-5/+54
The _Std_pair concept uses in <bits/uses_allocator_args.h> handles const qualified pairs, but not volatile qualified. That's because it just uses __is_pair which is specialized for const pairs. This removes the partial specialization __is_pair<const pair<T,U>>, so that __is_pair is now only true for cv-unqualified pairs. Then _Std_pair needs to explicitly use remove_cv_t for the argument to __is_pair. The other use of __is_pair is in map::insert(Pair&&) which doesn't want to handle volatile so should just use remove_const_t. libstdc++-v3/ChangeLog: * include/bits/stl_map.h (map::insert(Pair&&)): Use remove_const_t on argument to __is_pair. * include/bits/stl_pair.h (__is_pair<const pair<T,U>>): Remove partial specialization. * include/bits/uses_allocator_args.h (_Std_pair): Use remove_cv_t as per LWG 3677. * testsuite/20_util/uses_allocator/lwg3677.cc: New test.
2022-10-15Daily bump.GCC Administrator11-1/+402
2022-10-14preprocessor: C2x identifier rulesJoseph Myers6-33/+66
C2x has, like C++, adopted rules for identifiers based directly on an unversioned normative reference to Unicode. Make libcpp follow those rules for c2x / gnu2x standards (this involves bringing back a flag separate from the C++ one for whether to use these identifier rules, but this time enabled for all C++ language versions since that was the conclusion adopted for C++ identifier handling). There is one change here that affects C++. I believe the new normative requirement for NFC only applies to identifiers, not to the use of identifier-continue characters in pp-numbers, where there is no such requirement and so the diagnostic ought to be a warning not a pedwarn in pp-numbers, and that this is the case for both C and C++. Bootstrapped with no regressions for x86_64-pc-linux-gnu. libcpp/ * charset.cc (ucn_valid_in_identifier): Check xid_identifiers not cplusplus to determine whether to use CXX23 and NXX23 flags. * include/cpplib.h (struct cpp_options): Add xid_identifiers. * init.cc (struct lang_flags, lang_defaults): Add xid_identifiers. (cpp_set_lang): Set xid_identifiers. * lex.cc (warn_about_normalization): Add parameter identifier. Only pedwarn about non-NFC for identifiers, not pp-numbers. (_cpp_lex_direct): Update calls to warn_about_normalization. gcc/testsuite/ * gcc.dg/cpp/c2x-ucnid-1-utf8.c, gcc.dg/cpp/c2x-ucnid-1.c: New tests.
2022-10-14Fortran: fix check of polymorphic elements in data transfers [PR100971]Harald Anlauf2-0/+22
gcc/fortran/ChangeLog: PR fortran/100971 * resolve.cc (resolve_transfer): Extend check for permissibility of polymorphic elements in a data transfer to arrays. gcc/testsuite/ChangeLog: PR fortran/100971 * gfortran.dg/der_io_5.f90: New test.
2022-10-14Implement distinction between HONOR_SIGNED_ZEROS and MODE_HAS_SIGNED_ZEROS.Aldy Hernandez1-1/+8
gcc/ChangeLog: * value-range.cc (frange::set): Implement distinction between HONOR_SIGNED_ZEROS and MODE_HAS_SIGNED_ZEROS.
2022-10-14Implement range-op entry for __builtin_copysign.Aldy Hernandez1-0/+39
copysign(MAGNITUDE, SIGN) is implemented as the absolute of MAGNITUDE, with SIGN applied. If the sign of "SIGN" cannot be determined, we return a range of [-MAGNITUDE, +MAGNITUDE]. gcc/ChangeLog: * gimple-range-op.cc (class cfn_copysign): New. (gimple_range_op_handler::maybe_builtin_call): Add CFN_BUILT_IN_COPYSIGN*.
2022-10-14gfortran.dg/c-interop/deferred-character-2.f90: Fix dg-doTobias Burnus1-1/+1
gcc/testsuite/ * gfortran.dg/c-interop/deferred-character-2.f90: Use 'dg-do run'.
2022-10-14Check rvc_normal in real_isdenormal.Aldy Hernandez2-1/+6
[-Inf, -Inf] is being flushed to [-Inf, -0.0] because real_isdenormal is being overly pessimistic. It is missing a check for rvc_normal. This doesn't cause problems in real.cc because all uses of real_isdenormal are already on the rvc_normal path. The uses in value-range.cc however, are not. This patch adds a check for rvc_normal. gcc/ChangeLog: * real.h (real_isdenormal): Check rvc_normal. * value-range.cc (range_tests_floats): New test.
2022-10-14libstdc++: Disable all emergency EH pool code if obj-count == 0Jonathan Wakely1-1/+19
For a zero-sized static pool we can completely elide all code for the EH pool. We no longer need to adjust the static buffer size to ensure at least one free_entry can be created in it, because we no longer use a static buffer at all if obj_count == 0. If the buffer exists, obj_count >= 1 and the buffer will be much larger than sizeof(free_entry). libstdc++-v3/ChangeLog: * libsupc++/eh_alloc.cc [USE_POOL]: New macro. [!USE_POOL] (__gnu_cxx::__freeres, pool): Do not define. [_GLIBCXX_EH_POOL_STATIC] (pool::arena): Do not use std::max. (__cxxabiv1::__cxa_allocate_exception) [!USE_POOL]: Do not use pool. (__cxxabiv1::__cxa_free_exception) [!USE_POOL]: Likewise. (__cxxabiv1::__cxa_allocate_dependent_exception) [!USE_POOL]: Likewise. (__cxxabiv1::__cxa_free_dependent_exception) [!USE_POOL]: Likewise.
2022-10-14libstdc++: Simplify print_raw function for debug assertionsJonathan Wakely1-13/+8
Replace two uses of print_raw where it's clearer to just use fprintf directly. Then the only remaining use of print_raw is as the print_func argument of pretty_print. When called by pretty_print the count is either a positive integer or -1, so we can simplify print_raw itself. Remove the default argument, because it's never used. Remove the check for nbc == 0, which never happens (but would be harmless if it did). Replace the conditional expression with a single call to fprintf, using INT_MAX as the maximum length. libstdc++-v3/ChangeLog: * src/c++11/debug.cc (print_raw): Simplify. (print_word): Print indentation by calling fprintf directly. (_Error_formatter::_M_error): Print unindented string by calling fprintf directly.
2022-10-14Replace CFN_BUILTIN_SIGNBIT* cases with CASE_FLT_FN.Aldy Hernandez1-3/+1
gcc/ChangeLog: * gimple-range-op.cc (gimple_range_op_handler::maybe_builtin_call): Replace CFN_BUILTIN_SIGNBIT* cases with CASE_FLT_FN.
2022-10-14Normalize ranges over the range for both bounds when -ffinite-math-only.Aldy Hernandez1-0/+4
[-Inf, +Inf] was being chopped correctly for -ffinite-math-only, but [-Inf, -Inf] was not. This was latent because a bug in real_isdenormal is causing us to flush -Inf to zero. gcc/ChangeLog: * value-range.cc (frange::set): Normalize ranges for both bounds.
2022-10-14Drop -0.0 in frange::set() for !HONOR_SIGNED_ZEROS.Aldy Hernandez1-0/+8
Similar to what we do for NANs when !HONOR_NANS and Inf when flag_finite_math_only, we can remove -0.0 from the range at creation time. We were kinda sorta doing this because there is a bug in real_isdenormal that is causing flush_denormals_to_zero to saturate [x, -0.0] to [x, +0.0] when !HONOR_SIGNED_ZEROS. Fixing this bug (upcoming), causes us to leave -0.0 in places where we aren't expecting it (the intersection code). gcc/ChangeLog: * value-range.cc (frange::set): Drop -0.0 for !HONOR_SIGNED_ZEROS.
2022-10-14c++ modules: ICE with dynamic_cast [PR106304]Patrick Palka4-1/+25
The FUNCTION_DECL we build for __dynamic_cast has an empty DECL_CONTEXT but trees_out::tree_node expects FUNCTION_DECLs to have non-empty DECL_CONTEXT, thus we crash when streaming out the dynamic_cast in the below testcase. This patch naively fixes this by setting DECL_CONTEXT for __dynamic_cast appropriately. I suppose we should push it into the namespace too, like we do for __cxa_atexit which is similarly lazily declared. PR c++/106304 gcc/cp/ChangeLog: * constexpr.cc (cxx_dynamic_cast_fn_p): Check for abi_node instead of global_namespace. * rtti.cc (build_dynamic_cast_1): Set DECL_CONTEXT and DECL_SOURCE_LOCATION when building dynamic_cast_node. Push it into the namespace. gcc/testsuite/ChangeLog: * g++.dg/modules/pr106304_a.C: New test. * g++.dg/modules/pr106304_b.C: New test.
2022-10-14Add cases for CFN_BUILT_IN_SIGNBIT[FL].Aldy Hernandez1-0/+2
gcc/ChangeLog: * gimple-range-op.cc (gimple_range_op_handler::maybe_builtin_call): Add CFN_BUILT_IN_SIGNBIT[FL]* entries.
2022-10-14tree-optimization/107254 - check and support live lanes from permutesRichard Biener2-5/+77
The following fixes an omission from adding SLP permute nodes which is live lanes originating from those. We have to check that we can extract the lane and have to actually code generate them. PR tree-optimization/107254 * tree-vect-slp.cc (vect_slp_analyze_node_operations_1): For permutes also analyze live lanes. (vect_schedule_slp_node): For permutes also code generate live lane extracts. * gfortran.dg/vect/pr107254.f90: New testcase.
2022-10-14Fix PR target/107248Eric Botcazou1-12/+12
This is the infamous PR rtl-optimization/38644 rearing its ugly head for leaf functions on SPARC more than a decade later... Richard E.'s generic solution has never been implemented so let's do as other RISC back-ends did. gcc/ PR target/107248 * config/sparc/sparc.cc (sparc_expand_prologue): Emit a frame blockage for leaf functions. (sparc_flat_expand_prologue): Emit frame instead of full blockage. (sparc_expand_epilogue): Emit a frame blockage for leaf functions. (sparc_flat_expand_epilogue): Emit frame instead of full blockage.
2022-10-14libstdc++: Use markdown in Doxygen commentJonathan Wakely1-3/+3
This makes the comment easier to read in the source, without altering the Doxygen output. libstdc++-v3/ChangeLog: * include/std/iostream: Use markdown in Doxygen comment.
2022-10-14gcov: test line count for label in then/else blockJørgen Kvalsvik1-1/+25
Add a test to catch regression in line counts for labels on top of then/else blocks. Only the 'goto <label>' should contribute to the line counter for the label, not the if. gcc/testsuite/ChangeLog: * gcc.misc-tests/gcov-4.c: New testcase.
2022-10-14gcov: test switch/break line countsJørgen Kvalsvik2-6/+6
The coverage support will under some conditions decide to split edges to accurately report coverage. By running the test suite with/without this edge splitting a small diff shows up, addressed by this patch, which should catch future regressions. Removing the edge splitting: $ diff --git a/gcc/profile.cc b/gcc/profile.cc --- a/gcc/profile.cc +++ b/gcc/profile.cc @@ -1244,19 +1244,7 @@ branch_prob (bool thunk) Don't do that when the locuses match, so if (blah) goto something; is not computed twice. */ - if (last - && gimple_has_location (last) - && !RESERVED_LOCATION_P (e->goto_locus) - && !single_succ_p (bb) - && (LOCATION_FILE (e->goto_locus) - != LOCATION_FILE (gimple_location (last)) - || (LOCATION_LINE (e->goto_locus) - != LOCATION_LINE (gimple_location (last))))) - { - basic_block new_bb = split_edge (e); - edge ne = single_succ_edge (new_bb); - ne->goto_locus = e->goto_locus; - } + if ((e->flags & (EDGE_ABNORMAL | EDGE_ABNORMAL_CALL)) && e->dest != EXIT_BLOCK_PTR_FOR_FN (cfun)) need_exit_edge = 1; Assuming the .gcov files from make chec-gcc RUNTESTFLAGS=gcov.exp are kept: $ diff -r no-split-edge with-split-edge | grep -C 2 -E "^[<>]\s\s" diff -r sans-split-edge/gcc/gcov-4.c.gcov with-split-edge/gcc/gcov-4.c.gcov 228c228 < -: 224: break; --- > 1: 224: break; 231c231 < -: 227: break; --- > #####: 227: break; 237c237 < -: 233: break; --- > 2: 233: break; gcc/testsuite/ChangeLog: * g++.dg/gcov/gcov-1.C: Add line count check. * gcc.misc-tests/gcov-4.c: Likewise.
2022-10-14middle-end, c++, i386, libgcc: std::bfloat16_t and __bf16 arithmetic supportJakub Jelinek47-309/+1339
Here is a complete patch to add std::bfloat16_t support on x86 (AArch64 and ARM left for later). Almost no BFmode optabs are added by the patch, so for binops/unops it extends to SFmode first and then truncates back to BFmode. For {HF,SF,DF,XF,TF}mode -> BFmode conversions libgcc has implementations of all those conversions so that we avoid double rounding, for BFmode -> {DF,XF,TF}mode conversions to avoid growing libgcc too much it emits BFmode -> SFmode conversion first and then converts to the even wider mode, neither step should be imprecise. For BFmode -> HFmode, it first emits a precise BFmode -> SFmode conversion and then SFmode -> HFmode, because neither format is subset or superset of the other, while SFmode is superset of both. expr.cc then contains a -ffast-math optimization of the BF -> SF and SF -> BF conversions if we don't optimize for space (and for the latter if -frounding-math isn't enabled either). For x86, perhaps truncsfbf2 optab could be defined for TARGET_AVX512BF16 but IMNSHO should FAIL if !flag_finite_math || flag_rounding_math || !flag_unsafe_math_optimizations, because I think the insn doesn't raise on sNaNs, hardcodes round to nearest and flushes denormals to zero. By default (unless x86 -fexcess-precision=16) we use float excess precision for BFmode, so truncate only on explicit casts and assignments. The patch introduces a single __bf16 builtin - __builtin_nansf16b, because (__bf16) __builtin_nansf ("") will drop the sNaN into qNaN, and uses f16b suffix instead of bf16 because there would be ambiguity on log vs. logb - __builtin_logbf16 could be either log with bf16 suffix or logb with f16 suffix. In other cases libstdc++ should mostly use __builtin_*f for std::bfloat16_t overloads (we have a problem with std::nextafter though but that one we have also for std::float16_t). 2022-10-14 Jakub Jelinek <jakub@redhat.com> gcc/ * tree-core.h (enum tree_index): Add TI_BFLOAT16_TYPE. * tree.h (bfloat16_type_node): Define. * tree.cc (excess_precision_type): Promote bfloat16_type_mode like float16_type_mode. (build_common_tree_nodes): Initialize bfloat16_type_node if BFmode is supported. * expmed.h (maybe_expand_shift): Declare. * expmed.cc (maybe_expand_shift): No longer static. * expr.cc (convert_mode_scalar): Don't ICE on BF -> HF or HF -> BF conversions. If there is no optab, handle BF -> {DF,XF,TF,HF} conversions as separate BF -> SF -> {DF,XF,TF,HF} conversions, add -ffast-math generic implementation for BF -> SF and SF -> BF conversions. * builtin-types.def (BT_BFLOAT16, BT_FN_BFLOAT16_CONST_STRING): New. * builtins.def (BUILT_IN_NANSF16B): New builtin. * fold-const-call.cc (fold_const_call): Handle CFN_BUILT_IN_NANSF16B. * config/i386/i386.cc (classify_argument): Handle E_BCmode. (ix86_libgcc_floating_mode_supported_p): Also return true for BFmode for -msse2. (ix86_mangle_type): Mangle BFmode as DF16b. (ix86_invalid_conversion, ix86_invalid_unary_op, ix86_invalid_binary_op): Remove. (TARGET_INVALID_CONVERSION, TARGET_INVALID_UNARY_OP, TARGET_INVALID_BINARY_OP): Don't redefine. * config/i386/i386-builtins.cc (ix86_bf16_type_node): Remove. (ix86_register_bf16_builtin_type): Use bfloat16_type_node rather than ix86_bf16_type_node, only create it if still NULL. * config/i386/i386-builtin-types.def (BFLOAT16): Likewise. * config/i386/i386.md (cbranchbf4, cstorebf4): New expanders. gcc/c-family/ * c-cppbuiltin.cc (c_cpp_builtins): If bfloat16_type_node, predefine __BFLT16_*__ macros and for C++23 also __STDCPP_BFLOAT16_T__. Predefine bfloat16_type_node related macros for -fbuilding-libgcc. * c-lex.cc (interpret_float): Handle CPP_N_BFLOAT16. gcc/c/ * c-typeck.cc (convert_arguments): Don't promote __bf16 to double. gcc/cp/ * cp-tree.h (extended_float_type_p): Return true for bfloat16_type_node. * typeck.cc (cp_compare_floating_point_conversion_ranks): Set extended{1,2} if mv{1,2} is bfloat16_type_node. Adjust comment. gcc/testsuite/ * lib/target-supports.exp (check_effective_target_bfloat16, check_effective_target_bfloat16_runtime, add_options_for_bfloat16): New. * gcc.dg/torture/bfloat16-basic.c: New test. * gcc.dg/torture/bfloat16-builtin.c: New test. * gcc.dg/torture/bfloat16-builtin-issignaling-1.c: New test. * gcc.dg/torture/bfloat16-complex.c: New test. * gcc.dg/torture/builtin-issignaling-1.c: Allow to be includable from bfloat16-builtin-issignaling-1.c. * gcc.dg/torture/floatn-basic.h: Allow to be includable from bfloat16-basic.c. * gcc.target/i386/vect-bfloat16-typecheck_2.c: Adjust expected diagnostics. * gcc.target/i386/sse2-bfloat16-scalar-typecheck.c: Likewise. * gcc.target/i386/vect-bfloat16-typecheck_1.c: Likewise. * g++.target/i386/bfloat_cpp_typecheck.C: Likewise. libcpp/ * include/cpplib.h (CPP_N_BFLOAT16): Define. * expr.cc (interpret_float_suffix): Handle bf16 and BF16 suffixes for C++. libgcc/ * config/i386/t-softfp (softfp_extensions): Add bfsf. (softfp_truncations): Add tfbf xfbf dfbf sfbf hfbf. (CFLAGS-extendbfsf2.c, CFLAGS-truncsfbf2.c, CFLAGS-truncdfbf2.c, CFLAGS-truncxfbf2.c, CFLAGS-trunctfbf2.c, CFLAGS-trunchfbf2.c): Add -msse2. * config/i386/libgcc-glibc.ver (GCC_13.0.0): Export __extendbfsf2 and __trunc{s,d,x,t,h}fbf2. * config/i386/sfp-machine.h (_FP_NANSIGN_B): Define. * config/i386/64/sfp-machine.h (_FP_NANFRAC_B): Define. * config/i386/32/sfp-machine.h (_FP_NANFRAC_B): Define. * soft-fp/brain.h: New file. * soft-fp/truncsfbf2.c: New file. * soft-fp/truncdfbf2.c: New file. * soft-fp/truncxfbf2.c: New file. * soft-fp/trunctfbf2.c: New file. * soft-fp/trunchfbf2.c: New file. * soft-fp/truncbfhf2.c: New file. * soft-fp/extendbfsf2.c: New file. libiberty/ * cp-demangle.h (D_BUILTIN_TYPE_COUNT): Increment. * cp-demangle.c (cplus_demangle_builtin_types): Add std::bfloat16_t entry. (cplus_demangle_type): Demangle DF16b. * testsuite/demangle-expected (_Z3xxxDF16b): New test.
2022-10-14c++: Excess precision for ? int : float or int == float [PR107097, PR82071, ↵Jakub Jelinek9-53/+99
PR87390] The following incremental patch implements the C11 behavior (for all C++ versions) for cond ? int : float cond ? float : int int cmp float float cmp int where int is any integral type, float any floating point type with excess precision and cmp ==, !=, >, <, >=, <= and <=>. 2022-10-14 Jakub Jelinek <jakub@redhat.com> PR c/82071 PR c/87390 PR c++/107097 gcc/cp/ * cp-tree.h (cp_ep_convert_and_check): Remove. * cvt.cc (cp_ep_convert_and_check): Remove. * call.cc (build_conditional_expr): Use excess precision for ?: with one arm floating and another integral. Don't convert first to semantic result type from integral types. (convert_like_internal): Don't call cp_ep_convert_and_check, instead just strip EXCESS_PRECISION_EXPR before calling cp_convert_and_check or cp_convert. * typeck.cc (cp_build_binary_op): Set may_need_excess_precision for comparisons or SPACESHIP_EXPR with at least one operand integral. Don't compute semantic_result_type if build_type is non-NULL. Call cp_convert_and_check instead of cp_ep_convert_and_check. gcc/testsuite/ * gcc.target/i386/excess-precision-8.c: For C++ wrap abort and exit declarations into extern "C" block. * gcc.target/i386/excess-precision-10.c: Likewise. * g++.target/i386/excess-precision-7.C: Remove. * g++.target/i386/excess-precision-8.C: New test. * g++.target/i386/excess-precision-9.C: Remove. * g++.target/i386/excess-precision-10.C: New test. * g++.target/i386/excess-precision-12.C: New test.
2022-10-14c++: Implement excess precision support for C++ [PR107097, PR323]Jakub Jelinek38-86/+613
The following patch implements excess precision support for C++. Like for C, it uses EXCESS_PRECISION_EXPR tree to say that its operand is evaluated in excess precision and what the semantic type of the expression is. In most places I've followed what the C FE does in similar spots, so e.g. for binary ops if one or both operands are already EXCESS_PRECISION_EXPR, strip those away or for operations that might need excess precision (+, -, *, /) check if the operands should use excess precision and convert to that type and at the end wrap into EXCESS_PRECISION_EXPR with the common semantic type. This patch follows the C99 handling where it differs from C11 handling. There are some cases which needed to be handled differently, the C FE can just strip EXCESS_PRECISION_EXPR (replace it with its operand) when handling explicit cast, but that IMHO isn't right for C++ - the discovery what exact conversion should be used (e.g. if user conversion or standard or their sequence) should be decided based on the semantic type (i.e. type of EXCESS_PRECISION_EXPR), and that decision continues in convert_like* where we pick the right user conversion, again, if say some class has ctor from double and long double and we are on ia32 with standard excess precision promoting float/double to long double, then we should pick the ctor from double. Or when some other class has ctor from just double, and EXCESS_PRECISION_EXPR semantic type is float, we should choose the user ctor from double, but actually just convert the long double excess precision to double and not to float first. We need to make sure even identity conversion converts from excess precision to the semantic one though, but if identity is chained with other conversions, we don't want the identity next_conversion to drop to semantic precision only to widen afterwards. The existing testcases tweaks were for cases on i686-linux where excess precision breaks those tests, e.g. if we have double d = 4.2; if (d == 4.2) then it does the expected thing only with -fexcess-precision=fast, because with -fexcess-precision=standard it is actually double d = 4.2; if ((long double) d == 4.2L) where 4.2L is different from 4.2. I've added -fexcess-precision=fast to some tests and changed other tests to use constants that are exactly representable and don't suffer from these excess precision issues. There is one exception, pr68180.C looks like a bug in the patch which is also present in the C FE (so I'd like to get it resolved incrementally in both). Reduced testcase: typedef float __attribute__((vector_size (16))) float32x4_t; float32x4_t foo(float32x4_t x, float y) { return x + y; } with -m32 -std=c11 -Wno-psabi or -m32 -std=c++17 -Wno-psabi it is rejected with: pr68180.c:2:52: error: conversion of scalar ‘long double’ to vector ‘float32x4_t’ {aka ‘__vector(4) float’} involves truncation but without excess precision (say just -std=c11 -Wno-psabi or -std=c++17 -Wno-psabi) it is accepted. Perhaps we should pass down the semantic type to scalar_to_vector and use the semantic type rather than excess precision type in the diagnostics. 2022-10-14 Jakub Jelinek <jakub@redhat.com> PR middle-end/323 PR c++/107097 gcc/ * doc/invoke.texi (-fexcess-precision=standard): Mention that the option now also works in C++. gcc/c-family/ * c-common.def (EXCESS_PRECISION_EXPR): Remove comment part about the tree being specific to C/ObjC. * c-opts.cc (c_common_post_options): Handle flag_excess_precision in C++ the same as in C. * c-lex.cc (interpret_float): Set const_type to excess_precision () even for C++. gcc/cp/ * parser.cc (cp_parser_primary_expression): Handle EXCESS_PRECISION_EXPR with REAL_CST operand the same as REAL_CST. * cvt.cc (cp_ep_convert_and_check): New function. * call.cc (build_conditional_expr): Add excess precision support. When type_after_usual_arithmetic_conversions returns error_mark_node, use gcc_checking_assert that it is because of uncomparable floating point ranks instead of checking all those conditions and make it work also with complex types. (convert_like_internal): Likewise. Add NESTED_P argument, pass true to recursive calls to convert_like. (convert_like): Add NESTED_P argument, pass it through to convert_like_internal. For other overload pass false to it. (convert_like_with_context): Pass false to NESTED_P. (convert_arg_to_ellipsis): Add excess precision support. (magic_varargs_p): For __builtin_is{finite,inf,inf_sign,nan,normal} and __builtin_fpclassify return 2 instead of 1, document what it means. (build_over_call): Don't handle former magic 2 which is no longer used, instead for magic 1 remove EXCESS_PRECISION_EXPR. (perform_direct_initialization_if_possible): Pass false to NESTED_P convert_like argument. * constexpr.cc (cxx_eval_constant_expression): Handle EXCESS_PRECISION_EXPR. (potential_constant_expression_1): Likewise. * pt.cc (tsubst_copy, tsubst_copy_and_build): Likewise. * cp-tree.h (cp_ep_convert_and_check): Declare. * cp-gimplify.cc (cp_fold): Handle EXCESS_PRECISION_EXPR. * typeck.cc (cp_common_type): For COMPLEX_TYPEs, return error_mark_node if recursive call returned it. (convert_arguments): For magic 1 remove EXCESS_PRECISION_EXPR. (cp_build_binary_op): Add excess precision support. When cp_common_type returns error_mark_node, use gcc_checking_assert that it is because of uncomparable floating point ranks instead of checking all those conditions and make it work also with complex types. (cp_build_unary_op): Likewise. (cp_build_compound_expr): Likewise. (build_static_cast_1): Remove EXCESS_PRECISION_EXPR. gcc/testsuite/ * gcc.target/i386/excess-precision-1.c: For C++ wrap abort and exit declarations into extern "C" block. * gcc.target/i386/excess-precision-2.c: Likewise. * gcc.target/i386/excess-precision-3.c: Likewise. Remove check_float_nonproto and check_double_nonproto tests for C++. * gcc.target/i386/excess-precision-7.c: For C++ wrap abort and exit declarations into extern "C" block. * gcc.target/i386/excess-precision-9.c: Likewise. * g++.target/i386/excess-precision-1.C: New test. * g++.target/i386/excess-precision-2.C: New test. * g++.target/i386/excess-precision-3.C: New test. * g++.target/i386/excess-precision-4.C: New test. * g++.target/i386/excess-precision-5.C: New test. * g++.target/i386/excess-precision-6.C: New test. * g++.target/i386/excess-precision-7.C: New test. * g++.target/i386/excess-precision-9.C: New test. * g++.target/i386/excess-precision-11.C: New test. * c-c++-common/dfp/convert-bfp-10.c: Add -fexcess-precision=fast as dg-additional-options. * c-c++-common/dfp/compare-eq-const.c: Likewise. * g++.dg/cpp1z/constexpr-96862.C: Likewise. * g++.dg/cpp1z/decomp12.C (main): Use 2.25 instead of 2.3 to avoid excess precision differences. * g++.dg/other/thunk1.C: Add -fexcess-precision=fast as dg-additional-options. * g++.dg/vect/pr64410.cc: Likewise. * g++.dg/cpp1y/pr68180.C: Likewise. * g++.dg/vect/pr89653.cc: Likewise. * g++.dg/cpp0x/variadic-tuple.C: Likewise. * g++.dg/cpp0x/nsdmi-union1.C: Use 4.25 instead of 4.2 to avoid excess precision differences. * g++.old-deja/g++.brendan/copy9.C: Add -fexcess-precision=fast as dg-additional-options. * g++.old-deja/g++.brendan/overload7.C: Likewise.
2022-10-14c: C2x storage class specifiers in compound literalsJoseph Myers16-14/+369
Implement the C2x feature of storage class specifiers in compound literals. Such storage class specifiers (static, register or thread_local; also constexpr, but we don't yet have C2x constexpr support implemented) can be used before the type name (not mixed with type specifiers, unlike in declarations) and have the same semantics and constraints as for declarations of named objects. Also allow GNU __thread to be used, given that thread_local can be. Bootstrapped with no regressions for x86_64-pc-linux-gnu. gcc/c/ * c-decl.cc (build_compound_literal): Add parameter scspecs. Handle storage class specifiers. * c-parser.cc (c_token_starts_compound_literal) (c_parser_compound_literal_scspecs): New. (c_parser_postfix_expression_after_paren_type): Add parameter scspecs. Call pedwarn_c11 for use of storage class specifiers. Update call to build_compound_literal. (c_parser_cast_expression, c_parser_sizeof_expression) (c_parser_alignof_expression): Handle storage class specifiers for compound literals. Update calls to c_parser_postfix_expression_after_paren_type. (c_parser_postfix_expression): Update syntax comment. * c-tree.h (build_compound_literal): Update prototype. * c-typeck.cc (c_mark_addressable): Diagnose taking address of register compound literal. gcc/testsuite/ * gcc.dg/c11-complit-1.c, gcc.dg/c11-complit-2.c, gcc.dg/c11-complit-3.c, gcc.dg/c2x-complit-2.c, gcc.dg/c2x-complit-3.c, gcc.dg/c2x-complit-4.c, gcc.dg/c2x-complit-5.c, gcc.dg/c2x-complit-6.c, gcc.dg/c2x-complit-7.c, gcc.dg/c90-complit-2.c, gcc.dg/gnu2x-complit-1.c, gcc.dg/gnu2x-complit-2.c: New tests.
2022-10-14Daily bump.GCC Administrator9-1/+241
2022-10-14Fix bogus -Wstringop-overflow warningEric Botcazou2-2/+22
If you compile the testcase with -O2 -fno-inline -Wall, you get: In function 'process_array3': cc1: warning: 'process_array4' accessing 4 bytes in a region of size 3 [- Wstringop-overflow=] cc1: note: referencing argument 1 of type 'char[4]' t.c:6:6: note: in a call to function 'process_array4' 6 | void process_array4 (char a[4], int n) | ^~~~~~~~~~~~~~ cc1: warning: 'process_array4' accessing 4 bytes in a region of size 3 [- Wstringop-overflow=] cc1: note: referencing argument 1 of type 'char[4]' t.c:6:6: note: in a call to function 'process_array4' That's because the ICF IPA pass has identified the two functions and turned process_array3 into a wrapper of process_array4. gcc/ * gimple-ssa-warn-access.cc (pass_waccess::check_call): Return early for calls made from thunks. gcc/testsuite/ * gcc.dg/Wstringop-overflow-89.c: New test.
2022-10-13c++: trivial formatting cleanupsJason Merrill5-15/+17
Split out from the C++ contracts patch. gcc/cp/ChangeLog: * cp-tree.h: Fix whitespace. * parser.h: Fix whitespace. * decl.cc: Fix whitespace. * parser.cc: Fix whitespace. * pt.cc: Fix whitespace.
2022-10-13analyzer: fix ICE introduced in r13-3168 [PR107210]David Malcolm2-1/+18
gcc/analyzer/ChangeLog: PR analyzer/107210 * svalue.cc (constant_svalue::maybe_fold_bits_within): Only attempt to extract individual bits when tree_fits_uhwi_p. gcc/testsuite/ChangeLog: PR analyzer/107210 * gfortran.dg/analyzer/pr107210.f90: New test. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2022-10-13libgomp: Add Fortran testcases for omp_in_explicit_taskTobias Burnus7-0/+247
Fortranized testcases of commits r13-3257-ga58a965eb73 and r13-3258-g0ec4e93fb9f. libgomp/ChangeLog: * testsuite/libgomp.fortran/task-7.f90: New test. * testsuite/libgomp.fortran/task-8.f90: New test. * testsuite/libgomp.fortran/task-in-explicit-1.f90: New test. * testsuite/libgomp.fortran/task-in-explicit-2.f90: New test. * testsuite/libgomp.fortran/task-in-explicit-3.f90: New test. * testsuite/libgomp.fortran/task-reduction-17.f90: New test. * testsuite/libgomp.fortran/task-reduction-18.f90: New test.