aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2022-03-16Daily bump.GCC Administrator5-1/+98
2022-03-15analyzer: add test coverage for PR 95000David Malcolm1-0/+38
PR analyzer/95000 isn't fixed yet; add test coverage with XFAILs. gcc/testsuite/ChangeLog: PR analyzer/95000 * gcc.dg/analyzer/pr95000-1.c: New test. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2022-03-15analyzer: presize m_cluster_map in store copy ctorDavid Malcolm1-1/+2
Testing cc1 on pr93032-mztools-unsigned-char.c Benchmark #1: (without patch) Time (mean ± σ): 338.8 ms ± 13.6 ms [User: 323.2 ms, System: 14.2 ms] Range (min … max): 326.7 ms … 363.1 ms 10 runs Benchmark #2: (with patch) Time (mean ± σ): 332.3 ms ± 12.8 ms [User: 316.6 ms, System: 14.3 ms] Range (min … max): 322.5 ms … 357.4 ms 10 runs Summary ./cc1.new ran 1.02 ± 0.06 times faster than ./cc1.old gcc/analyzer/ChangeLog: * store.cc (store::store): Presize m_cluster_map. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2022-03-15rs6000: Fix invalid address passed to __builtin_mma_disassemble_acc [PR104923]Peter Bergner2-2/+28
The mma_disassemble_output_operand predicate is too lenient on the types of addresses it will accept, leading to combine creating invalid address that eventually lead to ICEs in LRA. The solution is to restrict the addresses to indirect, indexed or those valid for quad memory accesses. 2022-03-15 Peter Bergner <bergner@linux.ibm.com> gcc/ PR target/104923 * config/rs6000/predicates.md (mma_disassemble_output_operand): Restrict acceptable MEM addresses. gcc/testsuite/ PR target/104923 * gcc.target/powerpc/pr104923.c: New test.
2022-03-15c++: extraneous access error with ambiguous lookup [PR103177]Patrick Palka2-21/+34
When a lookup is ambiguous, lookup_member still attempts to check access of the first member found before diagnosing the ambiguity and propagating the error, and this may cause us to issue an extraneous access error as in the testcase below (for B1::foo). This patch fixes this by swapping the order of the ambiguity and access checks within lookup_member. In passing, since the only thing that could go wrong during lookup_field_r is ambiguity, we might as well hardcode that in lookup_member and get rid of lookup_field_info::errstr. PR c++/103177 gcc/cp/ChangeLog: * search.cc (lookup_field_info::errstr): Remove this data member. (lookup_field_r): Don't set errstr. (lookup_member): Check ambiguity before checking access. Simplify accordingly after errstr removal. Exit early upon error or empty result. gcc/testsuite/ChangeLog: * g++.dg/lookup/ambig6.C: New test.
2022-03-15riscv: Allow -Wno-psabi to turn off ABI warnings [PR91229]Jakub Jelinek1-4/+4
While checking if all targets honor -Wno-psabi for ABI related warnings or messages, I found that almost all do, except for riscv. In the testsuite when we want to ignore ABI related messages we typically use -Wno-psabi -w, but it would be nice to get rid of those -w uses eventually. The following allows silencing those warnings with -Wno-psabi rather than just -w even on riscv. 2022-03-15 Jakub Jelinek <jakub@redhat.com> PR target/91229 * config/riscv/riscv.cc (riscv_pass_aggregate_in_fpr_pair_p, riscv_pass_aggregate_in_fpr_and_gpr_p): Pass OPT_Wpsabi instead of 0 to warning calls.
2022-03-15i386: Use no-mmx,no-sse for LIBGCC2_UNWIND_ATTRIBUTE [PR104890]Jakub Jelinek1-3/+3
Regardless of the outcome of the general-regs-only stuff in x86gprintrin.h, apparently general-regs-only is much bigger hammer than no-sse, and e.g. using 387 instructions in the unwinder isn't a big deal, it never needs to realign the stack because of it. So, the following patch uses no-sse (and adds no-mmx to it, even when not strictly needed). 2022-03-15 Jakub Jelinek <jakub@redhat.com> PR target/104890 * config/i386/i386.h (LIBGCC2_UNWIND_ATTRIBUTE): Use no-mmx,no-sse instead of general-regs-only.
2022-03-15PR tree-optimization/101895: Fold VEC_PERM to help recognize FMA.Roger Sayle2-2/+30
This patch resolves PR tree-optimization/101895 a missed optimization regression, by adding a costant folding simplification to match.pd to simplify the transform "mult; vec_perm; plus" into "vec_perm; mult; plus" with the aim that keeping the multiplication and addition next to each other allows them to be recognized as fused-multiply-add on suitable targets. This transformation requires a tweak to match.pd's vec_same_elem_p predicate to handle CONSTRUCTOR_EXPRs using the same SSA_NAME_DEF_STMT idiom used for constructors elsewhere in match.pd. The net effect is that the following code example: void foo(float * __restrict__ a, float b, float *c) { a[0] = c[0]*b + a[0]; a[1] = c[2]*b + a[1]; a[2] = c[1]*b + a[2]; a[3] = c[3]*b + a[3]; } when compiled on x86_64-pc-linux-gnu with -O2 -march=cascadelake currently generates: vbroadcastss %xmm0, %xmm0 vmulps (%rsi), %xmm0, %xmm0 vpermilps $216, %xmm0, %xmm0 vaddps (%rdi), %xmm0, %xmm0 vmovups %xmm0, (%rdi) ret but with this patch now generates the improved: vpermilps $216, (%rsi), %xmm1 vbroadcastss %xmm0, %xmm0 vfmadd213ps (%rdi), %xmm0, %xmm1 vmovups %xmm1, (%rdi) ret 2022-03-15 Roger Sayle <roger@nextmovesoftware.com> Marc Glisse <marc.glisse@inria.fr> Richard Biener <rguenther@suse.de> gcc/ChangeLog PR tree-optimization/101895 * match.pd (vec_same_elem_p): Handle CONSTRUCTOR_EXPR def. (plus (vec_perm (mult ...) ...) ...): New reordering simplification. gcc/testsuite/ChangeLog PR tree-optimization/101895 * gcc.target/i386/pr101895.c: New test case.
2022-03-15c++: Fix up cp_parser_skip_to_pragma_eol [PR104623]Jakub Jelinek2-2/+9
We ICE on the following testcase, because we tentatively parse it multiple times and the erroneous attribute syntax results in cp_parser_skip_to_end_of_statement, which when seeing CPP_PRAGMA (can be any deferred one, OpenMP/OpenACC/ivdep etc.) it calls cp_parser_skip_to_pragma_eol, which calls cp_lexer_purge_tokens_after. That call purges all the tokens from CPP_PRAGMA until CPP_PRAGMA_EOL, excluding the initial CPP_PRAGMA though (but including the final CPP_PRAGMA_EOL). This means the second time we parse this, we see CPP_PRAGMA with no tokens after it from the pragma, most importantly not the CPP_PRAGMA_EOL, so either if it is the last pragma in the TU, we ICE, or if there are other pragmas we treat everything in between as a pragma. I've tried various things, including making the CPP_PRAGMA token itself also purged, or changing the cp_parser_skip_to_end_of_statement (and cp_parser_skip_to_end_of_block_or_statement) to call it with NULL instead of token, so that this purging isn't done there, but each patch resulted in lots of regressions. But removing the purging altogether surprisingly doesn't regress anything, and I think it is the right thing, if we e.g. parse tentatively, why can't we parse the pragma multiple times or at least skip over it? 2022-03-15 Jakub Jelinek <jakub@redhat.com> PR c++/104623 * parser.cc (cp_parser_skip_to_pragma_eol): Don't purge any tokens. * g++.dg/gomp/pr104623.C: New test.
2022-03-15ifcvt: Punt if not onlyjump_p for find_if_case_{1,2} [PR104814]Jakub Jelinek2-4/+40
find_if_case_{1,2} implicitly assumes conditional jumps and rewrites them, so if they have extra side-effects or are say asm goto, things don't work well, either the side-effects are lost or we could ICE. In particular, the testcase below on s390x has there a doloop instruction that decrements a register in addition to testing it for non-zero and conditionally jumping based on that. The following patch fixes that by punting for !onlyjump_p case, i.e. if there are side-effects in the jump instruction or it isn't a plain PC setter. Also, it assumes BB_END (test_bb) will be always non-NULL, because basic blocks with 2 non-abnormal successor edges should always have some instruction at the end that determines which edge to take. 2022-03-15 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/104814 * ifcvt.cc (find_if_case_1, find_if_case_2): Punt if test_bb doesn't end with onlyjump_p. Assume BB_END (test_bb) is always non-NULL. * gcc.c-torture/execute/pr104814.c: New test.
2022-03-14Avoid -Wdangling-pointer for by-transparent-reference arguments [PR104436].Martin Sebor3-1/+66
This change avoids -Wdangling-pointer for by-value arguments transformed into by-transparent-reference. Resolves: PR middle-end/104436 - spurious -Wdangling-pointer assigning local address to a class passed by value gcc/ChangeLog: PR middle-end/104436 * gimple-ssa-warn-access.cc (pass_waccess::check_dangling_stores): Check for warning suppression. Avoid by-value arguments transformed into by-transparent-reference. gcc/testsuite/ChangeLog: PR middle-end/104436 * c-c++-common/Wdangling-pointer-8.c: New test. * g++.dg/warn/Wdangling-pointer-5.C: New test.
2022-03-15Daily bump.GCC Administrator8-1/+124
2022-03-14Update gcc de.po, fr.po, sv.poJoseph Myers3-2526/+1780
* de.po, fr.po, sv.po: Update.
2022-03-14Fix libitm.c/memset-1.c test fails with new peephole2s.Roger Sayle2-2/+13
My sincere apologies for the breakage, but alas handling SImode in the recently added "xorl;movb -> movzbl" peephole2 turns out to be slightly more complicated that just using SWI48 as a mode iterator. I'd failed to check the machine description carefully, but the *zero_extend<mode>si2 define_insn is conditionally defined, based on x86 target tuning using TARGET_ZERO_EXTEND_WITH_AND, and therefore unavailable on 486 and pentium unless optimizing the code for size. It turns out that the libitm testsuite specifies -m486 with make check RUNTESTFLAGS="--target_board='unix{-m32}'" and therefore encounters/catches oversight. Fixed by adding the appropriate conditions to the new peephole2 patterns. 2022-03-14 Roger Sayle <roger@nextmovesoftware.com> Uroš Bizjak <ubizjak@gmail.com> gcc/ChangeLog * config/i386/i386.md (peephole2 xorl;movb -> movzbl): Disable transformation when *zero_extend<mode>si2 is not available. gcc/testsuite/ChangeLog * gcc.target/i386/pr98335.c: Skip this test if tuning for i486 or pentium, and not optimizing for size.
2022-03-15Enable libsanitizer build on mips64Xi Ruoyao5-5/+17
Bootstrapped and regtested on mips64-linux-gnuabi64. bootstrap-ubsan revealed 3 bugs (PR 104842, 104843, 104851). bootstrap-asan did not reveal any new bug. gcc/ * config/mips/mips.h (SUBTARGET_SHADOW_OFFSET): Define. * config/mips/mips.cc (mips_option_override): Make -fsanitize=address imply -fasynchronous-unwind-tables. This is needed by libasan for stack backtrace on MIPS. (mips_asan_shadow_offset): Return SUBTARGET_SHADOW_OFFSET. gcc/testsuite: * c-c++-common/asan/global-overflow-1.c: Skip for MIPS with some optimization levels because inaccurate debug info is causing dg-output mismatch on line numbers. * g++.dg/asan/large-func-test-1.C: Likewise. libsanitizer/ * configure.tgt: Enable build on mips*64*-*-linux*.
2022-03-15libsanitizer: cherry-pick db7bca28638e from upstreamXi Ruoyao1-2/+2
libsanitizer/ * sanitizer_common/sanitizer_atomic_clang.h: Ensures to only include sanitizer_atomic_clang_mips.h for O32.
2022-03-14lra: Fix up debug_p handling in lra_substitute_pseudo [PR104778]Jakub Jelinek2-2/+84
The following testcase ICEs on powerpc-linux, because lra_substitute_pseudo substitutes (const_int 1) into a subreg operand. First a subreg of subreg of a reg appears in a debug insn (which surely is invalid outside of debug insns, but in debug insns we allow even what is normally invalid in RTL like subregs which the target doesn't like, because either dwarf2out is able to handle it, or we just throw away the location expression, making some var <optimized out>. lra_substitute_pseudo already has some code to deal with specifically SUBREG of REG with the REG being substituted for VOIDmode constant, but that doesn't cover this case, so the following patch extends lra_substitute_pseudo for debug_p mode to treat stuff like e.g. combiner's subst function to ensure we don't lose mode which is essential for the IL. 2022-03-14 Jakub Jelinek <jakub@redhat.com> PR debug/104778 * lra.cc (lra_substitute_pseudo): For debug_p mode, simplify SUBREG, ZERO_EXTEND, SIGN_EXTEND, FLOAT or UNSIGNED_FLOAT if recursive call simplified the first operand into VOIDmode constant. * gcc.target/powerpc/pr104778.c: New test.
2022-03-14libstdc++: Fix reading UTF-8 characters for 16-bit targets [PR104875]Jonathan Wakely1-7/+7
The current code in read_utf8_code_point assumes that integer promotion will create a 32-bit int, but that's not true for 16-bit targets like msp430 and avr. This changes the intermediate variables used for each octet from unsigned char to char32_t, so that (c << N) works correctly when N > 8. libstdc++-v3/ChangeLog: PR libstdc++/104875 * src/c++11/codecvt.cc (read_utf8_code_point): Use char32_t to hold octets that will be left-shifted.
2022-03-14top-level: Fix comment about --enable-libstdcxx in configureJonathan Wakely2-2/+2
The custom option for enabling/disabling libstdc++ is not spelled the same as the directory name: AC_ARG_ENABLE(libstdcxx, AS_HELP_STRING([--disable-libstdcxx], [do not build libstdc++-v3 directory]) The comment referring to it later use the wrong name. ChangeLog: * configure.ac: Fix incorrect option in comment. * configure: Regenerate.
2022-03-14c++: Reject __builtin_clear_padding on non-trivially-copyable types with one ↵Jakub Jelinek3-0/+76
exception [PR102586] As mentioned by Jason in the PR, non-trivially-copyable types (or non-POD for purposes of layout?) types can be base classes of derived classes in which the padding in those non-trivially-copyable types can be reused for some real data members or even the layout can change and data members can be moved to other positions. __builtin_clear_padding is right now used for multiple purposes, in <atomic> where it isn't used yet but was planned as the main spot it can be used for trivially copyable types only, ditto for std::bit_cast where we also use it. It is used for OpenMP long double atomics too but long double is trivially copyable, and lastly for -ftrivial-auto-var-init=. The following patch restricts the builtin to pointers to trivially-copyable types, with the exception when it is called directly on an address of a variable, in that case already the FE can verify it is the complete object type and so it is safe to clear all the paddings in it. 2022-03-14 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/102586 gcc/ * doc/extend.texi (__builtin_clear_padding): Clearify that for C++ argument type should be pointer to trivially-copyable type unless it is address of a variable or parameter. gcc/cp/ * call.cc (build_cxx_call): Diagnose __builtin_clear_padding where first argument's type is pointer to non-trivially-copyable type unless it is address of a variable or parameter. gcc/testsuite/ * g++.dg/cpp2a/builtin-clear-padding1.C: New test.
2022-03-14i386: Fix up _mm_loadu_si{16,32} [PR99754]Jakub Jelinek3-3/+46
These intrinsics are supposed to do an unaligned may_alias load of a 16-bit or 32-bit value and store it as the first element of a 128-bit integer vector, with all other elements cleared. The current _mm_storeu_* implementation implements that correctly, uses __*_u types to do the store and extracts the first element of a vector into it. But _mm_loadu_si{16,32} gets it all wrong. It performs an aligned non-may_alias load and because _mm_set_epi{16,32} has the args reversed, it also inserts it into the last vector element instead of first. The following patch fixes that. Note, while the Intrinsics guide for _mm_loadu_si32 says SSE2, for _mm_loadu_si16 it says strangely SSE. But the intrinsics returns __m128i, which is only defined in emmintrin.h, and _mm_set_epi16 is also only SSE2 and later in emmintrin.h. Even clang defines it in emmintrin.h and ends up with inlining failure when calling _mm_loadu_si16 from sse,no-sse2 function. So, isn't that a bug in the intrinsic guide instead? 2022-03-14 Jakub Jelinek <jakub@redhat.com> PR target/99754 * config/i386/emmintrin.h (_mm_loadu_si32): Put loaded value into first rather than last element of the vector, use __m32_u to do a really unaligned load, use just 0 instead of (int)0. (_mm_loadu_si16): Put loaded value into first rather than last element of the vector, use __m16_u to do a really unaligned load, use just 0 instead of (short)0. * gcc.target/i386/pr99754-1.c: New test. * gcc.target/i386/pr99754-2.c: New test.
2022-03-14Spelling fix - cannott -> cannot [PR104899]Jakub Jelinek2-2/+3
This fixes typos and while changing that, also uses %< %> around attribute names and fixes up formatting. 2022-03-14 Jakub Jelinek <jakub@redhat.com> PR other/104899 * config/bfin/bfin.cc (bfin_handle_longcall_attribute): Fix a typo in diagnostic message - cannott -> cannot. Use %< and %> around names of attribute. Avoid too long line. * range-op.cc (operator_logical_and::op1_range): Fix up a typo in comment - cannott -> cannot. Use 2 spaces after . instead of one.
2022-03-14Don't fold builtin into gimple when isa mismatches.liuhongt4-40/+115
The patch fixes ICE in ix86_gimple_fold_builtin. gcc/ChangeLog: PR target/104666 * config/i386/i386-expand.cc (ix86_check_builtin_isa_match): New func. (ix86_expand_builtin): Move code to ix86_check_builtin_isa_match and call it. * config/i386/i386-protos.h (ix86_check_builtin_isa_match): Declare. * config/i386/i386.cc (ix86_gimple_fold_builtin): Don't fold builtin into gimple when isa mismatches. gcc/testsuite/ChangeLog: * gcc.target/i386/pr104666.c: New test.
2022-03-14Daily bump.GCC Administrator6-1/+29
2022-03-13d: Merge upstream dmd 02a3fafc6, druntime 26b58167, phobos 16cb085b5.Iain Buclaw75-975/+1792
D front-end changes: - Import dmd v2.099.0. - The deprecation period for D1-style operators has ended, any use of the D1 overload operators will now result in a compiler error. - `scope' as a type constraint on class, struct, union, and enum declarations has been deprecated. - Fix segmentation fault when emplacing a new front-end Expression node during CTFE (PR104835). D runtime changes: - Import druntime v2.099.0. - Fix C bindings for stdint types (PR104738). - Fix bus error when allocating new array on the GC (PR104742). - Fix bus error when allocating new pointer on the GC (PR104745). Phobos changes: - Import phobos v2.099.0. - New function `bind' in `std.functional'. gcc/d/ChangeLog: * dmd/MERGE: Merge upstream dmd 02a3fafc6. * dmd/VERSION: Update version to v2.099.0. * imports.cc (ImportVisitor::visit (EnumDeclaration *)): Don't cache decl in front-end AST node. (ImportVisitor::visit (AggregateDeclaration *)): Likewise. (ImportVisitor::visit (ClassDeclaration *)): Likewise. libphobos/ChangeLog: * libdruntime/MERGE: Merge upstream druntime 26b58167. * src/MERGE: Merge upstream phobos 16cb085b5.
2022-03-13texi + c-target.def: Fix typosTobias Burnus5-8/+8
gcc/c-family/ChangeLog: * c-target.def (check_string_object_format_arg): Fix description typo. gcc/ChangeLog: * doc/invoke.texi: Fix typos. * doc/tm.texi.in: Remove duplicated word. * doc/tm.texi: Regenerate. libgomp/ChangeLog: * libgomp.texi: Fix typo.
2022-03-13Daily bump.GCC Administrator7-1/+230
2022-03-12c++: naming a dependently-scoped template for CTAD [PR104641]Patrick Palka4-10/+63
In order to be able to perform CTAD for a dependently-scoped template (such as A<T>::B in the testcase below), we need to permit a typename-specifier to resolve to a template as per [dcl.type.simple]/3, at least when it appears in a CTAD-enabled context. This patch implements this using a new tsubst flag tf_tst_ok to control when a TYPENAME_TYPE is allowed to name a template, and sets this flag when substituting into the type of a CAST_EXPR, CONSTRUCTOR or VAR_DECL (each of which is a CTAD-enabled context). PR c++/104641 gcc/cp/ChangeLog: * cp-tree.h (tsubst_flags::tf_tst_ok): New flag. * decl.cc (make_typename_type): Allow a typename-specifier to resolve to a template when tf_tst_ok, in which case return a CTAD placeholder for the template. * pt.cc (tsubst_decl) <case VAR_DECL>: Set tf_tst_ok when substituting the type. (tsubst): Clear tf_tst_ok and remember if it was set. <case TYPENAME_TYPE>: Pass tf_tst_ok to make_typename_type appropriately. (tsubst_copy) <case CAST_EXPR>: Set tf_tst_ok when substituting the type. (tsubst_copy_and_build) <case CAST_EXPR>: Likewise. <case CONSTRUCTOR>: Likewise. gcc/testsuite/ChangeLog: * g++.dg/cpp1z/class-deduction107.C: New test.
2022-03-12c++: ICE with bad conversion shortcutting [PR104622]Patrick Palka3-4/+29
When shortcutting bad argument conversions during overload resolution, we assume conversions get computed in sequential order and that therefore the conversion array is incomplete iff the last conversion is missing. But this assumption turns out to be wrong for templates, because during deduction check_non_deducible_conversion can compute an argument conversion out of order. So in the testcase below, at the end of add_template_candidate the conversion array looks like {bad_conv, NULL, good_conv} where the last conversion was computed during deduction and the first one later from add_function_candidate. We need to add this candidate to bad_fns since not all of its argument conversions were computed, but we don't do so because the last conversion isn't missing. This patch fixes this by checking for a missing conversion exhaustively instead. In passing, this cleans up check_non_deducible_conversion given that the only values of 'strict' we expect to see here the enumerators of unification_kind_t. PR c++/104622 gcc/cp/ChangeLog: * call.cc (missing_conversion_p): Define. (add_candidates): Use it. * pt.cc (check_non_deducible_conversion): Change type of strict parameter to unification_kind_t and directly test for DEDUCE_CALL. gcc/testsuite/ChangeLog: * g++.dg/template/conv18.C: New test.
2022-03-12c++: return-type-req in constraint using only outer tparms [PR104527]Patrick Palka3-26/+75
Here the template context for the atomic constraint has two levels of template parameters, but since it depends only on the innermost parameter T we use a single-level argument vector (built by get_mapped_args) during substitution into the atom. We eventually pass this vector to do_auto_deduction as part of checking the return-type-requirement within the atom, but do_auto_deduction expects outer_targs to be a full set of arguments for sake of satisfaction. This patch fixes this by making get_mapped_args always return an argument vector whose depth corresponds to the template depth of the context in which the atomic constraint expression was written, instead of the highest parameter level that the expression happens to use. PR c++/104527 gcc/cp/ChangeLog: * constraint.cc (normalize_atom): Set ATOMIC_CONSTR_EXPR_FROM_CONCEPT_P appropriately. (get_mapped_args): Make static, adjust parameters. Always return a vector whose depth corresponds to the template depth of the context of the atomic constraint expression. Micro-optimize by passing false as exact to safe_grow_cleared and by collapsing a multi-level depth-one argument vector. (satisfy_atom): Adjust call to get_mapped_args and diagnose_atomic_constraint. (diagnose_atomic_constraint): Replace map parameter with an args parameter. * cp-tree.h (ATOMIC_CONSTR_EXPR_FROM_CONCEPT_P): Define. (get_mapped_args): Remove declaration. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/concepts-return-req4.C: New test.
2022-03-12c++: ICE with non-constant satisfaction value [PR98644]Patrick Palka3-21/+34
Here during satisfaction, the expression of the atomic constraint after substitution is (int *) NON_LVALUE_EXPR <1> != 0B, which is not a C++ constant expression due to the reinterpret_cast, but TREE_CONSTANT is set since its value is otherwise effectively constant. We then call maybe_constant_value on it, which proceeds via its fail-fast path to exit early without clearing TREE_CONSTANT. But satisfy_atom relies on checking TREE_CONSTANT of the result of maybe_constant_value in order to detect non-constant satisfaction. This patch fixes this by making the fail-fast path of maybe_constant_value clear TREE_CONSTANT in this case, like cxx_eval_outermost_constant_expr in the normal path would have done. PR c++/98644 gcc/cp/ChangeLog: * constexpr.cc (mark_non_constant): Define, split out from ... (cxx_eval_outermost_constant_expr): ... here. (maybe_constant_value): Use it. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/concepts-pr98644.C: New test. * g++.dg/parse/array-size2.C: Remove expected diagnostic about a narrowing conversion. Co-authored-by: Jason Merrill <jason@redhat.com>
2022-03-12c++: give fold expressions a locationPatrick Palka2-6/+6
This improves diagnostic quality for unsatisfied atomic constraints that consist of a fold expression, e.g. in concepts/diagnostic3.C the "evaluated to false" diagnostic now points to the expression: .../diagnostic3.C:10:22: note: the expression ‘(foo<Ts> && ...) [with Ts = {int, char}]’ evaluated to ‘false’ 10 | requires (foo<Ts> && ...) | ~~~~~~~~~~~~^~~~ gcc/cp/ChangeLog: * semantics.cc (finish_unary_fold_expr): Use input_location instead of UNKNOWN_LOCATION. (finish_binary_fold_expr): Likewise. gcc/testsuite/ChangeLog: * g++.dg/concepts/diagnostic3.C: Adjusted expected location of "evaluated to false" diagnostics.
2022-03-12rs6000: Do not use rs6000_cpu for .machine ppc and ppc64 (PR104829)Segher Boessenkool1-2/+10
Fixes: 77eccbf39ed5 rs6000.h has #define PROCESSOR_POWERPC PROCESSOR_PPC604 #define PROCESSOR_POWERPC64 PROCESSOR_RS64A which means that if you use things like -mcpu=powerpc -mvsx it will no longer work after my latest .machine patch. This causes GCC build errors in some cases, not a good idea (even if the errors are actually pre-existing: using -mvsx with a machine that does not have VSX cannot work properly). 2022-03-11 Segher Boessenkool <segher@kernel.crashing.org> PR target/104829 * config/rs6000/rs6000.cc (rs6000_machine_from_flags): Don't output "ppc" and "ppc64" based on rs6000_cpu.
2022-03-12OpenACC 'kernels' decomposition: resolve wrong-code cases unless manually ↵Thomas Schwinge22-45/+120
making certain variables addressable [PR100280, PR104892] Currently in OpenACC 'kernels' decomposition, there is special handling of 'GOMP_MAP_FORCE_TOFROM', documented to be done to avoid "internal compiler errors in later passes". For performance reasons, the current repetitive to/from device copying for every region is not ideal, compared to using 'present' clauses, as done for almost all other 'GOMP_MAP_*'. Also, the current special handling (incomplete, evidently) is the reason for the PR104892 misbehavior. For PR100280 etc. we've resolved all such known ICEs -- removing the special handling for 'GOMP_MAP_FORCE_TOFROM' now resolves PR104892. PR middle-end/100280 PR middle-end/104892 gcc/ * omp-oacc-kernels-decompose.cc (omp_oacc_kernels_decompose_1): Remove special handling of 'GOMP_MAP_FORCE_TOFROM'. gcc/testsuite/ * c-c++-common/goacc/kernels-decompose-2.c: Adjust. * c-c++-common/goacc/kernels-decompose-pr100400-1-1.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr100400-1-2.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr100400-1-3.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr100400-1-4.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr104061-1-1.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr104061-1-2.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr104061-1-3.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr104061-1-4.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr104132-1.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr104133-1.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr104774-1.c: Likewise. * gfortran.dg/goacc/classify-kernels.f95: Likewise. * gfortran.dg/goacc/kernels-decompose-2.f95: Likewise. libgomp/ * testsuite/libgomp.oacc-c-c++-common/declare-vla.c: Adjust. * testsuite/libgomp.oacc-c-c++-common/default-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-reduction-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Likewise. * testsuite/libgomp.oacc-fortran/asyncwait-1.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-reduction-1.f90: Likewise.
2022-03-12OpenACC 'kernels' decomposition: wrong-code cases unless manually making ↵Thomas Schwinge5-16/+59
certain variables addressable [PR104892] Document a few examples of the status quo. PR middle-end/104892 libgomp/ * testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c: Point to PR104892. * testsuite/libgomp.oacc-c-c++-common/default-1.c: Likewise, enable '--param=openacc-kernels=decompose' and adjust. * testsuite/libgomp.oacc-c-c++-common/kernels-reduction-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Likewise. * testsuite/libgomp.oacc-fortran/kernels-reduction-1.f90: Likewise.
2022-03-12Enhance further testcases to verify handling of OpenACC privatization level ↵Thomas Schwinge4-55/+266
[PR90115] As originally introduced in commit 11b8286a83289f5b54e813f14ff56d730c3f3185 "[OpenACC privatization] Largely extend diagnostics and corresponding testsuite coverage [PR90115]". PR middle-end/90115 libgomp/ * testsuite/libgomp.oacc-c-c++-common/default-1.c: Enhance. * testsuite/libgomp.oacc-c-c++-common/kernels-reduction-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Likewise. * testsuite/libgomp.oacc-fortran/kernels-reduction-1.f90: Likewise.
2022-03-12OpenACC 'kernels' decomposition: Mark variables used in 'present' clauses as ↵Thomas Schwinge7-83/+168
addressable [PR100280, PR104086] ... like in recent commit 9b32c1669aad5459dd053424f9967011348add83 "OpenACC 'kernels' decomposition: Mark variables used in synthesized data clauses as addressable [PR100280]". Otherwise, we may run into 'gcc/omp-low.cc:lower_omp_target': 13125 else if (is_gimple_reg (var)) 13126 { 13127 gcc_assert (offloaded); PR middle-end/100280 PR middle-end/104086 gcc/ * omp-oacc-kernels-decompose.cc (omp_oacc_kernels_decompose_1): Mark variables used in 'present' clauses as addressable. * omp-low.cc (scan_sharing_clauses) <OMP_CLAUSE_MAP>: Gracefully handle duplicate 'OMP_CLAUSE_MAP_DECL_MAKE_ADDRESSABLE'. gcc/testsuite/ * c-c++-common/goacc/kernels-decompose-pr104086-1.c: Adjust, extend. libgomp/ * testsuite/libgomp.oacc-c-c++-common/declare-vla-kernels-decompose-ice-1.c: Merge this... * testsuite/libgomp.oacc-c-c++-common/declare-vla-kernels-decompose.c: ..., and this... * testsuite/libgomp.oacc-c-c++-common/declare-vla.c: ... into this, and adjust. * testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c: Extend.
2022-03-12Add 'c-c++-common/goacc/kernels-decompose-pr104086-1.c' [PR104086]Thomas Schwinge1-0/+25
..., currently XFAILed with 'dg-ice', as it runs into 'gcc/omp-low.cc:lower_omp_target': 13125 else if (is_gimple_reg (var)) 13126 { 13127 gcc_assert (offloaded); This means, the recent PR100280 etc. changes are still not sufficient. gcc/testsuite/ PR middle-end/104086 * c-c++-common/goacc/kernels-decompose-pr104086-1.c: New file.
2022-03-12Add 'gcc/tree.cc:user_omp_clause_code_name' [PR65095]Thomas Schwinge6-38/+41
Re PR65095 "Adapt OpenMP diagnostic messages for OpenACC", move C/C++ front end 'gcc/c-family/c-omp.cc:c_omp_map_clause_name' to generic 'gcc/tree.cc:user_omp_clause_code_name' . No functional change. PR other/65095 gcc/ * tree-core.h (user_omp_claus_code_name): Declare function. * tree.cc (user_omp_clause_code_name): New function. gcc/c/ * c-typeck.cc (handle_omp_array_sections_1) (c_oacc_check_attachments): Call 'user_omp_clause_code_name' instead of 'c_omp_map_clause_name'. gcc/cp/ * semantics.cc (handle_omp_array_sections_1) (cp_oacc_check_attachments): Call 'user_omp_clause_code_name' instead of 'c_omp_map_clause_name'. gcc/c-family/ * c-common.h (c_omp_map_clause_name): Remove. * c-omp.cc (c_omp_map_clause_name): Remove.
2022-03-12PR middle-end/98420: Don't fold x - x to 0.0 with -frounding-mathRoger Sayle2-1/+12
This patch addresses PR middle-end/98420, which is inappropriate constant folding of x - x to 0.0 (in match.pd) when -frounding-math is specified. Specifically, x - x may be -0.0 with FE_DOWNWARD as the rounding mode. To summarize, the desired IEEE behaviour, x - x for floating point x, (1) can't be folded to 0.0 by default, due to the possibility of NaN or Inf (2) can be folded to 0.0 with -ffinite-math-only (3) can't be folded to 0.0 with -ffinite-math-only -frounding-math (4) can be folded with -ffinite-math-only -frounding-math -fno-signed-zeros 2022-03-12 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog PR middle-end/98420 * match.pd (minus @0 @0): Additional checks for -fno-rounding-math (the defaut) or -fno-signed-zeros. gcc/testsuite/ChangeLog PR middle-end/98420 * gcc.dg/pr98420.c: New test case.
2022-03-11Fix DImode to TImode sign extend issueMichael Meissner1-1/+1
PR target/104868 had had an issue where my code that updated the DImode to TImode sign extension for power10 failed. In looking at the failure message, the reason is when extendditi2 tries to split the insn, it generates an insn that does not satisfy its constraints: (set (reg:V2DI 65 1) (vec_duplicate:V2DI (reg:DI 0))) The reason is vsx_splat_v2di does not allow GPR register 0 when the will be generating a mtvsrdd instruction. In the definition of the mtvsrdd instruction, if the RA register is 0, it means clear the upper 64 bits of the vector instead of moving register GPR 0 to those bits. When I wrote the extendditi2 pattern, I forgot that mtvsrdd had that behavior so I used a 'r' constraint instead of 'b'. In the rare case where the value is in GPR register 0, this split will fail. This patch uses the right constraint for extendditi2. 2022-03-11 Michael Meissner <meissner@linux.ibm.com> gcc/ PR target/104868 * config/rs6000/vsx.md (extendditi2): Use a 'b' constraint when moving from a GPR register to an Altivec register.
2022-03-12Daily bump.GCC Administrator7-1/+107
2022-03-11d: Cache generated import declarations in a hash_mapIain Buclaw1-36/+41
Originally, these were cached in the front-end AST node field `isym'. However, this field is due to be removed in the future. gcc/d/ChangeLog: * imports.cc (imported_decls): Define. (class ImportVisitor): Add result_ field. (ImportVisitor::result): New method. (ImportVisitor::visit (Module *)): Store decl to result_. (ImportVisitor::visit (Import *)): Likewise. (ImportVisitor::visit (AliasDeclaration *)): Don't cache decl in front-end AST node. (ImportVisitor::visit (OverDeclaration *)): Likewise. (ImportVisitor::visit (FuncDeclaration *)): Likewise. (ImportVisitor::visit (Declaration *)): Likewise. (build_import_decl): Use imported_decls to cache and lookup built declarations.
2022-03-11d: Fix mistakes in strings to be translated [PR104552]Iain Buclaw1-2/+2
Address comments made in PR104552 about documented D language options. gcc/d/ChangeLog: PR translation/104552 * lang.opt (fdump-cxx-spec=): Fix typo in argument handle. (fpreview=fixaliasthis): Quote `alias this' as code.
2022-03-11PR tree-optimization/98335: New peephole2 xorl;movb -> movzblRoger Sayle3-0/+80
This patch is the backend piece of my proposed fix to PR tree-opt/98335, to allow C++ partial struct initialization to be as efficient/optimized as full struct initialization. With the middle-end patch just posted to gcc-patches, the test case in the PR compiles on x86_64-pc-linux-gnu with -O2 to: xorl %eax, %eax movb c(%rip), %al ret with this additional peephole2 (actually four peephole2s): movzbl c(%rip), %eax ret 2022-03-11 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog PR tree-optimization/98335 * config/i386/i386.md (peephole2): Eliminate redundant insv. Combine movl followed by movb. Transform xorl followed by a suitable movb or movw into the equivalent movz[bw]l. gcc/testsuite/ChangeLog PR tree-optimization/98335 * g++.target/i386/pr98335.C: New test case. * gcc.target/i386/pr98335.c: New test case.
2022-03-11PR tree-optimization/98335: Improvements to DSE's compute_trims.Roger Sayle8-5/+139
This patch is the main middle-end piece of a fix for PR tree-opt/98335, which is a code-quality regression affecting mainline. The issue occurs in DSE's (dead store elimination's) compute_trims function that determines where a store to memory can be trimmed. In the testcase given in the PR, this function notices that the first byte of a DImode store is dead, and replaces the 8-byte store at (aligned) offset zero, with a 7-byte store at (unaligned) offset one. Most architectures can store a power-of-two bytes (up to a maximum) in single instruction, so writing 7 bytes requires more instructions than writing 8 bytes. This patch follows Jakub Jelinek's suggestion in comment 5, that compute_trims needs improved heuristics. On x86_64-pc-linux-gnu with -O2 the new test case in the PR goes from: movl $0, -24(%rsp) movabsq $72057594037927935, %rdx movl $0, -21(%rsp) andq -24(%rsp), %rdx movq %rdx, %rax salq $8, %rax movb c(%rip), %al ret to xorl %eax, %eax movb c(%rip), %al ret 2022-03-11 Roger Sayle <roger@nextmovesoftware.com> Richard Biener <rguenther@suse.de> gcc/ChangeLog PR tree-optimization/98335 * builtins.cc (get_object_alignment_2): Export. * builtins.h (get_object_alignment_2): Likewise. * tree-ssa-alias.cc (ao_ref_alignment): New. * tree-ssa-alias.h (ao_ref_alignment): Declare. * tree-ssa-dse.cc (compute_trims): Improve logic deciding whether to align head/tail, writing more bytes but using fewer store insns. gcc/testsuite/ChangeLog PR tree-optimization/98335 * g++.dg/pr98335.C: New test case. * gcc.dg/pr86010.c: New test case. * gcc.dg/pr86010-2.c: New test case.
2022-03-11[Committed] Update g++.dg/other/pr84964.C for ia32 (and similar) targets.Roger Sayle1-3/+3
The "sorry, unimplemented" message in the new g++.dg/other/pr84964.C is apparently dependent upon whether the target passes multi-gigabyte arguments on the stack. This tweaks the testcase to just confirm that it no longer ICEs, not the specific set of warnings/errors triggered. 2022-03-11 Roger Sayle <roger@nextmovesoftware.com> gcc/testsuite/ChangeLog PR c++/84964 * g++.dg/other/pr84964.C: Tweak test to check for the ICE, not for the (target-dependent) sorry.
2022-03-11tree-optimization/104880 - update-address-taken and cmpxchgRichard Biener2-3/+56
The following addresses optimistic non-addressable marking of an argument of __atomic_compare_exchange_n which broke when I added DECL_NOT_GIMPLE_REG_P since we cannot guarantee we can rewrite it when TREE_ADDRESSABLE is unset. Instead we have to restore TREE_ADDRESSABLE in that case. 2022-03-11 Richard Biener <rguenther@suse.de> PR tree-optimization/104880 * tree-ssa.cc (execute_update_address_taken): Remember if we optimistically made something not addressable and prepare to undo it. * g++.dg/opt/pr104880.cc: New testcase.
2022-03-11target/104762 - vectorization costs of CONSTRUCTORsRichard Biener1-6/+11
After accounting for GPR -> XMM move cost for vec_construct the base cost needs adjustments to not double-cost those. This also lowers the cost when such move is not necessary. 2022-03-11 Richard Biener <rguenther@suse.de> PR target/104762 * config/i386/i386.cc (ix86_builtin_vectorization_cost): Do not cost the first lane of SSE pieces as inserts for vec_construct.
2022-03-11lto-plugin: Honor link_output_name for -foffload-objects file nameTobias Burnus1-1/+8
lto-plugin/ChangeLog: * lto-plugin.c (all_symbols_read_handler): With -save-temps, use link_output_name for -foffload-objects's file name, if available.