aboutsummaryrefslogtreecommitdiff
path: root/gcc
AgeCommit message (Collapse)AuthorFilesLines
2023-06-29c++: unpropagated CONSTRUCTOR_MUTABLE_POISON [PR110463]Patrick Palka2-0/+20
Here we're incorrectly accepting the mutable member accesses because cp_fold neglects to propagate CONSTRUCTOR_MUTABLE_POISON when folding a CONSTRUCTOR. PR c++/110463 gcc/cp/ChangeLog: * cp-gimplify.cc (cp_fold) <case CONSTRUCTOR>: Propagate CONSTRUCTOR_MUTABLE_POISON. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/constexpr-mutable6.C: New test.
2023-06-29Update documentation to clarify a GCC extension [PR c/77650]Qing Zhao4-1/+88
on a structure with a C99 flexible array member being nested in another structure. "The GCC extension accepts a structure containing an ISO C99 "flexible array member", or a union containing such a structure (possibly recursively) to be a member of a structure. There are two situations: * A structure containing a C99 flexible array member, or a union containing such a structure, is the last field of another structure, for example: struct flex { int length; char data[]; }; union union_flex { int others; struct flex f; }; struct out_flex_struct { int m; struct flex flex_data; }; struct out_flex_union { int n; union union_flex flex_data; }; In the above, both 'out_flex_struct.flex_data.data[]' and 'out_flex_union.flex_data.f.data[]' are considered as flexible arrays too. * A structure containing a C99 flexible array member, or a union containing such a structure, is not the last field of another structure, for example: struct flex { int length; char data[]; }; struct mid_flex { int m; struct flex flex_data; int n; }; In the above, accessing a member of the array 'mid_flex.flex_data.data[]' might have undefined behavior. Compilers do not handle such a case consistently, Any code relying on this case should be modified to ensure that flexible array members only end up at the ends of structures. Please use the warning option '-Wflex-array-member-not-at-end' to identify all such cases in the source code and modify them. This extension is now deprecated. " PR c/77650 gcc/c-family/ChangeLog: * c.opt: New option -Wflex-array-member-not-at-end. gcc/c/ChangeLog: * c-decl.cc (finish_struct): Issue warnings for new option. gcc/ChangeLog: * doc/extend.texi: Document GCC extension on a structure containing a flexible array member to be a member of another structure. gcc/testsuite/ChangeLog: * gcc.dg/variable-sized-type-flex-array.c: New test.
2023-06-29Introduce IR bit TYPE_INCLUDES_FLEXARRAY for the GCC extensionQing Zhao7-4/+36
on a structure with a C99 flexible array member being nested in another structure GCC extension accepts the case when a struct with a flexible array member is embedded into another struct or union (possibly recursively) as the last field. This patch is to introduce the IR bit TYPE_INCLUDES_FLEXARRAY (reuse the existing IR bit TYPE_NO_NAMED_ARGS_SATDARG_P), set it correctly in C FE, stream it correctly in Middle-end, and print it during IR dumping. gcc/c/ChangeLog: * c-decl.cc (finish_struct): Set TYPE_INCLUDES_FLEXARRAY for struct/union type. gcc/lto/ChangeLog: * lto-common.cc (compare_tree_sccs_1): Compare bit TYPE_NO_NAMED_ARGS_STDARG_P or TYPE_INCLUDES_FLEXARRAY properly for its corresponding type. gcc/ChangeLog: * print-tree.cc (print_node): Print new bit type_include_flexarray. * tree-core.h (struct tree_type_common): Use bit no_named_args_stdarg_p as type_include_flexarray for RECORD_TYPE or UNION_TYPE. * tree-streamer-in.cc (unpack_ts_type_common_value_fields): Stream in bit no_named_args_stdarg_p properly for its corresponding type. * tree-streamer-out.cc (pack_ts_type_common_value_fields): Stream out bit no_named_args_stdarg_p properly for its corresponding type. * tree.h (TYPE_INCLUDES_FLEXARRAY): New macro TYPE_INCLUDES_FLEXARRAY.
2023-06-29Move maybe_set_nonzero_bits() to its only user.Aldy Hernandez3-66/+65
gcc/ChangeLog: * tree-vrp.cc (maybe_set_nonzero_bits): Move from here... * tree-ssa-dom.cc (maybe_set_nonzero_bits): ...to here. * tree-vrp.h (maybe_set_nonzero_bits): Remove.
2023-06-29Tidy up the range normalization code.Aldy Hernandez2-51/+50
There's a few spots where a range is being altered in-place, but we fail to call normalize the range. This patch makes sure we always call normalize_kind(), and that normalize_kind in turn calls verify_range to make sure verything is canonical. gcc/ChangeLog: * value-range.cc (frange::set): Do not call verify_range. (frange::normalize_kind): Verify range. (frange::union_nans): Do not call verify_range. (frange::union_): Same. (frange::intersect): Same. (irange::irange_single_pair_union): Call normalize_kind if necessary. (irange::union_): Same. (irange::intersect): Same. (irange::set_range_from_nonzero_bits): Verify range. (irange::set_nonzero_bits): Call normalize_kind if necessary. (irange::get_nonzero_bits): Tweak comment. (irange::intersect_nonzero_bits): Call normalize_kind if necessary. (irange::union_nonzero_bits): Same. * value-range.h (irange::normalize_kind): Verify range.
2023-06-29cselib+expr+bitmap: Change return type of predicate functions from int to boolUros Bizjak6-96/+98
gcc/ChangeLog: * cselib.h (rtx_equal_for_cselib_1): Change return type from int to bool. (references_value_p): Ditto. (rtx_equal_for_cselib_p): Ditto. * expr.h (can_store_by_pieces): Ditto. (try_casesi): Ditto. (try_tablejump): Ditto. (safe_from_p): Ditto. * sbitmap.h (bitmap_equal_p): Ditto. * cselib.cc (references_value_p): Change return type from int to void and adjust function body accordingly. (rtx_equal_for_cselib_1): Ditto. * expr.cc (is_aligning_offset): Ditto. (can_store_by_pieces): Ditto. (mostly_zeros_p): Ditto. (all_zeros_p): Ditto. (safe_from_p): Ditto. (is_aligning_offset): Ditto. (try_casesi): Ditto. (try_tablejump): Ditto. (store_constructor): Change "need_to_clear" and "const_bounds_p" variables to bool. * sbitmap.cc (bitmap_equal_p): Change return type from int to bool.
2023-06-29c++: cache partial template specialization selectionPatrick Palka5-31/+66
There's currently no cheap way to obtain the partial template specialization (and arguments relative to it) that was selected for a class or variable template specialization. Our only option is to compute the result from scratch via most_specialized_partial_spec. For class templates this isn't really an issue because we usually need this information just once, upon instantiation. But for variable templates we need it upon specialization and also later upon instantiation. We could implement an ad-hoc cache for variable templates only, but it'd be nice for this information to be readily available in general. To that end, this patch adds a TI_PARTIAL_INFO field to TEMPLATE_INFO that holds another TEMPLATE_INFO consisting of the partial template and arguments relative to it, which most_specialized_partial_spec then uses to transparently cache its (now TEMPLATE_INFO) result. Similarly, there's no easy way to go from the DECL_TEMPLATE_RESULT of a partial TEMPLATE_DECL back to that TEMPLATE_DECL. (Our best option is to walk the DECL_TEMPLATE_SPECIALIZATIONS list of the primary TEMPLATE_DECL.) So this patch also uses this new field to link these entities in both directions. gcc/cp/ChangeLog: * cp-tree.h (tree_template_info::partial): New data member. (TI_PARTIAL_INFO): New tree accessor. (most_specialized_partial_spec): Add defaulted bool parameter. * module.cc (trees_out::core_vals) <case TEMPLATE_INFO>: Stream TI_PARTIAL_INFO. (trees_in::core_vals) <case TEMPLATE_INFO>: Likewise. * parser.cc (specialization_of): Adjust after making most_specialized_partial_spec return TEMPLATE_INFO instead of TREE_LIST. * pt.cc (process_partial_specialization): Set TI_PARTIAL_INFO of 'decl' to point back to the partial TEMPLATE_DECL. Likewise (and pass rechecking=true to most_specialization_partial_spec). (instantiate_class_template): Likewise. (instantiate_template): Set TI_PARTIAL_INFO to the result of most_specialization_partial_spec after forming a variable template specialization. (most_specialized_partial_spec): Add 'rechecking' parameter. Exit early if the template is not primary. Use the TI_PARTIAL_INFO of the corresponding TEMPLATE_INFO as a cache unless 'rechecking' is true. Don't bother setting TREE_TYPE of each TREE_LIST. (instantiate_decl): Adjust after making most_specialized_partial_spec return TEMPLATE_INFO instead of TREE_LIST. * ptree.cc (cxx_print_xnode) <case TEMPLATE_INFO>: Dump TI_PARTIAL_INFO.
2023-06-29tree-ssa-math-opts: Use element_precision.Robin Dapp1-2/+2
The recent TYPE_PRECISION changes to detect improper usage cause an ICE in divmod_candidate_p for RVV when called with a vector type. Therefore, use element_precision instead. gcc/ChangeLog: * tree-ssa-math-opts.cc (divmod_candidate_p): Use element_precision.
2023-06-29[Committed] Add -mmove-max=128 -mstore-max=128 to pieces-memcmp-2.cRoger Sayle1-1/+1
Adding -mmove-max=128 and -mstore-max=128 to the dg-options of the recently added gcc.target/i386/pieces-memcmp-2.c avoids changing the intent of this testcase when adding -march=cascadelake to RUNTESTFLAGS. Committed as obvious. 2023-06-29 Roger Sayle <roger@nextmovesoftware.com> gcc/testsuite/ChangeLog * gcc.target/i386/pieces-memcmp-2.c: Specify that 128-bit comparisons are desired, to see if 256-bit instructions are generated inappropriately (fixes test on -march=cascadelake).
2023-06-29tree-optimization/110460 - fend off vector types from vectorizerRichard Biener1-2/+5
The following makes fending off existing vector types from vectorization also apply to word_mode vector types. I've chosen to add a positive list of allowed scalar types here for clarity. PR tree-optimization/110460 * tree-vect-stmts.cc (get_related_vectype_for_scalar_type): Only allow integral, pointer and scalar float type scalar_type.
2023-06-29Avoid adding loop-carried ops to long chainsLili Cui1-69/+172
Avoid adding loop-carried ops to long chains, otherwise the whole chain will have dependencies across the loop iteration. Just keep loop-carried ops in a separate chain. E.g. x_1 = phi(x_0, x_2) y_1 = phi(y_0, y_2) a + b + c + d + e + x1 + y1 SSA1 = a + b; SSA2 = c + d; SSA3 = SSA1 + e; SSA4 = SSA3 + SSA2; SSA5 = x1 + y1; SSA6 = SSA4 + SSA5; With the patch applied, these test cases improved by 32%~100%. S242: for (int i = 1; i < LEN_1D; ++i) { a[i] = a[i - 1] + s1 + s2 + b[i] + c[i] + d[i];} Case 1: for (int i = 1; i < LEN_1D; ++i) { a[i] = a[i - 1] + s1 + s2 + b[i] + c[i] + d[i] + e[i];} Case 2: for (int i = 1; i < LEN_1D; ++i) { a[i] = a[i - 1] + b[i - 1] + s1 + s2 + b[i] + c[i] + d[i] + e[i];} The value is the execution time A: original version B: with FMA patch g:e5405f065bace0685cb3b8878d1dfc7a6e7ef409(base on A) C: with current patch(base on B) A B C B/A C/A s242 2.859 5.152 2.859 1.802028681 1 case 1 5.489 5.488 3.511 0.999818 0.64 case 2 7.216 7.499 4.885 1.039218 0.68 gcc/ChangeLog: PR tree-optimization/110148 * tree-ssa-reassoc.cc (rewrite_expr_tree_parallel): Handle loop-carried ops in this function.
2023-06-29[testsuite] tolerate enabled but missing language frontendsAlexandre Oliva1-1/+1
When a language is enabled but we run the testsuite against a tree in which the frontend compiler is not present, help.exp fails. It recognizes the output pattern for a disabled language, but not a missing frontend. Extend the pattern so that it covers both cases. for gcc/testsuite/ChangeLog * lib/options.exp (check_for_options_with_filter): Handle missing frontend compiler like disabled language.
2023-06-29middle-end/110452 - bad code generation with AVX512 mask splatRichard Biener2-0/+26
The following adds an alternate way of expanding a uniform mask vector constructor like _55 = _2 ? -1 : 0; vect_cst__56 = {_55, _55, _55, _55, _55, _55, _55, _55}; when the mask mode is a scalar int mode like for AVX512 or GCN. Instead of piecewise building the result via shifts and ors we can take advantage of uniformity and signedness of the component and simply sign-extend to the result. Instead of cmpl $3, %edi sete %cl movl %ecx, %esi leal (%rsi,%rsi), %eax leal 0(,%rsi,4), %r9d leal 0(,%rsi,8), %r8d orl %esi, %eax orl %r9d, %eax movl %ecx, %r9d orl %r8d, %eax movl %ecx, %r8d sall $4, %r9d sall $5, %r8d sall $6, %esi orl %r9d, %eax orl %r8d, %eax movl %ecx, %r8d orl %esi, %eax sall $7, %r8d orl %r8d, %eax kmovb %eax, %k1 we then get cmpl $3, %edi sete %cl negl %ecx kmovb %ecx, %k1 Code generation for non-uniform masks remains bad, but at least I see no easy way out for the most general case here. PR middle-end/110452 * expr.cc (store_constructor): Handle uniform boolean vectors with integer mode specially.
2023-06-29middle-end/110461 - pattern applying wrongly to vectorsRichard Biener2-0/+17
The following guards a match.pd pattern that wasn't supposed to apply to vectors and thus runs into TYPE_PRECISION checking. For vector support the constant case is lacking and the pattern would have missing optab support checking for the result operation. PR middle-end/110461 * match.pd (bitop (convert@2 @0) (convert?@3 @1)): Disable for VECTOR_TYPE_P. * gcc.dg/pr110461.c: New testcase.
2023-06-29c/110454 - ICE with bogus TYPE_PRECISION useRichard Biener2-2/+12
The following sinks TYPE_PRECISION to properly guarded use places. PR c/110454 gcc/c/ * c-typeck.cc (convert_argument): Sink formal_prec compute to where TYPE_PRECISION is valid to use. gcc/testsuite/ * gcc.dg/Wtraditional-conversion-3.c: New testcase.
2023-06-29A couple of va_gc_atomic tweaksRichard Sandiford2-15/+18
The only current user of va_gc_atomic is Ada's: vec<Entity_Id, va_gc_atomic> It uses the generic gt_pch_nx routines (with gt_pch_nx being the “note pointers” hooks), such as: template<typename T, typename A> void gt_pch_nx (vec<T, A, vl_embed> *v) { extern void gt_pch_nx (T &); for (unsigned i = 0; i < v->length (); i++) gt_pch_nx ((*v)[i]); } It then defines gt_pch_nx routines for Entity_Id &. The problem is that if we wanted to take the same approach for an array of unsigned ints, we'd need to define: inline void gt_pch_nx (unsigned int &) { } which would then be ambiguous with: inline void gt_pch_nx (unsigned int) { } The point of va_gc_atomic is that the elements don't need to be GCed, and so we have: template<typename T> void gt_ggc_mx (vec<T, va_gc_atomic, vl_embed> *v ATTRIBUTE_UNUSED) { /* Nothing to do. Vectors of atomic types wrt GC do not need to be traversed. */ } I think it's therefore reasonable to assume that no pointers will need to be processed for PCH either. The patch also relaxes the array_slice constructor for vec<T, va_gc> * so that it handles all embedded vectors. gcc/ * vec.h (gt_pch_nx): Add overloads for va_gc_atomic. (array_slice): Relax va_gc constructor to handle all vectors with a vl_embed layout. gcc/ada/ * gcc-interface/decl.cc (gt_pch_nx): Remove overloads for Entity_Id.
2023-06-29RISC-V: Support vfadd static rounding mode by mode switchingPan Li10-14/+206
This patch would like to support the vfadd static round mode similar to the fixed-point. Then the related fsrm instructions will be inserted correlatively. Please *NOTE* this PATCH doesn't cover anything about FRM dynamic mode, it will be implemented in the underlying PATCH(s). Signed-off-by: Pan Li <pan2.li@intel.com> gcc/ChangeLog: * config/riscv/riscv.cc (riscv_emit_mode_set): Add emit for FRM. (riscv_mode_needed): Likewise. (riscv_entity_mode_after): Likewise. (riscv_mode_after): Likewise. (riscv_mode_entry): Likewise. (riscv_mode_exit): Likewise. * config/riscv/riscv.h (NUM_MODES_FOR_MODE_SWITCHING): Add number for FRM. * config/riscv/riscv.md: Add FRM register. * config/riscv/vector-iterators.md: Add FRM type. * config/riscv/vector.md (frm_mode): Define new attr for FRM mode. (fsrm): Define new insn for fsrm instruction. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/float-point-frm-insert-1.c: New test. * gcc.target/riscv/rvv/base/float-point-frm-insert-2.c: New test. * gcc.target/riscv/rvv/base/float-point-frm-insert-3.c: New test. * gcc.target/riscv/rvv/base/float-point-frm-insert-4.c: New test. * gcc.target/riscv/rvv/base/float-point-frm-insert-5.c: New test.
2023-06-29RISC-V: Allow rounding mode control for RVV floating-point addPan Li10-0/+189
According to the doc as below, we need to support the rounding mode of the RVV floating-point, both the static and dynamice frm. https://github.com/riscv-non-isa/rvv-intrinsic-doc/pull/226 For tracking and development friendly, We will take some steps to support all rounding modes for the RVV floating-point rounding modes. 1. Allow rounding mode control by one intrinsic (aka this patch), vfadd. 2. Support static rounding mode control by mode switch, like fixed-point. 3. Support dynamice round mode control by mode switch. 4. Support the rest floating-point instructions for frm. Please *NOTE* this patch only allow the rounding mode control for the vfadd intrinsic API, and the related frm will be coverred by step 2. Signed-off-by: Pan Li <pan2.li@intel.com> Co-Authored by: Juzhe-Zhong <juzhe.zhong@rivai.ai> gcc/ChangeLog: * config/riscv/riscv-protos.h (enum floating_point_rounding_mode): Add macro for static frm min and max. * config/riscv/riscv-vector-builtins-bases.cc (class binop_frm): New class for floating-point with frm. (BASE): Add vfadd for frm. * config/riscv/riscv-vector-builtins-bases.h: Likewise. * config/riscv/riscv-vector-builtins-functions.def (vfadd_frm): Likewise. * config/riscv/riscv-vector-builtins-shapes.cc (struct alu_frm_def): New struct for alu with frm. (SHAPE): Add alu with frm. * config/riscv/riscv-vector-builtins-shapes.h: Likewise. * config/riscv/riscv-vector-builtins.cc (function_checker::report_out_of_range_and_not): New function for report out of range and not val. (function_checker::require_immediate_range_or): New function for checking in range or one val. * config/riscv/riscv-vector-builtins.h: Add function decl. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/float-point-frm-error.c: New test. * gcc.target/riscv/rvv/base/float-point-frm.c: New test.
2023-06-29x86: Update model values for Alderlake, Rocketlake and Raptorlake.Cui, Lili1-2/+1
Update model values for Alderlake, Rocketlake and Raptorlake according to SDM. gcc/ChangeLog * common/config/i386/cpuinfo.h (get_intel_cpu): Remove model value 0xa8 from Rocketlake, move model value 0xbf from Alderlake to Raptorlake.
2023-06-28Fix collection and processing of autoprofile data for target libsEugene Rozenfeld3-5/+5
cc1, cc1plus, and lto built during STAGEautoprofile need to be built with debug info since they are used to build target libs. -gtoggle was turning off debug info for this stage. create_gcov should be passed prev-gcc/cc1, prev-gcc/cc1plus, and prev-gcc/lto instead of stage1-gcc/cc1, stage1-gcc/cc1plus, and stage1-gcc/lto when processing profile data collected while building target libraries. Tested on x86_64-pc-linux-gnu. ChangeLog: * Makefile.in: Remove -gtoggle for STAGEautoprofile * Makefile.tpl: Remove -gtoggle for STAGEautoprofile gcc/c/ChangeLog: * Make-lang.in: Pass correct stage cc1 when processing profile data collected while building target libraries gcc/cp/ChangeLog: * Make-lang.in: Pass correct stage cc1plus when processing profile data collected while building target libraries gcc/lto/ChangeLog: * Make-lang.in: Pass correct stage lto when processing profile data collected while building target libraries
2023-06-29Daily bump.GCC Administrator8-1/+411
2023-06-28testsuite: check_effective_target_lra: CRIS is LRAHans-Peter Nilsson1-1/+1
Left-over from r14-383-gfaf8bea79b6256. * lib/target-supports.exp (check_effective_target_lra): Remove cris-*-* from expression for exceptions to LRA.
2023-06-28CRIS: Don't apply PATTERN to insn before validation (PR 110144)Hans-Peter Nilsson1-1/+1
Oops. The validation was there, but PATTERN was applied before that. Noticeable only with rtl-checking (for example as in the report: "--enable-checking=yes,rtl") as this statement was only a (one of many) straggling olde-C declare-and-initialize-at-beginning-of-block thing. PR target/110144 * config/cris/cris.cc (cris_postdbr_cmpelim): Don't apply PATTERN to insn before validating it.
2023-06-28Enable early inlining into always_inline functionsJan Hubicka6-45/+108
Early inliner currently skips always_inline functions and moreover we ignore calls from always_inline in ipa_reverse_postorder. This leads to disabling most of propagation done using early optimization that is quite bad when early inline functions are not leaf functions, which is now quite common in libstdc++. This patch instead of fully disabling the inline checks calls in callee. I am quite conservative about what can be inlined as this patch is bit touchy anyway. To avoid problems with always_inline being optimized after early inline I extended inline_always_inline_functions to lazilly compute fnsummary when needed. gcc/ChangeLog: PR middle-end/110334 * ipa-fnsummary.h (ipa_fn_summary): Add safe_to_inline_to_always_inline. * ipa-inline.cc (can_early_inline_edge_p): ICE if SSA is not built; do cycle checking for always_inline functions. (inline_always_inline_functions): Be recrusive; watch for cycles; do not updat overall summary. (early_inliner): Do not give up on always_inlines. * ipa-utils.cc (ipa_reverse_postorder): Do not skip always inlines. gcc/testsuite/ChangeLog: PR middle-end/110334 * g++.dg/opt/pr66119.C: Disable early inlining. * gcc.c-torture/compile/pr110334.c: New test. * gcc.dg/tree-ssa/pr110334.c: New test.
2023-06-28Fortran: ABI for scalar CHARACTER(LEN=1),VALUE dummy argument [PR110360]Harald Anlauf2-5/+33
gcc/fortran/ChangeLog: PR fortran/110360 * trans-expr.cc (gfc_conv_procedure_call): For non-constant string argument passed to CHARACTER(LEN=1),VALUE dummy, ensure proper dereferencing and truncation of string to length 1. gcc/testsuite/ChangeLog: PR fortran/110360 * gfortran.dg/value_9.f90: Add tests for intermediate regression.
2023-06-28c++: ahead of time variable template-id coercion [PR89442]Patrick Palka12-19/+32
This patch makes us coerce the arguments of a variable template-id ahead of time, as we do for class template-ids, which causes us to immediately diagnose template parm/arg kind mismatches and arity mismatches. Unfortunately this causes a regression in cpp1z/constexpr-if20.C: coercing the variable template-id m<ar, as> ahead of time means we strip it of typedefs, yielding m<typename C<i>::q, typename C<j>::q>, but in this stripped form we're directly using 'i' and so we expect to have captured it. This is a variable template version of PR107437. PR c++/89442 PR c++/107437 gcc/cp/ChangeLog: * cp-tree.h (lookup_template_variable): Add complain parameter. * parser.cc (cp_parser_template_id): Pass tf_warning_or_error to lookup_template_variable. * pt.cc (lookup_template_variable): Add complain parameter. Coerce template arguments here ... (finish_template_variable): ... instead of here. (lookup_and_finish_template_variable): Check for error_mark_node result from lookup_template_variable. (tsubst_copy) <case TEMPLATE_ID_EXPR>: Pass complain to lookup_template_variable. (instantiate_template): Use build2 instead of lookup_template_variable to build a TEMPLATE_ID_EXPR for most_specialized_partial_spec. gcc/testsuite/ChangeLog: * g++.dg/cpp/pr64127.C: Expect "expected unqualified-id at end of input" error. * g++.dg/cpp0x/alias-decl-ttp1.C: Fix template parameter/argument kind mismatch for variable template has_P_match_V. * g++.dg/cpp1y/pr72759.C: Expect "template argument 1 is invalid" error. * g++.dg/cpp1z/constexpr-if20.C: XFAIL test due to bogus "'i' is not captured" error. * g++.dg/cpp1z/noexcept-type21.C: Fix arity of variable template d. * g++.dg/diagnostic/not-a-function-template-1.C: Add default template argument to variable template A so that A<> is valid. * g++.dg/parse/error56.C: Don't expect "ISO C++ forbids declaration with no type" error. * g++.dg/parse/template30.C: Don't expect "parse error in template argument list" error. * g++.dg/cpp1y/var-templ82.C: New test.
2023-06-28d: Fix wrong code-gen when returning structs by value.Iain Buclaw2-4/+60
Since r13-1104, structs have have compute_record_mode called too early on them, causing them to return differently depending on the order that types are generated in, and whether there are forward references. This patch moves the call to compute_record_mode into its own function, and calls it after all fields have been given a size. PR d/106977 PR target/110406 gcc/d/ChangeLog: * types.cc (finish_aggregate_mode): New function. (finish_incomplete_fields): Call finish_aggregate_mode. (finish_aggregate_type): Replace call to compute_record_mode with finish_aggregate_mode. gcc/testsuite/ChangeLog: * gdc.dg/torture/pr110406.d: New test.
2023-06-28d: Fix d_signed_or_unsigned_type is invoked for vector types (PR110193)Iain Buclaw1-2/+2
This function can be invoked on VECTOR_TYPE, but the implementation assumes it works on integer types only. To fix, added a check whether the type passed is any `__vector(T)' or non-integral type, and return early by calling `signed_or_unsigned_type_for()' instead. Problem was found by instrumenting TYPE_PRECISION and ICEing when applied on VECTOR_TYPEs. PR d/110193 gcc/d/ChangeLog: * types.cc (d_signed_or_unsigned_type): Handle being called with any vector or non-integral type.
2023-06-28c++: fix error reporting routines re-entered ICE [PR110175]Marek Polacek2-2/+9
Here we get the "error reporting routines re-entered" ICE because of an unguarded use of warning_at. While at it, I added a check for a warning_at just above it. PR c++/110175 gcc/cp/ChangeLog: * typeck.cc (cp_build_unary_op): Check tf_warning before warning. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/decltype-110175.C: New test.
2023-06-28final+varasm: Change return type of predicate functions from int to boolUros Bizjak5-81/+83
Also change some internal variables to bool and change return type of compute_alignments to void. gcc/ChangeLog: * output.h (leaf_function_p): Change return type from int to bool. (final_forward_branch_p): Ditto. (only_leaf_regs_used): Ditto. (maybe_assemble_visibility): Ditto. * varasm.h (supports_one_only): Ditto. * rtl.h (compute_alignments): Change return type from int to void. * final.cc (app_on): Change return type from int to bool. (compute_alignments): Change return type from int to void and adjust function body accordingly. (shorten_branches): Change "something_changed" variable type from int to bool. (leaf_function_p): Change return type from int to bool and adjust function body accordingly. (final_forward_branch_p): Ditto. (only_leaf_regs_used): Ditto. * varasm.cc (contains_pointers_p): Change return type from int to bool and adjust function body accordingly. (compare_constant): Ditto. (maybe_assemble_visibility): Ditto. (supports_one_only): Ditto.
2023-06-28cprop_hardreg: fix ORIGINAL_REGNO/REG_ATTRS/REG_POINTER handlingManolis Tsamis2-16/+65
Fixes: 6a2e8dcbbd4bab3 Propagation for the stack pointer in regcprop was enabled in 6a2e8dcbbd4bab3, but set ORIGINAL_REGNO/REG_ATTRS/REG_POINTER for stack_pointer_rtx which caused regression (e.g., PR 110313, PR 110308). This fix adds special handling for stack_pointer_rtx in the places where maybe_mode_change is called. This also adds an check in maybe_mode_change to return the stack pointer only when the requested mode matches the mode of stack_pointer_rtx. PR debug/110308 gcc/ChangeLog: * regcprop.cc (maybe_mode_change): Check stack_pointer_rtx mode. (maybe_copy_reg_attrs): New function. (find_oldest_value_reg): Use maybe_copy_reg_attrs. (copyprop_hardreg_forward_1): Ditto. gcc/testsuite/ChangeLog: * g++.dg/torture/pr110308.C: New test. Signed-off-by: Manolis Tsamis <manolis.tsamis@vrull.eu> Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
2023-06-28tree-optimization/110434 - avoid <retval> ={v} {CLOBBER} from NRVRichard Biener1-1/+11
When NRV replaces a local variable with <retval> it also replaces occurences in clobbers. This leads to <retval> being clobbered before the return of it which is strictly invalid but harmless in practice since there's no pass after NRV which would remove earlier stores. The following fixes this nevertheless. PR tree-optimization/110434 * tree-nrv.cc (pass_nrv::execute): Remove CLOBBERs of VAR we replace with <retval>.
2023-06-28Make mve_fp_fpu[12].c accept single or double precision FPUChristophe Lyon2-2/+2
This tests currently expect a directive containing .fpu fpv5-sp-d16 and thus may fail if the test is executed for instance with -march=armv8.1-m.main+mve.fp+fp.dp This patch accepts either fpv5-sp-d16 or fpv5-d16 to avoid the failure. 2023-06-28 Christophe Lyon <christophe.lyon@linaro.org> gcc/testsuite/ * gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c: Fix .fpu scan-assembler. * gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c: Likewise.
2023-06-28Make nomve_fp_1.c require arm_fpChristophe Lyon1-0/+2
If GCC is configured with the default (soft) -mfloat-abi, and we don't override the target_board test flags appropriately, gcc.target/arm/mve/general-c/nomve_fp_1.c fails for lack of -mfloat-abi=softfp or -mfloat-abi=hard, because it doesn't use dg-add-options arm_v8_1m_mve (on purpose, see comment in the test). Require and use the options needed for arm_fp to fix this problem. 2023-06-28 Christophe Lyon <christophe.lyon@linaro.org> gcc/testsuite/ * gcc.target/arm/mve/general-c/nomve_fp_1.c: Require arm_fp.
2023-06-28tree-optimization/110451 - hoist invariant compare after interchangeRichard Biener2-1/+61
The following adjusts the cost model of invariant motion to consider [VEC_]COND_EXPRs and comparisons producing a data value as expensive. For 503.bwaves_r this avoids an unnecessarily high vectorization factor because of an integer comparison besides data operations on double. PR tree-optimization/110451 * tree-ssa-loop-im.cc (stmt_cost): [VEC_]COND_EXPR and tcc_comparison are expensive. * gfortran.dg/vect/pr110451.f: New testcase.
2023-06-28Fortran: Enable class expressions in structure constructors [PR49213]Paul Thomas5-12/+166
2023-06-28 Paul Thomas <pault@gcc.gnu.org> gcc/fortran PR fortran/49213 * expr.cc (gfc_is_ptr_fcn): Remove reference to class_pointer. * resolve.cc (resolve_assoc_var): Call gfc_is_ptr_fcn to allow associate names with pointer function targets to be used in variable definition context. * trans-decl.cc (get_symbol_decl): Remove extraneous line. * trans-expr.cc (alloc_scalar_allocatable_subcomponent): Obtain size of intrinsic and character expressions. (gfc_trans_subcomponent_assign): Expand assignment to class components to include intrinsic and character expressions. gcc/testsuite/ PR fortran/49213 * gfortran.dg/pr49213.f90 : New test
2023-06-28i386: Add cbranchti4 pattern to i386.md (for -m32 compare_by_pieces).Roger Sayle4-3/+45
This patch fixes some very odd (unanticipated) code generation by compare_by_pieces with -m32 -mavx, since the recent addition of the cbranchoi4 pattern. The issue is that cbranchoi4 is available with TARGET_AVX, but cbranchti4 is currently conditional on TARGET_64BIT which results in the odd behaviour (thanks to OPTAB_WIDEN) that with -m32 -mavx, compare_by_pieces ends up (inefficiently) widening 128-bit comparisons to 256-bits before performing PTEST. This patch fixes this by providing a cbranchti4 pattern that's available with either TARGET_64BIT or TARGET_SSE4_1. For the test case below (again from PR 104610): int foo(char *a) { static const char t[] = "0123456789012345678901234567890"; return __builtin_memcmp(a, &t[0], sizeof(t)) == 0; } GCC with -m32 -O2 -mavx currently produces the bonkers: foo: pushl %ebp movl %esp, %ebp andl $-32, %esp subl $64, %esp movl 8(%ebp), %eax vmovdqa .LC0, %xmm4 movl $0, 48(%esp) vmovdqu (%eax), %xmm2 movl $0, 52(%esp) movl $0, 56(%esp) movl $0, 60(%esp) movl $0, 16(%esp) movl $0, 20(%esp) movl $0, 24(%esp) movl $0, 28(%esp) vmovdqa %xmm2, 32(%esp) vmovdqa %xmm4, (%esp) vmovdqa (%esp), %ymm5 vpxor 32(%esp), %ymm5, %ymm0 vptest %ymm0, %ymm0 jne .L2 vmovdqu 16(%eax), %xmm7 movl $0, 48(%esp) movl $0, 52(%esp) vmovdqa %xmm7, 32(%esp) vmovdqa .LC1, %xmm7 movl $0, 56(%esp) movl $0, 60(%esp) movl $0, 16(%esp) movl $0, 20(%esp) movl $0, 24(%esp) movl $0, 28(%esp) vmovdqa %xmm7, (%esp) vmovdqa (%esp), %ymm1 vpxor 32(%esp), %ymm1, %ymm0 vptest %ymm0, %ymm0 je .L6 .L2: movl $1, %eax xorl $1, %eax vzeroupper leave ret .L6: xorl %eax, %eax xorl $1, %eax vzeroupper leave ret with this patch, we now generate the (slightly) more sensible: foo: vmovdqa .LC0, %xmm0 movl 4(%esp), %eax vpxor (%eax), %xmm0, %xmm0 vptest %xmm0, %xmm0 jne .L2 vmovdqa .LC1, %xmm0 vpxor 16(%eax), %xmm0, %xmm0 vptest %xmm0, %xmm0 je .L5 .L2: movl $1, %eax xorl $1, %eax ret .L5: xorl %eax, %eax xorl $1, %eax ret 2023-06-28 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog * config/i386/i386-expand.cc (ix86_expand_branch): Also use ptest for TImode comparisons on 32-bit architectures. * config/i386/i386.md (cbranch<mode>4): Change from SDWIM to SWIM1248x to exclude/avoid TImode being conditional on -m64. (cbranchti4): New define_expand for TImode on both TARGET_64BIT and/or with TARGET_SSE4_1. * config/i386/predicates.md (ix86_timode_comparison_operator): New predicate that depends upon TARGET_64BIT. (ix86_timode_comparison_operand): Likewise. gcc/testsuite/ChangeLog * gcc.target/i386/pieces-memcmp-2.c: New test case.
2023-06-28i386: Fix FAIL of gcc.target/i386/pr78794.c on ia32.Roger Sayle1-1/+25
This patch fixes that FAIL of gcc.target/i386/pr78794.c on ia32, which is caused by minor STV rtx_cost differences with -march=silvermont. It turns out that generic tuning results in pandn, but the lack of accurate parameterization for COMPARE in compute_convert_gain combined with small differences in scalar<->SSE costs on silvermont results in this DImode chain not being converted. The solution is to provide more accurate costs/gains for converting (DImode and SImode) comparisons. I'd been holding off of doing this as I'd thought it would be possible to turn pandn;ptestz into ptestc (for an even bigger scalar-to-vector win) but I've recently realized that these optimizations (as I've implemented them) occur in the wrong order (stv2 occurs after combine), so it isn't easy for STV to convert CCZmode into CCCmode. Doh! Perhaps something can be done in peephole2. 2023-06-28 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog PR target/78794 * config/i386/i386-features.cc (compute_convert_gain): Provide more accurate gains for conversion of scalar comparisons to PTEST.
2023-06-28tree-optimization/110443 - prevent SLP splat of gathersRichard Biener2-1/+23
The following prevents non-grouped load SLP in case the element to splat is from a gather operation. While it should be possible to support this it is not similar to the single element interleaving case I was trying to mimic here. PR tree-optimization/110443 * tree-vect-slp.cc (vect_build_slp_tree_1): Reject non-grouped gather loads. * gcc.dg/torture/pr110443.c: New testcase.
2023-06-28rs6000: Add two peephole patterns for "mr." insnHaochen Gui3-4/+155
When investigating the issue mentioned in PR87871#c30 - if compare and move pattern benefits before RA, I checked the assembly generated for SPEC2017 and found that certain insn sequences aren't converted to "mr." instructions. Following two sequence are never to be combined to "mr." pattern as there is no register link between them. This patch adds two peephole2 patterns to convert them to "mr." instructions. cmp 0,3,0 mr 4,3 mr 4,3 cmp 0,3,0 The patch also creates a new mode iterator which decided by TARGET_POWERPC64. This mode iterator is used in "mr." and its split pattern. The original P iterator is improper when -m32/-mpowerpc64 is set. In this situation, the "mr." should compares the whole 64-bit register with 0 other than the low 32-bit one. gcc/ * config/rs6000/rs6000.md (peephole2 for compare_and_move): New. (peephole2 for move_and_compare): New. (mode_iterator WORD): New. Set the mode to SI/DImode by TARGET_POWERPC64. (*mov<mode>_internal2): Change the mode iterator from P to WORD. (split pattern for compare_and_move): Likewise. gcc/testsuite/ * gcc.dg/rtl/powerpc/move_compare_peephole_32.c: New. * gcc.dg/rtl/powerpc/move_compare_peephole_64.c: New.
2023-06-28RISC-V: Support vfwmacc combine loweringJuzhe-Zhong5-6/+103
This patch adds combine pattern as follows: 1. (set (reg) (fma (float_extend:reg)(float_extend:reg)(reg))) This pattern allows combine: vfwcvt + vfwcvt + vfmacc ==> vwfmacc. 2. (set (reg) (fma (float_extend:reg)(reg)(reg))) This pattern is the intermediate IR that enhances the combine optimizations. Since for the complicate situation, combine pass can not combine both operands of multiplication at the first time, it will try to first combine at the first stage: (set (reg) (fma (float_extend:reg)(reg)(reg))). Then combine another extension of the other operand at the second stage. This can enhance combine optimization for the following case: define TEST_TYPE(TYPE1, TYPE2) \ __attribute__ ((noipa)) void vwadd_##TYPE1_##TYPE2 ( \ TYPE1 *__restrict dst, TYPE1 *__restrict dst2, TYPE1 *__restrict dst3, \ TYPE1 *__restrict dst4, TYPE2 *__restrict a, TYPE2 *__restrict b, \ TYPE2 *__restrict a2, TYPE2 *__restrict b2, int n) \ { \ for (int i = 0; i < n; i++) \ { \ dst[i] += (TYPE1) a[i] * (TYPE1) b[i]; \ dst2[i] += (TYPE1) a2[i] * (TYPE1) b[i]; \ dst3[i] += (TYPE1) a2[i] * (TYPE1) a[i]; \ dst4[i] += (TYPE1) a[i] * (TYPE1) b2[i]; \ } \ } define TEST_ALL() \ TEST_TYPE (int16_t, int8_t) \ TEST_TYPE (uint16_t, uint8_t) \ TEST_TYPE (int32_t, int16_t) \ TEST_TYPE (uint32_t, uint16_t) \ TEST_TYPE (int64_t, int32_t) \ TEST_TYPE (uint64_t, uint32_t) \ TEST_TYPE (float, _Float16) \ TEST_TYPE (double, float) TEST_ALL () gcc/ChangeLog: * config/riscv/autovec-opt.md (*double_widen_fma<mode>): New pattern. (*single_widen_fma<mode>): Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/widen/widen-8.c: Add floating-point. * gcc.target/riscv/rvv/autovec/widen/widen-complicate-5.c: Ditto. * gcc.target/riscv/rvv/autovec/widen/widen_run-8.c: Ditto. * gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-8.c: New test.
2023-06-28rs6000: Splat vector small V2DI constants with vspltisw and vupkhswHaochen Gui6-1/+81
This patch adds a new insn for vector splat with small V2DI constants on P8. If the value of constant is in RANGE (-16, 15) but not 0 or -1, it can be loaded with vspltisw and vupkhsw on P8. gcc/ PR target/104124 * config/rs6000/altivec.md (*altivec_vupkhs<VU_char>_direct): Rename to... (altivec_vupkhs<VU_char>_direct): ...this. * config/rs6000/predicates.md (vspltisw_vupkhsw_constant_split): New predicate to test if a constant can be loaded with vspltisw and vupkhsw. (easy_vector_constant): Call vspltisw_vupkhsw_constant_p to Check if a vector constant can be synthesized with a vspltisw and a vupkhsw. * config/rs6000/rs6000-protos.h (vspltisw_vupkhsw_constant_p): Declare. * config/rs6000/rs6000.cc (vspltisw_vupkhsw_constant_p): New function to return true if OP mode is V2DI and can be synthesized with vupkhsw and vspltisw. * config/rs6000/vsx.md (*vspltisw_v2di_split): New insn to load up constants with vspltisw and vupkhsw. gcc/testsuite/ PR target/104124 * gcc.target/powerpc/pr104124.c: New.
2023-06-28Enable ranger for ipa-propJan Hubicka2-2/+20
gcc/ChangeLog: PR tree-optimization/110377 * ipa-prop.cc (ipa_compute_jump_functions_for_edge): Pass statement to the ranger query. (ipa_analyze_node): Enable ranger. gcc/testsuite/ChangeLog: PR tree-optimization/110377 * gcc.dg/ipa/pr110377.c: New test.
2023-06-28Add testcase for PR 110444Andrew Pinski1-0/+11
This testcase was fixed after r14-2135-gd915762ea9043da85 and there was no testcase for it before so adding one is a good thing. Committed as obvious after testing the testcase to make sure it works. gcc/testsuite/ChangeLog: PR tree-optimization/110444 * gcc.c-torture/compile/pr110444-1.c: New test.
2023-06-28Prevent TYPE_PRECISION on VECTOR_TYPEsRichard Biener6-8/+10
The following makes sure that using TYPE_PRECISION on VECTOR_TYPE ICEs when tree checking is enabled. This should avoid wrong-code in cases like PR110182 and instead ICE. It also introduces a TYPE_PRECISION_RAW accessor and adjusts places I found that are eligible to use that. * tree.h (TYPE_PRECISION): Check for non-VECTOR_TYPE. (TYPE_PRECISION_RAW): Provide raw access to the precision field. * tree.cc (verify_type_variant): Compare TYPE_PRECISION_RAW. (gimple_canonical_types_compatible_p): Likewise. * tree-streamer-out.cc (pack_ts_type_common_value_fields): Stream TYPE_PRECISION_RAW. * tree-streamer-in.cc (unpack_ts_type_common_value_fields): Likewise. * lto-streamer-out.cc (hash_tree): Hash TYPE_PRECISION_RAW. gcc/lto/ * lto-common.cc (compare_tree_sccs_1): Use TYPE_PRECISION_RAW.
2023-06-28c++: inherited constructor attributesJason Merrill4-1/+43
Inherited constructors are like constructor clones; they don't exist from the language perspective, so they should copy the attributes in the same way. But it doesn't make sense to copy alias or ifunc attributes in either case. Unlike handle_copy_attribute, we do want to copy inlining attributes. The discussion of PR110334 pointed out that we weren't copying the always_inline attribute, leading to poor inlining choices. PR c++/110334 gcc/cp/ChangeLog: * cp-tree.h (clone_attrs): Declare. * method.cc (implicitly_declare_fn): Use it for inherited constructor. * optimize.cc (clone_attrs): New. (maybe_clone_body): Use it. gcc/testsuite/ChangeLog: * g++.dg/cpp1z/nodiscard-inh1.C: New test.
2023-06-28Add leafy mode for zero-call-used-regsAlexandre Oliva8-3/+103
Introduce 'leafy' to auto-select between 'used' and 'all' for leaf and nonleaf functions, respectively. for gcc/ChangeLog * doc/extend.texi (zero-call-used-regs): Document leafy and variants thereof. * flag-types.h (zero_regs_flags): Add LEAFY_MODE, as well as LEAFY and variants. * function.cc (gen_call_ued_regs_seq): Set only_used for leaf functions in leafy mode. * opts.cc (zero_call_used_regs_opts): Add leafy and variants. for gcc/testsuite/ChangeLog * c-c++-common/zero-scratch-regs-leafy-1.c: New. * c-c++-common/zero-scratch-regs-leafy-2.c: New. * gcc.target/i386/zero-scratch-regs-leafy-1.c: New. * gcc.target/i386/zero-scratch-regs-leafy-2.c: New.
2023-06-28[testsuite] note pitfall in how outputs.exp sets gldAlexandre Oliva1-1/+9
This patch documents a glitch in gcc.misc-tests/outputs.exp: it checks whether the linker is GNU ld, and uses that to decide whether to expect collect2 to create .ld1_args files under -save-temps, but collect2 bases that decision on whether HAVE_GNU_LD is set, which may be false zero if the linker in use is GNU ld. Configuring --with-gnu-ld fixes this misalignment. Without that, atsave tests are likely to fail, because without HAVE_GNU_LD, collect2 won't use @file syntax to run the linker (so it won't create .ld1_args files). Long version: HAVE_GNU_LD is set when (i) DEFAULT_LINKER is set during configure, pointing at GNU ld; (ii) --with-gnu-ld is passed to configure; or (iii) config.gcc sets gnu_ld=yes. If a port doesn't set gnu_ld, and the toolchain isn't configured so as to assume GNU ld, configure and thus collect2 conservatively assume the linker doesn't support @file arguments. But outputs.exp can't see how configure set HAVE_GNU_LD (it may be used to test an installed compiler), and upon finding that the linker used by the compiler is GNU ld, it will expect collect2 to use @file arguments when running the linker. If that assumption doesn't hold, atsave tests will fail. for gcc/testsuite/ChangeLog * gcc.misc-tests/outputs.exp (gld): Note a known mismatch and record a workaround.
2023-06-27c++: C++26 constexpr cast from void* [PR110344]Jason Merrill5-1/+665
P2768 allows static_cast from void* to ob* in constant evaluation if the pointer does in fact point to an object of the appropriate type. cxx_fold_indirect_ref already does the work of finding such an object if it happens to be a subobject rather than the outermost object at that address, as in constexpr-voidptr2.C. P2768 PR c++/110344 gcc/c-family/ChangeLog: * c-cppbuiltin.cc (c_cpp_builtins): Update __cpp_constexpr. gcc/cp/ChangeLog: * constexpr.cc (cxx_eval_constant_expression): In C++26, allow cast from void* to the type of a pointed-to object. gcc/testsuite/ChangeLog: * g++.dg/cpp26/constexpr-voidptr1.C: New test. * g++.dg/cpp26/constexpr-voidptr2.C: New test. * g++.dg/cpp26/feat-cxx26.C: New test.
2023-06-27testsuite: std_list handling for { target c++26 }Jason Merrill1-5/+5
As with c++23, we want to run { target c++26 } tests even though it isn't part of the default std_list. C++17 with Concepts TS is no longer an interesting target configuration. And bump the impcx target to use C++26 mode instead of 23. gcc/testsuite/ChangeLog: * lib/g++-dg.exp (g++-dg-runtest): Update for C++26.