aboutsummaryrefslogtreecommitdiff
path: root/gcc
AgeCommit message (Collapse)AuthorFilesLines
2022-09-05Fortran/openmp: Partial OpenMP 5.2 doacross and omp_cur_iteration supportTobias Burnus12-115/+407
Add the Fortran support to the ME/C/C++ commit r13-2388-ga651e6d59188da8992f8bfae2df1cb4e6316f9e6 gcc/fortran/ChangeLog: * dump-parse-tree.cc (show_omp_namelist, show_omp_clauses): Handle omp_cur_iteration and distinguish doacross/depend. * gfortran.h (enum gfc_omp_depend_doacross_op): Renamed from gfc_omp_depend_op. (enum gfc_omp_depend_doacross_op): Add OMP_DOACROSS_SINK_FIRST, Rename OMP_DEPEND_SINK to OMP_DOACROSS_SINK. (gfc_omp_namelist) Handle renaming, rename depend_op to depend_doacross_op. (struct gfc_omp_clauses): Add doacross_source. * openmp.cc (gfc_match_omp_depend_sink): Renamed to ... (gfc_match_omp_doacross_sink): ... this; handle omp_all_memory. (enum omp_mask2): Add OMP_CLAUSE_DOACROSS. (gfc_match_omp_clauses): Handle 'doacross' and syntax changes to depend. (gfc_match_omp_depobj): Simplify as sink/source are now impossible. (gfc_match_omp_ordered_depend): Request OMP_CLAUSE_DOACROSS. (resolve_omp_clauses): Update sink/source checks. (gfc_resolve_omp_directive): Resolve EXEC_OMP_ORDERED clauses. * parse.cc (decode_omp_directive): Handle 'ordered doacross'. * trans-openmp.cc (gfc_trans_omp_clauses): Handle doacross. (gfc_trans_omp_do): Fix OMP_FOR_ORIG_DECLS handling if 'ordered' clause is present. (gfc_trans_omp_depobj): Update for member name change. libgomp/ChangeLog: * libgomp.texi (OpenMP 5.2): Update doacross/omp_cur_iteration status. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/all-memory-1.f90: Update dg-error. * gfortran.dg/gomp/depend-iterator-2.f90: Likewise. * gfortran.dg/gomp/depobj-2.f90: Likewise. * gfortran.dg/gomp/doacross-5.f90: New test. * gfortran.dg/gomp/doacross-6.f90: New test. (cherry picked from commit 938cda536019cd6a1bc0dd2346381185b420bbf8)
2022-09-05Merge branch 'releases/gcc-12' into devel/omp/gcc-12Tobias Burnus18-48/+270
Merge up to r12-8742-gf4f72a25a9dfb5afbff8853bd51c1a891139dfd0 (5th Sep 2022)
2022-09-05nvptx: Silence unused variable warning in output_constant_pool_contents()Jan-Benedict Glaw2-0/+10
Similar to the rs6000 code, nvptx defines ASM_OUTPUT_DEF_FROM_DECLS as well as ASM_OUTPUT_DEF. Make sure that the define's parameters are used by referencing them as (void) to silence a warning in output_constant_pool_contents(). 2022-09-30 Jan-Benedict Glaw <jbglaw@lug-owl.de> gcc/ * config/nvptx/nvptx.h (ASM_OUTPUT_DEF): Reference macro arguments. (cherry picked from commit 08de065293f8b08158e1089fbacce9dbaba95077)
2022-09-05openmp: Partial OpenMP 5.2 doacross and omp_cur_iteration supportJakub Jelinek32-290/+933
The following patch implements part of the OpenMP 5.2 changes related to ordered loops and with the assumed resolution of https://github.com/OpenMP/spec/issues/3302 issues. The changes are: 1) the depend clause on stand-alone ordered constructs has been renamed to doacross (because depend clause has different syntax on other constructs) with some syntax changes below, depend clause is deprecated (we'll deprecate stuff on the GCC side only when we have everything else from 5.2 implemented) depend(source) -> doacross(source:) or doacross(source:omp_cur_iteration) depend(sink:vec) -> doacross(sink:vec) (where vec has the same syntax as before) 2) in 5.1 and before it has been significant whether ordered clause has or doesn't have an argument, if it didn't, only block-associated ordered could appear in the body, if it did, only stand-alone ordered could appear in the body, all loops had to be perfectly nested, no associated range-based for loops, no linear clause on work-sharing loop and ordered clause with an argument wasn't allowed on composite for simd. In 5.2, whether ordered clause has or doesn't have an argument is insignificant (except for bugs in the standard, #3302 mentions those), if the argument is missing, it is simply treated as equal to collapse argument (if any, otherwise 1). The implementation better should be able to differentiate between ordered and doacross loops at compile time which previously was through the absence or presence of the argument, now it is done through looking at the body of the construct lexically and looking for stand-alone ordered constructs. If there are any, it is to be handled as doacross loop, otherwise it is ordered loop (but in that case ordered argument if present must be equal to collapse argument - 5.2 says instead it must be one, but that is clearly wrong and mentioned in #3302) - stand-alone ordered constructs must appear lexically in the body (and had to before as well). For the restrictions mentioned above, the for simd restriction is gone (stand-alone ordered can't appear in simd construct, so that is enough), and the other rules are expected to be changed into something related to presence of stand-alone ordered constructs in the body 3) 5.2 allows a new syntax, doacross(sink:omp_cur_iteration-1), which means wait for previous iteration in the iteration space of all the associated loops The following patch implements that, except that we sorry for now on the doacross(sink:omp_cur_iteration-1) syntax during omp expansion because library side isn't done yet for it. It doesn't implement it for the Fortran FE either. Incrementally, I'd like to change the way we differentiate between stand-alone and block-associated ordered constructs, because the current way of looking for presence of doacross clause doesn't work well if those clauses are removed because they had been invalid (wrong syntax or unknown variables in it etc.) and of course implement doacross(sink:omp_cur_iteration-1). 2022-09-03 Jakub Jelinek <jakub@redhat.com> gcc/ * tree-core.h (enum omp_clause_code): Add OMP_CLAUSE_DOACROSS. (enum omp_clause_depend_kind): Remove OMP_CLAUSE_DEPEND_SOURCE and OMP_CLAUSE_DEPEND_SINK, add OMP_CLAUSE_DEPEND_INVALID. (enum omp_clause_doacross_kind): New type. (struct tree_omp_clause): Add subcode.doacross_kind member. * tree.h (OMP_CLAUSE_DEPEND_SINK_NEGATIVE): Remove. (OMP_CLAUSE_DOACROSS_KIND): Define. (OMP_CLAUSE_DOACROSS_SINK_NEGATIVE): Define. (OMP_CLAUSE_DOACROSS_DEPEND): Define. (OMP_CLAUSE_ORDERED_DOACROSS): Define. * tree.cc (omp_clause_num_ops, omp_clause_code_name): Add OMP_CLAUSE_DOACROSS entries. * tree-nested.cc (convert_nonlocal_omp_clauses, convert_local_omp_clauses): Handle OMP_CLAUSE_DOACROSS. * tree-pretty-print.cc (dump_omp_clause): Don't handle OMP_CLAUSE_DEPEND_SOURCE and OMP_CLAUSE_DEPEND_SINK. Handle OMP_CLAUSE_DOACROSS. * gimplify.cc (gimplify_omp_depend): Don't handle OMP_CLAUSE_DEPEND_SOURCE and OMP_CLAUSE_DEPEND_SINK. (gimplify_scan_omp_clauses): Likewise. Handle OMP_CLAUSE_DOACROSS. (gimplify_adjust_omp_clauses): Handle OMP_CLAUSE_DOACROSS. (find_standalone_omp_ordered): New function. (gimplify_omp_for): When OMP_CLAUSE_ORDERED is present, search body for OMP_ORDERED with OMP_CLAUSE_DOACROSS and if found, set OMP_CLAUSE_ORDERED_DOACROSS. (gimplify_omp_ordered): Don't handle OMP_CLAUSE_DEPEND_SINK or OMP_CLAUSE_DEPEND_SOURCE, instead check OMP_CLAUSE_DOACROSS, adjust diagnostics that presence or absence of ordered clause parameter is irrelevant. Handle doacross(sink:omp_cur_iteration-1). Use actual user name of the clause - doacross or depend - in diagnostics. * omp-general.cc (omp_extract_for_data): Don't set fd->ordered if !OMP_CLAUSE_ORDERED_DOACROSS (t). If OMP_CLAUSE_ORDERED_DOACROSS (t) but !OMP_CLAUSE_ORDERED_EXPR (t), set fd->ordered to -1 and set it after the loop in that case to fd->collapse. * omp-low.cc (check_omp_nesting_restrictions): Don't handle OMP_CLAUSE_DEPEND_SOURCE nor OMP_CLAUSE_DEPEND_SINK, instead check OMP_CLAUSE_DOACROSS. Use actual user name of the clause - doacross or depend - in diagnostics. Diagnose mixing of stand-alone and block associated ordered constructs binding to the same loop. (lower_omp_ordered_clauses): Don't handle OMP_CLAUSE_DEPEND_SINK, instead handle OMP_CLAUSE_DOACROSS. (lower_omp_ordered): Look for OMP_CLAUSE_DOACROSS instead of OMP_CLAUSE_DEPEND. (lower_depend_clauses): Don't handle OMP_CLAUSE_DEPEND_SOURCE and OMP_CLAUSE_DEPEND_SINK. * omp-expand.cc (expand_omp_ordered_sink): Emit a sorry for doacross(sink:omp_cur_iteration-1). (expand_omp_ordered_source_sink): Use OMP_CLAUSE_DOACROSS_SINK_NEGATIVE instead of OMP_CLAUSE_DEPEND_SINK_NEGATIVE. Use actual user name of the clause - doacross or depend - in diagnostics. (expand_omp): Look for OMP_CLAUSE_DOACROSS clause instead of OMP_CLAUSE_DEPEND. (build_omp_regions_1): Likewise. (omp_make_gimple_edges): Likewise. * lto-streamer-out.cc (hash_tree): Handle OMP_CLAUSE_DOACROSS. * tree-streamer-in.cc (unpack_ts_omp_clause_value_fields): Likewise. * tree-streamer-out.cc (pack_ts_omp_clause_value_fields): Likewise. gcc/c-family/ * c-pragma.h (enum pragma_omp_clause): Add PRAGMA_OMP_CLAUSE_DOACROSS. * c-omp.cc (c_finish_omp_depobj): Check also for OMP_CLAUSE_DOACROSS clause and diagnose it. Don't handle OMP_CLAUSE_DEPEND_SOURCE and OMP_CLAUSE_DEPEND_SINK. Assert kind is not OMP_CLAUSE_DEPEND_INVALID. gcc/c/ * c-parser.cc (c_parser_omp_clause_name): Handle doacross. (c_parser_omp_clause_depend_sink): Renamed to ... (c_parser_omp_clause_doacross_sink): ... this. Add depend_p argument. Handle parsing of doacross(sink:omp_cur_iteration-1). Use OMP_CLAUSE_DOACROSS_SINK_NEGATIVE instead of OMP_CLAUSE_DEPEND_SINK_NEGATIVE, build OMP_CLAUSE_DOACROSS instead of OMP_CLAUSE_DEPEND and set OMP_CLAUSE_DOACROSS_DEPEND flag on it. (c_parser_omp_clause_depend): Use OMP_CLAUSE_DOACROSS_SINK and OMP_CLAUSE_DOACROSS_SOURCE instead of OMP_CLAUSE_DEPEND_SINK and OMP_CLAUSE_DEPEND_SOURCE, build OMP_CLAUSE_DOACROSS for depend(source) and set OMP_CLAUSE_DOACROSS_DEPEND on it. (c_parser_omp_clause_doacross): New function. (c_parser_omp_all_clauses): Handle PRAGMA_OMP_CLAUSE_DOACROSS. (c_parser_omp_depobj): Use OMP_CLAUSE_DEPEND_INVALID instead of OMP_CLAUSE_DEPEND_SOURCE. (c_parser_omp_for_loop): Don't diagnose here linear clause together with ordered with argument. (c_parser_omp_simd): Don't diagnose ordered clause with argument on for simd. (OMP_ORDERED_DEPEND_CLAUSE_MASK): Add PRAGMA_OMP_CLAUSE_DOACROSS. (c_parser_omp_ordered): Handle also doacross and adjust for it diagnostic wording. * c-typeck.cc (c_finish_omp_clauses): Handle OMP_CLAUSE_DOACROSS. Don't handle OMP_CLAUSE_DEPEND_SOURCE and OMP_CLAUSE_DEPEND_SINK. gcc/cp/ * parser.cc (cp_parser_omp_clause_name): Handle doacross. (cp_parser_omp_clause_depend_sink): Renamed to ... (cp_parser_omp_clause_doacross_sink): ... this. Add depend_p argument. Handle parsing of doacross(sink:omp_cur_iteration-1). Use OMP_CLAUSE_DOACROSS_SINK_NEGATIVE instead of OMP_CLAUSE_DEPEND_SINK_NEGATIVE, build OMP_CLAUSE_DOACROSS instead of OMP_CLAUSE_DEPEND and set OMP_CLAUSE_DOACROSS_DEPEND flag on it. (cp_parser_omp_clause_depend): Use OMP_CLAUSE_DOACROSS_SINK and OMP_CLAUSE_DOACROSS_SOURCE instead of OMP_CLAUSE_DEPEND_SINK and OMP_CLAUSE_DEPEND_SOURCE, build OMP_CLAUSE_DOACROSS for depend(source) and set OMP_CLAUSE_DOACROSS_DEPEND on it. (cp_parser_omp_clause_doacross): New function. (cp_parser_omp_all_clauses): Handle PRAGMA_OMP_CLAUSE_DOACROSS. (cp_parser_omp_depobj): Use OMP_CLAUSE_DEPEND_INVALID instead of OMP_CLAUSE_DEPEND_SOURCE. (cp_parser_omp_for_loop): Don't diagnose here linear clause together with ordered with argument. (cp_parser_omp_simd): Don't diagnose ordered clause with argument on for simd. (OMP_ORDERED_DEPEND_CLAUSE_MASK): Add PRAGMA_OMP_CLAUSE_DOACROSS. (cp_parser_omp_ordered): Handle also doacross and adjust for it diagnostic wording. * pt.cc (tsubst_omp_clause_decl): Use OMP_CLAUSE_DOACROSS_SINK_NEGATIVE instead of OMP_CLAUSE_DEPEND_SINK_NEGATIVE. (tsubst_omp_clauses): Handle OMP_CLAUSE_DOACROSS. (tsubst_expr): Use OMP_CLAUSE_DEPEND_INVALID instead of OMP_CLAUSE_DEPEND_SOURCE. * semantics.cc (cp_finish_omp_clause_depend_sink): Rename to ... (cp_finish_omp_clause_doacross_sink): ... this. (finish_omp_clauses): Handle OMP_CLAUSE_DOACROSS. Don't handle OMP_CLAUSE_DEPEND_SOURCE and OMP_CLAUSE_DEPEND_SINK. gcc/fortran/ * trans-openmp.cc (gfc_trans_omp_clauses): Use OMP_CLAUSE_DOACROSS_SINK_NEGATIVE instead of OMP_CLAUSE_DEPEND_SINK_NEGATIVE, build OMP_CLAUSE_DOACROSS clause instead of OMP_CLAUSE_DEPEND and set OMP_CLAUSE_DOACROSS_DEPEND on it. gcc/testsuite/ * c-c++-common/gomp/doacross-2.c: Adjust expected diagnostics. * c-c++-common/gomp/doacross-5.c: New test. * c-c++-common/gomp/doacross-6.c: New test. * c-c++-common/gomp/nesting-2.c: Adjust expected diagnostics. * c-c++-common/gomp/ordered-3.c: Likewise. * c-c++-common/gomp/sink-3.c: Likewise. * gfortran.dg/gomp/nesting-2.f90: Likewise. (cherry picked from commit a651e6d59188da8992f8bfae2df1cb4e6316f9e6)
2022-09-05Daily bump.GCC Administrator1-1/+1
2022-09-04Daily bump.GCC Administrator3-1/+18
2022-09-02rs6000: Don't ICE when we disassemble an MMA variable [PR101322]Peter Bergner2-1/+23
When we expand an MMA disassemble built-in with C++ using a pointer that is cast to a valid MMA type, the type isn't passed down to the expand machinery and we end up using the base type of the pointer which leads to an ICE. This patch enforces we always use the correct MMA type regardless of the pointer type being used. 2022-08-31 Peter Bergner <bergner@linux.ibm.com> gcc/ PR target/101322 * config/rs6000/rs6000-builtin.cc (rs6000_gimple_fold_mma_builtin): Enforce the use of a valid MMA pointer type. gcc/testsuite/ PR target/101322 * g++.target/powerpc/pr101322.C: New test. (cherry picked from commit 2985049049f12b0aa3366ca244d387820385b9e8)
2022-09-03Daily bump.GCC Administrator2-1/+44
2022-09-02cselib: add function to check if SET is redundant [PR106187]Richard Earnshaw7-20/+90
A SET operation that writes memory may have the same value as an earlier store but if the alias sets of the new and earlier store do not conflict then the set is not truly redundant. This can happen, for example, if objects of different types share a stack slot. To fix this we define a new function in cselib that first checks for equality and if that is successful then finds the earlier store in the value history and checks the alias sets. The routine is used in two places elsewhere in the compiler: cfgcleanup and postreload. gcc/ChangeLog: PR rtl-optimization/106187 * alias.h (mems_same_for_tbaa_p): Declare. * alias.cc (mems_same_for_tbaa_p): New function. * dse.cc (record_store): Use it instead of open-coding alias check. * cselib.h (cselib_redundant_set_p): Declare. * cselib.cc: Include alias.h (cselib_redundant_set_p): New function. * cfgcleanup.cc: (mark_effect): Use cselib_redundant_set_p instead of rtx_equal_for_cselib_p. * postreload.cc (reload_cse_simplify): Use cselib_redundant_set_p. (reload_cse_noop_set_p): Delete. (cherry picked from commit 64ce76d940501cb04d14a0d36752b4f93473531c)
2022-09-02arm: correctly handle misaligned MEMs on MVE [PR105463]Richard Earnshaw2-21/+73
Vector operations in MVE must be aligned to the element size, so if we are asked for a misaligned move in a wider mode we must recast it to a form suitable for the known alignment (larger elements have better address offset ranges, so there is some advantage to using wider element sizes if possible). Whilst fixing this, also rework the predicates used for validating operands - the Neon predicates are not right for MVE. gcc/ChangeLog: PR target/105463 * config/arm/mve.md (*movmisalign<mode>_mve_store): Use mve_memory_operand. (*movmisalign<mode>_mve_load): Likewise. * config/arm/vec-common.md (movmisalign<mode>): Convert to generator form... (@movmisalign<mode>): ... thus. Use generic predicates and then rework operands if they are not valid. For MVE rework to a narrower element size if the alignment is not high enough. (cherry picked from commit 6a116728e27c4da65d84483c0e75561a7479d4d5)
2022-09-02AArch64: Fix bootstrap failure due to dump_printf_loc format attribute uses ↵Tamar Christina1-1/+2
[PR106782] This fixes the bootstrap failure on AArch64 following -Werror=format by correcting the print format modifiers in the backend. gcc/ChangeLog: PR other/106782 * config/aarch64/aarch64.cc (aarch64_vector_costs::prefer_unrolled_loop): Replace %u with HOST_WIDE_INT_PRINT_UNSIGNED. (cherry picked from commit b98c5262d02c13cdbbf3b985859b436adec94d90)
2022-09-02Daily bump.GCC Administrator2-1/+15
2022-09-01amdgcn: OpenMP SIMD routine supportAndrew Stubbs9-0/+91
Enable and configure SIMD clones for amdgcn. This affects both the __simd__ function attribute, and the OpenMP "declare simd" directive. Note that the masked SIMD variants are generated, but the middle end doesn't actually support calling them yet. gcc/ChangeLog: * config/gcn/gcn.cc (gcn_simd_clone_compute_vecsize_and_simdlen): New. (gcn_simd_clone_adjust): New. (gcn_simd_clone_usable): New. (TARGET_SIMD_CLONE_ADJUST): New. (TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN): New. (TARGET_SIMD_CLONE_USABLE): New. gcc/testsuite/ChangeLog: * gcc.dg/vect/vect-simd-clone-1.c: Add dg-warning. * gcc.dg/vect/vect-simd-clone-2.c: Add dg-warning. * gcc.dg/vect/vect-simd-clone-3.c: Add dg-warning. * gcc.dg/vect/vect-simd-clone-4.c: Add dg-warning. * gcc.dg/vect/vect-simd-clone-5.c: Add dg-warning. * gcc.dg/vect/vect-simd-clone-8.c: Add dg-warning. (cherry picked from commit b73c49f6f88dd7f7569f9a72c8ceb04598d4c15c)
2022-09-01omp-simd-clone: Unbreak bootstrapJakub Jelinek2-3/+13
This patch fixes -Werror=sign-compare errors during stage2/stage3. 2022-08-30 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> Jakub Jelinek <jakub@redhat.com> * omp-simd-clone.cc (simd_clone_adjust_return_type, simd_clone_adjust_argument_types): Use known_eq (veclen, 0U) instead of known_eq (veclen, 0) to avoid -Wsign-compare warnings. (cherry picked from commit 437bde93dcde8309bb23ee255924c697e8e70df9)
2022-09-01omp-simd-clone: Allow fixed-lane vectorsAndrew Stubbs4-5/+31
The vecsize_int/vecsize_float has an assumption that all arguments will use the same bitsize, and vary the number of lanes according to the element size, but this is inappropriate on targets where the number of lanes is fixed and the bitsize varies (i.e. amdgcn). With this change the vecsize can be left zero and the vectorization factor will be the same for all types. gcc/ChangeLog: * doc/tm.texi: Regenerate. * omp-simd-clone.cc (simd_clone_adjust_return_type): Allow zero vecsize. (simd_clone_adjust_argument_types): Likewise. * target.def (compute_vecsize_and_simdlen): Document the new vecsize_int and vecsize_float semantics. (cherry picked from commit f134a25ee8c29646f35f7e466109f6a7f5b9e824)
2022-09-01amdgcn: Vector procedure call ABIAndrew Stubbs4-31/+61
Adjust the (unofficial) procedure calling ABI such that vector arguments are passed in vector registers, not on the stack. Scalar arguments continue to be passed in scalar registers, making a total of 12 argument registers. The return value is also moved to a vector register (even for scalars; it would be possible to retain the scalar location, using untyped_call, but there's no obvious advantage in doing so). After this change the ABI is as follows: s0-s13 : Reserved for kernel launch parameters. s14-s15 : Frame pointer. s16-s17 : Stack pointer. s18-s19 : Link register. s20-s21 : Exec Save. s22-s23 : CC Save. s24-s25 : Scalar arguments. NO LONGER RETURN VALUE. s26-s29 : Additional scalar arguments (makes 6 total). s30-s31 : Static Chain. v0 : Prologue/epilogue scratch. v1 : Constant 0, 1, 2, 3, 4, ... 63. v2-v7 : Prologue/epilogue scratch. v8-v9 : Return value & vector arguments. NEW. v10-v13 : Additional vector arguments (makes 6 total). NEW. gcc/ChangeLog: * config/gcn/gcn.cc (gcn_function_value): Allow vector return values. (num_arg_regs): Allow vector arguments. (gcn_function_arg): Likewise. (gcn_function_arg_advance): Likewise. (gcn_arg_partial_bytes): Likewise. (gcn_return_in_memory): Likewise. (gcn_expand_epilogue): Get return value from v8. * config/gcn/gcn.h (RETURN_VALUE_REG): Set to v8. (FIRST_PARM_REG): USE FIRST_SGPR_REG for clarity. (FIRST_VPARM_REG): New. (FUNCTION_ARG_REGNO_P): Allow vector parameters. (struct gcn_args): Add vnum field. (LIBCALL_VALUE): All vector return values. * config/gcn/gcn.md (gcn_call_value): Add vector constraints. (gcn_call_value_indirect): Likewise. (cherry picked from commit 4e1914625dec4aa09a5671c6294e877dbf4518f5)
2022-09-01Fix up dump_printf_loc format attribute and adjust uses [PR106782]Jakub Jelinek3-4/+7
As discussed on IRC, the r13-2299-g68c61c2daa1f bug only got missed because dump_printf_loc had incorrect format attribute and therefore almost no -Wformat=* checking was performed on it. 3, 0 are suitable for function with (whatever, whatever, const char *, va_list) arguments, not for (whatever, whatever, const char *, ...), that one should use 3, 4. There are 3 spots where the mismatch was worse though, two using %u or %d for unsigned HOST_WIDE_INT argument and one %T for enum argument (promoted to int) and this backport just fixes those spots. 2022-09-01 Jakub Jelinek <jakub@redhat.com> PR other/106782 * tree-vect-slp.cc (vect_print_slp_tree): Use HOST_WIDE_INT_PRINT_UNSIGNED instead of %u. * tree-vect-loop.cc (vect_estimate_min_profitable_iters): Use HOST_WIDE_INT_PRINT_UNSIGNED instead of %d. * tree-vect-slp-patterns.cc (vect_pattern_validate_optab): Use %G instead of %T and STMT_VINFO_STMT (SLP_TREE_REPRESENTATIVE (node)) instead of SLP_TREE_DEF_TYPE (node). (cherry picked from commit 953e08fde44a596e4ec2491efd15cd645e1ddc48)
2022-09-01Daily bump.GCC Administrator1-1/+1
2022-08-31Revert OG12-only parts of "dwarf: Multi-register CFI address support"Tobias Burnus3-30/+14
"dwarf: Multi-register CFI address support." was commited to upstream (GCC12) as r12-5833-g13b6c7639cfdca892a3f02b63596b097e1839f38 It is based on an earlier version that was committed to OG11 as ce8e18474780aaae1abbfdde0543c948ae97da42 The OG11 version contained code that was not part of the upstream version as the approach changed; however, when OG12 was created, those were accidentally applied to OG12. This commit removes (reverts) the additional code, i.e. the OG12 commits: Reverts: commit 3405728e403ce7054f09e9a69846a57f680060db "dwarf: Multi-register CFI address support" gcc/ChangeLog: * dwarf2cfi.cc (get_cfa_from_loc_descr): Support register spans with DW_OP_piece and DW_OP_LLVM_piece_end. * dwarf2out.cc (build_cfa_loc): Support register spans. include/ChangeLog: * dwarf2.def (DW_OP_LLVM_piece_end): New extension operator. Reverts: commit 29ba2e4eeff0381e04a37a3c471c56cd887d2035 "Fix mis-merge of 'dwarf: Multi-register CFI address support'" gcc/ * dwarf2cfi.cc (get_cfa_from_loc_descr): Check op against DW_OP_bregx.
2022-08-31Merge branch 'releases/gcc-12' into devel/omp/gcc-12Tobias Burnus58-74691/+76011
Merge up to r12-8732-g63997f222380a7b718a9045effa4841c00f30f70 (31st Aug 2022)
2022-08-31Daily bump.GCC Administrator6-1/+55
2022-08-30Update gcc sv.poJoseph Myers1-12/+9
* sv.po: Update.
2022-08-30Fortran: Fix OpenMP clause name in error messageTobias Burnus2-1/+9
gcc/fortran/ChangeLog: * openmp.cc (gfc_check_omp_requires): Fix clause name in error. Co-authored-by: Chung-Lin Tang <cltang@codesourcery.com> (cherry picked from commit 8af266501795dd76d05faef498dbd3472a01b305)
2022-08-30OpenMP: Support reverse offload (middle end part)Tobias Burnus47-64/+299
gcc/ChangeLog: * internal-fn.cc (expand_GOMP_TARGET_REV): New. * internal-fn.def (GOMP_TARGET_REV): New. * lto-cgraph.cc (lto_output_node, verify_node_partition): Mark 'omp target device_ancestor_host' as in_other_partition and don't error if absent. * omp-low.cc (create_omp_child_function): Mark as 'noclone'. * omp-expand.cc (expand_omp_target): For reverse offload, remove sorry, use device = GOMP_DEVICE_HOST_FALLBACK and create empty-body nohost function. * omp-offload.cc (execute_omp_device_lower): Handle IFN_GOMP_TARGET_REV. (pass_omp_target_link::execute): For ACCEL_COMPILER, don't nullify fn argument for reverse offload libgomp/ChangeLog: * libgomp.texi (OpenMP 5.0): Mark 'ancestor' as implemented but refer to 'requires'. * testsuite/libgomp.c-c++-common/reverse-offload-1-aux.c: New test. * testsuite/libgomp.c-c++-common/reverse-offload-1.c: New test. * testsuite/libgomp.fortran/reverse-offload-1-aux.f90: New test. * testsuite/libgomp.fortran/reverse-offload-1.f90: New test. gcc/testsuite/ChangeLog: * c-c++-common/gomp/reverse-offload-1.c: Remove dg-sorry. * c-c++-common/gomp/target-device-ancestor-4.c: Likewise. * gfortran.dg/gomp/target-device-ancestor-4.f90: Likewise. * gfortran.dg/gomp/target-device-ancestor-5.f90: Likewise. * c-c++-common/goacc/classify-kernels-parloops.c: Add 'noclone' to scan-tree-dump-times. * c-c++-common/goacc/classify-kernels-unparallelized-parloops.c: Likewise. * c-c++-common/goacc/classify-kernels-unparallelized.c: Likewise. * c-c++-common/goacc/classify-kernels.c: Likewise. * c-c++-common/goacc/classify-parallel.c: Likewise. * c-c++-common/goacc/classify-serial.c: Likewise. * c-c++-common/goacc/kernels-counter-vars-function-scope.c: Likewise. * c-c++-common/goacc/kernels-loop-2.c: Likewise. * c-c++-common/goacc/kernels-loop-3.c: Likewise. * c-c++-common/goacc/kernels-loop-data-2.c: Likewise. * c-c++-common/goacc/kernels-loop-data-enter-exit-2.c: Likewise. * c-c++-common/goacc/kernels-loop-data-enter-exit.c: Likewise. * c-c++-common/goacc/kernels-loop-data-update.c: Likewise. * c-c++-common/goacc/kernels-loop-data.c: Likewise. * c-c++-common/goacc/kernels-loop-g.c: Likewise. * c-c++-common/goacc/kernels-loop-mod-not-zero.c: Likewise. * c-c++-common/goacc/kernels-loop-n.c: Likewise. * c-c++-common/goacc/kernels-loop-nest.c: Likewise. * c-c++-common/goacc/kernels-loop.c: Likewise. * c-c++-common/goacc/kernels-one-counter-var.c: Likewise. * c-c++-common/goacc/kernels-parallel-loop-data-enter-exit.c: Likewise. * gfortran.dg/goacc/classify-kernels-parloops.f95: Likewise. * gfortran.dg/goacc/classify-kernels-unparallelized-parloops.f95: Likewise. * gfortran.dg/goacc/classify-kernels-unparallelized.f95: Likewise. * gfortran.dg/goacc/classify-kernels.f95: Likewise. * gfortran.dg/goacc/classify-parallel.f95: Likewise. * gfortran.dg/goacc/classify-serial.f95: Likewise. * gfortran.dg/goacc/kernels-loop-2.f95: Likewise. * gfortran.dg/goacc/kernels-loop-data-2.f95: Likewise. * gfortran.dg/goacc/kernels-loop-data-enter-exit-2.f95: Likewise. * gfortran.dg/goacc/kernels-loop-data-enter-exit.f95: Likewise. * gfortran.dg/goacc/kernels-loop-data-update.f95: Likewise. * gfortran.dg/goacc/kernels-loop-data.f95: Likewise. * gfortran.dg/goacc/kernels-loop-n.f95: Likewise. * gfortran.dg/goacc/kernels-loop.f95: Likewise. * gfortran.dg/goacc/kernels-parallel-loop-data-enter-exit.f95: Likewise. (cherry picked from commit d6621a2f3176dd6a593d4f5fa7f85db0234b40d2)
2022-08-30gfortran.dg/gomp/depend-6.f90: minor fix + dump updateTobias Burnus2-37/+41
Contains the minor fix of upstream commit r13-2152-gf05e3b2c63f3307ba405900f1a80c25b2e87b0a3 to avoid setting the same array element twice, which is: Exactly the same as previous commit for depend-4.f90, r13-2151. Additionally, it updates the expected tree dumps due to differences between GCC 12/OG12 and mainline (GCC13); the latter seems to use a restricted pointer of type '&a[1]' while OG12 has pointerplus expressions like 'a + 4'. The changes include -m32/-m64 handling for the depobj var and in two cases, the expected count increases from 1 to 2 for code like 'D.\[0-9\]+ = daa'. The latter is likewise the same as done for depend-4.f90 on the OG12 branch in commit d77133b29fc51de4e0f37de5f6ad2b5496124346- gcc/testsuite/ * gfortran.dg/gomp/depend-6.f90: Update expected tree dumps.
2022-08-30Fortran/OpenMP: Fix strictly structured blocks parsingTobias Burnus2-1/+22
gcc/fortran/ChangeLog: * parse.cc (parse_omp_structured_block): When parsing strictly structured blocks, issue an error if the end-directive comes before the 'end block'. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/strictly-structured-block-4.f90: New test. (cherry picked from commit 33f24eb58748e9db7c827662753757c5c2217eb4)
2022-08-30c++: __has_builtin gives the wrong answer [PR106759]Marek Polacek2-0/+129
We've supported __is_nothrow_constructible since r11-4386, but names_builtin_p didn't know about it, so it gave the wrong answer for #if __has_builtin(__is_nothrow_constructible) ... #endif I've tested all C++-only built-ins and only two were missing. PR c++/106759 gcc/cp/ChangeLog: * cp-objcp-common.cc (names_builtin_p): Handle RID_IS_NOTHROW_ASSIGNABLE and RID_IS_NOTHROW_CONSTRUCTIBLE. gcc/testsuite/ChangeLog: * g++.dg/ext/has-builtin-1.C: New test. (cherry picked from commit fe915f35b7d8dc768a2b977c09aa02f933e1d1e9)
2022-08-30sve: Fix fcmuo combine patterns [PR106524]Tamar Christina2-2/+13
There's no encoding for fcmuo with zero. This restricts the combine patterns from accepting zero registers. gcc/ChangeLog: PR target/106524 * config/aarch64/aarch64-sve.md (*fcmuo<mode>_nor_combine, *fcmuo<mode>_bic_combine): Don't accept comparisons against zero. gcc/testsuite/ChangeLog: PR target/106524 * gcc.target/aarch64/sve/pr106524.c: New test. (cherry picked from commit f4ff20d464f90c85919ce2e7fa63e204dcda4e40)
2022-08-30Daily bump.GCC Administrator5-1/+106
2022-08-29rs6000: Allow conversions of MMA pointer types [PR106017]Peter Bergner2-22/+19
GCC incorrectly disables conversions between MMA pointer types, which are allowed with clang. The original intent was to disable conversions between MMA types and other other types, but pointer conversions should have been allowed. The fix is to just remove the MMA pointer conversion handling code altogether. gcc/ PR target/106017 * config/rs6000/rs6000.cc (rs6000_invalid_conversion): Remove handling of MMA pointer conversions. gcc/testsuite/ PR target/106017 * gcc.target/powerpc/pr106017.c: New test. (cherry picked from commit 1ae1325f24cea1698b56e4299d95446a1f7b90a2)
2022-08-29x86: Cast stride to __PTRDIFF_TYPE__ in AMX intrinsicsH.J. Lu1-3/+3
On 64-bit Windows, long is 32 bits and can't be used as stride in memory operand when base is a pointer which is 64 bits. Cast stride to __PTRDIFF_TYPE__, instead of long. PR target/106714 * config/i386/amxtileintrin.h (_tile_loadd_internal): Cast to __PTRDIFF_TYPE__. (_tile_stream_loadd_internal): Likewise. (_tile_stored_internal): Likewise. (cherry picked from commit aeb9b58225916bc84a0cd02c6fc77bbb92167e53)
2022-08-29fortran: Expand ieee_arithmetic module's ieee_value inline [PR106579]Jakub Jelinek1-0/+125
The following patch expands IEEE_VALUE function inline in the FE, but only for the powerpc64le-linux IEEE quad real(kind=16) case. 2022-08-26 Jakub Jelinek <jakub@redhat.com> PR fortran/106579 * trans-intrinsic.cc: Include realmpfr.h. (conv_intrinsic_ieee_value): New function. (gfc_conv_ieee_arithmetic_function): Handle ieee_value. (cherry picked from commit 0c2d6aa1be2ea85e751852834986ae52d58134d3)
2022-08-29fortran: Expand ieee_arithmetic module's ieee_class inline [PR106579]Jakub Jelinek3-1/+116
The following patch expands IEEE_CLASS inline in the FE but only for the powerpc64le-linux IEEE quad real(kind=16), using the __builtin_fpclassify builtin and explicit check of the MSB mantissa bit in place of missing __builtin_signbit builtin. 2022-08-26 Jakub Jelinek <jakub@redhat.com> PR fortran/106579 gcc/fortran/ * f95-lang.cc (gfc_init_builtin_functions): Initialize BUILT_IN_FPCLASSIFY. * libgfortran.h (IEEE_OTHER_VALUE, IEEE_SIGNALING_NAN, IEEE_QUIET_NAN, IEEE_NEGATIVE_INF, IEEE_NEGATIVE_NORMAL, IEEE_NEGATIVE_DENORMAL, IEEE_NEGATIVE_SUBNORMAL, IEEE_NEGATIVE_ZERO, IEEE_POSITIVE_ZERO, IEEE_POSITIVE_DENORMAL, IEEE_POSITIVE_SUBNORMAL, IEEE_POSITIVE_NORMAL, IEEE_POSITIVE_INF): New enum. * trans-intrinsic.cc (conv_intrinsic_ieee_class): New function. (gfc_conv_ieee_arithmetic_function): Handle ieee_class. libgfortran/ * ieee/ieee_helper.c (IEEE_OTHER_VALUE, IEEE_SIGNALING_NAN, IEEE_QUIET_NAN, IEEE_NEGATIVE_INF, IEEE_NEGATIVE_NORMAL, IEEE_NEGATIVE_DENORMAL, IEEE_NEGATIVE_SUBNORMAL, IEEE_NEGATIVE_ZERO, IEEE_POSITIVE_ZERO, IEEE_POSITIVE_DENORMAL, IEEE_POSITIVE_SUBNORMAL, IEEE_POSITIVE_NORMAL, IEEE_POSITIVE_INF): Move to gcc/fortran/libgfortran.h. (cherry picked from commit db630423a97ec6690a8eb0e5c3cb186c91e3740d)
2022-08-29i386: Fix up mode iterators that weren't expanded [PR106721]Jakub Jelinek1-2/+2
Currently, when md file reader sees <something> and something is valid mode (or code) attribute but which doesn't include case for the current mode (or code), it just keeps the <something> untouched. I went through all cases matching <[a-zA-Z] in tmp-mddump.md after make mddump. One of the cases was related to the V*HF mode additions and there was one typo. 2022-08-24 Jakub Jelinek <jakub@redhat.com> PR target/106721 * config/i386/sse.md (i128vldq): Add V16HF entry. (avx512er_vmrcp28<mode><mask_name><round_saeonly_name>): Fix typo, mask_opernad3 -> mask_operand3. (cherry picked from commit 846e5c009e360f0c4fe58ff0d3aee03ebe3ca1a9)
2022-08-29c++: Implement P2327R1 - De-deprecating volatile compound operationsJakub Jelinek5-14/+28
From what I can see, this has been voted in as a DR and as it means we warn less often than before in -std={gnu,c}++2{0,3} modes or with -Wvolatile, I wonder if it shouldn't be backported to affected release branches as well. 2022-08-16 Jakub Jelinek <jakub@redhat.com> * typeck.cc (cp_build_modify_expr): Implement P2327R1 - De-deprecating volatile compound operations. Don't warn for |=, &= or ^= with volatile lhs. * expr.cc (mark_use) <case MODIFY_EXPR>: Adjust warning wording, leave out simple. * g++.dg/cpp2a/volatile1.C: Adjust for de-deprecation of volatile compound |=, &= and ^= operations. * g++.dg/cpp2a/volatile3.C: Likewise. * g++.dg/cpp2a/volatile5.C: Likewise. (cherry picked from commit 6e790ca4615443fa395ac5cdba1ab6c87810985c)
2022-08-29ifcvt: Fix up noce_convert_multiple_sets [PR106590]Jakub Jelinek2-4/+112
The following testcase is miscompiled on x86_64-linux. The problem is in the noce_convert_multiple_sets optimization. We essentially have: if (g == 1) { g = 1; f = 23; } else { g = 2; f = 20; } and for each insn try to create a conditional move sequence. There is code to detect overlap with the regs used in the condition and the destinations, so we actually try to construct: tmp_g = g == 1 ? 1 : 2; f = g == 1 ? 23 : 20; g = tmp_g; which is fine. But, we actually try to create two different conditional move sequences in each case, seq1 with the whole (eq (reg/v:HI 82 [ g ]) (const_int 1 [0x1])) condition and seq2 with cc_cmp (eq (reg:CCZ 17 flags) (const_int 0 [0])) to rely on the earlier present comparison. In each case, we compare the rtx costs and choose the cheaper sequence (seq1 if both have the same cost). The problem is that with the skylake tuning, tmp_g = g == 1 ? 1 : 2; is actually expanded as tmp_g = (g == 1) + 1; in seq1 (which clobbers (reg 17 flags)) and as a cmov in seq2 (which doesn't). The tuning says both have the same cost, so we pick seq1. Next we check sequences for f = g == 1 ? 23 : 20; and here the seq2 cmov is cheaper, but it uses (reg 17 flags) which has been clobbered earlier. The following patch fixes that by detecting if we in the chosen sequence clobber some register mentioned in cc_cmp or rev_cc_cmp, and if yes, arranges for only seq1 (i.e. sequences that emit the comparison itself) to be used after that. 2022-08-15 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/106590 * ifcvt.cc (check_for_cc_cmp_clobbers): New function. (noce_convert_multiple_sets_1): If SEQ sets or clobbers any regs mentioned in cc_cmp or rev_cc_cmp, don't consider seq2 for any further conditional moves. * gcc.dg/torture/pr106590.c: New test. (cherry picked from commit 3a74a7bf62f47ed0d19866576378724be932ee17)
2022-08-29Daily bump.GCC Administrator1-1/+1
2022-08-28Daily bump.GCC Administrator1-1/+1
2022-08-27Daily bump.GCC Administrator4-1/+32
2022-08-26Fortran: improve error recovery while simplifying size of bad array [PR103694]Harald Anlauf2-2/+14
gcc/fortran/ChangeLog: PR fortran/103694 * simplify.cc (simplify_size): The size expression of an array cannot be simplified if an error occurs while resolving the array spec. gcc/testsuite/ChangeLog: PR fortran/103694 * gfortran.dg/pr103694.f90: New test. (cherry picked from commit 55d8c5409325001c89c35c3d04d425dec9127146)
2022-08-26Don't gimple fold ymm-version vblendvpd/vblendvps/vpblendvb w/o TARGET_AVX2liuhongt3-5/+27
Since 256-bit vector integer comparison is under TARGET_AVX2, and gimple folding for vblendvpd/vblendvps/vpblendvb relies on that. Restrict gimple fold condition to TARGET_AVX2. gcc/ChangeLog: PR target/106704 * config/i386/i386-builtin.def (BDESC): Add CODE_FOR_avx_blendvpd256/CODE_FOR_avx_blendvps256 to corresponding builtins. * config/i386/i386.cc (ix86_gimple_fold_builtin): Don't fold IX86_BUILTIN_PBLENDVB256, IX86_BUILTIN_BLENDVPS256, IX86_BUILTIN_BLENDVPD256 w/o TARGET_AVX2. gcc/testsuite/ChangeLog: * gcc.target/i386/pr106704.c: New test.
2022-08-26Daily bump.GCC Administrator3-1/+23
2022-08-25LoongArch: Fix pr106459 by use HWIT instead of 1UL.Chenghua Xu3-9/+25
gcc/ChangeLog: PR target/106459 * config/loongarch/loongarch.cc (loongarch_build_integer): Use HOST_WIDE_INT. * config/loongarch/loongarch.h (IMM_REACH): Likewise. (HWIT_1U): New Defined. (LU12I_OPERAND): Use HOST_WIDE_INT. (LU32I_OPERAND): Likewise. (LU52I_OPERAND): Likewise. (HWIT_UC_0xFFF): Likwise. gcc/testsuite/ChangeLog: * gcc.target/loongarch/pr106459.c: New test. (cherry picked from commit b169b67d7dafe2b786f87c31d6b2efc603fd880c)
2022-08-25Daily bump.GCC Administrator3-1/+37
2022-08-23vect: Don't allow vect_emulated_vector_p type in vectorizable_call [PR106322]Kewen Lin3-0/+109
As PR106322 shows, in some cases for some vector type whose TYPE_MODE is a scalar integral mode instead of a vector mode, it's possible to obtain wrong target support information when querying with the scalar integral mode. For example, for the test case in PR106322, on ppc64 32bit vectorizer gets vector type "vector(2) short unsigned int" for scalar type "short unsigned int", its mode is SImode instead of V2HImode. The target support querying checks umul_highpart optab with SImode and considers it's supported, then vectorizer further generates .MULH IFN call for that vector type. Unfortunately it's wrong to use SImode support for that vector type multiply highpart here. This patch is to teach vectorizable_call analysis not to allow vect_emulated_vector_p type for both vectype_in and vectype_out as Richi suggested. PR tree-optimization/106322 gcc/ChangeLog: * tree-vect-stmts.cc (vectorizable_call): Don't allow vect_emulated_vector_p type for both vectype_in and vectype_out. gcc/testsuite/ChangeLog: * gcc.target/i386/pr106322.c: New test. * gcc.target/powerpc/pr106322.c: New test. (cherry picked from commit 5239e2bd48fb1e6a1d1b06a1bac49bee0a742e98)
2022-08-23rs6000: Adjust mov optabs for opaque modes [PR103353]Kewen.Lin2-6/+55
As PR103353 shows, we may want to continue to expand built-in function __builtin_vsx_lxvp, even if we have already emitted error messages about some missing required conditions. As shown in that PR, without one explicit mov optab on OOmode provided, it would call emit_move_insn recursively. So this patch is to allow the mov pattern to be generated during expanding phase if compiler has already seen errors. PR target/103353 gcc/ChangeLog: * config/rs6000/mma.md (define_expand movoo): Move TARGET_MMA condition check to preparation statements and add handlings for !TARGET_MMA. (define_expand movxo): Likewise. gcc/testsuite/ChangeLog: * gcc.target/powerpc/pr103353.c: New test. (cherry picked from commit 9367e3a65f874dffc8f8a3b6760e77fd9ed67117)
2022-08-24Daily bump.GCC Administrator5-1/+39
2022-08-23Update gcc .po filesJoseph Myers19-74616/+74925
* be.po, da.po, de.po, el.po, es.po, fi.po, fr.po, hr.po, id.po, ja.po, nl.po, ru.po, sr.po, sv.po, tr.po, uk.po, vi.po, zh_CN.po, zh_TW.po: Update.
2022-08-23Merge branch 'releases/gcc-12' into devel/omp/gcc-12Tobias Burnus2-1/+3
Merge up to r12-8705-g4218d3abfde1aa3dadfdacb55893f08489e8a064 (23th Aug 2022)
2022-08-23gfortran.dg/gomp/depend-4.f90: minor fix + dump updateTobias Burnus2-37/+43
Contains the minor fix of upstream commit r13-2151-g6b2a584ed5bed1b851ee7b4668ac705f20cbb2c4 to avoid setting the same array element twice. Additionally, it updates the expected tree dumps due to differences between GCC 12/OG12 and mainline (GCC13); the latter seems to use a restricted pointer of type '&a[1]' while OG12 has pointerplus expressions like 'a + 4'. The changes include -m32/-m64 handling for the depobj var and in two cases, the expected count increases from 1 to 2 for code like 'D.\[0-9\]+ = daa'. gcc/testsuite/ * gfortran.dg/gomp/depend-4.f90: Update expected tree dumps.