aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-05-25[libstdc++] [testsuite] xfail to_chars/long_double on x86-vxworksAlexandre Oliva1-1/+1
Just as on aarch64, x86's wider long double experiences loss of precision with from_chars implemented in terms of double. Expect the execution fail. for libstdc++-v3/ChangeLog * testsuite/20_util/to_chars/long_double.cc: Expect execution fail on x86-vxworks. (cherry picked from commit 7daa166fe89fca4ff1baa063c00a5a690f7e462f)
2023-05-25libstdc++: Add missing constexpr to simdMatthias Kretz6-250/+418
The constexpr API is only available with -std=gnu++XX (and proposed for C++26). The proposal is to have the complete simd API usable in constant expressions. This patch resolves several issues with using simd in constant expressions. Issues why constant_evaluated branches are necessary: * subscripting vector builtins is not allowed in constant expressions * if the implementation needs/uses memcpy * if the implementation would otherwise call SIMD intrinsics/builtins Signed-off-by: Matthias Kretz <m.kretz@gsi.de> libstdc++-v3/ChangeLog: PR libstdc++/109261 * include/experimental/bits/simd.h (_SimdWrapper::_M_set): Avoid vector builtin subscripting in constant expressions. (resizing_simd_cast): Avoid memcpy if constant_evaluated. (const_where_expression, where_expression, where) (__extract_part, simd_mask, _SimdIntOperators, simd): Add either _GLIBCXX_SIMD_CONSTEXPR (on public APIs), or constexpr (on internal APIs). * include/experimental/bits/simd_builtin.h (__vector_permute) (__vector_shuffle, __extract_part, _GnuTraits::_SimdCastType1) (_GnuTraits::_SimdCastType2, _SimdImplBuiltin) (_MaskImplBuiltin::_S_store): Add constexpr. (_CommonImplBuiltin::_S_store_bool_array) (_SimdImplBuiltin::_S_load, _SimdImplBuiltin::_S_store) (_SimdImplBuiltin::_S_reduce, _MaskImplBuiltin::_S_load): Add constant_evaluated case. * include/experimental/bits/simd_fixed_size.h (_S_masked_load): Reword comment. (__tuple_element_meta, __make_meta, _SimdTuple::_M_apply_r) (_SimdTuple::_M_subscript_read, _SimdTuple::_M_subscript_write) (__make_simd_tuple, __optimize_simd_tuple, __extract_part) (__autocvt_to_simd, _Fixed::__traits::_SimdBase) (_Fixed::__traits::_SimdCastType, _SimdImplFixedSize): Add constexpr. (_SimdTuple::operator[], _M_set): Add constexpr and add constant_evaluated case. (_MaskImplFixedSize::_S_load): Add constant_evaluated case. * include/experimental/bits/simd_scalar.h: Add constexpr. * include/experimental/bits/simd_x86.h (_CommonImplX86): Add constexpr and add constant_evaluated case. (_SimdImplX86::_S_equal_to, _S_not_equal_to, _S_less) (_S_less_equal): Value-initialize to satisfy constexpr evaluation. (_MaskImplX86::_S_load): Add constant_evaluated case. (_MaskImplX86::_S_store): Add constexpr and constant_evaluated case. Value-initialize local variables. (_MaskImplX86::_S_logical_and, _S_logical_or, _S_bit_not) (_S_bit_and, _S_bit_or, _S_bit_xor): Add constant_evaluated case. * testsuite/experimental/simd/pr109261_constexpr_simd.cc: New test. (cherry picked from commit da579188807ede4ee9466d0b5bf51559c96a0b51)
2023-05-25libstdc++: Fix type of first argument to vec_cntm callMatthias Kretz2-6/+36
Signed-off-by: Matthias Kretz <m.kretz@gsi.de> libstdc++-v3/ChangeLog: PR libstdc++/109949 * include/experimental/bits/simd.h (__intrinsic_type): If __ALTIVEC__ is defined, map gnu::vector_size types to their corresponding __vector T types without losing unsignedness of integer types. Also prefer long long over long. * include/experimental/bits/simd_ppc.h (_S_popcount): Cast mask object to the expected unsigned vector type. (cherry picked from commit efd2b55d8562c6e80cb7ee8b9b1f9418f0c00cd9)
2023-05-25libstdc++: Add missing constexpr to simd_neonMatthias Kretz1-40/+36
Signed-off-by: Matthias Kretz <m.kretz@gsi.de> libstdc++-v3/ChangeLog: PR libstdc++/109261 * include/experimental/bits/simd_neon.h (_S_reduce): Add constexpr and make NEON implementation conditional on not __builtin_is_constant_evaluated. (cherry picked from commit b0a483b0a011f9cbc8b25053eae809c77dae2a12)
2023-05-25libstdc++: Fix SFINAE for __is_intrinsic_type on ARMMatthias Kretz1-3/+9
On ARM NEON doesn't support double, so __is_intrinsic_type_v<double, whatever> should say false (instead of being ill-formed). Signed-off-by: Matthias Kretz <m.kretz@gsi.de> libstdc++-v3/ChangeLog: PR libstdc++/109261 * include/experimental/bits/simd.h (__intrinsic_type): Specialize __intrinsic_type<double, 8> and __intrinsic_type<double, 16> in any case, but provide the member type only with __aarch64__. (cherry picked from commit aa8b363171a95b8f867a74f29c75f9577e9087e1)
2023-05-25libstdc++: Resolve -Wunused-variable warnings in stdx::simd and testsMatthias Kretz6-5/+17
Signed-off-by: Matthias Kretz <m.kretz@gsi.de> libstdc++-v3/ChangeLog: * include/experimental/bits/simd_builtin.h (_S_fpclassify): Move __infn into #ifdef'ed block. * testsuite/experimental/simd/tests/fpclassify.cc: Declare constants only when used. * testsuite/experimental/simd/tests/frexp.cc: Likewise. * testsuite/experimental/simd/tests/logarithm.cc: Likewise. * testsuite/experimental/simd/tests/trunc_ceil_floor.cc: Likewise. * testsuite/experimental/simd/tests/ldexp_scalbn_scalbln_modf.cc: Move totest and expect1 into #ifdef'ed block. (cherry picked from commit a7129e82bed1bd4f513fc3c3f401721e2c96a865)
2023-05-25Daily bump.GCC Administrator1-1/+1
2023-05-24Use OpenACC code to process OpenMP target regionsChung-Lin Tang29-58/+1079
(forward ported from devel/omp/gcc-12) This is a backport of: https://gcc.gnu.org/pipermail/gcc-patches/2023-May/619003.html This patch implements '-fopenmp-target=acc', which enables internally handling a subset of OpenMP target regions as OpenACC parallel regions. This basically includes target, teams, parallel, distribute, for/do constructs, and atomics. Essentially, we adjust the internal kinds to OpenACC type, and let OpenACC code paths handle them, with various needed adjustments throughout middle-end and nvptx backend. When using this "OMPACC" mode, if there are cases the patch doesn't handle, it issues a warning, and reverts to normal processing for that target region. gcc/ChangeLog: * builtins.cc (expand_builtin_omp_builtins): New function. (expand_builtin): Add expand cases for BUILT_IN_GOMP_BARRIER, BUILT_IN_OMP_GET_THREAD_NUM, BUILT_IN_OMP_GET_NUM_THREADS, BUILT_IN_OMP_GET_TEAM_NUM, and BUILT_IN_OMP_GET_NUM_TEAMS using expand_builtin_omp_builtins, enabled under -fopenmp-target=acc. * cgraphunit.cc (analyze_functions): Add call to omp_ompacc_attribute_tagging, enabled under -fopenmp-target=acc. * common.opt (fopenmp-target=): Add new option and enums. * config/nvptx/mkoffload.cc (main): Handle -fopenmp-target=. * config/nvptx/nvptx-protos.h (nvptx_expand_omp_get_num_threads): New prototype. (nvptx_mem_shared_p): Likewise. * config/nvptx/nvptx.cc (omp_num_threads_sym): New global static RTX symbol for number of threads in team. (omp_num_threads_align): New var for alignment of omp_num_threads_sym. (need_omp_num_threads): New bool for if any function references omp_num_threads_sym. (nvptx_option_override): Initialize omp_num_threads_sym/align. (write_as_kernel): Disable normal OpenMP kernel entry under OMPACC mode. (nvptx_declare_function_name): Disable shim function under OMPACC mode. Disable soft-stack under OMPACC mode. Add generation of neutering init code under OMPACC mode. (nvptx_output_set_softstack): Return "" under OMPACC mode. (nvptx_expand_call): Set parallelism to vector for function calls with "ompacc for" attached. (nvptx_expand_oacc_fork): Set mode to GOMP_DIM_VECTOR under OMPACC mode. (nvptx_expand_oacc_join): Likewise. (nvptx_expand_omp_get_num_threads): New function. (nvptx_mem_shared_p): New function. (nvptx_mach_max_workers): Return 1 under OMPACC mode. (nvptx_mach_vector_length): Return 32 under OMPACC mode. (nvptx_single): Add adjustments for OMPACC mode, which have parallel-construct fork/joins, and regions of code where neutering is dynamically determined. (nvptx_reorg): Enable neutering under OMPACC mode when "ompacc for" attribute is attached to function. Disable uniform-simt when under OMPACC mode. (nvptx_file_end): Write __nvptx_omp_num_threads out when needed. (nvptx_goacc_fork_join): Return true under OMPACC mode. * config/nvptx/nvptx.h (struct GTY(()) machine_function): Add omp_parallel_predicate and omp_fn_entry_num_threads_reg fields. * config/nvptx/nvptx.md (unspecv): Add UNSPECV_GET_TID, UNSPECV_GET_NTID, UNSPECV_GET_CTAID, UNSPECV_GET_NCTAID, UNSPECV_OMP_PARALLEL_FORK, UNSPECV_OMP_PARALLEL_JOIN entries. (nvptx_shared_mem_operand): New predicate. (gomp_barrier): New expand pattern. (omp_get_num_threads): New expand pattern. (omp_get_num_teams): New insn pattern. (omp_get_thread_num): Likewise. (omp_get_team_num): Likewise. (get_ntid): Likewise. (nvptx_omp_parallel_fork): Likewise. (nvptx_omp_parallel_join): Likewise. * flag-types.h (omp_target_mode_kind): New flag value enum. * gimplify.cc (struct gimplify_omp_ctx): Add 'bool ompacc' field. (gimplify_scan_omp_clauses): Handle OMP_CLAUSE__OMPACC_. (gimplify_adjust_omp_clauses): Likewise. (gimplify_omp_ctx_ompacc_p): New function. (gimplify_omp_for): Handle combined loops under OMPACC. * lto-wrapper.cc (append_compiler_options): Add OPT_fopenmp_target_. * omp-builtins.def (BUILT_IN_OMP_GET_THREAD_NUM): Remove CONST. (BUILT_IN_OMP_GET_NUM_THREADS): Likewise. * omp-expand.cc (remove_exit_barrier): Disable addressable-var processing for parallel construct child functions under OMPACC mode. (expand_oacc_for): Add OMPACC mode handling. (get_target_arguments): Force thread_limit clause value to 1 under OMPACC mode. (expand_omp): Under OMPACC mode, avoid child function expanding of GIMPLE_OMP_PARALLEL. * omp-general.cc (omp_extract_for_data): Adjustments for OMPACC mode. * omp-low.cc (struct omp_context): Add 'bool ompacc_p' field. (scan_sharing_clauses): Handle OMP_CLAUSE__OMPACC_. (ompacc_ctx_p): New function. (scan_omp_parallel): Handle OMPACC mode, avoid creating child function. (scan_omp_target): Tag "ompacc"/"ompacc for" attributes for target construct child function, remove OMP_CLAUSE__OMPACC_ clauses. (lower_oacc_head_mark): Handle OMPACC mode cases. (lower_omp_for): Adjust OMP_FOR kind from OpenMP to OpenACC kinds, add vector/gang clauses as needed. Add other OMPACC handling. (lower_omp_taskreg): Add call to lower_oacc_head_tail for OMPACC case. (lower_omp_target): Do OpenACC gang privatization under OMPACC case. (lower_omp_teams): Forward OpenACC privatization variables to outer target region under OMPACC mode. (lower_omp_1): Do OpenACC gang privatization under OMPACC case for GIMPLE_BIND. * omp-offload.cc (ompacc_supported_clauses_p): New function. (struct target_region_data): New struct type for tree walk. (scan_fndecl_for_ompacc): New function. (scan_omp_target_region_r): New function. (scan_omp_target_construct_r): New function. (omp_ompacc_attribute_tagging): New function. (oacc_dim_call): Add OMPACC case handling. (execute_oacc_device_lower): Make parts explicitly only OpenACC enabled. (pass_oacc_device_lower::gate): Enable pass under OMPACC mode. * omp-offload.h (omp_ompacc_attribute_tagging): New prototype. * opts.cc (finish_options): Only allow -fopenmp-target= when -fopenmp and no -fopenacc. * target-insns.def (gomp_barrier): New defined insn pattern. (omp_get_thread_num): Likewise. (omp_get_num_threads): Likewise. (omp_get_team_num): Likewise. (omp_get_num_teams): Likewise. * tree-core.h (enum omp_clause_code): Add new OMP_CLAUSE__OMPACC_ entry for internal clause. * tree-nested.cc (convert_nonlocal_omp_clauses): Handle OMP_CLAUSE__OMPACC_. * tree-pretty-print.cc (dump_omp_clause): Handle OMP_CLAUSE__OMPACC_. * tree.cc (omp_clause_num_ops): Add OMP_CLAUSE__OMPACC_ entry. (omp_clause_code_name): Likewise. * tree.h (OMP_CLAUSE__OMPACC__FOR): New macro for OMP_CLAUSE__OMPACC_. * tree-ssa-loop.cc (pass_oacc_only::gate): Enable pass under OMPACC mode cases. libgomp/ChangeLog: * config/nvptx/team.c (__nvptx_omp_num_threads): New global variable in shared memory. (cherry picked from commit 5f881613fa9128edae5bbfa4e19f9752809e4bd7)
2023-05-24Daily bump.GCC Administrator3-1/+18
2023-05-23Improve cost computation for single-bit bit insertions.Georg-Johann Lay1-0/+48
Some miscomputation of rtx_costs lead to sub-optimal code for single-bit bit insertions. This patch implements TARGET_INSN_COST, which has a chance to see the whole insn during insn combination; in particular the SET_DEST of (set (zero_extract (...) ...)). gcc/ * config/avr/avr.cc (avr_insn_cost): New static function. (TARGET_INSN_COST): Define to that function.
2023-05-23Fix handling of non-integral bit-fields in native_encode_initializerEric Botcazou3-10/+66
The encoder for CONSTRUCTORs assumes that all bit-fields (DECL_BIT_FIELD) have integral types, but that's not the case in Ada where they may have pretty much any type, resulting in a wrong encoding for them gcc/ * fold-const.cc (native_encode_initializer) <CONSTRUCTOR>: Apply the specific treatment for bit-fields only if they have an integral type and filter out non-integral bit-fields that do not start and end on a byte boundary. gcc/testsuite/ * gnat.dg/opt101.adb: New test. * gnat.dg/opt101_pkg.ads: New helper.
2023-05-23Daily bump.GCC Administrator3-1/+29
2023-05-22Merge branch 'releases/gcc-13' into devel/omp/gcc-13Kwok Cheung Yeung1785-49300/+91275
2023-05-22atch.pd: Ensure (op CONSTANT_CLASS_P CONSTANT_CLASS_P) is simplified [PR109505]Jakub Jelinek2-10/+22
On the following testcase we hang, because POLY_INT_CST is CONSTANT_CLASS_P, but BIT_AND_EXPR with it and INTEGER_CST doesn't simplify and the (x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2) simplification actually relies on the (CST1 & CST2) simplification, otherwise it is a deoptimization, trading 2 ops for 3 and furthermore running into /* Given a bit-wise operation CODE applied to ARG0 and ARG1, see if both operands are another bit-wise operation with a common input. If so, distribute the bit operations to save an operation and possibly two if constants are involved. For example, convert (A | B) & (A | C) into A | (B & C) Further simplification will occur if B and C are constants. */ simplification which simplifies that (x & CST2) | (CST1 & CST2) back to CST2 & (x | CST1). I went through all other places I could find where we have a simplification with 2 CONSTANT_CLASS_P operands and perform some operation on those two, while the other spots aren't that severe (just trade 2 operations for another 2 if the two constants don't simplify, rather than as in the above case trading 2 ops for 3), I still think all those spots really intend to optimize only if the 2 constants simplify. So, the following patch adds to those a ! modifier to ensure that, even at GENERIC that modifier means !EXPR_P which is exactly what we want IMHO. 2023-05-21 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/109505 * match.pd ((x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2), Combine successive equal operations with constants, (A +- CST1) +- CST2 -> A + CST3, (CST1 - A) +- CST2 -> CST3 - A, CST1 - (CST2 - A) -> CST3 + A): Use ! on ops with 2 CONSTANT_CLASS_P operands. * gcc.target/aarch64/sve/pr109505.c: New test. (cherry picked from commit f211757f6fa9515e3fd1a4f66f1a8b48e500c9de)
2023-05-21vect: Don't retry if the previous analysis failsKewen Lin1-1/+1
When working on a cost tweaking patch, I found that a newly added test case has different dumpings with stage-1 and bootstrapped gcc. By looking into it, the apparent reason is vect_analyze_loop_2 doesn't get slp_done_for_suggested_uf set expectedly, the following retrying will use the garbage slp_done_for_suggested_uf instead. In fact, the setting of slp_done_for_suggested_uf only happens when the previous analysis succeeds, for the mentioned test case, its previous analysis does fail, it's unexpected to use the value of slp_done_for_suggested_uf any more. In function vect_analyze_loop_1, we only return success when res is true, which is the result of 1st analysis. It means we never try to vectorize with unroll_vinfo if the previous analysis fails. So this patch shouldn't break anything, and just stop some useless analysis early. gcc/ChangeLog: * tree-vect-loop.cc (vect_analyze_loop_1): Don't retry analysis with suggested unroll factor once the previous analysis fails. (cherry picked from commit a04bf39f61ce7814d197d712760f08c206daf4f1)
2023-05-22Daily bump.GCC Administrator2-1/+17
2023-05-21Darwin, libgcc : Adjust min version supported for the OS.Iain Sandoe6-8/+63
Tools from later versions of the OS deprecate or fail to support earlier OS revisions. Signed-off-by: Iain Sandoe <iain@sandoe.co.uk> libgcc/ChangeLog: * config.host: Arrange to set min Darwin OS versions from the configured host version. * config/darwin10-unwind-find-enc-func.c: Do not use current headers, but declare the nexessary structures locally to the versions in use for Mac OSX 10.6. * config/t-darwin: Amend to handle configured min OS versions. * config/t-darwin-min-1: New. * config/t-darwin-min-5: New. * config/t-darwin-min-8: New. (cherry picked from commit 20b8779ea9bd82b26eeb195b30f695168cd7ae1d)
2023-05-21Daily bump.GCC Administrator3-1/+16
2023-05-20target/105753: Fix ICE in add_clobbers due to extra PARALLEL in insn.Triffid Hunter2-62/+77
This patch removes the superfluous parallel in [u]divmod patterns in the AVR backend. Effect of extra parallel is that add_clobbers reaches gcc_unreachable() because the clobbers for [u]divmod are missing. If an insn has multiple parts like clobbers, the parallel around the parts of the insn pattern is implicit. gcc/ PR target/105753 Backport from 2023-05-20 https://gcc.gnu.org/r14-1016 * config/avr/avr.md (divmodpsi, udivmodpsi, divmodsi, udivmodsi): Remove superfluous "parallel" in insn pattern. ([u]divmod<mode>4): Tidy code. Use gcc_unreachable() instead of printing error text to assembly. gcc/testsuite/ PR target/105753 Backport from 2023-05-20 https://gcc.gnu.org/r14-1016 * gcc.target/avr/torture/pr105753.c: New test.
2023-05-20Daily bump.GCC Administrator5-1/+38
2023-05-19riscv/linux: Don't add -latomic with -pthreadAndreas Schwab1-10/+0
Now that we have support for inline subword atomic operations, it is no longer necessary to link against libatomic. This also fixes testsuite failures because the framework does not properly set up the linker flags for finding libatomic. The use of atomic operations is also independent of the use of libpthread. gcc/ * config/riscv/linux.h (LIB_SPEC): Don't redefine.
2023-05-19Implement LDPT_REGISTER_CLAIM_FILE_HOOK_V2 linker plugin hook [PR109128]Joseph Myers2-3/+44
This is one part of the fix for PR109128, along with a corresponding binutils's linker change. Without this patch, what happens in the linker, when an unused object in a .a file has offload data, is that elf_link_is_defined_archive_symbol calls bfd_link_plugin_object_p, which ends up calling the plugin's claim_file_handler, which then records the object as one with offload data. That is, the linker never decides to use the object in the first place, but use of this _p interface (called as part of trying to decide whether to use the object) results in the plugin deciding to use its offload data (and a consequent mismatch in the offload data present at runtime). The new hook allows the linker plugin to distinguish calls to claim_file_handler that know the object is being used by the linker (from ldmain.c:add_archive_element), from calls that don't know it's being used by the linker (from elf_link_is_defined_archive_symbol); in the latter case, the plugin should avoid recording the object as one with offload data. PR middle-end/109128 include/ * plugin-api.h (ld_plugin_claim_file_handler_v2) (ld_plugin_register_claim_file_v2) (LDPT_REGISTER_CLAIM_FILE_HOOK_V2): New. (struct ld_plugin_tv): Add tv_register_claim_file_v2. lto-plugin/ * lto-plugin.c (register_claim_file_v2): New. (claim_file_handler_v2): New. (claim_file_handler): Wrap claim_file_handler_v2. (onload): Handle LDPT_REGISTER_CLAIM_FILE_HOOK_V2. (cherry picked from commit c49d51fa8134f6c7e6c7cf6e4e3007c4fea617c5)
2023-05-19OpenMP: Constructors and destructors for "declare target" static aggregatesJulian Brown9-41/+336
This patch adds support for running constructors and destructors for static (file-scope) aggregates for C++ objects which are marked with "declare target" directives on OpenMP offload targets. At present, space is allocated on the target for such aggregates, but nothing ever constructs them properly, so they end up zero-initialised. The approach taken is to generate a set of constructors to run on the target: this currently works for AMD GCN, but fails on NVPTX due to lack of constructor/destructor support there so far on mainline. (See the new test static-aggr-constructor-destructor-3.C for a reason why running constructors on the target is preferable to e.g. constructing on the host and then copying the resulting object to the target.) 2023-05-12 Julian Brown <julian@codesourcery.com> gcc/cp/ * decl2.cc (tree-inline.h): Include. (static_init_fini_fns): Bump to four entries. Update comment. (start_objects, start_partial_init_fini_fn): Add 'omp_target' parameter. Support "declare target" decls. Update forward declaration. (emit_partial_init_fini_fn): Add 'host_fn' parameter. Return tree for the created function. Support "declare target". (OMP_SSDF_IDENTIFIER): New macro. (partition_vars_for_init_fini): Support partitioning "declare target" variables also. (generate_ctor_or_dtor_function): Add 'omp_target' parameter. Support "declare target" decls. (c_parse_final_cleanups): Support constructors/destructors on OpenMP offload targets. gcc/ * omp-builtins.def (BUILT_IN_OMP_IS_INITIAL_DEVICE): New builtin. * tree.cc (get_file_function_name): Support names for on-target constructor/destructor functions. libgomp/ * testsuite/libgomp.c++/static-aggr-constructor-destructor-1.C: New test. * testsuite/libgomp.c++/static-aggr-constructor-destructor-2.C: New test. * testsuite/libgomp.c++/static-aggr-constructor-destructor-3.C: New test.
2023-05-19c++: desig init in presence of list ctor [PR109871]Patrick Palka2-8/+24
add_list_candidates has logic to reject designated initialization of a non-aggregate type, but this is inadvertently being suppressed if the type has a list constructor due to the order of case analysis, which in the below testcase leads to us incorrectly treating the initializer list as if it's non-designated. This patch fixes this by making us check for invalid designated initialization sooner. PR c++/109871 gcc/cp/ChangeLog: * call.cc (add_list_candidates): Check for invalid designated initialization sooner and even for types that have a list constructor. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/desig27.C: New test. (cherry picked from commit d5e5007c4b534391c0a97be56f6024fde1a88682)
2023-05-19c++: add feature-test macro for auto(x)Patrick Palka2-0/+7
This adds the feature-test macro for PR0849R8, as per https://github.com/cplusplus/CWG/issues/281. gcc/c-family/ChangeLog: * c-cppbuiltin.cc (c_cpp_builtins): Predefine __cpp_auto_cast for C++23. gcc/testsuite/ChangeLog: * g++.dg/cpp23/feat-cxx2b.C: Test __cpp_auto_cast. (cherry picked from commit 32b81d897629b6c3bd9e2780831a1c45b38b5ac3)
2023-05-19openmp: Fix initialization for 'unroll full'Frederik Harwath18-8/+79
The index variable initialization for the 'omp unroll' directive with 'full' clause got lost and the testsuite did not catch it. Add the initialization and add -Wall to some tests to detect uninitialized variable uses and other potential problems in the code generation. gcc/ChangeLog: * omp-transform-loops.cc (full_unroll): Add initialization of index variable. libgomp/ChangeLog: * testsuite/libgomp.c-c++-common/loop-transforms/matrix-no-directive-unroll-full-1.c: Use -Wall and add -Wno-unknown-pragmas to disable warnings about empty pragmas. Use -O2. * testsuite/libgomp.c++/loop-transforms/matrix-no-directive-unroll-full-1.C: Copy of testsuite/libgomp.c-c++-common/matrix-no-directive-unroll-full-1.c, but using -O0 which works only for C++. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-no-directive-1.c: Use -Wall and use -Wno-unknown-pragmas to disable warnings about empty pragmas. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-distribute-parallel-for-1.c: Likewise. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-for-1.c: Likewise. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-for-1.c: Likewise. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-masked-taskloop-1.c: Likewise. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-masked-taskloop-simd-1.c: Likewise. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-target-parallel-for-1.c: Likewise. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-target-teams-distribute-parallel-for-1.c: Likewise. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-taskloop-1.c: Likewise. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-teams-distribute-parallel-for-1.c: Likewise. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-simd-1.c: Likewise. * testsuite/libgomp.c-c++-common/loop-transforms/unroll-non-rect-1.c: Likewise. * testsuite/libgomp.c-c++-common/loop-transforms/unroll-1.c: Likewise and fix broken function calls found by -Wall.
2023-05-19Daily bump.GCC Administrator4-1/+1778
2023-05-18Fortran: CLASS pointer function result in variable definition context [PR109846]Harald Anlauf2-1/+40
gcc/fortran/ChangeLog: PR fortran/109846 * expr.cc (gfc_check_vardef_context): Check appropriate pointer attribute for CLASS vs. non-CLASS function result in variable definition context. gcc/testsuite/ChangeLog: PR fortran/109846 * gfortran.dg/ptr-func-5.f90: New test. (cherry picked from commit fa0569e90efe8a5cb895a3f50dd502f849940828)
2023-05-18Fortran: overloading of intrinsic binary operators [PR109641]Harald Anlauf4-0/+162
Fortran allows overloading of intrinsic operators also for operands of numeric intrinsic types. The intrinsic operator versions are used according to the rules of F2018 table 10.2 and imply type conversion as long as the operand ranks are conformable. Otherwise no type conversion shall be performed to allow the resolution of a matching user-defined operator. gcc/fortran/ChangeLog: PR fortran/109641 * arith.cc (eval_intrinsic): Check conformability of ranks of operands for intrinsic binary operators before performing type conversions. * gfortran.h (gfc_op_rank_conformable): Add prototype. * resolve.cc (resolve_operator): Check conformability of ranks of operands for intrinsic binary operators before performing type conversions. (gfc_op_rank_conformable): New helper function to compare ranks of operands of binary operator. gcc/testsuite/ChangeLog: PR fortran/109641 * gfortran.dg/overload_5.f90: New test. (cherry picked from commit 185da7c2014ba41f38dd62cc719873ebf020b076)
2023-05-18Fix test failures in gfortran.dg/gomp/target-exit-data.f90Kwok Cheung Yeung2-2/+6
The order of the map clauses in the Gimple representation has changed. 2023-05-10 Kwok Cheung Yeung <kcy@codesourcery.com> * gfortran.dg/gomp/target-exit-data.f90: Update expected outputs.
2023-05-18Fix expected output in map-10a.f90Kwok Cheung Yeung2-10/+14
2023-05-03 Kwok Cheung Yeung <kcy@codesourcery.com> * gfortran.dg/gomp/map-10a.f90: Update expected outputs.
2023-05-18Fix ICE in gfortran.dg/goacc/omp_data_optimize-1.f90Kwok Cheung Yeung2-2/+9
NULLs cannot be added to a hash_set in GCC 13, so test for NULL before adding. Should probably be a fixup to 'Kernels loops annotation: Fortran'. 2023-04-27 Kwok Cheung Yeung <kcy@codesourcery.com> gcc/fortran/ * openmp.cc (compute_goto_targets): Check label before adding to goto_targets.
2023-05-18Fix ICE in libgomp.oacc-c-c++-common/noncontig_array-* testsKwok Cheung Yeung2-0/+13
The GOMP_MAP_NONCONTIG_ARRAY_* map types need to be handled in omp_group_base (which was added in GCC 13). This should probably be a fixup to 'Merge non-contiguous array support patches'. 2023-04-18 Kwok Cheung Yeung <kcy@codesourcery.com> * gimplify.cc (omp_group_base): Handle GOMP_MAP_NONCONTIG_ARRAY_* map types.
2023-05-18Fix ICE in libgomp.oacc-fortran/declare-allocatable*.f90 testsKwok Cheung Yeung2-0/+7
GOMP_MAP_DECLARE_ALLOCATE and GOMP_MAP_DECLARE_DEALLOCATE need to be handled in omp_group_base (which was added in GCC 13). This should probably be a fixup to 'Fortran "declare create"/allocate support for OpenACC'. 2023-04-17 Kwok Cheung Yeung <kcy@codesourcery.com> * gimplify.cc (omp_group_base): Handle GOMP_MAP_DECLARE_ALLOCATE and GOMP_MAP_DECLARE_DEALLOCATE.
2023-05-18openmp: Fix loop transformation testsFrederik Harwath4-2/+12
libgomp/ChangeLog: * testsuite/libgomp.fortran/loop-transforms/tile-2.f90: Add reduction clause. * testsuite/libgomp.fortran/loop-transforms/unroll-1.f90: Initialize var. * testsuite/libgomp.fortran/loop-transforms/unroll-simd-1.f90: Add reduction and initialization.
2023-05-18amdgcn, openmp: Fix concurrency in low-latency allocatorAndrew Stubbs2-0/+7
The previous code works fine on Fiji and Vega 10 devices, but bogs down in The spin locks on Vega 20 or newer. Adding the sleep instructions fixes the problem. libgomp/ChangeLog: * basic-allocator.c (basic_alloc_free): Use BASIC_ALLOC_YIELD. (basic_alloc_realloc): Use BASIC_ALLOC_YIELD.
2023-05-18'-foffload-memory=pinned' using offloading device interfaces for ↵Thomas Schwinge2-5/+53
non-contiguous array support Changes related to og12 commit 15d0f61a7fecdc8fd12857c40879ea3730f6d99f "Merge non-contiguous array support patches". libgomp/ * target.c (gomp_map_vars_internal) <non-contiguous array support>: Handle 'always_pinned_mode'.
2023-05-18'-foffload-memory=pinned' using offloading device interfacesThomas Schwinge15-153/+1339
Implemented for nvptx offloading via 'cuMemHostAlloc', 'cuMemHostRegister'. gcc/ * doc/invoke.texi (-foffload-memory=pinned): Document. include/ * cuda/cuda.h (CUresult): Add 'CUDA_ERROR_HOST_MEMORY_ALREADY_REGISTERED'. (CUdevice_attribute): Add 'CU_DEVICE_ATTRIBUTE_READ_ONLY_HOST_REGISTER_SUPPORTED'. (CU_MEMHOSTREGISTER_READ_ONLY): Add. (cuMemHostGetFlags, cuMemHostRegister, cuMemHostUnregister): Add. libgomp/ * libgomp-plugin.h (GOMP_OFFLOAD_page_locked_host_free): Add 'struct goacc_asyncqueue *' formal parameter. (GOMP_OFFLOAD_page_locked_host_register) (GOMP_OFFLOAD_page_locked_host_unregister) (GOMP_OFFLOAD_page_locked_host_p): Add. * libgomp.h (always_pinned_mode) (gomp_page_locked_host_register_dev) (gomp_page_locked_host_unregister_dev): Add. (struct splay_tree_key_s): Add 'page_locked_host_p'. (struct gomp_device_descr): Add 'GOMP_OFFLOAD_page_locked_host_register', 'GOMP_OFFLOAD_page_locked_host_unregister', 'GOMP_OFFLOAD_page_locked_host_p'. * libgomp.texi (-foffload-memory=pinned): Document. * plugin/cuda-lib.def (cuMemHostGetFlags, cuMemHostRegister_v2) (cuMemHostRegister, cuMemHostUnregister): Add. * plugin/plugin-nvptx.c (struct ptx_device): Add 'read_only_host_register_supported'. (nvptx_open_device): Initialize it. (free_host_blocks, free_host_blocks_lock) (nvptx_run_deferred_page_locked_host_free) (nvptx_page_locked_host_free_callback, nvptx_page_locked_host_p) (GOMP_OFFLOAD_page_locked_host_register) (nvptx_page_locked_host_unregister_callback) (GOMP_OFFLOAD_page_locked_host_unregister) (GOMP_OFFLOAD_page_locked_host_p) (nvptx_run_deferred_page_locked_host_unregister) (nvptx_move_page_locked_host_unregister_blocks_aq1_aq2_callback): Add. (GOMP_OFFLOAD_fini_device, GOMP_OFFLOAD_page_locked_host_alloc) (GOMP_OFFLOAD_run): Call 'nvptx_run_deferred_page_locked_host_free'. (struct goacc_asyncqueue): Add 'page_locked_host_unregister_blocks_lock', 'page_locked_host_unregister_blocks'. (nvptx_goacc_asyncqueue_construct) (nvptx_goacc_asyncqueue_destruct): Handle those. (GOMP_OFFLOAD_page_locked_host_free): Handle 'struct goacc_asyncqueue *' formal parameter. (GOMP_OFFLOAD_openacc_async_test) (nvptx_goacc_asyncqueue_synchronize): Call 'nvptx_run_deferred_page_locked_host_unregister'. (GOMP_OFFLOAD_openacc_async_serialize): Call 'nvptx_move_page_locked_host_unregister_blocks_aq1_aq2_callback'. * config/linux/allocator.c (linux_memspace_alloc) (linux_memspace_calloc, linux_memspace_free) (linux_memspace_realloc): Remove 'always_pinned_mode' handling. (GOMP_enable_pinned_mode): Move... * target.c: ... here. (always_pinned_mode, verify_always_pinned_mode) (gomp_verify_always_pinned_mode, gomp_page_locked_host_alloc_dev) (gomp_page_locked_host_free_dev) (gomp_page_locked_host_aligned_alloc_dev) (gomp_page_locked_host_aligned_free_dev) (gomp_page_locked_host_register_dev) (gomp_page_locked_host_unregister_dev): Add. (gomp_copy_host2dev, gomp_map_vars_internal) (gomp_remove_var_internal, gomp_unmap_vars_internal) (get_gomp_offload_icvs, gomp_load_image_to_device) (gomp_target_rev, omp_target_memcpy_copy) (omp_target_memcpy_rect_worker): Handle 'always_pinned_mode'. (gomp_copy_host2dev, gomp_copy_dev2host): Handle 'verify_always_pinned_mode'. (GOMP_target_ext): Add 'assert'. (gomp_page_locked_host_alloc): Use 'gomp_page_locked_host_alloc_dev'. (gomp_page_locked_host_free): Use 'gomp_page_locked_host_free_dev'. (omp_target_associate_ptr): Adjust. (gomp_load_plugin_for_device): Handle 'page_locked_host_register', 'page_locked_host_unregister', 'page_locked_host_p'. * oacc-mem.c (memcpy_tofrom_device): Handle 'always_pinned_mode'. * libgomp_g.h (GOMP_enable_pinned_mode): Adjust. * testsuite/libgomp.c/alloc-pinned-7.c: Remove.
2023-05-18OpenACC: Pass pre-allocated 'ptrblock' to ↵Thomas Schwinge4-6/+13
'goacc_noncontig_array_create_ptrblock' [PR76739] ... to simplify later changes. No functional change. Follow-up for og12 commit 15d0f61a7fecdc8fd12857c40879ea3730f6d99f "Merge non-contiguous array support patches". PR other/76739 libgomp/ * target.c (gomp_map_vars_internal): Pass pre-allocated 'ptrblock' to 'goacc_noncontig_array_create_ptrblock'. * oacc-parallel.c (goacc_noncontig_array_create_ptrblock): Adjust. * oacc-int.h (goacc_noncontig_array_create_ptrblock): Adjust.
2023-05-18libgomp: Document OpenMP 'pinned' memoryThomas Schwinge2-0/+13
libgomp/ * libgomp.texi (AMD Radeon, nvptx): Document OpenMP 'pinned' memory.
2023-05-18openmp: Handle GIMPLE_OMP_METADIRECTIVE in walk_omp_for_loopsFrederik Harwath2-0/+20
gcc/ChangeLog: * omp-transform-loops.cc (walk_omp_for_loops): Handle GIMPLE_OMP_METADIRECTIVE.
2023-05-18openmp: Add C/C++ support for loop transformations on inner loopsFrederik Harwath29-77/+858
Add the parsing of loop transformations on inner loops of a loop-nest. gcc/c/ChangeLog: * c-parser.cc (c_parser_omp_nested_loop_transform_clauses): Add argument for the level of loop-nest at which the clauses appear, ... (c_parser_omp_tile): ... adjust use here, (c_parser_omp_unroll): ... and here, (c_parser_omp_for_loop): ... and here. Stop treating loop transformations like intervening code, parse them, and adjust the loop-nest depth if necessary for tiling. gcc/cp/ChangeLog: * parser.cc (cp_parser_is_pragma): New function. (cp_parser_omp_nested_loop_transform_clauses): Add argument for the level of loop-nest at which the clauses appear, ... (cp_parser_omp_tile): ... adjust use here, (cp_parser_omp_unroll): ... and here, (cp_parser_omp_for_loop): ... and here. Stop treating loop gcc/testsuite/ChangeLog: * c-c++-common/gomp/loop-transforms/unroll-inner-1.c: New test. * c-c++-common/gomp/loop-transforms/unroll-inner-2.c: New test. libgomp/ChangeLog * testsuite/libgomp.c++/loop-transforms/tile-1.C: Deleted, replaced by matrix-* tests. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-1.h: New header file for new tests. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-constant-iter.h: Likewise. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-helper.h: Likewise. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-no-directive-1.c: New test. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-no-directive-unroll-full-1.c: New test. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-distribute-parallel-for-1.c: New test. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-for-1.c: New test. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-for-1.c: New test. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-masked-taskloop-1.c: New test. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-masked-taskloop-simd-1.c: New test. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-target-parallel-for-1.c: New test. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-target-teams-distribute-parallel-for-1.c: New test. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-taskloop-1.c: New test. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-teams-distribute-parallel-for-1.c: New test. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-simd-1.c: New test. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-transform-variants-1.h: New test. * testsuite/libgomp.c-c++-common/loop-transforms/unroll-non-rect-1.c: New test.
2023-05-18openmp: Add Fortran support for loop transformations on inner loopsFrederik Harwath35-138/+1196
So far the implementation of the "omp tile" and "omp unroll" directives restricted their use to the outermost loop of a loop-nest. This commit changes the Fortran front end to parse and verify the directives on inner loops. The transformation clauses are extended to carry the information about the level of the loop-nest at which a transformation should be applied. The middle end transformation pass is adjusted to apply the transformations at the right level of a loop nest and to take their effect on the loop nest depth into account. gcc/fortran/ChangeLog: * openmp.cc (omp_unroll_removes_loop_nest): Move down in file. (resolve_loop_transform_generic): Remove, and ... (resolve_omp_unroll): ... inline and adapt here. Move function. Move functin. (find_nested_loop_in_block): New function. (find_nested_loop_in_chain): New function, used ... (is_outer_iteration_variable): ... here, and ... (expr_is_invariant): ... here. (resolve_omp_do): Adjust code for resolving loop transformations. (resolve_omp_tile): Likewise. * trans-openmp.cc (gfc_trans_omp_clauses): Set OMP_TRANSFROM_LEVEL on new clause. (compute_transformed_depth): New function to compute the depth ("collapse") of a transformed loop nest, used (gfc_trans_omp_do): ... here. gcc/ChangeLog: * omp-transform-loops.cc (gimple_assign_rhs_to_tree): Fix type in comment. (gomp_for_uncollapse): Adjust "collapse" value after uncollapse. (partial_unroll): Add argument for the loop nest level to be transformed. (tile): Likewise. (transform_gomp_for): Pass level to transformatoin functions. (optimize_transformation_clauses): Handle transformation clauses for all levels recursively. * tree-pretty-print.cc (dump_omp_clause): Print OMP_CLAUSE_TRANSFORM_LEVEL for OMP_CLAUSE_UNROLL_FULL, OMP_CLAUSE_UNROLL_PARTIAL, and OMP_CLAUSE_TILE. * tree.cc: Increase number of operands of OMP_CLAUSE_UNROLL_FULL, OMP_CLAUSE_UNROLL_PARTIAL, and OMP_CLAUSE_TILE. * tree.h (OMP_CLAUSE_TRANSFORM_LEVEL): New macro to access clause operand 0. (OMP_CLAUSE_UNROLL_PARTIAL_EXPR): Use operand 1 instead of 0. (OMP_CLAUSE_TILE_SIZES): Likewise. gcc/cp/ChangeLog * parser.cc (cp_parser_omp_clause_unroll_full): Set new OMP_CLAUSE_TRANSFORM_LEVEL operand to default value. (cp_parser_omp_clause_unroll_partial): Likewise. (cp_parser_omp_tile_sizes): Likewise. (cp_parser_omp_loop_transform_clause): Likewise. (cp_parser_omp_nested_loop_transform_clauses): Likewise. (cp_parser_omp_unroll): Likewise. * pt.cc (tsubst_omp_clauses): Adjust OMP_CLAUSE_UNROLL_PARTIAL and OMP_CLAUSE_TILE handling to changed number of operands. gcc/c/ChangeLog * c-parser.cc (c_parser_omp_clause_unroll_full): Set new OMP_CLAUSE_TRANSFORM_LEVEL operand to default value. (c_parser_omp_clause_unroll_partial): Likewise. (c_parser_omp_tile_sizes): Likewise. (c_parser_omp_loop_transform_clause): Likewise. (c_parser_omp_nested_loop_transform_clauses): Likewise. (c_parser_omp_unroll): Likewise. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/loop-transforms/unroll-8.f90: Adjust. * gfortran.dg/gomp/loop-transforms/unroll-9.f90: Adjust. * gfortran.dg/gomp/loop-transforms/unroll-tile-1.f90: Adjust. * gfortran.dg/gomp/loop-transforms/unroll-tile-2.f90: Adjust. * gfortran.dg/gomp/loop-transforms/inner-loops.f90: New test. * gfortran.dg/gomp/loop-transforms/tile-imperfect-nest.f90: New test. * gfortran.dg/gomp/loop-transforms/tile-inner-loops-1.f90: New test. * gfortran.dg/gomp/loop-transforms/tile-inner-loops-2.f90: New test. * gfortran.dg/gomp/loop-transforms/tile-inner-loops-3.f90: New test. * gfortran.dg/gomp/loop-transforms/tile-inner-loops-3a.f90: New test. * gfortran.dg/gomp/loop-transforms/tile-inner-loops-4.f90: New test. * gfortran.dg/gomp/loop-transforms/tile-inner-loops-4a.f90: New test. * gfortran.dg/gomp/loop-transforms/tile-inner-loops-5.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-inner-loop.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-tile-inner-1.f90: New test. * gfortran.dg/gomp/loop-transforms/tile-3.f90: Adapt to changed diagnostic messages. libgomp/ChangeLog: * testsuite/libgomp.fortran/loop-transforms/inner-1.f90: New test.
2023-05-18openmp: Add C/C++ support for "omp tile"Frederik Harwath29-102/+1885
This commit adds the C and C++ front end support for the "omp tile" directive. The middle end support for the transformation is implemented in a previous commit. gcc/c-family/ChangeLog: * c-omp.cc (c_omp_directives): Add PRAGMA_OMP_TILE. * c-pragma.cc (omp_pragmas_simd): Likewise. * c-pragma.h (enum pragma_kind): Add PRAGMA_OMP_TILE. (enum pragma_omp_clause): Add PRAGMA_OMP_CLAUSE_TILE gcc/c/ChangeLog: * c-parser.cc (c_parser_nested_omp_unroll_clauses): Rename and generalize ... (c_parser_omp_nested_loop_transform_clauses): ... to this. (c_parser_omp_for_loop): Handle "omp tile" parsing in loop nests. (c_parser_omp_tile_sizes): Parse single "sizes" clause. (c_parser_omp_loop_transform_clause): New function. (c_parser_omp_tile): New function for parsing "omp tile" (c_parser_omp_unroll): Adjust to renaming. (c_parser_omp_construct): Handle PRAGMA_OMP_TILE. gcc/cp/ChangeLog: * parser.cc (cp_parser_omp_clause_unroll_partial): Adjust. (cp_parser_nested_omp_unroll_clauses): Rename ... (cp_parser_omp_nested_loop_transform_clauses): ... to this. (cp_parser_omp_for_loop): Handle "omp tile" parsing in loop nests. (cp_parser_omp_tile_sizes): New function, parses single "sizes" clause (cp_parser_omp_tile): New function for parsing "omp tile". (cp_parser_omp_loop_transform_clause): New function. (cp_parser_omp_unroll): Adjust to renaming. (cp_parser_omp_construct): Handle PRAGMA_OMP_TILE. (cp_parser_pragma): Likewise. * pt.cc (tsubst_omp_clauses): Handle OMP_CLAUSE_TILE. * semantics.cc (finish_omp_clauses): Likewise. gcc/ChangeLog: * gimplify.cc (omp_for_drop_tile_clauses): New function, ... (gimplify_omp_for): ... used here. libgomp/ChangeLog: * testsuite/libgomp.c++/loop-transforms/tile-1.C: New test. * testsuite/libgomp.c++/loop-transforms/tile-2.C: New test. * testsuite/libgomp.c++/loop-transforms/tile-3.C: New test. gcc/testsuite/ChangeLog: * c-c++-common/gomp/loop-transforms/tile-1.c: New test. * c-c++-common/gomp/loop-transforms/tile-2.c: New test. * c-c++-common/gomp/loop-transforms/tile-3.c: New test. * c-c++-common/gomp/loop-transforms/tile-4.c: New test. * c-c++-common/gomp/loop-transforms/tile-5.c: New test. * c-c++-common/gomp/loop-transforms/tile-6.c: New test. * c-c++-common/gomp/loop-transforms/tile-7.c: New test. * c-c++-common/gomp/loop-transforms/tile-8.c: New test. * c-c++-common/gomp/loop-transforms/unroll-2.c: Adapt to changed diagnostic messages. * g++.dg/gomp/loop-transforms/tile-1.h: New test. * g++.dg/gomp/loop-transforms/tile-1a.C: New test. * g++.dg/gomp/loop-transforms/tile-1b.C: New test.
2023-05-18openmp: Add Fortran support for "omp tile"Frederik Harwath37-120/+2118
This commit implements the Fortran front end support for the "omp tile" directive and the corresponding middle end transformation. gcc/fortran/ChangeLog: * gfortran.h (enum gfc_statement): Add ST_OMP_TILE, ST_OMP_END_TILE. (enum gfc_exec_op): Add EXEC_OMP_TILE. (loop_transform_p): New declaration. (struct gfc_omp_clauses): Add "tile_sizes" field. * dump-parse-tree.cc (show_omp_clauses): Handle "tile_sizes" dumping. (show_omp_node): Handle EXEC_OMP_TILE. (show_code_node): Likewise. * match.h (gfc_match_omp_tile): New declaration. * openmp.cc (gfc_free_omp_clauses): Free "tile_sizes" field. (match_tile_sizes): New function. (OMP_TILE_CLAUSES): New macro. (gfc_match_omp_tile): New function. (resolve_omp_do): Handle EXEC_OMP_TILE. (resolve_omp_tile): New function. (omp_code_to_statement): Handle EXEC_OMP_TILE. (gfc_resolve_omp_directive): Likewise. * parse.cc (decode_omp_directive): Handle ST_OMP_END_TILE and ST_OMP_TILE. (next_statement): Handle ST_OMP_TILE. (gfc_ascii_statement): Likewise. (parse_omp_do): Likewise. (parse_executable): Likewise. * resolve.cc (gfc_resolve_blocks): Handle EXEC_OMP_TILE. (gfc_resolve_code): Likewise. * st.cc (gfc_free_statement): Likewise. * trans-openmp.cc (gfc_trans_omp_clauses): Handle "tile_sizes" field. (loop_transform_p): New function. (gfc_expr_list_len): New function. (gfc_trans_omp_do): Handle EXEC_OMP_TILE. (gfc_trans_omp_directive): Likewise. * trans.cc (trans_code): Likewise. gcc/ChangeLog: * gimplify.cc (gimplify_scan_omp_clauses): Handle OMP_CLAUSE_TILE. (gimplify_adjust_omp_clauses): Likewise. (gimplify_omp_loop): Likewise. * omp-transform-loops.cc (walk_omp_for_loops): New declaration. (subst_var_in_op): New function. (subst_var): New function. (gomp_for_number_of_iterations): Adjust. (gomp_for_iter_count_type): New function. (gimple_assign_rhs_to_tree): New function. (subst_defs): New function. (gomp_for_uncollapse): Adjust. (transformation_clause_p): Add OMP_CLAUSE_TILE. (tile): New function. (transform_gomp_for): Handle OMP_CLAUSE_TILE. (optimize_transformation_clauses): Handle OMP_CLAUSE_TILE. * omp-general.cc (omp_loop_transform_clause_p): Add OMP_CLAUSE_TILE. * tree-core.h (enum omp_clause_code): Add OMP_CLAUSE_TILE. * tree-pretty-print.cc (dump_omp_clause): Handle OMP_CLAUSE_TILE. * tree.cc: Add OMP_CLAUSE_TILE. * tree.h (OMP_CLAUSE_TILE_SIZES): New macro. libgomp/ChangeLog: * testsuite/libgomp.fortran/loop-transforms/tile-1.f90: New test. * testsuite/libgomp.fortran/loop-transforms/tile-2.f90: New test. * testsuite/libgomp.fortran/loop-transforms/tile-unroll-1.f90: New test. * testsuite/libgomp.fortran/loop-transforms/tile-unroll-2.f90: New test. * testsuite/libgomp.fortran/loop-transforms/tile-unroll-3.f90: New test. * testsuite/libgomp.fortran/loop-transforms/tile-unroll-4.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-tile-1.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-tile-2.f90: New test. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/loop-transforms/tile-1.f90: New test. * gfortran.dg/gomp/loop-transforms/tile-1a.f90: New test. * gfortran.dg/gomp/loop-transforms/tile-2.f90: New test. * gfortran.dg/gomp/loop-transforms/tile-3.f90: New test. * gfortran.dg/gomp/loop-transforms/tile-4.f90: New test. * gfortran.dg/gomp/loop-transforms/tile-unroll-1.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-tile-1.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-tile-2.f90: New test.
2023-05-18openacc: Rename OMP_CLAUSE_TILE to OMP_CLAUSE_OACC_TILEFrederik Harwath21-47/+93
OMP_CLAUSE_TILE will be used for the OpenMP 5.1 loop transformation construct "omp tile". gcc/ChangeLog: * tree-core.h (enum omp_clause_code): Rename OMP_CLAUSE_TILE. * tree.h (OMP_CLAUSE_TILE_LIST): Rename to ... (OMP_CLAUSE_OACC_TILE_LIST): ... this. (OMP_CLAUSE_TILE_ITERVAR): Rename to ... (OMP_CLAUSE_OACC_TILE_ITERVAR): ... this. (OMP_CLAUSE_TILE_COUNT): Rename to ... (OMP_CLAUSE_OACC_TILE_COUNT): this. * gimplify.cc (gimplify_scan_omp_clauses): Adjust to renamings. (gimplify_adjust_omp_clauses): Likewise. (gimplify_omp_for): Likewise. * omp-general.cc (omp_extract_for_data): Likewise. * omp-low.cc (scan_sharing_clauses): Likewise. (lower_oacc_head_mark): Likewise. * tree-nested.cc (convert_nonlocal_omp_clauses): Likewise. (convert_local_omp_clauses): Likewise. * tree-pretty-print.cc (dump_omp_clause): Likewise. * tree.cc: Likewise. gcc/c-family/ChangeLog: * c-omp.cc (c_oacc_split_loop_clauses): Adjust to renamings. gcc/c/ChangeLog: * c-parser.cc (c_parser_omp_clause_collapse): Adjust to renamings. (c_parser_oacc_clause_tile): Likewise. (c_parser_omp_for_loop): Likewise. * c-typeck.cc (c_finish_omp_clauses): Likewise. gcc/cp/ChangeLog: * parser.cc (cp_parser_oacc_clause_tile): Adjust to renamings. (cp_parser_omp_clause_collapse): Likewise. (cp_parser_omp_for_loop): Likewise. * pt.cc (tsubst_omp_clauses): Likewise. * semantics.cc (finish_omp_clauses): Likewise. (finish_omp_for): Likewise. gcc/fortran/ChangeLog: * openmp.cc (enum omp_mask2): Adjust to renamings. (gfc_match_omp_clauses): Likewise. * trans-openmp.cc (gfc_trans_omp_clauses): Likewise.
2023-05-18openmp: Add C/C++ support for "omp unroll" directiveFrederik Harwath29-8/+1327
This commit implements the C and the C++ front end changes to support the "omp unroll" directive. The execution of the loop transformation relies on the pass that has been added as a part of the earlier Fortran patch. gcc/c-family/ChangeLog: * c-gimplify.cc (c_genericize_control_stmt): Handle OMP_UNROLL. * c-omp.cc: Add "unroll" to omp_directives[]. * c-pragma.cc: Add "unroll" to omp_pragmas_simd[]. * c-pragma.h (enum pragma_kind): Add PRAGMA_OMP_UNROLL to pragma_kind and adjust PRAGMA_OMP__LAST_. (enum pragma_omp_clause): Add PRAGMA_OMP_CLAUSE_FULL and PRAGMA_OMP_CLAUSE_PARTIAL. gcc/c/ChangeLog: * c-parser.cc (c_parser_omp_clause_name): Handle "full" and "partial" clauses. (check_no_duplicate_clause): Change return type to bool and return check result. (c_parser_omp_clause_unroll_full): New function for parsing the "unroll clause". (c_parser_omp_clause_unroll_partial): New function for parsing the "partial" clause. (c_parser_omp_all_clauses): Handle PRAGMA_OMP_CLAUSE_FULL and PRAGMA_OMP_CLAUSE_PARTIAL. (c_parser_nested_omp_unroll_clauses): New function for parsing "omp unroll" directives following another directive. (OMP_UNROLL_CLAUSE_MASK): New definition. (c_parser_omp_unroll): New function for parsing "omp unroll" loops that are not associated with another directive. (c_parser_omp_construct): Handle PRAGMA_OMP_UNROLL. * c-typeck.cc (c_finish_omp_clauses): Handle OMP_CLAUSE_UNROLL_FULL, OMP_CLAUSE_UNROLL_PARTIAL, and OMP_CLAUSE_UNROLL_NONE. gcc/cp/ChangeLog: * cp-gimplify.cc (cp_gimplify_expr): Handle OMP_UNROLL. (cp_fold_r): Likewise. (cp_genericize_r): Likewise. * parser.cc (cp_parser_omp_clause_name): Handle "full" clause. (check_no_duplicate_clause): Change return type to bool and return check result. (cp_parser_omp_clause_unroll_full): New function for parsing the "unroll clause". (cp_parser_omp_clause_unroll_partial): New function for parsing the "partial" clause. (cp_parser_omp_all_clauses): Handle OMP_CLAUSE_UNROLL and OMP_CLAUSE_FULL. (cp_parser_nested_omp_unroll_clauses): New function for parsing "omp unroll" directives following another directive. (cp_parser_omp_for_loop): Handle "omp unroll" directives between directive and loop. (OMP_UNROLL_CLAUSE_MASK): New definition. (cp_parser_omp_unroll): New function for parsing "omp unroll" loops that are not associated with another directive. (cp_parser_omp_construct): Handle PRAGMA_OMP_UNROLL. (cp_parser_pragma): Handle PRAGMA_OMP_UNROLL. * pt.cc (tsubst_omp_clauses): Handle OMP_CLAUSE_UNROLL_PARTIAL, OMP_CLAUSE_UNROLL_FULL, and OMP_CLAUSE_UNROLL_NONE. (tsubst_expr): Handle OMP_UNROLL. * semantics.cc (finish_omp_clauses): Handle OMP_CLAUSE_UNROLL_FULL, OMP_CLAUSE_UNROLL_PARTIAL, and OMP_CLAUSE_UNROLL_NONE. libgomp/ChangeLog: * testsuite/libgomp.c++/loop-transforms/unroll-1.C: New test. * testsuite/libgomp.c++/loop-transforms/unroll-2.C: New test. * testsuite/libgomp.c-c++-common/loop-transforms/unroll-1.c: New test. gcc/testsuite/ChangeLog: * c-c++-common/gomp/loop-transforms/unroll-1.c: New test. * c-c++-common/gomp/loop-transforms/unroll-2.c: New test. * c-c++-common/gomp/loop-transforms/unroll-3.c: New test. * c-c++-common/gomp/loop-transforms/unroll-4.c: New test. * c-c++-common/gomp/loop-transforms/unroll-5.c: New test. * c-c++-common/gomp/loop-transforms/unroll-6.c: New test. * g++.dg/gomp/loop-transforms/unroll-1.C: New test. * g++.dg/gomp/loop-transforms/unroll-2.C: New test. * g++.dg/gomp/loop-transforms/unroll-3.C: New test.
2023-05-18openmp: Add Fortran support for "omp unroll" directiveFrederik Harwath57-17/+3577
This commit implements the OpenMP 5.1 "omp unroll" directive for Fortran. The Fortran front end changes encompass the parsing and the verification of nesting restrictions etc. The actual loop transformation is implemented in a new language-independent "omp_transform_loops" pass which runs before omp lowering. No attempt is made to re-use existing unrolling optimizations because a separate implementation allows for better control of the unrolling. The new pass will also serve as a foundation for the implementation of further OpenMP loop transformations. This commit only implements the support for "omp unroll" on the outermost loop of a loop nest. The support for inner loops will be added later. gcc/ChangeLog: * Makefile.in: Add omp_transform_loops.o. * gimple-pretty-print.cc (dump_gimple_omp_for): Handle "full" and "partial" clauses. * gimple.h (enum gf_mask): Add GF_OMP_FOR_KIND_TRANSFORM_LOOP. * gimplify.cc (is_gimple_stmt): Handle OMP_UNROLL. (gimplify_scan_omp_clauses): Handle OMP_UNROLL_FULL, OMP_UNROLL_NONE, and OMP_UNROLL_PARTIAL. (gimplify_adjust_omp_clauses): Handle OMP_UNROLL_FULL, OMP_UNROLL_NONE, and OMP_UNROLL_PARTIAL. (gimplify_omp_for): Handle OMP_UNROLL. (gimplify_expr): Likewise. * params.opt: Add omp-unroll-full-max-iteration and omp-unroll-default-factor. * passes.def: Add pass_omp_transform_loop before pass_lower_omp. * tree-core.h (enum omp_clause_code): Add OMP_CLAUSE_UNROLL_NONE, OMP_CLAUSE_UNROLL_FULL, and OMP_CLAUSE_UNROLL_PARTIAL. * tree-pass.h (make_pass_omp_transform_loops): Declare pmake_pass_omp_transform_loops. * tree-pretty-print.cc (dump_omp_clause): Handle OMP_CLAUSE_UNROLL_NONE, OMP_CLAUSE_UNROLL_FULL, and OMP_CLAUSE_UNROLL_PARTIAL. (dump_generic_node): Handle OMP_UNROLL. * tree.cc (omp_clause_num_ops): Add number of operators for OMP_CLAUSE_UNROLL_FULL, OMP_CLAUSE_UNROLL_NONE, and OMP_CLAUSE_UNROLL_PARTIAl. (omp_clause_code_names): Add name strings for OMP_CLAUSE_UNROLL_FULL, OMP_CLAUSE_UNROLL_NONE, and OMP_CLAUSE_UNROLL_PARTIAL. * tree.def (OMP_UNROLL): Define. * tree.h (OMP_CLAUSE_UNROLL_PARTIAL_EXPR): Define. * omp-transform-loops.cc: New file. * omp-general.cc (omp_loop_transform_clause_p): New function. * omp-general.h (omp_loop_transform_clause_p): New declaration. gcc/fortran/ChangeLog: * dump-parse-tree.cc (show_omp_clauses): Handle "unroll full" and "unroll partial". (show_omp_node): Handle OMP_UNROLL. (show_code_node): Handle EXEC_OMP_UNROLL. * gfortran.h (enum gfc_statement): Add ST_OMP_UNROLL, ST_OMP_END_UNROLL. (enum gfc_exec_op): Add EXEC_OMP_UNROLL. * match.h (gfc_match_omp_unroll): Declare. * openmp.cc (enum omp_mask2): Add OMP_CLAUSE_UNROLL_FULL, OMP_CLAUSE_UNROLL_NONE, OMP_CLAUSE_UNROLL_PARTIAL. (gfc_match_omp_clauses): Handle "omp unroll partial". (OMP_UNROLL_CLAUSES): New macro definition. (gfc_match_omp_unroll): Match "full" clause. (omp_unroll_removes_loop_nest): New function. (resolve_omp_unroll): New function. (resolve_omp_do): Accept and verify "omp unroll" directives between directive and loop. (omp_code_to_statement): Handle EXEC_OMP_UNROLL. (gfc_resolve_omp_directive): Likewise. * parse.cc (decode_omp_directive): Handle "undroll" and "end unroll". (next_statement): Handle ST_OMP_UNROLL. (gfc_ascii_statement): Handle ST_OMP_UNROLL and ST_OMP_END_UNROLL. (parse_omp_do): Accept ST_OMP_UNROLL and ST_OMP_END_UNROLL before/after loop. (parse_executable): Handle ST_OMP_UNROLL. * resolve.cc (gfc_resolve_blocks): Handle EXEC_OMP_UNROLL. (gfc_resolve_code): Likewise. * st.cc (gfc_free_statement): Likewise. * trans-openmp.cc (gfc_trans_omp_clauses): Handle unroll clauses. (gfc_trans_omp_do): Handle OMP_CLAUSE_UNROLL_FULL, OMP_CLAUSE_UNROLL_PARTIAL, OMP_CLAUSE_UNROLL_NONE creation. (gfc_trans_omp_directive): Handle EXEC_OMP_UNROLL. * trans.cc (trans_code): Likewise. libgomp/ChangeLog: * testsuite/libgomp.fortran/loop-transforms/unroll-1.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-2.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-3.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-4.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-5.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-6.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-7.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-7a.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-7b.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-7c.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-8.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-simd-1.f90: New test. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/loop-transforms/unroll-1.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-2.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-3.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-4.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-5.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-6.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-7.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-9.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-no-clause-1.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-no-clause-2.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-no-clause-3.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-simd-1.f90: New test.
2023-05-18Add 'libgomp.c/alloc-ompx_host_mem_alloc-1.c'Thomas Schwinge2-0/+79
OpenMP 'ompx_host_mem_alloc' is available for host and nvptx offloading as of og12 commit 84914e197d91a67b3d27db0e4c69a433462983a5 "openmp, nvptx: ompx_unified_shared_mem_alloc", and for GCN offloading as of og12 commit c77c45a641fedc3fe770e909cc010fb1735bdbbd "amdgcn, libgomp: low-latency allocator". libgomp/ * testsuite/libgomp.c/alloc-ompx_host_mem_alloc-1.c: New.
2023-05-18Miscellaneous clean-up re OpenMP 'ompx_host_mem_space'Thomas Schwinge2-0/+9
Like done for nvptx in og12 commit 23f52e49368d7b26a1b1a72d6bb903d31666e961 "Miscellaneous clean-up re OpenMP 'ompx_unified_shared_mem_space', 'ompx_host_mem_space'". Clean-up for og12 commit c77c45a641fedc3fe770e909cc010fb1735bdbbd "amdgcn, libgomp: low-latency allocator". No functional change. libgomp/ * config/gcn/allocator.c (gcn_memspace_free): Explicitly handle 'memspace == ompx_host_mem_space'.