aboutsummaryrefslogtreecommitdiff
path: root/libgomp/testsuite
AgeCommit message (Collapse)AuthorFilesLines
2021-05-28OpenMP: Add iterator support to Fortran's depend; add affinity clauseTobias Burnus1-0/+89
gcc/c-family/ChangeLog: * c-pragma.h (enum pragma_omp_clause): Add PRAGMA_OMP_CLAUSE_AFFINITY. gcc/c/ChangeLog: * c-parser.c (c_parser_omp_clause_affinity): New. (c_parser_omp_clause_name, c_parser_omp_variable_list, c_parser_omp_all_clauses, OMP_TASK_CLAUSE_MASK): Handle affinity clause. * c-typeck.c (handle_omp_array_sections_1, handle_omp_array_sections, c_finish_omp_clauses): Likewise. gcc/cp/ChangeLog: * parser.c (cp_parser_omp_clause_affinity): New. (cp_parser_omp_clause_name, cp_parser_omp_var_list_no_open, cp_parser_omp_all_clauses, OMP_TASK_CLAUSE_MASK): Handle affinity clause. * semantics.c (handle_omp_array_sections_1, handle_omp_array_sections, finish_omp_clauses): Likewise. gcc/fortran/ChangeLog: * dump-parse-tree.c (show_iterator): New. (show_omp_namelist): Handle iterators. (show_omp_clauses): Handle affinity. * gfortran.h (gfc_free_omp_namelist): New union with 'udr' and new 'ns'. * match.c (gfc_free_omp_namelist): Add are to choose union element. * openmp.c (gfc_free_omp_clauses, gfc_match_omp_detach, gfc_match_omp_clause_reduction, gfc_match_omp_flush): Update call to gfc_free_omp_namelist. (gfc_match_omp_variable_list): Likewise; permit preceeding whitespace. (enum omp_mask1): Add OMP_CLAUSE_AFFINITY. (gfc_match_iterator): New. (gfc_match_omp_clauses): Use it; update call to gfc_free_omp_namelist. (OMP_TASK_CLAUSES): Add OMP_CLAUSE_AFFINITY. (gfc_match_omp_taskwait): Match depend clause. (resolve_omp_clauses): Handle affinity; update for udr/union change. (gfc_resolve_omp_directive): Resolve clauses of taskwait. * st.c (gfc_free_statement): Update gfc_free_omp_namelist call. * trans-openmp.c (gfc_trans_omp_array_reduction_or_udr): Likewise (handle_iterator): New. (gfc_trans_omp_clauses): Handle iterators for depend/affinity clause. (gfc_trans_omp_taskwait): Handle depend clause. (gfc_trans_omp_directive): Update call. gcc/ChangeLog: * gimplify.c (gimplify_omp_affinity): New. (gimplify_scan_omp_clauses): Call it; remove affinity clause afterwards. * tree-core.h (enum omp_clause_code): Add OMP_CLAUSE_AFFINITY. * tree-pretty-print.c (dump_omp_clause): Handle OMP_CLAUSE_AFFINITY. * tree.c (omp_clause_num_ops, omp_clause_code_name): Add clause. (walk_tree_1): Handle OMP_CLAUSE_AFFINITY. libgomp/ChangeLog: * testsuite/libgomp.fortran/depend-iterator-2.f90: New test. gcc/testsuite/ChangeLog: * c-c++-common/gomp/affinity-1.c: New test. * c-c++-common/gomp/affinity-2.c: New test. * c-c++-common/gomp/affinity-3.c: New test. * c-c++-common/gomp/affinity-4.c: New test. * c-c++-common/gomp/affinity-5.c: New test. * c-c++-common/gomp/affinity-6.c: New test. * c-c++-common/gomp/affinity-7.c: New test. * gfortran.dg/gomp/affinity-clause-1.f90: New test. * gfortran.dg/gomp/affinity-clause-2.f90: New test. * gfortran.dg/gomp/affinity-clause-3.f90: New test. * gfortran.dg/gomp/affinity-clause-4.f90: New test. * gfortran.dg/gomp/affinity-clause-5.f90: New test. * gfortran.dg/gomp/affinity-clause-6.f90: New test. * gfortran.dg/gomp/depend-iterator-1.f90: New test. * gfortran.dg/gomp/depend-iterator-2.f90: New test. * gfortran.dg/gomp/depend-iterator-3.f90: New test. * gfortran.dg/gomp/taskwait.f90: New test.
2021-05-27libgomp: Add openacc_{cuda,cublas,cudart} effective targets and use them in ↵Jakub Jelinek28-7/+92
openacc testsuite When gcc is configured for nvptx offloading with --without-cuda-driver and full CUDA isn't installed, many libgomp.oacc-*/* tests fail, some of them because cuda.h header can't be found, others because the tests can't be linked against -lcuda, -lcudart or -lcublas. I usually only have akmod-nvidia and xorg-x11-drv-nvidia-cuda rpms installed, so libcuda.so.1 can be dlopened and the offloading works, but linking against those libraries isn't possible nor are the headers around (for the plugin itself there is the fallback libgomp/plugin/cuda/cuda.h). The following patch adds 3 new effective targets and uses them in tests that needs those. 2021-05-27 Jakub Jelinek <jakub@redhat.com> * testsuite/lib/libgomp.exp (check_effective_target_openacc_cuda, check_effective_target_openacc_cublas, check_effective_target_openacc_cudart): New. * testsuite/libgomp.oacc-fortran/host_data-4.f90: Require effective target openacc_cublas. * testsuite/libgomp.oacc-fortran/host_data-2.f90: Likewise. * testsuite/libgomp.oacc-fortran/host_data-3.f: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-91.c: Require effective target openacc_cuda. * testsuite/libgomp.oacc-c-c++-common/lib-70.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-90.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-75.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-69.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-74.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-81.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-72.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-85.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/pr87835.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-82.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-73.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-83.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-78.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-76.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-84.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-79.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/host_data-1.c: Require effective targets openacc_cublas and openacc_cudart. * testsuite/libgomp.oacc-c-c++-common/context-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/context-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/context-3.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/context-4.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/acc_get_property-nvptx.c: Require effective target openacc_cudart. * testsuite/libgomp.oacc-c-c++-common/asyncwait-1.c: Add -DUSE_CUDA_H for effective target openacc_cuda and add && defined USE_CUDA_H to preprocessor conditionals. Guard -lcuda also on openacc_cuda effective target.
2021-05-26openmp: Fix up handling of target constructs in offloaded routines [PR100573]Jakub Jelinek3-27/+83
OpenMP Nesting of Regions restrictions say: - If a target update, target data, target enter data, or target exit data construct is encountered during execution of a target region, the behavior is unspecified. - If a target construct is encountered during execution of a target region and a device clause in which the ancestor device-modifier appears is not present on the construct, the behavior is unspecified. That wording is about the dynamic (runtime) behavior, not about lexical nesting, so while it is UB if omp target * is encountered in the target region, we need to make it compile and link (for lexical nesting of target * inside of target we actually emit a warning). To make this work, I had to do multiple changes. One was to mark .omp_data_{sizes,kinds}.* variables when static as "omp declare target". Another one was to add stub GOMP_target* entrypoints to nvptx and gcn libgomp.a. The entrypoint functions shouldn't be called or passed in the offload regions, otherwise libgomp: cuLaunchKernel error: too many resources requested for launch was reported; fixed by changing those arguments of calls to GOMP_target_ext to NULL. And we didn't mark the entrypoints "omp target entrypoint" when the caller has been "omp declare target". 2021-05-26 Jakub Jelinek <jakub@redhat.com> PR libgomp/100573 gcc/ * omp-low.c: Include omp-offload.h. (create_omp_child_function): If current_function_decl has "omp declare target" attribute and is_gimple_omp_offloaded, remove that attribute from the copy of attribute list and add "omp target entrypoint" attribute instead. (lower_omp_target): Mark .omp_data_sizes.* and .omp_data_kinds.* variables for offloading if in omp_maybe_offloaded_ctx. * omp-offload.c (pass_omp_target_link::execute): Nullify second argument to GOMP_target_data_ext in offloaded code. libgomp/ * config/nvptx/target.c (GOMP_target_ext, GOMP_target_data_ext, GOMP_target_end_data, GOMP_target_update_ext, GOMP_target_enter_exit_data): New dummy entrypoints. * config/gcn/target.c (GOMP_target_ext, GOMP_target_data_ext, GOMP_target_end_data, GOMP_target_update_ext, GOMP_target_enter_exit_data): Likewise. * testsuite/libgomp.c-c++-common/for-3.c (DO_PRAGMA, OMPTEAMS, OMPFROM, OMPTO): Define. (main): Remove #pragma omp target teams around all the tests. * testsuite/libgomp.c-c++-common/target-41.c: New test. * testsuite/libgomp.c-c++-common/target-42.c: New test.
2021-05-25openmp: Fix reduction clause handling on teams distribute simd [PR99928]Jakub Jelinek1-0/+16
When a directive isn't combined with worksharing-loop, it takes much simpler clause splitting path for reduction, and that one was missing handling of teams when combined with simd. 2021-05-25 Jakub Jelinek <jakub@redhat.com> PR middle-end/99928 gcc/c-family/ * c-omp.c (c_omp_split_clauses): Copy reduction to teams when teams is combined with simd and not with taskloop or for. gcc/testsuite/ * c-c++-common/gomp/pr99928-8.c: Remove xfails from omp teams r21 and r28 checks. * c-c++-common/gomp/pr99928-9.c: Likewise. * c-c++-common/gomp/pr99928-10.c: Likewise. libgomp/ * testsuite/libgomp.c-c++-common/reduction-17.c: New test.
2021-05-24OpenMP/Fortran: Handle polymorphic scalars in data-sharing FIRSTPRIVATE ↵Tobias Burnus3-0/+991
[PR86470] gcc/fortran/ChangeLog: PR fortran/86470 * trans-expr.c (gfc_copy_class_to_class): Add unshare_expr. * trans-openmp.c (gfc_is_polymorphic_nonptr, gfc_is_unlimited_polymorphic_nonptr): New. (gfc_omp_clause_copy_ctor, gfc_omp_clause_dtor): Handle polymorphic scalars. libgomp/ChangeLog: PR fortran/86470 * testsuite/libgomp.fortran/class-firstprivate-1.f90: New test. * testsuite/libgomp.fortran/class-firstprivate-2.f90: New test. * testsuite/libgomp.fortran/class-firstprivate-3.f90: New test. gcc/testsuite/ChangeLog: PR fortran/86470 * gfortran.dg/gomp/class-firstprivate-1.f90: New test. * gfortran.dg/gomp/class-firstprivate-2.f90: New test. * gfortran.dg/gomp/class-firstprivate-3.f90: New test. * gfortran.dg/gomp/class-firstprivate-4.f90: New test.
2021-05-22[OpenACC privatization] Prune uninteresting/varying diagnostics in ↵Thomas Schwinge1-6/+3
'libgomp.oacc-fortran/privatized-ref-2.f90' Minor fix-up for my recent commit 11b8286a83289f5b54e813f14ff56d730c3f3185 "[OpenACC privatization] Largely extend diagnostics and corresponding testsuite coverage [PR90115]". libgomp/ PR testsuite/90115 * testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: Prune uninteresting/varying diagnostics. Reported-by: Sunil K Pandey <skpandey@sc.intel.com>
2021-05-21[OpenACC privatization] Reject 'static', 'external' in blocks [PR90115]Thomas Schwinge2-20/+8
gcc/ PR middle-end/90115 * omp-low.c (oacc_privatization_candidate_p): Reject 'static', 'external' in blocks. gcc/testsuite/ PR middle-end/90115 * c-c++-common/goacc/privatization-1-compute-loop.c: Update. * c-c++-common/goacc/privatization-1-compute.c: Likewise. * c-c++-common/goacc/privatization-1-routine_gang-loop.c: Likewise. * c-c++-common/goacc/privatization-1-routine_gang.c: Likewise. libgomp/ PR middle-end/90115 * testsuite/libgomp.oacc-c-c++-common/static-variable-1.c: Update. * testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: Likewise.
2021-05-21[OpenACC privatization] Largely extend diagnostics and corresponding ↵Thomas Schwinge70-218/+1509
testsuite coverage [PR90115] gcc/ PR middle-end/90115 * flag-types.h (enum openacc_privatization): New. * params.opt (-param=openacc-privatization): New. * doc/invoke.texi (openacc-privatization): Document it. * omp-general.h (get_openacc_privatization_dump_flags): New function. * omp-low.c (oacc_privatization_candidate_p): Add diagnostics. * omp-offload.c (execute_oacc_device_lower) <IFN_UNIQUE_OACC_PRIVATE>: Re-work diagnostics. * target.def (goacc.adjust_private_decl): Add 'location_t' parameter. * doc/tm.texi: Regenerate. * config/gcn/gcn-protos.h (gcn_goacc_adjust_private_decl): Adjust. * config/gcn/gcn-tree.c (gcn_goacc_adjust_private_decl): Likewise. * config/nvptx/nvptx.c (nvptx_goacc_adjust_private_decl): Likewise. Preserve it for... (nvptx_goacc_expand_var_decl): ... use here. gcc/testsuite/ PR middle-end/90115 * c-c++-common/goacc/privatization-1-compute-loop.c: New file. * c-c++-common/goacc/privatization-1-compute.c: Likewise. * c-c++-common/goacc/privatization-1-routine_gang-loop.c: Likewise. * c-c++-common/goacc/privatization-1-routine_gang.c: Likewise. * gfortran.dg/goacc/privatization-1-compute-loop.f90: Likewise. * gfortran.dg/goacc/privatization-1-compute.f90: Likewise. * gfortran.dg/goacc/privatization-1-routine_gang-loop.f90: Likewise. * gfortran.dg/goacc/privatization-1-routine_gang.f90: Likewise. * c-c++-common/goacc-gomp/nesting-1.c: Update. * c-c++-common/goacc/private-reduction-1.c: Likewise. * gfortran.dg/goacc/private-3.f95: Likewise. libgomp/ PR middle-end/90115 * testsuite/libgomp.oacc-fortran/private-atomic-1-vector.f90: New file. * testsuite/libgomp.oacc-c-c++-common/firstprivate-1.c: Update. * testsuite/libgomp.oacc-c-c++-common/host_data-7.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-3.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-4.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-5.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-3.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-4.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-5.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-6.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-vector-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-vector-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-3.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-4.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-5.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-6.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-7.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-g-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-g-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-gwv-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-gwv-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-red-g-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-red-gwv-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-red-v-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-red-v-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-red-w-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-red-w-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-red-wv-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-v-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-w-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-wv-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/parallel-reduction.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/private-atomic-1-gang.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/private-atomic-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/private-variables.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/routine-4.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/static-variable-1.c: Likewise. * testsuite/libgomp.oacc-fortran/acc_on_device-1-1.f90: Likewise. * testsuite/libgomp.oacc-fortran/acc_on_device-1-2.f: Likewise. * testsuite/libgomp.oacc-fortran/acc_on_device-1-3.f: Likewise. * testsuite/libgomp.oacc-fortran/declare-1.f90: Likewise. * testsuite/libgomp.oacc-fortran/host_data-5.F90: Likewise. * testsuite/libgomp.oacc-fortran/if-1.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-1.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-2.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-3.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-6.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-vector-1.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-vector-2.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-1.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-2.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-3.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-4.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-5.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-6.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-7.f90: Likewise. * testsuite/libgomp.oacc-fortran/optional-private.f90: Likewise. * testsuite/libgomp.oacc-fortran/parallel-dims.f90: Likewise. * testsuite/libgomp.oacc-fortran/private-atomic-1-gang.f90: Likewise. * testsuite/libgomp.oacc-fortran/private-atomic-1-worker.f90: Likewise. * testsuite/libgomp.oacc-fortran/private-variables.f90: Likewise. * testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: Likewise. * testsuite/libgomp.oacc-fortran/routine-7.f90: Likewise.
2021-05-21openacc: Add support for gang local storage allocation in shared memory ↵Julian Brown3-0/+95
[PR90115] This patch implements a method to track the "private-ness" of OpenACC variables declared in offload regions in gang-partitioned, worker-partitioned or vector-partitioned modes. Variables declared implicitly in scoped blocks and those declared "private" on enclosing directives (e.g. "acc parallel") are both handled. Variables that are e.g. gang-private can then be adjusted so they reside in GPU shared memory. The reason for doing this is twofold: correct implementation of OpenACC semantics, and optimisation, since shared memory might be faster than the main memory on a GPU. Handling of private variables is intimately tied to the execution model for gangs/workers/vectors implemented by a particular target: for current targets, we use (or on mainline, will soon use) a broadcasting/neutering scheme. That is sufficient for code that e.g. sets a variable in worker-single mode and expects to use the value in worker-partitioned mode. The difficulty (semantics-wise) comes when the user wants to do something like an atomic operation in worker-partitioned mode and expects a worker-single (gang private) variable to be shared across each partitioned worker. Forcing use of shared memory for such variables makes that work properly. In terms of implementation, the parallelism level of a given loop is not fixed until the oaccdevlow pass in the offload compiler, so the patch delays fixing the parallelism level of variables declared on or within such loops until the same point. This is done by adding a new internal UNIQUE function (OACC_PRIVATE) that lists (the address of) each private variable as an argument, and other arguments set so as to be able to determine the correct parallelism level to use for the listed variables. This new internal function fits into the existing scheme for demarcating OpenACC loops, as described in comments in the patch. Two new target hooks are introduced: TARGET_GOACC_ADJUST_PRIVATE_DECL and TARGET_GOACC_EXPAND_VAR_DECL. The first can tweak a variable declaration at oaccdevlow time, and the second at expand time. The first or both of these target hooks can be used by a given offload target, depending on its strategy for implementing private variables. This patch updates the TARGET_GOACC_ADJUST_PRIVATE_DECL target hook in the AMD GCN backend to the current name and prototype. (An earlier version of the hook was already present, but dormant.) gcc/ PR middle-end/90115 * doc/tm.texi.in (TARGET_GOACC_EXPAND_VAR_DECL) (TARGET_GOACC_ADJUST_PRIVATE_DECL): Add documentation hooks. * doc/tm.texi: Regenerate. * expr.c (expand_expr_real_1): Expand decls using the expand_var_decl OpenACC hook if defined. * internal-fn.c (expand_UNIQUE): Handle IFN_UNIQUE_OACC_PRIVATE. * internal-fn.h (IFN_UNIQUE_CODES): Add OACC_PRIVATE. * omp-low.c (omp_context): Add oacc_privatization_candidates field. (lower_oacc_reductions): Add PRIVATE_MARKER parameter. Insert before fork. (lower_oacc_head_tail): Add PRIVATE_MARKER parameter. Modify private marker's gimple call arguments, and pass it to lower_oacc_reductions. (oacc_privatization_scan_clause_chain) (oacc_privatization_scan_decl_chain, lower_oacc_private_marker): New functions. (lower_omp_for, lower_omp_target, lower_omp_1): Use these. * omp-offload.c (convert.h): Include. (oacc_loop_xform_head_tail): Treat private-variable markers like fork/join when transforming head/tail sequences. (struct var_decl_rewrite_info): Add struct. (oacc_rewrite_var_decl, is_sync_builtin_call): New functions. (execute_oacc_device_lower): Support rewriting gang-private variables using target hook, and fix up addr_expr and var_decl nodes afterwards. * target.def (adjust_private_decl, expand_var_decl): New hooks. * config/gcn/gcn-protos.h (gcn_goacc_adjust_gangprivate_decl): Rename to... (gcn_goacc_adjust_private_decl): ...this. * config/gcn/gcn-tree.c (gcn_goacc_adjust_gangprivate_decl): Rename to... (gcn_goacc_adjust_private_decl): ...this. Add LEVEL parameter. * config/gcn/gcn.c (TARGET_GOACC_ADJUST_GANGPRIVATE_DECL): Rename definition using gcn_goacc_adjust_gangprivate_decl... (TARGET_GOACC_ADJUST_PRIVATE_DECL): ...to this, using gcn_goacc_adjust_private_decl. * config/nvptx/nvptx.c (tree-pretty-print.h): Include. (gang_private_shared_size): New global variable. (gang_private_shared_align): Likewise. (gang_private_shared_sym): Likewise. (gang_private_shared_hmap): Likewise. (nvptx_option_override): Initialize these. (nvptx_file_end): Output gang_private_shared_sym. (nvptx_goacc_adjust_private_decl, nvptx_goacc_expand_var_decl): New functions. (nvptx_set_current_function): Clear gang_private_shared_hmap. (TARGET_GOACC_ADJUST_PRIVATE_DECL): Define hook. (TARGET_GOACC_EXPAND_VAR_DECL): Likewise. libgomp/ PR middle-end/90115 * testsuite/libgomp.oacc-c-c++-common/private-atomic-1-gang.c: New test. * testsuite/libgomp.oacc-fortran/private-atomic-1-gang.f90: Likewise. * testsuite/libgomp.oacc-fortran/private-atomic-1-worker.f90: Likewise. Co-Authored-By: Chung-Lin Tang <cltang@codesourcery.com> Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
2021-05-21Don't skip 'libgomp.oacc-fortran/privatized-ref-2.f90' for nvptx offloadingThomas Schwinge1-1/+11
libgomp/ * testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: Don't skip for nvptx offloading.
2021-05-21Add 'libgomp.oacc-fortran/privatized-ref-2.f90'Tobias Burnus1-0/+101
libgomp/ * testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: New.
2021-05-19Add 'libgomp.oacc-c-c++-common/private-atomic-1.c' [PR83812]Thomas Schwinge1-0/+37
... to at least document/test/XFAIL nvptx offloading: PR83812 "operation not supported on global/shared address space". libgomp/ PR target/83812 * testsuite/libgomp.oacc-c-c++-common/private-atomic-1.c: New.
2021-05-19Add 'libgomp.oacc-c-c++-common/loop-gwv-2.c'Julian Brown1-0/+95
libgomp/ * testsuite/libgomp.oacc-c-c++-common/loop-gwv-2.c: New.
2021-05-18[libgomp, testsuite] Don't shadow global 'offload_targets' variableThomas Schwinge1-3/+3
See local 'offload_targets' variable in 'libgomp/testsuite/lib/libgomp.exp:libgomp_check_effective_target_offload_target' vs. global 'libgomp/testsuite/libgomp-test-support.exp.in:offload_targets' variable. libgomp/ * testsuite/lib/libgomp.exp (check_effective_target_offload_target_nvptx): Don't shadow global 'offload_targets' variable.
2021-05-18'libgomp.c-c++-common/reduction-{5,6}.c': Restrict '-latomic' to nvptx ↵Thomas Schwinge2-2/+2
offloading compilation Fix-up for recent commit 33b647956caa977d1ae489f9baed9cef70b4f382 "OpenMP: Fix SIMT for complex/float reduction with && and ||"; see commit d42088e453042f4f8ba9190a7e29efd937ea2181 "Avoid -latomic for amdgcn offloading". libgomp/ * testsuite/libgomp.c-c++-common/reduction-5.c: Restrict '-latomic' to nvptx offloading compilation. * testsuite/libgomp.c-c++-common/reduction-6.c: Likewise.
2021-05-18'libgomp.c/target-44.c': Restrict '-latomic' to nvptx offloading compilationThomas Schwinge1-1/+1
Fix-up for recent commit f87990a2a8fc9e20d30462a0a4c9047582af0cd9 "[openmp, simt] Disable SIMT for user-defined reduction"; see commit d42088e453042f4f8ba9190a7e29efd937ea2181 "Avoid -latomic for amdgcn offloading". libgomp/ * testsuite/libgomp.c/target-44.c: Restrict '-latomic' to nvptx offloading compilation.
2021-05-17openmp: Notify team barrier of pending tasks in omp_fulfill_eventKwok Cheung Yeung1-0/+59
The team barrier should be notified of any new tasks that become runnable as the result of a completing task, otherwise the barrier threads might not resume processing available tasks, resulting in a hang. 2021-05-17 Kwok Cheung Yeung <kcy@codesourcery.com> libgomp/ * task.c (omp_fulfill_event): Call gomp_team_barrier_set_task_pending if new tasks generated. * testsuite/libgomp.c-c++-common/task-detach-13.c: New.
2021-05-14Fortran/OpenMP: Support 'omp parallel master'Tobias Burnus1-0/+14
gcc/fortran/ChangeLog: * dump-parse-tree.c (show_omp_node, show_code_node): Handle EXEC_OMP_PARALLEL_MASTER. * frontend-passes.c (gfc_code_walker): Likewise. * gfortran.h (enum gfc_statement): Add ST_OMP_PARALLEL_MASTER and ST_OMP_END_PARALLEL_MASTER. (enum gfc_exec_op): Add EXEC_OMP_PARALLEL_MASTER.. * match.h (gfc_match_omp_parallel_master): Handle it. * openmp.c (gfc_match_omp_parallel_master, resolve_omp_clauses, omp_code_to_statement, gfc_resolve_omp_directive): Likewise. * parse.c (decode_omp_directive, case_exec_markers, gfc_ascii_statement, parse_omp_structured_block, parse_executable): Likewise. * resolve.c (gfc_resolve_blocks, gfc_resolve_code): Likewise. * st.c (gfc_free_statement): Likewise. * trans-openmp.c (gfc_trans_omp_parallel_master, gfc_trans_omp_workshare, gfc_trans_omp_directive): Likewise. * trans.c (trans_code): Likewise. libgomp/ChangeLog: * testsuite/libgomp.fortran/parallel-master.f90: New test. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/parallel-master-1.f90: New test. * gfortran.dg/gomp/parallel-master-2.f90: New test.
2021-05-13testsuite: prune new LTO warningMartin Liska4-0/+4
libgomp/ChangeLog: PR testsuite/100569 * testsuite/libgomp.c/omp-nested-3.c: Prune new LTO warning. * testsuite/libgomp.c/pr46032-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/data-clauses-kernels-ipa-pta.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/data-clauses-parallel-ipa-pta.c: Likewise. gcc/testsuite/ChangeLog: PR testsuite/100569 * gcc.dg/atomic/c11-atomic-exec-2.c: Prune new LTO warning. * gcc.dg/torture/pr94947-1.c: Likewise.
2021-05-13OpenMP: detach - fix firstprivate handlingTobias Burnus2-0/+41
gcc/ChangeLog: * omp-low.c (finish_taskreg_scan): Use the proper detach decl. libgomp/ChangeLog: * testsuite/libgomp.c-c++-common/task-detach-12.c: New test. * testsuite/libgomp.fortran/task-detach-12.f90: New test.
2021-05-11openmp: Fix up taskloop reduction ICE if taskloop has no iterations [PR100471]Jakub Jelinek1-0/+21
When a taskloop doesn't have any iterations, GOMP_taskloop* takes an early return, doesn't create any tasks and more importantly, doesn't create a taskgroup and doesn't register task reductions. But, the code emitted in the callers assumes task reductions have been registered and performs the reduction handling and task reduction unregistration. The pointer to the task reduction private variables is reused, on input it is the alignment and only on output it is the pointer, so in the case taskloop with no iterations the caller attempts to dereference the alignment value as if it was a pointer and crashes. We could in the early returns register the task reductions only to have them looped over and unregistered in the caller, but I think it is better to tell the caller there is nothing to task reduce and bypass all that. 2021-05-11 Jakub Jelinek <jakub@redhat.com> PR middle-end/100471 * omp-low.c (lower_omp_task_reductions): For OMP_TASKLOOP, if data is 0, bypass the reduction loop including GOMP_taskgroup_reduction_unregister call. * taskloop.c (GOMP_taskloop): If GOMP_TASK_FLAG_REDUCTION and not GOMP_TASK_FLAG_NOGROUP, when doing early return clear the task reduction pointer. * testsuite/libgomp.c/task-reduction-4.c: New test.
2021-05-07OpenMP: Fix SIMT for complex/float reduction with && and ||Tobias Burnus2-0/+389
2021-05-07 Tobias Burnus <tobias@codesourcery.com> Tom de Vries <tdevries@suse.de> gcc/ChangeLog: * omp-low.c (lower_rec_simd_input_clauses): Set max_vf = 1 if a truth_value_p reduction variable is nonintegral. libgomp/ChangeLog: * testsuite/libgomp.c-c++-common/reduction-5.c: New test, testing complex/floating-point || + && reduction with 'omp target'. * testsuite/libgomp.c-c++-common/reduction-6.c: Likewise.
2021-05-04OpenMP: Support complex/float in && and || reductionTobias Burnus4-0/+770
C/C++ permit logical AND and logical OR also with floating-point or complex arguments by doing an unequal zero comparison; the result is an 'int' with value one or zero. Hence, those are also permitted as reduction variable, even though it is not the most sensible thing to do. gcc/c/ChangeLog: * c-typeck.c (c_finish_omp_clauses): Accept float + complex for || and && reductions. gcc/cp/ChangeLog: * semantics.c (finish_omp_reduction_clause): Accept float + complex for || and && reductions. gcc/ChangeLog: * omp-low.c (lower_rec_input_clauses, lower_reduction_clauses): Handle && and || with floating-point and complex arguments. gcc/testsuite/ChangeLog: * gcc.dg/gomp/clause-1.c: Use 'reduction(&:..)' instead of '...(&&:..)'. libgomp/ChangeLog: * testsuite/libgomp.c-c++-common/reduction-1.c: New test. * testsuite/libgomp.c-c++-common/reduction-2.c: New test. * testsuite/libgomp.c-c++-common/reduction-3.c: New test.
2021-05-04OpenMP/Fortran - fix pasto + testcase in depobj [PR100397]Tobias Burnus1-6/+9
gcc/fortran/ChangeLog: PR testsuite/100397 * trans-openmp.c (gfc_trans_omp_depobj): Fix pasto in enum values. libgomp/ChangeLog: PR testsuite/100397 * testsuite/libgomp.fortran/depobj-1.f90 (dep2, dep3): Move var declaration to scope of non-'depend'-guarded assignment to avoid races.
2021-05-03[openmp, simt] Disable SIMT for user-defined reductionTom de Vries1-0/+27
The test-case included in this patch contains this target region: ... for (int i0 = 0 ; i0 < N0 ; i0++ ) counter_N0.i += 1; ... When running with nvptx accelerator, the counter variable is expected to be N0 after the region, but instead is N0 / 32. The problem is that rather than getting the result for all warp lanes, we get it for just one lane. This is caused by the implementation of SIMT being incomplete. It handles regular reductions, but appearantly not user-defined reductions. For now, handle this by disabling SIMT in this case, specifically by setting sctx->max_vf to 1. Tested libgomp on x86_64-linux with nvptx accelerator. gcc/ChangeLog: 2021-05-03 Tom de Vries <tdevries@suse.de> PR target/100321 * omp-low.c (lower_rec_input_clauses): Disable SIMT for user-defined reduction. libgomp/ChangeLog: 2021-05-03 Tom de Vries <tdevries@suse.de> PR target/100321 * testsuite/libgomp.c/target-44.c: New test.
2021-04-30modulo-sched: skip loops with strange register defs [PR100225]Roman Zhuykov1-0/+1627
PR84878 fix adds an assertion which can fail, e.g. when stack pointer is adjusted inside the loop. We have to prevent it and search earlier for any 'strange' instruction. The solution is to skip the whole loop if using 'note_stores' we found that one of hard registers is in 'df->regular_block_artificial_uses' set. Also patch properly prohibit not single-set instruction in loop body. gcc/ChangeLog: PR rtl-optimization/100225 PR rtl-optimization/84878 * modulo-sched.c (sms_schedule): Use note_stores to skip loops where we have an instruction which touches (writes) any hard register from df->regular_block_artificial_uses set. Allow not-single-set instruction only right before basic block tail. gcc/testsuite/ChangeLog: PR rtl-optimization/100225 PR rtl-optimization/84878 * gcc.dg/pr100225.c: New test. libgomp/ChangeLog: * testsuite/libgomp.oacc-c-c++-common/atomic_capture-3.c: New test.
2021-04-29[omp, simt] Handle alternative IVTom de Vries1-0/+48
Consider the test-case libgomp.c/pr81778.c added in this commit, with this core loop (note: CANARY_SIZE set to 0 for simplicity): ... int s = 1; #pragma omp target simd for (int i = N - 1; i > -1; i -= s) a[i] = 1; ... which, given that N is 32, sets a[0..31] to 1. After omp-expand, this looks like: ... <bb 5> : simduid.7 = .GOMP_SIMT_ENTER (simduid.7); .omp_simt.8 = .GOMP_SIMT_ENTER_ALLOC (simduid.7); D.3193 = -s; s.9 = s; D.3204 = .GOMP_SIMT_LANE (); D.3205 = -s.9; D.3206 = (int) D.3204; D.3207 = D.3205 * D.3206; i = D.3207 + 31; D.3209 = 0; D.3210 = -s.9; D.3211 = D.3210 - i; D.3210 = -s.9; D.3212 = D.3211 / D.3210; D.3213 = (unsigned int) D.3212; D.3213 = i >= 0 ? D.3213 : 0; <bb 19> : if (D.3209 < D.3213) goto <bb 6>; [87.50%] else goto <bb 7>; [12.50%] <bb 6> : a[i] = 1; D.3215 = -s.9; D.3219 = .GOMP_SIMT_VF (); D.3216 = (int) D.3219; D.3220 = D.3215 * D.3216; i = D.3220 + i; D.3209 = D.3209 + 1; goto <bb 19>; [100.00%] ... On nvptx, the first time bb6 is executed, i is in the 0..31 range (depending on the lane that is executing) at bb entry. So we have the following sequence: - a[0..31] is set to 1 - i is updated to -32..-1 - D.3209 is updated to 1 (being 0 initially) - bb19 is executed, and if condition (D.3209 < D.3213) == (1 < 32) evaluates to true - bb6 is once more executed, which should not happen because all the elements that needed to be handled were already handled. - consequently, elements that should not be written are written - with CANARY_SIZE == 0, we may run into a libgomp error: ... libgomp: cuCtxSynchronize error: an illegal memory access was encountered ... and with CANARY_SIZE unmodified, we run into: ... Expected 0, got 1 at base[-961] Aborted (core dumped) ... The cause of this is as follows: - because the step s is a variable rather than a constant, an alternative IV (D.3209 in our example) is generated in expand_omp_simd, and the loop condition is tested in terms of the alternative IV rather than the original IV (i in our example). - the SIMT code in expand_omp_simd works by modifying step and initial value. - The initial value fd->loop.n1 is loaded into a variable n1, which is modified by the SIMT code and then used there-after. - The step fd->loop.step is loaded into a variable step, which is modified by the SIMT code, but afterwards there are uses of both step and fd->loop.step. - There are uses of fd->loop.step in the alternative IV handling code, which should use step instead. Fix this by introducing an additional variable orig_step, which is not modified by the SIMT code and replacing all remaining uses of fd->loop.step by either step or orig_step. Build on x86_64-linux with nvptx accelerator, tested libgomp. This fixes for-5.c and for-6.c FAILs I'm currently seeing on a quadro m1200 with driver 450.66. gcc/ChangeLog: 2020-10-02 Tom de Vries <tdevries@suse.de> * omp-expand.c (expand_omp_simd): Add step_orig, and replace uses of fd->loop.step by either step or orig_step. libgomp/ChangeLog: 2020-10-02 Tom de Vries <tdevries@suse.de> * testsuite/libgomp.c/pr81778.c: New test.
2021-04-29[omp, simt] Fix expand_GOMP_SIMT_*Tom de Vries1-0/+24
When running the test-case included in this patch using an nvptx accelerator, it fails in execution. The problem is that the expansion of GOMP_SIMT_XCHG_BFLY is optimized away during pass_jump as "trivially dead insns". This is caused by this code in expand_GOMP_SIMT_XCHG_BFLY: ... class expand_operand ops[3]; create_output_operand (&ops[0], target, mode); ... expand_insn (targetm.code_for_omp_simt_xchg_bfly, 3, ops); ... which doesn't guarantee that target is assigned to by the expanded insn. F.i., if target is: ... (gdb) call debug_rtx ( target ) (subreg/s/u:QI (reg:SI 40 [ _61 ]) 0) ... then after expand_insn, we have: ... (gdb) call debug_rtx ( ops[0].value ) (reg:QI 57) ... See commit 3af3bec2e4d "internal-fn: Avoid dropping the lhs of some calls [PR94941]" for a similar problem. Fix this in the same way, by adding: ... if (!rtx_equal_p (target, ops[0].value)) emit_move_insn (target, ops[0].value); ... where applicable in the expand_GOMP_SIMT_* functions. Tested libgomp on x86_64 with nvptx accelerator. gcc/ChangeLog: 2021-04-28 Tom de Vries <tdevries@suse.de> PR target/100232 * internal-fn.c (expand_GOMP_SIMT_ENTER_ALLOC) (expand_GOMP_SIMT_LAST_LANE, expand_GOMP_SIMT_ORDERED_PRED) (expand_GOMP_SIMT_VOTE_ANY, expand_GOMP_SIMT_XCHG_BFLY) (expand_GOMP_SIMT_XCHG_IDX): Ensure target is assigned to.
2021-04-26OpenACC: Fix pattern in dg-bogus in Fortran testcases againTobias Burnus4-9/+9
It turned out that a compiler built without offloading support and one with can produce slightly different diagnostic. Offloading support implies ENABLE_OFFLOAD which implies that g->have_offload is set when offloading is actually needed. In cgraphunit.c, the latter causes flag_generate_offload = 1, which in turn affects tree.c's free_lang_data. The result is that the front-end specific diagnostic gets reset ('tree_diagnostics_defaults (global_dc)'), which affects in this case 'Warning' vs. 'warning' via the Fortran frontend. Result: 'Warning:' vs. 'warning:'. Side note: Other FE also override the diagnostic, leading to similar differences, e.g. the C++ FE outputs mangled function names differently, cf. patch thread. libgomp/ChangeLog: * testsuite/libgomp.oacc-fortran/par-reduction-2-1.f: Use [Ww]arning in dg-bogus as FE diagnostic and default diagnostic differ and the result depends on ENABLE_OFFLOAD. * testsuite/libgomp.oacc-fortran/par-reduction-2-2.f: Likewise. * testsuite/libgomp.oacc-fortran/parallel-dims.f90: Likewise. * testsuite/libgomp.oacc-fortran/parallel-reduction.f90: Likewise. gcc/testsuite/ChangeLog: * gfortran.dg/goacc/classify-serial.f95: Use [Ww]arning in dg-bogus as FE diagnostic and default diagnostic differ and the result depends on ENABLE_OFFLOAD. * gfortran.dg/goacc/kernels-decompose-2.f95: Likewise. * gfortran.dg/goacc/routine-module-mod-1.f90: Likewise.
2021-04-26OpenACC: Fix pattern in dg-bogus in Fortran testcasesTobias Burnus4-9/+9
libgomp/ChangeLog: * testsuite/libgomp.oacc-fortran/par-reduction-2-1.f: Correct spelling in dg-bogus to match -Wopenacc-parallelism. * testsuite/libgomp.oacc-fortran/par-reduction-2-2.f: Likewise. * testsuite/libgomp.oacc-fortran/parallel-dims.f90: Likewise. * testsuite/libgomp.oacc-fortran/parallel-reduction.f90: Likewise. gcc/testsuite/ChangeLog: * gfortran.dg/goacc/classify-serial.f95: Correct spelling in dg-bogus to match -Wopenacc-parallelism. * gfortran.dg/goacc/kernels-decompose-2.f95: Likewise. * gfortran.dg/goacc/routine-module-mod-1.f90: Likewise.
2021-04-26Add '-Wopenacc-parallelism'Thomas Schwinge29-2/+209
... to diagnose potentially suboptimal choices regarding OpenACC parallelism. Not enabled by default: too noisy ("*potentially* suboptimal choices"); see XFAILed 'dg-bogus'es. gcc/c-family/ * c.opt (Wopenacc-parallelism): New. gcc/fortran/ * lang.opt (Wopenacc-parallelism): New. gcc/ * omp-offload.c (oacc_validate_dims): Implement '-Wopenacc-parallelism'. * doc/invoke.texi (-Wopenacc-parallelism): Document. gcc/testsuite/ * c-c++-common/goacc/diag-parallelism-1.c: New. * c-c++-common/goacc/acc-icf.c: Specify '-Wopenacc-parallelism', and match diagnostics, as appropriate. * c-c++-common/goacc/classify-kernels-unparallelized.c: Likewise. * c-c++-common/goacc/classify-kernels.c: Likewise. * c-c++-common/goacc/classify-parallel.c: Likewise. * c-c++-common/goacc/classify-routine.c: Likewise. * c-c++-common/goacc/classify-serial.c: Likewise. * c-c++-common/goacc/kernels-decompose-1.c: Likewise. * c-c++-common/goacc/kernels-decompose-2.c: Likewise. * c-c++-common/goacc/parallel-dims-1.c: Likewise. * c-c++-common/goacc/parallel-reduction.c: Likewise. * c-c++-common/goacc/pr70688.c: Likewise. * c-c++-common/goacc/routine-1.c: Likewise. * c-c++-common/goacc/routine-level-of-parallelism-2.c: Likewise. * c-c++-common/goacc/uninit-dim-clause.c: Likewise. * gfortran.dg/goacc/classify-kernels-unparallelized.f95: Likewise. * gfortran.dg/goacc/classify-kernels.f95: Likewise. * gfortran.dg/goacc/classify-parallel.f95: Likewise. * gfortran.dg/goacc/classify-routine.f95: Likewise. * gfortran.dg/goacc/classify-serial.f95: Likewise. * gfortran.dg/goacc/kernels-decompose-1.f95: Likewise. * gfortran.dg/goacc/kernels-decompose-2.f95: Likewise. * gfortran.dg/goacc/parallel-tree.f95: Likewise. * gfortran.dg/goacc/routine-4.f90: Likewise. * gfortran.dg/goacc/routine-level-of-parallelism-1.f90: Likewise. * gfortran.dg/goacc/routine-module-mod-1.f90: Likewise. * gfortran.dg/goacc/routine-multiple-directives-1.f90: Likewise. * gfortran.dg/goacc/uninit-dim-clause.f95: Likewise. libgomp/ * testsuite/libgomp.oacc-c-c++-common/firstprivate-1.c: Specify '-Wopenacc-parallelism', and match diagnostics, as appropriate. * testsuite/libgomp.oacc-c-c++-common/loop-auto-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-red-w-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-red-w-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-w-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/mode-transitions.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/par-reduction-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/par-reduction-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/parallel-reduction.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/pr85381-3.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/private-variables.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/reduction-5.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/reduction-7.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/routine-g-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/routine-w-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/routine-wv-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/static-variable-1.c: Likewise. * testsuite/libgomp.oacc-fortran/optional-private.f90: Likewise. * testsuite/libgomp.oacc-fortran/par-reduction-2-1.f: Likewise. * testsuite/libgomp.oacc-fortran/par-reduction-2-2.f: Likewise. * testsuite/libgomp.oacc-fortran/parallel-dims.f90: Likewise. * testsuite/libgomp.oacc-fortran/parallel-reduction.f90: Likewise. * testsuite/libgomp.oacc-fortran/pr84028.f90: Likewise. * testsuite/libgomp.oacc-fortran/private-variables.f90: Likewise. * testsuite/libgomp.oacc-fortran/reduction-1.f90: Likewise. * testsuite/libgomp.oacc-fortran/reduction-5.f90: Likewise. * testsuite/libgomp.oacc-fortran/reduction-6.f90: Likewise. * testsuite/libgomp.oacc-fortran/routine-7.f90: Likewise. Co-Authored-By: Nathan Sidwell <nathan@codesourcery.com> Co-Authored-By: Tom de Vries <vries@codesourcery.com> Co-Authored-By: Julian Brown <julian@codesourcery.com> Co-Authored-By: Kwok Cheung Yeung <kcy@codesourcery.com>
2021-04-26[OpenACC] Don't compile libgomp testcases with '-w'Thomas Schwinge10-12/+1
We'd like to actually catch compiler diagnostics (and currently there aren't any). libgomp/ * testsuite/libgomp.oacc-c-c++-common/par-reduction-1.c: Don't compile with '-w'. * testsuite/libgomp.oacc-c-c++-common/par-reduction-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/parallel-reduction.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/reduction-5.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/reduction-6.c: Likewise. * testsuite/libgomp.oacc-fortran/parallel-reduction.f90: Likewise. * testsuite/libgomp.oacc-fortran/reduction-1.f90: Likewise. * testsuite/libgomp.oacc-fortran/reduction-5.f90: Likewise. * testsuite/libgomp.oacc-fortran/reduction-6.f90: Likewise. * testsuite/libgomp.oacc-fortran/reduction-7.f90: Likewise.
2021-04-22Avoid -latomic for amdgcn offloadingRichard Biener1-1/+1
libatomic isn't built for amdgcn but reduction-16.c adds it via -foffload=-latomic when offloading for nvptx is enabled. The following avoids linker errors when offloading to amdgcn is enabled as well. 2021-04-21 Richard Biener <rguenther@suse.de> libgomp/ * testsuite/libgomp.c-c++-common/reduction-16.c: Use -latomic only on nvptx-none.
2021-04-21libgomp.fortran/depobj-1.f90: Fix omp_depend_kindTobias Burnus1-2/+1
libgomp/ * testsuite/libgomp.fortran/depobj-1.f90: Use omp_lib's omp_depend_kind instead of defining it as 16.
2021-04-21libgomp/testsuite: Fix checks for dg-excess-errorsTobias Burnus5-11/+23
For the tests modified below, the effective target line has to be effective when compiling for an offload target, except that variable-not-offloaded.c would compile with unified-share memory and pr86416-*.c if long double/float128 is supported. The previous check used a run-time device ability check. This new variant now enables those dg- lines when _compiling_ for nvptx or gcn. libgomp/ChangeLog: * testsuite/lib/libgomp.exp (offload_target_to_openacc_device_type): New, based on check_effective_target_offload_target_nvptx. (check_effective_target_offload_target_nvptx): Call it. (check_effective_target_offload_target_amdgcn): New. * testsuite/libgomp.c-c++-common/function-not-offloaded.c: Require target offload_target_nvptx || offload_target_amdgcn. * testsuite/libgomp.c-c++-common/variable-not-offloaded.c: Likewise. * testsuite/libgomp.c/pr86416-1.c: Likewise. * testsuite/libgomp.c/pr86416-2.c: Likewise.
2021-04-21Fortran/OpenMP: Add 'omp depobj' and 'depend(mutexinoutset:'Tobias Burnus1-0/+113
gcc/fortran/ChangeLog: * dump-parse-tree.c (show_omp_namelist): Handle depobj + mutexinoutset in the depend clause. (show_omp_clauses, show_omp_node, show_code_node): Handle depobj. * gfortran.h (enum gfc_statement): Add ST_OMP_DEPOBJ. (enum gfc_omp_depend_op): Add OMP_DEPEND_UNSET, OMP_DEPEND_MUTEXINOUTSET and OMP_DEPEND_DEPOBJ. (gfc_omp_clauses): Add destroy, depobj_update and depobj. (enum gfc_exec_op): Add EXEC_OMP_DEPOBJ * match.h (gfc_match_omp_depobj): Match 'omp depobj'. * openmp.c (gfc_match_omp_clauses): Add depobj + mutexinoutset to depend clause. (gfc_match_omp_depobj, resolve_omp_clauses, gfc_resolve_omp_directive): Handle 'omp depobj'. * parse.c (decode_omp_directive, next_statement, gfc_ascii_statement): Likewise. * resolve.c (gfc_resolve_code): Likewise. * st.c (gfc_free_statement): Likewise. * trans-openmp.c (gfc_trans_omp_clauses): Handle depobj + mutexinoutset in the depend clause. (gfc_trans_omp_depobj, gfc_trans_omp_directive): Handle EXEC_OMP_DEPOBJ. * trans.c (trans_code): Likewise. libgomp/ChangeLog: * testsuite/libgomp.fortran/depobj-1.f90: New test. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/depobj-1.f90: New test. * gfortran.dg/gomp/depobj-2.f90: New test.
2021-04-19[OpenACC 'kernels'] '-fopenacc-kernels=[...]' -> '--param=openacc-kernels=[...]'Thomas Schwinge4-4/+4
This configuration knob is temporary, and isn't really meant to be exposed to users. gcc/ * params.opt (-param=openacc-kernels=): Add. * omp-oacc-kernels-decompose.cc (pass_omp_oacc_kernels_decompose::gate): Use it. * doc/invoke.texi (-fopenacc-kernels=@var{mode}): Move... (--param): ... here, 'openacc-kernels'. gcc/c-family/ * c.opt (fopenacc-kernels=): Remove. gcc/fortran/ * lang.opt (fopenacc-kernels=): Remove. gcc/testsuite/ * c-c++-common/goacc/if-clause-2.c: '-fopenacc-kernels=[...]' -> '--param=openacc-kernels=[...]'. * c-c++-common/goacc/kernels-decompose-1.c: Likewise. * c-c++-common/goacc/kernels-decompose-2.c: Likewise. * c-c++-common/goacc/kernels-decompose-ice-1.c: Likewise. * c-c++-common/goacc/kernels-decompose-ice-2.c: Likewise. * gfortran.dg/goacc/kernels-decompose-1.f95: Likewise. * gfortran.dg/goacc/kernels-decompose-2.f95: Likewise. * gfortran.dg/goacc/kernels-tree.f95: Likewise. libgomp/ * testsuite/libgomp.oacc-c-c++-common/declare-vla-kernels-decompose-ice-1.c: '-fopenacc-kernels=[...]' -> '--param=openacc-kernels=[...]'. * testsuite/libgomp.oacc-c-c++-common/declare-vla-kernels-decompose.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c: Likewise. * testsuite/libgomp.oacc-fortran/pr94358-1.f90: Likewise.
2021-04-15XFAIL OpenMP/nvptx execution-time hangs for simple nested OpenMP ↵Thomas Schwinge4-3/+22
'target'/'parallel'/'task' constructs [PR99555] ... still awaiting proper resolution, of course. libgomp/ PR target/99555 * testsuite/lib/libgomp.exp (check_effective_target_offload_device_nvptx): New. * testsuite/libgomp.c/pr99555-1.c <nvptx offload device>: Until resolved, make sure that we exit quickly, with error status, XFAILed. * testsuite/libgomp.c-c++-common/task-detach-6.c: Likewise. * testsuite/libgomp.fortran/task-detach-6.f90: Likewise.
2021-04-14testsuite: Fix up libgomp.fortran/alloc-1.F90 testcase [PR100071]Jakub Jelinek1-1/+2
As can be seen under valgrind, the testcase didn't bind in the last part the fortran pointers properly to the c pointers. 2021-04-14 Jakub Jelinek <jakub@redhat.com> PR testsuite/100071 * testsuite/libgomp.fortran/alloc-1.F90: Call c_f_pointer after last cp = omp_alloc with cp, p arguments instead of cq, q and call c_f_pointer after last cq = omp_alloc with cq, q.
2021-04-11[OpenACC] Fix an ICE where a loop with GT condition is collapsed.Hafiz Abid Qadeer2-4/+28
We have seen an ICE both on trunk and devel/omp/gcc-10 branches which can be reprodued with this simple testcase. It occurs if an OpenACC loop has a collapse clause and any of the loop being collapsed uses GT or GE condition. This issue is specific to OpenACC. int main (void) { int ix, iy; int dim_x = 16, dim_y = 16; { for (iy = dim_y - 1; iy > 0; --iy) for (ix = dim_x - 1; ix > 0; --ix) ; } } The problem is caused by a failing assertion in expand_oacc_collapse_init. It checks that cond_code for fd->loop should be same as cond_code for all the loops that are being collapsed. As the cond_code for fd->loop is LT_EXPR with collapse clause (set at the end of omp_extract_for_data), this assertion forces that all the loop in collapse clause should use < operator. There does not seem to be anything in the code which demands this condition as loop with > condition works ok otherwise. I digged old mailing list a bit but could not find any discussion on this change. Looking at the code, expand_oacc_for checks that fd->loop->cond_code is either LT_EXPR or GT_EXPR. I guess the original intention was to have similar checks on the loop which are being collapsed. But the way check was written does not acheive that. I have fixed it by modifying the check in the assertion to be same as check on fd->loop->cond_code. I tested goacc and libgomp (with nvptx offloading) and did not see any regression. I have added new tests to check collapse with GT/GE condition. PR middle-end/98088 gcc/ * omp-expand.c (expand_oacc_collapse_init): Update condition in a gcc_assert. gcc/testsuite/ * c-c++-common/goacc/collapse-2.c: New. libgomp/ * testsuite/libgomp.oacc-c-c++-common/collapse-2.c: Add check for loop with GT/GE condition. * testsuite/libgomp.oacc-c-c++-common/collapse-3.c: Likewise.
2021-04-09Add 'libgomp.oacc-c-c++-common/static-variable-1.c' [PR84991, PR84992, PR90779]Thomas Schwinge1-0/+460
libgomp/ PR middle-end/84991 PR middle-end/84992 PR middle-end/90779 * testsuite/libgomp.oacc-c-c++-common/static-variable-1.c: New.
2021-03-29libgomp: Fix on_device_arch.c aux-file handling [PR99555]Tobias Burnus5-6/+6
libgomp/ChangeLog: PR target/99555 * testsuite/lib/on_device_arch.c: Move to ... * testsuite/libgomp.c-c++-common/on_device_arch.h: ... here. * testsuite/libgomp.fortran/on_device_arch.c: New file; #include on_device_arch.h. * testsuite/libgomp.c-c++-common/task-detach-6.c: #include on_device_arch.h instead of using dg-additional-source. * testsuite/libgomp.c/pr99555-1.c: Likewise. * testsuite/libgomp.fortran/task-detach-6.f90: Update to use on_device_arch.c without relative paths.
2021-03-25Avoid OpenMP/nvptx execution-time hangs for simple nested OpenMP ↵Thomas Schwinge4-0/+69
'target'/'parallel'/'task' constructs [PR99555] ... awaiting proper resolution, of course. libgomp/ PR target/99555 * testsuite/lib/on_device_arch.c: New file. * testsuite/libgomp.c/pr99555-1.c: Likewise. * testsuite/libgomp.c-c++-common/task-detach-6.c: Until resolved, skip for nvptx offloading, with error status. * testsuite/libgomp.fortran/task-detach-6.f90: Likewise.
2021-03-25'libgomp.oacc-fortran/derivedtypes-arrays-1.f90' OpenACC 'serial' construct ↵Thomas Schwinge1-0/+1
diagnostic for nvptx offloading Fixup for recent commit d28f3da11d8c0aed9b746689d723022a9b5ec04c "openacc: Fix lowering for derived-type mappings through array elements". With nvptx offloading we see the usual: [...]/libgomp.oacc-fortran/derivedtypes-arrays-1.f90: In function 'MAIN__._omp_fn.0': [...]/libgomp.oacc-fortran/derivedtypes-arrays-1.f90:90:40: warning: using vector_length (32), ignoring 1 libgomp/ * testsuite/libgomp.oacc-fortran/derivedtypes-arrays-1.f90: OpenACC 'serial' construct diagnostic for nvptx offloading.
2021-03-15OpenMP: Fix 'omp declare target' handling for vars [PR99509]Tobias Burnus1-0/+22
For variables with 'declare target' attribute, varpool_node::get_create marks variables as offload; however, if the node already exists, it is not updated. C/C++ may tag decl with 'declare target implicit', which may only be after varpool creation turned into 'declare target' or 'declare target link'; in this case, the tagging has to happen in the FE. gcc/c/ChangeLog: PR c++/99509 * c-decl.c (finish_decl): For 'omp declare target implicit' vars, ensure that the varpool node is marked as offloadable. gcc/cp/ChangeLog: PR c++/99509 * decl.c (cp_finish_decl): For 'omp declare target implicit' vars, ensure that the varpool node is marked as offloadable. libgomp/ChangeLog: PR c++/99509 * testsuite/libgomp.c-c++-common/declare_target-1.c: New test.
2021-03-12Fortran/OpenMP: Fix use_device_{ptr,addr} with assumed-size array [PR98858]Tobias Burnus1-0/+91
gcc/ChangeLog: PR fortran/98858 * gimplify.c (omp_add_variable): Handle NULL_TREE as size occuring for assumed-size arrays in use_device_{ptr,addr}. libgomp/ChangeLog: PR fortran/98858 * testsuite/libgomp.fortran/use_device_ptr-3.f90: New test.
2021-02-27libgomp, testsuite : Require alias support for PR96390 testcase.Iain Sandoe1-0/+1
This fails everywhere on Darwin, which does not have support for symbol aliases. Add a dg-require-alias to UNSUPPORT it. libgomp/ChangeLog: * testsuite/libgomp.c-c++-common/pr96390.c: Require alias support from the target.
2021-02-25openmp: Fix intermittent hanging of task-detach-6 libgomp tests [PR98738]Kwok Cheung Yeung22-42/+416
This adds support for the task detach clause to taskwait and taskgroup, and simplifies the handling of the detach clause by moving most of the extra handling required for detach tasks to omp_fulfill_event. 2021-02-25 Kwok Cheung Yeung <kcy@codesourcery.com> Jakub Jelinek <jakub@redhat.com> libgomp/ PR libgomp/98738 * libgomp.h (enum gomp_task_kind): Add GOMP_TASK_DETACHED. (struct gomp_task): Replace detach and completion_sem fields with union containing completion_sem and detach_team. Add deferred_p field. (struct gomp_team): Remove task_detach_queue. * task.c: Include assert.h. (gomp_init_task): Initialize deferred_p and completion_sem fields. Rearrange initialization order of fields. (task_fulfilled_p): Delete. (GOMP_task): Use address of task as the event handle. Remove initialization of detach field. Initialize deferred_p field. Use automatic local for completion_sem. Initialize detach_team field for deferred tasks. (gomp_barrier_handle_tasks): Remove handling of task_detach_queue. Set kind of suspended detach task to GOMP_TASK_DETACHED and decrement task_running_count. Move finish_cancelled block out of else branch. Relocate call to gomp_team_barrier_done. (GOMP_taskwait): Handle tasks with completion events that have not been fulfilled. (GOMP_taskgroup_end): Likewise. (omp_fulfill_event): Use address of task as event handle. Post to completion_sem for undeferred tasks. Clear detach_team if task has not finished. For finished tasks, handle post-execution tasks, call gomp_team_barrier_wake if necessary, and free task. * team.c (gomp_new_team): Remove initialization of task_detach_queue. (free_team): Remove free of task_detach_queue. * testsuite/libgomp.c-c++-common/task-detach-1.c: Fix formatting. * testsuite/libgomp.c-c++-common/task-detach-2.c: Fix formatting. * testsuite/libgomp.c-c++-common/task-detach-3.c: Fix formatting. * testsuite/libgomp.c-c++-common/task-detach-4.c: Fix formatting. * testsuite/libgomp.c-c++-common/task-detach-5.c: Fix formatting. Change data-sharing of detach events on enclosing parallel to private. * testsuite/libgomp.c-c++-common/task-detach-6.c: Likewise. Remove taskwait directive. * testsuite/libgomp.c-c++-common/task-detach-7.c: New. * testsuite/libgomp.c-c++-common/task-detach-8.c: New. * testsuite/libgomp.c-c++-common/task-detach-9.c: New. * testsuite/libgomp.c-c++-common/task-detach-10.c: New. * testsuite/libgomp.c-c++-common/task-detach-11.c: New. * testsuite/libgomp.fortran/task-detach-1.f90: Fix formatting. * testsuite/libgomp.fortran/task-detach-2.f90: Fix formatting. * testsuite/libgomp.fortran/task-detach-3.f90: Fix formatting. * testsuite/libgomp.fortran/task-detach-4.f90: Fix formatting. * testsuite/libgomp.fortran/task-detach-5.f90: Fix formatting. Change data-sharing of detach events on enclosing parallel to private. * testsuite/libgomp.fortran/task-detach-6.f90: Likewise. Remove taskwait directive. * testsuite/libgomp.fortran/task-detach-7.f90: New. * testsuite/libgomp.fortran/task-detach-8.f90: New. * testsuite/libgomp.fortran/task-detach-9.f90: New. * testsuite/libgomp.fortran/task-detach-10.f90: New. * testsuite/libgomp.fortran/task-detach-11.f90: New.
2021-02-22Fortran/OpenMP: Fix optional dummy procedures [PR99171]Tobias Burnus1-0/+393
gcc/fortran/ChangeLog: PR fortran/99171 * trans-openmp.c (gfc_omp_is_optional_argument): Regard optional dummy procs as nonoptional as no special treatment is needed. libgomp/ChangeLog: PR fortran/99171 * testsuite/libgomp.fortran/dummy-procs-1.f90: New test.
2021-02-17openacc: Strided array sections and components of derived-type arraysJulian Brown1-3/+2
This patch disallows selecting components of array sections in update directives for OpenACC, as specified in OpenACC 3.0, "2.14.4. Update Directive": In Fortran, members of variables of derived type may appear, including a subarray of a member. Members of subarrays of derived type may not appear. The diagnostic for attempting to use the same construct on other directives has also been improved. gcc/fortran/ * openmp.c (resolve_omp_clauses): Disallow selecting components of arrays of derived type. gcc/testsuite/ * gfortran.dg/goacc/array-with-dt-2.f90: Remove expected errors. * gfortran.dg/goacc/array-with-dt-6.f90: New test. * gfortran.dg/goacc/mapping-tests-2.f90: Update expected error. * gfortran.dg/goacc/ref_inquiry.f90: Update expected errors. * gfortran.dg/gomp/ref_inquiry.f90: Likewise. libgomp/ * testsuite/libgomp.oacc-fortran/array-stride-dt-1.f90: Remove expected errors.