aboutsummaryrefslogtreecommitdiff
path: root/libgomp/testsuite
AgeCommit message (Collapse)AuthorFilesLines
2022-03-16OpenACC privatization diagnostics vs. 'assert' [PR102841]Thomas Schwinge1-3/+3
It's an orthogonal concern why these diagnostics do appear at all for non-offloaded OpenACC constructs (where they're not relevant at all); PR90115. Depending on how 'assert' is implemented, it may cause temporaries to be created, and/or may lower into 'COND_EXPR's, and 'gcc/gimplify.cc:gimplify_cond_expr' uses 'create_tmp_var (type, "iftmp")'. Fix-up for commit 11b8286a83289f5b54e813f14ff56d730c3f3185 "[OpenACC privatization] Largely extend diagnostics and corresponding testsuite coverage [PR90115]". PR testsuite/102841 libgomp/ * testsuite/libgomp.oacc-c-c++-common/host_data-7.c: Adjust.
2022-03-12OpenACC 'kernels' decomposition: resolve wrong-code cases unless manually ↵Thomas Schwinge7-34/+48
making certain variables addressable [PR100280, PR104892] Currently in OpenACC 'kernels' decomposition, there is special handling of 'GOMP_MAP_FORCE_TOFROM', documented to be done to avoid "internal compiler errors in later passes". For performance reasons, the current repetitive to/from device copying for every region is not ideal, compared to using 'present' clauses, as done for almost all other 'GOMP_MAP_*'. Also, the current special handling (incomplete, evidently) is the reason for the PR104892 misbehavior. For PR100280 etc. we've resolved all such known ICEs -- removing the special handling for 'GOMP_MAP_FORCE_TOFROM' now resolves PR104892. PR middle-end/100280 PR middle-end/104892 gcc/ * omp-oacc-kernels-decompose.cc (omp_oacc_kernels_decompose_1): Remove special handling of 'GOMP_MAP_FORCE_TOFROM'. gcc/testsuite/ * c-c++-common/goacc/kernels-decompose-2.c: Adjust. * c-c++-common/goacc/kernels-decompose-pr100400-1-1.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr100400-1-2.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr100400-1-3.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr100400-1-4.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr104061-1-1.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr104061-1-2.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr104061-1-3.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr104061-1-4.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr104132-1.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr104133-1.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr104774-1.c: Likewise. * gfortran.dg/goacc/classify-kernels.f95: Likewise. * gfortran.dg/goacc/kernels-decompose-2.f95: Likewise. libgomp/ * testsuite/libgomp.oacc-c-c++-common/declare-vla.c: Adjust. * testsuite/libgomp.oacc-c-c++-common/default-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-reduction-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Likewise. * testsuite/libgomp.oacc-fortran/asyncwait-1.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-reduction-1.f90: Likewise.
2022-03-12OpenACC 'kernels' decomposition: wrong-code cases unless manually making ↵Thomas Schwinge5-16/+59
certain variables addressable [PR104892] Document a few examples of the status quo. PR middle-end/104892 libgomp/ * testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c: Point to PR104892. * testsuite/libgomp.oacc-c-c++-common/default-1.c: Likewise, enable '--param=openacc-kernels=decompose' and adjust. * testsuite/libgomp.oacc-c-c++-common/kernels-reduction-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Likewise. * testsuite/libgomp.oacc-fortran/kernels-reduction-1.f90: Likewise.
2022-03-12Enhance further testcases to verify handling of OpenACC privatization level ↵Thomas Schwinge4-55/+266
[PR90115] As originally introduced in commit 11b8286a83289f5b54e813f14ff56d730c3f3185 "[OpenACC privatization] Largely extend diagnostics and corresponding testsuite coverage [PR90115]". PR middle-end/90115 libgomp/ * testsuite/libgomp.oacc-c-c++-common/default-1.c: Enhance. * testsuite/libgomp.oacc-c-c++-common/kernels-reduction-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Likewise. * testsuite/libgomp.oacc-fortran/kernels-reduction-1.f90: Likewise.
2022-03-12OpenACC 'kernels' decomposition: Mark variables used in 'present' clauses as ↵Thomas Schwinge4-67/+88
addressable [PR100280, PR104086] ... like in recent commit 9b32c1669aad5459dd053424f9967011348add83 "OpenACC 'kernels' decomposition: Mark variables used in synthesized data clauses as addressable [PR100280]". Otherwise, we may run into 'gcc/omp-low.cc:lower_omp_target': 13125 else if (is_gimple_reg (var)) 13126 { 13127 gcc_assert (offloaded); PR middle-end/100280 PR middle-end/104086 gcc/ * omp-oacc-kernels-decompose.cc (omp_oacc_kernels_decompose_1): Mark variables used in 'present' clauses as addressable. * omp-low.cc (scan_sharing_clauses) <OMP_CLAUSE_MAP>: Gracefully handle duplicate 'OMP_CLAUSE_MAP_DECL_MAKE_ADDRESSABLE'. gcc/testsuite/ * c-c++-common/goacc/kernels-decompose-pr104086-1.c: Adjust, extend. libgomp/ * testsuite/libgomp.oacc-c-c++-common/declare-vla-kernels-decompose-ice-1.c: Merge this... * testsuite/libgomp.oacc-c-c++-common/declare-vla-kernels-decompose.c: ..., and this... * testsuite/libgomp.oacc-c-c++-common/declare-vla.c: ... into this, and adjust. * testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c: Extend.
2022-03-10Fix multiple issue in the testcase allocate-1.f90.Hafiz Abid Qadeer1-14/+12
1. Thomas reported in https://gcc.gnu.org/pipermail/gcc-patches/2022-January/589039.html that this testcase is randomly failing. The problem was fixed pool size which was exhausted when there were a lot of threads. Fixed it by removing pool_size trait which causes default pool size to be used which should be big enough. 2. Array indices have been changed to check the last element in the array. 3. Remove a redundant assignment and move some code to better match C testcase. libgomp/ChangeLog: * testsuite/libgomp.fortran/allocate-1.f90: Remove pool_size trait. Test last index in w and v array. Remove redundant assignment to V(1). Move alignment checks at the end of parallel region.
2022-03-10[nvptx] Disable warp sync in simt regionTom de Vries1-0/+18
I ran into a hang for this code: ... #pragma omp target map(tofrom: counter_N0) #pragma omp simd for (int i = 0 ; i < 1 ; i++ ) { #pragma omp atomic update counter_N0 = counter_N0 + 1 ; } ... This has to do with the nature of -muniform-simt. It has two modes of operation: inside and outside an SIMT region. Outside an SIMT region, a warp pretends to execute a single thread, but actually executes in all threads, to keep the local registers in all threads consistent. This approach works unless the insn that is executed is a syscall or an atomic insn. In that case, the insn is predicated, such that it executes in only one thread. If the predicated insn writes a result to a register, then that register is propagated to the other threads, after which the local registers in all threads are consistent again. Inside an SIMT region, a warp executes in all threads. However, the predication and propagation for syscalls and atomic insns is also present here, because nvptx_reorg_uniform_simt works on all code. Care has been taken though to ensure that the predication and propagation is a nop. That is, inside an SIMT region: - the predicate evalutes to true for each thread, and - the propagation insn copies a register from each thread to the same thread. That works fine, until we use -mptx=6.0, and instead of using the deprecated warp propagation insn shfl, we start using shfl.sync: ... @%r33 atom.add.u32 _, [%r29], 1; shfl.sync.idx.b32 %r30, %r30, %r32, 31, 0xffffffff; ... The shfl.sync specifies a member mask indicating all threads, but given that the loop only has a single iteration, only thread 0 will execute the insn, where it will hang waiting for the other threads. Fix this by predicating the shfl.sync (and likewise, bar.warp.sync and the uniform warp check) such that it only executes outside the SIMT region. Tested on x86_64 with nvptx accelerator. gcc/ChangeLog: 2022-03-08 Tom de Vries <tdevries@suse.de> PR target/104783 * config/nvptx/nvptx.cc (nvptx_init_unisimt_predicate) (nvptx_output_unisimt_switch): Handle unisimt_outside_simt_predicate. (nvptx_get_unisimt_outside_simt_predicate): New function. (predicate_insn): New function, factored out of ... (nvptx_reorg_uniform_simt): ... here. Predicate all emitted insns. * config/nvptx/nvptx.h (struct machine_function): Add unisimt_outside_simt_predicate field. * config/nvptx/nvptx.md (define_insn "nvptx_warpsync") (define_insn "nvptx_uniform_warp_check"): Make predicable. libgomp/ChangeLog: 2022-03-10 Tom de Vries <tdevries@suse.de> * testsuite/libgomp.c/pr104783.c: New test.
2022-03-10[OpenACC privatization] Analyze 'lookup_decl'-translated DECL [PR90115, ↵Thomas Schwinge17-61/+97
PR102330, PR104774] ... so that it matches what we analyze and what we action on. Fix-up for commit 29a2f51806c5b30e17a8d0e9ba7915a3c53c34ff "openacc: Add support for gang local storage allocation in shared memory [PR90115]". PR middle-end/90115 PR middle-end/102330 PR middle-end/104774 gcc/ * omp-low.cc (oacc_privatization_candidate_p) (oacc_privatization_scan_clause_chain) (oacc_privatization_scan_decl_chain, lower_oacc_private_marker): Analyze 'lookup_decl'-translated DECL. gcc/testsuite/ * c-c++-common/goacc/kernels-decompose-pr104061-1-3.c: Adjust. * c-c++-common/goacc/kernels-decompose-pr104061-1-4.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr104132-1.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr104133-1.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr104774-1.c: Likewise. * c-c++-common/goacc/privatization-1-compute-loop.c: Likewise. * c-c++-common/goacc/privatization-1-compute.c: Likewise. * c-c++-common/goacc/privatization-1-routine_gang-loop.c: Likewise. * c-c++-common/goacc/privatization-1-routine_gang.c: Likewise. * gfortran.dg/goacc-gomp/pr102330-1.f90: Likewise, and subsume... * gfortran.dg/goacc-gomp/pr102330-2.f90: ... this file, and... * gfortran.dg/goacc-gomp/pr102330-3.f90: ... this file. * gfortran.dg/goacc/privatization-1-compute-loop.f90: Adjust. * gfortran.dg/goacc/privatization-1-compute.f90: Likewise. * gfortran.dg/goacc/privatization-1-routine_gang-loop.f90: Likewise. * gfortran.dg/goacc/privatization-1-routine_gang.f90: Likewise. libgomp/ * testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c: Enhance. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-1.c: Adjust. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-3.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-4.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-5.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-vector-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-vector-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-3.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-4.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-5.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-6.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-7.c: Likewise. * testsuite/libgomp.oacc-fortran/optional-private.f90: Likewise. * testsuite/libgomp.oacc-fortran/privatized-ref-1.f95: Likewise. * testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: Likewise.
2022-03-10Enhance further testcases to verify handling of OpenACC privatization level ↵Thomas Schwinge4-12/+64
[PR90115] As originally introduced in commit 11b8286a83289f5b54e813f14ff56d730c3f3185 "[OpenACC privatization] Largely extend diagnostics and corresponding testsuite coverage [PR90115]". PR middle-end/90115 gcc/testsuite/ * c-c++-common/goacc/nesting-1.c: Enhance. * gcc.dg/goacc/nested-function-1.c: Likewise. * gcc.dg/goacc/nested-function-2.c: Likewise. * gfortran.dg/goacc/nested-function-1.f90: Likewise. libgomp/ * testsuite/libgomp.oacc-fortran/routine-1.f90: Enhance. * testsuite/libgomp.oacc-fortran/routine-2.f90: Likewise. * testsuite/libgomp.oacc-fortran/routine-3.f90: Likewise. * testsuite/libgomp.oacc-fortran/routine-9.f90: Likewise.
2022-03-04Fix 'libgomp.oacc-c-c++-common/kernels-decompose-1.c' expected diagnosticsThomas Schwinge1-0/+2
Fix-up for recent commit 8935589b496f755e08cadf26d8ceddf0dd6e0968 "OMP lowering: Regimplify 'OMP_CLAUSE_MAP_DECL_MAKE_ADDRESSABLE' DECLs [PR100280, PR104132, PR104133]": adjust for a GCN offloading workaround added just before commit: '(volatile void *) &f1;'. PR testsuite/104791 libgomp/ * testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c: Fix expected diagnostics.
2022-03-04Test 'libgomp.oacc-*/kernels-private-vars-*' with ↵Thomas Schwinge33-99/+361
'--param=openacc-kernels=decompose' [PR104784] Before recent commit 8935589b496f755e08cadf26d8ceddf0dd6e0968 "OMP lowering: Regimplify 'OMP_CLAUSE_MAP_DECL_MAKE_ADDRESSABLE' DECLs [PR100280, PR104132, PR104133]", 'libgomp.oacc-c' testing already worked fine, but 'libgomp.oacc-c++' testing ICEed. Via the commit mentioned, the C++ testing ICEs are now resolved, but the underlying issue remains to be looked into: PR104784 "OpenACC 'kernels' decomposition: C vs. C++ differences". PR middle-end/104784 libgomp/ * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-1.c: Test with '--param=openacc-kernels=decompose'. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-3.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-4.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-5.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-3.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-4.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-5.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-6.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-vector-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-vector-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-3.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-4.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-5.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-6.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-7.c: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-1.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-2.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-3.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-6.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-vector-1.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-vector-2.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-1.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-2.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-3.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-4.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-5.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-6.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-7.f90: Likewise.
2022-03-04Test '-fopt-info-omp-all' in 'libgomp.oacc-*/kernels-private-vars-*'Thomas Schwinge33-514/+811
libgomp/ * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-1.c: Test '-fopt-info-omp-all'. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-3.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-4.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-5.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-3.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-4.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-5.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-6.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-vector-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-vector-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-3.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-4.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-5.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-6.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-7.c: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-1.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-2.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-3.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-6.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-vector-1.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-vector-2.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-1.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-2.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-3.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-4.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-5.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-6.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-7.f90: Likewise.
2022-03-04OMP lowering: Regimplify 'OMP_CLAUSE_MAP_DECL_MAKE_ADDRESSABLE' DECLs ↵Thomas Schwinge1-4/+58
[PR100280, PR104132, PR104133] ... by generalizing the existing 'gcc/omp-low.cc:task_shared_vars'. Fix-up for commit 9b32c1669aad5459dd053424f9967011348add83 "OpenACC 'kernels' decomposition: Mark variables used in synthesized data clauses as addressable [PR100280]". PR middle-end/100280 PR middle-end/104132 PR middle-end/104133 gcc/ * omp-low.cc (task_shared_vars): Rename to 'make_addressable_vars'. Adjust all users. (scan_sharing_clauses) <OMP_CLAUSE_MAP> Use it for 'OMP_CLAUSE_MAP_DECL_MAKE_ADDRESSABLE' DECLs, too. gcc/testsuite/ * c-c++-common/goacc/kernels-decompose-pr104061-1-3.c: Adjust. * c-c++-common/goacc/kernels-decompose-pr104061-1-4.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr104132-1.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr104133-1.c: Likewise. libgomp/ * testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c: Extend.
2022-03-04OpenACC 'kernels' decomposition: Move 'TREE_ADDRESSABLE' setting into OMP ↵Thomas Schwinge2-4/+8
lowering [PR100280] ... in preparation for later changes. No functional change. Follow-up to commit 9b32c1669aad5459dd053424f9967011348add83 "OpenACC 'kernels' decomposition: Mark variables used in synthesized data clauses as addressable [PR100280]". PR middle-end/100280 gcc/ * tree.h (OMP_CLAUSE_MAP_DECL_MAKE_ADDRESSABLE): New. * tree-core.h: Document it. * omp-low.cc (scan_sharing_clauses) <OMP_CLAUSE_MAP>: Handle 'OMP_CLAUSE_MAP_DECL_MAKE_ADDRESSABLE'. * omp-oacc-kernels-decompose.cc (maybe_build_inner_data_region): Set 'OMP_CLAUSE_MAP_DECL_MAKE_ADDRESSABLE' instead of 'TREE_ADDRESSABLE'. gcc/testsuite/ * c-c++-common/goacc/classify-kernels-unparallelized.c: Adjust. * c-c++-common/goacc/classify-kernels.c: Likewise. * c-c++-common/goacc/kernels-decompose-2.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr100280-1.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr104061-1-2.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr104061-1-3.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr104061-1-4.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr104132-1.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr104133-1.c: Likewise. libgomp/ * testsuite/libgomp.oacc-c-c++-common/f-asyncwait-1.c: Adjust. * testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c: Likewise.
2022-03-04Add diagnostic: "note: OpenACC 'kernels' decomposition: variable '[...]' ↵Thomas Schwinge2-0/+4
declared in block made addressable" [PR100280] Follow-up to commit 9b32c1669aad5459dd053424f9967011348add83 "OpenACC 'kernels' decomposition: Mark variables used in synthesized data clauses as addressable [PR100280]". PR middle-end/100280 gcc/ * omp-oacc-kernels-decompose.cc (maybe_build_inner_data_region): Add diagnostic: "note: OpenACC 'kernels' decomposition: variable '[...]' declared in block made addressable". gcc/testsuite/ * c-c++-common/goacc/classify-kernels-unparallelized.c: Add '--param=openacc-privatization=noisy'. * c-c++-common/goacc/classify-kernels.c: Likewise. * c-c++-common/goacc/kernels-decompose-2.c: Adjust. * c-c++-common/goacc/kernels-decompose-pr100280-1.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr104061-1-2.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr104061-1-3.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr104061-1-4.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr104132-1.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr104133-1.c: Likewise. libgomp/ * testsuite/libgomp.oacc-c-c++-common/f-asyncwait-1.c: Adjust. * testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c: Likewise.
2022-02-28[libgomp, testsuite, nvptx] Add -mptx=_ in declare-variant-3-sm*.cTom de Vries6-6/+6
When running with target board unix/-foffload=-mptx=3.1, we run into: ... lto1: error: PTX version (-mptx) needs to be at least 4.2 to support \ selected -misa (sm_53)^M mkoffload: fatal error: x86_64-pc-linux-gnu-accel-nvptx-none-gcc returned \ 1 exit status^M compilation terminated.^M ... FAIL: libgomp.c/declare-variant-3-sm53.c (test for excess errors) ... Fix this by adding -foffload=-mptx=_ in the libgomp.c/declare-variant-3-sm*.c test-cases. Tested on x86_64 with nvptx accelerator. libgomp/ChangeLog: 2022-02-28 Tom de Vries <tdevries@suse.de> * testsuite/libgomp.c/declare-variant-3-sm30.c: Add -foffload=-mptx=_. * testsuite/libgomp.c/declare-variant-3-sm35.c: Same. * testsuite/libgomp.c/declare-variant-3-sm53.c: Same. * testsuite/libgomp.c/declare-variant-3-sm70.c: Same. * testsuite/libgomp.c/declare-variant-3-sm75.c: Same. * testsuite/libgomp.c/declare-variant-3-sm80.c: Same.
2022-02-24[libgomp, testsuite, nvptx] Add libgomp.c/declare-variant-3-sm*.cTom de Vries7-0/+108
Add openmp test-cases that test the omp declare variant construct: ... #pragma omp declare variant (f30) match (device={isa("sm_30")}) ... using the available nvptx isas. Only the one for sm_30 is a dg-do run test-case, the other ones are dg-do link. Tested on x86_64 with nvptx accelerator. libgomp/ChangeLog: 2022-02-24 Tom de Vries <tdevries@suse.de> * testsuite/libgomp.c/declare-variant-3-sm30.c: New test. * testsuite/libgomp.c/declare-variant-3-sm35.c: New test. * testsuite/libgomp.c/declare-variant-3-sm53.c: New test. * testsuite/libgomp.c/declare-variant-3-sm70.c: New test. * testsuite/libgomp.c/declare-variant-3-sm75.c: New test. * testsuite/libgomp.c/declare-variant-3-sm80.c: New test. * testsuite/libgomp.c/declare-variant-3.h: New header file.
2022-02-22Fix OpenACC gang-redundant execution in ↵Thomas Schwinge1-10/+32
'libgomp.oacc-fortran/privatized-ref-2.f90' This was a latent problem, and this commit here now resolves a regression that after recent commit a78b1ab1df9ca44acc5638e8f9d0ae2e62bd65ed "amdgcn: Tune default OpenMP/OpenACC GPU utilization" we had (only) seen on a GCN offloading '-march=gfx908' system: {+WARNING: program timed out.+} [-PASS:-]{+FAIL:+} libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa -O0 execution test Same for other optimization levels. Make sure that we're not executing non-parallelized code in gang-redundant mode, by putting these parts into their own 'parallel' constructs, which then default to 'num_gangs(1)'. libgomp/ * testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: Fix OpenACC gang-redundant execution.
2022-02-22[libgomp, nvptx] Fix hang in gomp_team_barrier_wait_endTom de Vries3-28/+0
Consider the following omp fragment. ... #pragma omp target #pragma omp parallel num_threads (2) #pragma omp task ; ... This hangs at -O0 for nvptx. Investigating the behaviour gives us the following trace of events: - both threads execute GOMP_task, where they: - deposit a task, and - execute gomp_team_barrier_wake - thread 1 executes gomp_team_barrier_wait_end and, not being the last thread, proceeds to wait at the team barrier - thread 0 executes gomp_team_barrier_wait_end and, being the last thread, it calls gomp_barrier_handle_tasks, where it: - executes both tasks and marks the team barrier done - executes a gomp_team_barrier_wake which wakes up thread 1 - thread 1 exits the team barrier - thread 0 returns from gomp_barrier_handle_tasks and goes to wait at the team barrier. - thread 0 hangs. To understand why there is a hang here, it's good to understand how things are setup for nvptx. The libgomp/config/nvptx/bar.c implementation is a copy of the libgomp/config/linux/bar.c implementation, with uses of both futex_wake and do_wait replaced with uses of ptx insn bar.sync: ... if (bar->total > 1) asm ("bar.sync 1, %0;" : : "r" (32 * bar->total)); ... The point where thread 0 goes to wait at the team barrier, corresponds in the linux implementation with a do_wait. In the linux case, the call to do_wait doesn't hang, because it's waiting for bar->generation to become a certain value, and if bar->generation already has that value, it just proceeds, without any need for coordination with other threads. In the nvtpx case, the bar.sync waits until thread 1 joins it in the same logical barrier, which never happens: thread 1 is lingering in the thread pool at the thread pool barrier (using a different logical barrier), waiting to join a new team. The easiest way to fix this is to revert to the posix implementation for bar.{c,h}. That however falls back on a busy-waiting approach, and does not take advantage of the ptx bar.sync insn. Instead, we revert to the linux implementation for bar.c, and implement bar.c local functions futex_wait and futex_wake using the bar.sync insn. The bar.sync insn takes an argument specifying how many threads are participating, and that doesn't play well with the futex syntax where it's not clear in advance how many threads will be woken up. This is solved by waking up all waiting threads each time a futex_wait or futex_wake happens, and possibly going back to sleep with an updated thread count. Tested libgomp on x86_64 with nvptx accelerator. libgomp/ChangeLog: 2021-04-20 Tom de Vries <tdevries@suse.de> PR target/99555 * config/nvptx/bar.c (generation_to_barrier): New function, copied from config/rtems/bar.c. (futex_wait, futex_wake): New function. (do_spin, do_wait): New function, copied from config/linux/wait.h. (gomp_barrier_wait_end, gomp_barrier_wait_last) (gomp_team_barrier_wake, gomp_team_barrier_wait_end): (gomp_team_barrier_wait_cancel_end, gomp_team_barrier_cancel): Remove and replace with include of config/linux/bar.c. * config/nvptx/bar.h (gomp_barrier_t): Add fields waiters and lock. (gomp_barrier_init): Init new fields. * testsuite/libgomp.c-c++-common/task-detach-6.c: Remove nvptx-specific workarounds. * testsuite/libgomp.c/pr99555-1.c: Same. * testsuite/libgomp.fortran/task-detach-6.f90: Same.
2022-02-22[libgomp, testsuite, nvptx] Fix pr96390.c without CUDATom de Vries2-0/+2
When running the libgomp testsuite on x86_64 with nvptx accelerator, we run into: ... XPASS: libgomp.c/../libgomp.c-c++-common/pr96390.c (test for excess errors) FAIL: libgomp.c/../libgomp.c-c++-common/pr96390.c execution test ... The problem is that we're expecting the following ptxas error: ... XFAIL: libgomp.c/../libgomp.c-c++-common/pr96390.c (test for excess errors) Excess errors: ptxas /tmp/ccZYDw8N.o, line 90; error : Call to 'baz' requires call prototype ptxas /tmp/ccZYDw8N.o, line 90; error : Unknown symbol 'baz' ... But it's not triggered because ptxas is not in the path, so nvptx-none-as defaults to --no-verify. So instead, we run into the same error at execution time. Fix this by forcing verification using: ... /* { dg-additional-options "-foffload=-Wa,--verify" \ { target offload_target_nvptx } } */ ... such that we run into the xfail in this way instead: ... XFAIL: libgomp.c/../libgomp.c-c++-common/pr96390.c (test for excess errors) Excess errors: nvptx-as: error trying to exec 'ptxas': execvp: No such file or directory nvptx-as: ptxas returned 255 exit status ... Tested on x86_64-linux with nvptx accelerator. libgomp/ChangeLog: 2022-02-21 Tom de Vries <tdevries@suse.de> PR testsuite/104146 * testsuite/libgomp.c++/pr96390.C: Add additional-option -foffload=-Wa,--verify for nvptx. * testsuite/libgomp.c-c++-common/pr96390.c: Same.
2022-02-15Fortran/OpenMP: Fix depend-clause handlingTobias Burnus1-0/+109
gcc/fortran/ChangeLog: * trans-openmp.cc (gfc_trans_omp_clauses, gfc_trans_omp_depobj): Depend on the proper addr, for ptr/alloc depend on pointee. libgomp/ChangeLog: * testsuite/libgomp.fortran/depend-4.f90: New test. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/depend-4.f90: New test. * gfortran.dg/gomp/depend-5.f90: New test.
2022-02-10OpenMP/C++: Permit mapping classes with virtual members [PR102204]Tobias Burnus1-0/+50
PR c++/102204 gcc/cp/ChangeLog: * decl2.cc (cp_omp_mappable_type_1): Remove check for virtual members as those are permitted since OpenMP 5.0. libgomp/ChangeLog: * testsuite/libgomp.c++/target-virtual-1.C: New test. gcc/testsuite/ChangeLog: * g++.dg/gomp/unmappable-1.C: Remove previously expected dg-message.
2022-02-09C, C++, Fortran, OpenMP: Add 'has_device_addr' clause to 'target' construct.Marcel Vollweiler10-0/+478
This patch adds the 'has_device_addr' clause to the OpenMP 'target' construct which was introduced in OpenMP 5.1 (OpenMP API 5.1 specification pp. 197ff): has_device_addr(list) "The has_device_addr clause indicates that its list items already have device addresses and therefore they may be directly accessed from a target device. If the device address of a list item is not for the device on which the target region executes, accessing the list item inside the region results in unspecified behavior. The list items may include array sections." (p. 200) "A list item may not be specified in both an is_device_ptr clause and a has_device_addr clause on the directive." (p. 202) "A list item that appears in an is_device_ptr or a has_device_addr clause must not be specified in any data-sharing attribute clause on the same target construct." (p. 203) gcc/c-family/ChangeLog: * c-omp.cc (c_omp_split_clauses): Added OMP_CLAUSE_HAS_DEVICE_ADDR case. * c-pragma.h (enum pragma_kind): Added 5.1 in comment. (enum pragma_omp_clause): Added PRAGMA_OMP_CLAUSE_HAS_DEVICE_ADDR. gcc/c/ChangeLog: * c-parser.cc (c_parser_omp_clause_name): Parse 'has_device_addr' clause. (c_parser_omp_variable_list): Handle array sections. (c_parser_omp_clause_has_device_addr): Added. (c_parser_omp_all_clauses): Added PRAGMA_OMP_CLAUSE_HAS_DEVICE_ADDR case. (c_parser_omp_target_exit_data): Added HAS_DEVICE_ADDR to OMP_CLAUSE_MASK. * c-typeck.cc (handle_omp_array_sections): Handle clause restrictions. (c_finish_omp_clauses): Handle array sections. gcc/cp/ChangeLog: * parser.cc (cp_parser_omp_clause_name): Parse 'has_device_addr' clause. (cp_parser_omp_var_list_no_open): Handle array sections. (cp_parser_omp_all_clauses): Added PRAGMA_OMP_CLAUSE_HAS_DEVICE_ADDR case. (cp_parser_omp_target_update): Added HAS_DEVICE_ADDR to OMP_CLAUSE_MASK. * semantics.cc (handle_omp_array_sections): Handle clause restrictions. (finish_omp_clauses): Handle array sections. gcc/fortran/ChangeLog: * dump-parse-tree.cc (show_omp_clauses): Added OMP_LIST_HAS_DEVICE_ADDR case. * gfortran.h: Added OMP_LIST_HAS_DEVICE_ADDR. * openmp.cc (enum omp_mask2): Added OMP_CLAUSE_HAS_DEVICE_ADDR. (gfc_match_omp_clauses): Parse HAS_DEVICE_ADDR clause. (resolve_omp_clauses): Same. * trans-openmp.cc (gfc_trans_omp_variable_list): Added OMP_LIST_HAS_DEVICE_ADDR case. (gfc_trans_omp_clauses): Firstprivatize of array descriptors. gcc/ChangeLog: * gimplify.cc (gimplify_scan_omp_clauses): Added cases for OMP_CLAUSE_HAS_DEVICE_ADDR and handle array sections. (gimplify_adjust_omp_clauses): Added OMP_CLAUSE_HAS_DEVICE_ADDR case. * omp-low.cc (scan_sharing_clauses): Handle OMP_CLAUSE_HAS_DEVICE_ADDR. (lower_omp_target): Same. * tree-core.h (enum omp_clause_code): Same. * tree-nested.cc (convert_nonlocal_omp_clauses): Same. (convert_local_omp_clauses): Same. * tree-pretty-print.cc (dump_omp_clause): Same. * tree.cc: Same. libgomp/ChangeLog: * libgomp.texi: Updated entry for HAS_DEVICE_ADDR. * target.c (copy_firstprivate_data): Copy only if host address is not NULL. * testsuite/libgomp.c++/target-has-device-addr-2.C: New test. * testsuite/libgomp.c++/target-has-device-addr-4.C: New test. * testsuite/libgomp.c++/target-has-device-addr-5.C: New test. * testsuite/libgomp.c++/target-has-device-addr-6.C: New test. * testsuite/libgomp.c-c++-common/target-has-device-addr-1.c: New test. * testsuite/libgomp.c/target-has-device-addr-3.c: New test. * testsuite/libgomp.fortran/target-has-device-addr-1.f90: New test. * testsuite/libgomp.fortran/target-has-device-addr-2.f90: New test. * testsuite/libgomp.fortran/target-has-device-addr-3.f90: New test. * testsuite/libgomp.fortran/target-has-device-addr-4.f90: New test. gcc/testsuite/ChangeLog: * c-c++-common/gomp/clauses-1.c: Added has_device_addr to test cases. * g++.dg/gomp/attrs-1.C: Added has_device_addr to test cases. * g++.dg/gomp/attrs-2.C: Added has_device_addr to test cases. * c-c++-common/gomp/target-has-device-addr-1.c: New test. * c-c++-common/gomp/target-has-device-addr-2.c: New test. * c-c++-common/gomp/target-is-device-ptr-1.c: New test. * c-c++-common/gomp/target-is-device-ptr-2.c: New test. * gfortran.dg/gomp/is_device_ptr-3.f90: New test. * gfortran.dg/gomp/target-has-device-addr-1.f90: New test. * gfortran.dg/gomp/target-has-device-addr-2.f90: New test.
2022-02-08libgomp: Fix segfault with posthumous orphan tasks [PR104385]Jakub Jelinek1-0/+26
The following patch fixes crashes with posthumous orphan tasks. When a parent task finishes, gomp_clear_parent clears the parent pointers of its children tasks present in the parent->children_queue. But children that are still waiting for dependencies aren't in that queue yet, they will be added there only when the sibling they are waiting for exits. Unfortunately we were adding those tasks into the queues with the original task->parent which then causes crashes because that task is gone and freed. The following patch fixes that by clearing the parent field when we schedule such task for running by adding it into the queues and we know that the sibling task which is about to finish has NULL parent. 2022-02-08 Jakub Jelinek <jakub@redhat.com> PR libgomp/104385 * task.c (gomp_task_run_post_handle_dependers): If parent is NULL, clear task->parent. * testsuite/libgomp.c/pr104385.c: New test.
2022-02-04libgomp.fortran/allocate-1.f90: Fix minor cleanupTobias Burnus1-3/+0
libgomp/ChangeLog: * testsuite/libgomp.fortran/allocate-1.f90: Remove spurious STOP of previous commit.
2022-02-04libgomp.fortran/allocate-1.f90: Minor cleanupTobias Burnus2-56/+60
libgomp/ChangeLog: * testsuite/libgomp.fortran/allocate-1.c (is_64bit_aligned): Renamed from is_64bit_aligned_. * testsuite/libgomp.fortran/allocate-1.f90: Fix interface decl and use it, more implicit none, remove unused argument.
2022-02-01[nvptx] Add some support for .local atomicsTom de Vries3-21/+0
The ptx insn atom doesn't support local memory. In case of doing an atomic operation on local memory, we run into: ... operation not supported on global/shared address space ... This is the cuGetErrorString message for CUDA_ERROR_INVALID_ADDRESS_SPACE. The message is somewhat confusing given that actually the operation is not supported on local address space. Fix this by falling back on a non-atomic version when detecting a frame-related memory operand. This only solves some cases that are detected at compile-time. It does however fix the openacc private-atomic-* test-cases. Tested on x86_64 with nvptx accelerator. gcc/ChangeLog: 2022-01-27 Tom de Vries <tdevries@suse.de> * config/nvptx/nvptx.md (define_insn "atomic_compare_and_swap<mode>_1") (define_insn "atomic_exchange<mode>") (define_insn "atomic_fetch_add<mode>") (define_insn "atomic_fetch_addsf") (define_insn "atomic_fetch_<logic><mode>"): Output non-atomic version if memory operands is frame-relative. gcc/testsuite/ChangeLog: 2022-01-31 Tom de Vries <tdevries@suse.de> * gcc.target/nvptx/stack-atomics-run.c: New test. libgomp/ChangeLog: 2022-01-27 Tom de Vries <tdevries@suse.de> * testsuite/libgomp.oacc-c-c++-common/private-atomic-1.c: Remove PR83812 workaround. * testsuite/libgomp.oacc-fortran/private-atomic-1-vector.f90: Same. * testsuite/libgomp.oacc-fortran/private-atomic-1-worker.f90: Same.
2022-02-01[libgomp, testsuite] Fix insufficient resources in test-casesTom de Vries3-3/+25
When running libgomp test-case broadcast-many.c on an nvptx accelerator (T400, driver version 470.86), I run into: ... libgomp: The Nvidia accelerator has insufficient resources to launch \ 'main$_omp_fn$0' with num_workers = 32 and vector_length = 32; \ recompile the program with 'num_workers = x and vector_length = y' on \ that offloaded region or '-fopenacc-dim=:x:y' where x * y <= 896. FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/broadcast-many.c \ -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none \ -O0 execution test ... The error does not occur when using GOMP_NVPTX_JIT=-O0. Fix this by using 896 / 32 == 28 workers for ACC_DEVICE_TYPE_nvidia. Likewise for some other test-cases. Tested libgomp on x86_64 with nvptx accelerator. libgomp/ChangeLog: 2022-01-27 Tom de Vries <tdevries@suse.de> * testsuite/libgomp.oacc-c-c++-common/broadcast-many.c: Reduce num_workers for nvidia accelerator to fix libgomp error 'insufficient resources'. * testsuite/libgomp.oacc-c-c++-common/par-loop-comb-reduction-4.c: Same. * testsuite/libgomp.oacc-c-c++-common/reduction-7.c: Same.
2022-02-01[libgomp, testsuite] Reduce recursion depth in declare_target-*.f90Tom de Vries2-3/+7
When running the libgomp testsuite with GOMP_NVPTX_JIT=-O0 using an nvptx accelerator (Nvidia T400, 2GB), I run into: ... libgomp: cuCtxSynchronize error: unspecified launch failure \ (perhaps abort was called) libgomp: cuMemFree_v2 error: unspecified launch failure libgomp: device finalization failed FAIL: libgomp.fortran/examples-4/declare_target-1.f90 -O0 execution test ... The test-case contains: ... ! Reduced from 25 to 23, otherwise execution runs out of thread stack on ! Nvidia Titan V. if (fib (23) /= fib_wrapper (23)) stop 2 ... Fix this by reducing the fib/fib_wrapper argument from 23 to 22. Same for declare_target-2.f90. Tested on x86_64 with nvptx accelerator. libgomp/ChangeLog: 2022-01-27 Tom de Vries <tdevries@suse.de> * testsuite/libgomp.fortran/examples-4/declare_target-1.f90: Reduce recursion depth. * testsuite/libgomp.fortran/examples-4/declare_target-2.f90: Same.
2022-01-21Strengthen a few OpenACC test casesThomas Schwinge15-57/+202
Rather than rubber-stamp whatever requested vs. actual device kernel launch configuration happens, actually (again) verify the requested values (modulo expected variations). This better highlights that "AMD GCN has an upper limit of 'num_workers(16)'", and the deficiency that "AMD GCN uses the autovectorizer for the vector dimension: the use of a function call in vector-partitioned code [...] is not currently supported". And, this removes several instances of race conditions, where variables are concurrently written to in OpenACC gang-redundant mode. libgomp/ * testsuite/libgomp.oacc-c-c++-common/loop-gwv-1.c: Strengthen. * testsuite/libgomp.oacc-c-c++-common/loop-gwv-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-red-gwv-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-red-v-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-red-v-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-red-w-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-red-w-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-red-wv-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-v-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-w-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-wv-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/routine-gwv-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/routine-v-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/routine-w-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/routine-wv-1.c: Likewise.
2022-01-19nvptx: update fix for -Wformat-diagMartin Liska16-33/+33
gcc/ChangeLog: * config/nvptx/nvptx.cc (nvptx_goacc_validate_dims_1): Update warning messages. libgomp/ChangeLog: * testsuite/libgomp.oacc-c++/privatized-ref-2.C: Update scanning patterns. * testsuite/libgomp.oacc-c++/privatized-ref-3.C: Likewise. * testsuite/libgomp.oacc-c-c++-common/acc_prof-kernels-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-loop-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/pr85486.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/pr95270-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/routine-nohost-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/struct-copyout-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/struct-copyout-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/vector-length-64-1.c: Likewise. * testsuite/libgomp.oacc-fortran/attach-descriptor-1.f90: Likewise. * testsuite/libgomp.oacc-fortran/derivedtypes-arrays-1.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-loop-2.f95: Likewise. * testsuite/libgomp.oacc-fortran/parallel-dims.f90: Likewise. * testsuite/libgomp.oacc-fortran/privatized-ref-1.f95: Likewise. Co-authored-by: Thomas Schwinge <thomas@codesourcery.com>
2022-01-18nvptx: fix -Wformat-diag warningsMartin Liska16-33/+33
gcc/ChangeLog: * config/nvptx/nvptx.cc (nvptx_goacc_validate_dims_1): Wrap keyword. * config/nvptx/nvptx.md: Remove trailing dot. libgomp/ChangeLog: * testsuite/libgomp.oacc-c++/privatized-ref-2.C: Update keyword in dg-warning. * testsuite/libgomp.oacc-c++/privatized-ref-3.C: Likewise. * testsuite/libgomp.oacc-c-c++-common/acc_prof-kernels-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-loop-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/pr85486.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/pr95270-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/routine-nohost-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/struct-copyout-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/struct-copyout-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/vector-length-64-1.c: Likewise. * testsuite/libgomp.oacc-fortran/attach-descriptor-1.f90: Likewise. * testsuite/libgomp.oacc-fortran/derivedtypes-arrays-1.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-loop-2.f95: Likewise. * testsuite/libgomp.oacc-fortran/parallel-dims.f90: Likewise. * testsuite/libgomp.oacc-fortran/privatized-ref-1.f95: Likewise.
2022-01-17Extend test cases for references in OpenACC 'private' clausesThomas Schwinge3-22/+364
libgomp/ * testsuite/libgomp.oacc-c++/privatized-ref-2.C: Extend. * testsuite/libgomp.oacc-c++/privatized-ref-3.C: Likewise. * testsuite/libgomp.oacc-fortran/privatized-ref-1.f95: Likewise.
2022-01-17Test cases for references in OpenACC 'private' clausesJulian Brown3-0/+211
libgomp/ * testsuite/libgomp.oacc-fortran/privatized-ref-1.f95: New test. * testsuite/libgomp.oacc-c++/privatized-ref-2.C: New test. * testsuite/libgomp.oacc-c++/privatized-ref-3.C: New test. Co-authored-by: Thomas Schwinge <thomas@codesourcery.com>
2022-01-13Add support for allocate clause (OpenMP 5.0).Hafiz Abid Qadeer2-0/+340
This patch adds support for OpenMP 5.0 allocate clause for fortran. It does not yet support the allocator-modifier as specified in OpenMP 5.1. The allocate clause is already supported in C/C++. gcc/fortran/ChangeLog: * dump-parse-tree.c (show_omp_clauses): Handle OMP_LIST_ALLOCATE. * gfortran.h (OMP_LIST_ALLOCATE): New enum value. * openmp.c (enum omp_mask1): Add OMP_CLAUSE_ALLOCATE. (gfc_match_omp_clauses): Handle OMP_CLAUSE_ALLOCATE (OMP_PARALLEL_CLAUSES, OMP_DO_CLAUSES, OMP_SECTIONS_CLAUSES) (OMP_TASK_CLAUSES, OMP_TASKLOOP_CLAUSES, OMP_TARGET_CLAUSES) (OMP_TEAMS_CLAUSES, OMP_DISTRIBUTE_CLAUSES) (OMP_SINGLE_CLAUSES): Add OMP_CLAUSE_ALLOCATE. (OMP_TASKGROUP_CLAUSES): New. (gfc_match_omp_taskgroup): Use OMP_TASKGROUP_CLAUSES instead of OMP_CLAUSE_TASK_REDUCTION. (resolve_omp_clauses): Handle OMP_LIST_ALLOCATE. (resolve_omp_do): Avoid warning when loop iteration variable is in allocate clause. * trans-openmp.c (gfc_trans_omp_clauses): Handle translation of allocate clause. (gfc_split_omp_clauses): Update for OMP_LIST_ALLOCATE. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/allocate-1.f90: New test. * gfortran.dg/gomp/allocate-2.f90: New test. * gfortran.dg/gomp/allocate-3.f90: New test. * gfortran.dg/gomp/collapse1.f90: Update error message. * gfortran.dg/gomp/openmp-simd-4.f90: Likewise. * gfortran.dg/gomp/clauses-1.f90: Uncomment allocate clause. libgomp/ChangeLog: * testsuite/libgomp.fortran/allocate-1.c: New test. * testsuite/libgomp.fortran/allocate-1.f90: New test. * libgomp.texi: Remove string that says that allocate clause support is for C/C++ only.
2022-01-13Improve Intel MIC offloading XFAILing for 'omp_get_device_num'Thomas Schwinge4-3/+36
After recent commit be661959a6b6d8f9c3c8608a746789e7b2ec3ca4 "libgomp/testsuite: Improve omp_get_device_num() tests", we're now iterating over all OpenMP target devices. Intel MIC (emulated) offloading still doesn't properly implement device-side 'omp_get_device_num', and we thus regress: PASS: libgomp.c/../libgomp.c-c++-common/target-45.c (test for excess errors) [-PASS:-]{+FAIL:+} libgomp.c/../libgomp.c-c++-common/target-45.c execution test PASS: libgomp.c++/../libgomp.c-c++-common/target-45.c (test for excess errors) [-PASS:-]{+FAIL:+} libgomp.c++/../libgomp.c-c++-common/target-45.c execution test PASS: libgomp.fortran/target10.f90 -O0 (test for excess errors) [-PASS:-]{+FAIL:+} libgomp.fortran/target10.f90 -O0 execution test PASS: libgomp.fortran/target10.f90 -O1 (test for excess errors) [-PASS:-]{+FAIL:+} libgomp.fortran/target10.f90 -O1 execution test PASS: libgomp.fortran/target10.f90 -O2 (test for excess errors) [-PASS:-]{+FAIL:+} libgomp.fortran/target10.f90 -O2 execution test PASS: libgomp.fortran/target10.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for excess errors) [-PASS:-]{+FAIL:+} libgomp.fortran/target10.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test PASS: libgomp.fortran/target10.f90 -O3 -g (test for excess errors) [-PASS:-]{+FAIL:+} libgomp.fortran/target10.f90 -O3 -g execution test PASS: libgomp.fortran/target10.f90 -Os (test for excess errors) [-PASS:-]{+FAIL:+} libgomp.fortran/target10.f90 -Os execution test Improve the XFAILing added in commit bb75b22aba254e8ff144db27b1c8b4804bad73bb "Allow matching Intel MIC in OpenMP 'declare variant'" for the case that *any* Intel MIC offload device is available. libgomp/ * testsuite/libgomp.c-c++-common/on_device_arch.h (any_device_arch, any_device_arch_intel_mic): New. * testsuite/lib/libgomp.exp (check_effective_target_offload_device_any_intel_mic): New. * testsuite/libgomp.c-c++-common/target-45.c: Use it. * testsuite/libgomp.fortran/target10.f90: Likewise.
2022-01-13Document current '-Wuninitialized' diagnostics for ↵Thomas Schwinge1-1/+13
'libgomp.oacc-fortran/routine-10.f90' [PR102192] libgomp/ PR tree-optimization/102192 * testsuite/libgomp.oacc-fortran/routine-10.f90: Document current '-Wuninitialized' diagnostics.
2022-01-13Document current '-Wuninitialized'/'-Wmaybe-uninitialized' diagnostics for ↵Thomas Schwinge12-3/+108
OpenACC test cases ... including "note: '[...]' was declared here" emitted since recent commit 9695e1c23be5b5c55d572ced152897313ddb96ae "Improve -Wuninitialized note location". For those that seemed incorrect to me, I've placed XFAILed 'dg-bogus'es, including one more instance of PR77504 etc., and several instances where for "local variables" of reference-data-type reductions (etc.?) we emit bogus (?) diagnostics. For implicit data clauses (including 'firstprivate'), we seem to be missing diagnostics, so I've placed XFAILed 'dg-warning's. gcc/testsuite/ * c-c++-common/goacc/builtin-goacc-parlevel-id-size.c: Document current '-Wuninitialized' diagnostics. * c-c++-common/goacc/mdc-1.c: Likewise. * c-c++-common/goacc/nested-reductions-1-kernels.c: Likewise. * c-c++-common/goacc/nested-reductions-1-parallel.c: Likewise. * c-c++-common/goacc/nested-reductions-1-routine.c: Likewise. * c-c++-common/goacc/nested-reductions-2-kernels.c: Likewise. * c-c++-common/goacc/nested-reductions-2-parallel.c: Likewise. * c-c++-common/goacc/nested-reductions-2-routine.c: Likewise. * c-c++-common/goacc/uninit-dim-clause.c: Likewise. * c-c++-common/goacc/uninit-firstprivate-clause.c: Likewise. * c-c++-common/goacc/uninit-if-clause.c: Likewise. * gfortran.dg/goacc/array-with-dt-1.f90: Likewise. * gfortran.dg/goacc/array-with-dt-2.f90: Likewise. * gfortran.dg/goacc/array-with-dt-3.f90: Likewise. * gfortran.dg/goacc/array-with-dt-4.f90: Likewise. * gfortran.dg/goacc/array-with-dt-5.f90: Likewise. * gfortran.dg/goacc/derived-chartypes-1.f90: Likewise. * gfortran.dg/goacc/derived-chartypes-2.f90: Likewise. * gfortran.dg/goacc/derived-chartypes-3.f90: Likewise. * gfortran.dg/goacc/derived-chartypes-4.f90: Likewise. * gfortran.dg/goacc/derived-classtypes-1.f95: Likewise. * gfortran.dg/goacc/derived-types-2.f90: Likewise. * gfortran.dg/goacc/host_data-tree.f95: Likewise. * gfortran.dg/goacc/kernels-tree.f95: Likewise. * gfortran.dg/goacc/modules.f95: Likewise. * gfortran.dg/goacc/nested-reductions-1-kernels.f90: Likewise. * gfortran.dg/goacc/nested-reductions-1-parallel.f90: Likewise. * gfortran.dg/goacc/nested-reductions-1-routine.f90: Likewise. * gfortran.dg/goacc/nested-reductions-2-kernels.f90: Likewise. * gfortran.dg/goacc/nested-reductions-2-parallel.f90: Likewise. * gfortran.dg/goacc/nested-reductions-2-routine.f90: Likewise. * gfortran.dg/goacc/parallel-tree.f95: Likewise. * gfortran.dg/goacc/pr93464.f90: Likewise. * gfortran.dg/goacc/privatization-1-compute-loop.f90: Likewise. * gfortran.dg/goacc/privatization-1-compute.f90: Likewise. * gfortran.dg/goacc/privatization-1-routine_gang-loop.f90: Likewise. * gfortran.dg/goacc/privatization-1-routine_gang.f90: Likewise. * gfortran.dg/goacc/uninit-dim-clause.f95: Likewise. * gfortran.dg/goacc/uninit-firstprivate-clause.f95: Likewise. * gfortran.dg/goacc/uninit-if-clause.f95: Likewise. * gfortran.dg/goacc/uninit-use-device-clause.f95: Likewise. * gfortran.dg/goacc/wait.f90: Likewise. libgomp/ * testsuite/libgomp.oacc-c-c++-common/vred2d-128.c: Document current '-Wuninitialized' diagnostics. * testsuite/libgomp.oacc-fortran/data-5.f90: Likewise. * testsuite/libgomp.oacc-fortran/gemm-2.f90: Likewise. * testsuite/libgomp.oacc-fortran/gemm.f90: Likewise. * testsuite/libgomp.oacc-fortran/optional-reduction.f90: Likewise. * testsuite/libgomp.oacc-fortran/parallel-reduction.f90: Likewise. * testsuite/libgomp.oacc-fortran/pr70643.f90: Likewise. * testsuite/libgomp.oacc-fortran/pr96628-part1.f90: Likewise. * testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: Likewise. * testsuite/libgomp.oacc-fortran/reduction-5.f90: Likewise. * testsuite/libgomp.oacc-fortran/reduction-7.f90: Likewise. * testsuite/libgomp.oacc-fortran/reference-reductions.f90: Likewise.
2022-01-13Host and offload targets have no common meaning of address spacesThomas Schwinge1-4/+0
gcc/ * tree-streamer-out.c (pack_ts_base_value_fields): Don't pack 'TYPE_ADDR_SPACE' for offloading. * tree-streamer-in.c (unpack_ts_base_value_fields): Don't unpack 'TYPE_ADDR_SPACE' for offloading. libgomp/ * testsuite/libgomp.c/address-space-1.c: Remove 'dg-xfail-run-if' for 'offload_device_intel_mic'.
2022-01-13Wait at end of OpenACC asynchronous kernels regionsJulian Brown1-1/+0
In OpenACC 'kernels' decomposition, we're improperly nesting synchronous and asynchronous data and compute regions, giving rise to data races when the asynchronicity is actually executed, as is visible in at least on test case with GCN offloading. The proper fix is to correctly use the asynchronous interfaces, making the currently synchronous data regions fully asynchronous (see also <https://gcc.gnu.org/PR97390> "[OpenACC] 'async' clause on 'data' construct", which is to share the same implementation), but that's for later; for now add some more synchronization. gcc/ * omp-oacc-kernels-decompose.cc (add_wait): New function, split out of... (add_async_clauses_and_wait): ...here. Call new outlined function. (decompose_kernels_region_body): Add wait at the end of explicitly-asynchronous kernels regions. libgomp/ * testsuite/libgomp.oacc-c-c++-common/f-asyncwait-1.c: Remove GCN offloading execution XFAIL. Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
2022-01-13OpenACC 'kernels' decomposition: Mark variables used in synthesized data ↵Thomas Schwinge3-28/+33
clauses as addressable [PR100280] ... as otherwise 'gcc/omp-low.c:lower_omp_target' has to create a temporary: 13073 else if (is_gimple_reg (var)) 13074 { 13075 gcc_assert (offloaded); 13076 tree avar = create_tmp_var (TREE_TYPE (var)); 13077 mark_addressable (avar); ..., which (a) is only implemented for actualy *offloaded* regions (but not data regions), and (b) the subsequently synthesized code for writing to and later reading back from the temporary fundamentally conflicts with OpenACC 'async' (as used by OpenACC 'kernels' decomposition). That's all not trivial to make work, so let's just avoid this case. gcc/ PR middle-end/100280 * omp-oacc-kernels-decompose.cc (maybe_build_inner_data_region): Mark variables used in synthesized data clauses as addressable. gcc/testsuite/ PR middle-end/100280 * c-c++-common/goacc/kernels-decompose-pr100280-1.c: New. * c-c++-common/goacc/classify-kernels-parloops.c: Likewise. * c-c++-common/goacc/classify-kernels-unparallelized-parloops.c: Likewise. * c-c++-common/goacc/classify-kernels-unparallelized.c: Test '--param openacc-kernels=decompose'. * c-c++-common/goacc/classify-kernels.c: Likewise. * c-c++-common/goacc/kernels-decompose-2.c: Update. * c-c++-common/goacc/kernels-decompose-ice-1.c: Remove. * c-c++-common/goacc/kernels-decompose-ice-2.c: Likewise. * gfortran.dg/goacc/classify-kernels-parloops.f95: New. * gfortran.dg/goacc/classify-kernels-unparallelized-parloops.f95: Likewise. * gfortran.dg/goacc/classify-kernels-unparallelized.f95: Test '--param openacc-kernels=decompose'. * gfortran.dg/goacc/classify-kernels.f95: Likewise. libgomp/ PR middle-end/100280 * testsuite/libgomp.oacc-c-c++-common/declare-vla-kernels-decompose-ice-1.c: Update. * testsuite/libgomp.oacc-c-c++-common/f-asyncwait-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c: Likewise. Suggested-by: Julian Brown <julian@codesourcery.com>
2022-01-13Enhance OpenACC 'kernels' decomposition testingThomas Schwinge11-91/+434
gcc/testsuite/ * c-c++-common/goacc/kernels-decompose-1.c: Enhance. * c-c++-common/goacc/kernels-decompose-2.c: Likewise. * c-c++-common/goacc/kernels-decompose-ice-1.c: Likewise. * c-c++-common/goacc/kernels-decompose-ice-2.c: Likewise. * gfortran.dg/goacc/kernels-decompose-1.f95: Likewise. * gfortran.dg/goacc/kernels-decompose-2.f95: Likewise. libgomp/ * testsuite/libgomp.oacc-c-c++-common/declare-vla-kernels-decompose-ice-1.c: Enhance. * testsuite/libgomp.oacc-c-c++-common/declare-vla-kernels-decompose.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/declare-vla.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/f-asyncwait-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/f-asyncwait-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/f-asyncwait-3.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c: Likewise. * testsuite/libgomp.oacc-fortran/asyncwait-1.f90: Likewise. * testsuite/libgomp.oacc-fortran/asyncwait-2.f90: Likewise. * testsuite/libgomp.oacc-fortran/asyncwait-3.f90: Likewise. * testsuite/libgomp.oacc-fortran/pr94358-1.f90: Likewise.
2022-01-04libgomp/testsuite: Improve omp_get_device_num() testsTobias Burnus2-19/+27
Related to r12-6208-gebc853deb7cc0487de9ef6e891a007ba853d1933 "libgomp: Fix GOMP_DEVICE_NUM_VAR stringification during offload image load" That commit fixed an issue with omp_get_device_num() on gcn/nvptx that resulted in having always the value 0. This commit modifies the tests to iterate over all devices such that on a multi-nonhost-device system it had detected that always-zero issue. libgomp/ChangeLog: * testsuite/libgomp.c-c++-common/target-45.c: Iterate over all devices. * testsuite/libgomp.fortran/target10.f90: Likewise.
2022-01-03Update copyright years.Jakub Jelinek4-4/+4
2021-12-13Fortran: Handle compare in OpenMP atomicTobias Burnus1-0/+313
gcc/fortran/ChangeLog: PR fortran/103576 * openmp.c (is_scalar_intrinsic_expr): Fix condition. (resolve_omp_atomic): Fix/update checks, accept compare. * trans-openmp.c (gfc_trans_omp_atomic): Handle compare. libgomp/ChangeLog: * libgomp.texi (OpenMP 5.1): Set Fortran support for atomic to 'Y'. * testsuite/libgomp.fortran/atomic-19.f90: New test. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/atomic-25.f90: Remove sorry, fix + add checks. * gfortran.dg/gomp/atomic-26.f90: Likewise. * gfortran.dg/gomp/atomic-21.f90: New test.
2021-12-10openmp: Fix libgomp.c++ testsuite errors for non-offload configsChung-Lin Tang3-1/+6
Some testcases for libgomp.c++ only works for non-shared address space offloading, because it exercises the zero-length array section behavior for offloaded address space, testing for NULL/non-NULL cases. libgomp/ChangeLog: * testsuite/libgomp.c++/target-lambda-1.C: Only run under "target offload_device_nonshared_as" * testsuite/libgomp.c++/target-this-3.C: Likewise. * testsuite/libgomp.c++/target-this-4.C: Likewise.
2021-12-09OpenMP 5.0: Remove array section base-pointer mapping semantics and other ↵Chung-Lin Tang10-54/+102
front-end adjustments This patch implements three pieces of functionality: (1) Adjust array section mapping to have standards conforming behavior, mapping array sections should *NOT* also map the base-pointer: struct S { int *ptr; ... }; struct S s; Instead of generating this during gimplify: map(to:*_1 [len: 400]) map(attach:s.ptr [bias: 0]) Now, adjust to: (i.e. do not map the base-pointer together. The attach operation is still generated, and if s.ptr is already mapped prior, attachment will happen) The correct way of achieving the base-pointer-also-mapped behavior would be to use: (A small Fortran front-end patch to trans-openmp.c:gfc_trans_omp_array_section is also included, which removes generation of a GOMP_MAP_ALWAYS_POINTER for array types, which appears incorrect and causes a regression in libgomp.fortranlibgomp.fortran/struct-elem-map-1.f90) (2) Related to the first item above, are fixes in libgomp/target.c to not overwrite attached pointers when handling device<->host copies, mainly for the "always" case. (3) The third is a set of changes to the C/C++ front-ends to extend the allowed component access syntax in map clauses. These changes are enabled for both OpenACC and OpenMP. gcc/c/ChangeLog: * c-parser.c (struct omp_dim): New struct type for use inside c_parser_omp_variable_list. (c_parser_omp_variable_list): Allow multiple levels of array and component accesses in array section base-pointer expression. (c_parser_omp_clause_to): Set 'allow_deref' to true in call to c_parser_omp_var_list_parens. (c_parser_omp_clause_from): Likewise. * c-typeck.c (handle_omp_array_sections_1): Extend allowed range of base-pointer expressions involving INDIRECT/MEM/ARRAY_REF and POINTER_PLUS_EXPR. (c_finish_omp_clauses): Extend allowed ranged of expressions involving INDIRECT/MEM/ARRAY_REF and POINTER_PLUS_EXPR. gcc/cp/ChangeLog: * parser.c (struct omp_dim): New struct type for use inside cp_parser_omp_var_list_no_open. (cp_parser_omp_var_list_no_open): Allow multiple levels of array and component accesses in array section base-pointer expression. (cp_parser_omp_all_clauses): Set 'allow_deref' to true in call to cp_parser_omp_var_list for to/from clauses. * semantics.c (handle_omp_array_sections_1): Extend allowed range of base-pointer expressions involving INDIRECT/MEM/ARRAY_REF and POINTER_PLUS_EXPR. (handle_omp_array_sections): Adjust pointer map generation of references. (finish_omp_clauses): Extend allowed ranged of expressions involving INDIRECT/MEM/ARRAY_REF and POINTER_PLUS_EXPR. gcc/fortran/ChangeLog: * trans-openmp.c (gfc_trans_omp_array_section): Do not generate GOMP_MAP_ALWAYS_POINTER map for main array maps of ARRAY_TYPE type. gcc/ChangeLog: * gimplify.c (extract_base_bit_offset): Add 'tree *offsetp' parameter, accomodate case where 'offset' return of get_inner_reference is non-NULL. (is_or_contains_p): Further robustify conditions. (omp_target_reorder_clauses): In alloc/to/from sorting phase, also move following GOMP_MAP_ALWAYS_POINTER maps along. Add new sorting phase where we make sure pointers with an attach/detach map are ordered correctly. (gimplify_scan_omp_clauses): Add modifications to avoid creating GOMP_MAP_STRUCT and associated alloc map for attach/detach maps. gcc/testsuite/ChangeLog: * c-c++-common/goacc/deep-copy-arrayofstruct.c: Adjust testcase. * c-c++-common/gomp/target-enter-data-1.c: New testcase. * c-c++-common/gomp/target-implicit-map-2.c: New testcase. libgomp/ChangeLog: * target.c (gomp_map_vars_existing): Make sure attached pointer is not overwritten during cross-host/device copying. (gomp_update): Likewise. (gomp_exit_data): Likewise. * testsuite/libgomp.c++/target-11.C: Adjust testcase. * testsuite/libgomp.c++/target-12.C: Likewise. * testsuite/libgomp.c++/target-15.C: Likewise. * testsuite/libgomp.c++/target-16.C: Likewise. * testsuite/libgomp.c++/target-17.C: Likewise. * testsuite/libgomp.c++/target-21.C: Likewise. * testsuite/libgomp.c++/target-23.C: Likewise. * testsuite/libgomp.c/target-23.c: Likewise. * testsuite/libgomp.c/target-29.c: Likewise. * testsuite/libgomp.c-c++-common/target-implicit-map-2.c: New testcase.
2021-12-08openmp: Improve OpenMP target support for C++ (PR92120)Chung-Lin Tang8-0/+459
This patch implements several C++ specific mapping capabilities introduced for OpenMP 5.0, including implicit mapping of this[:1] for non-static member functions, zero-length array section mapping of pointer-typed members, lambda captured variable access in target regions, and use of lambda objects inside target regions. Several adjustments to the C/C++ front-ends to allow more member-access syntax as valid is also included. PR middle-end/92120 gcc/cp/ChangeLog: * cp-tree.h (finish_omp_target): New declaration. (finish_omp_target_clauses): Likewise. * parser.c (cp_parser_omp_clause_map): Adjust call to cp_parser_omp_var_list_no_open to set 'allow_deref' argument to true. (cp_parser_omp_target): Factor out code, adjust into calls to new function finish_omp_target. * pt.c (tsubst_expr): Add call to finish_omp_target_clauses for OMP_TARGET case. * semantics.c (handle_omp_array_sections_1): Add handling to create 'this->member' from 'member' FIELD_DECL. Remove case of rejecting 'this' when not in declare simd. (handle_omp_array_sections): Likewise. (finish_omp_clauses): Likewise. Adjust to allow 'this[]' in OpenMP map clauses. Handle 'A->member' case in map clauses. Remove case of rejecting 'this' when not in declare simd. (struct omp_target_walk_data): New struct for walking over target-directive tree body. (finish_omp_target_clauses_r): New function for tree walk. (finish_omp_target_clauses): New function. (finish_omp_target): New function. gcc/c/ChangeLog: * c-parser.c (c_parser_omp_clause_map): Set 'allow_deref' argument in call to c_parser_omp_variable_list to 'true'. * c-typeck.c (handle_omp_array_sections_1): Add strip of MEM_REF in array base handling. (c_finish_omp_clauses): Handle 'A->member' case in map clauses. gcc/ChangeLog: * gimplify.c ("tree-hash-traits.h"): Add include. (gimplify_scan_omp_clauses): Change struct_map_to_clause to type hash_map<tree_operand, tree> *. Adjust struct map handling to handle cases of *A and A->B expressions. Under !DECL_P case of GOMP_CLAUSE_MAP handling, add STRIP_NOPS for indir_p case, add to struct_deref_set for map(*ptr_to_struct) cases. Add MEM_REF case when handling component_ref_p case. Add unshare_expr and gimplification when created GOMP_MAP_STRUCT is not a DECL. Add code to add firstprivate pointer for *pointer-to-struct case. (gimplify_adjust_omp_clauses): Move GOMP_MAP_STRUCT removal code for exit data directives code to earlier position. * omp-low.c (lower_omp_target): Handle GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION, and GOMP_MAP_POINTER_TO_ZERO_LENGTH_ARRAY_SECTION map kinds. * tree-pretty-print.c (dump_omp_clause): Likewise. gcc/testsuite/ChangeLog: * gcc.dg/gomp/target-3.c: New testcase. * g++.dg/gomp/target-3.C: New testcase. * g++.dg/gomp/target-lambda-1.C: New testcase. * g++.dg/gomp/target-lambda-2.C: New testcase. * g++.dg/gomp/target-this-1.C: New testcase. * g++.dg/gomp/target-this-2.C: New testcase. * g++.dg/gomp/target-this-3.C: New testcase. * g++.dg/gomp/target-this-4.C: New testcase. * g++.dg/gomp/target-this-5.C: New testcase. * g++.dg/gomp/this-2.C: Adjust testcase. include/ChangeLog: * gomp-constants.h (enum gomp_map_kind): Add GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION, and GOMP_MAP_POINTER_TO_ZERO_LENGTH_ARRAY_SECTION map kinds. (GOMP_MAP_POINTER_P): Include GOMP_MAP_POINTER_TO_ZERO_LENGTH_ARRAY_SECTION. libgomp/ChangeLog: * libgomp.h (gomp_attach_pointer): Add bool parameter. * oacc-mem.c (acc_attach_async): Update call to gomp_attach_pointer. (goacc_enter_data_internal): Likewise. * target.c (gomp_map_vars_existing): Update assert condition to include GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION. (gomp_map_pointer): Add 'bool allow_zero_length_array_sections' parameter, add support for mapping a pointer with NULL target. (gomp_attach_pointer): Add 'bool allow_zero_length_array_sections' parameter, add support for attaching a pointer with NULL target. (gomp_map_vars_internal): Update calls to gomp_map_pointer and gomp_attach_pointer, add handling for GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION, and GOMP_MAP_POINTER_TO_ZERO_LENGTH_ARRAY_SECTION cases. * testsuite/libgomp.c++/target-23.C: New testcase. * testsuite/libgomp.c++/target-lambda-1.C: New testcase. * testsuite/libgomp.c++/target-lambda-2.C: New testcase. * testsuite/libgomp.c++/target-this-1.C: New testcase. * testsuite/libgomp.c++/target-this-2.C: New testcase. * testsuite/libgomp.c++/target-this-3.C: New testcase. * testsuite/libgomp.c++/target-this-4.C: New testcase. * testsuite/libgomp.c++/target-this-5.C: New testcase.
2021-12-02fortran: OpenMP/OpenACC array mapping alignment fix (PR90030)Chung-Lin Tang2-0/+32
Fix issue with the Fortran front-end when mapping arrays: when creating the data MEM_REF for the map clause, there was a convention of casting the referencing pointer to 'c_char *' by fold_convert (build_pointer_type (char_type_node), ptr). This causes the alignment passed to the libgomp runtime for array data hardwared to '1', and causes alignment errors on the offload target. This patch fixes this by removing the char_type_node pointer converts, and adding gcc_asserts to ensure POINTER_TYPE_P (TREE_TYPE (ptr)). PR fortran/90030 gcc/fortran/ChangeLog: * trans-openmp.c (gfc_omp_finish_clause): Remove fold_convert to pointer to char_type_node, add gcc_assert of POINTER_TYPE_P. (gfc_trans_omp_array_section): Likewise. (gfc_trans_omp_clauses): Likewise. gcc/testsuite/ChangeLog: * gfortran.dg/goacc/finalize-1.f: Adjust scan test. * gfortran.dg/gomp/affinity-clause-1.f90: Likewise. * gfortran.dg/gomp/affinity-clause-5.f90: Likewise. * gfortran.dg/gomp/defaultmap-4.f90: Likewise. * gfortran.dg/gomp/defaultmap-5.f90: Likewise. * gfortran.dg/gomp/defaultmap-6.f90: Likewise. * gfortran.dg/gomp/map-3.f90: Likewise. * gfortran.dg/gomp/pr78260-2.f90: Likewise. * gfortran.dg/gomp/pr78260-3.f90: Likewise. libgomp/ChangeLog: * testsuite/libgomp.oacc-fortran/pr90030.f90: New test. * testsuite/libgomp.fortran/pr90030.f90: New test.
2021-11-30[OpenACC] Allow gang reductions inside serial constructsKwok Cheung Yeung1-1/+0
... fixing a regression introduced in the preceding commit 2b7dac2c0dcb087da9e4018943c023c0678234a3 "Make OpenACC orphan gang reductions errors". gcc/fortran/ * openmp.c (oacc_is_serial, oacc_is_parallel_or_serial): New. (resolve_oacc_loop_blocks): Use oacc_is_parallel_or_serial instead of oacc_is_parallel. libgomp/ * testsuite/libgomp.oacc-fortran/parallel-dims.f90: Remove temporary skip. Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>