aboutsummaryrefslogtreecommitdiff
path: root/libgomp/testsuite
AgeCommit message (Collapse)AuthorFilesLines
2022-03-10[OpenACC privatization] Analyze 'lookup_decl'-translated DECL [PR90115, ↵Thomas Schwinge17-61/+97
PR102330, PR104774] ... so that it matches what we analyze and what we action on. Fix-up for commit 29a2f51806c5b30e17a8d0e9ba7915a3c53c34ff "openacc: Add support for gang local storage allocation in shared memory [PR90115]". PR middle-end/90115 PR middle-end/102330 PR middle-end/104774 gcc/ * omp-low.cc (oacc_privatization_candidate_p) (oacc_privatization_scan_clause_chain) (oacc_privatization_scan_decl_chain, lower_oacc_private_marker): Analyze 'lookup_decl'-translated DECL. gcc/testsuite/ * c-c++-common/goacc/kernels-decompose-pr104061-1-3.c: Adjust. * c-c++-common/goacc/kernels-decompose-pr104061-1-4.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr104132-1.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr104133-1.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr104774-1.c: Likewise. * c-c++-common/goacc/privatization-1-compute-loop.c: Likewise. * c-c++-common/goacc/privatization-1-compute.c: Likewise. * c-c++-common/goacc/privatization-1-routine_gang-loop.c: Likewise. * c-c++-common/goacc/privatization-1-routine_gang.c: Likewise. * gfortran.dg/goacc-gomp/pr102330-1.f90: Likewise, and subsume... * gfortran.dg/goacc-gomp/pr102330-2.f90: ... this file, and... * gfortran.dg/goacc-gomp/pr102330-3.f90: ... this file. * gfortran.dg/goacc/privatization-1-compute-loop.f90: Adjust. * gfortran.dg/goacc/privatization-1-compute.f90: Likewise. * gfortran.dg/goacc/privatization-1-routine_gang-loop.f90: Likewise. * gfortran.dg/goacc/privatization-1-routine_gang.f90: Likewise. libgomp/ * testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c: Enhance. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-1.c: Adjust. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-3.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-4.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-5.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-vector-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-vector-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-3.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-4.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-5.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-6.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-7.c: Likewise. * testsuite/libgomp.oacc-fortran/optional-private.f90: Likewise. * testsuite/libgomp.oacc-fortran/privatized-ref-1.f95: Likewise. * testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: Likewise.
2022-03-10Enhance further testcases to verify handling of OpenACC privatization level ↵Thomas Schwinge4-12/+64
[PR90115] As originally introduced in commit 11b8286a83289f5b54e813f14ff56d730c3f3185 "[OpenACC privatization] Largely extend diagnostics and corresponding testsuite coverage [PR90115]". PR middle-end/90115 gcc/testsuite/ * c-c++-common/goacc/nesting-1.c: Enhance. * gcc.dg/goacc/nested-function-1.c: Likewise. * gcc.dg/goacc/nested-function-2.c: Likewise. * gfortran.dg/goacc/nested-function-1.f90: Likewise. libgomp/ * testsuite/libgomp.oacc-fortran/routine-1.f90: Enhance. * testsuite/libgomp.oacc-fortran/routine-2.f90: Likewise. * testsuite/libgomp.oacc-fortran/routine-3.f90: Likewise. * testsuite/libgomp.oacc-fortran/routine-9.f90: Likewise.
2022-03-04Fix 'libgomp.oacc-c-c++-common/kernels-decompose-1.c' expected diagnosticsThomas Schwinge1-0/+2
Fix-up for recent commit 8935589b496f755e08cadf26d8ceddf0dd6e0968 "OMP lowering: Regimplify 'OMP_CLAUSE_MAP_DECL_MAKE_ADDRESSABLE' DECLs [PR100280, PR104132, PR104133]": adjust for a GCN offloading workaround added just before commit: '(volatile void *) &f1;'. PR testsuite/104791 libgomp/ * testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c: Fix expected diagnostics.
2022-03-04Test 'libgomp.oacc-*/kernels-private-vars-*' with ↵Thomas Schwinge33-99/+361
'--param=openacc-kernels=decompose' [PR104784] Before recent commit 8935589b496f755e08cadf26d8ceddf0dd6e0968 "OMP lowering: Regimplify 'OMP_CLAUSE_MAP_DECL_MAKE_ADDRESSABLE' DECLs [PR100280, PR104132, PR104133]", 'libgomp.oacc-c' testing already worked fine, but 'libgomp.oacc-c++' testing ICEed. Via the commit mentioned, the C++ testing ICEs are now resolved, but the underlying issue remains to be looked into: PR104784 "OpenACC 'kernels' decomposition: C vs. C++ differences". PR middle-end/104784 libgomp/ * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-1.c: Test with '--param=openacc-kernels=decompose'. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-3.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-4.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-5.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-3.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-4.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-5.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-6.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-vector-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-vector-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-3.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-4.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-5.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-6.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-7.c: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-1.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-2.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-3.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-6.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-vector-1.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-vector-2.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-1.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-2.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-3.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-4.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-5.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-6.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-7.f90: Likewise.
2022-03-04Test '-fopt-info-omp-all' in 'libgomp.oacc-*/kernels-private-vars-*'Thomas Schwinge33-514/+811
libgomp/ * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-1.c: Test '-fopt-info-omp-all'. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-3.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-4.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-5.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-3.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-4.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-5.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-6.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-vector-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-vector-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-3.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-4.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-5.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-6.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-7.c: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-1.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-2.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-3.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-6.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-vector-1.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-vector-2.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-1.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-2.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-3.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-4.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-5.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-6.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-7.f90: Likewise.
2022-03-04OMP lowering: Regimplify 'OMP_CLAUSE_MAP_DECL_MAKE_ADDRESSABLE' DECLs ↵Thomas Schwinge1-4/+58
[PR100280, PR104132, PR104133] ... by generalizing the existing 'gcc/omp-low.cc:task_shared_vars'. Fix-up for commit 9b32c1669aad5459dd053424f9967011348add83 "OpenACC 'kernels' decomposition: Mark variables used in synthesized data clauses as addressable [PR100280]". PR middle-end/100280 PR middle-end/104132 PR middle-end/104133 gcc/ * omp-low.cc (task_shared_vars): Rename to 'make_addressable_vars'. Adjust all users. (scan_sharing_clauses) <OMP_CLAUSE_MAP> Use it for 'OMP_CLAUSE_MAP_DECL_MAKE_ADDRESSABLE' DECLs, too. gcc/testsuite/ * c-c++-common/goacc/kernels-decompose-pr104061-1-3.c: Adjust. * c-c++-common/goacc/kernels-decompose-pr104061-1-4.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr104132-1.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr104133-1.c: Likewise. libgomp/ * testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c: Extend.
2022-03-04OpenACC 'kernels' decomposition: Move 'TREE_ADDRESSABLE' setting into OMP ↵Thomas Schwinge2-4/+8
lowering [PR100280] ... in preparation for later changes. No functional change. Follow-up to commit 9b32c1669aad5459dd053424f9967011348add83 "OpenACC 'kernels' decomposition: Mark variables used in synthesized data clauses as addressable [PR100280]". PR middle-end/100280 gcc/ * tree.h (OMP_CLAUSE_MAP_DECL_MAKE_ADDRESSABLE): New. * tree-core.h: Document it. * omp-low.cc (scan_sharing_clauses) <OMP_CLAUSE_MAP>: Handle 'OMP_CLAUSE_MAP_DECL_MAKE_ADDRESSABLE'. * omp-oacc-kernels-decompose.cc (maybe_build_inner_data_region): Set 'OMP_CLAUSE_MAP_DECL_MAKE_ADDRESSABLE' instead of 'TREE_ADDRESSABLE'. gcc/testsuite/ * c-c++-common/goacc/classify-kernels-unparallelized.c: Adjust. * c-c++-common/goacc/classify-kernels.c: Likewise. * c-c++-common/goacc/kernels-decompose-2.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr100280-1.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr104061-1-2.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr104061-1-3.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr104061-1-4.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr104132-1.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr104133-1.c: Likewise. libgomp/ * testsuite/libgomp.oacc-c-c++-common/f-asyncwait-1.c: Adjust. * testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c: Likewise.
2022-03-04Add diagnostic: "note: OpenACC 'kernels' decomposition: variable '[...]' ↵Thomas Schwinge2-0/+4
declared in block made addressable" [PR100280] Follow-up to commit 9b32c1669aad5459dd053424f9967011348add83 "OpenACC 'kernels' decomposition: Mark variables used in synthesized data clauses as addressable [PR100280]". PR middle-end/100280 gcc/ * omp-oacc-kernels-decompose.cc (maybe_build_inner_data_region): Add diagnostic: "note: OpenACC 'kernels' decomposition: variable '[...]' declared in block made addressable". gcc/testsuite/ * c-c++-common/goacc/classify-kernels-unparallelized.c: Add '--param=openacc-privatization=noisy'. * c-c++-common/goacc/classify-kernels.c: Likewise. * c-c++-common/goacc/kernels-decompose-2.c: Adjust. * c-c++-common/goacc/kernels-decompose-pr100280-1.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr104061-1-2.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr104061-1-3.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr104061-1-4.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr104132-1.c: Likewise. * c-c++-common/goacc/kernels-decompose-pr104133-1.c: Likewise. libgomp/ * testsuite/libgomp.oacc-c-c++-common/f-asyncwait-1.c: Adjust. * testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c: Likewise.
2022-02-28[libgomp, testsuite, nvptx] Add -mptx=_ in declare-variant-3-sm*.cTom de Vries6-6/+6
When running with target board unix/-foffload=-mptx=3.1, we run into: ... lto1: error: PTX version (-mptx) needs to be at least 4.2 to support \ selected -misa (sm_53)^M mkoffload: fatal error: x86_64-pc-linux-gnu-accel-nvptx-none-gcc returned \ 1 exit status^M compilation terminated.^M ... FAIL: libgomp.c/declare-variant-3-sm53.c (test for excess errors) ... Fix this by adding -foffload=-mptx=_ in the libgomp.c/declare-variant-3-sm*.c test-cases. Tested on x86_64 with nvptx accelerator. libgomp/ChangeLog: 2022-02-28 Tom de Vries <tdevries@suse.de> * testsuite/libgomp.c/declare-variant-3-sm30.c: Add -foffload=-mptx=_. * testsuite/libgomp.c/declare-variant-3-sm35.c: Same. * testsuite/libgomp.c/declare-variant-3-sm53.c: Same. * testsuite/libgomp.c/declare-variant-3-sm70.c: Same. * testsuite/libgomp.c/declare-variant-3-sm75.c: Same. * testsuite/libgomp.c/declare-variant-3-sm80.c: Same.
2022-02-24[libgomp, testsuite, nvptx] Add libgomp.c/declare-variant-3-sm*.cTom de Vries7-0/+108
Add openmp test-cases that test the omp declare variant construct: ... #pragma omp declare variant (f30) match (device={isa("sm_30")}) ... using the available nvptx isas. Only the one for sm_30 is a dg-do run test-case, the other ones are dg-do link. Tested on x86_64 with nvptx accelerator. libgomp/ChangeLog: 2022-02-24 Tom de Vries <tdevries@suse.de> * testsuite/libgomp.c/declare-variant-3-sm30.c: New test. * testsuite/libgomp.c/declare-variant-3-sm35.c: New test. * testsuite/libgomp.c/declare-variant-3-sm53.c: New test. * testsuite/libgomp.c/declare-variant-3-sm70.c: New test. * testsuite/libgomp.c/declare-variant-3-sm75.c: New test. * testsuite/libgomp.c/declare-variant-3-sm80.c: New test. * testsuite/libgomp.c/declare-variant-3.h: New header file.
2022-02-22Fix OpenACC gang-redundant execution in ↵Thomas Schwinge1-10/+32
'libgomp.oacc-fortran/privatized-ref-2.f90' This was a latent problem, and this commit here now resolves a regression that after recent commit a78b1ab1df9ca44acc5638e8f9d0ae2e62bd65ed "amdgcn: Tune default OpenMP/OpenACC GPU utilization" we had (only) seen on a GCN offloading '-march=gfx908' system: {+WARNING: program timed out.+} [-PASS:-]{+FAIL:+} libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa -O0 execution test Same for other optimization levels. Make sure that we're not executing non-parallelized code in gang-redundant mode, by putting these parts into their own 'parallel' constructs, which then default to 'num_gangs(1)'. libgomp/ * testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: Fix OpenACC gang-redundant execution.
2022-02-22[libgomp, nvptx] Fix hang in gomp_team_barrier_wait_endTom de Vries3-28/+0
Consider the following omp fragment. ... #pragma omp target #pragma omp parallel num_threads (2) #pragma omp task ; ... This hangs at -O0 for nvptx. Investigating the behaviour gives us the following trace of events: - both threads execute GOMP_task, where they: - deposit a task, and - execute gomp_team_barrier_wake - thread 1 executes gomp_team_barrier_wait_end and, not being the last thread, proceeds to wait at the team barrier - thread 0 executes gomp_team_barrier_wait_end and, being the last thread, it calls gomp_barrier_handle_tasks, where it: - executes both tasks and marks the team barrier done - executes a gomp_team_barrier_wake which wakes up thread 1 - thread 1 exits the team barrier - thread 0 returns from gomp_barrier_handle_tasks and goes to wait at the team barrier. - thread 0 hangs. To understand why there is a hang here, it's good to understand how things are setup for nvptx. The libgomp/config/nvptx/bar.c implementation is a copy of the libgomp/config/linux/bar.c implementation, with uses of both futex_wake and do_wait replaced with uses of ptx insn bar.sync: ... if (bar->total > 1) asm ("bar.sync 1, %0;" : : "r" (32 * bar->total)); ... The point where thread 0 goes to wait at the team barrier, corresponds in the linux implementation with a do_wait. In the linux case, the call to do_wait doesn't hang, because it's waiting for bar->generation to become a certain value, and if bar->generation already has that value, it just proceeds, without any need for coordination with other threads. In the nvtpx case, the bar.sync waits until thread 1 joins it in the same logical barrier, which never happens: thread 1 is lingering in the thread pool at the thread pool barrier (using a different logical barrier), waiting to join a new team. The easiest way to fix this is to revert to the posix implementation for bar.{c,h}. That however falls back on a busy-waiting approach, and does not take advantage of the ptx bar.sync insn. Instead, we revert to the linux implementation for bar.c, and implement bar.c local functions futex_wait and futex_wake using the bar.sync insn. The bar.sync insn takes an argument specifying how many threads are participating, and that doesn't play well with the futex syntax where it's not clear in advance how many threads will be woken up. This is solved by waking up all waiting threads each time a futex_wait or futex_wake happens, and possibly going back to sleep with an updated thread count. Tested libgomp on x86_64 with nvptx accelerator. libgomp/ChangeLog: 2021-04-20 Tom de Vries <tdevries@suse.de> PR target/99555 * config/nvptx/bar.c (generation_to_barrier): New function, copied from config/rtems/bar.c. (futex_wait, futex_wake): New function. (do_spin, do_wait): New function, copied from config/linux/wait.h. (gomp_barrier_wait_end, gomp_barrier_wait_last) (gomp_team_barrier_wake, gomp_team_barrier_wait_end): (gomp_team_barrier_wait_cancel_end, gomp_team_barrier_cancel): Remove and replace with include of config/linux/bar.c. * config/nvptx/bar.h (gomp_barrier_t): Add fields waiters and lock. (gomp_barrier_init): Init new fields. * testsuite/libgomp.c-c++-common/task-detach-6.c: Remove nvptx-specific workarounds. * testsuite/libgomp.c/pr99555-1.c: Same. * testsuite/libgomp.fortran/task-detach-6.f90: Same.
2022-02-22[libgomp, testsuite, nvptx] Fix pr96390.c without CUDATom de Vries2-0/+2
When running the libgomp testsuite on x86_64 with nvptx accelerator, we run into: ... XPASS: libgomp.c/../libgomp.c-c++-common/pr96390.c (test for excess errors) FAIL: libgomp.c/../libgomp.c-c++-common/pr96390.c execution test ... The problem is that we're expecting the following ptxas error: ... XFAIL: libgomp.c/../libgomp.c-c++-common/pr96390.c (test for excess errors) Excess errors: ptxas /tmp/ccZYDw8N.o, line 90; error : Call to 'baz' requires call prototype ptxas /tmp/ccZYDw8N.o, line 90; error : Unknown symbol 'baz' ... But it's not triggered because ptxas is not in the path, so nvptx-none-as defaults to --no-verify. So instead, we run into the same error at execution time. Fix this by forcing verification using: ... /* { dg-additional-options "-foffload=-Wa,--verify" \ { target offload_target_nvptx } } */ ... such that we run into the xfail in this way instead: ... XFAIL: libgomp.c/../libgomp.c-c++-common/pr96390.c (test for excess errors) Excess errors: nvptx-as: error trying to exec 'ptxas': execvp: No such file or directory nvptx-as: ptxas returned 255 exit status ... Tested on x86_64-linux with nvptx accelerator. libgomp/ChangeLog: 2022-02-21 Tom de Vries <tdevries@suse.de> PR testsuite/104146 * testsuite/libgomp.c++/pr96390.C: Add additional-option -foffload=-Wa,--verify for nvptx. * testsuite/libgomp.c-c++-common/pr96390.c: Same.
2022-02-15Fortran/OpenMP: Fix depend-clause handlingTobias Burnus1-0/+109
gcc/fortran/ChangeLog: * trans-openmp.cc (gfc_trans_omp_clauses, gfc_trans_omp_depobj): Depend on the proper addr, for ptr/alloc depend on pointee. libgomp/ChangeLog: * testsuite/libgomp.fortran/depend-4.f90: New test. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/depend-4.f90: New test. * gfortran.dg/gomp/depend-5.f90: New test.
2022-02-10OpenMP/C++: Permit mapping classes with virtual members [PR102204]Tobias Burnus1-0/+50
PR c++/102204 gcc/cp/ChangeLog: * decl2.cc (cp_omp_mappable_type_1): Remove check for virtual members as those are permitted since OpenMP 5.0. libgomp/ChangeLog: * testsuite/libgomp.c++/target-virtual-1.C: New test. gcc/testsuite/ChangeLog: * g++.dg/gomp/unmappable-1.C: Remove previously expected dg-message.
2022-02-09C, C++, Fortran, OpenMP: Add 'has_device_addr' clause to 'target' construct.Marcel Vollweiler10-0/+478
This patch adds the 'has_device_addr' clause to the OpenMP 'target' construct which was introduced in OpenMP 5.1 (OpenMP API 5.1 specification pp. 197ff): has_device_addr(list) "The has_device_addr clause indicates that its list items already have device addresses and therefore they may be directly accessed from a target device. If the device address of a list item is not for the device on which the target region executes, accessing the list item inside the region results in unspecified behavior. The list items may include array sections." (p. 200) "A list item may not be specified in both an is_device_ptr clause and a has_device_addr clause on the directive." (p. 202) "A list item that appears in an is_device_ptr or a has_device_addr clause must not be specified in any data-sharing attribute clause on the same target construct." (p. 203) gcc/c-family/ChangeLog: * c-omp.cc (c_omp_split_clauses): Added OMP_CLAUSE_HAS_DEVICE_ADDR case. * c-pragma.h (enum pragma_kind): Added 5.1 in comment. (enum pragma_omp_clause): Added PRAGMA_OMP_CLAUSE_HAS_DEVICE_ADDR. gcc/c/ChangeLog: * c-parser.cc (c_parser_omp_clause_name): Parse 'has_device_addr' clause. (c_parser_omp_variable_list): Handle array sections. (c_parser_omp_clause_has_device_addr): Added. (c_parser_omp_all_clauses): Added PRAGMA_OMP_CLAUSE_HAS_DEVICE_ADDR case. (c_parser_omp_target_exit_data): Added HAS_DEVICE_ADDR to OMP_CLAUSE_MASK. * c-typeck.cc (handle_omp_array_sections): Handle clause restrictions. (c_finish_omp_clauses): Handle array sections. gcc/cp/ChangeLog: * parser.cc (cp_parser_omp_clause_name): Parse 'has_device_addr' clause. (cp_parser_omp_var_list_no_open): Handle array sections. (cp_parser_omp_all_clauses): Added PRAGMA_OMP_CLAUSE_HAS_DEVICE_ADDR case. (cp_parser_omp_target_update): Added HAS_DEVICE_ADDR to OMP_CLAUSE_MASK. * semantics.cc (handle_omp_array_sections): Handle clause restrictions. (finish_omp_clauses): Handle array sections. gcc/fortran/ChangeLog: * dump-parse-tree.cc (show_omp_clauses): Added OMP_LIST_HAS_DEVICE_ADDR case. * gfortran.h: Added OMP_LIST_HAS_DEVICE_ADDR. * openmp.cc (enum omp_mask2): Added OMP_CLAUSE_HAS_DEVICE_ADDR. (gfc_match_omp_clauses): Parse HAS_DEVICE_ADDR clause. (resolve_omp_clauses): Same. * trans-openmp.cc (gfc_trans_omp_variable_list): Added OMP_LIST_HAS_DEVICE_ADDR case. (gfc_trans_omp_clauses): Firstprivatize of array descriptors. gcc/ChangeLog: * gimplify.cc (gimplify_scan_omp_clauses): Added cases for OMP_CLAUSE_HAS_DEVICE_ADDR and handle array sections. (gimplify_adjust_omp_clauses): Added OMP_CLAUSE_HAS_DEVICE_ADDR case. * omp-low.cc (scan_sharing_clauses): Handle OMP_CLAUSE_HAS_DEVICE_ADDR. (lower_omp_target): Same. * tree-core.h (enum omp_clause_code): Same. * tree-nested.cc (convert_nonlocal_omp_clauses): Same. (convert_local_omp_clauses): Same. * tree-pretty-print.cc (dump_omp_clause): Same. * tree.cc: Same. libgomp/ChangeLog: * libgomp.texi: Updated entry for HAS_DEVICE_ADDR. * target.c (copy_firstprivate_data): Copy only if host address is not NULL. * testsuite/libgomp.c++/target-has-device-addr-2.C: New test. * testsuite/libgomp.c++/target-has-device-addr-4.C: New test. * testsuite/libgomp.c++/target-has-device-addr-5.C: New test. * testsuite/libgomp.c++/target-has-device-addr-6.C: New test. * testsuite/libgomp.c-c++-common/target-has-device-addr-1.c: New test. * testsuite/libgomp.c/target-has-device-addr-3.c: New test. * testsuite/libgomp.fortran/target-has-device-addr-1.f90: New test. * testsuite/libgomp.fortran/target-has-device-addr-2.f90: New test. * testsuite/libgomp.fortran/target-has-device-addr-3.f90: New test. * testsuite/libgomp.fortran/target-has-device-addr-4.f90: New test. gcc/testsuite/ChangeLog: * c-c++-common/gomp/clauses-1.c: Added has_device_addr to test cases. * g++.dg/gomp/attrs-1.C: Added has_device_addr to test cases. * g++.dg/gomp/attrs-2.C: Added has_device_addr to test cases. * c-c++-common/gomp/target-has-device-addr-1.c: New test. * c-c++-common/gomp/target-has-device-addr-2.c: New test. * c-c++-common/gomp/target-is-device-ptr-1.c: New test. * c-c++-common/gomp/target-is-device-ptr-2.c: New test. * gfortran.dg/gomp/is_device_ptr-3.f90: New test. * gfortran.dg/gomp/target-has-device-addr-1.f90: New test. * gfortran.dg/gomp/target-has-device-addr-2.f90: New test.
2022-02-08libgomp: Fix segfault with posthumous orphan tasks [PR104385]Jakub Jelinek1-0/+26
The following patch fixes crashes with posthumous orphan tasks. When a parent task finishes, gomp_clear_parent clears the parent pointers of its children tasks present in the parent->children_queue. But children that are still waiting for dependencies aren't in that queue yet, they will be added there only when the sibling they are waiting for exits. Unfortunately we were adding those tasks into the queues with the original task->parent which then causes crashes because that task is gone and freed. The following patch fixes that by clearing the parent field when we schedule such task for running by adding it into the queues and we know that the sibling task which is about to finish has NULL parent. 2022-02-08 Jakub Jelinek <jakub@redhat.com> PR libgomp/104385 * task.c (gomp_task_run_post_handle_dependers): If parent is NULL, clear task->parent. * testsuite/libgomp.c/pr104385.c: New test.
2022-02-04libgomp.fortran/allocate-1.f90: Fix minor cleanupTobias Burnus1-3/+0
libgomp/ChangeLog: * testsuite/libgomp.fortran/allocate-1.f90: Remove spurious STOP of previous commit.
2022-02-04libgomp.fortran/allocate-1.f90: Minor cleanupTobias Burnus2-56/+60
libgomp/ChangeLog: * testsuite/libgomp.fortran/allocate-1.c (is_64bit_aligned): Renamed from is_64bit_aligned_. * testsuite/libgomp.fortran/allocate-1.f90: Fix interface decl and use it, more implicit none, remove unused argument.
2022-02-01[nvptx] Add some support for .local atomicsTom de Vries3-21/+0
The ptx insn atom doesn't support local memory. In case of doing an atomic operation on local memory, we run into: ... operation not supported on global/shared address space ... This is the cuGetErrorString message for CUDA_ERROR_INVALID_ADDRESS_SPACE. The message is somewhat confusing given that actually the operation is not supported on local address space. Fix this by falling back on a non-atomic version when detecting a frame-related memory operand. This only solves some cases that are detected at compile-time. It does however fix the openacc private-atomic-* test-cases. Tested on x86_64 with nvptx accelerator. gcc/ChangeLog: 2022-01-27 Tom de Vries <tdevries@suse.de> * config/nvptx/nvptx.md (define_insn "atomic_compare_and_swap<mode>_1") (define_insn "atomic_exchange<mode>") (define_insn "atomic_fetch_add<mode>") (define_insn "atomic_fetch_addsf") (define_insn "atomic_fetch_<logic><mode>"): Output non-atomic version if memory operands is frame-relative. gcc/testsuite/ChangeLog: 2022-01-31 Tom de Vries <tdevries@suse.de> * gcc.target/nvptx/stack-atomics-run.c: New test. libgomp/ChangeLog: 2022-01-27 Tom de Vries <tdevries@suse.de> * testsuite/libgomp.oacc-c-c++-common/private-atomic-1.c: Remove PR83812 workaround. * testsuite/libgomp.oacc-fortran/private-atomic-1-vector.f90: Same. * testsuite/libgomp.oacc-fortran/private-atomic-1-worker.f90: Same.
2022-02-01[libgomp, testsuite] Fix insufficient resources in test-casesTom de Vries3-3/+25
When running libgomp test-case broadcast-many.c on an nvptx accelerator (T400, driver version 470.86), I run into: ... libgomp: The Nvidia accelerator has insufficient resources to launch \ 'main$_omp_fn$0' with num_workers = 32 and vector_length = 32; \ recompile the program with 'num_workers = x and vector_length = y' on \ that offloaded region or '-fopenacc-dim=:x:y' where x * y <= 896. FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/broadcast-many.c \ -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none \ -O0 execution test ... The error does not occur when using GOMP_NVPTX_JIT=-O0. Fix this by using 896 / 32 == 28 workers for ACC_DEVICE_TYPE_nvidia. Likewise for some other test-cases. Tested libgomp on x86_64 with nvptx accelerator. libgomp/ChangeLog: 2022-01-27 Tom de Vries <tdevries@suse.de> * testsuite/libgomp.oacc-c-c++-common/broadcast-many.c: Reduce num_workers for nvidia accelerator to fix libgomp error 'insufficient resources'. * testsuite/libgomp.oacc-c-c++-common/par-loop-comb-reduction-4.c: Same. * testsuite/libgomp.oacc-c-c++-common/reduction-7.c: Same.
2022-02-01[libgomp, testsuite] Reduce recursion depth in declare_target-*.f90Tom de Vries2-3/+7
When running the libgomp testsuite with GOMP_NVPTX_JIT=-O0 using an nvptx accelerator (Nvidia T400, 2GB), I run into: ... libgomp: cuCtxSynchronize error: unspecified launch failure \ (perhaps abort was called) libgomp: cuMemFree_v2 error: unspecified launch failure libgomp: device finalization failed FAIL: libgomp.fortran/examples-4/declare_target-1.f90 -O0 execution test ... The test-case contains: ... ! Reduced from 25 to 23, otherwise execution runs out of thread stack on ! Nvidia Titan V. if (fib (23) /= fib_wrapper (23)) stop 2 ... Fix this by reducing the fib/fib_wrapper argument from 23 to 22. Same for declare_target-2.f90. Tested on x86_64 with nvptx accelerator. libgomp/ChangeLog: 2022-01-27 Tom de Vries <tdevries@suse.de> * testsuite/libgomp.fortran/examples-4/declare_target-1.f90: Reduce recursion depth. * testsuite/libgomp.fortran/examples-4/declare_target-2.f90: Same.
2022-01-21Strengthen a few OpenACC test casesThomas Schwinge15-57/+202
Rather than rubber-stamp whatever requested vs. actual device kernel launch configuration happens, actually (again) verify the requested values (modulo expected variations). This better highlights that "AMD GCN has an upper limit of 'num_workers(16)'", and the deficiency that "AMD GCN uses the autovectorizer for the vector dimension: the use of a function call in vector-partitioned code [...] is not currently supported". And, this removes several instances of race conditions, where variables are concurrently written to in OpenACC gang-redundant mode. libgomp/ * testsuite/libgomp.oacc-c-c++-common/loop-gwv-1.c: Strengthen. * testsuite/libgomp.oacc-c-c++-common/loop-gwv-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-red-gwv-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-red-v-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-red-v-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-red-w-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-red-w-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-red-wv-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-v-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-w-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-wv-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/routine-gwv-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/routine-v-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/routine-w-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/routine-wv-1.c: Likewise.
2022-01-19nvptx: update fix for -Wformat-diagMartin Liska16-33/+33
gcc/ChangeLog: * config/nvptx/nvptx.cc (nvptx_goacc_validate_dims_1): Update warning messages. libgomp/ChangeLog: * testsuite/libgomp.oacc-c++/privatized-ref-2.C: Update scanning patterns. * testsuite/libgomp.oacc-c++/privatized-ref-3.C: Likewise. * testsuite/libgomp.oacc-c-c++-common/acc_prof-kernels-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-loop-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/pr85486.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/pr95270-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/routine-nohost-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/struct-copyout-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/struct-copyout-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/vector-length-64-1.c: Likewise. * testsuite/libgomp.oacc-fortran/attach-descriptor-1.f90: Likewise. * testsuite/libgomp.oacc-fortran/derivedtypes-arrays-1.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-loop-2.f95: Likewise. * testsuite/libgomp.oacc-fortran/parallel-dims.f90: Likewise. * testsuite/libgomp.oacc-fortran/privatized-ref-1.f95: Likewise. Co-authored-by: Thomas Schwinge <thomas@codesourcery.com>
2022-01-18nvptx: fix -Wformat-diag warningsMartin Liska16-33/+33
gcc/ChangeLog: * config/nvptx/nvptx.cc (nvptx_goacc_validate_dims_1): Wrap keyword. * config/nvptx/nvptx.md: Remove trailing dot. libgomp/ChangeLog: * testsuite/libgomp.oacc-c++/privatized-ref-2.C: Update keyword in dg-warning. * testsuite/libgomp.oacc-c++/privatized-ref-3.C: Likewise. * testsuite/libgomp.oacc-c-c++-common/acc_prof-kernels-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-loop-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/pr85486.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/pr95270-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/routine-nohost-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/struct-copyout-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/struct-copyout-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/vector-length-64-1.c: Likewise. * testsuite/libgomp.oacc-fortran/attach-descriptor-1.f90: Likewise. * testsuite/libgomp.oacc-fortran/derivedtypes-arrays-1.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-loop-2.f95: Likewise. * testsuite/libgomp.oacc-fortran/parallel-dims.f90: Likewise. * testsuite/libgomp.oacc-fortran/privatized-ref-1.f95: Likewise.
2022-01-17Extend test cases for references in OpenACC 'private' clausesThomas Schwinge3-22/+364
libgomp/ * testsuite/libgomp.oacc-c++/privatized-ref-2.C: Extend. * testsuite/libgomp.oacc-c++/privatized-ref-3.C: Likewise. * testsuite/libgomp.oacc-fortran/privatized-ref-1.f95: Likewise.
2022-01-17Test cases for references in OpenACC 'private' clausesJulian Brown3-0/+211
libgomp/ * testsuite/libgomp.oacc-fortran/privatized-ref-1.f95: New test. * testsuite/libgomp.oacc-c++/privatized-ref-2.C: New test. * testsuite/libgomp.oacc-c++/privatized-ref-3.C: New test. Co-authored-by: Thomas Schwinge <thomas@codesourcery.com>
2022-01-13Add support for allocate clause (OpenMP 5.0).Hafiz Abid Qadeer2-0/+340
This patch adds support for OpenMP 5.0 allocate clause for fortran. It does not yet support the allocator-modifier as specified in OpenMP 5.1. The allocate clause is already supported in C/C++. gcc/fortran/ChangeLog: * dump-parse-tree.c (show_omp_clauses): Handle OMP_LIST_ALLOCATE. * gfortran.h (OMP_LIST_ALLOCATE): New enum value. * openmp.c (enum omp_mask1): Add OMP_CLAUSE_ALLOCATE. (gfc_match_omp_clauses): Handle OMP_CLAUSE_ALLOCATE (OMP_PARALLEL_CLAUSES, OMP_DO_CLAUSES, OMP_SECTIONS_CLAUSES) (OMP_TASK_CLAUSES, OMP_TASKLOOP_CLAUSES, OMP_TARGET_CLAUSES) (OMP_TEAMS_CLAUSES, OMP_DISTRIBUTE_CLAUSES) (OMP_SINGLE_CLAUSES): Add OMP_CLAUSE_ALLOCATE. (OMP_TASKGROUP_CLAUSES): New. (gfc_match_omp_taskgroup): Use OMP_TASKGROUP_CLAUSES instead of OMP_CLAUSE_TASK_REDUCTION. (resolve_omp_clauses): Handle OMP_LIST_ALLOCATE. (resolve_omp_do): Avoid warning when loop iteration variable is in allocate clause. * trans-openmp.c (gfc_trans_omp_clauses): Handle translation of allocate clause. (gfc_split_omp_clauses): Update for OMP_LIST_ALLOCATE. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/allocate-1.f90: New test. * gfortran.dg/gomp/allocate-2.f90: New test. * gfortran.dg/gomp/allocate-3.f90: New test. * gfortran.dg/gomp/collapse1.f90: Update error message. * gfortran.dg/gomp/openmp-simd-4.f90: Likewise. * gfortran.dg/gomp/clauses-1.f90: Uncomment allocate clause. libgomp/ChangeLog: * testsuite/libgomp.fortran/allocate-1.c: New test. * testsuite/libgomp.fortran/allocate-1.f90: New test. * libgomp.texi: Remove string that says that allocate clause support is for C/C++ only.
2022-01-13Improve Intel MIC offloading XFAILing for 'omp_get_device_num'Thomas Schwinge4-3/+36
After recent commit be661959a6b6d8f9c3c8608a746789e7b2ec3ca4 "libgomp/testsuite: Improve omp_get_device_num() tests", we're now iterating over all OpenMP target devices. Intel MIC (emulated) offloading still doesn't properly implement device-side 'omp_get_device_num', and we thus regress: PASS: libgomp.c/../libgomp.c-c++-common/target-45.c (test for excess errors) [-PASS:-]{+FAIL:+} libgomp.c/../libgomp.c-c++-common/target-45.c execution test PASS: libgomp.c++/../libgomp.c-c++-common/target-45.c (test for excess errors) [-PASS:-]{+FAIL:+} libgomp.c++/../libgomp.c-c++-common/target-45.c execution test PASS: libgomp.fortran/target10.f90 -O0 (test for excess errors) [-PASS:-]{+FAIL:+} libgomp.fortran/target10.f90 -O0 execution test PASS: libgomp.fortran/target10.f90 -O1 (test for excess errors) [-PASS:-]{+FAIL:+} libgomp.fortran/target10.f90 -O1 execution test PASS: libgomp.fortran/target10.f90 -O2 (test for excess errors) [-PASS:-]{+FAIL:+} libgomp.fortran/target10.f90 -O2 execution test PASS: libgomp.fortran/target10.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for excess errors) [-PASS:-]{+FAIL:+} libgomp.fortran/target10.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test PASS: libgomp.fortran/target10.f90 -O3 -g (test for excess errors) [-PASS:-]{+FAIL:+} libgomp.fortran/target10.f90 -O3 -g execution test PASS: libgomp.fortran/target10.f90 -Os (test for excess errors) [-PASS:-]{+FAIL:+} libgomp.fortran/target10.f90 -Os execution test Improve the XFAILing added in commit bb75b22aba254e8ff144db27b1c8b4804bad73bb "Allow matching Intel MIC in OpenMP 'declare variant'" for the case that *any* Intel MIC offload device is available. libgomp/ * testsuite/libgomp.c-c++-common/on_device_arch.h (any_device_arch, any_device_arch_intel_mic): New. * testsuite/lib/libgomp.exp (check_effective_target_offload_device_any_intel_mic): New. * testsuite/libgomp.c-c++-common/target-45.c: Use it. * testsuite/libgomp.fortran/target10.f90: Likewise.
2022-01-13Document current '-Wuninitialized' diagnostics for ↵Thomas Schwinge1-1/+13
'libgomp.oacc-fortran/routine-10.f90' [PR102192] libgomp/ PR tree-optimization/102192 * testsuite/libgomp.oacc-fortran/routine-10.f90: Document current '-Wuninitialized' diagnostics.
2022-01-13Document current '-Wuninitialized'/'-Wmaybe-uninitialized' diagnostics for ↵Thomas Schwinge12-3/+108
OpenACC test cases ... including "note: '[...]' was declared here" emitted since recent commit 9695e1c23be5b5c55d572ced152897313ddb96ae "Improve -Wuninitialized note location". For those that seemed incorrect to me, I've placed XFAILed 'dg-bogus'es, including one more instance of PR77504 etc., and several instances where for "local variables" of reference-data-type reductions (etc.?) we emit bogus (?) diagnostics. For implicit data clauses (including 'firstprivate'), we seem to be missing diagnostics, so I've placed XFAILed 'dg-warning's. gcc/testsuite/ * c-c++-common/goacc/builtin-goacc-parlevel-id-size.c: Document current '-Wuninitialized' diagnostics. * c-c++-common/goacc/mdc-1.c: Likewise. * c-c++-common/goacc/nested-reductions-1-kernels.c: Likewise. * c-c++-common/goacc/nested-reductions-1-parallel.c: Likewise. * c-c++-common/goacc/nested-reductions-1-routine.c: Likewise. * c-c++-common/goacc/nested-reductions-2-kernels.c: Likewise. * c-c++-common/goacc/nested-reductions-2-parallel.c: Likewise. * c-c++-common/goacc/nested-reductions-2-routine.c: Likewise. * c-c++-common/goacc/uninit-dim-clause.c: Likewise. * c-c++-common/goacc/uninit-firstprivate-clause.c: Likewise. * c-c++-common/goacc/uninit-if-clause.c: Likewise. * gfortran.dg/goacc/array-with-dt-1.f90: Likewise. * gfortran.dg/goacc/array-with-dt-2.f90: Likewise. * gfortran.dg/goacc/array-with-dt-3.f90: Likewise. * gfortran.dg/goacc/array-with-dt-4.f90: Likewise. * gfortran.dg/goacc/array-with-dt-5.f90: Likewise. * gfortran.dg/goacc/derived-chartypes-1.f90: Likewise. * gfortran.dg/goacc/derived-chartypes-2.f90: Likewise. * gfortran.dg/goacc/derived-chartypes-3.f90: Likewise. * gfortran.dg/goacc/derived-chartypes-4.f90: Likewise. * gfortran.dg/goacc/derived-classtypes-1.f95: Likewise. * gfortran.dg/goacc/derived-types-2.f90: Likewise. * gfortran.dg/goacc/host_data-tree.f95: Likewise. * gfortran.dg/goacc/kernels-tree.f95: Likewise. * gfortran.dg/goacc/modules.f95: Likewise. * gfortran.dg/goacc/nested-reductions-1-kernels.f90: Likewise. * gfortran.dg/goacc/nested-reductions-1-parallel.f90: Likewise. * gfortran.dg/goacc/nested-reductions-1-routine.f90: Likewise. * gfortran.dg/goacc/nested-reductions-2-kernels.f90: Likewise. * gfortran.dg/goacc/nested-reductions-2-parallel.f90: Likewise. * gfortran.dg/goacc/nested-reductions-2-routine.f90: Likewise. * gfortran.dg/goacc/parallel-tree.f95: Likewise. * gfortran.dg/goacc/pr93464.f90: Likewise. * gfortran.dg/goacc/privatization-1-compute-loop.f90: Likewise. * gfortran.dg/goacc/privatization-1-compute.f90: Likewise. * gfortran.dg/goacc/privatization-1-routine_gang-loop.f90: Likewise. * gfortran.dg/goacc/privatization-1-routine_gang.f90: Likewise. * gfortran.dg/goacc/uninit-dim-clause.f95: Likewise. * gfortran.dg/goacc/uninit-firstprivate-clause.f95: Likewise. * gfortran.dg/goacc/uninit-if-clause.f95: Likewise. * gfortran.dg/goacc/uninit-use-device-clause.f95: Likewise. * gfortran.dg/goacc/wait.f90: Likewise. libgomp/ * testsuite/libgomp.oacc-c-c++-common/vred2d-128.c: Document current '-Wuninitialized' diagnostics. * testsuite/libgomp.oacc-fortran/data-5.f90: Likewise. * testsuite/libgomp.oacc-fortran/gemm-2.f90: Likewise. * testsuite/libgomp.oacc-fortran/gemm.f90: Likewise. * testsuite/libgomp.oacc-fortran/optional-reduction.f90: Likewise. * testsuite/libgomp.oacc-fortran/parallel-reduction.f90: Likewise. * testsuite/libgomp.oacc-fortran/pr70643.f90: Likewise. * testsuite/libgomp.oacc-fortran/pr96628-part1.f90: Likewise. * testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: Likewise. * testsuite/libgomp.oacc-fortran/reduction-5.f90: Likewise. * testsuite/libgomp.oacc-fortran/reduction-7.f90: Likewise. * testsuite/libgomp.oacc-fortran/reference-reductions.f90: Likewise.
2022-01-13Host and offload targets have no common meaning of address spacesThomas Schwinge1-4/+0
gcc/ * tree-streamer-out.c (pack_ts_base_value_fields): Don't pack 'TYPE_ADDR_SPACE' for offloading. * tree-streamer-in.c (unpack_ts_base_value_fields): Don't unpack 'TYPE_ADDR_SPACE' for offloading. libgomp/ * testsuite/libgomp.c/address-space-1.c: Remove 'dg-xfail-run-if' for 'offload_device_intel_mic'.
2022-01-13Wait at end of OpenACC asynchronous kernels regionsJulian Brown1-1/+0
In OpenACC 'kernels' decomposition, we're improperly nesting synchronous and asynchronous data and compute regions, giving rise to data races when the asynchronicity is actually executed, as is visible in at least on test case with GCN offloading. The proper fix is to correctly use the asynchronous interfaces, making the currently synchronous data regions fully asynchronous (see also <https://gcc.gnu.org/PR97390> "[OpenACC] 'async' clause on 'data' construct", which is to share the same implementation), but that's for later; for now add some more synchronization. gcc/ * omp-oacc-kernels-decompose.cc (add_wait): New function, split out of... (add_async_clauses_and_wait): ...here. Call new outlined function. (decompose_kernels_region_body): Add wait at the end of explicitly-asynchronous kernels regions. libgomp/ * testsuite/libgomp.oacc-c-c++-common/f-asyncwait-1.c: Remove GCN offloading execution XFAIL. Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
2022-01-13OpenACC 'kernels' decomposition: Mark variables used in synthesized data ↵Thomas Schwinge3-28/+33
clauses as addressable [PR100280] ... as otherwise 'gcc/omp-low.c:lower_omp_target' has to create a temporary: 13073 else if (is_gimple_reg (var)) 13074 { 13075 gcc_assert (offloaded); 13076 tree avar = create_tmp_var (TREE_TYPE (var)); 13077 mark_addressable (avar); ..., which (a) is only implemented for actualy *offloaded* regions (but not data regions), and (b) the subsequently synthesized code for writing to and later reading back from the temporary fundamentally conflicts with OpenACC 'async' (as used by OpenACC 'kernels' decomposition). That's all not trivial to make work, so let's just avoid this case. gcc/ PR middle-end/100280 * omp-oacc-kernels-decompose.cc (maybe_build_inner_data_region): Mark variables used in synthesized data clauses as addressable. gcc/testsuite/ PR middle-end/100280 * c-c++-common/goacc/kernels-decompose-pr100280-1.c: New. * c-c++-common/goacc/classify-kernels-parloops.c: Likewise. * c-c++-common/goacc/classify-kernels-unparallelized-parloops.c: Likewise. * c-c++-common/goacc/classify-kernels-unparallelized.c: Test '--param openacc-kernels=decompose'. * c-c++-common/goacc/classify-kernels.c: Likewise. * c-c++-common/goacc/kernels-decompose-2.c: Update. * c-c++-common/goacc/kernels-decompose-ice-1.c: Remove. * c-c++-common/goacc/kernels-decompose-ice-2.c: Likewise. * gfortran.dg/goacc/classify-kernels-parloops.f95: New. * gfortran.dg/goacc/classify-kernels-unparallelized-parloops.f95: Likewise. * gfortran.dg/goacc/classify-kernels-unparallelized.f95: Test '--param openacc-kernels=decompose'. * gfortran.dg/goacc/classify-kernels.f95: Likewise. libgomp/ PR middle-end/100280 * testsuite/libgomp.oacc-c-c++-common/declare-vla-kernels-decompose-ice-1.c: Update. * testsuite/libgomp.oacc-c-c++-common/f-asyncwait-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c: Likewise. Suggested-by: Julian Brown <julian@codesourcery.com>
2022-01-13Enhance OpenACC 'kernels' decomposition testingThomas Schwinge11-91/+434
gcc/testsuite/ * c-c++-common/goacc/kernels-decompose-1.c: Enhance. * c-c++-common/goacc/kernels-decompose-2.c: Likewise. * c-c++-common/goacc/kernels-decompose-ice-1.c: Likewise. * c-c++-common/goacc/kernels-decompose-ice-2.c: Likewise. * gfortran.dg/goacc/kernels-decompose-1.f95: Likewise. * gfortran.dg/goacc/kernels-decompose-2.f95: Likewise. libgomp/ * testsuite/libgomp.oacc-c-c++-common/declare-vla-kernels-decompose-ice-1.c: Enhance. * testsuite/libgomp.oacc-c-c++-common/declare-vla-kernels-decompose.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/declare-vla.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/f-asyncwait-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/f-asyncwait-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/f-asyncwait-3.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c: Likewise. * testsuite/libgomp.oacc-fortran/asyncwait-1.f90: Likewise. * testsuite/libgomp.oacc-fortran/asyncwait-2.f90: Likewise. * testsuite/libgomp.oacc-fortran/asyncwait-3.f90: Likewise. * testsuite/libgomp.oacc-fortran/pr94358-1.f90: Likewise.
2022-01-04libgomp/testsuite: Improve omp_get_device_num() testsTobias Burnus2-19/+27
Related to r12-6208-gebc853deb7cc0487de9ef6e891a007ba853d1933 "libgomp: Fix GOMP_DEVICE_NUM_VAR stringification during offload image load" That commit fixed an issue with omp_get_device_num() on gcn/nvptx that resulted in having always the value 0. This commit modifies the tests to iterate over all devices such that on a multi-nonhost-device system it had detected that always-zero issue. libgomp/ChangeLog: * testsuite/libgomp.c-c++-common/target-45.c: Iterate over all devices. * testsuite/libgomp.fortran/target10.f90: Likewise.
2022-01-03Update copyright years.Jakub Jelinek4-4/+4
2021-12-13Fortran: Handle compare in OpenMP atomicTobias Burnus1-0/+313
gcc/fortran/ChangeLog: PR fortran/103576 * openmp.c (is_scalar_intrinsic_expr): Fix condition. (resolve_omp_atomic): Fix/update checks, accept compare. * trans-openmp.c (gfc_trans_omp_atomic): Handle compare. libgomp/ChangeLog: * libgomp.texi (OpenMP 5.1): Set Fortran support for atomic to 'Y'. * testsuite/libgomp.fortran/atomic-19.f90: New test. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/atomic-25.f90: Remove sorry, fix + add checks. * gfortran.dg/gomp/atomic-26.f90: Likewise. * gfortran.dg/gomp/atomic-21.f90: New test.
2021-12-10openmp: Fix libgomp.c++ testsuite errors for non-offload configsChung-Lin Tang3-1/+6
Some testcases for libgomp.c++ only works for non-shared address space offloading, because it exercises the zero-length array section behavior for offloaded address space, testing for NULL/non-NULL cases. libgomp/ChangeLog: * testsuite/libgomp.c++/target-lambda-1.C: Only run under "target offload_device_nonshared_as" * testsuite/libgomp.c++/target-this-3.C: Likewise. * testsuite/libgomp.c++/target-this-4.C: Likewise.
2021-12-09OpenMP 5.0: Remove array section base-pointer mapping semantics and other ↵Chung-Lin Tang10-54/+102
front-end adjustments This patch implements three pieces of functionality: (1) Adjust array section mapping to have standards conforming behavior, mapping array sections should *NOT* also map the base-pointer: struct S { int *ptr; ... }; struct S s; Instead of generating this during gimplify: map(to:*_1 [len: 400]) map(attach:s.ptr [bias: 0]) Now, adjust to: (i.e. do not map the base-pointer together. The attach operation is still generated, and if s.ptr is already mapped prior, attachment will happen) The correct way of achieving the base-pointer-also-mapped behavior would be to use: (A small Fortran front-end patch to trans-openmp.c:gfc_trans_omp_array_section is also included, which removes generation of a GOMP_MAP_ALWAYS_POINTER for array types, which appears incorrect and causes a regression in libgomp.fortranlibgomp.fortran/struct-elem-map-1.f90) (2) Related to the first item above, are fixes in libgomp/target.c to not overwrite attached pointers when handling device<->host copies, mainly for the "always" case. (3) The third is a set of changes to the C/C++ front-ends to extend the allowed component access syntax in map clauses. These changes are enabled for both OpenACC and OpenMP. gcc/c/ChangeLog: * c-parser.c (struct omp_dim): New struct type for use inside c_parser_omp_variable_list. (c_parser_omp_variable_list): Allow multiple levels of array and component accesses in array section base-pointer expression. (c_parser_omp_clause_to): Set 'allow_deref' to true in call to c_parser_omp_var_list_parens. (c_parser_omp_clause_from): Likewise. * c-typeck.c (handle_omp_array_sections_1): Extend allowed range of base-pointer expressions involving INDIRECT/MEM/ARRAY_REF and POINTER_PLUS_EXPR. (c_finish_omp_clauses): Extend allowed ranged of expressions involving INDIRECT/MEM/ARRAY_REF and POINTER_PLUS_EXPR. gcc/cp/ChangeLog: * parser.c (struct omp_dim): New struct type for use inside cp_parser_omp_var_list_no_open. (cp_parser_omp_var_list_no_open): Allow multiple levels of array and component accesses in array section base-pointer expression. (cp_parser_omp_all_clauses): Set 'allow_deref' to true in call to cp_parser_omp_var_list for to/from clauses. * semantics.c (handle_omp_array_sections_1): Extend allowed range of base-pointer expressions involving INDIRECT/MEM/ARRAY_REF and POINTER_PLUS_EXPR. (handle_omp_array_sections): Adjust pointer map generation of references. (finish_omp_clauses): Extend allowed ranged of expressions involving INDIRECT/MEM/ARRAY_REF and POINTER_PLUS_EXPR. gcc/fortran/ChangeLog: * trans-openmp.c (gfc_trans_omp_array_section): Do not generate GOMP_MAP_ALWAYS_POINTER map for main array maps of ARRAY_TYPE type. gcc/ChangeLog: * gimplify.c (extract_base_bit_offset): Add 'tree *offsetp' parameter, accomodate case where 'offset' return of get_inner_reference is non-NULL. (is_or_contains_p): Further robustify conditions. (omp_target_reorder_clauses): In alloc/to/from sorting phase, also move following GOMP_MAP_ALWAYS_POINTER maps along. Add new sorting phase where we make sure pointers with an attach/detach map are ordered correctly. (gimplify_scan_omp_clauses): Add modifications to avoid creating GOMP_MAP_STRUCT and associated alloc map for attach/detach maps. gcc/testsuite/ChangeLog: * c-c++-common/goacc/deep-copy-arrayofstruct.c: Adjust testcase. * c-c++-common/gomp/target-enter-data-1.c: New testcase. * c-c++-common/gomp/target-implicit-map-2.c: New testcase. libgomp/ChangeLog: * target.c (gomp_map_vars_existing): Make sure attached pointer is not overwritten during cross-host/device copying. (gomp_update): Likewise. (gomp_exit_data): Likewise. * testsuite/libgomp.c++/target-11.C: Adjust testcase. * testsuite/libgomp.c++/target-12.C: Likewise. * testsuite/libgomp.c++/target-15.C: Likewise. * testsuite/libgomp.c++/target-16.C: Likewise. * testsuite/libgomp.c++/target-17.C: Likewise. * testsuite/libgomp.c++/target-21.C: Likewise. * testsuite/libgomp.c++/target-23.C: Likewise. * testsuite/libgomp.c/target-23.c: Likewise. * testsuite/libgomp.c/target-29.c: Likewise. * testsuite/libgomp.c-c++-common/target-implicit-map-2.c: New testcase.
2021-12-08openmp: Improve OpenMP target support for C++ (PR92120)Chung-Lin Tang8-0/+459
This patch implements several C++ specific mapping capabilities introduced for OpenMP 5.0, including implicit mapping of this[:1] for non-static member functions, zero-length array section mapping of pointer-typed members, lambda captured variable access in target regions, and use of lambda objects inside target regions. Several adjustments to the C/C++ front-ends to allow more member-access syntax as valid is also included. PR middle-end/92120 gcc/cp/ChangeLog: * cp-tree.h (finish_omp_target): New declaration. (finish_omp_target_clauses): Likewise. * parser.c (cp_parser_omp_clause_map): Adjust call to cp_parser_omp_var_list_no_open to set 'allow_deref' argument to true. (cp_parser_omp_target): Factor out code, adjust into calls to new function finish_omp_target. * pt.c (tsubst_expr): Add call to finish_omp_target_clauses for OMP_TARGET case. * semantics.c (handle_omp_array_sections_1): Add handling to create 'this->member' from 'member' FIELD_DECL. Remove case of rejecting 'this' when not in declare simd. (handle_omp_array_sections): Likewise. (finish_omp_clauses): Likewise. Adjust to allow 'this[]' in OpenMP map clauses. Handle 'A->member' case in map clauses. Remove case of rejecting 'this' when not in declare simd. (struct omp_target_walk_data): New struct for walking over target-directive tree body. (finish_omp_target_clauses_r): New function for tree walk. (finish_omp_target_clauses): New function. (finish_omp_target): New function. gcc/c/ChangeLog: * c-parser.c (c_parser_omp_clause_map): Set 'allow_deref' argument in call to c_parser_omp_variable_list to 'true'. * c-typeck.c (handle_omp_array_sections_1): Add strip of MEM_REF in array base handling. (c_finish_omp_clauses): Handle 'A->member' case in map clauses. gcc/ChangeLog: * gimplify.c ("tree-hash-traits.h"): Add include. (gimplify_scan_omp_clauses): Change struct_map_to_clause to type hash_map<tree_operand, tree> *. Adjust struct map handling to handle cases of *A and A->B expressions. Under !DECL_P case of GOMP_CLAUSE_MAP handling, add STRIP_NOPS for indir_p case, add to struct_deref_set for map(*ptr_to_struct) cases. Add MEM_REF case when handling component_ref_p case. Add unshare_expr and gimplification when created GOMP_MAP_STRUCT is not a DECL. Add code to add firstprivate pointer for *pointer-to-struct case. (gimplify_adjust_omp_clauses): Move GOMP_MAP_STRUCT removal code for exit data directives code to earlier position. * omp-low.c (lower_omp_target): Handle GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION, and GOMP_MAP_POINTER_TO_ZERO_LENGTH_ARRAY_SECTION map kinds. * tree-pretty-print.c (dump_omp_clause): Likewise. gcc/testsuite/ChangeLog: * gcc.dg/gomp/target-3.c: New testcase. * g++.dg/gomp/target-3.C: New testcase. * g++.dg/gomp/target-lambda-1.C: New testcase. * g++.dg/gomp/target-lambda-2.C: New testcase. * g++.dg/gomp/target-this-1.C: New testcase. * g++.dg/gomp/target-this-2.C: New testcase. * g++.dg/gomp/target-this-3.C: New testcase. * g++.dg/gomp/target-this-4.C: New testcase. * g++.dg/gomp/target-this-5.C: New testcase. * g++.dg/gomp/this-2.C: Adjust testcase. include/ChangeLog: * gomp-constants.h (enum gomp_map_kind): Add GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION, and GOMP_MAP_POINTER_TO_ZERO_LENGTH_ARRAY_SECTION map kinds. (GOMP_MAP_POINTER_P): Include GOMP_MAP_POINTER_TO_ZERO_LENGTH_ARRAY_SECTION. libgomp/ChangeLog: * libgomp.h (gomp_attach_pointer): Add bool parameter. * oacc-mem.c (acc_attach_async): Update call to gomp_attach_pointer. (goacc_enter_data_internal): Likewise. * target.c (gomp_map_vars_existing): Update assert condition to include GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION. (gomp_map_pointer): Add 'bool allow_zero_length_array_sections' parameter, add support for mapping a pointer with NULL target. (gomp_attach_pointer): Add 'bool allow_zero_length_array_sections' parameter, add support for attaching a pointer with NULL target. (gomp_map_vars_internal): Update calls to gomp_map_pointer and gomp_attach_pointer, add handling for GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION, and GOMP_MAP_POINTER_TO_ZERO_LENGTH_ARRAY_SECTION cases. * testsuite/libgomp.c++/target-23.C: New testcase. * testsuite/libgomp.c++/target-lambda-1.C: New testcase. * testsuite/libgomp.c++/target-lambda-2.C: New testcase. * testsuite/libgomp.c++/target-this-1.C: New testcase. * testsuite/libgomp.c++/target-this-2.C: New testcase. * testsuite/libgomp.c++/target-this-3.C: New testcase. * testsuite/libgomp.c++/target-this-4.C: New testcase. * testsuite/libgomp.c++/target-this-5.C: New testcase.
2021-12-02fortran: OpenMP/OpenACC array mapping alignment fix (PR90030)Chung-Lin Tang2-0/+32
Fix issue with the Fortran front-end when mapping arrays: when creating the data MEM_REF for the map clause, there was a convention of casting the referencing pointer to 'c_char *' by fold_convert (build_pointer_type (char_type_node), ptr). This causes the alignment passed to the libgomp runtime for array data hardwared to '1', and causes alignment errors on the offload target. This patch fixes this by removing the char_type_node pointer converts, and adding gcc_asserts to ensure POINTER_TYPE_P (TREE_TYPE (ptr)). PR fortran/90030 gcc/fortran/ChangeLog: * trans-openmp.c (gfc_omp_finish_clause): Remove fold_convert to pointer to char_type_node, add gcc_assert of POINTER_TYPE_P. (gfc_trans_omp_array_section): Likewise. (gfc_trans_omp_clauses): Likewise. gcc/testsuite/ChangeLog: * gfortran.dg/goacc/finalize-1.f: Adjust scan test. * gfortran.dg/gomp/affinity-clause-1.f90: Likewise. * gfortran.dg/gomp/affinity-clause-5.f90: Likewise. * gfortran.dg/gomp/defaultmap-4.f90: Likewise. * gfortran.dg/gomp/defaultmap-5.f90: Likewise. * gfortran.dg/gomp/defaultmap-6.f90: Likewise. * gfortran.dg/gomp/map-3.f90: Likewise. * gfortran.dg/gomp/pr78260-2.f90: Likewise. * gfortran.dg/gomp/pr78260-3.f90: Likewise. libgomp/ChangeLog: * testsuite/libgomp.oacc-fortran/pr90030.f90: New test. * testsuite/libgomp.fortran/pr90030.f90: New test.
2021-11-30[OpenACC] Allow gang reductions inside serial constructsKwok Cheung Yeung1-1/+0
... fixing a regression introduced in the preceding commit 2b7dac2c0dcb087da9e4018943c023c0678234a3 "Make OpenACC orphan gang reductions errors". gcc/fortran/ * openmp.c (oacc_is_serial, oacc_is_parallel_or_serial): New. (resolve_oacc_loop_blocks): Use oacc_is_parallel_or_serial instead of oacc_is_parallel. libgomp/ * testsuite/libgomp.oacc-fortran/parallel-dims.f90: Remove temporary skip. Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
2021-11-30Make OpenACC orphan gang reductions errorsCesar Philippidis1-0/+1
This patch promotes all OpenACC gang reductions on orphan loops as errors. Accord to the spec, orphan loops are those which are not lexically nested inside an OpenACC parallel or kernels regions. I.e., acc loops inside acc routines. At first I thought this could be a warning because the gang reduction finalizer uses an atomic update. However, because there is no synchronization between gangs, there is way to guarantee that reduction will have completed once a single gang entity returns from the acc routine call. gcc/c/ * c-typeck.c (c_finish_omp_clauses): Emit an error on orphan OpenACC gang reductions. gcc/cp/ * semantics.c (finish_omp_clauses): Emit an error on orphan OpenACC gang reductions. gcc/fortran/ * openmp.c (oacc_is_parallel, oacc_is_kernels): New 'static' functions. (resolve_oacc_loop_blocks): Emit an error on orphan OpenACC gang reductions. gcc/ * omp-general.h (enum oacc_loop_flags): Add OLF_REDUCTION enum. * omp-low.c (lower_oacc_head_mark): Use it to mark OpenACC reductions. * omp-offload.c (oacc_loop_auto_partitions): Don't assign gang level parallelism to orphan reductions. gcc/testsuite/ * c-c++-common/goacc/nested-reductions-1-routine.c: Adjust. * c-c++-common/goacc/nested-reductions-2-routine.c: Likewise. * gcc.dg/goacc/loop-processing-1.c: Likewise. * gfortran.dg/goacc/nested-reductions-1-routine.f90: Likewise. * gfortran.dg/goacc/nested-reductions-2-routine.f90: Likewise. * c-c++-common/goacc/orphan-reductions-1.c: New test. * c-c++-common/goacc/orphan-reductions-2.c: New test. * gfortran.dg/goacc/orphan-reductions-1.f90: New test. * gfortran.dg/goacc/orphan-reductions-2.f90: New test. libgomp/ * testsuite/libgomp.oacc-fortran/parallel-dims.f90: Temporarily skip. Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
2021-11-24openmp: Fix up handling of kind(host) and kind(nohost) in ACCEL_COMPILERs ↵Jakub Jelinek1-0/+45
[PR103384] As the testcase shows, we weren't handling kind(host) and kind(nohost) properly in the ACCEL_COMPILERs, the code written in there is valid for the host compiler only, where if we are maybe offloaded, we defer resolution after IPA, otherwise return 0 for kind(nohost) and accept it for kind(host). Note, omp_maybe_offloaded is false after IPA. If ACCEL_COMPILER is defined, it is the other way around, but also we know we are after IPA. 2021-11-24 Jakub Jelinek <jakub@redhat.com> PR middle-end/103384 gcc/ * omp-general.c (omp_context_selector_matches): For ACCEL_COMPILER, return 0 for kind(host) and continue for kind(nohost). libgomp/ * testsuite/libgomp.c/declare-variant-2.c: New test.
2021-11-15Fortran: openmp: Add support for thread_limit clause on targetTobias Burnus1-0/+41
gcc/fortran/ChangeLog: * openmp.c (OMP_TARGET_CLAUSES): Add thread_limit. * trans-openmp.c (gfc_split_omp_clauses): Add thread_limit also to teams. libgomp/ChangeLog: * testsuite/libgomp.fortran/thread-limit-1.f90: New test.
2021-11-15openmp: Add support for thread_limit clause on targetJakub Jelinek1-0/+23
OpenMP 5.1 says that thread_limit clause can also appear on target, and similarly to teams should affect the thread-limit-var ICV. On combined target teams, the clause goes to both. We actually passed thread_limit internally on target already before, but only used it for gcn/ptx offloading to hint how many threads should be created and for ptx didn't set thread_limit_var in that case. Similarly for host fallback. Also, I found that we weren't copying the args array that contains encoded thread_limit and num_teams clause for target (etc.) for async target. 2021-11-15 Jakub Jelinek <jakub@redhat.com> gcc/ * gimplify.c (optimize_target_teams): Only add OMP_CLAUSE_THREAD_LIMIT to OMP_TARGET_CLAUSES if it isn't there already. gcc/c-family/ * c-omp.c (c_omp_split_clauses) <case OMP_CLAUSE_THREAD_LIMIT>: Duplicate to both OMP_TARGET and OMP_TEAMS. gcc/c/ * c-parser.c (OMP_TARGET_CLAUSE_MASK): Add PRAGMA_OMP_CLAUSE_THREAD_LIMIT. gcc/cp/ * parser.c (OMP_TARGET_CLAUSE_MASK): Add PRAGMA_OMP_CLAUSE_THREAD_LIMIT. libgomp/ * task.c (gomp_create_target_task): Copy args array as well. * target.c (gomp_target_fallback): Add args argument. Set gomp_icv (true)->thread_limit_var if thread_limit is present. (GOMP_target): Adjust gomp_target_fallback caller. (GOMP_target_ext): Likewise. (gomp_target_task_fn): Likewise. * config/nvptx/team.c (gomp_nvptx_main): Set gomp_global_icv.thread_limit_var. * testsuite/libgomp.c-c++-common/thread-limit-1.c: New test.
2021-11-15libgomp: Add a testcase for omp_get_num_teams inside of target inside of ↵Jakub Jelinek1-0/+17
host teams This is https://github.com/OpenMP/spec/issues/3183 There is an agreement that we should return 1 team inside of target, even if that target is inside of host teams. We were doing that when offloading and not during host fallback, r12-5151 should fix that even for host fallback. 2021-11-15 Jakub Jelinek <jakub@redhat.com> * testsuite/libgomp.c/teams-5.c: New test.
2021-11-12openmp: Relax handling of implicit map vs. existing device mappingsChung-Lin Tang1-0/+31
This patch implements relaxing the requirements when a map with the implicit attribute encounters an overlapping existing map. As the OpenMP 5.0 spec describes on page 320, lines 18-27 (and 5.1 spec, page 352, lines 13-22): "If a single contiguous part of the original storage of a list item with an implicit data-mapping attribute has corresponding storage in the device data environment prior to a task encountering the construct that is associated with the map clause, only that part of the original storage will have corresponding storage in the device data environment as a result of the map clause." 2021-11-12 Chung-Lin Tang <cltang@codesourcery.com> include/ChangeLog: * gomp-constants.h (GOMP_MAP_FLAG_SPECIAL_3): Define special bit macro. (GOMP_MAP_IMPLICIT): New special map kind bits value. (GOMP_MAP_FLAG_SPECIAL_BITS): Define helper mask for whole set of special map kind bits. (GOMP_MAP_IMPLICIT_P): New predicate macro for implicit map kinds. gcc/ChangeLog: * tree.h (OMP_CLAUSE_MAP_RUNTIME_IMPLICIT_P): New access macro for 'implicit' bit, using 'base.deprecated_flag' field of tree_node. * tree-pretty-print.c (dump_omp_clause): Add support for printing implicit attribute in tree dumping. * gimplify.c (gimplify_adjust_omp_clauses_1): Set OMP_CLAUSE_MAP_RUNTIME_IMPLICIT_P to 1 if map clause is implicitly created. (gimplify_adjust_omp_clauses): Adjust place of adding implicitly created clauses, from simple append, to starting of list, after non-map clauses. * omp-low.c (lower_omp_target): Add GOMP_MAP_IMPLICIT bits into kind values passed to libgomp for implicit maps. gcc/testsuite/ChangeLog: * c-c++-common/gomp/target-implicit-map-1.c: New test. * c-c++-common/goacc/combined-reduction.c: Adjust scan test pattern. * c-c++-common/goacc/firstprivate-mappings-1.c: Likewise. * c-c++-common/goacc/mdc-1.c: Likewise. * g++.dg/goacc/firstprivate-mappings-1.C: Likewise. libgomp/ChangeLog: * target.c (gomp_map_vars_existing): Add 'bool implicit' parameter, add implicit map handling to allow a "superset" existing map as valid case. (get_kind): Adjust to filter out GOMP_MAP_IMPLICIT bits in return value. (get_implicit): New function to extract implicit status. (gomp_map_fields_existing): Adjust arguments in calls to gomp_map_vars_existing, and add uses of get_implicit. (gomp_map_vars_internal): Likewise. * testsuite/libgomp.c-c++-common/target-implicit-map-1.c: New test.
2021-11-12openmp: Honor OpenMP 5.1 num_teams lower boundJakub Jelinek2-1/+71
The following patch implements what I've been talking about earlier, honor that for explicit num_teams clause we create at least the lower-bound (if not specified, upper-bound) teams in the league. For host fallback, it still means we only have one thread doing all the teams, sequentially one after another. For PTX and GCN, I think the new teams-2.c test and maybe teams-4.c too will or might fail. For these offloads, I think it is ok to remove symbols no longer used from libgomp.a. If num_teams_lower is bigger than the provided num_blocks or num_workgroups, we should arrange for gomp_num_teams_var to be num_teams_lower - 1, stop using the %ctaid.x or __builtin_gcn_dim_pos (0) for omp_get_team_num () and instead use for it some .shared var that GOMP_teams4 initializes to %ctaid.x or __builtin_gcn_dim_pos (0) when first and for !first increment that by num_blocks or num_workgroups each time and only return false when we are above num_teams_lower. Any help with actually implementing this for the 2 architectures highly appreciated. 2021-11-12 Jakub Jelinek <jakub@redhat.com> gcc/ * omp-builtins.def (BUILT_IN_GOMP_TEAMS): Remove. (BUILT_IN_GOMP_TEAMS4): New. * builtin-types.def (BT_FN_VOID_UINT_UINT): Remove. (BT_FN_BOOL_UINT_UINT_UINT_BOOL): New. * omp-low.c (lower_omp_teams): Use GOMP_teams4 instead of GOMP_teams, pass to it also num_teams lower-bound expression or a dup of upper-bound if it is missing and a flag whether it is the first call or not. gcc/fortran/ * types.def (BT_FN_VOID_UINT_UINT): Remove. (BT_FN_BOOL_UINT_UINT_UINT_BOOL): New. libgomp/ * libgomp_g.h (GOMP_teams4): Declare. * libgomp.map (GOMP_5.1): Export GOMP_teams4. * target.c (GOMP_teams4): New function. * config/nvptx/target.c (GOMP_teams): Remove. (GOMP_teams4): New function. * config/gcn/target.c (GOMP_teams): Remove. (GOMP_teams4): New function. * testsuite/libgomp.c/teams-4.c (main): Expect exactly 2 teams instead of <= 2. * testsuite/libgomp.c-c++-common/teams-2.c: New test.