aboutsummaryrefslogtreecommitdiff
path: root/libgomp/testsuite
AgeCommit message (Collapse)AuthorFilesLines
2023-08-23libgomp, testsuite: Do not call nonstandard functionsFrancois-Xavier Coudert2-0/+28
The following functions are not standard, and not always available (e.g., on darwin). They should not be called unless available: gamma, gammaf, scalb, scalbf, significand, and significandf. libgomp/ChangeLog: * testsuite/lib/libgomp.exp: Add effective target. * testsuite/libgomp.c/simd-math-1.c: Avoid calling nonstandard functions.
2023-08-19omp-expand.cc: Fix wrong code with non-rectangular loop nest [PR111017]Tobias Burnus1-0/+72
Before commit r12-5295-g47de0b56ee455e, all gimple_build_cond in expand_omp_for_* were inserted with gsi_insert_before (gsi_p, cond_stmt, GSI_SAME_STMT); except the one dealing with the multiplicative factor that was gsi_insert_after (gsi, cond_stmt, GSI_CONTINUE_LINKING); That commit for PR103208 fixed the issue of some missing regimplify of operands of GIMPLE_CONDs by moving the condition handling to the new function expand_omp_build_cond. While that function has an 'bool after = false' argument to switch between the two variants. However, all callers ommited this argument. This commit reinstates the prior behavior by passing 'true' for the factor != 0 condition, fixing the included testcase. PR middle-end/111017 gcc/ * omp-expand.cc (expand_omp_for_init_vars): Pass after=true to expand_omp_build_cond for 'factor != 0' condition, resulting in pre-r12-5295-g47de0b56ee455e code for the gimple insert. libgomp/ * testsuite/libgomp.c-c++-common/non-rect-loop-1.c: New test.
2023-07-26OpenMP: Call cuMemcpy2D/cuMemcpy3D for nvptx for omp_target_memcpy_rectTobias Burnus3-6/+537
When copying a 2D or 3D rectangular memmory block, the performance is better when using CUDA's cuMemcpy2D/cuMemcpy3D instead of copying the data one by one. That's what this commit does. Additionally, it permits device-to-device copies, if neccessary using a temporary variable on the host. include/ChangeLog: * cuda/cuda.h (CUlimit): Add CUDA_ERROR_NOT_INITIALIZED, CUDA_ERROR_DEINITIALIZED, CUDA_ERROR_INVALID_HANDLE. (CUarray, CUmemorytype, CUDA_MEMCPY2D, CUDA_MEMCPY3D, CUDA_MEMCPY3D_PEER): New typdefs. (cuMemcpy2D, cuMemcpy2DAsync, cuMemcpy2DUnaligned, cuMemcpy3D, cuMemcpy3DAsync, cuMemcpy3DPeer, cuMemcpy3DPeerAsync): New prototypes. libgomp/ChangeLog: * libgomp-plugin.h (GOMP_OFFLOAD_memcpy2d, GOMP_OFFLOAD_memcpy3d): New prototypes. * libgomp.h (struct gomp_device_descr): Add memcpy2d_func and memcpy3d_func. * libgomp.texi (nvtpx): Document when cuMemcpy2D/cuMemcpy3D is used. * oacc-host.c (memcpy2d_func, .memcpy3d_func): Init with NULL. * plugin/cuda-lib.def (cuMemcpy2D, cuMemcpy2DUnaligned, cuMemcpy3D): Invoke via CUDA_ONE_CALL. * plugin/plugin-nvptx.c (GOMP_OFFLOAD_memcpy2d, GOMP_OFFLOAD_memcpy3d): New. * target.c (omp_target_memcpy_rect_worker): (omp_target_memcpy_rect_check, omp_target_memcpy_rect_copy): Permit all device-to-device copyies; invoke new plugins for 2D and 3D copying when available. (gomp_load_plugin_for_device): DLSYM the new plugin functions. * testsuite/libgomp.c/target-12.c: Fix dimension bug. * testsuite/libgomp.fortran/target-12.f90: Likewise. * testsuite/libgomp.fortran/target-memcpy-rect-1.f90: New test.
2023-07-19OpenMP/Fortran: Non-rectangular loops with constant steps other than 1 or -1 ↵Tobias Burnus4-648/+481
[PR107424] Before this commit, gfortran produced with OpenMP for 'do i = 1,10,2' the code for (count.0 = 0; count.0 < 5; count.0 = count.0 + 1) i = count.0 * 2 + 1; While such an inner loop can be collapsed, a non-rectangular could not. With this commit and for all constant loop steps, a simple loop such as 'for (i = 1; i <= 10; i = i + 2)' is created. (Before only for the constant steps of 1 and -1.) The constant step permits to know the direction (increasing/decreasing) that is required for the loop condition. The new code is only valid if one assumes no overflow of the loop variable. However, the Fortran standard can be read that this must be ensured by the user. Namely, the Fortran standard requires (F2023, 10.1.5.2.4): "The execution of any numeric operation whose result is not defined by the arithmetic used by the processor is prohibited." And, for DO loops, F2023's "11.1.7.4.3 The execution cycle" has the following: The number of loop iterations handled by an iteration count, which would permit code like 'do i = huge(i)-5, huge(i),4'. However, in step (3), this count is not only decremented by one but also: "... The DO variable, if any, is incremented by the value of the incrementation parameter m3." And for the example above, 'i' would be 'huge(i)+3' in the last execution cycle, which exceeds the largest model number and should render the example as invalid. PR fortran/107424 gcc/fortran/ChangeLog: * trans-openmp.cc (gfc_nonrect_loop_expr): Accept all constant loop steps. (gfc_trans_omp_do): Likewise; use sign to determine loop direction. libgomp/ChangeLog: * libgomp.texi (Impl. Status 5.0): Add link to new PR110735. * testsuite/libgomp.fortran/non-rectangular-loop-1.f90: Enable commented tests. * testsuite/libgomp.fortran/non-rectangular-loop-1a.f90: Remove test file; tests are in non-rectangular-loop-1.f90. * testsuite/libgomp.fortran/non-rectangular-loop-5.f90: Change testcase to use a non-constant step to retain the 'sorry' test. * testsuite/libgomp.fortran/non-rectangular-loop-6.f90: New test. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/linear-2.f90: Update dump to remove the additional count variable.
2023-07-17OpenMP/Fortran: Parsing support for 'uses_allocators'Tobias Burnus2-0/+267
The 'uses_allocators' clause to the 'target' construct accepts predefined allocators and can also be used to define a new allocator for a target region. As predefined allocators in GCC do not require special handling, those can and are ignored after parsing, such that this feature now works. On the other hand, defining a new allocator will fail for now with a 'sorry, unimplemented'. Note that both the OpenMP 5.0/5.1 and 5.2 syntax for uses_allocators is supported by this commit. 2023-07-17 Tobias Burnus <tobias@codesoucery.com> Chung-Lin Tang <cltang@codesourcery.com> gcc/fortran/ChangeLog: * dump-parse-tree.cc (show_omp_namelist, show_omp_clauses): Dump uses_allocators clause. * gfortran.h (gfc_free_omp_namelist): Add memspace_sym to u union and traits_sym to u2 union. (OMP_LIST_USES_ALLOCATORS): New enum value. (gfc_free_omp_namelist): Add 'bool free_mem_traits_space' arg. * match.cc (gfc_free_omp_namelist): Likewise. * openmp.cc (gfc_free_omp_clauses, gfc_match_omp_variable_list, gfc_match_omp_to_link, gfc_match_omp_doacross_sink, gfc_match_omp_clause_reduction, gfc_match_omp_allocate, gfc_match_omp_flush): Update call. (gfc_match_omp_clauses): Likewise. Parse uses_allocators clause. (gfc_match_omp_clause_uses_allocators): New. (enum omp_mask2): Add new OMP_CLAUSE_USES_ALLOCATORS. (OMP_TARGET_CLAUSES): Accept it. (resolve_omp_clauses): Resolve uses_allocators clause * st.cc (gfc_free_statement): Update gfc_free_omp_namelist call. * trans-openmp.cc (gfc_trans_omp_clauses): Handle OMP_LIST_USES_ALLOCATORS; fail with sorry unless predefined allocator. (gfc_split_omp_clauses): Handle uses_allocators. libgomp/ChangeLog: * testsuite/libgomp.fortran/uses_allocators_1.f90: New test. * testsuite/libgomp.fortran/uses_allocators_2.f90: New test. Co-authored-by: Chung-Lin Tang <cltang@codesourcery.com>
2023-07-13testsuite: dg-require LTO for libgomp LTO testsDavid Edelsohn3-0/+3
Some test cases in libgomp testsuite pass -flto as an option, but the testcases do not require LTO target support. This patch adds the necessary DejaGNU requirement for LTO support to the testcases.. libgomp/ChangeLog: * testsuite/libgomp.c++/target-map-class-2.C: Require LTO. * testsuite/libgomp.c-c++-common/requires-4.c: Require LTO. * testsuite/libgomp.c-c++-common/requires-4a.c: Require LTO. Signed-off-by: David Edelsohn <dje.gcc@gmail.com>
2023-07-12libgomp: Use libnuma for OpenMP's partition=nearest allocation traitTobias Burnus2-0/+502
As with the memkind library, it is only used when found at runtime; it does not need to be present when building GCC. The included testcase does not check whether the memory has been placed on the nearest node as the Linux kernel memory handling too often ignores that hint, using a different node for the allocation. However, when running with 'numactl --preferred=<node> ./executable', it is clearly visible that the feature works by comparing malloc/default vs. nearest placement (using get_mempolicy to obtain the node for a mem addr). libgomp/ChangeLog: * allocator.c: Add ifdef for LIBGOMP_USE_LIBNUMA. (enum gomp_numa_memkind_kind): Renamed from gomp_memkind_kind; add GOMP_MEMKIND_LIBNUMA. (struct gomp_libnuma_data, gomp_init_libnuma, gomp_get_libnuma): New. (omp_init_allocator): Handle partition=nearest with libnuma if avail. (omp_aligned_alloc, omp_free, omp_aligned_calloc, omp_realloc): Add numa_alloc_local (+ memset), numa_free, and numa_realloc calls as needed. * config/linux/allocator.c (LIBGOMP_USE_LIBNUMA): Define * libgomp.texi: Fix a typo; use 'fi' instead of its ligature char. (Memory allocation): Renamed from 'Memory allocation with libmemkind'; updated for libnuma usage. * testsuite/libgomp.c-c++-common/alloc-11.c: New test. * testsuite/libgomp.c-c++-common/alloc-12.c: New test.
2023-06-19Fix DejaGnu directive syntax error in 'libgomp.c/target-51.c'Thomas Schwinge1-1/+1
ERROR: libgomp.c/target-51.c: unknown dg option: \} for "}" Fix-up for recent commit 01fe115ba7eafebcf97bbac9e157038a003d0c85 "libgomp.c/target-51.c: Accept more error-msg variants in dg-output". libgomp/ * testsuite/libgomp.c/target-51.c: Fix DejaGnu directive syntax error.
2023-06-19libgomp.c/target-51.c: Accept more error-msg variants in dg-outputTobias Burnus1-2/+1
Depending on the details, the testcase can fail with different but related messages; all of the following all could be observed for this testcase: libgomp: OMP_TARGET_OFFLOAD is set to MANDATORY, but device cannot be used for offloading libgomp: OMP_TARGET_OFFLOAD is set to MANDATORY, but device not found libgomp: OMP_TARGET_OFFLOAD is set to MANDATORY, but only the host device is available Before, the last two were tested for with 'target offload_device' and '! offload_device', respectively. Now, all three are accepted by matching '.*' already after 'but' and without distinguishing whether the effective target is an offload_device or not. (For completeness, there is a fourth error that follows this pattern: 'OMP_TARGET_OFFLOAD is set to MANDATORY, but device is finalized'.) libgomp/ * testsuite/libgomp.c/target-51.c: Accept more error msg variants as expected dg-output.
2023-06-19OpenMP (C/C++): Keep pointer value of unmapped ptr with default mapping ↵Tobias Burnus6-14/+390
[PR110270] For C/C++ pointers, default implicit mapping firstprivatizes the pointer but if the memory it points to is mapped, the it is updated to point to the device memory (by attaching a zero sized array section of the pointed-to storage). However, if the pointed-to storage wasn't mapped, the pointer was set to NULL on the device side (OpenMP 5.0/5.1 semantic). With this commit, the pointer retains the on-host address in that case (OpenMP 5.2 semantic). The new semantic avoids an explicit map/firstprivate/is_device_ptr in the following sensible cases: Special values (e.g. pointer or 0x1, 0x2 etc.), explicitly device allocated memory (e.g. omp_target_alloc), and with (unified) shared memory. (Note: With (U)SM, mappings still must be tracked, at least when omp_target_associate_ptr does not fail when passing in two destinct pointers.) libgomp/ PR middle-end/110270 * target.c (gomp_map_vars_internal): Copy host value instead of NULL for GOMP_MAP_ZERO_LEN_ARRAY_SECTION if not mapped. * libgomp.texi (OpenMP 5.2 Impl.): Mark as 'Y'. * testsuite/libgomp.c/target-19.c: Update expected value. * testsuite/libgomp.c++/target-18.C: Likewise. * testsuite/libgomp.c++/target-19.C: Likewise. * testsuite/libgomp.c-c++-common/requires-unified-addr-2.c: New test. * testsuite/libgomp.c-c++-common/target-implicit-map-3.c: New test. * testsuite/libgomp.c-c++-common/target-implicit-map-4.c: New test.
2023-06-16libgomp: Fix OMP_TARGET_OFFLOAD=mandatoryTobias Burnus2-0/+43
It turned out that gomp_init_targets_once() was not run when directly calling 'omp target' or 'omp target (enter/exit) data' causing an abort with OMP_TARGET_OFFLOAD=mandatory wrongly claiming that no device is available. It was called a tiny bit later but few lines too late for updating the default-device-var. libgomp/ChangeLog: * target.c (resolve_device): Call gomp_get_num_devices early to ensure gomp_init_targets_once was called before using default-device-var. * testsuite/libgomp.c/target-55.c: New test. * testsuite/libgomp.c/target-55a.c: New test.
2023-06-15libgomp: Extend OMP_ALLOCATOR, add affinity env var docTobias Burnus6-0/+104
Support OpenMP 5.1's syntax for OMP_ALLOCATOR as well, which permits besides predefined allocators also predefined memspaces optionally followed by traits. Additionally, this commit adds the previously lacking documentation for OMP_ALLOCATOR, OMP_AFFINITY_FORMAT and OMP_DISPLAY_AFFINITY. libgomp/ChangeLog: * env.c (gomp_def_allocator_envvar): New var. (parse_allocator): Handle OpenMP 5.1 syntax. (cleanup_env): New. (omp_display_env): Output gomp_def_allocator_envvar for an allocator with traits. * libgomp.texi (OMP_ALLOCATOR, OMP_AFFINITY_FORMAT, OMP_DISPLAY_AFFINITY): New. * testsuite/libgomp.c/allocator-1.c: New test. * testsuite/libgomp.c/allocator-2.c: New test. * testsuite/libgomp.c/allocator-3.c: New test. * testsuite/libgomp.c/allocator-4.c: New test. * testsuite/libgomp.c/allocator-5.c: New test. * testsuite/libgomp.c/allocator-6.c: New test.
2023-06-14Align a 'OMP_TARGET_OFFLOAD=mandatory' diagnostic with othersThomas Schwinge1-1/+1
On 2023-06-14T11:42:22+0200, Tobias Burnus <tobias@codesourcery.com> wrote: > On 14.06.23 10:09, Thomas Schwinge wrote: >> Let me know if I should also adjust the new 'target { ! offload_device }' >> diagnostic "[...] MANDATORY but only the host device is available" to >> include a comma before 'but', for consistency with the other existing >> diagnostics (cited above)? > > I think it makes sense to be consistent. Thus: Yes, please add the commas. Fix-up for recent commit 18c8b56c7d67a9e37acf28822587786f0fc0efbc "OpenMP: Set default-device-var with OMP_TARGET_OFFLOAD=mandatory". libgomp/ * target.c (resolve_device): Align a 'OMP_TARGET_OFFLOAD=mandatory' diagnostic with others. * testsuite/libgomp.c/target-51.c: Adjust.
2023-06-14driver: Forward '-lgfortran', '-lm' to offloading compilationThomas Schwinge5-7/+0
..., so that users don't manually need to specify '-foffload-options=-lgfortran', '-foffload-options=-lm' in addition to '-lgfortran', '-lm' (specified manually, or implicitly by the driver). gcc/ * gcc.cc (driver_handle_option): Forward host '-lgfortran', '-lm' to offloading compilation. * config/gcn/mkoffload.cc (main): Adjust. * config/nvptx/mkoffload.cc (main): Likewise. * doc/invoke.texi (foffload-options): Update example. libgomp/ * testsuite/libgomp.fortran/fortran.exp (lang_link_flags): Don't set. * testsuite/libgomp.oacc-fortran/fortran.exp (lang_link_flags): Likewise. * testsuite/libgomp.c/simd-math-1.c: Remove '-foffload-options=-lm'. * testsuite/libgomp.fortran/fortran-torture_execute_math.f90: Likewise. * testsuite/libgomp.oacc-fortran/fortran-torture_execute_math.f90: Likewise.
2023-06-14Add 'libgomp.{,oacc-}fortran/fortran-torture_execute_math.f90'Thomas Schwinge2-0/+9
..., via 'include'ing the existing 'gfortran.fortran-torture/execute/math.f90', which therefore is enhanced for optional OpenACC 'serial', OpenMP 'target' usage. gcc/testsuite/ * gfortran.fortran-torture/execute/math.f90: Enhance for optional OpenACC 'serial', OpenMP 'target' usage. libgomp/ * testsuite/libgomp.fortran/fortran-torture_execute_math.f90: New. * testsuite/libgomp.oacc-fortran/fortran-torture_execute_math.f90: Likewise.
2023-06-14Fix typo in 'libgomp.c/target-51.c'Thomas Schwinge1-1/+1
..., and therefore, given 'target offload_device': PASS: libgomp.c/target-51.c (test for excess errors) PASS: libgomp.c/target-51.c execution test [-FAIL:-]{+PASS:+} libgomp.c/target-51.c output pattern test Fix-up for recent commit 18c8b56c7d67a9e37acf28822587786f0fc0efbc "OpenMP: Set default-device-var with OMP_TARGET_OFFLOAD=mandatory". libgomp/ * testsuite/libgomp.c/target-51.c: Fix typo.
2023-06-14OpenMP: Set default-device-var with OMP_TARGET_OFFLOAD=mandatoryTobias Burnus8-0/+210
OMP_TARGET_OFFLOAD=mandatory handling was before inconsistent. Hence, in OpenMP 5.2 it was clarified/extended by having implications on the default-device-var; additionally, omp_initial_device and omp_invalid_device enum values/PARAMETERs were added; support for it was added in r13-1066-g1158fe43407568 including aborting for omp_invalid_device and non-conforming device numbers. Only the mandatory handling was missing. Namely, while the default-device-var is usually initialized to value 0, with 'mandatory' it must have the value 'omp_invalid_device' if and only if zero non-host devices are available. (The OMP_DEFAULT_DEVICE env var overrides this as it comes semantically after the initialization.) To achieve this, default-device-var is now initialized to MIN_INT. If there is no 'mandatory', it is set to 0 directly after env var parsing. Otherwise, it is updated in gomp_target_init to either 0 or omp_invalid_device. To ensure INT_MIN is never seen by the user, both the omp_get_default_device API routine and omp_display_env (user call and OMP_DISPLAY_ENV env var) call gomp_init_targets_once() in that case. libgomp/ChangeLog: * env.c (gomp_default_icv_values): Init default_device_var to an nonconforming value - INT_MIN. (initialize_env): After env-var parsing, set default_device_var to device 0 unless OMP_TARGET_OFFLOAD=mandatory. (omp_display_env): If default_device_var is INT_MIN, call gomp_init_targets_once. * icv-device.c (omp_get_default_device): Likewise. * libgomp.texi (OMP_DEFAULT_DEVICE): Update init description. (OpenMP 5.2 Impl. Status): Mark OMP_TARGET_OFFLOAD=mandatory as 'Y'. * target.c (resolve_device): Improve error message device-num < 0 with 'mandatory' and no no-host devices available. (gomp_target_init): Set default-device-var if INT_MIN. * testsuite/libgomp.c/target-48.c: New test. * testsuite/libgomp.c/target-49.c: New test. * testsuite/libgomp.c/target-50.c: New test. * testsuite/libgomp.c/target-50a.c: New test. * testsuite/libgomp.c/target-51.c: New test. * testsuite/libgomp.c/target-52.c: New test. * testsuite/libgomp.c/target-53.c: New test. * testsuite/libgomp.c/target-54.c: New test.
2023-06-13libgomp/testsuite: Add requires-unified-addr-1.{c,f90} [PR109837]Tobias Burnus2-0/+185
Add a testcase for 'omp requires unified_address' that is currently supported by all devices but was not tested for. libgomp/ PR libgomp/109837 * testsuite/libgomp.c-c++-common/requires-unified-addr-1.c: New test. * testsuite/libgomp.fortran/requires-unified-addr-1.f90: New test.
2023-06-12OpenMP: Cleanups related to the 'present' modifierTobias Burnus7-13/+49
Reduce number of enum values passed to libgomp as GOMP_MAP_PRESENT_{TO,TOFROM,FROM,ALLOC} have the same semantic as GOMP_MAP_FORCE_PRESENT (i.e. abort if not present, otherwise ignore); that's different to GOMP_MAP_ALWAYS_PRESENT_{TO,TOFROM,FROM} which also abort if not present but copy data when present. This is is a follow-up to the commit r14-1579-g4ede915d5dde93 done 6 days ago. Additionally, the commit improves a libgomp run-time and a C/C++ compile-time error wording and extends testcases a tiny bit. gcc/c/ChangeLog: * c-parser.cc (c_parser_omp_clause_map): Reword error message for clearness especially with 'omp target (enter/exit) data.' gcc/cp/ChangeLog: * parser.cc (cp_parser_omp_clause_map): Reword error message for clearness especially with 'omp target (enter/exit) data.' * semantics.cc (handle_omp_array_sections): Handle GOMP_MAP_{ALWAYS_,}PRESENT_{TO,TOFROM,FROM,ALLOC} enum values. gcc/ChangeLog: * gimplify.cc (gimplify_adjust_omp_clauses_1): Use GOMP_MAP_FORCE_PRESENT for 'present alloc' implicit mapping. (gimplify_adjust_omp_clauses): Change GOMP_MAP_PRESENT_{TO,TOFROM,FROM,ALLOC} to the equivalent GOMP_MAP_FORCE_PRESENT. * omp-low.cc (lower_omp_target): Remove handling of no-longer valid GOMP_MAP_PRESENT_{TO,TOFROM,FROM,ALLOC}; update map kinds used for to/from clauses with present modifier. include/ChangeLog: * gomp-constants.h (enum gomp_map_kind): Change the enum values GOMP_MAP_PRESENT_{TO,TOFROM,FROM,ALLOC} to be compiler only. (GOMP_MAP_PRESENT_P): Update to include also GOMP_MAP_FORCE_PRESENT. libgomp/ChangeLog: * target.c (gomp_to_device_kind_p, gomp_map_vars_internal): Replace GOMP_MAP_PRESENT_{FROM,TO,TOFROM,ACLLOC} by GOMP_MAP_FORCE_PRESENT. (gomp_map_vars_internal, gomp_update): Likewise; unify and improve error message. * testsuite/libgomp.c-c++-common/target-present-2.c: Update for changed error message. * testsuite/libgomp.fortran/target-present-1.f90: Likewise. * testsuite/libgomp.fortran/target-present-2.f90: Likewise. * testsuite/libgomp.oacc-c-c++-common/present-1.c: Likewise. * testsuite/libgomp.c-c++-common/target-present-1.c: Likewise and extend testcase to check that data is copied when needed. * testsuite/libgomp.c-c++-common/target-present-3.c: Likewise. * testsuite/libgomp.fortran/target-present-3.f90: Likewise. gcc/testsuite/ChangeLog: * c-c++-common/gomp/defaultmap-4.c: Update scan-tree-dump. * c-c++-common/gomp/map-9.c: Likewise. * gfortran.dg/gomp/defaultmap-8.f90: Likewise. * gfortran.dg/gomp/map-11.f90: Likewise. * gfortran.dg/gomp/target-update-1.f90: Likewise. * gfortran.dg/gomp/map-12.f90: Likewise; also check original dump. * c-c++-common/gomp/map-6.c: Update dg-error and also check clause error with 'target (enter/exit) data'.
2023-06-07testsuite/libgomp.*/target-present-*.{c,f90}: Improve and fixTobias Burnus6-25/+35
One of the testcases lacked variables in a map clause such that the fail occurred too early. Additionally, it would have failed for all those non-host devices where 'present' is always true, i.e. non-host devices which can access all of the host memory (shared-memory devices). [There are currently none.] The commit now runs the code on all devices, which should succeed for host fallback and for shared-memory devices, finding potenial issues that way. Additionally, a checkpoint (required stdout output) is used to ensure that the execution won't fail (with the same error) before reaching the expected fail location. 2023-06-07 Thomas Schwinge <thomas@codesourcery.com> Tobias Burnus <tobias@codesourcery.com> libgomp/ * testsuite/libgomp.c-c++-common/target-present-1.c: Run code also for non-offload_device targets; check that it runs successfully for those and for all until a checkpoint for all * testsuite/libgomp.c-c++-common/target-present-2.c: Likewise. * testsuite/libgomp.c-c++-common/target-present-3.c: Likewise. * testsuite/libgomp.fortran/target-present-1.f90: Likewise. * testsuite/libgomp.fortran/target-present-3.f90: Likewise. * testsuite/libgomp.fortran/target-present-2.f90: Likewise; add missing vars to map clause.
2023-06-06openmp: Add support for the 'present' modifierTobias Burnus6-0/+163
This implements support for the OpenMP 5.1 'present' modifier, which can be used in map clauses in the 'target', 'target data', 'target data enter' and 'target data exit' constructs, and in the 'to' and 'from' clauses of the 'target update' construct. It is also supported in defaultmap. The modifier triggers a fatal runtime error if the data specified by the clause is not already present on the target device. It can also be combined with 'always' in map clauses. 2023-06-06 Kwok Cheung Yeung <kcy@codesourcery.com> Tobias Burnus <tobias@codesourcery.com> gcc/c/ * c-parser.cc (c_parser_omp_clause_defaultmap, c_parser_omp_clause_map): Parse 'present'. (c_parser_omp_clause_to, c_parser_omp_clause_from): Remove. (c_parser_omp_clause_from_to): New; parse to/from clauses with optional present modifer. (c_parser_omp_all_clauses): Update call. (c_parser_omp_target_data, c_parser_omp_target_enter_data, c_parser_omp_target_exit_data): Handle new map enum values for 'present' mapping. gcc/cp/ * parser.cc (cp_parser_omp_clause_defaultmap, cp_parser_omp_clause_map): Parse 'present'. (cp_parser_omp_clause_from_to): New; parse to/from clauses with optional 'present' modifier. (cp_parser_omp_all_clauses): Update call. (cp_parser_omp_target_data, cp_parser_omp_target_enter_data, cp_parser_omp_target_exit_data): Handle new enum value for 'present' mapping. * semantics.cc (finish_omp_target): Likewise. gcc/fortran/ * dump-parse-tree.cc (show_omp_namelist): Display 'present' map modifier. (show_omp_clauses): Display 'present' motion modifier for 'to' and 'from' clauses. * gfortran.h (enum gfc_omp_map_op): Add entries with 'present' modifiers. (struct gfc_omp_namelist): Add 'present_modifer'. * openmp.cc (gfc_match_motion_var_list): New, handles optional 'present' modifier for to/from clauses. (gfc_match_omp_clauses): Call it for to/from clauses; parse 'present' in defaultmap and map clauses. (resolve_omp_clauses): Allow 'present' modifiers on 'target', 'target data', 'target enter' and 'target exit' directives. * trans-openmp.cc (gfc_trans_omp_clauses): Apply 'present' modifiers to tree node for 'map', 'to' and 'from' clauses. Apply 'present' for defaultmap. gcc/ * gimplify.cc (omp_notice_variable): Apply GOVD_MAP_ALLOC_ONLY flag and defaultmap flags if the defaultmap has GOVD_MAP_FORCE_PRESENT flag set. (omp_get_attachment): Handle map clauses with 'present' modifier. (omp_group_base): Likewise. (gimplify_scan_omp_clauses): Reorder present maps to come first. Set GOVD flags for present defaultmaps. (gimplify_adjust_omp_clauses_1): Set map kind for present defaultmaps. * omp-low.cc (scan_sharing_clauses): Handle 'always, present' map clauses. (lower_omp_target): Handle map clauses with 'present' modifier. Handle 'to' and 'from' clauses with 'present'. * tree-core.h (enum omp_clause_defaultmap_kind): Add OMP_CLAUSE_DEFAULTMAP_PRESENT defaultmap kind. * tree-pretty-print.cc (dump_omp_clause): Handle 'map', 'to' and 'from' clauses with 'present' modifier. Handle present defaultmap. * tree.h (OMP_CLAUSE_MOTION_PRESENT): New #define. include/ * gomp-constants.h (GOMP_MAP_FLAG_SPECIAL_5): New. (GOMP_MAP_FLAG_FORCE): Redefine. (GOMP_MAP_FLAG_PRESENT, GOMP_MAP_FLAG_ALWAYS_PRESENT): New. (enum gomp_map_kind): Add map kinds with 'present' modifiers. (GOMP_MAP_COPY_TO_P, GOMP_MAP_COPY_FROM_P): Evaluate to true for map variants with 'present' (GOMP_MAP_ALWAYS_TO_P, GOMP_MAP_ALWAYS_FROM_P): Evaluate to true for map variants with 'always, present' modifiers. (GOMP_MAP_ALWAYS): Redefine. (GOMP_MAP_FORCE_P, GOMP_MAP_PRESENT_P): New. libgomp/ * libgomp.texi (OpenMP 5.1 Impl. status): Set 'present' support for defaultmap to 'Y', add 'Y' entry for 'present' on to/from/map clauses. * target.c (gomp_to_device_kind_p): Add map kinds with 'present' modifier. (gomp_map_vars_existing): Use new GOMP_MAP_FORCE_P macro. (gomp_map_vars_internal, gomp_update, gomp_target_rev): Emit runtime error if memory region not present. * testsuite/libgomp.c-c++-common/target-present-1.c: New test. * testsuite/libgomp.c-c++-common/target-present-2.c: New test. * testsuite/libgomp.c-c++-common/target-present-3.c: New test. * testsuite/libgomp.fortran/target-present-1.f90: New test. * testsuite/libgomp.fortran/target-present-2.f90: New test. * testsuite/libgomp.fortran/target-present-3.f90: New test. gcc/testsuite/ * c-c++-common/gomp/map-6.c: Update dg-error, extend to test for duplicated 'present' and extend scan-dump tests for 'present'. * gfortran.dg/gomp/defaultmap-1.f90: Update dg-error. * gfortran.dg/gomp/map-7.f90: Extend parse and dump test for 'present'. * gfortran.dg/gomp/map-8.f90: Extend for duplicate 'present' modifier checking. * c-c++-common/gomp/defaultmap-4.c: New test. * c-c++-common/gomp/map-9.c: New test. * c-c++-common/gomp/target-update-1.c: New test. * gfortran.dg/gomp/defaultmap-8.f90: New test. * gfortran.dg/gomp/map-11.f90: New test. * gfortran.dg/gomp/map-12.f90: New test. * gfortran.dg/gomp/target-update-1.f90: New test.
2023-06-02Support parallel testing in libgomp: fallback Perl 'flock' [PR66005]Thomas Schwinge2-1/+20
Follow-up to commit 6c3b30ef9e0578509bdaf59c13da4a212fe6c2ba "Support parallel testing in libgomp, part II [PR66005]" ("..., and enable if 'flock' is available for serializing execution testing"), where we saw: > On my Dell Precision 7530 laptop: > > $ uname -srvi > Linux 5.15.0-71-generic #78-Ubuntu SMP Tue Apr 18 09:00:29 UTC 2023 x86_64 > $ grep '^model name' < /proc/cpuinfo | uniq -c > 12 model name : Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz > $ nvidia-smi -L > GPU 0: Quadro P1000 (UUID: GPU-e043973b-b52a-d02b-c066-a8fdbf64e8ea) > > ... [...]: case (c) standard configuration, no offloading > configured, [...] > $ \time make check-target-libgomp > > Case (c), baseline; [...]: > > 1180.98user 110.80system 19:36.40elapsed 109%CPU (0avgtext+0avgdata 505148maxresident)k > 1133.22user 111.08system 19:35.75elapsed 105%CPU (0avgtext+0avgdata 505212maxresident)k > > Case (c), parallelized [using 'flock']: > > [...] > -j12 GCC_TEST_PARALLEL_SLOTS=12 > 2591.04user 192.64system 4:44.98elapsed 976%CPU (0avgtext+0avgdata 505216maxresident)k > 2581.23user 195.21system 4:47.51elapsed 965%CPU (0avgtext+0avgdata 505212maxresident)k Quite the same when instead of 'flock' using this fallback Perl 'flock': 2565.23user 194.35system 4:46.77elapsed 962%CPU (0avgtext+0avgdata 505216maxresident)k 2549.38user 200.20system 4:46.08elapsed 961%CPU (0avgtext+0avgdata 505216maxresident)k PR testsuite/66005 gcc/ * doc/install.texi: Document (optional) Perl usage for parallel testing of libgomp. libgomp/ * testsuite/lib/libgomp.exp: 'flock' through stdout. * testsuite/flock: New. * configure.ac (FLOCK): Point to that if no 'flock' available, but 'perl' is. * configure: Regenerate.
2023-06-02Remove stale Autoconf checks for PerlThomas Schwinge1-1/+0
Subversion r110220 (Git commit 03b8fe495d716c004f5491eb2347537f115ab2d8) for PR25884 "libgomp should not require perl to compile" removed all '$(PERL)' usage from libgomp -- but didn't remove the then-unused Autoconf Perl check itself. Later, this Autoconf Perl check appears to have been copied from libgomp into other GCC libraries, likewise unused. libgomp/ * configure.ac (PERL): Remove. * configure: Regenerate. * Makefile.in: Likewise. * testsuite/Makefile.in: Likewise. libatomic/ * configure.ac (PERL): Remove. * configure: Regenerate. * Makefile.in: Likewise. * testsuite/Makefile.in: Likewise. libgm2/ * configure.ac (PERL): Remove. * configure: Regenerate. * Makefile.in: Likewise. * libm2cor/Makefile.in: Likewise. * libm2iso/Makefile.in: Likewise. * libm2log/Makefile.in: Likewise. * libm2min/Makefile.in: Likewise. * libm2pim/Makefile.in: Likewise. libitm/ * configure.ac (PERL): Remove. * configure: Regenerate. * Makefile.in: Likewise. * testsuite/Makefile.in: Likewise.
2023-05-26Fortran/OpenMP: Add parsing support for allocators/allocate directivesTobias Burnus1-6/+6
gcc/fortran/ChangeLog: * dump-parse-tree.cc (show_omp_namelist): Update allocator, fix align dump. (show_omp_node, show_code_node): Handle EXEC_OMP_ALLOCATE. * gfortran.h (enum gfc_statement): Add ST_OMP_ALLOCATE and ..._EXEC. (enum gfc_exec_op): Add EXEC_OMP_ALLOCATE. (struct gfc_omp_namelist): Add 'allocator' to 'u2' union. (struct gfc_namespace): Add omp_allocate. (gfc_resolve_omp_allocate): New. * match.cc (gfc_free_omp_namelist): Free 'u2.allocator'. * match.h (gfc_match_omp_allocate, gfc_match_omp_allocators): New. * openmp.cc (gfc_omp_directives): Uncomment allocate/allocators. (gfc_match_omp_variable_list): Add bool arg for rejecting listening common-block vars separately. (gfc_match_omp_clauses): Update for u2.allocators. (OMP_ALLOCATORS_CLAUSES, gfc_match_omp_allocate, gfc_match_omp_allocators, is_predefined_allocator, gfc_resolve_omp_allocate): New. (resolve_omp_clauses): Update 'allocate' clause checks. (omp_code_to_statement, gfc_resolve_omp_directive): Handle OMP ALLOCATE/ALLOCATORS. * parse.cc (in_exec_part): New global var. (check_omp_allocate_stmt, parse_openmp_allocate_block): New. (decode_omp_directive, case_exec_markers, case_omp_decl, gfc_ascii_statement, parse_omp_structured_block): Handle OMP allocate/allocators. (verify_st_order, parse_executable): Set in_exec_part. * resolve.cc (gfc_resolve_blocks, resolve_codes): Handle allocate/allocators. * st.cc (gfc_free_statement): Likewise. * trans.cc (trans_code): Likewise. * trans-openmp.cc (gfc_trans_omp_directive): Likewise. (gfc_trans_omp_clauses, gfc_split_omp_clauses): Update for u2.allocator, fix for u.align. libgomp/ChangeLog: * testsuite/libgomp.fortran/allocate-4.f90: Update dg-error. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/allocate-2.f90: Update dg-error. * gfortran.dg/gomp/allocate-4.f90: New test. * gfortran.dg/gomp/allocate-5.f90: New test. * gfortran.dg/gomp/allocate-6.f90: New test. * gfortran.dg/gomp/allocate-7.f90: New test. * gfortran.dg/gomp/allocators-1.f90: New test. * gfortran.dg/gomp/allocators-2.f90: New test.
2023-05-21libgomp: Honor OpenMP's nteams-var ICV as upper limit on num teams [PR109875]Tobias Burnus4-0/+228
The nteams-var ICV exists per device and can be set either via the routine omp_set_num_teams or as environment variable (OMP_NUM_TEAMS with optional _ALL/_DEV/_DEV_<num> suffix); it is default-initialized to zero. The number of teams created is described under the num_teams clause. If the clause is absent, the number of teams is implementation defined but at least one team must exist and, if nteams-var is positive, at most nteams-var teams may exist. The latter condition was not honored in a target region before this commit, such that too many teams were created. Already before this commit, both the num_teams([lower:]upper) clause (on the host and in target regions) and, only on the host, the nteams-var ICV were honored. And as only one teams is created for host fallback, unless the clause specifies otherwise, the nteams-var ICV was and is effectively honored. libgomp/ChangeLog: PR libgomp/109875 * config/gcn/target.c (GOMP_teams4): Honor nteams-var ICV. * config/nvptx/target.c (GOMP_teams4): Likewise. * testsuite/libgomp.c-c++-common/teams-nteams-icv-1.c: New test. * testsuite/libgomp.c-c++-common/teams-nteams-icv-2.c: New test. * testsuite/libgomp.c-c++-common/teams-nteams-icv-3.c: New test. * testsuite/libgomp.c-c++-common/teams-nteams-icv-4.c: New test.
2023-05-17Fortran/OpenMP: Fix mapping of array descriptors and deferred-length stringsTobias Burnus5-1/+1551
Previously, array descriptors might have been mapped as 'alloc' instead of 'to' for 'alloc', not updating the array bounds. The 'alloc' could also appear for 'data exit', failing with a libgomp assert. In some cases, either array descriptors or deferred-length string's length variable was not mapped. And, finally, some offset calculations with array-sections mappings went wrong. Additionally, the patch now unmaps for scalar allocatables/pointers the GOMP_MAP_POINTER, avoiding stale mappings. The testcases contain some comment-out tests which require follow-up work and for which PR exist. Those mostly relate to deferred-length strings which have several issues beyong OpenMP support. gcc/fortran/ChangeLog: * trans-decl.cc (gfc_get_symbol_decl): Add attributes such as 'declare target' also to hidden artificial variable for deferred-length character variables. * trans-openmp.cc (gfc_trans_omp_array_section, gfc_trans_omp_clauses, gfc_trans_omp_target_exit_data): Improve mapping of array descriptors and deferred-length string variables. gcc/ChangeLog: * gimplify.cc (gimplify_scan_omp_clauses): Remove Fortran special case. libgomp/ChangeLog: * testsuite/libgomp.fortran/target-enter-data-3.f90: Uncomment 'target exit data'. * testsuite/libgomp.fortran/target-enter-data-4.f90: New test. * testsuite/libgomp.fortran/target-enter-data-5.f90: New test. * testsuite/libgomp.fortran/target-enter-data-6.f90: New test. * testsuite/libgomp.fortran/target-enter-data-7.f90: New test. gcc/testsuite/ * gfortran.dg/goacc/finalize-1.f: Update dg-tree; shows a fix for 'finalize' as a ptr is now 'delete' instead of 'release'. * gfortran.dg/gomp/pr78260-2.f90: Likewise as elem-size calc moved to if (allocated) block * gfortran.dg/gomp/target-exit-data.f90: Likewise as a var is now a replaced by a MEM< _25 > expression. * gfortran.dg/gomp/map-9.f90: Update dg-scan-tree-dump. * gfortran.dg/gomp/map-10.f90: New test.
2023-05-15Support parallel testing in libgomp, part II [PR66005]Thomas Schwinge5-4/+35
..., and enable if 'flock' is available for serializing execution testing. Regarding the default of 19 parallel slots, this turned out to be a local minimum for wall time when testing this on: $ uname -srvi Linux 4.2.0-42-generic #49~14.04.1-Ubuntu SMP Wed Jun 29 20:22:11 UTC 2016 x86_64 $ grep '^model name' < /proc/cpuinfo | uniq -c 32 model name : Intel(R) Xeon(R) CPU E5-2640 v3 @ 2.60GHz ... in two configurations: case (a) standard configuration, no offloading configured, case (b) offloading for GCN and nvptx configured but no devices available. For both cases, default plus '-m32' variant. $ \time make check-target-libgomp RUNTESTFLAGS="--target_board=unix\{,-m32\}" Case (a), baseline: 6432.23user 332.38system 47:32.28elapsed 237%CPU (0avgtext+0avgdata 505044maxresident)k 6382.43user 319.21system 47:06.04elapsed 237%CPU (0avgtext+0avgdata 505172maxresident)k This is what people have been complaining about, rightly so, in <https://gcc.gnu.org/PR66005> "libgomp make check time is excessive" and elsewhere. Case (a), parallelized: -j12 GCC_TEST_PARALLEL_SLOTS=10 3088.49user 267.74system 6:43.82elapsed 831%CPU (0avgtext+0avgdata 505188maxresident)k -j15 GCC_TEST_PARALLEL_SLOTS=15 3308.08user 294.79system 5:56.04elapsed 1011%CPU (0avgtext+0avgdata 505360maxresident)k -j17 GCC_TEST_PARALLEL_SLOTS=17 3539.93user 298.99system 5:27.86elapsed 1170%CPU (0avgtext+0avgdata 505112maxresident)k -j18 GCC_TEST_PARALLEL_SLOTS=18 3697.50user 317.18system 5:14.63elapsed 1275%CPU (0avgtext+0avgdata 505360maxresident)k -j19 GCC_TEST_PARALLEL_SLOTS=19 3765.94user 324.27system 5:13.22elapsed 1305%CPU (0avgtext+0avgdata 505128maxresident)k -j20 GCC_TEST_PARALLEL_SLOTS=20 3684.66user 312.32system 5:15.26elapsed 1267%CPU (0avgtext+0avgdata 505100maxresident)k -j23 GCC_TEST_PARALLEL_SLOTS=23 4040.59user 347.10system 5:29.12elapsed 1333%CPU (0avgtext+0avgdata 505200maxresident)k -j26 GCC_TEST_PARALLEL_SLOTS=26 3973.24user 377.96system 5:24.70elapsed 1340%CPU (0avgtext+0avgdata 505160maxresident)k -j32 GCC_TEST_PARALLEL_SLOTS=32 4004.42user 346.10system 5:16.11elapsed 1376%CPU (0avgtext+0avgdata 505160maxresident)k Yay! Case (b), baseline; 2+ h: 7227.58user 700.54system 2:14:33elapsed 98%CPU (0avgtext+0avgdata 994264maxresident)k Case (b), parallelized: -j12 GCC_TEST_PARALLEL_SLOTS=10 7377.46user 777.52system 16:06.63elapsed 843%CPU (0avgtext+0avgdata 994344maxresident)k -j15 GCC_TEST_PARALLEL_SLOTS=15 8019.18user 721.42system 12:13.56elapsed 1191%CPU (0avgtext+0avgdata 994228maxresident)k -j17 GCC_TEST_PARALLEL_SLOTS=17 8530.11user 716.95system 10:45.92elapsed 1431%CPU (0avgtext+0avgdata 994176maxresident)k -j18 GCC_TEST_PARALLEL_SLOTS=18 8776.79user 645.89system 10:27.20elapsed 1502%CPU (0avgtext+0avgdata 994248maxresident)k -j19 GCC_TEST_PARALLEL_SLOTS=19 9332.37user 641.76system 10:15.09elapsed 1621%CPU (0avgtext+0avgdata 994260maxresident)k -j20 GCC_TEST_PARALLEL_SLOTS=20 9609.54user 789.88system 10:26.94elapsed 1658%CPU (0avgtext+0avgdata 994284maxresident)k -j23 GCC_TEST_PARALLEL_SLOTS=23 10362.40user 911.14system 10:44.47elapsed 1749%CPU (0avgtext+0avgdata 994208maxresident)k -j26 GCC_TEST_PARALLEL_SLOTS=26 11159.44user 850.99system 11:09.25elapsed 1794%CPU (0avgtext+0avgdata 994256maxresident)k -j32 GCC_TEST_PARALLEL_SLOTS=32 11453.50user 939.52system 11:00.38elapsed 1876%CPU (0avgtext+0avgdata 994240maxresident)k On my Dell Precision 7530 laptop: $ uname -srvi Linux 5.15.0-71-generic #78-Ubuntu SMP Tue Apr 18 09:00:29 UTC 2023 x86_64 $ grep '^model name' < /proc/cpuinfo | uniq -c 12 model name : Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz $ nvidia-smi -L GPU 0: Quadro P1000 (UUID: GPU-e043973b-b52a-d02b-c066-a8fdbf64e8ea) ... in two configurations: case (c) standard configuration, no offloading configured, case (d) offloading for nvptx configured and device available. For both cases, only default variant, no '-m32'. $ \time make check-target-libgomp Case (c), baseline; roughly half of case (a) (just one variant): 1180.98user 110.80system 19:36.40elapsed 109%CPU (0avgtext+0avgdata 505148maxresident)k 1133.22user 111.08system 19:35.75elapsed 105%CPU (0avgtext+0avgdata 505212maxresident)k Case (c), parallelized: -j12 GCC_TEST_PARALLEL_SLOTS=2 1143.83user 110.76system 10:20.46elapsed 202%CPU (0avgtext+0avgdata 505216maxresident)k -j12 GCC_TEST_PARALLEL_SLOTS=6 1737.08user 143.94system 4:59.48elapsed 628%CPU (0avgtext+0avgdata 505200maxresident)k 1730.31user 143.02system 4:58.75elapsed 627%CPU (0avgtext+0avgdata 505152maxresident)k -j12 GCC_TEST_PARALLEL_SLOTS=8 2192.63user 169.34system 4:52.96elapsed 806%CPU (0avgtext+0avgdata 505216maxresident)k 2219.04user 167.67system 4:53.19elapsed 814%CPU (0avgtext+0avgdata 505152maxresident)k -j12 GCC_TEST_PARALLEL_SLOTS=10 2463.93user 184.98system 4:48.39elapsed 918%CPU (0avgtext+0avgdata 505200maxresident)k 2455.62user 183.68system 4:47.40elapsed 918%CPU (0avgtext+0avgdata 505216maxresident)k -j12 GCC_TEST_PARALLEL_SLOTS=12 2591.04user 192.64system 4:44.98elapsed 976%CPU (0avgtext+0avgdata 505216maxresident)k 2581.23user 195.21system 4:47.51elapsed 965%CPU (0avgtext+0avgdata 505212maxresident)k -j20 GCC_TEST_PARALLEL_SLOTS=20 [oversubscribe] 2613.18user 199.51system 4:44.06elapsed 990%CPU (0avgtext+0avgdata 505216maxresident)k Case (d), baseline (compared to case (b): only nvptx offloading compilation, but also nvptx offloading execution); ~1 h: 2841.93user 653.68system 1:02:26elapsed 93%CPU (0avgtext+0avgdata 909792maxresident)k 2842.03user 654.39system 1:02:24elapsed 93%CPU (0avgtext+0avgdata 909880maxresident)k Case (d), parallelized: -j12 GCC_TEST_PARALLEL_SLOTS=2 2856.39user 606.87system 33:58.64elapsed 169%CPU (0avgtext+0avgdata 909948maxresident)k -j12 GCC_TEST_PARALLEL_SLOTS=6 3444.90user 666.86system 18:37.57elapsed 367%CPU (0avgtext+0avgdata 909856maxresident)k 3462.13user 667.13system 18:36.87elapsed 369%CPU (0avgtext+0avgdata 909872maxresident)k -j12 GCC_TEST_PARALLEL_SLOTS=8 3929.74user 716.22system 18:02.36elapsed 429%CPU (0avgtext+0avgdata 909832maxresident)k -j12 GCC_TEST_PARALLEL_SLOTS=10 4152.84user 736.16system 17:43.05elapsed 459%CPU (0avgtext+0avgdata 909872maxresident)k -j12 GCC_TEST_PARALLEL_SLOTS=12 4209.60user 749.00system 17:35.20elapsed 469%CPU (0avgtext+0avgdata 909840maxresident)k -j20 GCC_TEST_PARALLEL_SLOTS=20 [oversubscribe] 4255.54user 756.78system 17:29.06elapsed 477%CPU (0avgtext+0avgdata 909868maxresident)k Worth noting is that with nvptx offloading, there is one execution test case that times out ('libgomp.fortran/reverse-offload-5.f90'). This effectively stalls progress for almost 5 min: quickly other executions test cases queue up on the lock for all parallel slots. That's working as expected; just noting this as it accordingly does skew the wall time numbers. PR testsuite/66005 libgomp/ * configure.ac: Look for 'flock'. * testsuite/Makefile.am (gcc_test_parallel_slots): Enable parallel testing. * testsuite/config/default.exp: Don't 'load_lib "standard.exp"' here... * testsuite/lib/libgomp.exp: ... but here, instead. (libgomp_load): Override for parallel testing. * testsuite/libgomp-site-extra.exp.in (FLOCK): Set. * configure: Regenerate. * Makefile.in: Regenerate. * testsuite/Makefile.in: Regenerate.
2023-05-15Support parallel testing in libgomp, part I [PR66005]Rainer Orth3-23/+138
..., while still hard-coding the number of parallel slots to one. PR testsuite/66005 libgomp/ * testsuite/Makefile.am (PWD_COMMAND): New variable. (%/site.exp): New target. (check_p_numbers0, check_p_numbers1, check_p_numbers2) (check_p_numbers3, check_p_numbers4, check_p_numbers5) (check_p_numbers6, check_p_numbers, gcc_test_parallel_slots) (check_p_subdirs) (check_DEJAGNU_libgomp_targets): New variables. ($(check_DEJAGNU_libgomp_targets)): New target. ($(check_DEJAGNU_libgomp_targets)): New dependency. (check-DEJAGNU $(check_DEJAGNU_libgomp_targets)): New targets. * testsuite/Makefile.in: Regenerate. * testsuite/lib/libgomp.exp: For parallel testing, 'load_file ../libgomp-test-support.exp'. Co-authored-by: Thomas Schwinge <thomas@codesourcery.com>
2023-05-15libgomp testsuite: As appropriate, use the 'gcc', 'g++', 'gfortran' driver ↵Thomas Schwinge7-61/+43
[PR91884] ..., that is, 'GCC_UNDER_TEST', 'GXX_UNDER_TEST', 'GFORTRAN_UNDER_TEST' instead of 'GCC_UNDER_TEST' for all of them. No need anymore for 'gcc -lstdc++ -x c++' for C++ code, or 'gcc -lgfortran' plus conditional '-lquadmath' for Fortran code. (Getting rid of explicit '-foffload=-lgfortran' is for another day.) PR testsuite/91884 libgomp/ * configure.ac: 'AC_SUBST(CXX)'. * configure: Regenerate. * Makefile.in: Likewise. * testsuite/Makefile.in: Likewise. * testsuite/libgomp-site-extra.exp.in (GXX_UNDER_TEST) (GFORTRAN_UNDER_TEST): Set. * testsuite/lib/libgomp.exp (libgomp_init): Adjust. * testsuite/libgomp.c++/c++.exp: Use 'GXX_UNDER_TEST'. * testsuite/libgomp.oacc-c++/c++.exp: Likewise. * testsuite/libgomp.fortran/fortran.exp: Use 'GFORTRAN_UNDER_TEST'. * testsuite/libgomp.oacc-fortran/fortran.exp: Likewise.
2023-05-15libgomp testsuite: Have each '*.exp' file specify the compiler to use [PR91884]Thomas Schwinge8-15/+17
..., which is still 'GCC_UNDER_TEST' for all of them; no change in behavior. PR testsuite/91884 libgomp/ * testsuite/lib/libgomp.exp (libgomp_target_compile): Don't specify compiler. * testsuite/libgomp.c++/c++.exp (ALWAYS_CFLAGS): Specify compiler. * testsuite/libgomp.c/c.exp (ALWAYS_CFLAGS): Likewise. * testsuite/libgomp.fortran/fortran.exp (ALWAYS_CFLAGS): Likewise. * testsuite/libgomp.graphite/graphite.exp (ALWAYS_CFLAGS): Likewise. * testsuite/libgomp.oacc-c++/c++.exp (ALWAYS_CFLAGS): Likewise. * testsuite/libgomp.oacc-c/c.exp (ALWAYS_CFLAGS): Likewise. * testsuite/libgomp.oacc-fortran/fortran.exp (ALWAYS_CFLAGS): Likewise.
2023-05-12LTO: Fix writing of toplevel asm with offloading [PR109816]Tobias Burnus2-0/+104
When offloading was enabled, top-level 'asm' were added to the offloading section, confusing assemblers which did not support the syntax. Additionally, with offloading and -flto, the top-level assembler code did not end up in the host files. As r14-321-g9a41d2cdbcd added top-level 'asm' to one libstdc++ header file, the issue became more apparent, causing fails with nvptx for some C++ testcases. PR libstdc++/109816 gcc/ChangeLog: * lto-cgraph.cc (output_symtab): Guard lto_output_toplevel_asms by '!lto_stream_offload_p'. libgomp/ChangeLog: * testsuite/libgomp.c++/target-map-class-1.C: New test. * testsuite/libgomp.c++/target-map-class-2.C: New test.
2023-05-12libgomp testsuite: Generalize 'lang_library_path' into a list of ↵Thomas Schwinge5-47/+63
'lang_library_paths' ..., and use that for libquadmath, too. libgomp/ * testsuite/lib/libgomp.exp (libgomp_target_compile): Generalize 'lang_library_path' into a list of 'lang_library_paths'. * testsuite/libgomp.c++/c++.exp: Adjust. * testsuite/libgomp.oacc-c++/c++.exp: Likewise. * testsuite/libgomp.fortran/fortran.exp: Adjust. Use that for libquadmath, too. * testsuite/libgomp.oacc-fortran/fortran.exp: Likewise.
2023-05-12libgomp testsuite: Get rid of 'lang_test_file_found'Thomas Schwinge5-282/+243
Instead, 'return' early from the '*.exp' files that we're not able to test. Also, change 'puts' into 'verbose -log'. While re-indenting the previous 'if { $lang_test_file_found } { [...] }' code, also simplify 'ld_library_path' setup. libgomp/ * testsuite/lib/libgomp.exp (libgomp_target_compile): Don't look at 'lang_test_file_found'. * testsuite/libgomp.c++/c++.exp: Don't set and use it, and instead 'return' early if not able to test. Simplify 'ld_library_path' setup. * testsuite/libgomp.fortran/fortran.exp: Likewise. * testsuite/libgomp.oacc-c++/c++.exp: Likewise. * testsuite/libgomp.oacc-fortran/fortran.exp: Likewise.
2023-05-12libgomp C++, Fortran testsuites: Resolve 'lang_test_file_found' firstThomas Schwinge4-73/+68
libgomp/ * testsuite/libgomp.c++/c++.exp: Resolve 'lang_test_file_found' first. * testsuite/libgomp.fortran/fortran.exp: Likewise. * testsuite/libgomp.oacc-c++/c++.exp: Likewise. * testsuite/libgomp.oacc-fortran/fortran.exp: Likewise.
2023-05-12libgomp testsuite: Localize 'lang_[...]' etc.Thomas Schwinge7-49/+36
..., instead of letting them bleed into the next '*.exp' file, requiring clean-up there. libgomp/ * testsuite/libgomp.c++/c++.exp: Localize 'lang_[...]' etc. * testsuite/libgomp.c/c.exp: Likewise. * testsuite/libgomp.fortran/fortran.exp: Likewise. * testsuite/libgomp.graphite/graphite.exp: Likewise. * testsuite/libgomp.oacc-c++/c++.exp: Likewise. * testsuite/libgomp.oacc-c/c.exp: Likewise. * testsuite/libgomp.oacc-fortran/fortran.exp: Likewise.
2023-05-09libgomp testsuite: Use 'lang_test_file_found' instead of 'lang_test_file'Thomas Schwinge8-24/+8
The value of 'lang_test_file' isn't actually used anywhere. libgomp/ * testsuite/libgomp.c++/c++.exp: Don't set 'lang_test_file'. * testsuite/libgomp.fortran/fortran.exp: Likewise. * testsuite/libgomp.oacc-c++/c++.exp: Likewise. * testsuite/libgomp.oacc-fortran/fortran.exp: Likewise. * testsuite/libgomp.c/c.exp: Unset 'lang_test_file_found' instead of 'lang_test_file'. * testsuite/libgomp.oacc-c/c.exp: Likewise. * testsuite/libgomp.graphite/graphite.exp: Likewise. * testsuite/lib/libgomp.exp (libgomp_target_compile): Look for 'lang_test_file_found' instead of 'lang_test_file'.
2023-05-09libgomp testsuite: Only use 'blddir' if setThomas Schwinge3-4/+7
(It is unclear to me why the current working directory needs to be in 'LD_LIBRARY_PATH'; leaving that alone for now.) libgomp/ * testsuite/lib/libgomp.exp (libgomp_init): Only use 'blddir' if set. * testsuite/libgomp.c++/c++.exp: Likewise. * testsuite/libgomp.oacc-c++/c++.exp: Likewise.
2023-05-09libgomp C++ testsuite: Don't compute 'blddir' twiceThomas Schwinge2-6/+0
It has already been set in 'libgomp/testsuite/lib/libgomp.exp:libgomp_init'. libgomp/ * testsuite/libgomp.c++/c++.exp (blddir): Don't set. * testsuite/libgomp.oacc-c++/c++.exp (blddir): Likewise.
2023-05-08libgomp C++ testsuite: Use 'lang_include_flags' instead of 'libstdcxx_includes'Thomas Schwinge2-8/+6
With nvptx offloading configured, and supported, and CUDA available: $ make check-target-libgomp RUNTESTFLAGS="--all c.exp=context-1.c c++.exp=context-1.c" [...] Running [...]/libgomp.oacc-c/c.exp ... PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O0 (test for excess errors) PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O0 execution test PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O2 (test for excess errors) PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O2 execution test UNSUPPORTED: libgomp.oacc-c/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable -O2 Running [...]/libgomp.oacc-c++/c++.exp ... PASS: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O0 (test for excess errors) PASS: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O0 execution test PASS: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O2 (test for excess errors) PASS: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O2 execution test UNSUPPORTED: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable -O2 [...] ..., but for 'c++.exp=context-1.c' alone, we currently get all-UNSUPPORTED: $ make check-target-libgomp RUNTESTFLAGS_="--all c++.exp=context-1.c" [...] Running [...]/libgomp.oacc-c++/c++.exp ... UNSUPPORTED: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O0 UNSUPPORTED: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O2 UNSUPPORTED: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable -O2 [...] That is, if 'c.exp' executes first, it does successfully evaluate 'dg-require-effective-target openacc_cublas' -- and does cache this result (so it isn't reevaluated for 'c++.exp'). However, for 'c++.exp' alone (that is, without the 'c.exp' result cached), we run into: spawn -ignore SIGHUP [xgcc] [...] -x c++ openacc_cublas2311907.c [...] In file included from /usr/include/cuda_fp16.h:3673, from /usr/include/cublas_api.h:75, from /usr/include/cublas_v2.h:65, from openacc_cublas2311907.c:3: /usr/include/cuda_fp16.hpp:67:10: fatal error: utility: No such file or directory We're missing include paths to C++/libstdc++ build-tree headers. Fix this by using the mechanism introduced for Fortran in r212268 (commit f707da16f714f7fe5a42391748212c84dfec639b) re "libgomp.fortran/fortran.exp - add -fintrinsic-modules-path ${blddir}". libgomp/ * testsuite/libgomp.c++/c++.exp: Use 'lang_include_flags' instead of 'libstdcxx_includes'. * testsuite/libgomp.oacc-c++/c++.exp: Likewise.
2023-05-04OpenACC: Further attach/detach clause fixes for Fortran [PR109622]Julian Brown4-2/+58
This patch moves several tests introduced by the following patch: https://gcc.gnu.org/pipermail/gcc-patches/2023-April/616939.html commit r14-325-gcacf65d74463600815773255e8b82b4043432bd7 into the proper location for OpenACC testing (thanks to Thomas for spotting my mistake!), and also fixes a few additional problems -- missing diagnostics for non-pointer attaches, and a case where a pointer was incorrectly dereferenced. Tests are also adjusted for vector-length warnings on nvidia accelerators. 2023-04-29 Julian Brown <julian@codesourcery.com> PR fortran/109622 gcc/fortran/ * openmp.cc (resolve_omp_clauses): Add diagnostic for non-pointer/non-allocatable attach/detach. * trans-openmp.cc (gfc_trans_omp_clauses): Remove dereference for pointer-to-scalar derived type component attach/detach. Fix attach/detach handling for descriptors. gcc/testsuite/ * gfortran.dg/goacc/pr109622-5.f90: New test. * gfortran.dg/goacc/pr109622-6.f90: New test. libgomp/ * testsuite/libgomp.fortran/pr109622.f90: Move test... * testsuite/libgomp.oacc-fortran/pr109622.f90: ...to here. Ignore vector length warning. * testsuite/libgomp.fortran/pr109622-2.f90: Move test... * testsuite/libgomp.oacc-fortran/pr109622-2.f90: ...to here. Add missing copyin/copyout variable. Ignore vector length warnings. * testsuite/libgomp.fortran/pr109622-3.f90: Move test... * testsuite/libgomp.oacc-fortran/pr109622-3.f90: ...to here. Ignore vector length warnings. * testsuite/libgomp.oacc-fortran/pr109622-4.f90: New test.
2023-04-28OpenACC: Stand-alone attach/detach clause fixes for Fortran [PR109622]Julian Brown3-0/+96
This patch fixes several cases where multiple attach or detach mapping nodes were being created for stand-alone attach or detach clauses in Fortran. After the introduction of stricter checking later during compilation, these extra nodes could cause ICEs, as seen in the PR. The patch also fixes cases that "happened to work" previously where the user attaches/detaches a pointer to array using a descriptor, and (I think!) the "_data" field has offset zero, hence the same address as the descriptor as a whole. 2023-04-27 Julian Brown <julian@codesourcery.com> PR fortran/109622 gcc/fortran/ * trans-openmp.cc (gfc_trans_omp_clauses): Attach/detach clause fixes. gcc/testsuite/ * gfortran.dg/goacc/attach-descriptor.f90: Adjust expected output. libgomp/ * testsuite/libgomp.fortran/pr109622.f90: New test. * testsuite/libgomp.fortran/pr109622-2.f90: New test. * testsuite/libgomp.fortran/pr109622-3.f90: New test.
2023-04-25'omp scan' struct block seq update for OpenMP 5.xTobias Burnus3-0/+248
While OpenMP 5.0 required a single structured block before and after the 'omp scan' directive, OpenMP 5.1 changed this to a 'structured block sequence, denoting 2 or more executable statements in OpenMP 5.1 (whoops!) and zero or more in OpenMP 5.2. This commit updates C/C++ to accept zero statements (but till requires the '{' ... '}' for the final-loop-body) and updates Fortran to accept zero or more than one statements. If there is no preceeding or succeeding executable statement, a warning is shown. gcc/c/ChangeLog: * c-parser.cc (c_parser_omp_scan_loop_body): Handle zero exec statements before/after 'omp scan'. gcc/cp/ChangeLog: * parser.cc (cp_parser_omp_scan_loop_body): Handle zero exec statements before/after 'omp scan'. gcc/fortran/ChangeLog: * openmp.cc (gfc_resolve_omp_do_blocks): Handle zero or more than one exec statements before/after 'omp scan'. * trans-openmp.cc (gfc_trans_omp_do): Likewise. libgomp/ChangeLog: * testsuite/libgomp.c-c++-common/scan-1.c: New test. * testsuite/libgomp.c/scan-23.c: New test. * testsuite/libgomp.fortran/scan-2.f90: New test. gcc/testsuite/ChangeLog: * g++.dg/gomp/attrs-7.C: Update dg-error/dg-warning. * gfortran.dg/gomp/loop-2.f90: Likewise. * gfortran.dg/gomp/reduction5.f90: Likewise. * gfortran.dg/gomp/reduction6.f90: Likewise. * gfortran.dg/gomp/scan-1.f90: Likewise. * gfortran.dg/gomp/taskloop-2.f90: Likewise. * c-c++-common/gomp/scan-6.c: New test. * gfortran.dg/gomp/scan-8.f90: New test.
2023-03-28testsuite: Fix weak_undefined handling on DarwinRainer Orth1-0/+1
The patch that introduced the weak_undefined effective-target keyword and corresponding dg-add-options support commit 378ec7b87a5265dbe2d489c245fac98ef37fa638 Author: Alexandre Oliva <oliva@adacore.com> Date: Thu Mar 23 00:45:05 2023 -0300 [testsuite] test for weak_undefined support and add options badly broke the affected tests on macOS like so: ERROR: gcc.dg/addr_equal-1.c: unknown dg option: 89 for " dg-add-options 5 weak_undefined " ERROR: gcc.dg/addr_equal-1.c: unknown dg option: 89 for " dg-add-options 5 weak_undefined " add_options_for_weak_undefined tries to call an non-existant proc "89". Even after fixing this by escaping the brackets, two tests still failed to link since they lacked the corresponding calls do dg-add-options weak_undefined. Tested on x86_64-apple-darwin20.6.0 and i386-pc-solaris2.11. 2023-03-27 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> gcc/testsuite: * lib/target-supports.exp (add_options_for_weak_undefined): Escape brackets. * gcc.dg/visibility-22.c: Add weak_undefined options. libgomp: * testsuite/libgomp.oacc-c-c++-common/routine-nohost-2.c: Add weak_undefined options.
2023-03-10Use 'GOMP_MAP_VARS_TARGET' for OpenACC compute constructs [PR90596]Thomas Schwinge1-44/+14
Thereby considerably simplify the device plugins' 'GOMP_OFFLOAD_openacc_exec', 'GOMP_OFFLOAD_openacc_async_exec' functions: in terms of lines of code, but in particular conceptually: no more device memory allocation, host to device data copying, device memory deallocation -- 'GOMP_MAP_VARS_TARGET' does all that for us. This depends on commit 2b2340e236c0bba8aaca358ea25a5accd8249fbd "Allow libgomp 'cbuf' buffering with OpenACC 'async' for 'ephemeral' data", where I said that "a use will emerge later", which is this one here. PR libgomp/90596 libgomp/ * target.c (gomp_map_vars_internal): Allow for 'param_kind == GOMP_MAP_VARS_OPENACC | GOMP_MAP_VARS_TARGET'. * oacc-parallel.c (GOACC_parallel_keyed): Pass 'GOMP_MAP_VARS_TARGET' to 'goacc_map_vars'. * plugin/plugin-gcn.c (alloc_by_agent, gcn_exec) (GOMP_OFFLOAD_openacc_exec, GOMP_OFFLOAD_openacc_async_exec): Adjust, simplify. (gomp_offload_free): Remove. * plugin/plugin-nvptx.c (nvptx_exec, GOMP_OFFLOAD_openacc_exec) (GOMP_OFFLOAD_openacc_async_exec): Adjust, simplify. (cuda_free_argmem): Remove. * testsuite/libgomp.oacc-c-c++-common/acc_prof-parallel-1.c: Adjust.
2023-03-10Simplify OpenACC 'no_create' clause implementationThomas Schwinge2-6/+36
For 'OFFSET_INLINED', 'gomp_map_val' does the right thing, and we may then simplify the device plugins accordingly. This is a follow-up to Subversion r279551 (Git commit a6163563f2ce502bd4ef444bd5de33570bb8eeb1) "Add OpenACC 2.6's no_create", Subversion r279622 (Git commit 5bcd470bf0749e1f56d05dd43aa9584ff2e3a090) "Use gomp_map_val for OpenACC host-to-device address translation". libgomp/ * target.c (gomp_map_vars_internal): Use 'OFFSET_INLINED' for 'GOMP_MAP_IF_PRESENT'. * plugin/plugin-gcn.c (gcn_exec, GOMP_OFFLOAD_openacc_exec) (GOMP_OFFLOAD_openacc_async_exec): Adjust. * plugin/plugin-nvptx.c (nvptx_exec, GOMP_OFFLOAD_openacc_exec) (GOMP_OFFLOAD_openacc_async_exec): Likewise. * testsuite/libgomp.oacc-c-c++-common/no_create-1.c: Add 'async' testing. * testsuite/libgomp.oacc-c-c++-common/no_create-2.c: Likewise.
2023-03-10Document/verify another aspect of OpenACC 'async' semantics in ↵Thomas Schwinge1-2/+2
'libgomp.oacc-c-c++-common/data-3.c' ... that I almost broke with later implementation changes. libgomp/ * testsuite/libgomp.oacc-c-c++-common/data-3.c: Document/verify another aspect of OpenACC 'async' semantics.
2023-03-10Fix OpenACC/GCN 'acc_ev_enqueue_launch_end' positionThomas Schwinge1-19/+183
For an OpenACC compute construct, we've currently got: - [...] - acc_ev_enqueue_launch_start - launch kernel - free memory - acc_ev_free - acc_ev_enqueue_launch_end This confused another thing that I'm working on, so I adjusted that to: - [...] - acc_ev_enqueue_launch_start - launch kernel - acc_ev_enqueue_launch_end - free memory - acc_ev_free Correspondingly, verify 'acc_ev_alloc', 'acc_ev_free' in 'libgomp.oacc-c-c++-common/acc_prof-parallel-1.c'. libgomp/ * plugin/plugin-gcn.c (gcn_exec): Fix 'acc_ev_enqueue_launch_end' position. * testsuite/libgomp.oacc-c-c++-common/acc_prof-parallel-1.c: Verify 'acc_ev_alloc', 'acc_ev_free'.
2023-03-09libgomp: Fix default value of GOMP_SPINCOUNT [PR 109062]Hongyu Wang1-0/+14
When OMP_WAIT_POLICY is not specified, current implementation will cause icv flag GOMP_ICV_WAIT_POLICY unset, so global variable wait_policy will remain its uninitialized value. Initialize it to -1 to make GOMP_SPINCOUNT behavior consistent with its description. libgomp/ChangeLog: PR libgomp/109062 * env.c (wait_policy): Initialize to -1. (initialize_icvs): Initialize icvs->wait_policy to -1. * testsuite/libgomp.c-c++-common/pr109062.c: New test.
2023-03-02amdgcn: Enable SIMD vectorization of math functionsKwok Cheung Yeung1-0/+217
Calls to vectorized versions of routines in the math library will now be inserted when vectorizing code containing supported math functions. 2023-03-02 Kwok Cheung Yeung <kcy@codesourcery.com> Paul-Antoine Arras <pa@codesourcery.com> gcc/ * builtins.cc (mathfn_built_in_explicit): New. * config/gcn/gcn.cc: Include case-cfn-macros.h. (mathfn_built_in_explicit): Add prototype. (gcn_vectorize_builtin_vectorized_function): New. (gcn_libc_has_function): New. (TARGET_LIBC_HAS_FUNCTION): Define. (TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION): Define. gcc/testsuite/ * gcc.target/gcn/simd-math-1.c: New testcase. * gcc.target/gcn/simd-math-2.c: New testcase. libgomp/ * testsuite/libgomp.c/simd-math-1.c: New testcase.
2023-03-01OpenMP/Fortran: Fix handling of optional is_device_ptr + bind(C) [PR108546]Tobias Burnus2-0/+99
For is_device_ptr, optional checks should only be done before calling libgomp, afterwards they are NULL either because of absent or, by chance, because it is unallocated or unassociated (for pointers/allocatables). Additionally, it fixes an issue with explicit mapping for 'type(c_ptr)'. PR middle-end/108546 gcc/fortran/ChangeLog: * trans-openmp.cc (gfc_trans_omp_clauses): Fix mapping of type(C_ptr) variables. gcc/ChangeLog: * omp-low.cc (lower_omp_target): Remove optional handling on the receiver side, i.e. inside target (data), for use_device_ptr. libgomp/ChangeLog: * testsuite/libgomp.fortran/is_device_ptr-3.f90: New test. * testsuite/libgomp.fortran/use_device_ptr-optional-4.f90: New test.