aboutsummaryrefslogtreecommitdiff
path: root/libgomp/libgomp.texi
diff options
context:
space:
mode:
Diffstat (limited to 'libgomp/libgomp.texi')
-rw-r--r--libgomp/libgomp.texi25
1 files changed, 15 insertions, 10 deletions
diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi
index 4217c29..dfd189b 100644
--- a/libgomp/libgomp.texi
+++ b/libgomp/libgomp.texi
@@ -258,7 +258,7 @@ The OpenMP 4.5 specification is fully supported.
device memory mapped by an array section @tab P @tab
@item Mapping of Fortran pointer and allocatable variables, including pointer
and allocatable components of variables
- @tab P @tab Mapping of vars with allocatable components unsupported
+ @tab Y @tab
@item @code{defaultmap} extensions @tab Y @tab
@item @code{declare mapper} directive @tab N @tab
@item @code{omp_get_supported_active_levels} routine @tab Y @tab
@@ -2316,7 +2316,7 @@ the initial device.
@end multitable
@item @emph{See also}:
-@ref{omp_target_memcpy_rect_async}, @ref{omp_target_memcpy}
+@ref{omp_target_memcpy_rect_async}, @ref{omp_target_memcpy}, @ref{Offload-Target Specifics}
@item @emph{Reference}:
@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 3.8.6
@@ -2391,7 +2391,7 @@ the initial device.
@end multitable
@item @emph{See also}:
-@ref{omp_target_memcpy_rect}, @ref{omp_target_memcpy_async}
+@ref{omp_target_memcpy_rect}, @ref{omp_target_memcpy_async}, @ref{Offload-Target Specifics}
@item @emph{Reference}:
@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 3.8.8
@@ -6888,7 +6888,7 @@ The implementation remark:
@code{device(ancestor:1)}) are processed serially per @code{target} region
such that the next reverse offload region is only executed after the previous
one returned.
-@item OpenMP code that has a @code{requires} directive with
+@item OpenMP code that has a @code{requires} directive with @code{self_maps} or
@code{unified_shared_memory} is only supported if all AMD GPUs have the
@code{HSA_AMD_SYSTEM_INFO_SVM_ACCESSIBLE_BY_DEFAULT} property; for
discrete GPUs, this may require setting the @code{HSA_XNACK} environment
@@ -6911,6 +6911,11 @@ The implementation remark:
@code{omp_thread_mem_alloc}, all use low-latency memory as first
preference, and fall back to main graphics memory when the low-latency
pool is exhausted.
+@item The OpenMP routines @code{omp_target_memcpy_rect} and
+ @code{omp_target_memcpy_rect_async} and the @code{target update}
+ directive for non-contiguous list items use the 3D memory-copy function
+ of the HSA library. Higher dimensions call this functions in a loop and
+ are therefore supported.
@item The unique identifier (UID), used with OpenMP's API UID routines, is the
value returned by the HSA runtime library for @code{HSA_AMD_AGENT_INFO_UUID}.
For GPUs, it is currently @samp{GPU-} followed by 16 lower-case hex digits,
@@ -7040,7 +7045,7 @@ The implementation remark:
Per device, reverse offload regions are processed serially such that
the next reverse offload region is only executed after the previous
one returned.
-@item OpenMP code that has a @code{requires} directive with
+@item OpenMP code that has a @code{requires} directive with @code{self_maps} or
@code{unified_shared_memory} runs on nvptx devices if and only if
all of those support the @code{pageableMemoryAccess} property;@footnote{
@uref{https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#um-requirements}}
@@ -7048,11 +7053,6 @@ The implementation remark:
devices (``host fallback'').
@item The default per-warp stack size is 128 kiB; see also @code{-msoft-stack}
in the GCC manual.
-@item The OpenMP routines @code{omp_target_memcpy_rect} and
- @code{omp_target_memcpy_rect_async} and the @code{target update}
- directive for non-contiguous list items will use the 2D and 3D
- memory-copy functions of the CUDA library. Higher dimensions will
- call those functions in a loop and are therefore supported.
@item Low-latency memory (@code{omp_low_lat_mem_space}) is supported when the
the @code{access} trait is set to @code{cgroup}, and libgomp has
been built for PTX ISA version 4.1 or higher (such as in GCC's
@@ -7070,6 +7070,11 @@ The implementation remark:
@code{omp_thread_mem_alloc}, all use low-latency memory as first
preference, and fall back to main graphics memory when the low-latency
pool is exhausted.
+@item The OpenMP routines @code{omp_target_memcpy_rect} and
+ @code{omp_target_memcpy_rect_async} and the @code{target update}
+ directive for non-contiguous list items use the 2D and 3D memory-copy
+ functions of the CUDA library. Higher dimensions call those functions
+ in a loop and are therefore supported.
@item The unique identifier (UID), used with OpenMP's API UID routines, consists
of the @samp{GPU-} prefix followed by the 16-bytes UUID as returned by
the CUDA runtime library. This UUID is output in grouped lower-case