aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorThomas Schwinge <tschwinge@baylibre.com>2024-11-12 09:54:35 +0100
committerThomas Schwinge <tschwinge@baylibre.com>2024-12-06 09:48:33 +0100
commitc80ecfa0927a1ada31864c709220a2adb7c96662 (patch)
treed33c65415e7382282b1368ce2a768597b1badb12
parentab5bd6ac68c6d9f870fcaf0de4a73f3dec920db9 (diff)
downloadgcc-c80ecfa0927a1ada31864c709220a2adb7c96662.zip
gcc-c80ecfa0927a1ada31864c709220a2adb7c96662.tar.gz
gcc-c80ecfa0927a1ada31864c709220a2adb7c96662.tar.bz2
Clarify libgomp nvptx 'omp_low_lat_mem_space' documentation
PTX '%dynamic_smem_size' was "Introduced in PTX ISA version 4.1", and "Requires 'sm_20' or higher". Given that GCC/nvptx generally supports 'sm_20', only the PTX ISA version matters here, and that's all fine if just using GCC's defaults. Follow-up to commit e9a19ead498fcc89186b724c6e76854f7751a89b "openmp, nvptx: low-lat memory access traits". libgomp/ * libgomp.texi: Clarify nvptx 'omp_low_lat_mem_space' documentation.
-rw-r--r--libgomp/libgomp.texi6
1 files changed, 4 insertions, 2 deletions
diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi
index 453c356..6b8000c 100644
--- a/libgomp/libgomp.texi
+++ b/libgomp/libgomp.texi
@@ -6972,8 +6972,10 @@ The implementation remark:
memory-copy functions of the CUDA library. Higher dimensions will
call those functions in a loop and are therefore supported.
@item Low-latency memory (@code{omp_low_lat_mem_space}) is supported when the
- the @code{access} trait is set to @code{cgroup}, the ISA is at least
- @code{sm_53}, and the PTX version is at least 4.1. The default pool size
+ the @code{access} trait is set to @code{cgroup}, and libgomp has
+ been built for PTX ISA version 4.1 or higher (such as in GCC's
+ default configuration). @c -mptx=4.1
+ The default pool size
is 8 kiB per team, but may be adjusted at runtime by setting environment
variable @code{GOMP_NVPTX_LOWLAT_POOL=@var{bytes}}. The maximum value is
limited by the available hardware, and care should be taken that the