aboutsummaryrefslogtreecommitdiff
path: root/llvm/docs
diff options
context:
space:
mode:
Diffstat (limited to 'llvm/docs')
-rw-r--r--llvm/docs/AMDGPUUsage.rst68
-rw-r--r--llvm/docs/CodingStandards.rst25
-rw-r--r--llvm/docs/CommandGuide/lit.rst1
3 files changed, 29 insertions, 65 deletions
diff --git a/llvm/docs/AMDGPUUsage.rst b/llvm/docs/AMDGPUUsage.rst
index e062032..7780c0a 100644
--- a/llvm/docs/AMDGPUUsage.rst
+++ b/llvm/docs/AMDGPUUsage.rst
@@ -6512,6 +6512,13 @@ operations.
``buffer/global/flat_load/store/atomic`` instructions to global memory are
termed vector memory operations.
+``global_load_lds`` or ``buffer/global_load`` instructions with the `lds` flag
+are LDS DMA loads. They interact with caches as if the loaded data were
+being loaded to registers and not to LDS, and so therefore support the same
+cache modifiers. They cannot be performed atomically. They implement volatile
+(via aux/cpol bit 31) and nontemporal (via metadata) as if they were loads
+from the global address space.
+
Private address space uses ``buffer_load/store`` using the scratch V#
(GFX6-GFX8), or ``scratch_load/store`` (GFX9-GFX11). Since only a single thread
is accessing the memory, atomic memory orderings are not meaningful, and all
@@ -13232,9 +13239,6 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`.
store atomic release - workgroup - global 1. s_waitcnt lgkmcnt(0) &
- generic vmcnt(0) & vscnt(0)
- - If CU wavefront execution
- mode, omit vmcnt(0) and
- vscnt(0).
- If OpenCL, omit
lgkmcnt(0).
- Could be split into
@@ -13280,8 +13284,6 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`.
2. buffer/global/flat_store
store atomic release - workgroup - local 1. s_waitcnt vmcnt(0) & vscnt(0)
- - If CU wavefront execution
- mode, omit.
- If OpenCL, omit.
- Could be split into
separate s_waitcnt
@@ -13369,9 +13371,6 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`.
atomicrmw release - workgroup - global 1. s_waitcnt lgkmcnt(0) &
- generic vmcnt(0) & vscnt(0)
- - If CU wavefront execution
- mode, omit vmcnt(0) and
- vscnt(0).
- If OpenCL, omit lgkmcnt(0).
- Could be split into
separate s_waitcnt
@@ -13416,8 +13415,6 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`.
2. buffer/global/flat_atomic
atomicrmw release - workgroup - local 1. s_waitcnt vmcnt(0) & vscnt(0)
- - If CU wavefront execution
- mode, omit.
- If OpenCL, omit.
- Could be split into
separate s_waitcnt
@@ -13501,9 +13498,6 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`.
fence release - workgroup *none* 1. s_waitcnt lgkmcnt(0) &
vmcnt(0) & vscnt(0)
- - If CU wavefront execution
- mode, omit vmcnt(0) and
- vscnt(0).
- If OpenCL and
address space is
not generic, omit
@@ -13630,9 +13624,6 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`.
atomicrmw acq_rel - workgroup - global 1. s_waitcnt lgkmcnt(0) &
vmcnt(0) & vscnt(0)
- - If CU wavefront execution
- mode, omit vmcnt(0) and
- vscnt(0).
- If OpenCL, omit
lgkmcnt(0).
- Must happen after
@@ -13684,8 +13675,6 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`.
2. buffer/global_atomic
3. s_waitcnt vm/vscnt(0)
- - If CU wavefront execution
- mode, omit.
- Use vmcnt(0) if atomic with
return and vscnt(0) if
atomic with no-return.
@@ -13710,8 +13699,6 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`.
atomicrmw acq_rel - workgroup - local 1. s_waitcnt vmcnt(0) & vscnt(0)
- - If CU wavefront execution
- mode, omit.
- If OpenCL, omit.
- Could be split into
separate s_waitcnt
@@ -13771,9 +13758,6 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`.
atomicrmw acq_rel - workgroup - generic 1. s_waitcnt lgkmcnt(0) &
vmcnt(0) & vscnt(0)
- - If CU wavefront execution
- mode, omit vmcnt(0) and
- vscnt(0).
- If OpenCL, omit lgkmcnt(0).
- Could be split into
separate s_waitcnt
@@ -13819,9 +13803,9 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`.
3. s_waitcnt lgkmcnt(0) &
vmcnt(0) & vscnt(0)
- - If CU wavefront execution
- mode, omit vmcnt(0) and
- vscnt(0).
+ - If atomic with return, omit
+ vscnt(0), if atomic with
+ no-return, omit vmcnt(0).
- If OpenCL, omit lgkmcnt(0).
- Must happen before
the following
@@ -13994,9 +13978,6 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`.
fence acq_rel - workgroup *none* 1. s_waitcnt lgkmcnt(0) &
vmcnt(0) & vscnt(0)
- - If CU wavefront execution
- mode, omit vmcnt(0) and
- vscnt(0).
- If OpenCL and
address space is
not generic, omit
@@ -14226,9 +14207,6 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`.
load atomic seq_cst - workgroup - global 1. s_waitcnt lgkmcnt(0) &
- generic vmcnt(0) & vscnt(0)
- - If CU wavefront execution
- mode, omit vmcnt(0) and
- vscnt(0).
- Could be split into
separate s_waitcnt
vmcnt(0), s_waitcnt
@@ -14337,8 +14315,6 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`.
1. s_waitcnt vmcnt(0) & vscnt(0)
- - If CU wavefront execution
- mode, omit.
- Could be split into
separate s_waitcnt
vmcnt(0) and s_waitcnt
@@ -15340,8 +15316,6 @@ the instruction in the code sequence that references the table.
| ``s_wait_storecnt 0x0``
| ``s_wait_loadcnt 0x0``
| ``s_wait_dscnt 0x0``
- | **CU wavefront execution mode:**
- | ``s_wait_dscnt 0x0``
- If OpenCL, omit ``s_wait_dscnt 0x0``.
- The waits can be
@@ -15387,8 +15361,6 @@ the instruction in the code sequence that references the table.
| ``s_wait_storecnt 0x0``
| ``s_wait_loadcnt 0x0``
| ``s_wait_dscnt 0x0``
- | **CU wavefront execution mode:**
- | ``s_wait_dscnt 0x0``
- If OpenCL, omit.
- The waits can be
@@ -15482,8 +15454,6 @@ the instruction in the code sequence that references the table.
| ``s_wait_storecnt 0x0``
| ``s_wait_loadcnt 0x0``
| ``s_wait_dscnt 0x0``
- | **CU wavefront execution mode:**
- | ``s_wait_dscnt 0x0``
- If OpenCL, omit ``s_wait_dscnt 0x0``.
- If OpenCL and CU wavefront
@@ -15533,8 +15503,6 @@ the instruction in the code sequence that references the table.
| ``s_wait_storecnt 0x0``
| ``s_wait_loadcnt 0x0``
| ``s_wait_dscnt 0x0``
- | **CU wavefront execution mode:**
- | ``s_wait_dscnt 0x0``
- If OpenCL, omit all.
- The waits can be
@@ -15626,8 +15594,6 @@ the instruction in the code sequence that references the table.
| ``s_wait_storecnt 0x0``
| ``s_wait_loadcnt 0x0``
| ``s_wait_dscnt 0x0``
- | **CU wavefront execution mode:**
- | ``s_wait_dscnt 0x0``
- If OpenCL, omit ``s_wait_dscnt 0x0``.
- If OpenCL and
@@ -15757,8 +15723,6 @@ the instruction in the code sequence that references the table.
| ``s_wait_storecnt 0x0``
| ``s_wait_loadcnt 0x0``
| ``s_wait_dscnt 0x0``
- | **CU wavefront execution mode:**
- | ``s_wait_dscnt 0x0``
- If OpenCL, omit ``s_wait_dscnt 0x0``.
- Must happen after
@@ -15815,8 +15779,6 @@ the instruction in the code sequence that references the table.
| **Atomic without return:**
| ``s_wait_storecnt 0x0``
- - If CU wavefront execution
- mode, omit.
- Must happen before
the following
``global_inv``.
@@ -15841,8 +15803,6 @@ the instruction in the code sequence that references the table.
| ``s_wait_storecnt 0x0``
| ``s_wait_loadcnt 0x0``
| ``s_wait_dscnt 0x0``
- | **CU wavefront execution mode:**
- | ``s_wait_dscnt 0x0``
- If OpenCL, omit.
- The waits can be
@@ -15904,8 +15864,6 @@ the instruction in the code sequence that references the table.
| ``s_wait_storecnt 0x0``
| ``s_wait_loadcnt 0x0``
| ``s_wait_dscnt 0x0``
- | **CU wavefront execution mode:**
- | ``s_wait_dscnt 0x0``
- If OpenCL, omit ``s_wait_loadcnt 0x0``.
- The waits can be
@@ -16157,8 +16115,6 @@ the instruction in the code sequence that references the table.
| ``s_wait_storecnt 0x0``
| ``s_wait_loadcnt 0x0``
| ``s_wait_dscnt 0x0``
- | **CU wavefront execution mode:**
- | ``s_wait_dscnt 0x0``
- If OpenCL and
address space is
@@ -16387,8 +16343,6 @@ the instruction in the code sequence that references the table.
| ``s_wait_storecnt 0x0``
| ``s_wait_loadcnt 0x0``
| ``s_wait_dscnt 0x0``
- | **CU wavefront execution mode:**
- | ``s_wait_dscnt 0x0``
- If OpenCL, omit
``s_wait_dscnt 0x0``
@@ -16495,8 +16449,6 @@ the instruction in the code sequence that references the table.
| ``s_wait_storecnt 0x0``
| ``s_wait_loadcnt 0x0``
| ``s_wait_dscnt 0x0``
- | **CU wavefront execution mode:**
- | ``s_wait_dscnt 0x0``
- If OpenCL, omit all.
- The waits can be
diff --git a/llvm/docs/CodingStandards.rst b/llvm/docs/CodingStandards.rst
index 65dd794..8677d89 100644
--- a/llvm/docs/CodingStandards.rst
+++ b/llvm/docs/CodingStandards.rst
@@ -860,27 +860,40 @@ your private interface remains private and undisturbed by outsiders.
It's okay to put extra implementation methods in a public class itself. Just
make them private (or protected) and all is well.
-Use Namespace Qualifiers to Implement Previously Declared Functions
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+Use Namespace Qualifiers to Define Previously Declared Symbols
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-When providing an out-of-line implementation of a function in a source file, do
-not open namespace blocks in the source file. Instead, use namespace qualifiers
-to help ensure that your definition matches an existing declaration. Do this:
+When providing an out-of-line definition for various symbols (variables,
+functions, opaque classes) in a source file, do not open namespace blocks in the
+source file. Instead, use namespace qualifiers to help ensure that your
+definition matches an existing declaration. Do this:
.. code-block:: c++
// Foo.h
namespace llvm {
+ extern int FooVal;
int foo(const char *s);
- }
+
+ namespace detail {
+ class FooImpl;
+ } // namespace detail
+ } // namespace llvm
// Foo.cpp
#include "Foo.h"
using namespace llvm;
+
+ int llvm::FooVal;
+
int llvm::foo(const char *s) {
// ...
}
+ class detail::FooImpl {
+ // ...
+ }
+
Doing this helps to avoid bugs where the definition does not match the
declaration from the header. For example, the following C++ code defines a new
overload of ``llvm::foo`` instead of providing a definition for the existing
diff --git a/llvm/docs/CommandGuide/lit.rst b/llvm/docs/CommandGuide/lit.rst
index 6a721eb..70daae4 100644
--- a/llvm/docs/CommandGuide/lit.rst
+++ b/llvm/docs/CommandGuide/lit.rst
@@ -641,7 +641,6 @@ TestRunner.py:
%{S:real} %S after expanding all symbolic links and substitute drives
%{p:real} %p after expanding all symbolic links and substitute drives
%{t:real} %t after expanding all symbolic links and substitute drives
- %{T:real} %T after expanding all symbolic links and substitute drives
%{/s:real} %/s after expanding all symbolic links and substitute drives
%{/S:real} %/S after expanding all symbolic links and substitute drives
%{/p:real} %/p after expanding all symbolic links and substitute drives