diff options
Diffstat (limited to 'llvm/docs')
-rw-r--r-- | llvm/docs/AMDGPUUsage.rst | 68 | ||||
-rw-r--r-- | llvm/docs/CodingStandards.rst | 25 | ||||
-rw-r--r-- | llvm/docs/CommandGuide/lit.rst | 1 |
3 files changed, 29 insertions, 65 deletions
diff --git a/llvm/docs/AMDGPUUsage.rst b/llvm/docs/AMDGPUUsage.rst index e062032..7780c0a 100644 --- a/llvm/docs/AMDGPUUsage.rst +++ b/llvm/docs/AMDGPUUsage.rst @@ -6512,6 +6512,13 @@ operations. ``buffer/global/flat_load/store/atomic`` instructions to global memory are termed vector memory operations. +``global_load_lds`` or ``buffer/global_load`` instructions with the `lds` flag +are LDS DMA loads. They interact with caches as if the loaded data were +being loaded to registers and not to LDS, and so therefore support the same +cache modifiers. They cannot be performed atomically. They implement volatile +(via aux/cpol bit 31) and nontemporal (via metadata) as if they were loads +from the global address space. + Private address space uses ``buffer_load/store`` using the scratch V# (GFX6-GFX8), or ``scratch_load/store`` (GFX9-GFX11). Since only a single thread is accessing the memory, atomic memory orderings are not meaningful, and all @@ -13232,9 +13239,6 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`. store atomic release - workgroup - global 1. s_waitcnt lgkmcnt(0) & - generic vmcnt(0) & vscnt(0) - - If CU wavefront execution - mode, omit vmcnt(0) and - vscnt(0). - If OpenCL, omit lgkmcnt(0). - Could be split into @@ -13280,8 +13284,6 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`. 2. buffer/global/flat_store store atomic release - workgroup - local 1. s_waitcnt vmcnt(0) & vscnt(0) - - If CU wavefront execution - mode, omit. - If OpenCL, omit. - Could be split into separate s_waitcnt @@ -13369,9 +13371,6 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`. atomicrmw release - workgroup - global 1. s_waitcnt lgkmcnt(0) & - generic vmcnt(0) & vscnt(0) - - If CU wavefront execution - mode, omit vmcnt(0) and - vscnt(0). - If OpenCL, omit lgkmcnt(0). - Could be split into separate s_waitcnt @@ -13416,8 +13415,6 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`. 2. buffer/global/flat_atomic atomicrmw release - workgroup - local 1. s_waitcnt vmcnt(0) & vscnt(0) - - If CU wavefront execution - mode, omit. - If OpenCL, omit. - Could be split into separate s_waitcnt @@ -13501,9 +13498,6 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`. fence release - workgroup *none* 1. s_waitcnt lgkmcnt(0) & vmcnt(0) & vscnt(0) - - If CU wavefront execution - mode, omit vmcnt(0) and - vscnt(0). - If OpenCL and address space is not generic, omit @@ -13630,9 +13624,6 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`. atomicrmw acq_rel - workgroup - global 1. s_waitcnt lgkmcnt(0) & vmcnt(0) & vscnt(0) - - If CU wavefront execution - mode, omit vmcnt(0) and - vscnt(0). - If OpenCL, omit lgkmcnt(0). - Must happen after @@ -13684,8 +13675,6 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`. 2. buffer/global_atomic 3. s_waitcnt vm/vscnt(0) - - If CU wavefront execution - mode, omit. - Use vmcnt(0) if atomic with return and vscnt(0) if atomic with no-return. @@ -13710,8 +13699,6 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`. atomicrmw acq_rel - workgroup - local 1. s_waitcnt vmcnt(0) & vscnt(0) - - If CU wavefront execution - mode, omit. - If OpenCL, omit. - Could be split into separate s_waitcnt @@ -13771,9 +13758,6 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`. atomicrmw acq_rel - workgroup - generic 1. s_waitcnt lgkmcnt(0) & vmcnt(0) & vscnt(0) - - If CU wavefront execution - mode, omit vmcnt(0) and - vscnt(0). - If OpenCL, omit lgkmcnt(0). - Could be split into separate s_waitcnt @@ -13819,9 +13803,9 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`. 3. s_waitcnt lgkmcnt(0) & vmcnt(0) & vscnt(0) - - If CU wavefront execution - mode, omit vmcnt(0) and - vscnt(0). + - If atomic with return, omit + vscnt(0), if atomic with + no-return, omit vmcnt(0). - If OpenCL, omit lgkmcnt(0). - Must happen before the following @@ -13994,9 +13978,6 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`. fence acq_rel - workgroup *none* 1. s_waitcnt lgkmcnt(0) & vmcnt(0) & vscnt(0) - - If CU wavefront execution - mode, omit vmcnt(0) and - vscnt(0). - If OpenCL and address space is not generic, omit @@ -14226,9 +14207,6 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`. load atomic seq_cst - workgroup - global 1. s_waitcnt lgkmcnt(0) & - generic vmcnt(0) & vscnt(0) - - If CU wavefront execution - mode, omit vmcnt(0) and - vscnt(0). - Could be split into separate s_waitcnt vmcnt(0), s_waitcnt @@ -14337,8 +14315,6 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`. 1. s_waitcnt vmcnt(0) & vscnt(0) - - If CU wavefront execution - mode, omit. - Could be split into separate s_waitcnt vmcnt(0) and s_waitcnt @@ -15340,8 +15316,6 @@ the instruction in the code sequence that references the table. | ``s_wait_storecnt 0x0`` | ``s_wait_loadcnt 0x0`` | ``s_wait_dscnt 0x0`` - | **CU wavefront execution mode:** - | ``s_wait_dscnt 0x0`` - If OpenCL, omit ``s_wait_dscnt 0x0``. - The waits can be @@ -15387,8 +15361,6 @@ the instruction in the code sequence that references the table. | ``s_wait_storecnt 0x0`` | ``s_wait_loadcnt 0x0`` | ``s_wait_dscnt 0x0`` - | **CU wavefront execution mode:** - | ``s_wait_dscnt 0x0`` - If OpenCL, omit. - The waits can be @@ -15482,8 +15454,6 @@ the instruction in the code sequence that references the table. | ``s_wait_storecnt 0x0`` | ``s_wait_loadcnt 0x0`` | ``s_wait_dscnt 0x0`` - | **CU wavefront execution mode:** - | ``s_wait_dscnt 0x0`` - If OpenCL, omit ``s_wait_dscnt 0x0``. - If OpenCL and CU wavefront @@ -15533,8 +15503,6 @@ the instruction in the code sequence that references the table. | ``s_wait_storecnt 0x0`` | ``s_wait_loadcnt 0x0`` | ``s_wait_dscnt 0x0`` - | **CU wavefront execution mode:** - | ``s_wait_dscnt 0x0`` - If OpenCL, omit all. - The waits can be @@ -15626,8 +15594,6 @@ the instruction in the code sequence that references the table. | ``s_wait_storecnt 0x0`` | ``s_wait_loadcnt 0x0`` | ``s_wait_dscnt 0x0`` - | **CU wavefront execution mode:** - | ``s_wait_dscnt 0x0`` - If OpenCL, omit ``s_wait_dscnt 0x0``. - If OpenCL and @@ -15757,8 +15723,6 @@ the instruction in the code sequence that references the table. | ``s_wait_storecnt 0x0`` | ``s_wait_loadcnt 0x0`` | ``s_wait_dscnt 0x0`` - | **CU wavefront execution mode:** - | ``s_wait_dscnt 0x0`` - If OpenCL, omit ``s_wait_dscnt 0x0``. - Must happen after @@ -15815,8 +15779,6 @@ the instruction in the code sequence that references the table. | **Atomic without return:** | ``s_wait_storecnt 0x0`` - - If CU wavefront execution - mode, omit. - Must happen before the following ``global_inv``. @@ -15841,8 +15803,6 @@ the instruction in the code sequence that references the table. | ``s_wait_storecnt 0x0`` | ``s_wait_loadcnt 0x0`` | ``s_wait_dscnt 0x0`` - | **CU wavefront execution mode:** - | ``s_wait_dscnt 0x0`` - If OpenCL, omit. - The waits can be @@ -15904,8 +15864,6 @@ the instruction in the code sequence that references the table. | ``s_wait_storecnt 0x0`` | ``s_wait_loadcnt 0x0`` | ``s_wait_dscnt 0x0`` - | **CU wavefront execution mode:** - | ``s_wait_dscnt 0x0`` - If OpenCL, omit ``s_wait_loadcnt 0x0``. - The waits can be @@ -16157,8 +16115,6 @@ the instruction in the code sequence that references the table. | ``s_wait_storecnt 0x0`` | ``s_wait_loadcnt 0x0`` | ``s_wait_dscnt 0x0`` - | **CU wavefront execution mode:** - | ``s_wait_dscnt 0x0`` - If OpenCL and address space is @@ -16387,8 +16343,6 @@ the instruction in the code sequence that references the table. | ``s_wait_storecnt 0x0`` | ``s_wait_loadcnt 0x0`` | ``s_wait_dscnt 0x0`` - | **CU wavefront execution mode:** - | ``s_wait_dscnt 0x0`` - If OpenCL, omit ``s_wait_dscnt 0x0`` @@ -16495,8 +16449,6 @@ the instruction in the code sequence that references the table. | ``s_wait_storecnt 0x0`` | ``s_wait_loadcnt 0x0`` | ``s_wait_dscnt 0x0`` - | **CU wavefront execution mode:** - | ``s_wait_dscnt 0x0`` - If OpenCL, omit all. - The waits can be diff --git a/llvm/docs/CodingStandards.rst b/llvm/docs/CodingStandards.rst index 65dd794..8677d89 100644 --- a/llvm/docs/CodingStandards.rst +++ b/llvm/docs/CodingStandards.rst @@ -860,27 +860,40 @@ your private interface remains private and undisturbed by outsiders. It's okay to put extra implementation methods in a public class itself. Just make them private (or protected) and all is well. -Use Namespace Qualifiers to Implement Previously Declared Functions -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +Use Namespace Qualifiers to Define Previously Declared Symbols +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -When providing an out-of-line implementation of a function in a source file, do -not open namespace blocks in the source file. Instead, use namespace qualifiers -to help ensure that your definition matches an existing declaration. Do this: +When providing an out-of-line definition for various symbols (variables, +functions, opaque classes) in a source file, do not open namespace blocks in the +source file. Instead, use namespace qualifiers to help ensure that your +definition matches an existing declaration. Do this: .. code-block:: c++ // Foo.h namespace llvm { + extern int FooVal; int foo(const char *s); - } + + namespace detail { + class FooImpl; + } // namespace detail + } // namespace llvm // Foo.cpp #include "Foo.h" using namespace llvm; + + int llvm::FooVal; + int llvm::foo(const char *s) { // ... } + class detail::FooImpl { + // ... + } + Doing this helps to avoid bugs where the definition does not match the declaration from the header. For example, the following C++ code defines a new overload of ``llvm::foo`` instead of providing a definition for the existing diff --git a/llvm/docs/CommandGuide/lit.rst b/llvm/docs/CommandGuide/lit.rst index 6a721eb..70daae4 100644 --- a/llvm/docs/CommandGuide/lit.rst +++ b/llvm/docs/CommandGuide/lit.rst @@ -641,7 +641,6 @@ TestRunner.py: %{S:real} %S after expanding all symbolic links and substitute drives %{p:real} %p after expanding all symbolic links and substitute drives %{t:real} %t after expanding all symbolic links and substitute drives - %{T:real} %T after expanding all symbolic links and substitute drives %{/s:real} %/s after expanding all symbolic links and substitute drives %{/S:real} %/S after expanding all symbolic links and substitute drives %{/p:real} %/p after expanding all symbolic links and substitute drives |