aboutsummaryrefslogtreecommitdiff
path: root/llvm/docs/AMDGPUModifierSyntax.rst
diff options
context:
space:
mode:
Diffstat (limited to 'llvm/docs/AMDGPUModifierSyntax.rst')
-rw-r--r--llvm/docs/AMDGPUModifierSyntax.rst109
1 files changed, 109 insertions, 0 deletions
diff --git a/llvm/docs/AMDGPUModifierSyntax.rst b/llvm/docs/AMDGPUModifierSyntax.rst
index 334bdaf..8a60663 100644
--- a/llvm/docs/AMDGPUModifierSyntax.rst
+++ b/llvm/docs/AMDGPUModifierSyntax.rst
@@ -1078,6 +1078,73 @@ Examples:
offset:0xfffff
offset:-x
+.. _amdgpu_synid_smem_offset24s:
+
+offset24s
+~~~~~~~~~
+
+Specifies a signed 24-bit offset, in bytes. The default value is 0.
+
+ ============================= ====================================================================
+ Syntax Description
+ ============================= ====================================================================
+ offset:{-0x1000000..0xFFFFFF} Specifies an offset as an
+ :ref:`integer number <amdgpu_synid_integer_number>`
+ or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
+ ============================= ====================================================================
+
+Examples:
+
+.. parsed-literal::
+
+ offset:-1
+ offset:0xfffff
+ offset:-x
+
+.. _amdgpu_synid_th:
+
+th
+~~
+
+Specifies temporal hint of memory operation.
+
+ =============================== =========================================================
+ Syntax Description
+ =============================== =========================================================
+ TH_{LOAD|STORE}_RT Regular
+ TH_{LOAD|STORE}_NT Non-temporal
+ TH_{LOAD|STORE}_HT High-temporal
+ TH_{LOAD|STORE}_LU Last use. Not available in SYS scope.
+ TH_{LOAD|STORE}_WB Regular (CU, SE); High-temporal with write-back (MALL)
+ TH_{LOAD|STORE}_NT_RT Non-temporal (CU, SE); Regular (MALL)
+ TH_{LOAD|STORE}_RT_NT Regular (CU, SE); Non-temporal (MALL)
+ TH_{LOAD|STORE}_NT_HT Non-temporal (CU, SE); High-temporal (MALL)
+ TH_{LOAD|STORE}_NT_WB Non-temporal (CU, SE); High-temporal with write-back (MALL)
+ TH_{LOAD|STORE}_BYPASS Available for SYS scope only.
+ TH_ATOMIC_RT Regular
+ TH_ATOMIC_RT_RETURN Regular. For atomic instructions that return values.
+ TH_ATOMIC_NT Non-temporal
+ TH_ATOMIC_NT_RETURN Non-temporal. For atomic instructions that return values.
+ TH_ATOMIC_CASCADE_RT Cascading atomic; Regular.
+ TH_ATOMIC_CASCADE_NT Cascading atomic; Non-temporal.
+ =============================== =========================================================
+
+.. _amdgpu_synid_scope:
+
+scope
+~~~~~
+
+Specifies scope of memory operation.
+
+ =============================== =========================================================
+ Syntax Description
+ =============================== =========================================================
+ SCOPE_CU Coherency within a Compute Unit.
+ SCOPE_SE Coherency within a Shader Engine.
+ SCOPE_DEV Coherency within a single device.
+ SCOPE_SYS Coherency across the full system.
+ =============================== =========================================================
+
VINTRP/VINTERP/LDSDIR Modifiers
-------------------------------
@@ -1117,6 +1184,27 @@ The default value is zero. This is a safe value, but it may be suboptimal.
issuing this instruction.
================ ======================================================
+.. _amdgpu_synid_wait_va_vdst:
+
+wait_va_vdst
+~~~~~~~~~~~~
+
+Manually specify a wait on the VA_VDST counter before issuing this instruction. VA_VDST must be less
+than or equal to this value before the instruction is issued. If set to 15, no wait is performed.
+
+If unspecified the current default is zero. This is a safe value but may have poor performance characteristics.
+
+This modifier is a shorthand for the WAR hazard where VALU reads a VGPR that is written by a parameter
+load. Since there is no VA_VSRC counter we must use VA_VDST as a proxy to detect when the
+VALU instruction has completed:
+
+Examples:
+
+.. parsed-literal::
+
+ v_mov_b32 v1, v0
+ ds_param_load v0, . . . wait_va_vdst:0
+
.. _amdgpu_synid_wait_vdst:
wait_vdst
@@ -1135,6 +1223,27 @@ The default value is zero. This is a safe value, but it may be suboptimal.
issuing this instruction.
================== ======================================================
+.. _amdgpu_synid_wait_vm_vsrc:
+
+wait_vm_vsrc
+~~~~~~~~~~~~
+
+Manually specify a wait on the VM_VSRC counter before issuing this instruction. VM_VSRC must be less
+than or equal to this value before the instruction is issued. If set to 1, no wait is performed.
+
+If unspecified the current default is zero. This is a safe value but may have poor performance characteristics.
+
+This modifier is a shorthand for the WAR hazard where VMEM reads a VGPR that is written by a parameter
+load.
+
+Examples:
+
+.. parsed-literal::
+
+ buffer_load_b32 v1, v0, s0, 0
+ ds_param_load v0, . . . wait_vm_vsrc:0
+
+
DPP8 Modifiers
--------------