aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorHans Boehm <hboehm@google.com>2023-05-11 16:35:29 -0700
committerAndrew Waterman <andrew@sifive.com>2023-05-16 14:03:52 -0700
commitd53848e827c974df8e9dd6bc917d9f7aabe87932 (patch)
treeba07251a4761679be308bb2293f9175f0e8ae263
parent9eee159cf6656ff03aeb29fa216781bd628195c0 (diff)
downloadriscv-isa-manual-atomics-wording-v2.zip
riscv-isa-manual-atomics-wording-v2.tar.gz
riscv-isa-manual-atomics-wording-v2.tar.bz2
Fixes for Atomics wordingatomics-wording-v2
Fix L.aq and S.rl comment: The aq and rl bits are not both necessary for C++ seq_cst semantics. See table 55 (formerly A.7). seq_cst LR/SC: The SC.aqrl recommendation here was inconsistent with the mapping in 54 (A.6) and unnecessary for 55 (A.7). It is probably correct for a trailing fence mapping, but that's not discussed anywhere. Tweak load-acquire and AMO sequential consistency wording. Demote this discussion to commentary. A.6 and A.7 (now 54 and 55) discussion: Note that 54 (A.6) and 55 (A.7) are incompatible as is, and there is no smooth transition from 54 to 55.
-rw-r--r--src/a-st-ext.adoc41
-rw-r--r--src/mm-eplan.adoc5
2 files changed, 34 insertions, 12 deletions
diff --git a/src/a-st-ext.adoc b/src/a-st-ext.adoc
index 7cb5d73..66c2f3b 100644
--- a/src/a-st-ext.adoc
+++ b/src/a-st-ext.adoc
@@ -195,16 +195,25 @@ An SC instruction can never be observed by another RISC-V hart before
the LR instruction that established the reservation. The LR/SC sequence
can be given acquire semantics by setting the _aq_ bit on the LR
instruction. The LR/SC sequence can be given release semantics by
-setting the _rl_ bit on the SC instruction. Setting the _aq_ bit on the
-LR instruction, and setting both the _aq_ and the _rl_ bit on the SC
-instruction makes the LR/SC sequence sequentially consistent, meaning
-that it cannot be reordered with earlier or later memory operations from
-the same hart.
+by setting the _rl_ bit on the SC instruction.
-If neither bit is set on both LR and SC, the LR/SC sequence can be
+[NOTE]
+====
+Assuming suitable mappings for other atomic operations, setting the
+_aq_ bit on the LR instruction, and setting the
+_rl_ bit on the SC instruction makes the LR/SC
+sequence sequentially consistent in the C\++ `memory_order_seq_cst`
+sense. Such a sequence does not act as a fence for ordering ordinary
+load and store instructions before and after the sequence. Specific
+instruction mappings for other C++ atomic operations,
+or stronger notions of "sequential consistency", may require both
+bits to be set on either or both of the LR or SC instruction.
+
+If neither bit is set on either LR or SC, the LR/SC sequence can be
observed to occur before or after surrounding memory operations from the
same RISC-V hart. This can be appropriate when the LR/SC sequence is
used to implement a parallel reduction operation.
+====
Software should not set the _rl_ bit on an LR instruction unless the
_aq_ bit is also set, nor should software set the _aq_ bit on an SC
@@ -430,11 +439,19 @@ relinquishment.
We recommend the use of the AMO Swap idiom shown above for both lock
acquire and release to simplify the implementation of speculative lock
elision. cite:[Rajwar:2001:SLE]
-
====
-The instructions in the "A" extension can also be used to provide
-sequentially consistent loads and stores. A sequentially consistent load
-can be implemented as an LR with both _aq_ and _rl_ set. A sequentially
-consistent store can be implemented as an AMOSWAP that writes the old
-value to x0 and has both _aq_ and _rl_ set.
+[NOTE]
+====
+The instructions in the "A" extension can be used to provide sequentially
+consistent loads and stores, but this constrains hardware
+reordering of memory accesses more than necessary.
+A C++ sequentially consistent load can be implemented as
+an LR with _aq_ set. However, the LR/SC eventual
+success guarantee may slow down concurrent loads from the same effective
+address. A sequentially consistent store can be implemented as an AMOSWAP
+that writes the old value to `x0` and has _rl_ set. However the superfluous
+load may impose ordering constraints that are unnecessary for this use case.
+Specific compilation conventions may require both the _aq_ and _rl_
+bits to be set in either or both the LR and AMOSWAP instructions.
+====
diff --git a/src/mm-eplan.adoc b/src/mm-eplan.adoc
index dcb1d3f..1243b1d 100644
--- a/src/mm-eplan.adoc
+++ b/src/mm-eplan.adoc
@@ -1450,6 +1450,11 @@ with _aq_ and _rl_ modifiers are introduced, then the mappings in
the two mappings only interoperate correctly if
`atomic_<op>(memory_order_seq_cst)` is mapped using an LR that has both
_aq_ and _rl_ set.
+Even more importantly, a <<c11mappings>> sequentially consistent store,
+followed by a <<c11mappings_hypothetical>> sequentially consistent load
+can be reordered unless the <<c11mappings>> mapping of stores is
+strengthened by either adding a second fence or mapping the store
+to `amoswap.rl` instead.
[[c11mappings]]
.Mappings from C/C++ primitives to RISC-V primitives.