diff options
author | Hans Boehm <hboehm@google.com> | 2023-05-11 16:35:29 -0700 |
---|---|---|
committer | Andrew Waterman <andrew@sifive.com> | 2023-05-16 14:03:52 -0700 |
commit | d53848e827c974df8e9dd6bc917d9f7aabe87932 (patch) | |
tree | ba07251a4761679be308bb2293f9175f0e8ae263 | |
parent | 9eee159cf6656ff03aeb29fa216781bd628195c0 (diff) | |
download | riscv-isa-manual-atomics-wording-v2.zip riscv-isa-manual-atomics-wording-v2.tar.gz riscv-isa-manual-atomics-wording-v2.tar.bz2 |
Fixes for Atomics wordingatomics-wording-v2
Fix L.aq and S.rl comment:
The aq and rl bits are not both necessary for C++ seq_cst semantics.
See table 55 (formerly A.7).
seq_cst LR/SC:
The SC.aqrl recommendation here was inconsistent with the mapping in
54 (A.6) and unnecessary for 55 (A.7). It is probably correct for a
trailing fence mapping, but that's not discussed anywhere.
Tweak load-acquire and AMO sequential consistency wording.
Demote this discussion to commentary.
A.6 and A.7 (now 54 and 55) discussion:
Note that 54 (A.6) and 55 (A.7) are incompatible as is, and there is no
smooth transition from 54 to 55.
-rw-r--r-- | src/a-st-ext.adoc | 41 | ||||
-rw-r--r-- | src/mm-eplan.adoc | 5 |
2 files changed, 34 insertions, 12 deletions
diff --git a/src/a-st-ext.adoc b/src/a-st-ext.adoc index 7cb5d73..66c2f3b 100644 --- a/src/a-st-ext.adoc +++ b/src/a-st-ext.adoc @@ -195,16 +195,25 @@ An SC instruction can never be observed by another RISC-V hart before the LR instruction that established the reservation. The LR/SC sequence can be given acquire semantics by setting the _aq_ bit on the LR instruction. The LR/SC sequence can be given release semantics by -setting the _rl_ bit on the SC instruction. Setting the _aq_ bit on the -LR instruction, and setting both the _aq_ and the _rl_ bit on the SC -instruction makes the LR/SC sequence sequentially consistent, meaning -that it cannot be reordered with earlier or later memory operations from -the same hart. +by setting the _rl_ bit on the SC instruction. -If neither bit is set on both LR and SC, the LR/SC sequence can be +[NOTE] +==== +Assuming suitable mappings for other atomic operations, setting the +_aq_ bit on the LR instruction, and setting the +_rl_ bit on the SC instruction makes the LR/SC +sequence sequentially consistent in the C\++ `memory_order_seq_cst` +sense. Such a sequence does not act as a fence for ordering ordinary +load and store instructions before and after the sequence. Specific +instruction mappings for other C++ atomic operations, +or stronger notions of "sequential consistency", may require both +bits to be set on either or both of the LR or SC instruction. + +If neither bit is set on either LR or SC, the LR/SC sequence can be observed to occur before or after surrounding memory operations from the same RISC-V hart. This can be appropriate when the LR/SC sequence is used to implement a parallel reduction operation. +==== Software should not set the _rl_ bit on an LR instruction unless the _aq_ bit is also set, nor should software set the _aq_ bit on an SC @@ -430,11 +439,19 @@ relinquishment. We recommend the use of the AMO Swap idiom shown above for both lock acquire and release to simplify the implementation of speculative lock elision. cite:[Rajwar:2001:SLE] - ==== -The instructions in the "A" extension can also be used to provide -sequentially consistent loads and stores. A sequentially consistent load -can be implemented as an LR with both _aq_ and _rl_ set. A sequentially -consistent store can be implemented as an AMOSWAP that writes the old -value to x0 and has both _aq_ and _rl_ set. +[NOTE] +==== +The instructions in the "A" extension can be used to provide sequentially +consistent loads and stores, but this constrains hardware +reordering of memory accesses more than necessary. +A C++ sequentially consistent load can be implemented as +an LR with _aq_ set. However, the LR/SC eventual +success guarantee may slow down concurrent loads from the same effective +address. A sequentially consistent store can be implemented as an AMOSWAP +that writes the old value to `x0` and has _rl_ set. However the superfluous +load may impose ordering constraints that are unnecessary for this use case. +Specific compilation conventions may require both the _aq_ and _rl_ +bits to be set in either or both the LR and AMOSWAP instructions. +==== diff --git a/src/mm-eplan.adoc b/src/mm-eplan.adoc index dcb1d3f..1243b1d 100644 --- a/src/mm-eplan.adoc +++ b/src/mm-eplan.adoc @@ -1450,6 +1450,11 @@ with _aq_ and _rl_ modifiers are introduced, then the mappings in the two mappings only interoperate correctly if `atomic_<op>(memory_order_seq_cst)` is mapped using an LR that has both _aq_ and _rl_ set. +Even more importantly, a <<c11mappings>> sequentially consistent store, +followed by a <<c11mappings_hypothetical>> sequentially consistent load +can be reordered unless the <<c11mappings>> mapping of stores is +strengthened by either adding a second fence or mapping the store +to `amoswap.rl` instead. [[c11mappings]] .Mappings from C/C++ primitives to RISC-V primitives. |