aboutsummaryrefslogtreecommitdiff
path: root/src/memory.tex
diff options
context:
space:
mode:
Diffstat (limited to 'src/memory.tex')
-rw-r--r--src/memory.tex14
1 files changed, 7 insertions, 7 deletions
diff --git a/src/memory.tex b/src/memory.tex
index 08fca50..f35d247 100644
--- a/src/memory.tex
+++ b/src/memory.tex
@@ -619,7 +619,7 @@ Figure~\ref{fig:litmus:address}, even though {\tt a1} XOR {\tt a1} is zero and
hence has no effect on the address accessed by the second load.
The benefit of using dependencies as a lightweight synchronization mechanism is that the ordering enforcement requirement is limited only to the specific two instructions in question.
-Other non-dependent instructions may be freely-reordered by aggressive implementations.
+Other non-dependent instructions may be freely reordered by aggressive implementations.
One alternative would be to use a load-acquire, but this would enforce ordering for the first load with respect to {\em all} subsequent instructions.
Another would be to use a FENCE~R,R, but this would include all previous and all subsequent loads, making this option more expensive.
@@ -826,7 +826,7 @@ memory access $a$ precedes memory access $b$ in global memory order if $a$ prece
\begin{enumerate}
\item $a$ precedes $b$ in preserved program order as defined in Chapter~\ref{ch:memorymodel}, with the exception that acquire and release ordering annotations apply only from one memory operation to another memory operation and from one I/O operation to another I/O operation, but not from a memory operation to an I/O nor vice versa
\item $a$ and $b$ are accesses to overlapping addresses in an I/O region
- \item $a$ and $b$ are accesses to the same strongly-ordered I/O region
+ \item $a$ and $b$ are accesses to the same strongly ordered I/O region
\item $a$ and $b$ are accesses to I/O regions, and the channel associated with the I/O region accessed by either $a$ or $b$ is channel 1
\item $a$ and $b$ are accesses to I/O regions associated with the same channel (except for channel 0)
\end{enumerate}
@@ -859,7 +859,7 @@ Ordering fences simply ensure that memory operations stay in order, while comple
RISC-V does not explicitly distinguish between ordering and completion fences.
Instead, this distinction is simply inferred from different uses of the FENCE bits.
-For implementations that conform to the RISC-V Unix Platform Specification, I/O devices and DMA operations are required to access memory coherently and via strongly-ordered I/O channels.
+For implementations that conform to the RISC-V Unix Platform Specification, I/O devices and DMA operations are required to access memory coherently and via strongly ordered I/O channels.
Therefore, accesses to regular main memory regions that are concurrently accessed by external devices can also use the standard synchronization mechanisms.
Implementations that do not conform to the Unix Platform Specification and/or in which devices do not access memory coherently will need to use mechanisms (which are currently platform-specific or device-specific) to enforce coherency.
@@ -895,7 +895,7 @@ The ordering guarantees in this section may not apply beyond a platform-specific
Table~\ref{tab:tsomappings} provides a mapping from TSO memory operations onto RISC-V memory instructions.
Normal x86 loads and stores are all inherently acquire-RCpc and release-RCpc operations: TSO enforces all load-load, load-store, and store-store ordering by default.
Therefore, under RVWMO, all TSO loads must be mapped onto a load followed by FENCE~R,RW, and all TSO stores must be mapped onto FENCE~RW,W followed by a store.
-TSO atomic read-modify-writes and x86 instructions using the LOCK prefix are fully-ordered and can be implemented either via an AMO with both {\em aq} and {\em rl} set, or via an LR with {\em aq} set, the arithmetic operation in question, an SC with both {\em aq} and {\em rl} set, and a conditional branch checking the success condition.
+TSO atomic read-modify-writes and x86 instructions using the LOCK prefix are fully ordered and can be implemented either via an AMO with both {\em aq} and {\em rl} set, or via an LR with {\em aq} set, the arithmetic operation in question, an SC with both {\em aq} and {\em rl} set, and a conditional branch checking the success condition.
In the latter case, the {\em rl} annotation on the LR turns out (for non-obvious reasons) to be redundant and can be omitted.
Alternatives to Table~\ref{tab:tsomappings} are also possible.
@@ -1044,7 +1044,7 @@ There are a few ways around this problem, including:
\begin{enumerate}
\item Always use FENCE~RW,W/FENCE~R,RW, and never use {\em aq}/{\em rl}. This suffices but is undesirable, as it defeats the purpose of the {\em aq}/{\em rl} modifiers.
\item Always use {\em aq}/{\em rl}, and never use FENCE~RW,W/FENCE~R,RW. This does not currently work due to the lack of load and store opcodes with {\em aq} and {\em rl} modifiers.
- \item Strengthen the mappings of release operations such that they would enforce sufficient orderings in the presence of either type of acquire mapping. This is the currently-recommended solution, and the one shown in Table~\ref{tab:linuxmappings}.
+ \item Strengthen the mappings of release operations such that they would enforce sufficient orderings in the presence of either type of acquire mapping. This is the currently recommended solution, and the one shown in Table~\ref{tab:linuxmappings}.
\end{enumerate}
\begin{figure}[h!]
@@ -1228,7 +1228,7 @@ Note however that the two mappings only interoperate correctly if {\tt atomic\_<
Any AMO can be emulated by an LR/SC pair, but care must be taken to ensure that any PPO orderings that originate from the LR are also made to originate from the SC, and that any PPO orderings that terminate at the SC are also made to terminate at the LR.
For example, the LR must also be made to respect any data dependencies that the AMO has, given that load operations do not otherwise have any notion of a data dependency.
Likewise, the effect a FENCE~R,R elsewhere in the same hart must also be made to apply to the SC, which would not otherwise respect that fence.
-The emulator may achieve this effect by simply mapping AMOs onto {\tt lr.aq;~<op>;~sc.aqrl}, matching the mapping used elsewhere for fully-ordered atomics.
+The emulator may achieve this effect by simply mapping AMOs onto {\tt lr.aq;~<op>;~sc.aqrl}, matching the mapping used elsewhere for fully ordered atomics.
\section{Implementation Guidelines}
@@ -1272,7 +1272,7 @@ Architectures are free to implement any of the memory model rules as conservativ
\item forbid any forwarding of a value from a store in the store buffer to a subsequent AMO or LR to the same address
\item forbid any forwarding of a value from an AMO or SC in the store buffer to a subsequent load to the same address
\item implement TSO on all memory accesses, and ignore any main memory fences that do not include PW and SR ordering (e.g., as Ztso implementations will do)
- \item implement all atomics to be RCsc or even fully-ordered, regardless of annotation
+ \item implement all atomics to be RCsc or even fully ordered, regardless of annotation
\end{itemize}
Architectures that implement RVTSO can safely do the following: