aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorelisa <elisa@riscv.org>2021-10-05 15:06:55 -0700
committerelisa <elisa@riscv.org>2021-10-05 15:06:55 -0700
commit5921e762efd81682f720130a0d72ce8a1a0da16e (patch)
tree98f1ec7488a804b717b9d2c5e323791834e76739
parentc6ae16c883f6b937c9696c427f046bfc9f8b25f6 (diff)
downloadriscv-isa-manual-5921e762efd81682f720130a0d72ce8a1a0da16e.zip
riscv-isa-manual-5921e762efd81682f720130a0d72ce8a1a0da16e.tar.gz
riscv-isa-manual-5921e762efd81682f720130a0d72ce8a1a0da16e.tar.bz2
adoc formatting and table fixes for intro, a, c, counters, m, rvmo chapters
-rw-r--r--src/a-st-ext.adoc21
-rw-r--r--src/c-st-ext.adoc200
-rw-r--r--src/counters.adoc36
-rw-r--r--src/intro.adoc2
-rw-r--r--src/m-st-ext.adoc21
-rw-r--r--src/riscv-isa-unpr-conv-review.pdfbin7444541 -> 6729005 bytes
-rw-r--r--src/rvwmo.adoc686
7 files changed, 477 insertions, 489 deletions
diff --git a/src/a-st-ext.adoc b/src/a-st-ext.adoc
index 7468009..34cdb2c 100644
--- a/src/a-st-ext.adoc
+++ b/src/a-st-ext.adoc
@@ -1,7 +1,7 @@
[[atomics]]
-== `A` Standard Extension for Atomic Instructions, Version 2.1
+== A Standard Extension for Atomic Instructions, Version 2.1
-The standard atomic-instruction extension, named `A`, contains
+The standard atomic-instruction extension, named A, contains
instructions that atomically read-modify-write memory to support
synchronization between multiple RISC-V harts running in the same memory
space. The two forms of atomic instruction provided are
@@ -111,7 +111,7 @@ software should only assume the failure code will be non-zero.
[NOTE]
====
-We reserve a failure code of 1 to mean `unspecified` so that simple
+We reserve a failure code of 1 to mean *unspecified* so that simple
implementations may return this value using the existing mux required
for the SLT/SLTU instructions. More specific failure codes might be
defined in future versions or extensions to the ISA.
@@ -223,6 +223,7 @@ instruction unless the _rl_ bit is also set. LR._rl_ and SC._aq_
instructions are not guaranteed to provide any stronger ordering than
those with both bits clear, but may result in lower performance.
+.Sample code for compare-and-swap function using LR/SC.
....
# a0 holds address of memory location
# a1 holds expected value
@@ -256,10 +257,10 @@ sequence in the case of failure, and must comprise at most 16
instructions placed sequentially in memory.
* An LR/SC sequence begins with an LR instruction and ends with an SC
instruction. The dynamic code executed between the LR and SC
-instructions can only contain instructions from the base `I`
+instructions can only contain instructions from the base _I_
instruction set, excluding loads, stores, backward jumps, taken backward
-branches, JALR, FENCE, and SYSTEM instructions. If the `C` extension
-is supported, then compressed forms of the aforementioned `I`
+branches, JALR, FENCE, and SYSTEM instructions. If the _C_ extension
+is supported, then compressed forms of the aforementioned _I_
instructions are also permitted.
* The code to retry a failing LR/SC sequence can contain backwards jumps
and/or branches to repeat the LR/SC sequence, but otherwise has the same
@@ -371,7 +372,7 @@ is not naturally aligned, an address-misaligned exception or an
access-fault exception will be generated. The access-fault exception can
be generated for a memory access that would otherwise be able to
complete except for the misalignment, if the misaligned access should
-not be emulated. The ``Zam`` extension, described in
+not be emulated. The _Zam_ extension, described in
<<zam>>, relaxes this requirement and specifies the
semantics of misaligned AMOs.
@@ -379,7 +380,7 @@ The operations supported are swap, integer add, bitwise AND, bitwise OR,
bitwise XOR, and signed and unsigned integer maximum and minimum.
Without ordering constraints, these AMOs can be used to implement
parallel reduction operations, where typically the return value would be
-discarded by writing to `x0`.
+discarded by writing to _x0_.
[NOTE]
====
@@ -388,7 +389,7 @@ parallel systems better than LR/SC or CAS. A simple microarchitecture
can implement AMOs using the LR/SC primitives, provided the
implementation can guarantee the AMO eventually completes. More complex
implementations might also implement AMOs at memory controllers, and can
-optimize away fetching the original value when the destination is `x0`.
+optimize away fetching the original value when the destination is *x0*.
The set of AMOs was chosen to support the C11/C++11 atomic memory
operations efficiently, and also to support parallel reductions in
@@ -444,7 +445,7 @@ acquire and release to simplify the implementation of speculative lock
elision cite:[Rajwar:2001:SLE].
====
-The instructions in the `A` extension can also be used to provide
+The instructions in the _A_ extension can also be used to provide
sequentially consistent loads and stores. A sequentially consistent load
can be implemented as an LR with both _aq_ and _rl_ set. A sequentially
consistent store can be implemented as an AMOSWAP that writes the old
diff --git a/src/c-st-ext.adoc b/src/c-st-ext.adoc
index 79904d8..f750a33 100644
--- a/src/c-st-ext.adoc
+++ b/src/c-st-ext.adoc
@@ -1,11 +1,11 @@
[[compressed]]
-== `C` Standard Extension for Compressed Instructions, Version 2.0
+== C Standard Extension for Compressed Instructions, Version 2.0
This chapter describes the current proposal for the RISC-V standard
-compressed instruction-set extension, named `C`, which reduces static
+compressed instruction-set extension, named _C_, which reduces static
and dynamic code size by adding short 16-bit instruction encodings for
common operations. The C extension can be added to any of the base ISAs
-(RV32, RV64, RV128), and we use the generic term `RVC` to cover any of
+(RV32, RV64, RV128), and we use the generic term _RVC_ to cover any of
these. Typically, 50%–60% of the RISC-V instructions in a program can be
replaced with RVC instructions, resulting in a 25%–30% code-size
reduction.
@@ -17,8 +17,8 @@ of common 32-bit RISC-V instructions when:
* the immediate or address offset is small, or
-* one of the registers is the zero register (`x0`), the ABI link register
-(`x1`), or the ABI stack pointer (`x2`), or
+* one of the registers is the zero register (_x0_), the ABI link register
+(_x1_), or the ABI stack pointer (_x2_), or
* the destination register and the first source register are identical, or
@@ -177,7 +177,7 @@ Table <<rvcopcodemap>> shows the nine compressed instruction
formats. CR, CI, and CSS can use any of the 32 RVI registers, but CIW,
CL, CS, CA, and CB are limited to just 8 of them.
Table <<registers>> lists these popular registers, which
-correspond to registers `x8` to `x15`. Note that there is a separate
+correspond to registers _x8_ to _x15_. Note that there is a separate
version of load and store instructions that use the stack pointer as the
base address register, since saving to and restoring from the stack are
so prevalent, and that they use the CI and CSS formats to allow access
@@ -187,7 +187,7 @@ ADDI4SPN instruction.
[NOTE]
====
The RISC-V ABI was changed to make the frequently used registers map to
-registers `x8`–`x15`. This simplifies the decompression decoder by
+registers *x8*–*x15*. This simplifies the decompression decoder by
having a contiguous naturally aligned set of register numbers, and is
also compatible with the RV32E base ISA, which only has 16 integer
registers.
@@ -195,14 +195,14 @@ registers.
(((compressed, loads and stores)))
Compressed register-based floating-point loads and stores also use the
-CL and CS formats respectively, with the eight registers mapping to `f8`
-to `f15`.
+CL and CS formats respectively, with the eight registers mapping to _f8_
+to _f15_.
(((calling convention, standard)))
[NOTE]
====
The standard RISC-V calling convention maps the most frequently used
-floating-point registers to registers `f8` to `f15`, which allows the
+floating-point registers to registers *f8* to *f15*, which allows the
same register decompression decoding as for integer register numbers.
====
(((register source specifiers, c-ext)))
@@ -223,7 +223,7 @@ position in every instruction, thereby simplifying implementations.
====
For many RVC instructions, zero-valued immediates are disallowed and
-`x0` is not a valid 5-bit register specifier. These restrictions free up
+_x0_ is not a valid 5-bit register specifier. These restrictions free up
encoding space for other instructions requiring fewer operand bits.
//include::images/wavedrom/cr-register.adoc[]
@@ -267,13 +267,13 @@ CS, CA, and CB formats.
|===
|RVC Register Number |000 |001 |010 |011 |100 |101 |110 |111
-|Integer Register Number |`x8` |`x9` |`x10` |`x11` |`x12` |`x13` |`x14`|`x15`
+|Integer Register Number |_x8_ |_x9_ |_x10_ |_x11_ |_x12_ |_x13_ |_x14_|_x15_
-|Integer Register ABI Name |`s0` |`s1` |`a0` |`a1` |`a2` |`a3` |`a4`|`a5`
+|Integer Register ABI Name |_s0_ |_s1_ |_a0_ |_a1_ |_a2_ |_a3_ |_a4_|_a5_
-|Floating-Point Register Number |`f8` |`f9` |`f10` |`f11` |`f12` |`f13`|`f14` |`f15`
+|Floating-Point Register Number |_f8_ |_f9_ |_f10_ |_f11_ |_f12_ |_f13_|_f14_ |_f15_
-|Floating-Point Register ABI Name |`fs0` |`fs1` |`fa0` |`fa1` |`fa2`|`fa3` |`fa4` |`fa5`
+|Floating-Point Register ABI Name |_fs0_ |_fs1_ |_fa0_ |_fa1_ |_fa2_|_fa3_ |_fa4_ |_fa5_
|===
@@ -285,7 +285,7 @@ bytes: latexmath:[$\times$]4 for words, latexmath:[$\times$]8 for double
words, and latexmath:[$\times$]16 for quad words.
RVC provides two variants of loads and stores. One uses the ABI stack
-pointer, `x2`, as the base address and can target any data register. The
+pointer, _x2_, as the base address and can target any data register. The
other can reference one of 8 base address registers and one of 8 data
registers.
@@ -300,7 +300,7 @@ These instructions use the CI format.
C.LWSP loads a 32-bit value from memory into register _rd_. It computes
an effective address by adding the _zero_-extended offset, scaled by 4,
-to the stack pointer, `x2`. It expands to `lw rd, offset(x2)`. C.LWSP is
+to the stack pointer, _x2_. It expands to _lw rd, offset(x2)_. C.LWSP is
only valid when latexmath:[$\textit{rd}{\neq}\texttt{x0}$]; the code
points with latexmath:[$\textit{rd}{=}\texttt{x0}$] are reserved.
(((operations, memory)))
@@ -308,28 +308,28 @@ points with latexmath:[$\textit{rd}{=}\texttt{x0}$] are reserved.
C.LDSP is an RV64C/RV128C-only instruction that loads a 64-bit value
from memory into register _rd_. It computes its effective address by
adding the zero-extended offset, scaled by 8, to the stack pointer,
-`x2`. It expands to `ld rd, offset(x2)`. C.LDSP is only valid when
+_x2_. It expands to _ld rd, offset(x2)_. C.LDSP is only valid when
latexmath:[$\textit{rd}{\neq}\texttt{x0}$]; the code points with
latexmath:[$\textit{rd}{=}\texttt{x0}$] are reserved.
C.LQSP is an RV128C-only instruction that loads a 128-bit value from
memory into register _rd_. It computes its effective address by adding
-the zero-extended offset, scaled by 16, to the stack pointer, `x2`. It
-expands to `lq rd, offset(x2)`. C.LQSP is only valid when
+the zero-extended offset, scaled by 16, to the stack pointer, _x2_. It
+expands to _lq rd, offset(x2)_. C.LQSP is only valid when
latexmath:[$\textit{rd}{\neq}\texttt{x0}$]; the code points with
latexmath:[$\textit{rd}{=}\texttt{x0}$] are reserved.
C.FLWSP is an RV32FC-only instruction that loads a single-precision
floating-point value from memory into floating-point register _rd_. It
computes its effective address by adding the _zero_-extended offset,
-scaled by 4, to the stack pointer, `x2`. It expands to
-`flw rd, offset(x2)`.
+scaled by 4, to the stack pointer, _x2_. It expands to
+_flw rd, offset(x2)_.
C.FLDSP is an RV32DC/RV64DC-only instruction that loads a
double-precision floating-point value from memory into floating-point
register _rd_. It computes its effective address by adding the
-_zero_-extended offset, scaled by 8, to the stack pointer, `x2`. It
-expands to `fld rd, offset(x2)`.
+_zero_-extended offset, scaled by 8, to the stack pointer, _x2_. It
+expands to _fld rd, offset(x2)_.
include::images/wavedrom/sp-base-ls-2.adoc[]
[sp-base-ls-2]
@@ -340,29 +340,29 @@ These instructions use the CSS format.
C.SWSP stores a 32-bit value in register _rs2_ to memory. It computes an
effective address by adding the _zero_-extended offset, scaled by 4, to
-the stack pointer, `x2`. It expands to `sw rs2, offset(x2)`.
+the stack pointer, _x2_. It expands to _sw rs2, offset(x2)_.
C.SDSP is an RV64C/RV128C-only instruction that stores a 64-bit value in
register _rs2_ to memory. It computes an effective address by adding the
-_zero_-extended offset, scaled by 8, to the stack pointer, `x2`. It
-expands to `sd rs2, offset(x2)`.
+_zero_-extended offset, scaled by 8, to the stack pointer, _x2_. It
+expands to _sd rs2, offset(x2)_.
C.SQSP is an RV128C-only instruction that stores a 128-bit value in
register _rs2_ to memory. It computes an effective address by adding the
-_zero_-extended offset, scaled by 16, to the stack pointer, `x2`. It
-expands to `sq rs2, offset(x2)`.
+_zero_-extended offset, scaled by 16, to the stack pointer, _x2_. It
+expands to _sq rs2, offset(x2)_.
C.FSWSP is an RV32FC-only instruction that stores a single-precision
floating-point value in floating-point register _rs2_ to memory. It
computes an effective address by adding the _zero_-extended offset,
-scaled by 4, to the stack pointer, `x2`. It expands to
-`fsw rs2, offset(x2)`.
+scaled by 4, to the stack pointer, _x2_. It expands to
+_fsw rs2, offset(x2)_.
C.FSDSP is an RV32DC/RV64DC-only instruction that stores a
double-precision floating-point value in floating-point register _rs2_
to memory. It computes an effective address by adding the
-_zero_-extended offset, scaled by 8, to the stack pointer, `x2`. It
-expands to `fsd rs2, offset(x2)`.
+_zero_-extended offset, scaled by 8, to the stack pointer, _x2_. It
+expands to _fsd rs2, offset(x2)_.
[NOTE]
====
@@ -422,31 +422,31 @@ These instructions use the CL format.
C.LW loads a 32-bit value from memory into register
_rd latexmath:[$'$]_. It computes an effective address by adding the
_zero_-extended offset, scaled by 4, to the base address in register
-_rs1 latexmath:[$'$]_. It expands to `lw rd, offset(rs1)`.
+_rs1 latexmath:[$'$]_. It expands to _lw rd, offset(rs1)_.
C.LD is an RV64C/RV128C-only instruction that loads a 64-bit value from
memory into register _rd latexmath:[$'$]_. It computes an effective
address by adding the _zero_-extended offset, scaled by 8, to the base
address in register _rs1 latexmath:[$'$]_. It expands to
-`ld rd', offset(rs1')`.
+_ld rd', offset(rs1')_.
C.LQ is an RV128C-only instruction that loads a 128-bit value from
memory into register _rd latexmath:[$'$]_. It computes an effective
address by adding the _zero_-extended offset, scaled by 16, to the base
address in register _rs1 latexmath:[$'$]_. It expands to
-`lq rd, offset(rs1)`.
+_lq rd, offset(rs1)_.
C.FLW is an RV32FC-only instruction that loads a single-precision
floating-point value from memory into floating-point register
_rd latexmath:[$'$]_. It computes an effective address by adding the
_zero_-extended offset, scaled by 4, to the base address in register
-_rs1 latexmath:[$'$]_. It expands to `flw rd, offset(rs1)`.
+_rs1 latexmath:[$'$]_. It expands to _flw rd, offset(rs1)_.
C.FLD is an RV32DC/RV64DC-only instruction that loads a double-precision
floating-point value from memory into floating-point register
_rd latexmath:[$'$]_. It computes an effective address by adding the
_zero_-extended offset, scaled by 8, to the base address in register
-_rs1 latexmath:[$'$]_. It expands to `fld rd, offset(rs1)`.
+_rs1 latexmath:[$'$]_. It expands to _fld rd, offset(rs1)_.
S@S@S@Y@S@Y +
& & & & & +
@@ -464,32 +464,32 @@ These instructions use the CS format.
C.SW stores a 32-bit value in register _rs2 latexmath:[$'$]_ to memory.
It computes an effective address by adding the _zero_-extended offset,
scaled by 4, to the base address in register _rs1 latexmath:[$'$]_. It
-expands to `sw rs2, offset(rs1)`.
+expands to _sw rs2, offset(rs1)_.
C.SD is an RV64C/RV128C-only instruction that stores a 64-bit value in
register _rs2 latexmath:[$'$]_ to memory. It computes an effective
address by adding the _zero_-extended offset, scaled by 8, to the base
address in register _rs1 latexmath:[$'$]_. It expands to
-`sd rs2, offset(rs1)`.
+_sd rs2, offset(rs1)_.
C.SQ is an RV128C-only instruction that stores a 128-bit value in
register _rs2 latexmath:[$'$]_ to memory. It computes an effective
address by adding the _zero_-extended offset, scaled by 16, to the base
address in register _rs1 latexmath:[$'$]_. It expands to
-`sq rs2, offset(rs1)`.
+_sq rs2, offset(rs1)_.
C.FSW is an RV32FC-only instruction that stores a single-precision
floating-point value in floating-point register _rs2 latexmath:[$'$]_ to
memory. It computes an effective address by adding the _zero_-extended
offset, scaled by 4, to the base address in register
-_rs1 latexmath:[$'$]_. It expands to `fsw rs2, offset(rs1)`.
+_rs1 latexmath:[$'$]_. It expands to _fsw rs2, offset(rs1)_.
C.FSD is an RV32DC/RV64DC-only instruction that stores a
double-precision floating-point value in floating-point register
_rs2 latexmath:[$'$]_ to memory. It computes an effective address by
adding the _zero_-extended offset, scaled by 8, to the base address in
register _rs1 latexmath:[$'$]_. It expands to
-`fsd rs2, offset(rs1)`.
+_fsd rs2, offset(rs1)_.
=== Control Transfer Instructions
@@ -511,14 +511,14 @@ offset[11latexmath:[$\vert$]4latexmath:[$\vert$]9:8latexmath:[$\vert$]10latexmat
These instructions use the CJ format.
C.J performs an unconditional control transfer. The offset is
-sign-extended and added to the `pc` to form the jump target address. C.J
+sign-extended and added to the _pc_ to form the jump target address. C.J
can therefore target a latexmath:[$\pm$] range. C.J expands to
-`jal x0, offset`.
+_jal x0, offset_.
C.JAL is an RV32C-only instruction that performs the same operation as
C.J, but additionally writes the address of the instruction following
-the jump (`pc`+2) to the link register, `x1`. C.JAL expands to
-`jal x1, offset`.
+the jump (_pc_+2) to the link register, _x1_. C.JAL expands to
+_jal x1, offset_.
E@T@T@Y +
& & & +
@@ -530,14 +530,14 @@ C.JALR & srclatexmath:[$\neq$]0 & 0 & C2 +
These instructions use the CR format.
C.JR (jump register) performs an unconditional control transfer to the
-address in register _rs1_. C.JR expands to `jalr x0, 0(rs1)`. C.JR is
+address in register _rs1_. C.JR expands to _jalr x0, 0(rs1)_. C.JR is
only valid when latexmath:[$\textit{rs1}{\neq}\texttt{x0}$]; the code
point with latexmath:[$\textit{rs1}{=}\texttt{x0}$] is reserved.
C.JALR (jump and link register) performs the same operation as C.JR, but
additionally writes the address of the instruction following the jump
-(`pc`+2) to the link register, `x1`. C.JALR expands to
-`jalr x1, 0(rs1)`. C.JALR is only valid when
+(_pc_+2) to the link register, _x1_. C.JALR expands to
+_jalr x1, 0(rs1)_. C.JALR is only valid when
latexmath:[$\textit{rs1}{\neq}\texttt{x0}$]; the code point with
latexmath:[$\textit{rs1}{=}\texttt{x0}$] corresponds to the C.EBREAK
instruction.
@@ -562,14 +562,14 @@ offset[7:6latexmath:[$\vert$]2:1latexmath:[$\vert$]5] & C1 +
These instructions use the CB format.
C.BEQZ performs conditional control transfers. The offset is
-sign-extended and added to the `pc` to form the branch target address.
+sign-extended and added to the _pc_ to form the branch target address.
It can therefore target a latexmath:[$\pm$] range. C.BEQZ takes the
branch if the value in register _rs1 latexmath:[$'$]_ is zero. It
-expands to `beq rs1, x0, offset`.
+expands to _beq rs1, x0, offset_.
C.BNEZ is defined analogously, but it takes the branch if
_rs1 latexmath:[$'$]_ contains a nonzero value. It expands to
-`bne rs1, x0, offset`.
+_bne rs1, x0, offset_.
=== Integer Computational Instructions
@@ -591,17 +591,17 @@ latexmath:[$\textrm{dest}{\neq}{\left\{0,2\right\}}$] & nzimm[16:12] &
C1 +
C.LI loads the sign-extended 6-bit immediate, _imm_, into register _rd_.
-C.LI expands into `addi rd, x0, imm`. C.LI is only valid when
-_rd_latexmath:[$\neq$]`x0`; the code points with _rd_=`x0` encode HINTs.
+C.LI expands into _addi rd, x0, imm_. C.LI is only valid when
+_rd_latexmath:[$\neq$]_x0_; the code points with _rd_=_x0_ encode HINTs.
C.LUI loads the non-zero 6-bit immediate field into bits 17–12 of the
destination register, clears the bottom 12 bits, and sign-extends bit 17
into all higher bits of the destination. C.LUI expands into
-`lui rd, nzimm`. C.LUI is only valid when
+_lui rd, nzimm_. C.LUI is only valid when
latexmath:[$\textit{rd}{\neq}{\left\{\texttt{x0},\texttt{x2}\right\}}$],
and when the immediate is not equal to zero. The code points with
-_nzimm_=0 are reserved; the remaining code points with _rd_=`x0` are
-HINTs; and the remaining code points with _rd_=`x2` correspond to the
+_nzimm_=0 are reserved; the remaining code points with _rd_=_x0_ are
+HINTs; and the remaining code points with _rd_=_x2_ correspond to the
C.ADDI16SP instruction.
==== Integer Register-Immediate Operations
@@ -621,29 +621,29 @@ C1 +
C.ADDI adds the non-zero sign-extended 6-bit immediate to the value in
register _rd_ then writes the result to _rd_. C.ADDI expands into
-`addi rd, rd, nzimm`. C.ADDI is only valid when
-_rd_latexmath:[$\neq$]`x0` and _nzimm_latexmath:[$\neq$]0. The code
-points with _rd_=`x0` encode the C.NOP instruction; the remaining code
+_addi rd, rd, nzimm_. C.ADDI is only valid when
+_rd_latexmath:[$\neq$]_x0_ and _nzimm_latexmath:[$\neq$]0. The code
+points with _rd_=_x0_ encode the C.NOP instruction; the remaining code
points with _nzimm_=0 encode HINTs.
C.ADDIW is an RV64C/RV128C-only instruction that performs the same
computation but produces a 32-bit result, then sign-extends result to 64
-bits. C.ADDIW expands into `addiw rd, rd, imm`. The immediate can be
-zero for C.ADDIW, where this corresponds to ` sext.w rd`. C.ADDIW is
-only valid when _rd_latexmath:[$\neq$]`x0`; the code points with
-_rd_=`x0` are reserved.
+bits. C.ADDIW expands into _addiw rd, rd, imm_. The immediate can be
+zero for C.ADDIW, where this corresponds to _sext.w rd_. C.ADDIW is
+only valid when _rd_latexmath:[$\neq$]_x0_; the code points with
+_rd_=_x0_ are reserved.
C.ADDI16SP shares the opcode with C.LUI, but has a destination field of
-`x2`. C.ADDI16SP adds the non-zero sign-extended 6-bit immediate to the
-value in the stack pointer (`sp`=`x2`), where the immediate is scaled to
+_x2_. C.ADDI16SP adds the non-zero sign-extended 6-bit immediate to the
+value in the stack pointer (_sp_=_x2_), where the immediate is scaled to
represent multiples of 16 in the range (-512,496). C.ADDI16SP is used to
adjust the stack pointer in procedure prologues and epilogues. It
-expands into `addi x2, x2, nzimm`. C.ADDI16SP is only valid when
+expands into _addi x2, x2, nzimm_. C.ADDI16SP is only valid when
_nzimm_latexmath:[$\neq$]0; the code point with _nzimm_=0 is reserved.
[NOTE]
====
-In the standard RISC-V calling convention, the stack pointer `sp` is
+In the standard RISC-V calling convention, the stack pointer *sp* is
always 16-byte aligned.
====
@@ -656,9 +656,9 @@ nzuimm[5:4latexmath:[$\vert$]9:6latexmath:[$\vert$]2latexmath:[$\vert$]3]
& dest & C0 +
C.ADDI4SPN is a CIW-format instruction that adds a _zero_-extended
-non-zero immediate, scaled by 4, to the stack pointer, `x2`, and writes
-the result to `rd`. This instruction is used to generate pointers to
-stack-allocated variables, and expands to `addi rd, x2, nzuimm`.
+non-zero immediate, scaled by 4, to the stack pointer, _x2_, and writes
+the result to _rd_. This instruction is used to generate pointers to
+stack-allocated variables, and expands to _addi rd, x2, nzuimm_.
C.ADDI4SPN is only valid when _nzuimm_latexmath:[$\neq$]0; the code
points with _nzuimm_=0 are reserved.
@@ -672,13 +672,13 @@ C.SLLI is a CI-format instruction that performs a logical left shift of
the value in register _rd_ then writes the result to _rd_. The shift
amount is encoded in the _shamt_ field. For RV128C, a shift amount of
zero is used to encode a shift of 64. C.SLLI expands into
-`slli rd, rd, shamt`, except for RV128C with `shamt=0`, which expands to
-`slli rd, rd, 64`.
+_slli rd, rd, shamt_, except for RV128C with _shamt=0_, which expands to
+_slli rd, rd, 64_.
For RV32C, _shamt[5]_ must be zero; the code points with _shamt[5]_=1
are designated for custom extensions. For RV32C and RV64C, the shift
amount must be non-zero; the code points with _shamt_=0 are HINTs. For
-all base ISAs, the code points with _rd_=`x0` are HINTs, except those
+all base ISAs, the code points with _rd_=_x0_ are HINTs, except those
with _shamt[5]_=1 in RV32C.
S@W@Y@S@T@Y +
@@ -694,15 +694,15 @@ _rd latexmath:[$'$]_. The shift amount is encoded in the _shamt_ field.
For RV128C, a shift amount of zero is used to encode a shift of 64.
Furthermore, the shift amount is sign-extended for RV128C, and so the
legal shift amounts are 1–31, 64, and 96–127. C.SRLI expands into
-`srli rd', rd', shamt`, except for RV128C with `shamt=0`, which
-expands to `srli rd, rd, 64`.
+_srli rd', rd', shamt_, except for RV128C with _shamt=0_, which
+expands to _srli rd, rd, 64_.
For RV32C, _shamt[5]_ must be zero; the code points with _shamt[5]_=1
are designated for custom extensions. For RV32C and RV64C, the shift
amount must be non-zero; the code points with _shamt_=0 are HINTs.
C.SRAI is defined analogously to C.SRLI, but instead performs an
-arithmetic right shift. C.SRAI expands to `srai rd, rd, shamt`.
+arithmetic right shift. C.SRAI expands to _srai rd, rd, shamt_.
[NOTE]
====
@@ -727,7 +727,7 @@ C.ANDI & imm[5] & C.ANDI & dest & imm[4:0] & C1 +
C.ANDI is a CB-format instruction that computes the bitwise AND of the
value in register _rd latexmath:[$'$]_ and the sign-extended 6-bit
immediate, then writes the result to _rd latexmath:[$'$]_. C.ANDI
-expands to `andi rd, rd, imm`.
+expands to _andi rd, rd, imm_.
==== Integer Register-Register Operations
@@ -741,7 +741,7 @@ C.ADD & destlatexmath:[$\neq$]0 & srclatexmath:[$\neq$]0 & C2 +
These instructions use the CR format.
C.MV copies the value in register _rs2_ into register _rd_. C.MV expands
-into `add rd, x0, rs2`. C.MV is only valid when
+into _add rd, x0, rs2_. C.MV is only valid when
latexmath:[$\textit{rs2}{\neq}\texttt{x0}$]; the code points with
latexmath:[$\textit{rs2}{=}\texttt{x0}$] correspond to the C.JR
instruction. The code points with
@@ -758,7 +758,7 @@ hardware cost.
====
C.ADD adds the values in registers _rd_ and _rs2_ and writes the result
-to register _rd_. C.ADD expands into `add rd, rd, rs2`. C.ADD is only
+to register _rd_. C.ADD expands into _add rd, rd, rs2_. C.ADD is only
valid when latexmath:[$\textit{rs2}{\neq}\texttt{x0}$]; the code points
with latexmath:[$\textit{rs2}{=}\texttt{x0}$] correspond to the C.JALR
and C.EBREAK instructions. The code points with
@@ -781,34 +781,34 @@ These instructions use the CA format.
C.AND computes the bitwise AND of the values in registers
_rd latexmath:[$'$]_ and _rs2 latexmath:[$'$]_, then writes the result
to register _rd latexmath:[$'$]_. C.AND expands into
-`and rd, rd, rs2`.
+_and rd, rd, rs2_.
C.OR computes the bitwise OR of the values in registers
_rd latexmath:[$'$]_ and _rs2 latexmath:[$'$]_, then writes the result
to register _rd latexmath:[$'$]_. C.OR expands into
-`or rd&#8242;, rd&#8242;, rs2&#8242;`.
+_or rd&#8242;, rd&#8242;, rs2&#8242;_.
C.XOR computes the bitwise XOR of the values in registers
_rd latexmath:[$'$]_ and _rs2 latexmath:[$'$]_, then writes the result
to register _rd latexmath:[$'$]_. C.XOR expands into
-`xor rd', rd', rs2'.
+_xor rd', rd', rs2'_.
C.SUB subtracts the value in register _rs2 latexmath:[$'$]_ from the
value in register _rd latexmath:[$'$]_, then writes the result to
register _rd latexmath:[$'$]_. C.SUB expands into
-`sub rd', rd', rs2'.
+_sub rd', rd', rs2'_.
C.ADDW is an RV64C/RV128C-only instruction that adds the values in
registers _rd latexmath:[$'$]_ and _rs2 latexmath:[$'$]_, then
sign-extends the lower 32 bits of the sum before writing the result to
register _rd latexmath:[$'$]_. C.ADDW expands into
-`addw rd', rd', rs2'`.
+_addw rd', rd', rs2'_.
C.SUBW is an RV64C/RV128C-only instruction that subtracts the value in
register _rs2 latexmath:[$'$]_ from the value in register
_rd latexmath:[$'$]_, then sign-extends the lower 32 bits of the
difference before writing the result to register _rd latexmath:[$'$]_.
-C.SUBW expands into `subw rd', rd', rs2'`.
+C.SUBW expands into _subw rd', rd', rs2'_.
[NOTE]
====
@@ -849,8 +849,8 @@ SW@T@T@Y +
C.NOP & 0 & 0 & 0 & C1 +
C.NOP is a CI-format instruction that does not change any user-visible
-state, except for advancing the `pc` and incrementing any applicable
-performance counters. C.NOP expands to `nop`. C.NOP is only valid when
+state, except for advancing the _pc_ and incrementing any applicable
+performance counters. C.NOP expands to _nop_. C.NOP is only valid when
_imm_=0; the code points with _imm_latexmath:[$\neq$]0 encode HINTs.
==== Breakpoint Instruction
@@ -861,7 +861,7 @@ E@U@Y +
& 10 & 2 +
C.EBREAK & 0 & C2 +
-Debuggers can use the C.EBREAK instruction, which expands to `ebreak`,
+Debuggers can use the C.EBREAK instruction, which expands to _ebreak_,
to cause control to be transferred back to the debugging environment.
C.EBREAK shares the opcode with the C.ADD instruction, but with _rd_ and
_rs2_ both zero, thus can also use the CR format.
@@ -886,12 +886,12 @@ C instructions will eventually complete.
A portion of the RVC encoding space is reserved for microarchitectural
HINTs. Like the HINTs in the RV32I base ISA (see
<<rv32i-hints>>, these instructions do not
-modify any architectural state, except for advancing the `pc` and any
+modify any architectural state, except for advancing the _pc_ and any
applicable performance counters. HINTs are executed as no-ops on
implementations that ignore them.
RVC HINTs are encoded as computational instructions that do not modify
-the architectural state, either because _rd_=`x0` (e.g.
+the architectural state, either because _rd_=_x0_ (e.g.
C.ADD _x0_, _t0_), or because _rd_ is overwritten with a copy of itself
(e.g. C.ADDI _t0_, 0).
@@ -930,22 +930,22 @@ no standard HINTs will ever be defined in this subspace.
|C.NOP |_nzimm_latexmath:[$\neq$]0 |63 .6+^.>s|_Reserved for future standard
use_
-|C.ADDI | _rd_latexmath:[$\neq$]`x0`, _nzimm_=0 |31
+|C.ADDI | _rd_latexmath:[$\neq$]_x0_, _nzimm_=0 |31
-|C.LI | _rd_=`x0` |64
+|C.LI | _rd_=_x0_ |64
-|C.LUI | _rd_=`x0`, _nzimm_latexmath:[$\neq$]0 |63
+|C.LUI | _rd_=_x0_, _nzimm_latexmath:[$\neq$]0 |63
-|C.MV | _rd_=`x0`, _rs2_latexmath:[$\neq$]`x0` |31
+|C.MV | _rd_=_x0_, _rs2_latexmath:[$\neq$]_x0_ |31
-|C.ADD | _rd_=`x0`, _rs2_latexmath:[$\neq$]`x0` |31
+|C.ADD | _rd_=_x0_, _rs2_latexmath:[$\neq$]_x0_ |31
-|C.SLLI |_rd_=`x0`, _nzimm_latexmath:[$\neq$]0 |31 (RV32), 63 (RV64/128) .5+^.>s|_Designated
+|C.SLLI |_rd_=_x0_, _nzimm_latexmath:[$\neq$]0 |31 (RV32), 63 (RV64/128) .5+^.>s|_Designated
for custom use_
-|C.SLLI64 | _rd_=`x0` |1
+|C.SLLI64 | _rd_=_x0_ |1
-|C.SLLI64 | _rd_latexmath:[$\neq$]`x0`, RV32 and RV64 only |31
+|C.SLLI64 | _rd_latexmath:[$\neq$]_x0_, RV32 and RV64 only |31
|C.SRLI64 | RV32 and RV64 only |8
diff --git a/src/counters.adoc b/src/counters.adoc
index 670ed6e..a92b3fc 100644
--- a/src/counters.adoc
+++ b/src/counters.adoc
@@ -1,10 +1,10 @@
[[perf-counters]]
== Counters
-RISC-V ISAs provide a set of up to 32latexmath:[$\times$]64-bit
+RISC-V ISAs provide a set of up to 32_X_64-bit
performance counters and timers that are accessible via unprivileged
-XLEN read-only CSR registers `0xC00`–`0xC1F` (with the upper 32 bits
-accessed via CSR registers `0xC80`–`0xC9F` on RV32). The first three of
+XLEN read-only CSR registers _0xC00_–_0xC1F_ (with the upper 32 bits
+accessed via CSR registers _0xC80_–_0xC9F_ on RV32). The first three of
these (CYCLE, TIME, and INSTRET) have dedicated functions (cycle count,
real-time clock, and instructions-retired respectively), while the
remaining counters, if implemented, provide programmable event counting.
@@ -22,8 +22,8 @@ RV32I provides a number of 64-bit read-only user-level counters, which
are mapped into the 12-bit CSR address space and accessed in 32-bit
pieces using CSRRS instructions. In RV64I, the CSR instructions can
manipulate 64-bit CSRs. In particular, the RDCYCLE, RDTIME, and
-RDINSTRET pseudoinstructions read the full 64 bits of the `cycle`,
-`time`, and `instret` counters. Hence, the RDCYCLEH, RDTIMEH, and
+RDINSTRET pseudoinstructions read the full 64 bits of the _cycle_,
+_time_, and _instret_ counters. Hence, the RDCYCLEH, RDTIMEH, and
RDINSTRETH instructions are RV32I-only.
[NOTE]
@@ -33,7 +33,7 @@ timing side-channel attacks.
====
(((counters, pseudoinstruction)))
-The RDCYCLE pseudoinstruction reads the low XLEN bits of the `cycle`
+The RDCYCLE pseudoinstruction reads the low XLEN bits of the _cycle_
CSR which holds a count of the number of clock cycles executed by the
processor core on which the hart is running from an arbitrary start time
in the past. RDCYCLEH is an RV32I-only instruction that reads bits 63–32
@@ -46,9 +46,9 @@ environment should provide a means to determine the current rate
[TIP]
====
RDCYCLE is intended to return the number of cycles executed by the
-processor core, not the hart. Precisely defining what is
+processor core, not the hart. Precisely defining what is a "core"
difficult given some implementation choices (e.g., AMD Bulldozer).
-Precisely defining what is a `clock cycle` is also difficult given the
+Precisely defining what is a "clock cycle" is also difficult given the
range of implementations (including software emulations), but the intent
is that RDCYCLE is used for performance monitoring along with the other
performance counters. In particular, where there is one hart/core, one
@@ -71,7 +71,7 @@ threading implementations. For example, should we only count cycles for
which any instruction was issued to execution for this hart, and/or
cycles any instruction retired, or include cycles this hart was
occupying machine resources but couldn’t execute due to stalls while
-other harts went into execution? Likely, `all of the above` would be
+other harts went into execution? Likely, _all of the above_ would be
needed to have understandable performance stats. This complexity of
defining a per-hart cycle count, and also the need in any case for a
total per-core cycle count when tuning multithreaded code led to just
@@ -79,8 +79,8 @@ standardizing the per-core cycle counter, which also happens to work
well for the common single hart/core case.
(((counters, handling sleep cycles)))
-Standardizing what happens during `sleep` is not practical given that
-what `sleep` means is not standardized across execution environments,
+Standardizing what happens during "sleep" is not practical given that
+what "sleep" means is not standardized across execution environments,
but if the entire core is paused (entirely clock-gated or powered-down
in deep sleep), then it is not executing clock cycles, and the cycle
count shouldn’t be increasing per the spec. There are many details,
@@ -90,12 +90,12 @@ execution-environment-specific details.
Even though there is no precise definition that works for all platforms,
this is still a useful facility for most platforms, and an imprecise,
-common, `usually correct` standard here is better than no standard.
+common, "usually correct" standard here is better than no standard.
The intent of RDCYCLE was primarily performance monitoring/tuning, and
the specification was written with that goal in mind.
====
-The RDTIME pseudoinstruction reads the low XLEN bits of the ` time` CSR,
+The RDTIME pseudoinstruction reads the low XLEN bits of the *time* CSR,
which counts wall-clock real time that has passed from an arbitrary
start time in the past. RDTIMEH is an RV32I-only instruction that reads
bits 63–32 of the same real-time counter. The underlying 64-bit counter
@@ -116,14 +116,14 @@ portable, rather than using RDCYCLE to measure wall-clock time.
(((counters, pseudoinstructions)))
The RDINSTRET pseudoinstruction reads the low XLEN bits of the
-` instret` CSR, which counts the number of instructions retired by this
+*instret* CSR, which counts the number of instructions retired by this
hart from some arbitrary start point in the past. RDINSTRETH is an
RV32I-only instruction that reads bits 63–32 of the same instruction
counter. The underlying 64-bit counter should never overflow in
practice.
The following code sequence will read a valid 64-bit cycle counter value
-into `x3`:`x2`, even if the counter overflows its lower half between
+into _x3_:_x2_, even if the counter overflows its lower half between
reading its upper and lower halves.
.Sample code for reading the 64-bit cycle counter in RV32.
@@ -168,9 +168,9 @@ implementations with a richer set of counters.
(((counters, performance)))
There is CSR space allocated for 29 additional unprivileged 64-bit
-hardware performance counters, `hpmcounter3`–`hpmcounter31`. For RV32,
+hardware performance counters, _hpmcounter3_–_hpmcounter31_. For RV32,
the upper 32 bits of these performance counters is accessible via
-additional CSRs `hpmcounter3h`–` hpmcounter31h`. These counters count
+additional CSRs _hpmcounter3h_–_hpmcounter31h_. These counters count
platform-specific events and are configured via additional privileged
registers. The number and width of these additional counters, and the
set of events they count is platform-specific.
@@ -184,6 +184,6 @@ counted.
It would be useful to eventually standardize event settings to count
ISA-level metrics, such as the number of floating-point instructions
executed for example, and possibly a few common microarchitectural
-metrics, such as `L1 instruction cache misses`.
+metrics, such as _L1 instruction cache misses_.
====
diff --git a/src/intro.adoc b/src/intro.adoc
index f455e54..3b318bc 100644
--- a/src/intro.adoc
+++ b/src/intro.adoc
@@ -730,5 +730,3 @@ behavior and values and use the term _unspecified_ for cases that are intentiona
unconstrained. These cases may be constrained or defined by other
extensions, platform standards, or implementations.
-
-Susan Anstey
diff --git a/src/m-st-ext.adoc b/src/m-st-ext.adoc
index ae5e91c..6538750 100644
--- a/src/m-st-ext.adoc
+++ b/src/m-st-ext.adoc
@@ -1,8 +1,8 @@
[[mstandard]]
-== _M_ Standard Extension for Integer Multiplication and Division, Version 2.0
+== M Standard Extension for Integer Multiplication and Division, Version 2.0
This chapter describes the standard integer multiplication and division
-instruction extension, which is named _M_ and contains instructions
+instruction extension, which is named M and contains instructions
that multiply or divide values held in two integer registers.
[TIP]
@@ -23,12 +23,12 @@ image::image_placeholder.png[]
(((MUL, MULHU)))
(((MUL, MULHSU)))
-MUL performs an XLEN-bit_X_XLEN-bit multiplication of
+MUL performs an XLEN-bit X XLEN-bit multiplication of
_rs1_ by _rs2_ and places the lower XLEN bits in the destination
register. MULH, MULHU, and MULHSU perform the same multiplication but
-return the upper XLEN bits of the full 2_X_XLEN-bit
-product, for signed_X_signed,
-unsigned_X_unsigned, and _rs1X_ unsigned _rs2_ multiplication, respectively.
+return the upper XLEN bits of the full 2 X XLEN-bit
+product, for signed X signed,
+unsigned X unsigned, and _rs1X_ unsigned _rs2_ multiplication, respectively.
If both the high and low bits of the same product are required, then t
he recommended code sequence is: MULH[[S]U]
_rdh, rs1, rs2_; MUL _rdl, rs1, rs2_ (source register specifiers must be
@@ -104,11 +104,10 @@ overflow cannot occur.
[cols="<,^,^,^,^,^,^",options="header",]
|===
|Condition |Dividend |Divisor |DIVU[W] |REMU[W] |DIV[W] |REM[W]
-|Division by zero |latexmath:[$x$] |0 |latexmath:[$2^{L}-1$]
-|latexmath:[$x$] |latexmath:[$-1$] |latexmath:[$x$]
-|Overflow (signed only) |latexmath:[$-2^{L-1}$] |latexmath:[$-1$] |– |–
-|latexmath:[$-2^{L-1}$] |0
+|Division by zero |latexmath:[$x$] |0 |latexmath:[$2^{L}-1$] |latexmath:[$x$] |latexmath:[$-1$] |latexmath:[$x$]
+
+|Overflow (signed only) |latexmath:[$-2^{L-1}$] |latexmath:[$-1$] |– |– |latexmath:[$-2^{L-1}$] |0
|===
In <<divby0>>, L is the width of the operation in bits: XLEN for DIV[U] and REM[U], or 32 for DIV[U]W and REM[U]W.
@@ -147,7 +146,7 @@ of the corresponding M-extension instructions.
[NOTE]
====
-The Zmmul extension enables low-cost implementations that require
+The *Zmmul* extension enables low-cost implementations that require
multiplication operations but not division. For many microcontroller
applications, division operations are too infrequent to justify the cost
of divider hardware. By contrast, multiplication operations are more
diff --git a/src/riscv-isa-unpr-conv-review.pdf b/src/riscv-isa-unpr-conv-review.pdf
index 1200701..917caf4 100644
--- a/src/riscv-isa-unpr-conv-review.pdf
+++ b/src/riscv-isa-unpr-conv-review.pdf
Binary files differ
diff --git a/src/rvwmo.adoc b/src/rvwmo.adoc
index e49d990..885e7ca 100644
--- a/src/rvwmo.adoc
+++ b/src/rvwmo.adoc
@@ -3,7 +3,7 @@
This chapter defines the RISC-V memory consistency model. A memory
consistency model is a set of rules specifying the values that can be
-returned by loads of memory. RISC-V uses a memory model called `RVWMO`
+returned by loads of memory. RISC-V uses a memory model called RVWMO
(RISC-V Weak Memory Ordering) which is designed to provide flexibility
for architects to build high-performance scalable designs while
simultaneously supporting a tractable programming model.
@@ -17,14 +17,14 @@ instructions from the first hart being executed in a different order.
Therefore, multithreaded code may require explicit synchronization to
guarantee ordering between memory instructions from different harts. The
base RISC-V ISA provides a FENCE instruction for this purpose, described
-in <<fence>>, while the atomics extension `A`
+in <<fence>>, while the atomics extension ^A^
additionally defines load-reserved/store-conditional and atomic
read-modify-write instructions.
(((atomics, misaligned)))
-The standard ISA extension for misaligned atomics `Zam`
+The standard ISA extension for misaligned atomics _Zam_
(<<zam>>) and the standard ISA extension for total
-store ordering `Ztso` (<<ztso>>) augment RVWMO
+store ordering _Ztso_ (<<ztso>>) augment RVWMO
with additional rules specific to those extensions.
The appendices to this specification provide both axiomatic and
@@ -33,12 +33,14 @@ additional explanatory material.
((FENCE))
((SFENCE))
+[NOTE]
+====
This chapter defines the memory model for regular main memory
operations. The interaction of the memory model with I/O memory,
instruction fetches, FENCE.I, page table walks, and SFENCE.VMA is not
(yet) formalized. Some or all of the above may be formalized in a future
revision of this specification. The RV128 base ISA and future ISA
-extensions such as the `V` vector and `J` JIT extensions will need
+extensions such as the V vector and J JIT extensions will need
to be incorporated into a future revision as well.
Memory consistency models supporting overlapping memory accesses of
@@ -47,6 +49,7 @@ research and are not yet fully understood. The specifics of how memory
accesses of different sizes interact under RVWMO are specified to the
best of our current abilities, but they are subject to revision should
new issues be uncovered.
+====
[[rvwmo]]
=== Definition of the RVWMO Memory Model
@@ -86,10 +89,13 @@ multiple memory operations if XLENlatexmath:[$<$]64, as stated in
gives rise to a single memory operation that is both a load operation
and a store operation simultaneously.
+[NOTE]
+====
Instructions in the RV128 base instruction set and in future ISA
extensions such as V (vector) and P (SIMD) may give rise to multiple
memory operations. However, the memory model for these extensions has
not yet been formalized.
+====
A misaligned load or store instruction may be decomposed into a set of
component memory operations of any granularity. An FLD or FSD
@@ -98,19 +104,22 @@ a set of component memory operations of any granularity. The memory
operations generated by such instructions are not ordered with respect
to each other in program order, but they are ordered normally with
respect to the memory operations generated by preceding and subsequent
-instructions in program order. The atomics extension `A` does not
+instructions in program order. The atomics extension ^A^ does not
require execution environments to support misaligned atomic instructions
-at all; however, if misaligned atomics are supported via the `Zam`
+at all; however, if misaligned atomics are supported via the _Zam_
extension, LRs, SCs, and AMOs may be decomposed subject to the
constraints of the atomicity axiom for misaligned atomics, which is
defined in <<zam>>.
((decomposition))
+[NOTE]
+====
The decomposition of misaligned memory operations down to byte
granularity facilitates emulation on implementations that do not
natively support misaligned accesses. Such implementations might, for
example, simply iterate over the bytes of a misaligned access one by
one.
+====
An LR instruction and an SC instruction are said to be _paired_ if the
LR precedes the SC in program order and if there are no other LR or SC
@@ -121,38 +130,41 @@ whether an SC must succeed, may succeed, or must fail is defined in
<<lrsc>>.
Load and store operations may also carry one or more ordering
-annotations from the following set: `acquire-RCpc`, `acquire-RCsc`,
-`release-RCpc`, and `release-RCsc`. An AMO or LR instruction with
-_aq_ set has an `acquire-RCsc` annotation. An AMO or SC instruction
-with _rl_ set has a `release-RCsc` annotation. An AMO, LR, or SC
-instruction with both _aq_ and _rl_ set has both `acquire-RCsc` and
-`release-RCsc` annotations.
-
-For convenience, we use the term `acquire annotation` to refer to an
+annotations from the following set: _acquire-RCpc_, _acquire-RCsc_,
+_release-RCpc_, and _release-RCsc_. An AMO or LR instruction with
+_aq_ set has an _acquire-RCsc_ annotation. An AMO or SC instruction
+with _rl_ set has a _release-RCsc_ annotation. An AMO, LR, or SC
+instruction with both _aq_ and _rl_ set has both _acquire-RCsc_ and
+_release-RCsc_ annotations.
+
+For convenience, we use the term _acquire annotation_ to refer to an
acquire-RCpc annotation or an acquire-RCsc annotation. Likewise, a
-`release annotation` refers to a release-RCpc annotation or a
-release-RCsc annotation. An `RCpc annotation` refers to an
-acquire-RCpc annotation or a release-RCpc annotation. An `RCsc
-annotation` refers to an acquire-RCsc annotation or a release-RCsc
+_release annotation_ refers to a release-RCpc annotation or a
+release-RCsc annotation. An _RCpc annotation_ refers to an
+acquire-RCpc annotation or a release-RCpc annotation. An _RCsc
+annotation_ refers to an acquire-RCsc annotation or a release-RCsc
annotation.
-In the memory model literature, the term `RCpc` stands for release
+[NOTE]
+====
+In the memory model literature, the term *RCpc* stands for release
consistency with processor-consistent synchronization operations, and
-the term `RCsc` stands for release consistency with sequentially
-consistent synchronization operations .
+the term *RCsc* stands for release consistency with sequentially
+consistent synchronization operations.
While there are many different definitions for acquire and release
annotations in the literature, in the context of RVWMO these terms are
concisely and completely defined by Preserved Program Order rules
<<rcsc>>.
-`RCpc` annotations are currently only used when implicitly assigned to
-every memory access per the standard extension `Ztso`
+*RCpc* annotations are currently only used when implicitly assigned to
+every memory access per the standard extension *Ztso*
(<<ztso>>). Furthermore, although the ISA does not
currently contain native load-acquire or store-release instructions, nor
RCpc variants thereof, the RVWMO model itself is designed to be
forwards-compatible with the potential addition of any or all of the
above into the ISA in a future extension.
+====
[[mem-dependencies]]
==== Syntactic Dependencies
@@ -160,7 +172,7 @@ above into the ISA in a future extension.
The definition of the RVWMO memory model depends in part on the notion
of a syntactic dependency, defined as follows.
-In the context of defining dependencies, a `register` refers either to
+In the context of defining dependencies, a _register_ refers either to
an entire general-purpose register, some portion of a CSR, or an entire
CSR. The granularity at which dependencies are tracked through CSRs is
specific to each CSR and is defined in
@@ -173,79 +185,81 @@ destination registers. This section provides a general definition of all
of these terms; however, <<source-dest-regs>> provides a
complete listing of the specifics for each instruction.
-In general, a register latexmath:[$r$] other than `x0` is a _source
-register_ for an instruction latexmath:[$i$] if any of the following
+In general, a register _r_ other than _x0_ is a _source
+register_ for an instruction _i_ if any of the following
hold:
-* In the opcode of latexmath:[$i$], _rs1_, _rs2_, or _rs3_ is set to
-latexmath:[$r$]
-* latexmath:[$i$] is a CSR instruction, and in the opcode of
-latexmath:[$i$], _csr_ is set to latexmath:[$r$], unless latexmath:[$i$]
-is CSRRW or CSRRWI and _rd_ is set to `x0`
-* latexmath:[$r$] is a CSR and an implicit source register for
-latexmath:[$i$], as defined in <<source-dest-regs>>
-* latexmath:[$r$] is a CSR that aliases with another source register for
-latexmath:[$i$]
+* In the opcode of _i_, _rs1_, _rs2_, or _rs3_ is set to
+_r_
+* _i_ is a CSR instruction, and in the opcode of
+_i_, _csr_ is set to _r_, unless _i_
+is CSRRW or CSRRWI and _rd_ is set to _x0_
+* _r_ is a CSR and an implicit source register for
+_i_, as defined in <<source-dest-regs>>
+* _r_ is a CSR that aliases with another source register for
+_i_
Memory instructions also further specify which source registers are
_address source registers_ and which are _data source registers_.
-In general, a register latexmath:[$r$] other than `x0` is a _destination
-register_ for an instruction latexmath:[$i$] if any of the following
+In general, a register _r_ other than _x0_ is a _destination
+register_ for an instruction _i_ if any of the following
hold:
-* In the opcode of latexmath:[$i$], _rd_ is set to latexmath:[$r$]
-* latexmath:[$i$] is a CSR instruction, and in the opcode of
-latexmath:[$i$], _csr_ is set to latexmath:[$r$], unless latexmath:[$i$]
-is CSRRS or CSRRC and _rs1_ is set to `x0` or latexmath:[$i$] is CSRRSI
+* In the opcode of _i_, _rd_ is set to _r_
+* _i_ is a CSR instruction, and in the opcode of
+_i_, _csr_ is set to _r_, unless _i_
+is CSRRS or CSRRC and _rs1_ is set to _x0_ or _i_ is CSRRSI
or CSRRCI and uimm[4:0] is set to zero.
-* latexmath:[$r$] is a CSR and an implicit destination register for
-latexmath:[$i$], as defined in <<source-dest-regs>>
-* latexmath:[$r$] is a CSR that aliases with another destination
-register for latexmath:[$i$]
+* _r_ is a CSR and an implicit destination register for
+_i_, as defined in <<source-dest-regs>>
+* _r_ is a CSR that aliases with another destination
+register for _i_
Most non-memory instructions _carry a dependency_ from each of their
source registers to each of their destination registers. However, there
are exceptions to this rule; see <<>>source-dest-regs>>.
-Instruction latexmath:[$j$] has a _syntactic dependency_ on instruction
-latexmath:[$i$] via destination register latexmath:[$s$] of
-latexmath:[$i$] and source register latexmath:[$r$] of latexmath:[$j$]
+Instruction _j_ has a _syntactic dependency_ on instruction
+_i_ via destination register _s_ of
+_i_ and source register _r_ of _j_
if either of the following hold:
-* latexmath:[$s$] is the same as latexmath:[$r$], and no instruction
-program-ordered between latexmath:[$i$] and latexmath:[$j$] has
-latexmath:[$r$] as a destination register
-* There is an instruction latexmath:[$m$] program-ordered between
-latexmath:[$i$] and latexmath:[$j$] such that all of the following hold:
-. latexmath:[$j$] has a syntactic dependency on latexmath:[$m$] via
-destination register latexmath:[$q$] and source register latexmath:[$r$]
-. latexmath:[$m$] has a syntactic dependency on latexmath:[$i$] via
-destination register latexmath:[$s$] and source register latexmath:[$p$]
-. latexmath:[$m$] carries a dependency from latexmath:[$p$] to
-latexmath:[$q$]
-
-Finally, in the definitions that follow, let latexmath:[$a$] and
-latexmath:[$b$] be two memory operations, and let latexmath:[$i$] and
-latexmath:[$j$] be the instructions that generate latexmath:[$a$] and
-latexmath:[$b$], respectively.
-
-latexmath:[$b$] has a _syntactic address dependency_ on latexmath:[$a$]
-if latexmath:[$r$] is an address source register for latexmath:[$j$] and
-latexmath:[$j$] has a syntactic dependency on latexmath:[$i$] via source
-register latexmath:[$r$]
-
-latexmath:[$b$] has a _syntactic data dependency_ on latexmath:[$a$] if
-latexmath:[$b$] is a store operation, latexmath:[$r$] is a data source
-register for latexmath:[$j$], and latexmath:[$j$] has a syntactic
-dependency on latexmath:[$i$] via source register latexmath:[$r$]
-
-latexmath:[$b$] has a _syntactic control dependency_ on latexmath:[$a$]
-if there is an instruction latexmath:[$m$] program-ordered between
-latexmath:[$i$] and latexmath:[$j$] such that latexmath:[$m$] is a
-branch or indirect jump and latexmath:[$m$] has a syntactic dependency
-on latexmath:[$i$].
-
+* _s_ is the same as _r_, and no instruction
+program-ordered between _i_ and _j_ has
+_r_ as a destination register
+* There is an instruction _m_ program-ordered between
+_i_ and _j_ such that all of the following hold:
+. _j_ has a syntactic dependency on _m_ via
+destination register _q_ and source register _r_
+. _m_ has a syntactic dependency on _i_ via
+destination register _s_ and source register _p_
+. _m_ carries a dependency from _p_ to
+_q_
+
+Finally, in the definitions that follow, let ^A^ and
+_b_ be two memory operations, and let _i_ and
+_j_ be the instructions that generate l^A^ and
+_b_, respectively.
+
+_b_ has a _syntactic address dependency_ on l^A^
+if _r_ is an address source register for _j_ and
+_j_ has a syntactic dependency on _i_ via source
+register _r_
+
+_b_ has a _syntactic data dependency_ on l^A^ if
+_b_ is a store operation, _r_ is a data source
+register for _j_, and _j_ has a syntactic
+dependency on _i_ via source register _r_
+
+_b_ has a _syntactic control dependency_ on l^A^
+if there is an instruction _m_ program-ordered between
+_i_ and _j_ such that _m_ is a
+branch or indirect jump and _m_ has a syntactic dependency
+on _i_.
+
+[NOTE]
+====
Generally speaking, non-AMO load instructions do not have data source
registers, and unconditional non-AMO store instructions do not have
destination registers. However, a successful SC instruction is
@@ -253,6 +267,7 @@ considered to have the register specified in _rd_ as a destination
register, and hence it is possible for an instruction to have a
syntactic dependency on a successful SC instruction that precedes it in
program order.
+====
==== Preserved Program Order
@@ -263,52 +278,52 @@ _preserved program order_.
The complete definition of preserved program order is as follows (and
note that AMOs are simultaneously both loads and stores): memory
-operation latexmath:[$a$] precedes memory operation latexmath:[$b$] in
+operation l^A^ precedes memory operation _b_ in
preserved program order (and hence also in the global memory order) if
-latexmath:[$a$] precedes latexmath:[$b$] in program order,
-latexmath:[$a$] and latexmath:[$b$] both access regular main memory
+l^A^ precedes _b_ in program order,
+l^A^ and _b_ both access regular main memory
(rather than I/O regions), and any of the following hold:
[[overlapping-orering]]
* Overlapping-Address Orderings:
-. latexmath:[$b$] is a store, and
-latexmath:[$a$] and latexmath:[$b$] access overlapping memory addresses
-. latexmath:[$a$] and latexmath:[$b$] are loads,
-latexmath:[$x$] is a byte read by both latexmath:[$a$] and
-latexmath:[$b$], there is no store to latexmath:[$x$] between
-latexmath:[$a$] and latexmath:[$b$] in program order, and
-latexmath:[$a$] and latexmath:[$b$] return values for latexmath:[$x$]
+. _b_ is a store, and
+l^A^ and _b_ access overlapping memory addresses
+. l^A^ and _b_ are loads,
+_x_ is a byte read by both l^A^ and
+_b_, there is no store to _x_ between
+l^A^ and _b_ in program order, and
+l^A^ and _b_ return values for _x_
written by different memory operations
-. latexmath:[$a$] is
-generated by an AMO or SC instruction, latexmath:[$b$] is a load, and
-latexmath:[$b$] returns a value written by latexmath:[$a$]
+. l^A^ is
+generated by an AMO or SC instruction, _b_ is a load, and
+_b_ returns a value written by l^A^
* Explicit Synchronization
. There is a FENCE instruction that
-orders latexmath:[$a$] before latexmath:[$b$]
-. latexmath:[$a$] has an acquire
+orders l^A^ before _b_
+. l^A^ has an acquire
annotation
-. latexmath:[$b$] has a release annotation
-. latexmath:[$a$] and latexmath:[$b$] both have
+. _b_ has a release annotation
+. l^A^ and _b_ both have
RCsc annotations
-. {empty} latexmath:[$a$] is paired with
-latexmath:[$b$]
+. {empty} l^A^ is paired with
+_b_
* Syntactic Dependencies
-. latexmath:[$b$] has a syntactic address
-dependency on latexmath:[$a$]
-. latexmath:[$b$] has a syntactic data
-dependency on latexmath:[$a$]
-. latexmath:[$b$] is a store, and
-latexmath:[$b$] has a syntactic control dependency on latexmath:[$a$]
+. _b_ has a syntactic address
+dependency on l^A^
+. _b_ has a syntactic data
+dependency on l^A^
+. _b_ is a store, and
+_b_ has a syntactic control dependency on l^A^
* Pipeline Dependencies
-. latexmath:[$b$] is a
-load, and there exists some store latexmath:[$m$] between
-latexmath:[$a$] and latexmath:[$b$] in program order such that
-latexmath:[$m$] has an address or data dependency on latexmath:[$a$],
-and latexmath:[$b$] returns a value written by latexmath:[$m$]
-. latexmath:[$b$] is a store, and
-there exists some instruction latexmath:[$m$] between latexmath:[$a$]
-and latexmath:[$b$] in program order such that latexmath:[$m$] has an
-address dependency on latexmath:[$a$]
+. _b_ is a
+load, and there exists some store _m_ between
+l^A^ and _b_ in program order such that
+_m_ has an address or data dependency on l^A^,
+and _b_ returns a value written by _m_
+. _b_ is a store, and
+there exists some instruction _m_ between l^A^
+and _b_ in program order such that _m_ has an
+address dependency on l^A^
==== Memory Model Axioms
@@ -320,25 +335,25 @@ axiom_, and the _progress axiom_.
[[ax-load]]
===== Load Value Axiom
-Each byte of each load latexmath:[$i$] returns the value written to that
+Each byte of each load _i_ returns the value written to that
byte by the store that is the latest in global memory order among the
following stores:
-. Stores that write that byte and that precede latexmath:[$i$] in the
+. Stores that write that byte and that precede _i_ in the
global memory order
-. Stores that write that byte and that precede latexmath:[$i$] in
+. Stores that write that byte and that precede _i_ in
program order
[[ax-atom]]
===== Atomicity Axiom
-If latexmath:[$r$] and latexmath:[$w$] are paired load and store
+If _r_ and _w_ are paired load and store
operations generated by aligned LR and SC instructions in a hart
-latexmath:[$h$], latexmath:[$s$] is a store to byte latexmath:[$x$], and
-latexmath:[$r$] returns a value written by latexmath:[$s$], then
-latexmath:[$s$] must precede latexmath:[$w$] in the global memory order,
-and there can be no store from a hart other than latexmath:[$h$] to byte
-latexmath:[$x$] following latexmath:[$s$] and preceding latexmath:[$w$]
+_h_, _s_ is a store to byte _x_, and
+_r_ returns a value written by _s_, then
+_s_ must precede _w_ in the global memory order,
+and there can be no store from a hart other than _h_ to byte
+_x_ following _s_ and preceding _w_
in the global memory order.
The theoretically supports LR/SC pairs of different widths and to
@@ -359,9 +374,9 @@ infinite sequence of other memory operations.
[cols="<,<,<",options="header",]
|===
|Name |Portions Tracked as Independent Units |Aliases
-|`fflags` |Bits 4, 3, 2, 1, 0 |`fcsr`
-|`frm` |entire CSR |`fcsr`
-|`fcsr` |Bits 7-5, 4, 3, 2, 1, 0 |`fflags`, `frm`
+|_fflags_ |Bits 4, 3, 2, 1, 0 |_fcsr_
+|_frm_ |entire CSR |_fcsr_
+|_fcsr_ |Bits 7-5, 4, 3, 2, 1, 0 |_fflags_, _frm_
|===
Note: read-only CSRs are not listed, as they do not participate in the
@@ -375,291 +390,296 @@ registers for each instruction. These listings are used in the
definition of syntactic dependencies in
<<mem-dependencies>>.
-The term `accumulating CSR` is used to describe a CSR that is both a
+The term _accumulating CSR_ is used to describe a CSR that is both a
source and a destination register, but which carries a dependency only
from itself to itself.
Instructions carry a dependency from each source register in the
-`Source Registers` column to each destination register in the
-`Destination Registers` column, from each source register in the
-`Source Registers` column to each CSR in the `Accumulating CSRs`
-column, and from each CSR in the `Accumulating CSRs` column to itself,
+_Source Registers_ column to each destination register in the
+_Destination Registers_ column, from each source register in the
+_Source Registers_ column to each CSR in the _Accumulating CSRs_
+column, and from each CSR in the _Accumulating CSRs_ column to itself,
except where annotated otherwise.
Key:
-latexmath:[$^A$]Address source register
+- ^A^: Address source register
-latexmath:[$^D$]Data source register
+- ^D^: Data source register
-latexmath:[$^\dagger$]The instruction does not carry a dependency from
+- latexmath:[$^\dagger$]: The instruction does not carry a dependency from
any source register to any destination register
-latexmath:[$^\ddagger$]The instruction carries dependencies from source
+- latexmath:[$^\ddagger$]: The instruction carries dependencies from source
register(s) to destination register(s) as specified
-[cols="<,<,<,<",]
+.RV32I Base Integer Instruction Set
+[%header,cols="<,<,<,<"]
|===
-|*RV32I Base Integer Instruction Set* | | |
-| |Source |Destination |Accumulating
-| |Registers |Registers |CSRs
+||Source Registers |Destination Registers|Accumulating CSRs
+
|LUI | |_rd_ |
+
|AUIPC | |_rd_ |
+
|JAL | |_rd_ |
-|JALRlatexmath:[$^\dagger$] |_rs1_ |_rd_ |
+
+|JALR latexmath:[$^\dagger$] |_rs1_ |_rd_ |
+
|BEQ |_rs1_, _rs2_ | |
+
|BNE |_rs1_, _rs2_ | |
+
|BLT |_rs1_, _rs2_ | |
+
|BGE |_rs1_, _rs2_ | |
+
|BLTU |_rs1_, _rs2_ | |
+
|BGEU |_rs1_, _rs2_ | |
-|LBlatexmath:[$^\dagger$] |_rs1_latexmath:[$^A$] |_rd_ |
-|LHlatexmath:[$^\dagger$] |_rs1_latexmath:[$^A$] |_rd_ |
-|LWlatexmath:[$^\dagger$] |_rs1_latexmath:[$^A$] |_rd_ |
-|LBUlatexmath:[$^\dagger$] |_rs1_latexmath:[$^A$] |_rd_ |
-|LHUlatexmath:[$^\dagger$] |_rs1_latexmath:[$^A$] |_rd_ |
-|SB |_rs1_latexmath:[$^A$], _rs2_latexmath:[$^D$] | |
-|SH |_rs1_latexmath:[$^A$], _rs2_latexmath:[$^D$] | |
-|SW |_rs1_latexmath:[$^A$], _rs2_latexmath:[$^D$] | |
+
+|LB latexmath:[$^\dagger$] | _rs1_ ^A^ | _rd_ |
+
+|LH latexmath:[$^\dagger$] | _rs1_ ^A^ | _rd_ |
+
+|LW latexmath:[$^\dagger$] | _rs1_ ^A^ | _rd_ |
+
+|LBU latexmath:[$^\dagger$] | _rs1_ ^A^ | _rd_ |
+
+|LHU latexmath:[$^\dagger$] | _rs1_ ^A^ | _rd_ |
+
+|SB |_rs1_ ^A^, _rs2_ ^D^ | |
+
+|SH |_rs1_ ^A^, _rs2_ ^D^ | |
+
+|SW |_rs1_ ^A^, _rs2_ ^D^ | |
+
|ADDI |_rs1_ |_rd_ |
+
|SLTI |_rs1_ |_rd_ |
+
|SLTIU |_rs1_ |_rd_ |
+
|XORI |_rs1_ |_rd_ |
+
|ORI |_rs1_ |_rd_ |
+
|ANDI |_rs1_ |_rd_ |
+
|SLLI |_rs1_ |_rd_ |
+
|SRLI |_rs1_ |_rd_ |
+
|SRAI |_rs1_ |_rd_ |
+
|ADD |_rs1_, _rs2_ |_rd_ |
+
|SUB |_rs1_, _rs2_ |_rd_ |
+
|SLL |_rs1_, _rs2_ |_rd_ |
+
|SLT |_rs1_, _rs2_ |_rd_ |
+
|SLTU |_rs1_, _rs2_ |_rd_ |
+
|XOR |_rs1_, _rs2_ |_rd_ |
+
|SRL |_rs1_, _rs2_ |_rd_ |
+
|SRA |_rs1_, _rs2_ |_rd_ |
+
|OR |_rs1_, _rs2_ |_rd_ |
+
|AND |_rs1_, _rs2_ |_rd_ |
+
|FENCE | | |
+
|FENCE.I | | |
+
|ECALL | | |
+
|EBREAK | | |
-|===
-[cols="<,<,<,<,<",]
-|===
-|RV32I Base Integer Instruction Set (continued) | | | |
+|CSRRW latexmath:[$^\ddagger$] unless rd=x0 |_rs1_, _csr_^*^ | _rd_, _csr_ | ^*^
+
+|CSRRS latexmath:[$^\ddagger$] |_rs1_, _csr_ unless _rs1_=_x0_ |_rd_ ^*^, _csr_ |^*^
-| |Source |Destination |Accumulating |
+|CSRRC latexmath:[$^\ddagger$] |_rs1_, _csr_ unless _rs1_=_x0_ |_rd_ ^*^, _csr_ |^*^
-| |Registers |Registers |CSRs |
+4+|latexmath:[$\ddagger$]carries a dependency from _rs1_ to _csr_ and from _csr_ to _rd_
-|CSRRWlatexmath:[$^\ddagger$] |_rs1_, _csr_latexmath:[$^*$] |_rd_, _csr_
-| |latexmath:[$^*$]unless _rd_=`x0`
-|CSRRSlatexmath:[$^\ddagger$] |_rs1_, _csr_ |_rd_latexmath:[$^*$], _csr_
-| |latexmath:[$^*$]unless _rs1_=`x0`
+|CSRRWI latexmath:[$^\ddagger$] |_csr_ ^*^ |_rd_, _csr_ |^*^unless _rd_=_x0_
-|CSRRClatexmath:[$^\ddagger$] |_rs1_, _csr_ |_rd_latexmath:[$^*$], _csr_
-| |latexmath:[$^*$]unless _rs1_=`x0`
+|CSRRSI latexmath:[$^\ddagger$] |_csr_ |_rd_, _csr_^*^ |^*^unless uimm[4:0]=0
-| |latexmath:[$\ddagger$]carries a dependency from _rs1_ to _csr_ and
-from _csr_ to _rd_ | | |
+|CSRRCI latexmath:[$^\ddagger$] |_csr_ |_rd_, _csr_^*^ |^*^unless uimm[4:0]=0
+
+4+|latexmath:[$\ddagger$]carries a dependency from _csr_ to _rd_
|===
-[cols="<,<,<,<,<",]
+.RV64I Base Integer Instruction Set
+[%header, cols="<,<,<,<",]
|===
-|RV32I Base Integer Instruction Set (continued) | | | |
+||Source Registers |Destination Registers |Accumulating CSRs
-| |Source |Destination |Accumulating |
+|_LWU_ latexmath:[$^\dagger$] |_rs1_ ^A^ |_rd_ |
-| |Registers |Registers |CSRs |
+|_LD_ latexmath:[$^\dagger$] |_rs1_ ^A^ |_rd_ |
-|CSRRWIlatexmath:[$^\ddagger$] |_csr_latexmath:[$^*$] |_rd_, _csr_ |
-|latexmath:[$^*$]unless _rd_=`x0`
+|SD |_rs1_ ^A^, _rs2_ ^D^ | |
-|CSRRSIlatexmath:[$^\ddagger$] |_csr_ |_rd_, _csr_latexmath:[$^*$] |
-|latexmath:[$^*$]unless uimm[4:0]=0
+|SLLI | _rs1_ | _rd_ |
-|CSRRCIlatexmath:[$^\ddagger$] |_csr_ |_rd_, _csr_latexmath:[$^*$] |
-|latexmath:[$^*$]unless uimm[4:0]=0
+|SRLI | _rs1_ | _rd_ |
-| |latexmath:[$\ddagger$]carries a dependency from _csr_ to _rd_ | | |
-|===
+|SRAI | _rs1_ | _rd_ |
-[cols="<,<,<,<",]
-|===
-|*RV64I Base Integer Instruction Set* | | |
-| |Source |Destination |Accumulating
-| |Registers |Registers |CSRs
-|LWUlatexmath:[$^\dagger$] |_rs1_latexmath:[$^A$] |_rd_ |
-|LDlatexmath:[$^\dagger$] |_rs1_latexmath:[$^A$] |_rd_ |
-|SD |_rs1_latexmath:[$^A$], _rs2_latexmath:[$^D$] | |
-|SLLI |_rs1_ |_rd_ |
-|SRLI |_rs1_ |_rd_ |
-|SRAI |_rs1_ |_rd_ |
-|ADDIW |_rs1_ |_rd_ |
-|SLLIW |_rs1_ |_rd_ |
-|SRLIW |_rs1_ |_rd_ |
-|SRAIW |_rs1_ |_rd_ |
-|ADDW |_rs1_, _rs2_ |_rd_ |
-|SUBW |_rs1_, _rs2_ |_rd_ |
-|SLLW |_rs1_, _rs2_ |_rd_ |
-|SRLW |_rs1_, _rs2_ |_rd_ |
-|SRAW |_rs1_, _rs2_ |_rd_ |
+|ADDIW | _rs1_ | _rd_ |
+
+|SLLIW | _rs1_ | _rd_ |
+
+|SRLIW | _rs1_ | _rd_ |
+
+|SRAIW | _rs1_ | _rd_ |
+
+|ADDW | _rs1_, _rs2_ |_rd_ |
+
+|SUBW | _rs1_, _rs2_ |_rd_ |
+
+|SLLW | _rs1_, _rs2_ |_rd_ |
+
+|SRLW | _rs1_, _rs2_ |_rd_ |
+
+|SRAW | _rs1_, _rs2_ |_rd_ |
|===
-[cols="<,<,<,<",]
+.RV32M Standard Extension
+[%header,cols="<,<,<,<",]
|===
-|*RV32M Standard Extension* | | |
-| |Source |Destination |Accumulating
-| |Registers |Registers |CSRs
-|MUL |_rs1_, _rs2_ |_rd_ |
-|MULH |_rs1_, _rs2_ |_rd_ |
+| |Source Regisers |Destination Registers |Accumulating CSRs
+
+|MUL | _rs1_, _rs2_ |_rd_ |
+
+|MULH | _rs1_, _rs2_ |_rd_ |
+
|MULHSU |_rs1_, _rs2_ |_rd_ |
+
|MULHU |_rs1_, _rs2_ |_rd_ |
+
|DIV |_rs1_, _rs2_ |_rd_ |
+
|DIVU |_rs1_, _rs2_ |_rd_ |
+
|REM |_rs1_, _rs2_ |_rd_ |
+
|REMU |_rs1_, _rs2_ |_rd_ |
|===
-[cols="<,<,<,<",]
+.RV64M Standard Extension
+[%header, cols="<,<,<,<",]
|===
-|*RV64M Standard Extension* | | |
-| |Source |Destination |Accumulating
-| |Registers |Registers |CSRs
+||Source Registers |Destination Registers |Accumulating CSRs
+
|MULW |_rs1_, _rs2_ |_rd_ |
+
|DIVW |_rs1_, _rs2_ |_rd_ |
+
|DIVUW |_rs1_, _rs2_ |_rd_ |
+
|REMW |_rs1_, _rs2_ |_rd_ |
+
|REMUW |_rs1_, _rs2_ |_rd_ |
|===
-[cols="<,<,<,<,<",]
+.RV32A Standard Extension
+[%header,cols="<,<,<,<,<",]
|===
-|*RV32A Standard Extension* | | | |
+||Source Registers |Destination Registers |Accumulating CSRs|
-| |Source |Destination |Accumulating |
+|LR.W latexmath:[$^\dagger$] | _rs1_ ^A^ | _rd_ | |
-| |Registers |Registers |CSRs |
+|SC.W latexmath:[$^\dagger$] | _rs1_ ^A^, _rs2_ ^D^ | _rd_ ^*^ | | ^*^ if successful
-|LR.Wlatexmath:[$^\dagger$] |_rs1_latexmath:[$^A$] |_rd_ | |
+|AMOSWAP.W latexmath:[$^\dagger$] |_rs1_ ^A^, _rs2_ ^D^ |_rd_ | |
-|SC.Wlatexmath:[$^\dagger$] |_rs1_latexmath:[$^A$],
-_rs2_latexmath:[$^D$] |_rd_latexmath:[$^*$] | |latexmath:[$^*$]if
-successful
+|AMOADD.W latexmath:[$^\dagger$] |_rs1_ ^A^, _rs2_ ^D^ |_rd_ | |
-|AMOSWAP.Wlatexmath:[$^\dagger$] |_rs1_latexmath:[$^A$],
-_rs2_latexmath:[$^D$] |_rd_ | |
+|AMOXOR.W latexmath:[$^\dagger$] |_rs1_ ^A^, _rs2_ ^D^ |_rd_ | |
-|AMOADD.Wlatexmath:[$^\dagger$] |_rs1_latexmath:[$^A$],
-_rs2_latexmath:[$^D$] |_rd_ | |
+|AMOAND.W latexmath:[$^\dagger$] |_rs1_ ^A^, _rs2_ ^D^ |_rd_ | |
-|AMOXOR.Wlatexmath:[$^\dagger$] |_rs1_latexmath:[$^A$],
-_rs2_latexmath:[$^D$] |_rd_ | |
+|AMOOR.W latexmath:[$^\dagger$] |_rs1_ ^A^, _rs2_^D^ |_rd_ | |
-|AMOAND.Wlatexmath:[$^\dagger$] |_rs1_latexmath:[$^A$],
-_rs2_latexmath:[$^D$] |_rd_ | |
+|AMOMIN.W latexmath:[$^\dagger$] |_rs1_ ^A^, _rs2_ ^D^ |_rd_ | |
-|AMOOR.Wlatexmath:[$^\dagger$] |_rs1_latexmath:[$^A$],
-_rs2_latexmath:[$^D$] |_rd_ | |
+|AMOMAX.W latexmath:[$^\dagger$] |_rs1_ ^A^, _rs2_ ^D^ |_rd_ | |
-|AMOMIN.Wlatexmath:[$^\dagger$] |_rs1_latexmath:[$^A$],
-_rs2_latexmath:[$^D$] |_rd_ | |
+|AMOMINU.W latexmath:[$^\dagger$] |_rs1_ ^A^, _rs2_ ^D^ |_rd_ | |
-|AMOMAX.Wlatexmath:[$^\dagger$] |_rs1_latexmath:[$^A$],
-_rs2_latexmath:[$^D$] |_rd_ | |
-
-|AMOMINU.Wlatexmath:[$^\dagger$] |_rs1_latexmath:[$^A$],
-_rs2_latexmath:[$^D$] |_rd_ | |
-
-|AMOMAXU.Wlatexmath:[$^\dagger$] |_rs1_latexmath:[$^A$],
-_rs2_latexmath:[$^D$] |_rd_ | |
+|AMOMAXU.W latexmath:[$^\dagger$] |_rs1_ ^A^, _rs2_ ^D^ |_rd_ | |
|===
-[cols="<,<,<,<,<",]
+.RV64A Standard Extension
+[%header,cols="<,<,<,<,<",]
|===
-|*RV64A Standard Extension* | | | |
-| |Source |Destination |Accumulating |
+| |Source Registers |Destination Registers |Accumulating CSRs|
-| |Registers |Registers |CSRs |
+|LR.D latexmath:[$^\dagger$] |_rs1_ ^A^ |_rd_ | |
-|LR.Dlatexmath:[$^\dagger$] |_rs1_latexmath:[$^A$] |_rd_ | |
-
-|SC.Dlatexmath:[$^\dagger$] |_rs1_latexmath:[$^A$],
-_rs2_latexmath:[$^D$] |_rd_latexmath:[$^*$] | |latexmath:[$^*$]if
+|SC.D latexmath:[$^\dagger$] |_rs1_ ^A^, _rs2_ ^D^ |_rd_ ^*^ | |^*^if
successful
-|AMOSWAP.Dlatexmath:[$^\dagger$] |_rs1_latexmath:[$^A$],
-_rs2_latexmath:[$^D$] |_rd_ | |
+|AMOSWAP.D latexmath:[$^\dagger$] |_rs1_ ^A^, _rs2_ ^D^ |_rd_ | |
-|AMOADD.Dlatexmath:[$^\dagger$] |_rs1_latexmath:[$^A$],
-_rs2_latexmath:[$^D$] |_rd_ | |
+|AMOADD.D latexmath:[$^\dagger$] |_rs1_ ^A^, _rs2_ ^D^ |_rd_ | |
-|AMOXOR.Dlatexmath:[$^\dagger$] |_rs1_latexmath:[$^A$],
-_rs2_latexmath:[$^D$] |_rd_ | |
+|AMOXOR.D latexmath:[$^\dagger$] |_rs1_ ^A^, _rs2_ ^D^ |_rd_ | |
-|AMOAND.Dlatexmath:[$^\dagger$] |_rs1_latexmath:[$^A$],
-_rs2_latexmath:[$^D$] |_rd_ | |
+|AMOAND.D latexmath:[$^\dagger$] |_rs1_ ^A^, _rs2_^D^ |_rd_ | |
-|AMOOR.Dlatexmath:[$^\dagger$] |_rs1_latexmath:[$^A$],
-_rs2_latexmath:[$^D$] |_rd_ | |
+|AMOOR.D latexmath:[$^\dagger$] |_rs1_ ^A^, _rs2_^D^ |_rd_ | |
-|AMOMIN.Dlatexmath:[$^\dagger$] |_rs1_latexmath:[$^A$],
-_rs2_latexmath:[$^D$] |_rd_ | |
+|AMOMIN.D latexmath:[$^\dagger$] |_rs1_ ^A^, _rs2_^D^ |_rd_ | |
-|AMOMAX.Dlatexmath:[$^\dagger$] |_rs1_latexmath:[$^A$],
-_rs2_latexmath:[$^D$] |_rd_ | |
+|AMOMAX.D latexmath:[$^\dagger$] |_rs1_ ^A^, _rs2_^D^ |_rd_ | |
-|AMOMINU.Dlatexmath:[$^\dagger$] |_rs1_latexmath:[$^A$],
-_rs2_latexmath:[$^D$] |_rd_ | |
+|AMOMINU.D latexmath:[$^\dagger$] |_rs1_ ^A^, _rs2_^D^ |_rd_ | |
-|AMOMAXU.Dlatexmath:[$^\dagger$] |_rs1_latexmath:[$^A$],
-_rs2_latexmath:[$^D$] |_rd_ | |
+|AMOMAXU.D latexmath:[$^\dagger$] |_rs1_ ^A^, _rs2_^D^ |_rd_ | |
|===
+.RV32F Standard Extension
[cols="<,<,<,<,<",]
|===
-|*RV32F Standard Extension* | | | |
-| |Source |Destination |Accumulating |
+| |Source Registers |Destination Registers |Accumulating CSRs |
-| |Registers |Registers |CSRs |
-|FLWlatexmath:[$^\dagger$] |_rs1_latexmath:[$^A$] |_rd_ | |
+|FLWlatexmath:[$^\dagger$] |_rs1_ ^A^ |_rd_ | |
-|FSW |_rs1_latexmath:[$^A$], _rs2_latexmath:[$^D$] | | |
+|FSW |_rs1_ ^A^, _rs2_^D^ | | |
-|FMADD.S |_rs1_, _rs2_, _rs3_, frmlatexmath:[$^*$] |_rd_ |NV, OF, UF, NX
-|latexmath:[$^*$]if rm=111
+|FMADD.S |_rs1_, _rs2_, _rs3_, frm^*^ |_rd_ |NV, OF, UF, NX |^*^if rm=111
-|FMSUB.S |_rs1_, _rs2_, _rs3_, frmlatexmath:[$^*$] |_rd_ |NV, OF, UF, NX
-|latexmath:[$^*$]if rm=111
+|FMSUB.S |_rs1_, _rs2_, _rs3_, frm^*^ |_rd_ |NV, OF, UF, NX |^*^if rm=111
-|FNMSUB.S |_rs1_, _rs2_, _rs3_, frmlatexmath:[$^*$] |_rd_ |NV, OF, UF,
-NX |latexmath:[$^*$]if rm=111
+|FNMSUB.S |_rs1_, _rs2_, _rs3_, frm^*^ |_rd_ |NV, OF, UF, NX |^*^if rm=111
-|FNMADD.S |_rs1_, _rs2_, _rs3_, frmlatexmath:[$^*$] |_rd_ |NV, OF, UF,
-NX |latexmath:[$^*$]if rm=111
+|FNMADD.S |_rs1_, _rs2_, _rs3_, frm^*^ |_rd_ |NV, OF, UF, NX |^*^if rm=111
-|FADD.S |_rs1_, _rs2_, frmlatexmath:[$^*$] |_rd_ |NV, OF, NX
-|latexmath:[$^*$]if rm=111
+|FADD.S |_rs1_, _rs2_, frm^*^ |_rd_ |NV, OF, NX |^*^if rm=111
-|FSUB.S |_rs1_, _rs2_, frmlatexmath:[$^*$] |_rd_ |NV, OF, NX
-|latexmath:[$^*$]if rm=111
+|FSUB.S |_rs1_, _rs2_, frm^*^ |_rd_ |NV, OF, NX |^*^if rm=111
-|FMUL.S |_rs1_, _rs2_, frmlatexmath:[$^*$] |_rd_ |NV, OF, UF, NX
-|latexmath:[$^*$]if rm=111
+|FMUL.S |_rs1_, _rs2_, frm^*^ |_rd_ |NV, OF, UF, NX |^*^if rm=111
-|FDIV.S |_rs1_, _rs2_, frmlatexmath:[$^*$] |_rd_ |NV, DZ, OF, UF, NX
-|latexmath:[$^*$]if rm=111
+|FDIV.S |_rs1_, _rs2_, frm^*^ |_rd_ |NV, DZ, OF, UF, NX |^*^if rm=111
-|FSQRT.S |_rs1_, frmlatexmath:[$^*$] |_rd_ |NV, NX |latexmath:[$^*$]if
-rm=111
+|FSQRT.S |_rs1_, frm^*^ |_rd_ |NV, NX |^*^if rm=111
|FSGNJ.S |_rs1_, _rs2_ |_rd_ | |
@@ -671,11 +691,9 @@ rm=111
|FMAX.S |_rs1_, _rs2_ |_rd_ |NV |
-|FCVT.W.S |_rs1_, frmlatexmath:[$^*$] |_rd_ |NV, NX |latexmath:[$^*$]if
-rm=111
+|FCVT.W.S |_rs1_, frm^*^ |_rd_ |NV, NX |^*^if rm=111
-|FCVT.WU.S |_rs1_, frmlatexmath:[$^*$] |_rd_ |NV, NX |latexmath:[$^*$]if
-rm=111
+|FCVT.WU.S |_rs1_, frm^*^ |_rd_ |NV, NX |^*^if rm=111
|FMV.X.W |_rs1_ |_rd_ | |
@@ -687,76 +705,57 @@ rm=111
|FCLASS.S |_rs1_ |_rd_ | |
-|FCVT.S.W |_rs1_, frmlatexmath:[$^*$] |_rd_ |NX |latexmath:[$^*$]if
-rm=111
+|FCVT.S.W |_rs1_, frm^*^ |_rd_ |NX |^*^if rm=111
-|FCVT.S.WU |_rs1_, frmlatexmath:[$^*$] |_rd_ |NX |latexmath:[$^*$]if
-rm=111
+|FCVT.S.WU |_rs1_, frm^*^ |_rd_ |NX |^*^if rm=111
|FMV.W.X |_rs1_ |_rd_ | |
|===
-[cols="<,<,<,<,<",]
+.RV64F Standard Extension
+[%heaser,cols="<,<,<,<,<",]
|===
-|*RV64F Standard Extension* | | | |
+| |Source Regsiters |Destination Registers |Accumulating CSRs|
-| |Source |Destination |Accumulating |
+|FCVT.L.S |_rs1_, frm^*^ |_rd_ |NV, NX |^*^if rm=111
-| |Registers |Registers |CSRs |
+|FCVT.LU.S |_rs1_, frm^*^ |_rd_ |NV, NX |^*^if rm=111
-|FCVT.L.S |_rs1_, frmlatexmath:[$^*$] |_rd_ |NV, NX |latexmath:[$^*$]if
-rm=111
+|FCVT.S.L |_rs1_, frm^*^ |_rd_ |NX |^*^if rm=111
-|FCVT.LU.S |_rs1_, frmlatexmath:[$^*$] |_rd_ |NV, NX |latexmath:[$^*$]if
-rm=111
-
-|FCVT.S.L |_rs1_, frmlatexmath:[$^*$] |_rd_ |NX |latexmath:[$^*$]if
-rm=111
-
-|FCVT.S.LU |_rs1_, frmlatexmath:[$^*$] |_rd_ |NX |latexmath:[$^*$]if
-rm=111
+|FCVT.S.LU |_rs1_, frm^*^ |_rd_ |NX |^*^if rm=111
|===
-[cols="<,<,<,<,<",]
+.RV32D Standard Extension
+[%header,cols="<,<,<,<,<",]
|===
-|*RV32D Standard Extension* | | | |
-| |Source |Destination |Accumulating |
+| |Source Regsters|Destination Regsiters |Accumulating CSRs |
-| |Registers |Registers |CSRs |
-|FLDlatexmath:[$^\dagger$] |_rs1_latexmath:[$^A$] |_rd_ | |
+|FLD latexmath:[$^\dagger$] |_rs1_ ^A^ |_rd_ | |
-|FSD |_rs1_latexmath:[$^A$], _rs2_latexmath:[$^D$] | | |
+|FSD |_rs1_ ^A^, _rs2_^D^ | | |
-|FMADD.D |_rs1_, _rs2_, _rs3_, frmlatexmath:[$^*$] |_rd_ |NV, OF, UF, NX
-|latexmath:[$^*$]if rm=111
+|FMADD.D |_rs1_, _rs2_, _rs3_, frm^*^ |_rd_ |NV, OF, UF, NX |^*^if rm=111
-|FMSUB.D |_rs1_, _rs2_, _rs3_, frmlatexmath:[$^*$] |_rd_ |NV, OF, UF, NX
-|latexmath:[$^*$]if rm=111
+|FMSUB.D |_rs1_, _rs2_, _rs3_, frm^*^ |_rd_ |NV, OF, UF, NX |^*^if rm=111
-|FNMSUB.D |_rs1_, _rs2_, _rs3_, frmlatexmath:[$^*$] |_rd_ |NV, OF, UF,
-NX |latexmath:[$^*$]if rm=111
+|FNMSUB.D |_rs1_, _rs2_, _rs3_, frm^*^ |_rd_ |NV, OF, UF, NX |^*^if rm=111
-|FNMADD.D |_rs1_, _rs2_, _rs3_, frmlatexmath:[$^*$] |_rd_ |NV, OF, UF,
-NX |latexmath:[$^*$]if rm=111
+|FNMADD.D |_rs1_, _rs2_, _rs3_, frm^*^ |_rd_ |NV, OF, UF, NX |^*^if rm=111
-|FADD.D |_rs1_, _rs2_, frmlatexmath:[$^*$] |_rd_ |NV, OF, NX
-|latexmath:[$^*$]if rm=111
+|FADD.D |_rs1_, _rs2_, frm^*^ |_rd_ |NV, OF, NX |^*^if rm=111
-|FSUB.D |_rs1_, _rs2_, frmlatexmath:[$^*$] |_rd_ |NV, OF, NX
-|latexmath:[$^*$]if rm=111
+|FSUB.D |_rs1_, _rs2_, frm^*^ |_rd_ |NV, OF, NX |^*^if rm=111
-|FMUL.D |_rs1_, _rs2_, frmlatexmath:[$^*$] |_rd_ |NV, OF, UF, NX
-|latexmath:[$^*$]if rm=111
+|FMUL.D |_rs1_, _rs2_, frm^*^ |_rd_ |NV, OF, UF, NX |^*^if rm=111
-|FDIV.D |_rs1_, _rs2_, frmlatexmath:[$^*$] |_rd_ |NV, DZ, OF, UF, NX
-|latexmath:[$^*$]if rm=111
+|FDIV.D |_rs1_, _rs2_, frm^*^ |_rd_ |NV, DZ, OF, UF, NX |^*^if rm=111
-|FSQRT.D |_rs1_, frmlatexmath:[$^*$] |_rd_ |NV, NX |latexmath:[$^*$]if
-rm=111
+|FSQRT.D |_rs1_, frm^*^ |_rd_ |NV, NX |^*^if rm=111
|FSGNJ.D |_rs1_, _rs2_ |_rd_ | |
@@ -768,8 +767,7 @@ rm=111
|FMAX.D |_rs1_, _rs2_ |_rd_ |NV |
-|FCVT.S.D |_rs1_, frmlatexmath:[$^*$] |_rd_ |NV, OF, UF, NX
-|latexmath:[$^*$]if rm=111
+|FCVT.S.D |_rs1_, frm^*^ |_rd_ |NV, OF, UF, NX |^*^if rm=111
|FCVT.D.S |_rs1_ |_rd_ |NV |
@@ -781,11 +779,9 @@ rm=111
|FCLASS.D |_rs1_ |_rd_ | |
-|FCVT.W.D |_rs1_, frmlatexmath:[$^*$] |_rd_ |NV, NX |latexmath:[$^*$]if
-rm=111
+|FCVT.W.D |_rs1_,^*^ |_rd_ |NV, NX |^*^if rm=111
-|FCVT.WU.D |_rs1_, frmlatexmath:[$^*$] |_rd_ |NV, NX |latexmath:[$^*$]if
-rm=111
+|FCVT.WU.D |_rs1_, frm^*^ |_rd_ |NV, NX |^*^if rm=111
|FCVT.D.W |_rs1_ |_rd_ | |
@@ -793,27 +789,21 @@ rm=111
|===
-[cols="<,<,<,<,<",]
+.RV64D Standard Extension
+[%header,cols="<,<,<,<,<",]
|===
-|*RV64D Standard Extension* | | | |
-
-| |Source |Destination |Accumulating |
-| |Registers |Registers |CSRs |
+| |Source Regsiters |Destination Registers |Accumulating CSRs |
-|FCVT.L.D |_rs1_, frmlatexmath:[$^*$] |_rd_ |NV, NX |latexmath:[$^*$]if
-rm=111
+|FCVT.L.D |_rs1_, frm^*^ |_rd_ |NV, NX |^*^if rm=111
-|FCVT.LU.D |_rs1_, frmlatexmath:[$^*$] |_rd_ |NV, NX |latexmath:[$^*$]if
-rm=111
+|FCVT.LU.D |_rs1_, frm^*^ |_rd_ |NV, NX |^*^if rm=111
|FMV.X.D |_rs1_ |_rd_ | |
-|FCVT.D.L |_rs1_, frmlatexmath:[$^*$] |_rd_ |NX |latexmath:[$^*$]if
-rm=111
+|FCVT.D.L |_rs1_, frm^*^ |_rd_ |NX |^*^if rm=111
-|FCVT.D.LU |_rs1_, frmlatexmath:[$^*$] |_rd_ |NX |latexmath:[$^*$]if
-rm=111
+|FCVT.D.LU |_rs1_, frm^*^ |_rd_ |NX |^*^if rm=111
|FMV.D.X |_rs1_ |_rd_ | |