aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorDaniel Lustig <dlustig@nvidia.com>2021-10-13 16:56:13 -0400
committerDaniel Lustig <dlustig@nvidia.com>2021-10-13 16:56:13 -0400
commit500d45193cddf6b6324fb3cf2224e2b8b176dbae (patch)
tree716fbc7d060c491e99eef65c45248563d6c4b32d
parent42bd6ecc07f48b424dc46851b70e6cc5f3140ff2 (diff)
parent399c9a759eb4540a65c60e2cc236164821ff2346 (diff)
downloadriscv-isa-manual-virtual-memory.zip
riscv-isa-manual-virtual-memory.tar.gz
riscv-isa-manual-virtual-memory.tar.bz2
Merge branch 'master' into virtual-memoryvirtual-memory
-rw-r--r--.travis.yml3
-rw-r--r--README.md6
-rw-r--r--build/Makefile8
-rw-r--r--marchid.md2
-rw-r--r--src/a.tex5
-rw-r--r--src/c.tex5
-rw-r--r--src/csr.tex28
-rw-r--r--src/f.tex3
-rw-r--r--src/hypervisor.tex422
-rw-r--r--src/instr-table.tex450
-rw-r--r--src/intro.tex7
-rw-r--r--src/l.tex20
-rw-r--r--src/machine.tex555
-rw-r--r--src/memory.tex12
-rw-r--r--src/n.tex235
-rw-r--r--src/naming.tex5
-rw-r--r--src/preface.tex15
-rw-r--r--src/priv-csrs.tex62
-rw-r--r--src/priv-instr-table.tex10
-rw-r--r--src/priv-preface.tex33
-rw-r--r--src/riscv-privileged.tex7
-rw-r--r--src/riscv-spec.bib17
-rw-r--r--src/riscv-spec.tex6
-rw-r--r--src/rv32.tex28
-rw-r--r--src/rvwmo.tex2
-rw-r--r--src/supervisor.tex276
-rw-r--r--src/t.tex16
-rw-r--r--src/zfh.tex422
-rw-r--r--src/zfinx.tex159
-rw-r--r--src/zihintpause.tex2
30 files changed, 2155 insertions, 666 deletions
diff --git a/.travis.yml b/.travis.yml
index b72e692..f87d851 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -1,3 +1,6 @@
+branches:
+ only:
+ - master
dist: focal
before_install:
- sudo apt-get -qq update && sudo apt-get install -y --no-install-recommends texlive-fonts-recommended texlive-latex-extra texlive-fonts-extra dvipng texlive-latex-recommended
diff --git a/README.md b/README.md
index 4fbb50c..ff2e0cc 100644
--- a/README.md
+++ b/README.md
@@ -1,9 +1,9 @@
-RISC-V Instruction Set Manual [![Build Status](https://travis-ci.org/riscv/riscv-isa-manual.svg?branch=master)](https://travis-ci.org/riscv/riscv-isa-manual)
+RISC-V Instruction Set Manual [![Build Status](https://travis-ci.com/riscv/riscv-isa-manual.svg?branch=master)](https://travis-ci.com/riscv/riscv-isa-manual)
=============================
This repository contains the LaTeX source for the draft RISC-V Instruction Set
-Manual. At the time of this writing, none of these specifications have been
-formally adopted by the RISC-V Foundation.
+Manual. The preface of each document indicates the version of each
+standard that has been formally ratified by RISC-V International.
This work is licensed under a Creative Commons Attribution 4.0 International
License. See the LICENSE file for details.
diff --git a/build/Makefile b/build/Makefile
index 8b895f2..7a64a4c 100644
--- a/build/Makefile
+++ b/build/Makefile
@@ -8,19 +8,19 @@
# build directory, copy the makefile there, and change the srcdir
# variable accordingly.
#
-# Note that the makefile assumes that the default dvips/ps2pdfwr
-# commands "do the right thing" for fonts in pdfs. This is true on
+# Note that the makefile assumes that the default dvips/ps2pdfwr
+# commands "do the right thing" for fonts in pdfs. This is true on
# Athena/Linux and Fedora Core but is not true for older redhat installs ...
#
# At a minimum you should just change the main variable to be
-# the basename of your toplevel tex file. If you use a bibliography
+# the basename of your toplevel tex file. If you use a bibliography
# then you should set the bibfile variable to be the name of your
# .bib file (assumed to be in the source directory).
#
srcdir = ../src
-docs_with_bib = riscv-spec riscv-privileged
+docs_with_bib = riscv-spec riscv-privileged
docs_without_bib =
srcs = $(wildcard $(srcdir)/*.tex)
diff --git a/marchid.md b/marchid.md
index 655da50..64bcb7e 100644
--- a/marchid.md
+++ b/marchid.md
@@ -42,3 +42,5 @@ CV32E40S | OpenHW Group | [Arjan Bink](mailto:arjan.bink
Ibex | lowRISC | [lowRISC Hardware Team](mailto:hardware@lowrisc.org) | 22 | https://github.com/lowRISC/ibex
RudolV | Jörg Mische | [Jörg Mische](mailto:bobbl@gmx.de) | 23 | https://github.com/bobbl/rudolv
Steel Core | Rafael Calcada | [Rafael Calcada](mailto:rafaelcalcada@gmail.com) | 24 | https://github.com/rafaelcalcada/steel-core
+XiangShan | ICT, CAS | [XiangShan Team](mailto:xiangshan-all@ict.ac.cn) | 25 | https://github.com/OpenXiangShan/XiangShan
+Hummingbirdv2 E203 | Nuclei System Technology | [Can Hu](mailto:canhu@nucleisys.com), Nuclei System Technology | 26 | https://github.com/riscv-mcu/e203_hbirdv2
diff --git a/src/a.tex b/src/a.tex
index 5d64cbc..9f50dd4 100644
--- a/src/a.tex
+++ b/src/a.tex
@@ -131,10 +131,7 @@ might cause a move away from DW-CAS.
More generally, a multi-word atomic primitive is desirable, but there is
still considerable debate about what form this should take, and
-guaranteeing forward progress adds complexity to a system. Our
-current thoughts are to include a small limited-capacity transactional
-memory buffer along the lines of the original transactional memory
-proposals as an optional standard extension ``T''.
+guaranteeing forward progress adds complexity to a system.
\end{commentary}
The failure code with value 1 is reserved to encode an unspecified
diff --git a/src/c.tex b/src/c.tex
index fc174da..92dd11c 100644
--- a/src/c.tex
+++ b/src/c.tex
@@ -221,10 +221,7 @@ the number of immediate muxes required.
The immediate fields are scrambled in the instruction formats instead
of in sequential order so that as many bits as possible are in the
same position in every instruction, thereby simplifying
-implementations. For example, immediate bits 17---10 are always sourced from
-the same instruction bit positions. Five other immediate bits (5, 4,
-3, 1, and 0) have just two source instruction bits, while four (9, 7,
-6, and 2) have three sources and one (8) has four sources.
+implementations.
\end{commentary}
For many RVC instructions, zero-valued immediates are disallowed and
diff --git a/src/csr.tex b/src/csr.tex
index 539f42e..4c7f00e 100644
--- a/src/csr.tex
+++ b/src/csr.tex
@@ -165,9 +165,10 @@ be side effects of that write.
trap, the trap is not considered a side effect of the write but merely
an indirect effect.
- The CSRs defined so far in this volume
- do not have any architectural side effects on reads or writes.
- Custom extensions might add CSRs for which accesses have side effects.
+ Standard CSRs do not have any side effects on reads.
+ Standard CSRs may have side effects on writes.
+ Custom extensions might add CSRs for which accesses have side effects
+ on either reads or writes.
\end{commentary}
Some CSRs, such as the instructions-retired counter, {\tt instret},
@@ -232,15 +233,18 @@ result, the order of CSR accesses with respect to all other accesses is
constrained by the same mechanisms that constrain the order of memory-mapped
I/O accesses to such a region.
-These CSR-ordering constraints are imposed primarily to support ordering main
-memory and memory-mapped I/O accesses with respect to reads of the {\tt time}
-CSR. With the exception of the {\tt time}, {\tt cycle}, and {\tt mcycle} CSRs,
-the CSRs defined thus far in Volumes I and II of this specification are not
-directly accessible to other harts or devices and cause no side effects visible
-to other harts or devices. Thus, accesses to CSRs other than the
-aforementioned three can be freely reordered in the global memory order
-with respect to FENCE instructions
-without violating this specification.
+These CSR-ordering constraints are imposed to support ordering main
+memory and memory-mapped I/O accesses with respect to CSR accesses that
+are visible to, or affected by, devices or other harts.
+Examples include the {\tt time}, {\tt cycle}, and {\tt mcycle}
+CSRs, in addition to CSRs that reflect pending interrupts, like {\tt mip} and
+{\tt sip}.
+Note that implicit reads of such CSRs (e.g., taking an interrupt because of
+a change in {\tt mip}) are also ordered as device input.
+
+Most CSRs (including, e.g., the {\tt fcsr}) are not visible to other harts;
+their accesses can be freely reordered in the global memory order with respect
+to FENCE instructions without violating this specification.
\end{commentary}
The hardware platform may define that accesses to certain CSRs are
diff --git a/src/f.tex b/src/f.tex
index 81545fd..4152ae5 100644
--- a/src/f.tex
+++ b/src/f.tex
@@ -147,8 +147,7 @@ the three least-significant bits of integer register {\em rs1} into
{\tt frm}. FRFLAGS and FSFLAGS are defined analogously for the
Accrued Exception Flags field {\tt fflags}.
-Bits 31--8 of the {\tt fcsr} are reserved for other standard extensions,
-including the ``L'' standard extension for decimal floating-point. If
+Bits 31--8 of the {\tt fcsr} are reserved for other standard extensions. If
these extensions are not present, implementations shall ignore writes to
these bits and supply a zero value when read. Standard software should
preserve the contents of these bits.
diff --git a/src/hypervisor.tex b/src/hypervisor.tex
index e93b803..40ebec8 100644
--- a/src/hypervisor.tex
+++ b/src/hypervisor.tex
@@ -1,8 +1,11 @@
-\chapter{Hypervisor Extension, Version 0.6.1}
+\chapter{Hypervisor Extension, Version 1.0.0-rc}
\label{hypervisor}
-{\bf Warning! This draft specification may change before being
-accepted as standard by the RISC-V Foundation.}
+This chapter is in the Frozen state.
+A substantive change that is not backward-compatible is highly
+unlikely, and will occur only as the result of some truly critical
+issue being identified.
+For more info see: \texttt{http://riscv.org/spec-state}.
This chapter describes the RISC-V hypervisor extension, which virtualizes the
supervisor-level architecture to support the efficient hosting of guest
@@ -24,6 +27,12 @@ In HS-mode, an OS or hypervisor interacts with the machine through the same
SBI as an OS normally does from S-mode. An HS-mode hypervisor is expected to
implement the SBI for its VS-mode guest.
+The hypervisor extension depends on an ``I'' base integer ISA with
+32 {\tt x} registers (RV32I or RV64I), not RV32E, which has only
+16 {\tt x} registers.
+CSR {\tt mtval} must not be hardwired to zero, and
+{\tt satp}.MODE must not be hardwired to Bare.
+
The hypervisor extension is enabled by setting bit 7 in the {\tt misa} CSR,
which corresponds to the letter H.
RISC-V harts that implement the hypervisor extension are encouraged
@@ -53,34 +62,37 @@ When V=1, the hart is either in virtual S-mode (VS-mode), or in virtual U-mode
When V=0, the hart is either in M-mode, in HS-mode, or in U-mode atop an OS
running in HS-mode.
The virtualization mode also indicates whether two-stage address translation
-is active (V=1) or inactive (V=0). Table~\ref{h-operating-modes} lists the
-possible operating modes of a RISC-V hart with the hypervisor extension.
+is active (V=1) or inactive (V=0). Table~\ref{tab:HPrivModes} lists the
+possible privilege modes of a RISC-V hart with the hypervisor extension.
\begin{table*}[h!]
\begin{center}
\begin{tabular}{|c|c||l|l|l|}
\hline
- Virtualization & Privilege & \multirow{2}{*}{Abbreviation} & \multirow{2}{*}{Name} & Two-Stage \\
- Mode (V) & Encoding & & & Translation \\ \hline
- 0 & 0 & U-mode & User mode & Off \\
- 0 & 1 & HS-mode & Hypervisor-extended supervisor mode & Off \\
- 0 & 3 & M-mode & Machine mode & Off \\
+ Virtualization & Nominal & \multirow{2}{*}{Abbreviation} & \multirow{2}{*}{Name} & Two-Stage \\
+ Mode (V) & Privilege & & & Translation \\ \hline
+ 0 & U & U-mode & User mode & Off \\
+ 0 & S & HS-mode & Hypervisor-extended supervisor mode & Off \\
+ 0 & M & M-mode & Machine mode & Off \\
\hline
- 1 & 0 & VU-mode & Virtual user mode & On \\
- 1 & 1 & VS-mode & Virtual supervisor mode & On \\
+ 1 & U & VU-mode & Virtual user mode & On \\
+ 1 & S & VS-mode & Virtual supervisor mode & On \\
\hline
\end{tabular}
\end{center}
-\caption{Operating modes with the hypervisor extension.}
-\label{h-operating-modes}
+\caption{Privilege modes with the hypervisor extension.}
+\label{tab:HPrivModes}
\end{table*}
-For purposes of interrupt global enables, HS-mode is considered more privileged
-than VS-mode, and VS-mode is considered more privileged than VU-mode.
+For privilege modes U and VU, the \textit{nominal privilege mode} is~U,
+and for privilege modes HS and VS, the nominal privilege mode is~S.
+
+HS-mode is more privileged
+than VS-mode, and VS-mode is more privileged than VU-mode.
VS-mode interrupts are globally disabled when executing in U-mode.
\begin{commentary}
-This description does not consider the possibility of U-mode or VU-mode interrupts and will be revised if the N extension for user-level interrupts is ultimately adopted.
+This description does not consider the possibility of U-mode or VU-mode interrupts and will be revised if an extension for user-level interrupts is adopted.
\end{commentary}
\section{Hypervisor and Virtual Supervisor CSRs}
@@ -90,7 +102,7 @@ interrupt, and address-translation subsystems.
Additional CSRs are provided to HS-mode, but not to VS-mode, to manage
two-stage address translation and to control the behavior of a VS-mode guest:
{\tt hstatus}, {\tt hedeleg}, {\tt hideleg}, {\tt hvip}, {\tt hip}, {\tt hie},
-{\tt hgeip}, {\tt hgeie},
+{\tt hgeip}, {\tt hgeie}, {\tt henvcfg}, {\tt henvcfgh},
{\tt hcounteren}, {\tt htimedelta}, {\tt htimedeltah}, {\tt htval},
{\tt htinst}, and {\tt hgatp}.
@@ -116,8 +128,9 @@ do so.
Conversely, when V=0, the VS CSRs do not ordinarily affect the behavior of
the machine other than being readable and writable by CSR instructions.
-A few standard supervisor CSRs ({\tt scounteren} and, if the N extension
-is implemented, {\tt sedeleg} and {\tt sideleg}) have no matching VS CSR.
+Some standard supervisor CSRs ({\tt senvcfg},
+{\tt scounteren}, and {\tt scontext},
+possibly others) have no matching VS CSR.
These supervisor CSRs continue to have their usual function and
accessibility even when V=1, except with VS-mode and VU-mode substituting for
HS-mode and U-mode.
@@ -187,7 +200,7 @@ for tracking and controlling the exception behavior of a VS-mode guest.
\end{center}
}
\vspace{-0.1in}
-\caption{Hypervisor status register ({\tt hstatus}) for RV32.}
+\caption{Hypervisor status register ({\tt hstatus}) when HSXLEN=32.}
\label{hstatusreg-rv32}
\end{figure*}
@@ -244,7 +257,7 @@ HSXLEN-34 & 2 & 9 & 1 & 1 & 1 & \\
\end{center}
}
\vspace{-0.1in}
-\caption{Hypervisor status register ({\tt hstatus}) for RV64.}
+\caption{Hypervisor status register ({\tt hstatus}) when HSXLEN=64.}
\label{hstatusreg}
\end{figure*}
@@ -299,13 +312,14 @@ a virtual machine's memory.
\end{commentary}
The SPV bit (Supervisor Previous Virtualization mode) is written by the implementation
-whenever a trap is taken into HS-mode. Just as the SPP bit in {\tt sstatus} is set to the privilege
+whenever a trap is taken into HS-mode.
+Just as the SPP bit in {\tt sstatus} is set to the (nominal) privilege
mode at the time of the trap, the SPV bit in {\tt hstatus} is set to the value of the virtualization
mode V at the time of the trap. When an SRET instruction is executed when V=0,
V is set to SPV.
When V=1 and a trap is taken into HS-mode, bit SPVP (Supervisor Previous
-Virtual Privilege) is set to the privilege mode at the time of the trap,
+Virtual Privilege) is set to the nominal privilege mode at the time of the trap,
the same as {\tt sstatus}.SPP.
But if V=0 before a trap, SPVP is left unchanged on trap entry.
SPVP controls the effective privilege of explicit memory accesses made by
@@ -322,12 +336,14 @@ HS-mode and U-mode.
Field GVA (Guest Virtual Address) is written by the implementation
whenever a trap is taken into HS-mode.
-For any trap (access fault, page fault, or guest-page fault) that writes
+For any trap (breakpoint, address misaligned,
+access fault, page fault, or guest-page fault) that writes
a guest virtual address to {\tt stval}, GVA is set to~1.
For any other trap into HS-mode, GVA is set to~0.
\begin{commentary}
-For memory faults, GVA is redundant with field SPV (the two bits are set
+For breakpoint and memory access traps,
+GVA is redundant with field SPV (the two bits are set
the same) except when the explicit memory access of an HLV, HLVX, or HSV
instruction causes a fault.
In that case, SPV=0 but GVA=1.
@@ -403,6 +419,7 @@ Bit & Attribute & Corresponding Exception \\
7 & Writable & Store/AMO access fault \\
8 & Writable & Environment call from U-mode or VU-mode \\
9 & Read-only 0 & Environment call from HS-mode \\
+10 & Read-only 0 & Environment call from VS-mode \\
11 & Read-only 0 & Environment call from M-mode \\
12 & Writable & Instruction page fault \\
13 & Writable & Load page fault \\
@@ -456,8 +473,7 @@ interrupt causes (codes 16 and above).
Register {\tt hvip} is an HSXLEN-bit read/write register that a
hypervisor can write to indicate virtual interrupts intended for VS-mode.
-The bit positions writable in {\tt hideleg} shall also be writable in
-{\tt hvip}, and the other bits of {\tt hvip} shall be hardwired to zeros.
+Bits of {\tt hvip} that are not writable are hardwired to zeros.
\begin{figure}[h!]
{\footnotesize
@@ -478,6 +494,7 @@ HSXLEN \\
The standard portion (bits 15:0) of {\tt hvip} is formatted as shown in
Figure~\ref{hvipreg-standard}.
+Bits VSEIP, VSTIP, and VSSIP of {\tt hvip} are writable.
Setting VSEIP=1 in {\tt hvip} asserts a VS-level external interrupt;
setting VSTIP asserts a VS-level timer interrupt; and setting VSSIP
asserts a VS-level software interrupt.
@@ -516,9 +533,6 @@ Registers {\tt hip} and {\tt hie} are HSXLEN-bit read/write registers
that supplement HS-level's {\tt sip} and {\tt sie} respectively.
The {\tt hip} register indicates pending VS-level and hypervisor-specific
interrupts, while {\tt hie} contains enable bits for the same interrupts.
-As with {\tt sip} and {\tt sie}, an interrupt \textit{i} will be taken in
-HS-mode if bit~\textit{i} is set in both {\tt hip} and {\tt hie}, and if
-supervisor-level interrupts are globally enabled.
\begin{figure}[h!]
{\footnotesize
@@ -566,6 +580,15 @@ software to emulate the hypervisor extension on platforms that do not
implement it in hardware.
\end{commentary}
+An interrupt~\textit{i} will trap to HS-mode whenever all of the
+following are true:
+(a)~either the current operating mode is HS-mode and the SIE bit in the
+{\tt sstatus} register is set, or the current operating mode has less
+privilege than HS-mode;
+(b)~bit~\textit{i} is set in both {\tt sip} and {\tt sie}, or in both
+{\tt hip} and {\tt hie}; and
+(c)~bit~\textit{i} is not set in {\tt hideleg}.
+
If bit~\textit{i} of {\tt sie} is hardwired to zero, the same bit in
register {\tt hip} may be writable or may be read-only.
When bit~\textit{i} in {\tt hip} is writable, a pending interrupt
@@ -790,6 +813,102 @@ cause a supervisor-level (HS-level) guest external interrupt.
The enable bits in {\tt hgeie} do not affect the VS-level external
interrupt signal selected from {\tt hgeip} by {\tt hstatus}.VGEIN.
+\subsection{%
+ Hypervisor Environment Configuration Registers
+ ({\tt henvcfg} and {\tt henvcfgh})%
+}
+
+The {\tt henvcfg} CSR is an HSXLEN-bit read/write register,
+formatted for HSXLEN=64 as shown in Figure~\ref{fig:henvcfg},
+that controls certain
+characteristics of the execution environment when virtualization mode
+V=1.
+
+\begin{figure}[h!]
+{\footnotesize
+\begin{center}
+\begin{tabular}{c@{}Kcc@{}W@{}Wc}
+\instbit{63} &
+\instbitrange{62}{8} &
+\instbit{7} &
+\instbit{6} &
+\instbitrange{5}{4} &
+\instbitrange{3}{1} &
+\instbit{0} \\
+\hline
+\multicolumn{1}{|c|}{VSTCE} &
+\multicolumn{1}{c|}{\wpri} &
+\multicolumn{1}{c|}{CBZE} &
+\multicolumn{1}{c|}{CBCFE} &
+\multicolumn{1}{c|}{CBIE} &
+\multicolumn{1}{c|}{\wpri} &
+\multicolumn{1}{c|}{FIOM} \\
+\hline
+1 & 55 & 1 & 1 & 2 & 3 & 1 \\
+\end{tabular}
+\end{center}
+}
+\vspace{-0.1in}
+\caption{Hypervisor environment configuration register ({\tt henvcfg}) for HSXLEN=64.}
+\label{fig:henvcfg}
+\end{figure}
+
+If bit FIOM (Fence of I/O implies Memory) is set to one in
+{\tt henvcfg}, FENCE instructions executed when V=1 are modified
+so the requirement to order accesses to device I/O implies also the
+requirement to order main memory accesses.
+Table~\ref{tab:henvcfg-FIOM} details the modified interpretation of
+FENCE instruction bits PI, PO, SI, and SO when FIOM=1 and V=1.
+
+Similarly, when FIOM=1 and V=1,
+if an atomic instruction that accesses a region ordered as device I/O
+has its {\em aq} and/or {\em rl} bit set, then that instruction is ordered
+as though it accesses both device I/O and memory.
+
+\begin{table}[h!]
+\begin{center}
+\begin{tabular}{|c|l|}
+\hline
+Instruction bit & Meaning when set \\
+\hline
+PI & Predecessor device input and memory reads (PR implied) \\
+PO & Predecessor device output and memory writes (PW implied) \\
+\hline
+SI & Successor device input and memory reads (SR implied) \\
+SO & Successor device output and memory writes (SW implied) \\
+\hline
+\end{tabular}
+\end{center}
+\vspace{-0.1in}
+\caption{%
+Modified interpretation of FENCE predecessor and successor sets when
+FIOM=1 and virtualization mode V=1.%
+}
+\label{tab:henvcfg-FIOM}
+\end{table}
+
+The definition of the VSTCE field will be furnished by the
+forthcoming Sstc extension.
+Its allocation within {\tt henvcfg} may change prior to the ratification
+of that extension.
+
+The definition of the CBZE field will be furnished by the
+forthcoming Zicboz extension.
+Its allocation within {\tt henvcfg} may change prior to the ratification
+of that extension.
+
+The definitions of the CBCFE and CBIE fields will be furnished by the
+forthcoming Zicbom extension.
+Their allocations within {\tt henvcfg} may change prior to the ratification
+of that extension.
+
+When HSXLEN=32, {\tt henvcfg} contains the same fields as bits 31:0
+of {\tt henvcfg} when HSXLEN=64.
+Additionally, when HSXLEN=32, {\tt henvcfgh} is a 32-bit read/write register that
+contains the same fields as bits 63:32 of {\tt henvcfg} when
+HSXLEN=64.
+Register {\tt henvcfgh} does not exist when HSXLEN=64.
+
\subsection{Hypervisor Counter-Enable Register ({\tt hcounteren})}
The counter-enable register {\tt hcounteren} is a 32-bit register that
@@ -1043,8 +1162,8 @@ in HS-mode will raise an illegal instruction exception.
\end{center}
}
\vspace{-0.1in}
-\caption{RV32 Hypervisor guest address translation and protection register
-{\tt hgatp}.}
+\caption{Hypervisor guest address translation and protection register
+{\tt hgatp} when HSXLEN=32.}
\label{rv32hgatp}
\end{figure}
@@ -1067,42 +1186,42 @@ in HS-mode will raise an illegal instruction exception.
\end{center}
}
\vspace{-0.1in}
-\caption{RV64 Hypervisor guest address translation and protection register
-{\tt hgatp}, for MODE values Bare, Sv39x4, Sv48x4, and Sv57x4.}
+\caption{Hypervisor guest address translation and protection register
+{\tt hgatp} when HSXLEN=64, for MODE values Bare, Sv39x4, Sv48x4, and Sv57x4.}
\label{rv64hgatp}
\end{figure}
-Table~\ref{tab:hgatp-mode} shows the encodings of the MODE field for RV32 and
-RV64.
+Table~\ref{tab:hgatp-mode} shows the encodings of the MODE field when HSXLEN=32 and
+HSXLEN=64.
When MODE=Bare, guest physical addresses are equal to supervisor physical
addresses, and there is no further memory protection for a guest virtual
machine beyond the physical memory protection scheme described in
Section~\ref{sec:pmp}.
In this case, the remaining fields in {\tt hgatp} must be set to zeros.
-For RV32, the only other valid setting for MODE is Sv32x4, which is a
+When HSXLEN=32, the only other valid setting for MODE is Sv32x4, which is a
modification of the usual Sv32 paged virtual-memory scheme, extended to support
34-bit guest physical addresses.
-For RV64, modes Sv39x4, Sv48x4, and Sv57x4 are defined as modifications of the
+When HSXLEN=64, modes Sv39x4, Sv48x4, and Sv57x4 are defined as modifications of the
Sv39, Sv48, and Sv57 paged virtual-memory schemes.
All of these paged virtual-memory schemes are described in
Section~\ref{sec:guest-addr-translation}.
-The remaining MODE settings for RV64 are reserved for future use and may define
+The remaining MODE settings when HSXLEN=64 are reserved for future use and may define
different interpretations of the other fields in {\tt hgatp}.
\begin{table}[h]
\begin{center}
\begin{tabular}{|c|c|l|}
\hline
-\multicolumn{3}{|c|}{RV32} \\
+\multicolumn{3}{|c|}{HSXLEN=32} \\
\hline
Value & Name & Description \\
\hline
0 & Bare & No translation or protection. \\
1 & Sv32x4 & Page-based 34-bit virtual addressing (2-bit extension of Sv32). \\
\hline \hline
-\multicolumn{3}{|c|}{RV64} \\
+\multicolumn{3}{|c|}{HSXLEN=64} \\
\hline
Value & Name & Description \\
\hline
@@ -1119,8 +1238,8 @@ Value & Name & Description \\
\label{tab:hgatp-mode}
\end{table}
-RV64 implementations are not required to support all defined RV64 MODE
-settings.
+Implementations are not required to support all defined MODE
+settings when HSXLEN=64.
A write to {\tt hgatp} with an unsupported MODE value is not ignored as it is
for {\tt satp}.
@@ -1219,7 +1338,7 @@ instructions that normally read or modify {\tt sstatus} actually access
\end{center}
}
\vspace{-0.1in}
-\caption{Virtual supervisor status register ({\tt vsstatus}) for RV32.}
+\caption{Virtual supervisor status register ({\tt vsstatus}) when VSXLEN=32.}
\label{vsstatusreg-rv32}
\end{figure*}
@@ -1280,7 +1399,7 @@ instructions that normally read or modify {\tt sstatus} actually access
\end{center}
}
\vspace{-0.1in}
-\caption{Virtual supervisor status register ({\tt vsstatus}) for RV64.}
+\caption{Virtual supervisor status register ({\tt vsstatus}) when VSXLEN=64.}
\label{vsstatusreg}
\end{figure*}
@@ -1640,7 +1759,7 @@ Section~\ref{sec:two-stage-translation}).
\end{center}
}
\vspace{-0.1in}
-\caption{RV32 virtual supervisor address translation and protection register {\tt vsatp}.}
+\caption{Virtual supervisor address translation and protection register {\tt vsatp} when VSXLEN=32.}
\label{rv32vsatpreg}
\end{figure}
@@ -1661,7 +1780,7 @@ Section~\ref{sec:two-stage-translation}).
\end{center}
}
\vspace{-0.1in}
-\caption{RV64 virtual supervisor address translation and protection register {\tt vsatp}, for MODE
+\caption{Virtual supervisor address translation and protection register {\tt vsatp} when VSXLEN=64, for MODE
values Bare, Sv39, Sv48, and Sv57.}
\label{rv64vsatpreg}
\end{figure*}
@@ -1748,17 +1867,26 @@ except that \textit{execute} permission takes the place of \textit{read}
permission during address translation.
That is, the memory being read must be executable in both stages of
address translation, but read permission is not required.
-HLVX.WU is valid for RV32, even though LWU and HLV.WU are not.
-(For RV32, HLVX.WU can be considered a variant of HLV.W, as sign
-extension is irrelevant for 32-bit values.)
+For the supervisor physical address that results from address
+translation, the supervisor physical memory attributes must grant both
+\textit{execute} and \textit{read} permissions.
+(The \textit{supervisor physical memory attributes} are the machine's
+physical memory attributes as modified by physical memory protection,
+Section~\ref{sec:pmp}, for supervisor level.)
The {\tt hgatp} and {\tt vsatp} registers are considered {\em active}
for the purposes of the address-translation algorithm when executing
virtual-machine load/store instructions (HLV, HLVX, or HSV).
+\begin{commentary}
HLVX cannot override machine-level physical memory protection (PMP),
so attempting to read memory that PMP designates as execute-only still
results in an access-fault exception.
+\end{commentary}
+
+HLVX.WU is valid for RV32, even though LWU and HLV.WU are not.
+(For RV32, HLVX.WU can be considered a variant of HLV.W, as sign
+extension is irrelevant for 32-bit values.)
Attempts to execute a virtual-machine load/store instruction (HLV, HLVX,
or HSV) when V=1 cause a virtual instruction trap.
@@ -1899,7 +2027,7 @@ behavior of several existing {\tt mstatus} fields.
Figure~\ref{hypervisor-mstatus} shows the modified {\tt mstatus} register
when the hypervisor extension is implemented and MXLEN=64.
When MXLEN=32, the hypervisor extension adds MPV and GVA not to {\tt mstatus}
-but to {\tt mstatush}, which must exist.
+but to {\tt mstatush}.
Figure~\ref{hypervisor-mstatush} shows the {\tt mstatush} register when
the hypervisor extension is implemented and MXLEN=32.
@@ -2027,14 +2155,16 @@ The format of {\tt mstatus} is unchanged for RV32.}
\end{figure*}
The MPV bit (Machine Previous Virtualization Mode) is written by the implementation
-whenever a trap is taken into M-mode. Just as the MPP bit is set to the privilege
+whenever a trap is taken into M-mode.
+Just as the MPP field is set to the (nominal) privilege
mode at the time of the trap, the MPV bit is set to the value of the virtualization
mode V at the time of the trap. When an MRET instruction is executed, the
virtualization mode V is set to MPV, unless MPP=3, in which case V remains 0.
Field GVA (Guest Virtual Address) is written by the implementation
whenever a trap is taken into M-mode.
-For any trap (access fault, page fault, or guest-page fault) that writes
+For any trap (breakpoint, address misaligned,
+access fault, page fault, or guest-page fault) that writes
a guest virtual address to {\tt mtval}, GVA is set to~1.
For any other trap into M-mode, GVA is set to~0.
@@ -2051,7 +2181,7 @@ field, MPRV, of {\tt mstatus}.
When MPRV=0, translation and protection behave as normal.
When MPRV=1, explicit memory accesses are translated and protected, and
endianness is applied, as though the current virtualization mode were set
-to MPV and the current privilege mode were set to MPP.
+to MPV and the current nominal privilege mode were set to MPP.
Table~\ref{h-mprv} enumerates the cases.
\begin{table*}[h!]
@@ -2059,7 +2189,7 @@ Table~\ref{h-mprv} enumerates the cases.
\begin{tabular}{|c|c|c||p{4.5in}|}
\hline
MPRV & MPV & MPP & Effect \\ \hline \hline
- 0 & -- & -- & Normal access; current privilege and virtualization modes apply. \\ \hline
+ 0 & -- & -- & Normal access; current privilege mode applies. \\ \hline
1 & 0 & 0 & U-level access with HS-level translation and protection only. \\ \hline
1 & 0 & 1 & HS-level access with HS-level translation and protection only. \\ \hline
1 & -- & 3 & M-level access with no translation. \\ \hline
@@ -2075,7 +2205,7 @@ memory accesses.}
MPRV does not affect the virtual-machine load/store instructions, HLV,
HLVX, and HSV.
The explicit loads and stores of these instructions always act as though
-V=1 and the privilege mode were {\tt hstatus}.SPVP, overriding MPRV.
+V=1 and the nominal privilege mode were {\tt hstatus}.SPVP, overriding MPRV.
The {\tt mstatus} register is a superset of the HS-level {\tt sstatus}
register but is not a superset of {\tt vsstatus}.
@@ -2529,7 +2659,7 @@ preclude it.
\subsection{Guest-Page Faults}
Guest-page-fault traps may be delegated from M-mode to HS-mode under the
-control of CSR {\tt medeleg}, but cannot be delegated to other operating
+control of CSR {\tt medeleg}, but cannot be delegated to other privilege
modes.
On a guest-page fault, CSR {\tt mtval} or {\tt stval} is written with the
faulting guest virtual address as usual, and {\tt mtval2} or {\tt htval} is
@@ -2670,22 +2800,93 @@ HS-mode and VS-mode ECALLs use different cause values so they can be delegated
separately.
\end{commentary}
-When V=1, a virtual instruction trap (not an illegal instruction trap) is
-taken for:
+When V=1, a virtual instruction exception (code 22) is normally
+raised instead of an illegal instruction exception if the attempted
+instruction is \textit{HS-qualified}
+but is prevented from executing when V=1 due to
+insufficient privilege or because the instruction is expressly disabled
+by a supervisor or hypervisor CSR such as {\tt scounteren} or {\tt hcounteren}.
+An instruction is \textit{HS-qualified} if it would be valid to execute
+in HS-mode (for some values of the instruction's register operands),
+assuming fields TSR and TVM of CSR {\tt mstatus} are both zero.
+
+Special rules apply for CSR instructions that access \mbox{32-bit}
+high-half CSRs such as {\tt cycleh} and {\tt htimedeltah}.
+When V=1 and XLEN$>$32, an attempt to access a high-half
+supervisor-level CSR, high-half hypervisor CSR, high-half VS CSR,
+or high-half unprivileged CSR always raises an illegal instruction
+exception.
+And in VS-mode, if the XLEN for VU-mode is greater than 32, an attempt
+to access a high-half user-level CSR (distinct from an unprivileged
+CSR) always raises an illegal instruction exception.
+On the other hand, when V=1 and XLEN=32, an invalid attempt to access a
+high-half S-level, hypervisor, VS, or unprivileged CSR raises a virtual
+instruction exception instead of an illegal instruction exception
+if the same CSR instruction for the partner \textit{low-half} CSR
+(e.g.\@ {\tt cycle} or {\tt htimedelta}) is HS-qualified.
+Likewise, in VS-mode, if the XLEN for VU-mode is 32, an invalid attempt
+to access a high-half user-level CSR raises a virtual instruction
+exception instead of an illegal instruction exception if the same CSR
+instruction for the partner low-half CSR is HS-qualified.
+
+\begin{commentary}
+The RISC-V Privileged Architecture currently defines no user-level
+CSRs, but they might be added by a future version of this standard or
+by an extension.
+\end{commentary}
+
+Specifically, a virtual instruction exception is raised for the
+following cases:
\begin{itemize}
\item
-attempts to access a counter CSR when the corresponding bit in
+in VS-mode,
+attempts to access a non-high-half counter CSR when the corresponding bit in
{\tt hcounteren} is~0 and the same bit in {\tt mcounteren} is~1;
\item
-attempts to execute a hypervisor instruction (HLV, HLVX, HSV, or HFENCE)
-or to access an implemented hypervisor CSR or VS CSR;
+in VS-mode, if XLEN=32, attempts to access a high-half
+counter CSR when the corresponding bit in {\tt hcounteren} is~0 and the
+same bit in {\tt mcounteren} is~1;
\item
-in VU-mode, attempts to execute WFI or a
-supervisor instruction (SRET or SFENCE),
-or to access an implemented supervisor CSR;
+in VU-mode, attempts to access a non-high-half counter CSR when the
+corresponding bit in either {\tt hcounteren} or {\tt scounteren} is~0
+and the same bit in {\tt mcounteren} is~1;
+
+\item
+in VU-mode, if XLEN=32, attempts to access a high-half counter CSR when
+the corresponding bit in either {\tt hcounteren} or {\tt scounteren}
+is~0 and the same bit in {\tt mcounteren} is~1;
+
+\item
+in VS-mode or VU-mode,
+attempts to execute a hypervisor instruction (HLV, HLVX, HSV, or HFENCE);
+
+\item
+in VS-mode or VU-mode, attempts to access an implemented non-high-half
+hypervisor CSR or VS CSR when the same access (read/write) would be
+allowed in HS-mode, assuming {\tt mstatus}.TVM=0;
+
+\item
+in VS-mode or VU-mode, if XLEN=32, attempts to access an implemented
+high-half hypervisor CSR or high-half VS CSR when the same access
+(read/write) to the CSR's low-half partner would be allowed in HS-mode,
+assuming {\tt mstatus}.TVM=0;
+
+\item
+in VU-mode, attempts to execute WFI when {\tt mstatus}.TW=0, or to
+execute a supervisor instruction (SRET or SFENCE);
+
+\item
+in VU-mode, attempts to access an implemented non-high-half supervisor
+CSR when the same access (read/write) would be allowed in HS-mode,
+assuming {\tt mstatus}.TVM=0;
+
+\item
+in VU-mode, if XLEN=32, attempts to access an implemented high-half
+supervisor CSR when the same access to the CSR's low-half partner would
+be allowed in HS-mode, assuming {\tt mstatus}.TVM=0;
\item
in VS-mode, attempts to execute WFI when {\tt hstatus}.VTW=1 and
@@ -2693,39 +2894,94 @@ in VS-mode, attempts to execute WFI when {\tt hstatus}.VTW=1 and
implementation-specific, bounded time;
\item
-in VS-mode, attempts to execute SRET when {\tt hstatus}.VTSR=1; or
+in VS-mode, attempts to execute SRET when {\tt hstatus}.VTSR=1; and
\item
in VS-mode, attempts to execute an SFENCE instruction or to access
{\tt satp}, when {\tt hstatus}.VTVM=1.
\end{itemize}
+Other extensions to the \mbox{RISC-V} Privileged Architecture may add
+to the set of circumstances that cause a virtual instruction exception
+when V=1.
On a virtual instruction trap, {\tt mtval} or {\tt stval} is written the
same as for an illegal instruction trap.
\begin{commentary}
-When V=1, privileged instructions that are invalid in VS-mode or
-VU-mode generally cause a virtual instruction trap instead of an illegal
-instruction trap.
-The same goes for attempts to access hypervisor- or supervisor-level CSRs
-that fail due to insufficient privilege when V=1, or attempts to access
-CSRs to which access has been expressly disabled by a hypervisor CSR
-(e.g.\ {\tt hcounteren}).
-It is not unusual that hypervisors must emulate such instructions, to
+It is not unusual that hypervisors must emulate the
+instructions that raise virtual instruction exceptions, to
support nested hypervisors or for other reasons.
-When not emulating an instruction, a hypervisor should convert a virtual
-instruction trap into an illegal instruction exception for the guest
-virtual machine.
-
Machine level is expected ordinarily to delegate virtual instruction
traps directly to HS-level, whereas illegal instruction traps are likely
to be processed first in M-mode before being conditionally delegated (by
software) to HS-level.
Consequently, virtual instruction traps are expected typically to be
handled faster than illegal instruction traps.
+
+When not emulating the trapping instruction,
+a hypervisor should convert a virtual
+instruction trap into an illegal instruction exception for the guest
+virtual machine.
+\end{commentary}
+
+\begin{commentary}
+Because TSR and TVM in {\tt mstatus} are intended to impact only S-mode
+(HS-mode), they are ignored for determining exceptions in VS-mode.
\end{commentary}
+\begin{table*}[htbp]
+\begin{center}
+\begin{tabular}{|l|r|l|}
+ \hline
+ Priority & Exc.\@ Code & Description \\
+ \hline
+ {\em Highest} & 3 & Instruction address breakpoint \\
+ \hline
+ & & During instruction address translation: \\
+ & 12, 20, 1 & \quad First encountered page fault,
+ guest-page fault, or access fault \\
+ \hline
+ & & With physical address for instruction: \\
+ & 1 & \quad Instruction access fault \\
+ \hline
+ & 2 & Illegal instruction \\
+ & 22 & Virtual instruction \\
+ & 0 & Instruction address misaligned \\
+ & 8, 9, 10, 11 & Environment call \\
+ & 3 & Environment break \\
+ & 3 & Load/store/AMO address breakpoint \\
+ \hline
+ & & Optionally: \\
+ & 4, 6 & \quad Load/store/AMO address misaligned \\
+ \hline
+ & & During address translation for an explicit
+ memory access: \\
+ & 13, 15, 21, 23, 5, 7 & \quad First encountered page fault,
+ guest-page fault, or access fault \\
+ \hline
+ & & With physical address for an explicit
+ memory access: \\
+ & 5, 7 & \quad Load/store/AMO access fault \\
+ \hline
+ & & If not higher priority: \\
+ {\em Lowest} & 4, 6 & \quad Load/store/AMO address misaligned \\
+ \hline
+\end{tabular}
+\end{center}
+\caption{%
+Synchronous exception priority when the hypervisor extension is
+implemented.%
+}
+\label{tab:HSyncExcPrio}
+\end{table*}
+
+If an instruction may raise multiple synchronous exceptions, the
+decreasing priority order of Table~\ref{tab:HSyncExcPrio} indicates
+which exception is taken and reported in {\tt mcause} or {\tt scause}.
+
+\FloatBarrier
+
\subsection{Trap Entry}
When a trap occurs in HS-mode or U-mode, it goes to M-mode, unless
@@ -3237,22 +3493,22 @@ faulting instruction.
\subsection{Trap Return}
The MRET instruction is used to return from a trap taken into M-mode.
-MRET first determines what the new operating mode will be according to
+MRET first determines what the new privilege mode will be according to
the values of MPP and MPV in {\tt mstatus} or {\tt mstatush}, as encoded in
Table~\ref{h-mpp}.
MRET then in {\tt mstatus}/{\tt mstatush} sets MPV=0, MPP=0, MIE=MPIE, and MPIE=1.
-Lastly, MRET sets the virtualization and privilege modes as previously
+Lastly, MRET sets the privilege mode as previously
determined, and sets {\tt pc}={\tt mepc}.
The SRET instruction is used to return from a trap taken into HS-mode or
VS-mode. Its behavior depends on the current virtualization mode.
When executed in M-mode or HS-mode (i.e., V=0), SRET first determines
-what the new operating mode will be according to the values in
+what the new privilege mode will be according to the values in
{\tt hstatus}.SPV and {\tt sstatus}.SPP, as encoded in Table~\ref{h-spp}.
SRET then sets {\tt hstatus}.SPV=0, and in {\tt sstatus} sets SPP=0,
SIE=SPIE, and SPIE=1.
-Lastly, SRET sets the virtualization and privilege modes as previously
+Lastly, SRET sets the privilege mode as previously
determined, and sets {\tt pc}={\tt sepc}.
When executed in VS-mode (i.e., V=1), SRET sets the privilege mode according to
diff --git a/src/instr-table.tex b/src/instr-table.tex
index 604d51e..a143d75 100644
--- a/src/instr-table.tex
+++ b/src/instr-table.tex
@@ -2421,11 +2421,457 @@
\multicolumn{1}{c|}{rd} &
\multicolumn{1}{c|}{1010011} & FCVT.Q.LU \\
\cline{2-11}
-
+
+\end{tabular}
+\end{center}
+\end{small}
+
+\end{table}
+
+
+
+\newpage
+
+\begin{table}[p]
+\begin{small}
+\begin{center}
+\begin{tabular}{p{0in}p{0.4in}p{0.05in}p{0.05in}p{0.05in}p{0.05in}p{0.4in}p{0.6in}p{0.4in}p{0.6in}p{0.7in}l}
+& & & & & & & & & & \\
+ &
+\multicolumn{1}{l}{\instbit{31}} &
+\multicolumn{1}{r}{\instbit{27}} &
+\instbit{26} &
+\instbit{25} &
+\multicolumn{1}{l}{\instbit{24}} &
+\multicolumn{1}{r}{\instbit{20}} &
+\instbitrange{19}{15} &
+\instbitrange{14}{12} &
+\instbitrange{11}{7} &
+\instbitrange{6}{0} \\
+\cline{2-11}
+
+
+&
+\multicolumn{4}{|c|}{funct7} &
+\multicolumn{2}{c|}{rs2} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{funct3} &
+\multicolumn{1}{c|}{rd} &
+\multicolumn{1}{c|}{opcode} & R-type \\
+\cline{2-11}
+
+
+&
+\multicolumn{2}{|c|}{rs3} &
+\multicolumn{2}{c|}{funct2} &
+\multicolumn{2}{c|}{rs2} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{funct3} &
+\multicolumn{1}{c|}{rd} &
+\multicolumn{1}{c|}{opcode} & R4-type \\
+\cline{2-11}
+
+
+&
+\multicolumn{6}{|c|}{imm[11:0]} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{funct3} &
+\multicolumn{1}{c|}{rd} &
+\multicolumn{1}{c|}{opcode} & I-type \\
+\cline{2-11}
+
+
+&
+\multicolumn{4}{|c|}{imm[11:5]} &
+\multicolumn{2}{c|}{rs2} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{funct3} &
+\multicolumn{1}{c|}{imm[4:0]} &
+\multicolumn{1}{c|}{opcode} & S-type \\
+\cline{2-11}
+
+
+
+&
+\multicolumn{10}{c}{} & \\
+&
+\multicolumn{10}{c}{\bf RV32Zfh Standard Extension} & \\
+\cline{2-11}
+
+
+&
+\multicolumn{6}{|c|}{imm[11:0]} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{001} &
+\multicolumn{1}{c|}{rd} &
+\multicolumn{1}{c|}{0000111} & FLH \\
+\cline{2-11}
+
+
+&
+\multicolumn{4}{|c|}{imm[11:5]} &
+\multicolumn{2}{c|}{rs2} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{001} &
+\multicolumn{1}{c|}{imm[4:0]} &
+\multicolumn{1}{c|}{0100111} & FSH \\
+\cline{2-11}
+
+
+&
+\multicolumn{2}{|c|}{rs3} &
+\multicolumn{2}{c|}{10} &
+\multicolumn{2}{c|}{rs2} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{rm} &
+\multicolumn{1}{c|}{rd} &
+\multicolumn{1}{c|}{1000011} & FMADD.H \\
+\cline{2-11}
+
+
+&
+\multicolumn{2}{|c|}{rs3} &
+\multicolumn{2}{c|}{10} &
+\multicolumn{2}{c|}{rs2} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{rm} &
+\multicolumn{1}{c|}{rd} &
+\multicolumn{1}{c|}{1000111} & FMSUB.H \\
+\cline{2-11}
+
+
+&
+\multicolumn{2}{|c|}{rs3} &
+\multicolumn{2}{c|}{10} &
+\multicolumn{2}{c|}{rs2} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{rm} &
+\multicolumn{1}{c|}{rd} &
+\multicolumn{1}{c|}{1001011} & FNMSUB.H \\
+\cline{2-11}
+
+
+&
+\multicolumn{2}{|c|}{rs3} &
+\multicolumn{2}{c|}{10} &
+\multicolumn{2}{c|}{rs2} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{rm} &
+\multicolumn{1}{c|}{rd} &
+\multicolumn{1}{c|}{1001111} & FNMADD.H \\
+\cline{2-11}
+
+
+&
+\multicolumn{4}{|c|}{0000010} &
+\multicolumn{2}{c|}{rs2} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{rm} &
+\multicolumn{1}{c|}{rd} &
+\multicolumn{1}{c|}{1010011} & FADD.H \\
+\cline{2-11}
+
+
+&
+\multicolumn{4}{|c|}{0000110} &
+\multicolumn{2}{c|}{rs2} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{rm} &
+\multicolumn{1}{c|}{rd} &
+\multicolumn{1}{c|}{1010011} & FSUB.H \\
+\cline{2-11}
+
+
+&
+\multicolumn{4}{|c|}{0001010} &
+\multicolumn{2}{c|}{rs2} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{rm} &
+\multicolumn{1}{c|}{rd} &
+\multicolumn{1}{c|}{1010011} & FMUL.H \\
+\cline{2-11}
+
+
+&
+\multicolumn{4}{|c|}{0001110} &
+\multicolumn{2}{c|}{rs2} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{rm} &
+\multicolumn{1}{c|}{rd} &
+\multicolumn{1}{c|}{1010011} & FDIV.H \\
+\cline{2-11}
+
+
+&
+\multicolumn{4}{|c|}{0101110} &
+\multicolumn{2}{c|}{00000} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{rm} &
+\multicolumn{1}{c|}{rd} &
+\multicolumn{1}{c|}{1010011} & FSQRT.H \\
+\cline{2-11}
+
+
+&
+\multicolumn{4}{|c|}{0010010} &
+\multicolumn{2}{c|}{rs2} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{000} &
+\multicolumn{1}{c|}{rd} &
+\multicolumn{1}{c|}{1010011} & FSGNJ.H \\
+\cline{2-11}
+
+
+&
+\multicolumn{4}{|c|}{0010010} &
+\multicolumn{2}{c|}{rs2} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{001} &
+\multicolumn{1}{c|}{rd} &
+\multicolumn{1}{c|}{1010011} & FSGNJN.H \\
+\cline{2-11}
+
+
+&
+\multicolumn{4}{|c|}{0010010} &
+\multicolumn{2}{c|}{rs2} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{010} &
+\multicolumn{1}{c|}{rd} &
+\multicolumn{1}{c|}{1010011} & FSGNJX.H \\
+\cline{2-11}
+
+
+&
+\multicolumn{4}{|c|}{0010110} &
+\multicolumn{2}{c|}{rs2} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{000} &
+\multicolumn{1}{c|}{rd} &
+\multicolumn{1}{c|}{1010011} & FMIN.H \\
+\cline{2-11}
+
+
+&
+\multicolumn{4}{|c|}{0010110} &
+\multicolumn{2}{c|}{rs2} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{001} &
+\multicolumn{1}{c|}{rd} &
+\multicolumn{1}{c|}{1010011} & FMAX.H \\
+\cline{2-11}
+
+
+&
+\multicolumn{4}{|c|}{0100000} &
+\multicolumn{2}{c|}{00010} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{rm} &
+\multicolumn{1}{c|}{rd} &
+\multicolumn{1}{c|}{1010011} & FCVT.S.H \\
+\cline{2-11}
+
+
+&
+\multicolumn{4}{|c|}{0100001} &
+\multicolumn{2}{c|}{00010} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{rm} &
+\multicolumn{1}{c|}{rd} &
+\multicolumn{1}{c|}{1010011} & FCVT.D.H \\
+\cline{2-11}
+
+
+&
+\multicolumn{4}{|c|}{0100011} &
+\multicolumn{2}{c|}{00010} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{rm} &
+\multicolumn{1}{c|}{rd} &
+\multicolumn{1}{c|}{1010011} & FCVT.Q.H \\
+\cline{2-11}
+
+
+&
+\multicolumn{4}{|c|}{0100010} &
+\multicolumn{2}{c|}{00000} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{rm} &
+\multicolumn{1}{c|}{rd} &
+\multicolumn{1}{c|}{1010011} & FCVT.H.S \\
+\cline{2-11}
+
+
+&
+\multicolumn{4}{|c|}{0100010} &
+\multicolumn{2}{c|}{00001} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{rm} &
+\multicolumn{1}{c|}{rd} &
+\multicolumn{1}{c|}{1010011} & FCVT.H.D \\
+\cline{2-11}
+
+
+&
+\multicolumn{4}{|c|}{0100010} &
+\multicolumn{2}{c|}{00011} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{rm} &
+\multicolumn{1}{c|}{rd} &
+\multicolumn{1}{c|}{1010011} & FCVT.H.Q \\
+\cline{2-11}
+
+
+&
+\multicolumn{4}{|c|}{1110010} &
+\multicolumn{2}{c|}{00000} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{000} &
+\multicolumn{1}{c|}{rd} &
+\multicolumn{1}{c|}{1010011} & FMV.X.H \\
+\cline{2-11}
+
+
+&
+\multicolumn{4}{|c|}{1010010} &
+\multicolumn{2}{c|}{rs2} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{010} &
+\multicolumn{1}{c|}{rd} &
+\multicolumn{1}{c|}{1010011} & FEQ.H \\
+\cline{2-11}
+
+
+&
+\multicolumn{4}{|c|}{1010010} &
+\multicolumn{2}{c|}{rs2} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{001} &
+\multicolumn{1}{c|}{rd} &
+\multicolumn{1}{c|}{1010011} & FLT.H \\
+\cline{2-11}
+
+
+&
+\multicolumn{4}{|c|}{1010010} &
+\multicolumn{2}{c|}{rs2} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{000} &
+\multicolumn{1}{c|}{rd} &
+\multicolumn{1}{c|}{1010011} & FLE.H \\
+\cline{2-11}
+
+
+&
+\multicolumn{4}{|c|}{1110010} &
+\multicolumn{2}{c|}{00000} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{001} &
+\multicolumn{1}{c|}{rd} &
+\multicolumn{1}{c|}{1010011} & FCLASS.H \\
+\cline{2-11}
+
+
+&
+\multicolumn{4}{|c|}{1100010} &
+\multicolumn{2}{c|}{00000} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{rm} &
+\multicolumn{1}{c|}{rd} &
+\multicolumn{1}{c|}{1010011} & FCVT.W.H \\
+\cline{2-11}
+
+
+&
+\multicolumn{4}{|c|}{1100010} &
+\multicolumn{2}{c|}{00001} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{rm} &
+\multicolumn{1}{c|}{rd} &
+\multicolumn{1}{c|}{1010011} & FCVT.WU.H \\
+\cline{2-11}
+
+
+&
+\multicolumn{4}{|c|}{1101010} &
+\multicolumn{2}{c|}{00000} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{rm} &
+\multicolumn{1}{c|}{rd} &
+\multicolumn{1}{c|}{1010011} & FCVT.H.W \\
+\cline{2-11}
+
+
+&
+\multicolumn{4}{|c|}{1101010} &
+\multicolumn{2}{c|}{00001} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{rm} &
+\multicolumn{1}{c|}{rd} &
+\multicolumn{1}{c|}{1010011} & FCVT.H.WU \\
+\cline{2-11}
+
+
+&
+\multicolumn{4}{|c|}{1111010} &
+\multicolumn{2}{c|}{00000} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{000} &
+\multicolumn{1}{c|}{rd} &
+\multicolumn{1}{c|}{1010011} & FMV.H.X \\
+\cline{2-11}
+
+
+&
+\multicolumn{10}{c}{} & \\
+&
+\multicolumn{10}{c}{\bf RV64Zfh Standard Extension (in addition to RV32Zfh)} & \\
+\cline{2-11}
+
+
+&
+\multicolumn{4}{|c|}{1100010} &
+\multicolumn{2}{c|}{00010} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{rm} &
+\multicolumn{1}{c|}{rd} &
+\multicolumn{1}{c|}{1010011} & FCVT.L.H \\
+\cline{2-11}
+
+
+&
+\multicolumn{4}{|c|}{1100010} &
+\multicolumn{2}{c|}{00011} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{rm} &
+\multicolumn{1}{c|}{rd} &
+\multicolumn{1}{c|}{1010011} & FCVT.LU.H \\
+\cline{2-11}
+
+
+&
+\multicolumn{4}{|c|}{1101010} &
+\multicolumn{2}{c|}{00010} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{rm} &
+\multicolumn{1}{c|}{rd} &
+\multicolumn{1}{c|}{1010011} & FCVT.H.L \\
+\cline{2-11}
+
+
+&
+\multicolumn{4}{|c|}{1101010} &
+\multicolumn{2}{c|}{00011} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{rm} &
+\multicolumn{1}{c|}{rd} &
+\multicolumn{1}{c|}{1010011} & FCVT.H.LU \\
+\cline{2-11}
+
\end{tabular}
\end{center}
\end{small}
\caption{Instruction listing for RISC-V}
\end{table}
-
+
diff --git a/src/intro.tex b/src/intro.tex
index 112e7b4..330902b 100644
--- a/src/intro.tex
+++ b/src/intro.tex
@@ -311,8 +311,8 @@ For this purpose, we divide each RISC-V
instruction-set encoding space (and related encoding spaces such as
the CSRs) into three disjoint categories: {\em standard}, {\em
reserved}, and {\em custom}. Standard extensions and encodings
-are defined by the Foundation; any extensions not defined by the
-Foundation are {\em non-standard}.
+are defined by RISC-V International; any extensions not defined by
+RISC-V International are {\em non-standard}.
Each base ISA and its standard extensions use only standard encodings,
and shall not conflict with each other in their uses of these encodings.
Reserved encodings are currently not defined but are saved for future
@@ -422,8 +422,7 @@ Ordinarily, if an instruction attempts to access memory at an inaccessible
address, an exception is raised for the instruction.
Vacant locations in the address space are never accessible.
-Except when specified otherwise, implicit reads that do not raise an
-exception and that have no side effects
+Except when specified otherwise, implicit reads that do not raise an exception
may occur arbitrarily early and speculatively, even before the machine could
possibly prove that the read will be needed. For instance, a valid
implementation could attempt to read all of main memory at the earliest
diff --git a/src/l.tex b/src/l.tex
deleted file mode 100644
index 30c688d..0000000
--- a/src/l.tex
+++ /dev/null
@@ -1,20 +0,0 @@
-\chapter{``L'' Standard Extension for Decimal Floating-Point, Version 0.0}
-
-{\bf This chapter is a draft proposal that has not been ratified by
- the Foundation.}
-
-This chapter is a placeholder for the specification of a standard
-extension named ``L'' designed to support decimal floating-point
-arithmetic as defined in the IEEE 754-2008 standard.
-
-\section{Decimal Floating-Point Registers}
-
-Existing floating-point registers are used to hold 64-bit and 128-bit
-decimal floating-point values, and the existing floating-point load
-and store instructions are used to move values to and from memory.
-
-\begin{commentary}
-Due to the large opcode space required by the fused multiply-add
-instructions, the decimal floating-point instruction extension will
-require five 25-bit major opcodes in a 30-bit encoding space.
-\end{commentary}
diff --git a/src/machine.tex b/src/machine.tex
index 6b82746..6b62662 100644
--- a/src/machine.tex
+++ b/src/machine.tex
@@ -126,15 +126,15 @@ Bit & Character & Description \\
8 & I & RV32I/64I/128I base ISA \\
9 & J & {\em Tentatively reserved for Dynamically Translated Languages extension} \\
10 & K & {\em Reserved} \\
- 11 & L & {\em Tentatively reserved for Decimal Floating-Point extension} \\
+ 11 & L & {\em Reserved} \\
12 & M & Integer Multiply/Divide extension \\
- 13 & N & User-level interrupts supported \\
+ 13 & N & {\em Tentatively reserved for User-Level Interrupts extension} \\
14 & O & {\em Reserved} \\
15 & P & {\em Tentatively reserved for Packed-SIMD extension} \\
16 & Q & Quad-precision floating-point extension \\
17 & R & {\em Reserved} \\
18 & S & Supervisor mode implemented \\
- 19 & T & {\em Tentatively reserved for Transactional Memory extension} \\
+ 19 & T & {\em Reserved} \\
20 & U & User mode implemented \\
21 & V & {\em Tentatively reserved for Vector extension} \\
22 & W & {\em Reserved} \\
@@ -222,7 +222,7 @@ codes in the Bank field, and encodes the final byte in the Offset field,
discarding the parity bit. For example, the JEDEC manufacturer ID
{\tt 0x7f 0x7f 0x7f 0x7f 0x7f 0x7f 0x7f 0x7f 0x7f 0x7f 0x7f 0x7f 0x8a}
(twelve continuation codes followed by {\tt 0x8a}) would be encoded in the
-{\tt mvendorid} field as {\tt 0x60a}.
+{\tt mvendorid} CSR as {\tt 0x60a}.
\begin{commentary}
In JEDEC's parlance, the bank number is one greater than the number of
@@ -231,8 +231,8 @@ that is one less than the JEDEC bank number.
\end{commentary}
\begin{commentary}
-Previously the vendor ID was to be a number allocated by the RISC-V
-Foundation, but this duplicates the work of JEDEC in maintaining a
+Previously the vendor ID was to be a number allocated by RISC-V
+International, but this duplicates the work of JEDEC in maintaining a
manufacturer ID standard. At time of writing, registering a
manufacturer ID with JEDEC has a one-time cost of \$500.
\end{commentary}
@@ -263,8 +263,8 @@ MXLEN \\
\label{marchreg}
\end{figure*}
-Open-source project architecture IDs are allocated globally by the
-RISC-V Foundation, and have non-zero architecture IDs with a zero
+Open-source project architecture IDs are allocated globally by
+RISC-V International, and have non-zero architecture IDs with a zero
most-significant-bit (MSB). Commercial architecture IDs are allocated
by each commercial vendor independently, but must have the MSB set and
cannot contain zero in the remaining MXLEN-1 bits.
@@ -276,7 +276,7 @@ occurs rather than a particular organization. Commercial fabrications
of open-source designs should (and might be required by the license
to) retain the original architecture ID. This will aid in reducing
fragmentation and tool support costs, as well as provide attribution.
-Open-source architecture IDs should be administered by the Foundation
+Open-source architecture IDs are administered by RISC-V International
and should only be allocated to released, functioning open-source
projects. Commercial architecture IDs can be managed independently by
any registered vendor but are required to have IDs disjoint from the
@@ -525,9 +525,6 @@ Bits 30:4 of {\tt mstatush} generally contain the same fields found in
bits 62:36 of {\tt mstatus} for RV64.
Fields SD, SXL, and UXL do not exist in {\tt mstatush}.
-The {\tt mstatush} register is not required to be implemented if every field
-would be hardwired to zero.
-
\begin{figure*}[h!]
{\footnotesize
\begin{center}
@@ -919,8 +916,21 @@ Status & FS Meaning & XS Meaning\\
\label{fsxsencoding}
\end{table*}
-In systems that do not implement S-mode and do not have a
-floating-point unit, the FS field is hardwired to zero.
+If the F extension is implemented, the FS field shall not be
+hardwired to zero.
+
+If neither the F extension nor S-mode is implemented, then FS is
+hardwired to zero.
+If S-mode is implemented but the F extension is not, FS may optionally
+be hardwired to zero.
+
+\begin{commentary}
+Implementations with S-mode but without the F extension are
+permitted, but not required, to hardwire the FS field to zero.
+Some such implementations will choose {\em not} to hardwire the FS
+field to zero, so as to enable emulation of the F extension for
+both S-mode and U-mode via invisible traps into M-mode.
+\end{commentary}
In systems without additional user extensions requiring new state, the
XS field is hardwired to zero. Every additional extension with state
@@ -1218,8 +1228,7 @@ must exist, and setting a bit in
{\tt medeleg} or {\tt mideleg} will delegate the corresponding trap, when
occurring in S-mode or U-mode, to the S-mode trap handler.
In systems without S-mode, the {\tt medeleg} and {\tt mideleg} registers
-should not exist (unless the N extension for user-mode interrupts is
-implemented).
+should not exist.
\begin{commentary}
In versions 1.9.1 and earlier , these registers existed but were
@@ -1369,17 +1378,25 @@ MXLEN \\
\label{miereg}
\end{figure}
-An interrupt \textit{i} will be taken if bit \textit{i} is set in both
-{\tt mip} and {\tt mie}, and if interrupts are globally enabled. By
-default, M-mode interrupts are globally enabled if the hart's current
-privilege mode is less than M, or if the current privilege mode is M
-and the MIE bit in the {\tt mstatus} register is set. If bit \textit{i}
-in {\tt mideleg} is set, however, interrupts are considered to be
-globally enabled if the hart's current privilege mode equals the
-delegated privilege mode and that mode's interrupt enable
-bit (\textit{x}\/IE in {\tt mstatus} for mode~\textit{x}) is set,
-or if the current
-privilege mode is less than the delegated privilege mode.
+An interrupt~\textit{i} will trap to M-mode (causing the privilege mode
+to change to M-mode) if all of the following are true:
+(a)~either the current privilege mode is M and the MIE bit in the
+{\tt mstatus} register is set, or the current privilege mode has less
+privilege than M-mode;
+(b)~bit~\textit{i} is set in both {\tt mip} and {\tt mie}; and
+(c)~if register {\tt mideleg} exists, bit~\textit{i} is not set in
+{\tt mideleg}.
+
+These conditions for an interrupt trap to occur must be evaluated in a bounded
+amount of time from when an interrupt becomes, or ceases to be,
+pending in {\tt mip}, and must
+also be evaluated immediately following the execution of an {\em x}\/RET
+instruction or an explicit write to a CSR on which these interrupt trap
+conditions expressly depend (including {\tt mip}, {\tt mie}, {\tt mstatus},
+and {\tt mideleg}).
+
+Interrupts to M-mode take priority over any interrupts to lower privilege
+modes.
Each individual bit in register {\tt mip} may be writable or may be
read-only.
@@ -1512,6 +1529,10 @@ to memory-mapped control registers, which are used by remote harts to
provide machine-level interprocessor interrupts.
A hart can write its
own MSIP bit using the same memory-mapped control register.
+If a system has only one hart, or if a platform standard supports the
+delivery of machine-level interprocessor interrupts through external
+interrupts (MEI) instead, then {\tt mip}.MSIP and {\tt mie}.MSIE may
+both be hardwired to zeros.
If supervisor mode is not implemented, bits SEIP, STIP, and SSIP of
{\tt mip} and SEIE, STIE, and SSIE of {\tt mie} are hardwired to zeros.
@@ -1535,6 +1556,14 @@ Only the software-writable SEIP bit participates in the
read-modify-write sequence of a CSRRS or CSRRC instruction.
\begin{commentary}
+ For example, if we name the software-writable SEIP bit {\tt B} and the
+ signal from the external interrupt controller {\tt E}, then if \mbox{\tt csrrs
+ t0, mip, t1} is executed, {\tt t0[9]} is written with \mbox{\tt B || E}, then
+ {\tt B} is written with \mbox{\tt B || t1[9]}.
+ If \mbox{\tt csrrw t0, mip, t1} is executed, then {\tt t0[9]} is written with
+ \mbox{\tt B || E}, and {\tt B} is simply written with {\tt t1[9]}.
+ In neither case does {\tt B} depend upon {\tt E}.
+
The SEIP field behavior is designed to allow a higher privilege
layer to mimic external interrupts cleanly, without losing any real
external interrupts. The behavior of the CSR instructions is
@@ -1550,30 +1579,12 @@ written by M-mode software to deliver timer interrupts to S-mode.
If supervisor mode is implemented, bits {\tt mip}.SSIP and {\tt mie}.SSIE
are the interrupt-pending and interrupt-enable bits for supervisor-level
software interrupts.
-SSIP is writable in {\tt mip}.
-
-\begin{commentary}
-Interprocessor
-interrupts at supervisor level are implemented through
-implementation-specific mechanisms, e.g., via calls to an SEE,
-which might ultimately result in
-a machine-mode write to the receiving hart's MSIP bit.
-
-We allow a hart to directly write only its own SSIP bit, not those of other
-harts, as other harts might be
-virtualized and possibly descheduled by higher privilege levels. We
-rely on calls to the SEE to provide interprocessor interrupts
-for this reason. Machine-mode harts are not virtualized and can
-directly interrupt other harts by setting their MSIP bits, typically
-using uncached I/O writes to memory-mapped control registers depending
-on the platform specification.
-\end{commentary}
+SSIP is writable in {\tt mip} and may also be set to 1 by a platform-specific
+interrupt controller.
-Multiple simultaneous interrupts destined for different privilege modes are
-handled in decreasing order of destined privilege mode. Multiple simultaneous
-interrupts destined for the same privilege mode are handled in the following
+Multiple simultaneous
+interrupts destined for M-mode are handled in the following
decreasing priority order: MEI, MSI, MTI, SEI, SSI, STI.
-Synchronous exceptions are of lower priority than all interrupts.
\begin{commentary}
The machine-level interrupt fixed-priority ordering rules were developed
@@ -1600,9 +1611,6 @@ Synchronous exceptions are of lower priority than all interrupts.
Software interrupts are located in the lowest four bits of {\tt mip}
as these are often written by software, and this position allows the
use of a single CSR instruction with a five-bit immediate.
-
- Synchronous exceptions are given the lowest priority to minimize
- worst-case interrupt latency.
\end{commentary}
Restricted views of the {\tt mip} and {\tt mie} registers appear as
@@ -1668,10 +1676,11 @@ A future revision of this specification will define a mechanism to generate an
interrupt when a hardware performance monitor counter overflows.
\end{commentary}
-On RV32 only, reads of the {\tt mcycle}, {\tt minstret}, and {\tt
-mhpmcounter{\em n}} CSRs return the low 32 bits, while reads of the {\tt
-mcycleh}, {\tt minstreth}, and {\tt mhpmcounter{\em n}h} CSRs return bits
-63--32 of the corresponding counter.
+When MXLEN=32, reads of the {\tt mcycle}, {\tt minstret}, and {\tt
+mhpmcounter{\em n}} CSRs return bits 31--0 of the corresponding counter, and
+writes change only bits 31--0; reads of the {\tt mcycleh}, {\tt minstreth},
+and {\tt mhpmcounter{\em n}h} CSRs return bits 63--32 of the corresponding
+counter, and writes change only bits 63--32.
\begin{figure}[h!]
{\footnotesize
@@ -1774,7 +1783,7 @@ The {\tt cycle}, {\tt instret}, and {\tt hpmcounter{\em n}} CSRs are
read-only shadows of {\tt mcycle}, {\tt minstret}, and {\tt mhpmcounter{\em
n}}, respectively. The {\tt time} CSR is a read-only shadow of the
memory-mapped {\tt mtime} register. Analogously, on RV32I the {\tt cycleh},
-{\tt instreth} and {\tt hpmcounter{\em n}} CSRs are read-only shadows of
+{\tt instreth} and {\tt hpmcounter{\em n}h} CSRs are read-only shadows of
{\tt mcycleh}, {\tt minstreth} and {\tt mhpmcounter{\em n}h}, respectively.
On RV32I the {\tt timeh} CSR is a read-only shadow of the upper 32 bits of
the memory-mapped {\tt mtime} register, while {\tt time} shadows only the
@@ -1912,8 +1921,9 @@ Though masked, {\tt mepc[1]} remains writable when IALIGN=32.
{\tt mepc} is a \warl\ register that must be able to hold all valid
virtual addresses. It need not be capable of holding all possible invalid
-addresses. Implementations may convert some invalid address patterns into
-other invalid addresses prior to writing them to {\tt mepc}.
+addresses.
+Prior to writing {\tt mepc}, implementations may convert an invalid address
+into some other invalid address that {\tt mepc} is capable of holding.
\begin{commentary}
When address translation is not in effect, virtual addresses and physical
@@ -2051,7 +2061,7 @@ a policy on whether these need to be distinguished, and if so, whether
a given opcode should be treated as illegal or privileged.
\end{commentary}
-If an instruction raises multiple synchronous exceptions, the
+If an instruction may raise multiple synchronous exceptions, the
decreasing priority order of Table~\ref{exception-priority}
indicates which exception is taken and reported in {\tt mcause}.
The priority of any custom synchronous exceptions is implementation-defined.
@@ -2061,24 +2071,37 @@ The priority of any custom synchronous exceptions is implementation-defined.
\begin{tabular}{|l|r|l|}
\hline
- Priority & Exception Code & Description \\
+ Priority & Exc.\@ Code & Description \\
+ \hline
+ {\em Highest} & 3 & Instruction address breakpoint \\
+ \hline
+ & & During instruction address translation: \\
+ & 12, 1 & \quad First encountered page fault or
+ access fault \\
+ \hline
+ & & With physical address for instruction: \\
+ & 1 & \quad Instruction access fault \\
\hline
- {\em Highest} & 3 & Instruction address breakpoint \\ \hline
- & 12 & Instruction page fault \\ \hline
- & 1 & Instruction access fault \\ \hline
& 2 & Illegal instruction \\
& 0 & Instruction address misaligned \\
& 8, 9, 11 & Environment call \\
& 3 & Environment break \\
- & 3 & Load/Store/AMO address breakpoint \\ \hline
- {\em Optionally, these may have}
- & 6 & Store/AMO address misaligned \\
- {\em lowest priority instead.}
- & 4 & Load address misaligned \\ \hline
- & 15 & Store/AMO page fault \\
- & 13 & Load page fault \\ \hline
- & 7 & Store/AMO access fault \\
- & 5 & Load access fault \\
+ & 3 & Load/store/AMO address breakpoint \\
+ \hline
+ & & Optionally: \\
+ & 4, 6 & \quad Load/store/AMO address misaligned \\
+ \hline
+ & & During address translation for an explicit
+ memory access: \\
+ & 13, 15, 5, 7 & \quad First encountered page fault or
+ access fault \\
+ \hline
+ & & With physical address for an explicit
+ memory access: \\
+ & 5, 7 & \quad Load/store/AMO access fault \\
+ \hline
+ & & If not higher priority: \\
+ {\em Lowest} & 4, 6 & \quad Load/store/AMO address misaligned \\
\hline
\end{tabular}
@@ -2087,7 +2110,11 @@ The priority of any custom synchronous exceptions is implementation-defined.
\label{exception-priority}
\end{table*}
-Note that load/store/AMO address-misaligned exceptions may have
+When a virtual address is translated into
+a physical address, the address translation
+algorithm determines what specific exception may be raised.
+
+Load/store/AMO address-misaligned exceptions may have
either higher or lower priority than load/store/AMO page-fault and
access-fault exceptions.
\begin{commentary}
@@ -2125,26 +2152,15 @@ software in handling the trap. Otherwise, {\tt mtval} is never written by the
implementation, though it may be explicitly written by software. The hardware
platform will specify which exceptions must set {\tt mtval} informatively and
which may unconditionally set it to zero.
+If the hardware platform specifies that no exceptions set {\tt mtval} to a
+nonzero value, then {\tt mtval} is hardwired to zero.
-When a breakpoint,
-address-misaligned, access-fault, or page-fault exception occurs
-on an instruction fetch, load, or store, {\tt
- mtval} is written with the faulting virtual address. On an illegal
-instruction trap, {\tt mtval} may be written with the first XLEN or ILEN
-bits of the faulting instruction as described below. For other traps,
-{\tt mtval} is set to zero, but a future standard may redefine {\tt
- mtval}'s setting for other traps.
+If {\tt mtval} is written with a nonzero value when a breakpoint,
+address-misaligned, access-fault, or page-fault exception occurs on an
+instruction fetch, load, or store, then {\tt mtval} will contain the faulting
+virtual address.
\begin{commentary}
- The {\tt mtval} register replaces the {\tt mbadaddr} register in
- the previous specification. In addition to providing bad addresses,
- the register can now provide the bad instruction that triggered an
- illegal instruction trap (and may in future be used to return other
- information). Returning the instruction bits accelerates instruction emulation and also
- removes some races that might be present when trying to emulate
- illegal instructions.
-\end{commentary}
-\begin{commentary}
When page-based virtual memory is enabled, {\tt mtval} is written with
the faulting virtual address, even for physical-memory access-fault exceptions.
This design reduces datapath cost for most implementations, particularly
@@ -2168,29 +2184,28 @@ MXLEN \\
\label{mtvalreg}
\end{figure}
-For misaligned loads and stores that cause access-fault or page-fault exceptions,
-{\tt mtval} will contain the virtual address of the portion of the access that
-caused the fault. For instruction access-fault or page-fault exceptions on
-systems with variable-length instructions, {\tt mtval} will contain the
-virtual address of the portion of the instruction that caused the fault while
-{\tt mepc} will point to the beginning of the instruction.
-
-The {\tt mtval} register can optionally also be used to return the
-faulting instruction bits on an illegal instruction exception ({\tt
- mepc} points to the faulting instruction in memory).
-
-If this feature is not provided, then {\tt mtval} is set to zero on
-an illegal instruction fault.
-
-If this feature is provided, after an illegal instruction trap, {\tt mtval}
-will contain the shortest of:
+If {\tt mtval} is written with a nonzero value when a misaligned load or store
+causes an access-fault or page-fault exception, then {\tt mtval} will contain
+the virtual address of the portion of the access that caused the fault.
+
+If {\tt mtval} is written with a nonzero value when an instruction
+access-fault or page-fault exception occurs on a system with variable-length
+instructions, then {\tt mtval} will contain the virtual address of the portion
+of the instruction that caused the fault, while {\tt mepc} will point to the
+beginning of the instruction.
+
+The {\tt mtval} register can optionally also be used to return the faulting
+instruction bits on an illegal instruction exception ({\tt mepc} points to the
+faulting instruction in memory).
+If {\tt mtval} is written with a nonzero value when an illegal-instruction
+exception occurs, then {\tt mtval} will contain the shortest of:
\begin{compactitem}
\item the actual faulting instruction
\item the first ILEN bits of the faulting instruction
-\item the first XLEN bits of the faulting instruction
+\item the first MXLEN bits of the faulting instruction
\end{compactitem}
-The value loaded into {\tt mtval} is right-justified and all unused upper
-bits are cleared to zero.
+The value loaded into {\tt mtval} on an illegal-instruction exception is
+right-justified and all unused upper bits are cleared to zero.
\begin{commentary}
Capturing the faulting instruction in {\tt mtval} reduces the
@@ -2202,7 +2217,7 @@ bits are cleared to zero.
instruction memory, as might occur in a dynamic translation system.
A requirement is that the entire instruction (or at least the first
- XLEN bits) are fetched into {\tt mtval} before taking the trap.
+ MXLEN bits) are fetched into {\tt mtval} before taking the trap.
This should not constrain implementations, which would typically
fetch the entire instruction before attempting to decode the
instruction, and avoids complicating software handlers.
@@ -2215,15 +2230,221 @@ bits are cleared to zero.
appropriate trap handling before runtime).
\end{commentary}
-If the hardware platform specifies that no exceptions set {\tt mtval} to a
-nonzero value, then it may be hardwired to zero. Otherwise,
-{\tt mtval} is a \warl\ register that must be able to hold all valid
-virtual addresses and the value 0. It need not be capable of holding all
-possible invalid addresses. Implementations may convert some invalid address
-patterns into other invalid addresses prior to writing them to {\tt mtval}.
+For other traps, {\tt mtval} is set to zero, but a future standard may
+redefine {\tt mtval}'s setting for other traps.
+
+If {\tt mtval} is not hardwired to zero, it is a \warl\ register that must be
+able to hold all valid virtual addresses and the value zero.
+It need not be capable of holding all
+possible invalid addresses.
+Prior to writing {\tt mtval}, implementations may convert an invalid address
+into some other invalid address that {\tt mtval} is capable of holding.
If the feature to return the faulting instruction bits is implemented, {\tt
mtval} must also be able to hold all values less than $2^N$, where $N$ is the
-smaller of XLEN and ILEN.
+smaller of MXLEN and ILEN.
+
+\subsection{Machine Configuration Pointer Register ({\tt mconfigptr})}
+
+{\tt mconfigptr} is an MXLEN-bit read-only CSR, formatted as shown in
+Figure~\ref{mconfigptrreg}, that holds the physical address of a configuration
+data structure.
+Software can traverse this data structure to discover information about
+the harts, the platform, and their configuration.
+
+\begin{figure}[h!]
+{\footnotesize
+\begin{center}
+\begin{tabular}{@{}J}
+\instbitrange{MXLEN-1}{0} \\
+\hline
+\multicolumn{1}{|c|}{\tt mconfigptr} \\
+\hline
+MXLEN \\
+\end{tabular}
+\end{center}
+}
+\vspace{-0.1in}
+\caption{Machine Configuration Pointer register.}
+\label{mconfigptrreg}
+\end{figure}
+
+The pointer alignment in bits must be no smaller than the greatest supported
+MXLEN: i.e., if the greatest supported MXLEN is $8\times n$, then
+{\tt mconfigptr}[$\log_2n$-1:0] must be hardwired to zero.
+
+{\tt mconfigptr} must be implemented, but it may be hardwired to zero to
+indicate the configuration data structure does not exist or that an
+alternative mechanism must be used to locate it.
+
+\begin{commentary}
+The format and schema of the configuration data structure have yet to be standardized.
+\end{commentary}
+
+\begin{commentary}
+While {\tt mconfigptr} will simply be hardwired in some implementations, other
+implementations may provide a means to configure the value returned on CSR
+reads.
+For example, {\tt mconfigptr} might present the value of a memory-mapped
+register that is programmed by the platform or by M-mode software towards the
+beginning of the boot process.
+\end{commentary}
+
+\subsection{%
+ Machine Environment Configuration Registers
+ ({\tt menvcfg} and {\tt menvcfgh})%
+}
+
+The {\tt menvcfg} CSR is an MXLEN-bit read/write register,
+formatted for MXLEN=64 as shown in Figure~\ref{fig:menvcfg},
+that controls certain characteristics of the execution environment
+for modes less privileged than M.
+
+\begin{figure}[h!]
+{\footnotesize
+\begin{center}
+\begin{tabular}{c@{}Kcc@{}W@{}Wc}
+\instbit{63} &
+\instbitrange{62}{8} &
+\instbit{7} &
+\instbit{6} &
+\instbitrange{5}{4} &
+\instbitrange{3}{1} &
+\instbit{0} \\
+\hline
+\multicolumn{1}{|c|}{STCE} &
+\multicolumn{1}{c|}{\wpri} &
+\multicolumn{1}{c|}{CBZE} &
+\multicolumn{1}{c|}{CBCFE} &
+\multicolumn{1}{c|}{CBIE} &
+\multicolumn{1}{c|}{\wpri} &
+\multicolumn{1}{c|}{FIOM} \\
+\hline
+1 & 55 & 1 & 1 & 2 & 3 & 1 \\
+\end{tabular}
+\end{center}
+}
+\vspace{-0.1in}
+\caption{Machine environment configuration register ({\tt menvcfg}) for MXLEN=64.}
+\label{fig:menvcfg}
+\end{figure}
+
+If bit FIOM (Fence of I/O implies Memory) is set to one in {\tt menvcfg},
+FENCE instructions executed in modes less privileged than M are modified so
+the requirement to order accesses to device I/O implies also the requirement
+to order main memory accesses.
+Table~\ref{tab:menvcfg-FIOM} details the modified interpretation of
+FENCE instruction bits PI, PO, SI, and SO for modes less privileged than M
+when FIOM=1.
+
+Similarly, for modes less privileged than M when FIOM=1,
+if an atomic instruction that accesses a region ordered as device I/O
+has its {\em aq} and/or {\em rl} bit set, then that instruction is ordered
+as though it accesses both device I/O and memory.
+
+If S-mode is not supported, or if {\tt satp}.MODE is hardwired to Bare,
+the implementation may hardwire FIOM to zero.
+
+\begin{table}[h!]
+\begin{center}
+\begin{tabular}{|c|l|}
+\hline
+Instruction bit & Meaning when set \\
+\hline
+PI & Predecessor device input and memory reads (PR implied) \\
+PO & Predecessor device output and memory writes (PW implied) \\
+\hline
+SI & Successor device input and memory reads (SR implied) \\
+SO & Successor device output and memory writes (SW implied) \\
+\hline
+\end{tabular}
+\end{center}
+\vspace{-0.1in}
+\caption{%
+Modified interpretation of FENCE predecessor and successor sets
+for modes less privileged than M when FIOM=1.%
+}
+\label{tab:menvcfg-FIOM}
+\end{table}
+
+\begin{commentary}
+Bit FIOM is needed in {\tt menvcfg} so M-mode can emulate the
+hypervisor extension of Chapter~\ref{hypervisor}, which has an
+equivalent FIOM bit in the hypervisor CSR {\tt henvcfg}.
+\end{commentary}
+
+The definition of the STCE field will be furnished by the
+forthcoming Sstc extension.
+Its allocation within {\tt menvcfg} may change prior to the ratification
+of that extension.
+
+The definition of the CBZE field will be furnished by the
+forthcoming Zicboz extension.
+Its allocation within {\tt menvcfg} may change prior to the ratification
+of that extension.
+
+The definitions of the CBCFE and CBIE fields will be furnished by the
+forthcoming Zicbom extension.
+Their allocations within {\tt menvcfg} may change prior to the ratification
+of that extension.
+
+When MXLEN=32, {\tt menvcfg} contains the same fields as bits 31:0
+of {\tt menvcfg} when MXLEN=64.
+Additionally, when MXLEN=32, {\tt menvcfgh} is a 32-bit read/write register that
+contains the same fields as bits 63:32 of {\tt menvcfg} when
+MXLEN=64.
+Register {\tt menvcfgh} does not exist when MXLEN=64.
+
+If U-mode is not supported, then registers {\tt menvcfg} and {\tt menvcfgh} do
+not exist.
+
+\subsection{Machine Security Configuration Register ({\tt mseccfg})}
+\label{sec:mseccfg}
+
+{\tt mseccfg} is an optional MXLEN-bit read/write register, formatted as shown
+in Figure~\ref{fig:mseccfg}, that controls security features.
+
+When MXLEN=32 only, {\tt mseccfgh} is a 32-bit read/write register that
+contains the same fields as {\tt mseccfg} bits 63:32 when MXLEN=64.
+
+\begin{figure*}[h!]
+{\footnotesize
+\begin{center}
+\setlength{\tabcolsep}{4pt}
+\begin{tabular}{MccFccc}
+\instbitrange{XLEN-1}{10} &
+\instbit{9} &
+\instbit{8} &
+\instbitrange{7}{3} &
+\instbit{2} &
+\instbit{1} &
+\instbit{0} \\
+\hline
+\multicolumn{1}{|c|}{\wpri} &
+\multicolumn{1}{c|}{SSEED} &
+\multicolumn{1}{c|}{USEED} &
+\multicolumn{1}{c|}{\wpri} &
+\multicolumn{1}{c|}{RLB} &
+\multicolumn{1}{c|}{MMWP} &
+\multicolumn{1}{c|}{MML} \\
+\hline
+XLEN-10 & 1 & 1 & 5 & 1 & 1 & 1 \\
+\end{tabular}
+\end{center}
+}
+\vspace{-0.1in}
+\caption{Machine security configuration register ({\tt mseccfg}).}
+\label{fig:mseccfg}
+\end{figure*}
+
+The definitions of the SSEED and USEED fields will be furnished by the
+forthcoming entropy-source extension, Zkr.
+Their allocations within {\tt mseccfg} may change prior to the ratification
+of that extension.
+
+The definitions of the RLB, MMWP, and MML fields will be furnished by the
+forthcoming PMP-enhancement extension, Smepmp.
+Their allocations within {\tt mseccfg} may change prior to the ratification
+of that extension.
\section{Machine-Level Memory-Mapped Registers}
@@ -2339,7 +2560,7 @@ to the intermediate value of the comparand:
\end{figure}
For RV64, naturally aligned 64-bit memory accesses to the {\tt mtime} and {\tt
- mtimecmp} registers are atomic.
+ mtimecmp} registers are additionally supported and are atomic.
\section{Machine-Mode Privileged Instructions}
@@ -2432,15 +2653,6 @@ mode stack. In addition to manipulating the privilege stack as
described in Section~\ref{privstack}, {\em x}\/RET sets the {\tt pc}
to the value stored in the {\em x}\/{\tt epc} register.
-\begin{commentary}
-Previously, there was only a single ERET instruction (which was also
-earlier known as SRET). To support the addition of user-level
-interrupts, we needed to add a separate URET instruction to continue
-to allow classic virtualization of OS code using the ERET instruction.
-It then became more orthogonal to support a different {\em x}\/RET
-instruction per privilege level.
-\end{commentary}
-
If the A extension is supported, the {\em x}\/RET instruction is
allowed to clear any outstanding LR address reservation but is not
required to. Trap handlers should explicitly clear the reservation if
@@ -2531,9 +2743,9 @@ discarded before the WFI is executed.
As implementations are free to implement WFI as a NOP, software must
explicitly check for any relevant pending but disabled interrupts in
the code following an WFI, and should loop back to the WFI if no
-suitable interrupt was detected. The {\tt mip}, {\tt sip},
-or {\tt uip} registers can be interrogated to determine the presence
-of any interrupt in machine, supervisor, or user mode
+suitable interrupt was detected. The {\tt mip} or {\tt sip}
+registers can be interrogated to determine the presence
+of any interrupt in machine or supervisor mode
respectively.
The operation of WFI is unaffected by the delegation register settings.
@@ -2550,6 +2762,46 @@ extensions that wait on memory locations changing, or message
arrival.
\end{commentary}
+\subsection{Custom SYSTEM Instructions}
+\label{sec:customsys}
+
+The subspace of the SYSTEM major opcode shown in Figure~\ref{fig:customsys}
+is designated for custom use.
+It is recommended that these instructions use bits 29:28 to designate the
+minimum required privilege mode, as do other SYSTEM instructions.
+
+\begin{figure}[h!]
+\begin{center}
+\begin{tabular}{Y@{}S@{}F@{}Y@{}Rc}
+\\
+\instbitrange{31}{26} &
+\instbitrange{25}{15} &
+\instbitrange{14}{12} &
+\instbitrange{11}{7} &
+\instbitrange{6}{0} \\
+\cline{1-5}
+\multicolumn{1}{|c|}{funct6} &
+\multicolumn{1}{c|}{\em custom} &
+\multicolumn{1}{c|}{funct3} &
+\multicolumn{1}{c|}{\em custom} &
+\multicolumn{1}{c|}{opcode} &
+Recommended Purpose \\
+\cline{1-5}
+6 & 11 & 3 & 5 & 7 \\
+100011 & {\em custom} & 0 & {\em custom} & SYSTEM & Unprivileged or User-Level \\
+110011 & {\em custom} & 0 & {\em custom} & SYSTEM & Unprivileged or User-Level \\
+100111 & {\em custom} & 0 & {\em custom} & SYSTEM & Supervisor-Level \\
+110111 & {\em custom} & 0 & {\em custom} & SYSTEM & Supervisor-Level \\
+101011 & {\em custom} & 0 & {\em custom} & SYSTEM & Hypervisor-Level \\
+111011 & {\em custom} & 0 & {\em custom} & SYSTEM & Hypervisor-Level \\
+101111 & {\em custom} & 0 & {\em custom} & SYSTEM & Machine-Level \\
+111111 & {\em custom} & 0 & {\em custom} & SYSTEM & Machine-Level \\
+\end{tabular}
+\end{center}
+\caption{SYSTEM instruction encodings designated for custom use.}
+\label{fig:customsys}
+\end{figure}
+
\section{Reset}
\label{sec:reset}
@@ -2564,13 +2816,14 @@ reset vector. The {\tt mcause} register is set to a value indicating the
cause of the reset.
Writable PMP registers' A and L fields are set to 0, unless the platform
mandates a different reset value for some PMP registers' A and L fields.
+No \warl\ field contains an illegal value.
All other hart state is \unspecified.
The {\tt mcause} values after reset have implementation-specific
interpretation, but the value 0 should be returned on implementations
that do not distinguish different reset conditions. Implementations
that distinguish different reset conditions should only use 0 to
-indicate the most complete reset (e.g., hard reset).
+indicate the most complete reset.
\begin{commentary}
Some designs may have multiple causes of reset (e.g., power-on reset,
@@ -2978,6 +3231,19 @@ because they will be fixed as either uncached, read-only, hardware
cache-coherent, or only accessed by one agent.
\end{commentary}
+If a PMA indicates non-cacheability, then accesses to that region must
+be satisfied by the memory itself, not by any caches.
+
+\begin{commentary}
+For implementations with a cacheability-control mechanism, the situation
+may arise that a program uncacheably accesses a memory location that is
+currently cache-resident.
+In this situation, the cached copy must be ignored.
+This constraint is necessary to prevent more-privileged modes' speculative
+cache refills from affecting the behavior of less-privileged modes'
+uncacheable accesses.
+\end{commentary}
+
\subsection{Idempotency PMAs}
Idempotency PMAs describe whether reads and writes to an address
@@ -3007,6 +3273,21 @@ misaligned access using multiple smaller accesses, which could cause
unexpected side effects.
\end{commentary}
+For non-idempotent regions, implicit reads and writes must not be performed
+early or speculatively, with the following exceptions.
+When a non-speculative implicit read is performed, an implementation is
+permitted to additionally read any of the bytes within a naturally aligned
+power-of-2 region containing the address of the non-speculative implicit read.
+Furthermore, when a non-speculative instruction fetch is performed, an
+implementation is permitted to additionally read any of the bytes within the
+{\em next} naturally aligned power-of-2 region of the same size (with the
+address of the region taken modulo $2^{\text{XLEN}}$).
+The results of these additional reads may be used to satisfy subsequent early
+or speculative implicit reads.
+The size of these naturally aligned power-of-2 regions is
+implementation-defined, but, for systems with page-based virtual memory, must
+not exceed the smallest supported page size.
+
\section{Physical Memory Protection}
\label{sec:pmp}
@@ -3052,7 +3333,8 @@ PMP entries are described by an 8-bit configuration register and one MXLEN-bit
address register. Some PMP settings additionally use the address register
associated with the preceding PMP entry.
Up to 64 PMP entries are supported.
-Implementations may implement zero, 16, or 64 PMP CSRs.
+Implementations may implement zero, 16, or 64 PMP CSRs; the lowest-numbered
+PMP CSRs must be implemented first.
All PMP CSR fields are \warl\ and may be hardwired to zero.
PMP CSRs are only accessible to M-mode.
@@ -3255,9 +3537,10 @@ space, so the RV64 PMP address registers impose the same limit.
Figure~\ref{pmpcfg} shows the layout of a PMP configuration register. The R,
W, and X bits, when set, indicate that the PMP entry permits read, write, and
instruction execution, respectively. When one of these bits is clear, the
-corresponding access type is denied. The combination R=0 and W=1 is reserved
-for future use. The remaining two fields, A and L, are
-described in the following sections.
+corresponding access type is denied.
+The R, W, and X fields form a collective \warl\ field for which the
+combinations with R=0 and W=1 are reserved.
+The remaining two fields, A and L, are described in the following sections.
\begin{figure}[h!]
{\footnotesize
diff --git a/src/memory.tex b/src/memory.tex
index f35d247..2cab253 100644
--- a/src/memory.tex
+++ b/src/memory.tex
@@ -668,7 +668,7 @@ Moreover, since SC is defined to carry dependencies from its source registers to
{
\tt\small
\begin{tabular}{cl||cl}
- \multicolumn{4}{c}{Initial values: 0(s0)=1; 0(s1)=1} \\
+ \multicolumn{4}{c}{Initial values: 0(s0)=1; 0(s2)=1} \\
\\
\multicolumn{2}{c}{Hart 0} & \multicolumn{2}{c}{Hart 1} \\
\hline
@@ -1230,6 +1230,15 @@ For example, the LR must also be made to respect any data dependencies that the
Likewise, the effect a FENCE~R,R elsewhere in the same hart must also be made to apply to the SC, which would not otherwise respect that fence.
The emulator may achieve this effect by simply mapping AMOs onto {\tt lr.aq;~<op>;~sc.aqrl}, matching the mapping used elsewhere for fully ordered atomics.
+These C11/C++11 mappings require the platform to provide the following Physical Memory Attributes (as defined in the RISC-V Privileged ISA) for all memory:
+\begin{itemize}
+ \item main memory
+ \item coherent
+ \item AMOArithmetic
+ \item RsrvEventual
+\end{itemize}
+Platforms with different attributes may require different mappings, or require platform-specific SW (e.g., memory-mapped I/O).
+
\section{Implementation Guidelines}
The RVWMO and RVTSO memory models by no means preclude microarchitectures from employing sophisticated speculation techniques or other forms of optimization in order to deliver higher performance.
@@ -1328,7 +1337,6 @@ We expect that any or all of the following possible future extensions would be c
\begin{itemize}
\item `V' vector ISA extensions
- \item A transactional memory subset of the `T' ISA extension
\item `J' JIT extension
\item Native encodings for load and store opcodes with {\em aq} and {\em rl} set
\item Fences limited to certain addresses
diff --git a/src/n.tex b/src/n.tex
deleted file mode 100644
index 902aba0..0000000
--- a/src/n.tex
+++ /dev/null
@@ -1,235 +0,0 @@
-\chapter{``N'' Standard Extension for User-Level Interrupts, Version 1.1}
-\label{chap:n}
-
-\begin{commentary}
- This is a placeholder for a more complete writeup of the N
- extension, and to form a basis for discussion.
-
- An ongoing topic of discussion is whether, for systems needing only M and
- U privilege modes, the N extension should be supplanted by S-mode without
- virtual memory (i.e., with {\tt satp} hardwired to zero).
- This approach would have similar hardware cost and would simplify the
- architecture.
-\end{commentary}
-
-This chapter presents a proposal for adding RISC-V user-level
-interrupt and exception handling. When the N extension is present,
-and the outer execution environment has delegated designated
-interrupts and exceptions to user-level, then hardware can transfer
-control directly to a user-level trap handler without invoking the
-outer execution environment.
-
-\begin{commentary}
-User-level interrupts are primarily intended to support secure
-embedded systems with only M-mode and U-mode present, but can also be
-supported in systems running Unix-like operating systems to support
-user-level trap handling.
-
-When used in an Unix environment, the user-level interrupts would
-likely not replace conventional signal handling, but could be used as
-a building block for further extensions that generate user-level
-events such as garbage collection barriers, integer overflow,
-floating-point traps.
-\end{commentary}
-
-\section{Additional CSRs}
-
-New user-visible CSRs are added to support the N extension.
-Their encodings are listed in Table~\ref{ucsrnames} in
-Chapter~\ref{chap:priv-csrs}.
-
-\subsection{User Status Register ({\tt ustatus})}
-
-The {\tt ustatus} register is a UXLEN-bit read/write register
-formatted as shown in Figure~\ref{ustatusreg}. The {\tt ustatus}
-register keeps track of and controls the hart's current operating
-state.
-
-\begin{figure*}[h!]
-\begin{center}
-\setlength{\tabcolsep}{4pt}
-\begin{tabular}{KccFc}
-\\
-\instbitrange{UXLEN}{5} &
-\instbit{4} &
-\instbitrange{3}{1} &
-\instbit{0} \\
-\hline
-\multicolumn{1}{|c|}{\wpri} &
-\multicolumn{1}{c|}{UPIE} &
-\multicolumn{1}{c|}{\wpri} &
-\multicolumn{1}{c|}{UIE} \\
-\hline
-UXLEN-5 & 1 & 3 & 1 \\
-\end{tabular}
-\end{center}
-\vspace{-0.1in}
-\caption{User-mode status register ({\tt ustatus}).}
-\label{ustatusreg}
-\end{figure*}
-
-The user interrupt-enable bit UIE disables user-level interrupts when
-clear. The value of UIE is copied into UPIE when a user-level trap is
-taken, and the value of UIE is set to zero to provide atomicity for
-the user-level trap handler.
-
-The UIE and UPIE bits are mirrored in the {\tt mstatus} and {\tt sstatus}
-registers in the same bit positions.
-
-\begin{commentary}
- There is no UPP bit to hold the previous privilege mode as it can
- only be user mode.
-\end{commentary}
-
-A new instruction, URET, is used to return from traps in U-mode.
-URET copies UPIE into UIE, then sets UPIE, before copying {\tt uepc}
-to the {\tt pc}.
-\begin{commentary}
- UPIE is set after the UPIE/UIE stack is popped to enable interrupts
- and help catch coding errors.
-\end{commentary}
-
-\subsection{User Interrupt Registers ({\tt uip} and {\tt uie})}
-
-The {\tt uip} register is a UXLEN-bit read/write register containing
-information on pending interrupts, while {\tt uie} is the corresponding
-UXLEN-bit read/write register containing interrupt enable bits.
-
-\begin{figure*}[h!]
-{\footnotesize
-\begin{center}
-\setlength{\tabcolsep}{4pt}
-\begin{tabular}{KcFcFc}
-\instbitrange{UXLEN-1}{9} &
-\instbit{8} &
-\instbitrange{7}{5} &
-\instbit{4} &
-\instbitrange{3}{1} &
-\instbit{0} \\
-\hline
-\multicolumn{1}{|c|}{\wpri} &
-\multicolumn{1}{c|}{UEIP} &
-\multicolumn{1}{c|}{\wpri} &
-\multicolumn{1}{c|}{UTIP} &
-\multicolumn{1}{c|}{\wpri} &
-\multicolumn{1}{c|}{USIP} \\
-\hline
-UXLEN-9 & 1 & 3 & 1 & 3 & 1 \\
-\end{tabular}
-\end{center}
-}
-\vspace{-0.1in}
-\caption{User interrupt-pending register ({\tt uip}).}
-\label{uipreg}
-\end{figure*}
-
-\begin{figure*}[h!]
-{\footnotesize
-\begin{center}
-\setlength{\tabcolsep}{4pt}
-\begin{tabular}{KcFcFc}
-\instbitrange{UXLEN-1}{9} &
-\instbit{8} &
-\instbitrange{7}{5} &
-\instbit{4} &
-\instbitrange{3}{1} &
-\instbit{0} \\
-\hline
-\multicolumn{1}{|c|}{\wpri} &
-\multicolumn{1}{c|}{UEIE} &
-\multicolumn{1}{c|}{\wpri} &
-\multicolumn{1}{c|}{UTIE} &
-\multicolumn{1}{c|}{\wpri} &
-\multicolumn{1}{c|}{USIE} \\
-\hline
-UXLEN-9 & 1 & 3 & 1 & 3 & 1 \\
-\end{tabular}
-\end{center}
-}
-\vspace{-0.1in}
-\caption{User interrupt-enable register ({\tt uie}).}
-\label{uiereg}
-\end{figure*}
-
-Three types of interrupts are defined: software interrupts, timer interrupts,
-and external interrupts. A user-level software interrupt is triggered
-on the current hart by writing 1 to its user software interrupt-pending
-(USIP) bit in the {\tt uip} register. A pending user-level software
-interrupt can be cleared by writing 0 to the USIP bit in {\tt uip}.
-User-level software interrupts are disabled when the USIE bit in the
-{\tt uie} register is clear.
-
-The ABI should provide a mechanism to send interprocessor interrupts to other
-harts, which will ultimately cause the USIP bit to be set in the recipient
-hart's {\tt uip} register.
-
-All bits besides USIP in the {\tt uip} register are read-only.
-
-A user-level timer interrupt is pending if the UTIP bit in the {\tt uip}
-register is set. User-level timer interrupts are disabled when the UTIE
-bit in the {\tt uie} register is clear. The ABI should provide a
-mechanism to clear a pending timer interrupt.
-
-A user-level external interrupt is pending if the UEIP bit in the
-{\tt uip} register is set. User-level external interrupts are disabled
-when the UEIE bit in the {\tt uie} register is clear. The ABI
-should provide facilities to mask, unmask, and query the cause of external
-interrupts.
-
-The {\tt uip} and {\tt uie} registers are subsets of the {\tt mip} and {\tt
-mie} registers.
-Reading any field, or writing any writable field, of {\tt uip}/{\tt uie}
-effects a read or write of the homonymous field of {\tt mip}/{\tt mie}.
-If S-mode is implemented, the {\tt uip} and {\tt uie} registers are also
-subsets of the {\tt sip} and {\tt sie} registers.
-
-\subsection{Machine Trap Delegation Registers ({\tt medeleg} and {\tt mideleg})}
-
-In systems with the N extension, the {\tt medeleg} and {\tt mideleg}
-registers, described in Chapter~\ref{machine}, must be implemented.
-
-In systems that implement S-mode, {\tt medeleg} and {\tt mideleg}
-behave as described in Chapter~\ref{machine}.
-In systems with only M and U privilege modes, setting a bit in {\tt medeleg}
-or {\tt mideleg} delegates the corresponding trap in U-mode to the U-mode trap
-handler.
-
-\subsection{Supervisor Trap Delegation Registers ({\tt sedeleg} and {\tt sideleg})}
-
-For systems with both S-mode and the N extension, new CSRs {\tt
-sedeleg} and {\tt sideleg} are added.
-These registers have the same layout as the machine trap delegation registers,
-{\tt medeleg} and {\tt mideleg}.
-
-{\tt sedeleg} and {\tt sideleg} allow S-mode to delegate traps to U-mode.
-Only bits corresponding to traps that have been delegated to S-mode are
-writable; the others are hardwired to zero.
-Setting a bit in {\tt sedeleg} or {\tt sideleg} delegates the corresponding
-trap in U-mode to the U-mode trap handler.
-
-\subsection{Other CSRs}
-
-The {\tt uscratch}, {\tt uepc}, {\tt ucause}, {\tt utvec}, and {\tt utval}
-CSRs are defined analogously to the {\tt mscratch}, {\tt mepc}, {\tt mcause},
-{\tt mtvec}, and {\tt mtval} CSRs.
-
-\begin{commentary}
- A more complete writeup is to follow.
-\end{commentary}
-
-\section{N Extension Instructions}
-
-The URET instruction is added to perform the analogous function to
-MRET and SRET.
-
-\section{Reducing Context-Swap Overhead}
-
-The user-level interrupt-handling registers add considerable state to
-the user-level context, yet will usually rarely be active in normal
-use. In particular, {\tt uepc}, {\tt ucause}, and {\tt utval} are
-only valid during execution of a trap handler.
-
-An NS field can be added to {\tt mstatus} and {\tt sstatus} following
-the format of the FS and XS fields to reduce context-switch overhead
-when the values are not live. Execution of URET will place the {\tt
- uepc}, {\tt ucause}, and {\tt utval} back into initial state.
diff --git a/src/naming.tex b/src/naming.tex
index 6aa7d6c..b123f69 100644
--- a/src/naming.tex
+++ b/src/naming.tex
@@ -83,7 +83,7 @@ Chapter~\ref{chap:zifencei}; ``Zifencei2'' and ``Zifencei2p0'' name version
2.0 of same.
The first letter following the ``Z'' conventionally indicates the most closely
-related alphabetical extension category, IMAFDQLCBKJTPVN. For the ``Zam''
+related alphabetical extension category, IMAFDQLCBKJTPV. For the ``Zam''
extension for misaligned atomics, for example, the letter ``a'' indicates the
extension is related to the ``A'' standard extension. If multiple ``Z''
extensions are named, they should be ordered first by category, then
@@ -164,15 +164,12 @@ Double-Precision Floating-Point & D & F \\
General & G & IMADZifencei \\
\hline
Quad-Precision Floating-Point & Q & D\\
-Decimal Floating-Point & L & \\
16-bit Compressed Instructions & C & \\
Bit Manipulation & B & \\
Cryptography Extensions & K & \\
Dynamic Languages & J & \\
-Transactional Memory & T & \\
Packed-SIMD Extensions & P & \\
Vector Extensions & V & \\
-User-Level Interrupts & N & \\
Control and Status Register Access & Zicsr & \\
Instruction-Fetch Fence & Zifencei & \\
Misaligned Atomics & Zam & A \\
diff --git a/src/preface.tex b/src/preface.tex
index 1e420c6..7525409 100644
--- a/src/preface.tex
+++ b/src/preface.tex
@@ -39,7 +39,14 @@ The document contains the following versions of the RISC-V ISA modules:
\em V & \em 0.7 & \em Draft \\
\bf Zicsr & \bf 2.0 & \bf Ratified \\
\bf Zifencei & \bf 2.0 & \bf Ratified \\
+ \bf Zihintpause & \bf 2.0 & \bf Ratified \\
\em Zam & \em 0.1 & \em Draft \\
+ \em Zfh & \em 0.1 & \em Draft \\
+ \em Zfhmin & \em 0.1 & \em Draft \\
+ \em Zfinx & \em 1.0 & \em Frozen \\
+ \em Zdinx & \em 1.0 & \em Frozen \\
+ \em Zhinx & \em 1.0 & \em Frozen \\
+ \em Zhinxmin & \em 1.0 & \em Frozen \\
\em Ztso & \em 0.1 & \em Frozen \\
\hline
\end{tabular}
@@ -53,6 +60,8 @@ The document contains the following versions of the RISC-V ISA modules:
%\itemsep 1pt
%\end{itemize}
+\FloatBarrier
+
\section*{Preface to Document Version 20191213-Base-Ratified}
This document describes the RISC-V unprivileged architecture.
@@ -113,6 +122,8 @@ The changes in this version of the document include:
\item Defined PAUSE hint instruction.
\end{itemize}
+\FloatBarrier
+
\section*{Preface to Document Version 20190608-Base-Ratified}
This document describes the RISC-V unprivileged architecture.
@@ -231,6 +242,8 @@ The changes in this version of the document include:
extension draft document.
\end{itemize}
+\FloatBarrier
+
\section*{Preface to Document Version 2.2}
This is version 2.2 of the document describing the RISC-V
@@ -306,6 +319,8 @@ The major changes in this version of the document include:
\item The C extension has been frozen and renumbered version 2.0.
\end{itemize}
+\FloatBarrier
+
\section*{Preface to Document Version 2.1}
This is version 2.1 of the document describing the RISC-V user-level
diff --git a/src/priv-csrs.tex b/src/priv-csrs.tex
index c1bd6d8..5510452 100644
--- a/src/priv-csrs.tex
+++ b/src/priv-csrs.tex
@@ -10,10 +10,10 @@ The privileged architecture requires the Zicsr extension; which other
privileged instructions are required depends on the privileged-architecture
feature set.
-In addition to the user-level
+In addition to the unprivileged
state described in Volume I of this manual, an implementation may
contain additional CSRs, accessible by some subset of the privilege
-levels using the CSR instructions described in the user-level manual.
+levels using the CSR instructions described in Volume~I.
In this chapter, we map out the CSR address space. The following
chapters describe the function of each of the CSRs according to
privilege level, as well as the other privileged instructions which
@@ -22,6 +22,9 @@ Note that although CSRs and instructions are associated with one
privilege level, they are also accessible at all higher privilege
levels.
+Standard CSRs do not have side effects on reads but may have side effects
+on writes.
+
\section{CSR Address Mapping Conventions}
The standard RISC-V ISA sets aside a 12-bit encoding space (csr[11:0])
@@ -52,7 +55,7 @@ less-privileged software.
\multicolumn{3}{|c|}{CSR Address} & Hex & \multicolumn{1}{c|}{Use and Accessibility}\\ \cline{1-3}
[11:10] & [9:8] & [7:4] & & \\
\hline
-\multicolumn{5}{|c|}{User CSRs} \\
+\multicolumn{5}{|c|}{Unprivileged and User-Level CSRs} \\
\hline
\tt 00 &\tt 00 &\tt XXXX & \tt 0x000-0x0FF & Standard read/write \\
\tt 01 &\tt 00 &\tt XXXX & \tt 0x400-0x4FF & Standard read/write \\
@@ -61,7 +64,7 @@ less-privileged software.
\tt 11 &\tt 00 &\tt 10XX & \tt 0xC80-0xCBF & Standard read-only \\
\tt 11 &\tt 00 &\tt 11XX & \tt 0xCC0-0xCFF & Custom read-only \\
\hline
-\multicolumn{5}{|c|}{Supervisor CSRs} \\
+\multicolumn{5}{|c|}{Supervisor-Level CSRs} \\
\hline
\tt 00 &\tt 01 &\tt XXXX & \tt 0x100-0x1FF & Standard read/write \\
\tt 01 &\tt 01 &\tt 0XXX & \tt 0x500-0x57F & Standard read/write \\
@@ -74,7 +77,7 @@ less-privileged software.
\tt 11 &\tt 01 &\tt 10XX & \tt 0xD80-0xDBF & Standard read-only \\
\tt 11 &\tt 01 &\tt 11XX & \tt 0xDC0-0xDFF & Custom read-only \\
\hline
-\multicolumn{5}{|c|}{Hypervisor CSRs} \\
+\multicolumn{5}{|c|}{Hypervisor and VS CSRs} \\
\hline
\tt 00 &\tt 10 &\tt XXXX & \tt 0x200-0x2FF & Standard read/write \\
\tt 01 &\tt 10 &\tt 0XXX & \tt 0x600-0x67F & Standard read/write \\
@@ -87,7 +90,7 @@ less-privileged software.
\tt 11 &\tt 10 &\tt 10XX & \tt 0xE80-0xEBF & Standard read-only \\
\tt 11 &\tt 10 &\tt 11XX & \tt 0xEC0-0xEFF & Custom read-only \\
\hline
-\multicolumn{5}{|c|}{Machine CSRs} \\
+\multicolumn{5}{|c|}{Machine-Level CSRs} \\
\hline
\tt 00 &\tt 11 &\tt XXXX & \tt 0x300-0x3FF & Standard read/write \\
\tt 01 &\tt 11 &\tt 0XXX & \tt 0x700-0x77F & Standard read/write \\
@@ -139,8 +142,8 @@ accesses. Currently, the counters are the only shadowed CSRs.
Tables~\ref{ucsrnames}--\ref{mcsrnames1} list the CSRs that have
currently been allocated CSR addresses. The timers, counters, and
-floating-point CSRs are standard user-level CSRs, as well as the
-additional user trap registers added by the N extension. The other
+floating-point CSRs are standard unprivileged CSRs.
+The other
registers are used by privileged code, as described in the following
chapters. Note that not all registers are required on all
implementations.
@@ -151,28 +154,14 @@ implementations.
\hline
Number & Privilege & Name & Description \\
\hline
-\multicolumn{4}{|c|}{User Trap Setup} \\
-\hline
-\tt 0x000 & URW &\tt ustatus & User status register. \\
-\tt 0x004 & URW &\tt uie & User interrupt-enable register. \\
-\tt 0x005 & URW &\tt utvec & User trap handler base address. \\
-\hline
-\multicolumn{4}{|c|}{User Trap Handling} \\
-\hline
-\tt 0x040 & URW &\tt uscratch & Scratch register for user trap handlers. \\
-\tt 0x041 & URW &\tt uepc & User exception program counter. \\
-\tt 0x042 & URW &\tt ucause & User trap cause. \\
-\tt 0x043 & URW &\tt utval & User bad address or instruction. \\
-\tt 0x044 & URW &\tt uip & User interrupt pending. \\
-\hline
-\multicolumn{4}{|c|}{User Floating-Point CSRs} \\
+\multicolumn{4}{|c|}{Unprivileged Floating-Point CSRs} \\
\hline
\tt 0x001 & URW &\tt fflags & Floating-Point Accrued Exceptions. \\
\tt 0x002 & URW &\tt frm & Floating-Point Dynamic Rounding Mode. \\
\tt 0x003 & URW &\tt fcsr & Floating-Point Control and Status
Register ({\tt frm} + {\tt fflags}). \\
\hline
-\multicolumn{4}{|c|}{User Counter/Timers} \\
+\multicolumn{4}{|c|}{Unprivileged Counter/Timers} \\
\hline
\tt 0xC00 & URO &\tt cycle & Cycle counter for RDCYCLE instruction. \\
\tt 0xC01 & URO &\tt time & Timer for RDTIME instruction. \\
@@ -191,7 +180,7 @@ Register ({\tt frm} + {\tt fflags}). \\
\hline
\end{tabular}
\end{center}
-\caption{Currently allocated RISC-V user-level CSR addresses.}
+\caption{Currently allocated RISC-V unprivileged CSR addresses.}
\label{ucsrnames}
\end{table}
@@ -204,12 +193,14 @@ Number & Privilege & Name & Description \\
\multicolumn{4}{|c|}{Supervisor Trap Setup} \\
\hline
\tt 0x100 & SRW &\tt sstatus & Supervisor status register. \\
-\tt 0x102 & SRW &\tt sedeleg & Supervisor exception delegation register. \\
-\tt 0x103 & SRW &\tt sideleg & Supervisor interrupt delegation register. \\
\tt 0x104 & SRW &\tt sie & Supervisor interrupt-enable register. \\
\tt 0x105 & SRW &\tt stvec & Supervisor trap handler base address. \\
\tt 0x106 & SRW &\tt scounteren & Supervisor counter enable. \\
\hline
+\multicolumn{4}{|c|}{Supervisor Configuration} \\
+\hline
+\tt 0x10A & SRW &\tt senvcfg & Supervisor environment configuration register. \\
+\hline
\multicolumn{4}{|c|}{Supervisor Trap Handling} \\
\hline
\tt 0x140 & SRW &\tt sscratch & Scratch register for supervisor trap handlers. \\
@@ -256,6 +247,11 @@ Number & Privilege & Name & Description \\
\tt 0x64A & HRW &\tt htinst & Hypervisor trap instruction (transformed). \\
\tt 0xE12 & HRO &\tt hgeip & Hypervisor guest external interrupt pending. \\
\hline
+\multicolumn{4}{|c|}{Hypervisor Configuration} \\
+\hline
+\tt 0x60A & HRW &\tt henvcfg & Hypervisor environment configuration register. \\
+\tt 0x61A & HRW &\tt henvcfgh & Additional hypervisor env. conf. register, RV32 only. \\
+\hline
\multicolumn{4}{|c|}{Hypervisor Protection and Translation} \\
\hline
\tt 0x680 & HRW &\tt hgatp & Hypervisor guest address translation and protection. \\
@@ -267,7 +263,7 @@ Number & Privilege & Name & Description \\
\multicolumn{4}{|c|}{Hypervisor Counter/Timer Virtualization Registers} \\
\hline
\tt 0x605 & HRW &\tt htimedelta & Delta for VS/VU-mode timer. \\
-\tt 0x615 & HRW &\tt htimedeltah & Upper 32 bits of {\tt htimedelta}, RV32 only. \\
+\tt 0x615 & HRW &\tt htimedeltah & Upper 32 bits of {\tt htimedelta}, HSXLEN=32 only. \\
\hline
\multicolumn{4}{|c|}{Virtual Supervisor Registers} \\
\hline
@@ -283,7 +279,7 @@ Number & Privilege & Name & Description \\
\hline
\end{tabular}
\end{center}
-\caption{Currently allocated RISC-V hypervisor-level CSR addresses.}
+\caption{Currently allocated RISC-V hypervisor and VS CSR addresses.}
\label{hcsrnames}
\end{table}
@@ -300,6 +296,7 @@ Number & Privilege & Name & Description \\
\tt 0xF12 & MRO &\tt marchid & Architecture ID. \\
\tt 0xF13 & MRO &\tt mimpid & Implementation ID. \\
\tt 0xF14 & MRO &\tt mhartid & Hardware thread ID. \\
+\tt 0xF15 & MRO &\tt mconfigptr & Pointer to configuration data structure. \\
\hline
\multicolumn{4}{|c|}{Machine Trap Setup} \\
\hline
@@ -322,6 +319,13 @@ Number & Privilege & Name & Description \\
\tt 0x34A & MRW &\tt mtinst & Machine trap instruction (transformed). \\
\tt 0x34B & MRW &\tt mtval2 & Machine bad guest physical address. \\
\hline
+\multicolumn{4}{|c|}{Machine Configuration} \\
+\hline
+\tt 0x30A & MRW &\tt menvcfg & Machine environment configuration register. \\
+\tt 0x31A & MRW &\tt menvcfgh & Additional machine env. conf. register, RV32 only. \\
+\tt 0x747 & MRW &\tt mseccfg & Machine security configuration register. \\
+\tt 0x757 & MRW &\tt mseccfgh & Additional machine security conf. register, RV32 only. \\
+\hline
\multicolumn{4}{|c|}{Machine Memory Protection} \\
\hline
%\tt 0x380 & MRW &\tt mbase & Base register. \\
diff --git a/src/priv-instr-table.tex b/src/priv-instr-table.tex
index 5603d7b..ef300f5 100644
--- a/src/priv-instr-table.tex
+++ b/src/priv-instr-table.tex
@@ -44,17 +44,7 @@
&
\multicolumn{10}{c}{\bf Trap-Return Instructions} & \\
\cline{2-11}
-
-&
-\multicolumn{4}{|c|}{0000000} &
-\multicolumn{2}{c|}{00010} &
-\multicolumn{1}{c|}{00000} &
-\multicolumn{1}{c|}{000} &
-\multicolumn{1}{c|}{00000} &
-\multicolumn{1}{c|}{1110011} & URET \\
-\cline{2-11}
-
&
\multicolumn{4}{|c|}{0001000} &
diff --git a/src/priv-preface.tex b/src/priv-preface.tex
index ae2aacf..097497a 100644
--- a/src/priv-preface.tex
+++ b/src/priv-preface.tex
@@ -1,9 +1,8 @@
\chapter{Preface}
-This is {\bf a draft of} version 1.12 of the RISC-V privileged
-architecture proposal.
-The document contains the following versions of the RISC-V ISA
-modules:
+This document describes the RISC-V privileged architecture. This
+release, version \privrev, will be used for public review of the
+following modules:
{
\begin{table}[hbt]
@@ -12,17 +11,16 @@ modules:
\hline
Module & Version & Status\\
\hline
- \em Machine ISA & \em 1.12 & \em Draft \\
- \em Supervisor ISA & \em 1.12 & \em Draft \\
- \em Hypervisor ISA & \em 0.6 & \em Draft \\
- \em N Extension & \em 1.1 & \em Draft \\
+ \em Machine ISA & \em 1.12 & \em Frozen \\
+ \em Supervisor ISA & \em 1.12 & \em Frozen \\
+ \em Hypervisor ISA & \em 1.0 & \em Frozen \\
\hline
\end{tabular}
\end{table}
}
The Machine and Supervisor ISAs, version 1.11, have been ratified by
-the RISC-V Foundation. Version 1.12 of these modules, described in
+RISC-V International. Version 1.12 of these modules, described in
this document, is a minor revision to version 1.11.
The following changes have been made since version 1.11, which, while not
@@ -55,9 +53,17 @@ Additionally, the following compatible changes have been made since version
\begin{itemize}
\parskip 0pt
\itemsep 1pt
-\item Moved N extension into its own chapter.
-\item Defined the RV32-only CSR {\tt mstatush}, which contains most of the
- same fields as the upper 32 bits of RV64's {\tt mstatus}.
+\item Removed the N extension.
+\item Defined the mandatory RV32-only CSR {\tt mstatush}, which contains
+ most of the same fields as the upper 32 bits of RV64's {\tt mstatus}.
+\item Defined the mandatory CSR {\tt mconfigptr}, which if nonzero
+ contains the address of a configuration data structure.
+\item Defined optional {\tt mseccfg} and {\tt mseccfgh} CSRs, which control
+ the machine's security configuration.
+\item Defined {\tt menvcfg}, {\tt henvcfg}, and {\tt senvcfg} CSRs
+ (and RV32-only {\tt menvcfgh} and {\tt henvcfgh} CSRs),
+ which control various characteristics of the execution environment.
+\item Designated part of SYSTEM major opcode for custom use.
\item Permitted the unconditional delegation of less-privileged interrupts.
\item Added optional big-endian and bi-endian support.
\item Made priority of load/store/AMO address-misaligned exceptions
@@ -72,6 +78,9 @@ Additionally, the following compatible changes have been made since version
\item Added Sv57 and Sv57x4 address translation modes.
\item Software breakpoint exceptions are permitted to write either 0
or the PC to {\em x}\/{\tt tval}.
+\item Clarified that bare S-mode need not support the SFENCE.VMA instruction.
+\item Specified relaxed constraints for implicit reads of non-idempotent
+ regions.
\end{itemize}
Finally, the hypervisor architecture proposal has been extensively revised.
diff --git a/src/riscv-privileged.tex b/src/riscv-privileged.tex
index 99ad6fd..8ae0844 100644
--- a/src/riscv-privileged.tex
+++ b/src/riscv-privileged.tex
@@ -10,8 +10,8 @@
\input{preamble}
-\newcommand{\privrev}{1.12-draft}
-\newcommand{\privmonthyear}{November 2020}
+\newcommand{\privrev}{20210921-{\em draft}}
+\newcommand{\privmonthyear}{September 2021}
\setcounter{secnumdepth}{3}
\setcounter{tocdepth}{3}
@@ -57,7 +57,7 @@ Creative Commons Attribution 4.0 International License.
Please cite as: ``The RISC-V Instruction Set
Manual, Volume II: Privileged Architecture, Document Version \privrev'', Editors
-Andrew Waterman and Krste Asanovi\'{c}, RISC-V Foundation, \privmonthyear.
+Andrew Waterman, Krste Asanovi\'{c}, and John Hauser, RISC-V International, \privmonthyear.
\markboth{Volume II: RISC-V Privileged Architectures V\privrev}
{Volume II: RISC-V Privileged Architectures V\privrev}
@@ -78,7 +78,6 @@ Andrew Waterman and Krste Asanovi\'{c}, RISC-V Foundation, \privmonthyear.
\input{machine}
\input{supervisor}
\input{hypervisor}
-\input{n}
\input{priv-insns}
\input{priv-history}
diff --git a/src/riscv-spec.bib b/src/riscv-spec.bib
index 9f6d946..3d7157f 100644
--- a/src/riscv-spec.bib
+++ b/src/riscv-spec.bib
@@ -494,3 +494,20 @@ month = {June},
year = {2010},
address = {Toronto, Canada}}
+@article{roux:hal-01091186,
+ TITLE = {{Innocuous Double Rounding of Basic Arithmetic Operations}},
+ AUTHOR = {Roux, Pierre},
+ URL = {https://hal.archives-ouvertes.fr/hal-01091186},
+ JOURNAL = {{Journal of Formalized Reasoning}},
+ PUBLISHER = {{ASDD-AlmaDL}},
+ VOLUME = {7},
+ NUMBER = {1},
+ PAGES = {131-142},
+ YEAR = {2014},
+ MONTH = Nov,
+ DOI = {10.6092/issn.1972-5787/4359},
+ KEYWORDS = {Coq ; double rounding ; floating-point arithmetic},
+ PDF = {https://hal.archives-ouvertes.fr/hal-01091186/file/submission.pdf},
+ HAL_ID = {hal-01091186},
+ HAL_VERSION = {v1},
+}
diff --git a/src/riscv-spec.tex b/src/riscv-spec.tex
index 8ef9653..1b0e3b6 100644
--- a/src/riscv-spec.tex
+++ b/src/riscv-spec.tex
@@ -59,7 +59,7 @@ Creative Commons Attribution 4.0 International License.
Please cite as: ``The RISC-V Instruction Set
Manual, Volume I: User-Level ISA, Document Version \specrev'', Editors
-Andrew Waterman and Krste Asanovi\'{c}, RISC-V Foundation, \specmonthyear.
+Andrew Waterman and Krste Asanovi\'{c}, RISC-V International, \specmonthyear.
\markboth{Volume I: RISC-V Unprivileged ISA V\specrev}
@@ -90,15 +90,15 @@ Andrew Waterman and Krste Asanovi\'{c}, RISC-V Foundation, \specmonthyear.
\input{f}
\input{d}
\input{q}
+\input{zfh}
\input{rvwmo}
-\input{l}
\input{c}
\input{b}
\input{j}
-\input{t}
\input{p}
\input{v}
\input{zam}
+\input{zfinx}
\input{ztso}
\input{gmaps}
\input{extensions}
diff --git a/src/rv32.tex b/src/rv32.tex
index afc6730..70e27d6 100644
--- a/src/rv32.tex
+++ b/src/rv32.tex
@@ -1213,7 +1213,24 @@ hart or external device can observe any operation in the {\em
predecessor} set preceding the FENCE.
Chapter~\ref{ch:memorymodel} provides a precise description of the
RISC-V memory consistency model.
-
+
+The FENCE instruction also orders memory reads and writes made by the
+hart as observed by memory reads and writes made by an external
+device. However, FENCE does not order observations of events made by
+an external device using any other signaling mechanism.
+
+\begin{commentary}
+A device might observe an access to a memory location via some
+external communication mechanism, e.g., a memory-mapped control
+register that drives an interrupt signal to an interrupt controller.
+This communication is outside the scope of the FENCE ordering
+mechanism and hence the FENCE instruction can provide no guarantee on
+when a change in the interrupt signal is visible to the interrupt
+controller. Specific devices might provide additional ordering
+guarantees to reduce software overhead but those are outside the scope
+of the RISC-V memory model.
+\end{commentary}
+
The EEI will define what I/O operations are possible, and in
particular, which memory addresses when accessed by load and store instructions will be treated and
ordered as device input and device output operations respectively
@@ -1249,7 +1266,7 @@ The fence mode field {\em fm} defines the semantics of the FENCE. A
FENCE with {\em fm}=0000 orders all memory operations in its
predecessor set before all memory operations in its successor set.
-The optional FENCE.TSO instruction is encoded as a FENCE instruction
+The FENCE.TSO instruction is encoded as a FENCE instruction
with {\em fm}=1000, {\em predecessor}=RW, and {\em successor}=RW.
FENCE.TSO orders all load
operations in its predecessor set before all memory operations in its
@@ -1259,10 +1276,9 @@ operations in the FENCE.TSO's predecessor set unordered with non-AMO
loads in its successor set.
\begin{commentary}
- The FENCE.TSO encoding was added as an optional extension to the
- original base FENCE instruction encoding. The base definition
- requires that implementations ignore any set bits and treat the
- FENCE as global, and so this is a backwards-compatible extension.
+ Because \mbox{FENCE RW,RW} imposes a superset of the orderings that
+ FENCE.TSO imposes, it is correct to ignore the {\em fm} field and
+ implement FENCE.TSO as \mbox{FENCE RW,RW}.
\end{commentary}
The unused fields in the FENCE instructions---{\em rs1} and {\em rd}---are
diff --git a/src/rvwmo.tex b/src/rvwmo.tex
index 228e582..f52ea8c 100644
--- a/src/rvwmo.tex
+++ b/src/rvwmo.tex
@@ -14,7 +14,7 @@ The standard ISA extension for misaligned atomics ``Zam'' (Chapter~\ref{sec:zam}
The appendices to this specification provide both axiomatic and operational formalizations of the memory consistency model as well as additional explanatory material.
\begin{commentary}
- This chapter defines the memory model for regular main memory operations. The interaction of the memory model with I/O memory, instruction fetches, FENCE.I, page table walks, and SFENCE.VMA is not (yet) formalized. Some or all of the above may be formalized in a future revision of this specification. The RV128 base ISA and future ISA extensions such as the ``V'' vector, ``T'' transactional memory, and ``J'' JIT extensions will need to be incorporated into a future revision as well.
+ This chapter defines the memory model for regular main memory operations. The interaction of the memory model with I/O memory, instruction fetches, FENCE.I, page table walks, and SFENCE.VMA is not (yet) formalized. Some or all of the above may be formalized in a future revision of this specification. The RV128 base ISA and future ISA extensions such as the ``V'' vector and ``J'' JIT extensions will need to be incorporated into a future revision as well.
Memory consistency models supporting overlapping memory accesses of different widths simultaneously remain an active area of academic research and are not yet fully understood. The specifics of how memory accesses of different sizes interact under RVWMO are specified to the best of our current abilities, but they are subject to revision should new issues be uncovered.
\end{commentary}
diff --git a/src/supervisor.tex b/src/supervisor.tex
index e212062..b911b68 100644
--- a/src/supervisor.tex
+++ b/src/supervisor.tex
@@ -38,8 +38,8 @@ the supervisor-level CSR descriptions.
The {\tt sstatus} register is an SXLEN-bit read/write register
-formatted as shown in Figure~\ref{sstatusreg-rv32} for RV32 and
-Figure~\ref{sstatusreg} for RV64. The {\tt sstatus}
+formatted as shown in Figure~\ref{sstatusreg-rv32} when SXLEN=32 and
+Figure~\ref{sstatusreg} when SXLEN=64. The {\tt sstatus}
register keeps track of the processor's current operating state.
\begin{figure*}[h!]
@@ -87,7 +87,7 @@ register keeps track of the processor's current operating state.
\end{center}
}
\vspace{-0.1in}
-\caption{Supervisor-mode status register ({\tt sstatus}) for RV32.}
+\caption{Supervisor-mode status register ({\tt sstatus}) when SXLEN=32.}
\label{sstatusreg-rv32}
\end{figure*}
@@ -148,7 +148,7 @@ register keeps track of the processor's current operating state.
\end{center}
}
\vspace{-0.1in}
-\caption{Supervisor-mode status register ({\tt sstatus}) for RV64.}
+\caption{Supervisor-mode status register ({\tt sstatus}) when SXLEN=64.}
\label{sstatusreg}
\end{figure*}
@@ -185,8 +185,8 @@ which may differ from the value of XLEN for S-mode, termed {\em SXLEN}. The
encoding of UXL is the same as that of the MXL field of {\tt misa}, shown in
Table~\ref{misabase}.
-For RV32 systems, the UXL field does not exist, and UXLEN=32. For RV64
-systems, it is a \warl\ field that encodes the current value of UXLEN.
+When SXLEN=32, the UXL field does not exist, and UXLEN=32. When
+SXLEN=64, it is a \warl\ field that encodes the current value of UXLEN.
In particular, an implementation may make UXL be a read-only field whose
value always ensures that UXLEN=SXLEN.
@@ -368,12 +368,22 @@ SXLEN \\
\label{siereg}
\end{figure}
-An interrupt \textit{i} will be taken if bit \textit{i} is set in both
-{\tt sip} and {\tt sie}, and if supervisor-level interrupts are globally
-enabled.
-Supervisor-level interrupts are globally enabled if the hart's current
-privilege mode is less than S, or if the current privilege mode is S
-and the SIE bit in the {\tt sstatus} register is set.
+An interrupt~\textit{i} will trap to S-mode if both of the
+following are true:
+(a)~either the current privilege mode is S and the SIE bit in the
+{\tt sstatus} register is set, or the current privilege mode has less
+privilege than S-mode; and
+(b)~bit~\textit{i} is set in both {\tt sip} and {\tt sie}.
+
+These conditions for an interrupt trap to occur must be evaluated in a bounded
+amount of time from when an interrupt becomes, or ceases to be,
+pending in {\tt sip}, and must
+also be evaluated immediately following the execution of an SRET instruction
+or an explicit write to a CSR on which these interrupt trap conditions
+expressly depend (including {\tt sip}, {\tt sie} and {\tt sstatus}).
+
+Interrupts to S-mode take priority over any interrupts to lower privilege
+modes.
Each individual bit in register {\tt sip} may be writable or may be
read-only.
@@ -465,11 +475,8 @@ the execution environment.
Bits {\tt sip}.SSIP and {\tt sie}.SSIE are the interrupt-pending and
interrupt-enable bits for supervisor-level software interrupts.
-If implemented, SSIP is writable in {\tt sip}.
-A supervisor-level software interrupt is triggered
-on the current hart by writing 1 to SSIP,
-while a pending supervisor-level software
-interrupt can be cleared by writing 0 to SSIP.
+If implemented, SSIP is writable in {\tt sip} and may also be set
+to 1 by a platform-specific interrupt controller.
\begin{commentary}
Interprocessor interrupts are sent to other harts by implementation-specific
@@ -501,7 +508,6 @@ they are shown as hardwired to 0 in Figures~\ref{sipreg-standard} and
Multiple simultaneous
interrupts destined for supervisor mode are handled in the following
decreasing priority order: SEI, SSI, STI.
-Synchronous exceptions are of lower priority than all interrupts.
\subsection{Supervisor Timers and Performance Counters}
@@ -614,8 +620,9 @@ Though masked, {\tt sepc[1]} remains writable when IALIGN=32.
{\tt sepc} is a \warl\ register that must be able to hold all valid
virtual addresses. It need not be capable of holding all possible invalid
-addresses. Implementations may convert some invalid address patterns into
-other invalid addresses prior to writing them to {\tt sepc}.
+addresses.
+Prior to writing {\tt sepc}, implementations may convert an invalid address
+into some other invalid address that {\tt sepc} is capable of holding.
When a trap is taken into S-mode, {\tt sepc} is written with the
virtual address of the instruction that was interrupted or that
@@ -729,14 +736,10 @@ which exceptions must set {\tt stval} informatively and which may
unconditionally set it to zero.
-When a breakpoint,
-address-misaligned, access-fault, or page-fault exception occurs
-on an instruction fetch, load, or store, {\tt stval}
-is written with the faulting virtual address. On an illegal instruction trap,
-{\tt stval} may be written with the first XLEN or ILEN bits of the faulting
-instruction as described below. For other exceptions, {\tt stval} is set to
-zero, but a future standard may redefine {\tt stval}'s setting for other
-exceptions.
+If {\tt stval} is written with a nonzero value when a breakpoint,
+address-misaligned, access-fault, or page-fault exception occurs on an
+instruction fetch, load, or store, then {\tt stval} will contain the faulting
+virtual address.
\begin{figure}[h!]
{\footnotesize
@@ -755,39 +758,156 @@ SXLEN \\
\label{stvalreg}
\end{figure}
-For misaligned loads and stores that cause access-fault or page-fault
-exceptions, {\tt stval} will contain the virtual address of the
-portion of the access that caused the fault. For
-instruction access-fault or page-fault exceptions on systems
-with variable-length instructions, {\tt stval} will contain the
-virtual address of the portion of the instruction that caused
-the fault while {\tt sepc} will point to the beginning of the
-instruction.
-
-The {\tt stval} register can optionally also be used to return the
-faulting instruction bits on an illegal instruction exception ({\tt
- sepc} points to the faulting instruction in memory).
-
-If this feature is not provided, then {\tt stval} is set to zero on
-an illegal instruction fault.
-
-If this feature is provided, after an illegal instruction trap, {\tt stval}
-will contain the shortest of:
+If {\tt stval} is written with a nonzero value when a misaligned load or store
+causes an access-fault or page-fault exception, then {\tt stval} will contain
+the virtual address of the portion of the access that caused the fault.
+
+If {\tt stval} is written with a nonzero value when an instruction access-fault
+or page-fault exception occurs on a system with variable-length instructions,
+then {\tt stval} will contain the virtual address of the portion of the
+instruction that caused the fault, while {\tt sepc} will point to the beginning
+of the instruction.
+
+The {\tt stval} register can optionally also be used to return the faulting
+instruction bits on an illegal instruction exception ({\tt sepc} points to the
+faulting instruction in memory).
+If {\tt stval} is written with a nonzero value when an illegal-instruction
+exception occurs, then {\tt stval} will contain the shortest of:
\begin{compactitem}
\item the actual faulting instruction
\item the first ILEN bits of the faulting instruction
-\item the first XLEN bits of the faulting instruction
+\item the first SXLEN bits of the faulting instruction
\end{compactitem}
-The value loaded into {\tt stval} is right-justified and all unused upper
-bits are cleared to zero.
+The value loaded into {\tt stval} on an illegal-instruction exception is
+right-justified and all unused upper bits are cleared to zero.
+
+For other traps, {\tt stval} is set to zero, but a future standard may
+redefine {\tt stval}'s setting for other traps.
{\tt stval} is a \warl\ register that must be able to hold all valid
virtual addresses and the value 0. It need not be capable of holding all
-possible invalid addresses. Implementations may convert some invalid address
-patterns into other invalid addresses prior to writing them to {\tt stval}.
+possible invalid addresses.
+Prior to writing {\tt stval}, implementations may convert an invalid address
+into some other invalid address that {\tt stval} is capable of holding.
If the feature to return the faulting instruction bits is implemented, {\tt
stval} must also be able to hold all values less than $2^N$, where $N$ is the
-smaller of XLEN and ILEN.
+smaller of SXLEN and ILEN.
+
+\subsection{Supervisor Environment Configuration Register ({\tt senvcfg})}
+
+The {\tt senvcfg} CSR is an SXLEN-bit read/write register,
+formatted as shown in Figure~\ref{fig:senvcfg},
+that controls certain characteristics of the U-mode execution environment.
+
+\begin{figure}[h!]
+{\footnotesize
+\begin{center}
+\begin{tabular}{@{}Kcc@{}W@{}Wc}
+\instbitrange{SXLEN-1}{8} &
+\instbit{7} &
+\instbit{6} &
+\instbitrange{5}{4} &
+\instbitrange{3}{1} &
+\instbit{0} \\
+\hline
+\multicolumn{1}{|c|}{\wpri} &
+\multicolumn{1}{c|}{CBZE} &
+\multicolumn{1}{c|}{CBCFE} &
+\multicolumn{1}{c|}{CBIE} &
+\multicolumn{1}{c|}{\wpri} &
+\multicolumn{1}{c|}{FIOM} \\
+\hline
+SXLEN-8 & 1 & 1 & 2 & 3 & 1 \\
+\end{tabular}
+\end{center}
+}
+\vspace{-0.1in}
+\caption{Supervisor environment configuration register ({\tt senvcfg}).}
+\label{fig:senvcfg}
+\end{figure}
+
+If bit FIOM (Fence of I/O implies Memory) is set to one in {\tt senvcfg},
+FENCE instructions executed in U-mode are modified so
+the requirement to order accesses to device I/O implies also the requirement
+to order main memory accesses.
+Table~\ref{tab:senvcfg-FIOM} details the modified interpretation of
+FENCE instruction bits PI, PO, SI, and SO in U-mode when FIOM=1.
+
+Similarly, for U-mode when FIOM=1,
+if an atomic instruction that accesses a region ordered as device I/O
+has its {\em aq} and/or {\em rl} bit set, then that instruction is ordered
+as though it accesses both device I/O and memory.
+
+If {\tt satp}.MODE is hardwired to Bare, the implementation may hardwire FIOM to zero.
+
+\begin{table}[h!]
+\begin{center}
+\begin{tabular}{|c|l|}
+\hline
+Instruction bit & Meaning when set \\
+\hline
+PI & Predecessor device input and memory reads (PR implied) \\
+PO & Predecessor device output and memory writes (PW implied) \\
+\hline
+SI & Successor device input and memory reads (SR implied) \\
+SO & Successor device output and memory writes (SW implied) \\
+\hline
+\end{tabular}
+\end{center}
+\vspace{-0.1in}
+\caption{%
+Modified interpretation of FENCE predecessor and successor sets in U-mode when FIOM=1.}
+\label{tab:senvcfg-FIOM}
+\end{table}
+
+\begin{commentary}
+Bit FIOM exists for a specific circumstance when an I/O device is
+being emulated for U-mode and both of the following are true:
+(a)~the emulated device has a memory buffer that should be I/O space
+but is actually mapped to main memory via address translation, and
+(b)~multiple physical harts are involved in accessing this emulated
+device from U-mode.
+
+A hypervisor running in S-mode without the benefit of the hypervisor
+extension of Chapter~\ref{hypervisor} may need to emulate a device for
+U-mode if paravirtualization cannot be employed.
+If the same hypervisor provides a virtual machine (VM) with multiple
+virtual harts, mapped one-to-one to real harts, then multiple harts may
+concurrently access the emulated device, perhaps because:
+(a)~the guest OS within the VM assigns device interrupt handling to one
+hart while the device is also accessed by a different hart outside of
+an interrupt handler, or
+(b)~control of the device (or partial control) is being migrated
+from one hart to another, such as for interrupt load balancing within
+the VM.
+For such cases, guest software within the VM is expected to properly
+coordinate access to the (emulated) device across multiple harts using
+mutex locks and/or interprocessor interrupts as usual, which in part
+entails executing I/O fences.
+But those I/O fences may not be sufficient if some of the device
+``I/O'' is actually main memory, unknown to the guest.
+Setting FIOM=1 modifies those fences (and all other I/O fences executed
+in U-mode) to include main memory, too.
+
+Software can always avoid the need to set FIOM by never using main
+memory to emulate a device memory buffer that should be I/O space.
+However, this choice usually requires trapping all U-mode accesses
+to the emulated buffer, which might have a noticeable impact on
+performance.
+The alternative offered by FIOM is sufficiently inexpensive to implement that
+we consider it worth supporting even if only rarely enabled.
+\end{commentary}
+
+
+The definition of the CBZE field will be furnished by the
+forthcoming Zicboz extension.
+Its allocation within {\tt senvcfg} may change prior to the ratification
+of that extension.
+
+The definitions of the CBCFE and CBIE fields will be furnished by the
+forthcoming Zicbom extension.
+Their allocations within {\tt senvcfg} may change prior to the ratification
+of that extension.
\subsection{Supervisor Address Translation and Protection ({\tt satp}) Register}
\label{sec:satp}
@@ -819,7 +939,10 @@ register are described in Section~\ref{virt-control}.
\end{center}
}
\vspace{-0.1in}
-\caption{RV32 Supervisor address translation and protection register {\tt satp}.}
+\caption{%
+Supervisor address translation and protection register {\tt satp}
+when SXLEN=32.%
+}
\label{rv32satp}
\end{figure}
@@ -851,8 +974,10 @@ main memory be representable.
\end{center}
}
\vspace{-0.1in}
-\caption{RV64 Supervisor address translation and protection register {\tt satp}, for MODE
-values Bare, Sv39, Sv48, and Sv57.}
+\caption{%
+Supervisor address translation and protection register {\tt satp}
+when SXLEN=64, for MODE values Bare, Sv39, Sv48, and Sv57.%
+}
\label{rv64satp}
\end{figure}
@@ -864,21 +989,21 @@ translations, or vice-versa. This approach also slightly reduces the cost of
a context switch.
\end{commentary}
-Table~\ref{tab:satp-mode} shows the encodings of the MODE field for RV32 and
-RV64. When MODE=Bare, supervisor virtual addresses are equal to
+Table~\ref{tab:satp-mode} shows the encodings of the MODE field when SXLEN=32 and
+SXLEN=64. When MODE=Bare, supervisor virtual addresses are equal to
supervisor physical addresses, and there is no additional memory protection
beyond the physical memory protection scheme described in
Section~\ref{sec:pmp}.
To select MODE=Bare, software must write zero to the remaining fields of
-{\tt satp} (bits 30--0 for RV32, or bits 59--0 for RV64).
+{\tt satp} (bits 30--0 when SXLEN=32, or bits 59--0 when SXLEN=64).
Attempting to select MODE=Bare with a nonzero pattern in the remaining fields
has an \unspecified\ effect on the value that the remaining fields assume
and an \unspecified\ effect on address translation and protection behavior.
-For RV32, the {\tt satp} encodings corresponding to MODE=Bare and ASID[8:7]=3 are designated
+When SXLEN=32, the {\tt satp} encodings corresponding to MODE=Bare and ASID[8:7]=3 are designated
for custom use, whereas the encodings corresponding to MODE=Bare and ASID[8:7]$\ne$3 are
reserved for future standard use.
-For RV64, all {\tt satp} encodings corresponding to MODE=Bare are reserved for future
+When SXLEN=64, all {\tt satp} encodings corresponding to MODE=Bare are reserved for future
standard use.
\begin{commentary}
@@ -889,10 +1014,10 @@ additional translation and protection modes, particularly in RV32, for which
all patterns of the existing MODE field have already been allocated.
\end{commentary}
-For RV32, the only other valid setting for MODE is Sv32, a paged
+When SXLEN=32, the only other valid setting for MODE is Sv32, a paged
virtual-memory scheme described in Section~\ref{sec:sv32}.
-For RV64, three paged virtual-memory schemes are defined: Sv39, Sv48, and Sv57,
+When SXLEN=64, three paged virtual-memory schemes are defined: Sv39, Sv48, and Sv57,
described in Sections~\ref{sec:sv39}, \ref{sec:sv48}, and \ref{sec:sv57}, respectively.
One additional scheme, Sv64, will be defined in a later version
of this specification. The remaining MODE settings are reserved
@@ -907,14 +1032,14 @@ no effect; no fields in {\tt satp} are modified.
\begin{center}
\begin{tabular}{|c|c|l|}
\hline
-\multicolumn{3}{|c|}{RV32} \\
+\multicolumn{3}{|c|}{SXLEN=32} \\
\hline
Value & Name & Description \\
\hline
0 & Bare & No translation or protection. \\
1 & Sv32 & Page-based 32-bit virtual addressing (see Section~\ref{sec:sv32}). \\
\hline \hline
-\multicolumn{3}{|c|}{RV64} \\
+\multicolumn{3}{|c|}{SXLEN=64} \\
\hline
Value & Name & Description \\
\hline
@@ -1253,7 +1378,7 @@ When Sv32 is written to the MODE field in the {\tt satp} register (see
Section~\ref{sec:satp}), the supervisor operates in a 32-bit paged
virtual-memory system. In this mode, supervisor and user virtual addresses
are translated into supervisor physical addresses by traversing a radix-tree
-page table. Sv32 is supported on RV32 systems and is designed to include
+page table. Sv32 is supported when SXLEN=32 and is designed to include
mechanisms sufficient for supporting modern Unix-based operating systems.
\begin{commentary}
@@ -1267,6 +1392,19 @@ to implement software TLB refills using a machine-mode trap handler as
an extension to M-mode.
\end{commentary}
+\begin{commentary}
+Some ISAs architecturally expose \emph{virtually indexed, physically tagged}
+caches, in that accesses to the same physical address via different virtual
+addresses might not be coherent unless the virtual addresses lie within the
+same cache set.
+Implicitly, this specification does not permit such behavior to be
+architecturally exposed.
+\end{commentary}
+
+For implementations that hardwire {\tt satp}.MODE to Bare, attempts to
+execute an SFENCE.VMA instruction might raise an illegal instruction
+exception.
+
\subsection{Addressing and Memory Protection}
\label{sec:translation}
@@ -1677,8 +1815,8 @@ execution of the algorithm began.
\section{Sv39: Page-Based 39-bit Virtual-Memory System}
\label{sec:sv39}
-This section describes a simple paged virtual-memory system designed
-for RV64 systems, which supports 39-bit virtual address spaces. The
+This section describes a simple paged virtual-memory system
+for SXLEN=64, which supports 39-bit virtual address spaces. The
design of Sv39 follows the overall scheme of Sv32, and this section
details only the differences between the schemes.
@@ -1836,8 +1974,8 @@ Section~\ref{sv32algorithm}, except LEVELS equals 3 and PTESIZE equals 8.
\section{Sv48: Page-Based 48-bit Virtual-Memory System}
\label{sec:sv48}
-This section describes a simple paged virtual-memory system designed
-for RV64 systems, which supports 48-bit virtual address spaces. Sv48
+This section describes a simple paged virtual-memory system
+for SXLEN=64, which supports 48-bit virtual address spaces. Sv48
is intended for systems for which a 39-bit virtual address space is
insufficient. It closely follows the design of Sv39, simply adding an
additional level of page table, and so this chapter only details the
diff --git a/src/t.tex b/src/t.tex
deleted file mode 100644
index d7f8efa..0000000
--- a/src/t.tex
+++ /dev/null
@@ -1,16 +0,0 @@
-\chapter{``T'' Standard Extension for Transactional Memory, Version 0.0}
-\label{sec:tm}
-
-This chapter is a placeholder for a future standard extension to
-provide transactional memory operations.
-
-\begin{commentary}
-Despite much research over the last twenty years, and initial
-commercial implementations, there is still much debate on the best way
-to support atomic operations involving multiple addresses.
-
-Our current thoughts are to include a small limited-capacity
-transactional memory buffer along the lines of the original
-transactional memory proposals.
-\end{commentary}
-
diff --git a/src/zfh.tex b/src/zfh.tex
new file mode 100644
index 0000000..3ccebf9
--- /dev/null
+++ b/src/zfh.tex
@@ -0,0 +1,422 @@
+\chapter{``Zfh'' and ``Zfhmin'' Standard Extensions for Half-Precision Floating-Point,
+ Version 0.1}
+
+{\bf Warning! This draft specification may change before being
+accepted as standard by RISC-V International.}
+
+This chapter describes the Zfh standard extension for 16-bit half-precision
+binary floating-point instructions compliant with the IEEE 754-2008 arithmetic
+standard.
+The Zfh extension depends on the single-precision floating-point extension, F.
+The NaN-boxing scheme described in Section~\ref{nanboxing} is extended to
+allow a half-precision value to be NaN-boxed inside a single-precision value
+(which may be recursively NaN-boxed inside a double- or quad-precision value
+when the D or Q extension is present).
+
+\begin{commentary}
+This extension primarily provides instructions that consume half-precision
+operands and produce half-precision results.
+However, it is also common to compute on half-precision data using higher
+intermediate precision.
+Although this extension provides explicit conversion instructions that suffice
+to implement that pattern, future extensions might further accelerate such
+computation with additional instructions that implicitly widen their
+operands---e.g., half$\times$half$+$single$\rightarrow$single---or implicitly
+narrow their results---e.g., half$+$single$\rightarrow$half.
+\end{commentary}
+
+\section{Half-Precision Load and Store Instructions}
+
+New 16-bit variants of LOAD-FP and STORE-FP instructions are added,
+encoded with a new value for the funct3 width field.
+
+\vspace{-0.2in}
+\begin{center}
+\begin{tabular}{M@{}R@{}F@{}R@{}O}
+\\
+\instbitrange{31}{20} &
+\instbitrange{19}{15} &
+\instbitrange{14}{12} &
+\instbitrange{11}{7} &
+\instbitrange{6}{0} \\
+\hline
+\multicolumn{1}{|c|}{imm[11:0]} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{width} &
+\multicolumn{1}{c|}{rd} &
+\multicolumn{1}{c|}{opcode} \\
+\hline
+12 & 5 & 3 & 5 & 7 \\
+offset[11:0] & base & H & dest & LOAD-FP \\
+\end{tabular}
+\end{center}
+
+\vspace{-0.2in}
+\begin{center}
+\begin{tabular}{O@{}R@{}R@{}F@{}R@{}O}
+\\
+\instbitrange{31}{25} &
+\instbitrange{24}{20} &
+\instbitrange{19}{15} &
+\instbitrange{14}{12} &
+\instbitrange{11}{7} &
+\instbitrange{6}{0} \\
+\hline
+\multicolumn{1}{|c|}{imm[11:5]} &
+\multicolumn{1}{c|}{rs2} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{width} &
+\multicolumn{1}{c|}{imm[4:0]} &
+\multicolumn{1}{c|}{opcode} \\
+\hline
+7 & 5 & 5 & 3 & 5 & 7 \\
+offset[11:5] & src & base & H & offset[4:0] & STORE-FP \\
+\end{tabular}
+\end{center}
+
+FLH and FSH are only guaranteed to execute atomically if the effective address
+is naturally aligned.
+
+FLH and FSH do not modify the bits being transferred; in particular, the
+payloads of non-canonical NaNs are preserved.
+FLH NaN-boxes the result written to {\em rd}, whereas FSH ignores all but
+the lower 16 bits in {\em rs2}.
+
+\section{Half-Precision Computational Instructions}
+
+A new supported format is added to the format field of most
+instructions, as shown in Table~\ref{tab:fpextfmth}.
+
+\begin{table}[htp]
+\begin{center}
+\begin{tabular}{|c|c|l|}
+\hline
+{\em fmt} field &
+Mnemonic &
+Meaning \\
+\hline
+00 & S & 32-bit single-precision \\
+01 & D & 64-bit double-precision \\
+10 & H & 16-bit half-precision \\
+11 & Q & 128-bit quad-precision \\
+\hline
+\end{tabular}
+\end{center}
+\caption{Format field encoding.}
+\label{tab:fpextfmth}
+\end{table}
+
+The half-precision floating-point computational instructions are
+defined analogously to their single-precision counterparts, but operate on
+half-precision operands and produce half-precision results.
+
+\vspace{-0.2in}
+\begin{center}
+\begin{tabular}{R@{}F@{}R@{}R@{}F@{}R@{}O}
+\\
+\instbitrange{31}{27} &
+\instbitrange{26}{25} &
+\instbitrange{24}{20} &
+\instbitrange{19}{15} &
+\instbitrange{14}{12} &
+\instbitrange{11}{7} &
+\instbitrange{6}{0} \\
+\hline
+\multicolumn{1}{|c|}{funct5} &
+\multicolumn{1}{c|}{fmt} &
+\multicolumn{1}{c|}{rs2} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{rm} &
+\multicolumn{1}{c|}{rd} &
+\multicolumn{1}{c|}{opcode} \\
+\hline
+5 & 2 & 5 & 5 & 3 & 5 & 7 \\
+FADD/FSUB & H & src2 & src1 & RM & dest & OP-FP \\
+FMUL/FDIV & H & src2 & src1 & RM & dest & OP-FP \\
+FMIN-MAX & H & src2 & src1 & MIN/MAX & dest & OP-FP \\
+FSQRT & H & 0 & src & RM & dest & OP-FP \\
+\end{tabular}
+\end{center}
+
+\vspace{-0.2in}
+\begin{center}
+\begin{tabular}{R@{}F@{}R@{}R@{}F@{}R@{}O}
+\\
+\instbitrange{31}{27} &
+\instbitrange{26}{25} &
+\instbitrange{24}{20} &
+\instbitrange{19}{15} &
+\instbitrange{14}{12} &
+\instbitrange{11}{7} &
+\instbitrange{6}{0} \\
+\hline
+\multicolumn{1}{|c|}{rs3} &
+\multicolumn{1}{c|}{fmt} &
+\multicolumn{1}{c|}{rs2} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{rm} &
+\multicolumn{1}{c|}{rd} &
+\multicolumn{1}{c|}{opcode} \\
+\hline
+5 & 2 & 5 & 5 & 3 & 5 & 7 \\
+src3 & H & src2 & src1 & RM & dest & F[N]MADD/F[N]MSUB \\
+\end{tabular}
+\end{center}
+
+\section{Half-Precision Convert and Move Instructions}
+
+New floating-point-to-integer and integer-to-floating-point conversion
+instructions are added. These instructions are defined analogously to the
+single-precision-to-integer and integer-to-single-precision conversion
+instructions. FCVT.W.H or FCVT.L.H converts a half-precision floating-point
+number to a signed 32-bit or 64-bit integer, respectively. FCVT.H.W or
+FCVT.H.L converts a 32-bit or 64-bit signed integer, respectively, into a
+half-precision floating-point number. FCVT.WU.H, FCVT.LU.H, FCVT.H.WU, and
+FCVT.H.LU variants convert to or from unsigned integer values. FCVT.L[U].H and
+FCVT.H.L[U] are RV64-only instructions.
+
+\vspace{-0.2in}
+\begin{center}
+\begin{tabular}{R@{}F@{}R@{}R@{}F@{}R@{}O}
+\\
+\instbitrange{31}{27} &
+\instbitrange{26}{25} &
+\instbitrange{24}{20} &
+\instbitrange{19}{15} &
+\instbitrange{14}{12} &
+\instbitrange{11}{7} &
+\instbitrange{6}{0} \\
+\hline
+\multicolumn{1}{|c|}{funct5} &
+\multicolumn{1}{c|}{fmt} &
+\multicolumn{1}{c|}{rs2} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{rm} &
+\multicolumn{1}{c|}{rd} &
+\multicolumn{1}{c|}{opcode} \\
+\hline
+5 & 2 & 5 & 5 & 3 & 5 & 7 \\
+FCVT.{\em int}.H & H & W[U]/L[U] & src & RM & dest & OP-FP \\
+FCVT.H.{\em int} & H & W[U]/L[U] & src & RM & dest & OP-FP \\
+\end{tabular}
+\end{center}
+
+New floating-point-to-floating-point conversion instructions are added. These
+instructions are defined analogously to the double-precision
+floating-point-to-floating-point conversion instructions.
+FCVT.S.H or FCVT.H.S converts a half-precision floating-point number to
+a single-precision floating-point number, or vice-versa, respectively.
+If the D extension is present, FCVT.D.H or FCVT.H.D converts a half-precision
+floating-point number to a double-precision floating-point number, or
+vice-versa, respectively.
+If the Q extension is present, FCVT.Q.H or FCVT.H.Q converts a half-precision
+floating-point number to a quad-precision floating-point number, or
+vice-versa, respectively.
+
+\vspace{-0.2in}
+\begin{center}
+\begin{tabular}{R@{}F@{}R@{}R@{}F@{}R@{}O}
+\\
+\instbitrange{31}{27} &
+\instbitrange{26}{25} &
+\instbitrange{24}{20} &
+\instbitrange{19}{15} &
+\instbitrange{14}{12} &
+\instbitrange{11}{7} &
+\instbitrange{6}{0} \\
+\hline
+\multicolumn{1}{|c|}{funct5} &
+\multicolumn{1}{c|}{fmt} &
+\multicolumn{1}{c|}{rs2} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{rm} &
+\multicolumn{1}{c|}{rd} &
+\multicolumn{1}{c|}{opcode} \\
+\hline
+5 & 2 & 5 & 5 & 3 & 5 & 7 \\
+FCVT.S.H & S & H & src & RM & dest & OP-FP \\
+FCVT.H.S & H & S & src & RM & dest & OP-FP \\
+FCVT.D.H & D & H & src & RM & dest & OP-FP \\
+FCVT.H.D & H & D & src & RM & dest & OP-FP \\
+FCVT.Q.H & Q & H & src & RM & dest & OP-FP \\
+FCVT.H.Q & H & Q & src & RM & dest & OP-FP \\
+\end{tabular}
+\end{center}
+
+Floating-point to floating-point sign-injection instructions, FSGNJ.H,
+FSGNJN.H, and FSGNJX.H are defined analogously to the single-precision
+sign-injection instruction.
+
+\vspace{-0.2in}
+\begin{center}
+\begin{tabular}{R@{}F@{}R@{}R@{}F@{}R@{}O}
+\\
+\instbitrange{31}{27} &
+\instbitrange{26}{25} &
+\instbitrange{24}{20} &
+\instbitrange{19}{15} &
+\instbitrange{14}{12} &
+\instbitrange{11}{7} &
+\instbitrange{6}{0} \\
+\hline
+\multicolumn{1}{|c|}{funct5} &
+\multicolumn{1}{c|}{fmt} &
+\multicolumn{1}{c|}{rs2} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{rm} &
+\multicolumn{1}{c|}{rd} &
+\multicolumn{1}{c|}{opcode} \\
+\hline
+5 & 2 & 5 & 5 & 3 & 5 & 7 \\
+FSGNJ & H & src2 & src1 & J[N]/JX & dest & OP-FP \\
+\end{tabular}
+\end{center}
+
+
+Instructions are provided to move bit patterns between the floating-point and
+integer registers.
+FMV.X.H moves the half-precision value in floating-point register {\em rs1} to
+a representation in IEEE 754-2008 standard encoding in integer register {\em
+rd}, filling the upper XLEN-16 bits with copies of the floating-point number's
+sign bit.
+
+FMV.H.X moves the half-precision value encoded in IEEE 754-2008 standard
+encoding from the lower 16 bits of integer register {\em rs1} to the
+floating-point register {\em rd}, NaN-boxing the result.
+
+FMV.X.H and FMV.H.X do not modify the bits being transferred; in particular,
+the payloads of non-canonical NaNs are preserved.
+
+\vspace{-0.2in}
+\begin{center}
+\begin{tabular}{R@{}F@{}R@{}R@{}F@{}R@{}O}
+\\
+\instbitrange{31}{27} &
+\instbitrange{26}{25} &
+\instbitrange{24}{20} &
+\instbitrange{19}{15} &
+\instbitrange{14}{12} &
+\instbitrange{11}{7} &
+\instbitrange{6}{0} \\
+\hline
+\multicolumn{1}{|c|}{funct5} &
+\multicolumn{1}{c|}{fmt} &
+\multicolumn{1}{c|}{rs2} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{rm} &
+\multicolumn{1}{c|}{rd} &
+\multicolumn{1}{c|}{opcode} \\
+\hline
+5 & 2 & 5 & 5 & 3 & 5 & 7 \\
+FMV.X.H & H & 0 & src & 000 & dest & OP-FP \\
+FMV.H.X & H & 0 & src & 000 & dest & OP-FP \\
+\end{tabular}
+\end{center}
+
+\section{Half-Precision Floating-Point Compare Instructions}
+
+The half-precision floating-point compare instructions are
+defined analogously to their single-precision counterparts, but operate on
+half-precision operands.
+
+\vspace{-0.2in}
+\begin{center}
+\begin{tabular}{S@{}F@{}R@{}R@{}F@{}R@{}O}
+\\
+\instbitrange{31}{27} &
+\instbitrange{26}{25} &
+\instbitrange{24}{20} &
+\instbitrange{19}{15} &
+\instbitrange{14}{12} &
+\instbitrange{11}{7} &
+\instbitrange{6}{0} \\
+\hline
+\multicolumn{1}{|c|}{funct5} &
+\multicolumn{1}{c|}{fmt} &
+\multicolumn{1}{c|}{rs2} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{rm} &
+\multicolumn{1}{c|}{rd} &
+\multicolumn{1}{c|}{opcode} \\
+\hline
+5 & 2 & 5 & 5 & 3 & 5 & 7 \\
+FCMP & H & src2 & src1 & EQ/LT/LE & dest & OP-FP \\
+\end{tabular}
+\end{center}
+
+\section{Half-Precision Floating-Point Classify Instruction}
+
+The half-precision floating-point classify instruction, FCLASS.H, is
+defined analogously to its single-precision counterpart, but operates on
+half-precision operands.
+
+\vspace{-0.2in}
+\begin{center}
+\begin{tabular}{S@{}F@{}R@{}R@{}F@{}R@{}O}
+\\
+\instbitrange{31}{27} &
+\instbitrange{26}{25} &
+\instbitrange{24}{20} &
+\instbitrange{19}{15} &
+\instbitrange{14}{12} &
+\instbitrange{11}{7} &
+\instbitrange{6}{0} \\
+\hline
+\multicolumn{1}{|c|}{funct5} &
+\multicolumn{1}{c|}{fmt} &
+\multicolumn{1}{c|}{rs2} &
+\multicolumn{1}{c|}{rs1} &
+\multicolumn{1}{c|}{rm} &
+\multicolumn{1}{c|}{rd} &
+\multicolumn{1}{c|}{opcode} \\
+\hline
+5 & 2 & 5 & 5 & 3 & 5 & 7 \\
+FCLASS & H & 0 & src & 001 & dest & OP-FP \\
+\end{tabular}
+\end{center}
+
+\section{``Zfhmin'' Standard Extension for Minimal Half-Precision Floating-Point Support}
+
+{\bf Warning! This draft specification may change before being
+accepted as standard by RISC-V International.}
+
+This section describes the Zfhmin standard extension, which provides minimal
+support for 16-bit half-precision binary floating-point instructions.
+The Zfhmin extension is a subset of the Zfh extension, consisting only
+of data transfer and conversion instructions.
+Like Zfh, the Zfhmin extension depends on the single-precision floating-point
+extension, F.
+The expectation is that Zfhmin software primarily uses the half-precision
+format for storage, performing most computation in higher precision.
+
+The Zfhmin extension includes the following instructions from the Zfh
+extension: FLH, FSH, FMV.X.H, FMV.H.X, FCVT.S.H, and FCVT.H.S.
+If the D extension is present, the FCVT.D.H and FCVT.H.D instructions are
+also included.
+If the Q extension is present, the FCVT.Q.H and FCVT.H.Q instructions are
+additionally included.
+
+\begin{commentary}
+Zfhmin does not include the FSGNJ.H instruction, because it suffices to
+instead use the FSGNJ.S instruction to move half-precision values between
+floating-point registers.
+\end{commentary}
+
+\begin{commentary}
+Half-precision addition, subtraction, multiplication, division, and
+square-root operations can be faithfully emulated by converting the
+half-precision operands to single-precision, performing the operation
+using single-precision arithmetic, then converting back to
+half-precision~\cite{roux:hal-01091186}.
+Performing half-precision fused multiply-addition using this method incurs
+a 1-ulp error on some inputs for the RNE and RMM rounding modes.
+
+Conversion from 8- or 16-bit integers to half-precision can be emulated by
+first converting to single-precision, then converting to half-precision.
+Conversion from 32-bit integer can be emulated by first converting to
+double-precision.
+If the D extension is not present and a 1-ulp error under RNE or RMM is
+tolerable, 32-bit integers can be first converted to single-precision instead.
+The same remark applies to conversions from 64-bit integers without the Q
+extension.
+\end{commentary}
diff --git a/src/zfinx.tex b/src/zfinx.tex
new file mode 100644
index 0000000..0dab033
--- /dev/null
+++ b/src/zfinx.tex
@@ -0,0 +1,159 @@
+\chapter{``Zfinx'', ``Zdinx'', ``Zhinx'', ``Zhinxmin'': Standard Extensions for Floating-Point in Integer Registers, Version 1.0.0-rc}
+\label{sec:zfinx}
+
+This chapter is in the Frozen state. Change is extremely unlikely. A high threshold will be used,
+and a change will only occur because of some truly critical issue being identified during the
+public review cycle. Any other desired or needed changes can be the subject of a follow-on
+new extension. For more info see: http://riscv.org/spec-state.
+
+This chapter defines the ``Zfinx'' extension (pronounced ``z-f-in-x'')
+that provides instructions similar to those in the standard
+floating-point F extension for single-precision floating-point
+instructions but which operate on the {\tt x} registers instead of the
+{\tt f} registers. This chapter also defines the ``Zdinx'',
+``Zhinx'', and ``Zhinxmin'' extensions that provide similar
+instructions for other floating-point precisions.
+
+\begin{commentary}
+The F extension uses separate {\tt f} registers for floating-point
+computation, to reduce register pressure and simplify the provision of
+register-file ports for wide superscalars.
+However, the additional \wunits{128}{B} of architectural state increases the
+minimal implementation cost.
+By eliminating the {\tt f} registers, the Zfinx extension substantially
+reduces the cost of simple RISC-V implementations with floating-point
+instruction-set support.
+Zfinx also reduces context-switch cost.
+
+In general, software that assumes the presence of the F extension
+is incompatible with software that assumes the presence of the Zfinx
+extension, and vice versa.
+\end{commentary}
+
+The Zfinx extension adds all of the instructions that the F extension
+adds, {\em except} for the transfer instructions FLW, FSW, FMV.W.X,
+FMV.X.W, C.FLW[SP], and C.FSW[SP].
+
+\begin{commentary}
+Zfinx software uses integer loads and stores to transfer floating-point values
+from and to memory.
+Transfers between registers use either integer arithmetic or floating-point
+sign-injection instructions.
+\end{commentary}
+
+The Zfinx variants of these F-extension instructions have the same semantics,
+except that whenever such an instruction would have accessed an {\tt f}
+register, it instead accesses the {\tt x} register with the same number.
+
+\section{Processing of Narrower Values}
+
+Floating-point operands of width \mbox{{\em w} $<$ XLEN bits} occupy bits
+\mbox{{\em w}-1:0} of an {\tt x} register.
+Floating-point operations on {\em w}-bit operands ignore operand bits
+\mbox{XLEN-1:{\em w}}.
+
+Floating-point operations that produce \mbox{{\em w} $<$ XLEN-bit} results
+fill bits \mbox{XLEN-1:{\em w}} with copies of bit \mbox{{\em w}-1} (the
+sign bit).
+
+\begin{commentary}
+The NaN-boxing scheme employed in the {\tt f} registers was designed to
+efficiently support recoded floating-point formats.
+Recoding is less practical for Zfinx, though, since the same registers
+hold both floating-point and integer operands.
+Hence, the need for NaN boxing is diminished.
+
+Sign-extending 32-bit floating-point numbers when held in RV64 {\tt x}
+registers matches the existing RV64 calling conventions, which require all
+32-bit types to be sign-extended when passed or returned in {\tt x} registers.
+To keep the architecture more regular, we extend this pattern to 16-bit
+floating-point numbers in both RV32 and RV64.
+\end{commentary}
+
+\section{Zdinx}
+
+The Zdinx extension provides analogous double-precision floating-point
+instructions.
+The Zdinx extension requires the Zfinx extension.
+
+The Zdinx extension adds all of the instructions that the D extension
+adds, {\em except} for the transfer instructions FLD, FSD, FMV.D.X,
+FMV.X.D, C.FLD[SP], and C.FSD[SP].
+
+The Zdinx variants of these D-extension instructions have the same semantics,
+except that whenever such an instruction would have accessed an {\tt f}
+register, it instead accesses the {\tt x} register with the same number.
+
+\section{Processing of Wider Values}
+
+Double-precision operands in RV32Zdinx
+are held in aligned {\tt x}-register pairs, i.e.,
+register numbers must be even.
+Use of misaligned (odd-numbered) registers for double-width floating-point
+operands is {\em reserved}.
+
+Regardless of endianness, the lower-numbered register holds the low-order
+bits, and the higher-numbered register holds the high-order bits: e.g., bits
+31:0 of a double-precision operand in RV32Zdinx might be held in register
+{\tt x14}, with bits 63:32 of that operand held in {\tt x15}.
+
+When a double-width floating-point result is written to {\tt x0}, the entire
+write takes no effect: e.g., for RV32Zdinx, writing a double-precision result
+to {\tt x0} does not cause {\tt x1} to be written.
+
+When {\tt x0} is used as a double-width floating-point operand, the entire
+operand is zero---i.e., {\tt x1} is not accessed.
+
+\begin{commentary}
+Load-pair and store-pair instructions are not provided, so transferring
+double-precision operands in RV32Zdinx from or to memory requires
+two loads or stores.
+Register moves need only a single FSGNJ.D instruction, however.
+\end{commentary}
+
+\section{Zhinx}
+
+The Zhinx extension provides analogous half-precision floating-point
+instructions.
+The Zhinx extension requires the Zfinx extension.
+
+The Zhinx extension adds all of the instructions that the Zfh extension
+adds, {\em except} for the transfer instructions FLH, FSH, FMV.H.X,
+and FMV.X.H.
+
+The Zhinx variants of these Zfh-extension instructions have the same semantics,
+except that whenever such an instruction would have accessed an {\tt f}
+register, it instead accesses the {\tt x} register with the same number.
+
+\section{Zhinxmin}
+
+The Zhinxmin extension provides minimal support for 16-bit half-precision
+floating-point instructions that operate on the {\tt x} registers.
+The Zhinxmin extension requires the Zfinx extension.
+
+The Zhinxmin extension includes the following instructions from the Zhinx
+extension: FCVT.S.H and FCVT.H.S.
+If the Zdinx extension is present, the FCVT.D.H and FCVT.H.D instructions are
+also included.
+
+\begin{commentary}
+In the future, an RV64Zqinx quad-precision extension could be defined analogously
+to RV32Zdinx.
+An RV32Zqinx extension could also be defined but would require
+quad-register groups.
+\end{commentary}
+
+\section{Privileged Architecture Implications}
+
+In the standard privileged architecture defined in Volume II, the
+{\tt mstatus} field FS is hardwired to 0 if the Zfinx extension is
+implemented, and FS no longer affects the trapping behavior of
+floating-point instructions or {\tt fcsr} accesses.
+
+The {\tt misa} bits F, D, and Q are hardwired to 0 when the Zfinx
+extension is implemented.
+
+\begin{commentary}
+A future discoverability mechanism might be used to probe the existence
+of the Zfinx, Zhinx, and Zdinx extensions.
+\end{commentary}
diff --git a/src/zihintpause.tex b/src/zihintpause.tex
index fd652a2..7baf9cb 100644
--- a/src/zihintpause.tex
+++ b/src/zihintpause.tex
@@ -1,4 +1,4 @@
-\chapter{``Zihintpause'' Pause Hint, Version 1.0}
+\chapter{``Zihintpause'' Pause Hint, Version 2.0}
\label{chap:zihintpause}
The PAUSE instruction is a HINT that indicates the current hart's rate of