diff options
author | Andrew Waterman <andrew@sifive.com> | 2021-02-11 15:17:20 -0800 |
---|---|---|
committer | GitHub <noreply@github.com> | 2021-02-11 15:17:20 -0800 |
commit | d7a3a96a1d9bc8b8e671aeb7dcb5fd17fac0b7d3 (patch) | |
tree | cb174c7c2b476e03b99ee0f99a83cdcf4e4e846c | |
parent | 0ca4c684008f655c6c5b8933a82789f2fc069139 (diff) | |
parent | a478e1a19663b071fca208a3971f85e43279a93e (diff) | |
download | riscv-isa-manual-d7a3a96a1d9bc8b8e671aeb7dcb5fd17fac0b7d3.zip riscv-isa-manual-d7a3a96a1d9bc8b8e671aeb7dcb5fd17fac0b7d3.tar.gz riscv-isa-manual-d7a3a96a1d9bc8b8e671aeb7dcb5fd17fac0b7d3.tar.bz2 |
Merge pull request #398 from riscv/pause
Add PAUSE hint
-rw-r--r-- | src/preface.tex | 1 | ||||
-rw-r--r-- | src/riscv-spec.tex | 1 | ||||
-rw-r--r-- | src/rv32.tex | 49 | ||||
-rw-r--r-- | src/rv64.tex | 15 | ||||
-rw-r--r-- | src/zihintpause.tex | 50 |
5 files changed, 102 insertions, 14 deletions
diff --git a/src/preface.tex b/src/preface.tex index 8e5442e..1e420c6 100644 --- a/src/preface.tex +++ b/src/preface.tex @@ -110,6 +110,7 @@ The changes in this version of the document include: December 2019. \item Defined big-endian ISA variant. \item Moved N extension for user-mode interrupts into Volume II. +\item Defined PAUSE hint instruction. \end{itemize} \section*{Preface to Document Version 20190608-Base-Ratified} diff --git a/src/riscv-spec.tex b/src/riscv-spec.tex index 7a4200e..8ef9653 100644 --- a/src/riscv-spec.tex +++ b/src/riscv-spec.tex @@ -79,6 +79,7 @@ Andrew Waterman and Krste Asanovi\'{c}, RISC-V Foundation, \specmonthyear. \input{intro} \input{rv32} \input{zifencei} +\input{zihintpause} \input{rv32e} \input{rv64} \input{rv128} diff --git a/src/rv32.tex b/src/rv32.tex index 7c7970f..3c67e90 100644 --- a/src/rv32.tex +++ b/src/rv32.tex @@ -1383,29 +1383,44 @@ supervisor-level operating system or debugger. RV32I reserves a large encoding space for HINT instructions, which are usually used to communicate performance hints to the -microarchitecture. HINTs are encoded as integer computational -instructions with {\em rd}={\tt x0}. Hence, like the NOP instruction, -HINTs do not change any architecturally visible state, except for -advancing the {\tt pc} and any applicable performance counters. +microarchitecture. +Like the NOP instruction, HINTs do not change any architecturally visible +state, except for advancing the {\tt pc} and any applicable performance +counters. Implementations are always allowed to ignore the encoded hints. +Most RV32I HINTs are encoded as integer computational instructions with +{\em rd}={\tt x0}. +The other RV32I HINTs are encoded as FENCE instructions with a null +predecessor or successor set and with {\em fm}=0. + \begin{commentary} -This HINT encoding has been chosen so that simple implementations can ignore -HINTs altogether, and instead execute a HINT as a regular computational +These HINT encodings have been chosen so that simple implementations can ignore +HINTs altogether, and instead execute a HINT as a regular instruction that happens not to mutate the architectural state. For example, ADD is a HINT if the destination register is {\tt x0}; the five-bit {\em rs1} and {\em rs2} fields encode arguments to the HINT. However, a simple implementation can simply execute the HINT as an ADD of {\em rs1} and {\em rs2} that writes {\tt x0}, which has no architecturally visible effect. + +As another example, a FENCE instruction with a zero {\em pred} field and +a zero {\em fm} field is a HINT; the {\em succ}, {\em rs1}, and {\em rd} +fields encode the arguments to the HINT. +A simple implementation can simply execute the HINT as a FENCE that orders the +null set of prior memory accesses before whichever subsequent memory accesses +are encoded in the {\em succ} field. +Since the intersection of the predecessor and successor sets is null, the +instruction imposes no memory orderings, and so it has no architecturally +visible effect. \end{commentary} Table~\ref{tab:rv32i-hints} lists all RV32I HINT code points. 91\% of the HINT -space is reserved for standard HINTs, but none are presently defined. The -remainder of the HINT space is designated for custom HINTs; no standard HINTs +space is reserved for standard HINTs. The +remainder of the HINT space is designated for custom HINTs: no standard HINTs will ever be defined in this subspace. \begin{commentary} -No standard hints are presently defined. We anticipate +We anticipate standard hints to eventually include memory-system spatial and temporal locality hints, branch prediction hints, thread-scheduling hints, security tags, and instrumentation flags for @@ -1417,7 +1432,7 @@ simulation/emulation. \begin{tabular}{|l|l|c|l|} \hline Instruction & Constraints & Code Points & Purpose \\ \hline \hline - LUI & {\em rd}={\tt x0} & $2^{20}$ & \multirow{15}{*}{\em Reserved for future standard use} \\ \cline{1-3} + LUI & {\em rd}={\tt x0} & $2^{20}$ & \multirow{25}{*}{\em Reserved for future standard use} \\ \cline{1-3} AUIPC & {\em rd}={\tt x0} & $2^{20}$ & \\ \cline{1-3} \multirow{2}{*}{ADDI} & {\em rd}={\tt x0}, and either & \multirow{2}{*}{$2^{17}-1$} & \\ & {\em rs1}$\neq${\tt x0} or {\em imm}$\neq$0 & & \\ \cline{1-3} @@ -1432,8 +1447,18 @@ simulation/emulation. SLL & {\em rd}={\tt x0} & $2^{10}$ & \\ \cline{1-3} SRL & {\em rd}={\tt x0} & $2^{10}$ & \\ \cline{1-3} SRA & {\em rd}={\tt x0} & $2^{10}$ & \\ \cline{1-3} - \multirow{2}{*}{FENCE}& {\em fm=0}, and either & \multirow{2}{*}{$2^{5}-1$} & \\ - & {\em pred}=0 or {\em succ}=0 & & \\ \hline \hline + \multirow{3}{*}{FENCE}& {\em rd}={\tt x0}, {\em rs1}$\neq${\tt x0}, & \multirow{3}{*}{$2^{10}-63$}& \\ + & {\em fm}=0, and either & & \\ + & {\em pred}=0 or {\em succ}=0 & & \\ \cline{1-3} + \multirow{3}{*}{FENCE}& {\em rd}$\neq${\tt x0}, {\em rs1}={\tt x0}, & \multirow{3}{*}{$2^{10}-63$}& \\ + & {\em fm}=0, and either & & \\ + & {\em pred}=0 or {\em succ}=0 & & \\ \cline{1-3} + \multirow{2}{*}{FENCE}& {\em rd}={\em rs1}={\tt x0}, {\em fm}=0, & \multirow{2}{*}{15} & \\ + & {\em pred}=0, {\em succ}$\neq$0 & & \\ \cline{1-3} + \multirow{2}{*}{FENCE}& {\em rd}={\em rs1}={\tt x0}, {\em fm}=0, & \multirow{2}{*}{15} & \\ + & {\em pred}$\neq$W, {\em succ}=0 & & \\ \hline + \multirow{2}{*}{FENCE}& {\em rd}={\em rs1}={\tt x0}, {\em fm}=0, & \multirow{2}{*}{1} & \multirow{2}{*}{PAUSE} \\ + & {\em pred}=W, {\em succ}=0 & & \\ \hline \hline SLTI & {\em rd}={\tt x0} & $2^{17}$ & \multirow{7}{*}{\em Designated for custom use} \\ \cline{1-3} SLTIU & {\em rd}={\tt x0} & $2^{17}$ & \\ \cline{1-3} SLLI & {\em rd}={\tt x0} & $2^{10}$ & \\ \cline{1-3} diff --git a/src/rv64.tex b/src/rv64.tex index 79b1a30..253f2d3 100644 --- a/src/rv64.tex +++ b/src/rv64.tex @@ -274,7 +274,7 @@ will ever be defined in this subspace. \begin{tabular}{|l|l|c|l|} \hline Instruction & Constraints & Code Points & Purpose \\ \hline \hline - LUI & {\em rd}={\tt x0} & $2^{20}$ & \multirow{21}{*}{\em Reserved for future standard use} \\ \cline{1-3} + LUI & {\em rd}={\tt x0} & $2^{20}$ & \multirow{32}{*}{\em Reserved for future standard use} \\ \cline{1-3} AUIPC & {\em rd}={\tt x0} & $2^{20}$ & \\ \cline{1-3} \multirow{2}{*}{ADDI} & {\em rd}={\tt x0}, and either & \multirow{2}{*}{$2^{17}-1$} & \\ & {\em rs1}$\neq${\tt x0} or {\em imm}$\neq$0 & & \\ \cline{1-3} @@ -295,7 +295,18 @@ will ever be defined in this subspace. SLLW & {\em rd}={\tt x0} & $2^{10}$ & \\ \cline{1-3} SRLW & {\em rd}={\tt x0} & $2^{10}$ & \\ \cline{1-3} SRAW & {\em rd}={\tt x0} & $2^{10}$ & \\ \cline{1-3} - FENCE & {\em pred}=0 or {\em succ}=0 & $2^{5}-1$ & \\ \hline \hline + \multirow{3}{*}{FENCE}& {\em rd}={\tt x0}, {\em rs1}$\neq${\tt x0}, & \multirow{3}{*}{$2^{10}-63$}& \\ + & {\em fm}=0, and either & & \\ + & {\em pred}=0 or {\em succ}=0 & & \\ \cline{1-3} + \multirow{3}{*}{FENCE}& {\em rd}$\neq${\tt x0}, {\em rs1}={\tt x0}, & \multirow{3}{*}{$2^{10}-63$}& \\ + & {\em fm}=0, and either & & \\ + & {\em pred}=0 or {\em succ}=0 & & \\ \cline{1-3} + \multirow{2}{*}{FENCE}& {\em rd}={\em rs1}={\tt x0}, {\em fm}=0, & \multirow{2}{*}{15} & \\ + & {\em pred}=0, {\em succ}$\neq$0 & & \\ \cline{1-3} + \multirow{2}{*}{FENCE}& {\em rd}={\em rs1}={\tt x0}, {\em fm}=0, & \multirow{2}{*}{15} & \\ + & {\em pred}$\neq$W, {\em succ}=0 & & \\ \hline + \multirow{2}{*}{FENCE}& {\em rd}={\em rs1}={\tt x0}, {\em fm}=0, & \multirow{2}{*}{1} & \multirow{2}{*}{PAUSE} \\ + & {\em pred}=W, {\em succ}=0 & & \\ \hline \hline SLTI & {\em rd}={\tt x0} & $2^{17}$ & \multirow{10}{*}{\em Designated for custom use} \\ \cline{1-3} SLTIU & {\em rd}={\tt x0} & $2^{17}$ & \\ \cline{1-3} SLLI & {\em rd}={\tt x0} & $2^{11}$ & \\ \cline{1-3} diff --git a/src/zihintpause.tex b/src/zihintpause.tex new file mode 100644 index 0000000..a8a6326 --- /dev/null +++ b/src/zihintpause.tex @@ -0,0 +1,50 @@ +\chapter{``Zihintpause'' Pause Hint, Version 1.0} +\label{chap:zihintpause} + +The PAUSE instruction is a HINT that indicates the current hart's rate of +instruction retirement should be temporarily reduced or paused. The duration of its +effect must be bounded and may be zero. No architectural state is changed. + +\begin{commentary} +Software can use the PAUSE instruction to reduce energy consumption while +executing spin-wait code sequences. Multithreaded cores might temporarily +relinquish execution resources to other harts when PAUSE is executed. +It is recommended that a PAUSE instruction generally be included in the code +sequence for a spin-wait loop. + +A future extension might add primitives similar to the x86 MONITOR/MWAIT +instructions, which provide a more efficient mechanism to wait on writes to +a specific memory location. +However, these instructions would not supplant PAUSE. +PAUSE is more appropriate when polling for non-memory events, when polling for +multiple events, or when software does not know precisely what events it is +polling for. + +The duration of a PAUSE instruction's effect may vary significantly within and +among implementations. +In typical implementations this duration should be much less than the time to perform a context switch, probably more on the rough order of an on-chip cache miss latency or a cacheless access to main memory. + +A series of PAUSE instructions can be used to create a cumulative delay loosely +proportional to the number of PAUSE instructions. +In spin-wait loops in portable code, however, only one PAUSE instruction should +be used before re-evaluating loop conditions, else the hart might stall longer +than optimal on some implementations, degrading system performance. +\end{commentary} + +PAUSE is encoded as a FENCE instruction with {\em pred}=W, {\em succ}=0, +{\em fm}=0, {\em rd}={\tt x0}, and {\em rs1}={\tt x0}. + +\begin{commentary} +PAUSE is encoded as a hint within the FENCE opcode because some +implementations are expected to deliberately stall the PAUSE instruction until outstanding +memory transactions have completed. +Because the successor set is null, however, PAUSE does not {\em mandate} any +particular memory ordering---hence, it truly is a HINT. + +Like other FENCE instructions, PAUSE cannot be used within LR/SC sequences +without voiding the forward-progress guarantee. + +The choice of a predecessor set of W is arbitrary, since the successor set is +null. +Other HINTs similar to PAUSE might be encoded with other predecessor sets. +\end{commentary} |