diff options
author | Bill Traynor <wmat@riscv.org> | 2024-02-27 11:34:14 -0500 |
---|---|---|
committer | GitHub <noreply@github.com> | 2024-02-27 11:34:14 -0500 |
commit | 14c5798ba6272d1faf419626dd31c9659b98cbfe (patch) | |
tree | 7e339961fb9fdedd38dbb4e769f0eeb992f77f29 | |
parent | d28d56b5225ae44c811fa1422758e0e951edddc0 (diff) | |
parent | b1940473272185c5bd2059c4663ed7537b85b3e3 (diff) | |
download | riscv-isa-manual-14c5798ba6272d1faf419626dd31c9659b98cbfe.zip riscv-isa-manual-14c5798ba6272d1faf419626dd31c9659b98cbfe.tar.gz riscv-isa-manual-14c5798ba6272d1faf419626dd31c9659b98cbfe.tar.bz2 |
Merge branch 'main' into cmo
Signed-off-by: Bill Traynor <wmat@riscv.org>
-rw-r--r-- | marchid.md | 1 | ||||
-rw-r--r-- | src/c-st-ext.adoc | 4 | ||||
-rw-r--r-- | src/f-st-ext.adoc | 2 | ||||
-rw-r--r-- | src/images/bytefield/hstatusreg-rv32.edn | 4 | ||||
-rw-r--r-- | src/images/bytefield/hstatusreg.edn | 4 | ||||
-rw-r--r-- | src/images/bytefield/hypv-mstatus.edn | 6 | ||||
-rw-r--r-- | src/images/bytefield/vsstatusreg.edn | 6 | ||||
-rw-r--r-- | src/images/wavedrom/ct-unconditional-2.adoc | 2 | ||||
-rw-r--r-- | src/intro.adoc | 2 | ||||
-rw-r--r-- | src/mm-eplan.adoc | 2 | ||||
-rw-r--r-- | src/mm-formal.adoc | 68 | ||||
-rw-r--r-- | src/resources/themes/riscv-spec.yml | 9 | ||||
-rw-r--r-- | src/riscv-unprivileged.adoc | 15 | ||||
-rw-r--r-- | src/rv-32-64g.adoc | 9 | ||||
-rw-r--r-- | src/rv32.adoc | 2 | ||||
-rw-r--r-- | src/zawrs.adoc | 105 | ||||
-rw-r--r-- | src/zfh.adoc | 2 |
17 files changed, 188 insertions, 55 deletions
@@ -60,3 +60,4 @@ RV6 | Nikola Lukić | [Nikola Lukić](mailto:lukicn ApogeoRV | Gabriele Tripi | [Gabriele Tripi](mailto:tripi.gabriele2002@gmail.com) | 40 | https://github.com/GabbedT/ApogeoRV MicroRV32 | AGRA, Group of Computer Architecture, University of Bremen | [RISC-V @ AGRA](mailto:riscv@informatik.uni-bremen.de) | 41 | https://github.com/agra-uni-bremen/microrv32 QEMU | qemu.org | [QEMU Mailing List](mailto:qemu-riscv@nongnu.org) | 42 | https://qemu.org +KianV | Hirosh Dabui | [Hirosh Dabui](mailto:hirosh@dabui.de) | 43 | https://github.com/splinedrive/kianRiscV diff --git a/src/c-st-ext.adoc b/src/c-st-ext.adoc index cfd9538..ca248f6 100644 --- a/src/c-st-ext.adoc +++ b/src/c-st-ext.adoc @@ -298,7 +298,7 @@ registers. ==== Stack-Pointer-Based Loads and Stores include::images/wavedrom/c-sp-load-store.adoc[] -[c-sp-load-store] +[[c-sp-load-store]] //.Stack-Pointer-Based Loads and Stores--these instructions use the CI format. These instructions use the CI format. @@ -336,7 +336,7 @@ _zero_-extended offset, scaled by 8, to the stack pointer, `x2`. It expands to `fld rd, offset(x2)`. include::images/wavedrom/c-sp-load-store-css.adoc[] -[c-sp-load-store-css] +[[c-sp-load-store-css]] //.Stack-Pointer-Based Loads and Stores--these instructions use the CSS format. These instructions use the CSS format. diff --git a/src/f-st-ext.adoc b/src/f-st-ext.adoc index 54d43ca..24941ed 100644 --- a/src/f-st-ext.adoc +++ b/src/f-st-ext.adoc @@ -37,7 +37,7 @@ floating-point register file state can reduce context-switch overhead. [[fprs]] .RISC-V standard F extension single-precision floating-point state -[col[s="<|^|>"|option[s="header",width="50%",align="center"grid="rows"] +[cols="<,^,>",options="header",width="50%",align="center",grid="rows"] |=== | [.small]#FLEN-1#| >| [.small]#0# 3+^| [.small]#f0# diff --git a/src/images/bytefield/hstatusreg-rv32.edn b/src/images/bytefield/hstatusreg-rv32.edn index 02db585..2762ce6 100644 --- a/src/images/bytefield/hstatusreg-rv32.edn +++ b/src/images/bytefield/hstatusreg-rv32.edn @@ -51,9 +51,9 @@ (draw-box "6" {:span 5 :borders {}}) (draw-box "2" {:span 2 :borders {}}) (draw-box "1" {:borders {}}) -(draw-box "2" {:span 2 :borders {}}) +(draw-box "1" {:span 2 :borders {}}) (draw-box "1" {:span 3 :borders {}}) (draw-box "1" {:span 3 :borders {}}) -(draw-box "2" {:span 2 :borders {}}) +(draw-box "1" {:span 2 :borders {}}) (draw-box "5" {:span 2 :borders {}}) ---- diff --git a/src/images/bytefield/hstatusreg.edn b/src/images/bytefield/hstatusreg.edn index cff75db..cce601e 100644 --- a/src/images/bytefield/hstatusreg.edn +++ b/src/images/bytefield/hstatusreg.edn @@ -8,7 +8,7 @@ (def boxes-per-row 32) (draw-box nil {:span 3 :borders {}}) -(draw-box "HSXLEN-1" {:span 8 :borders {} :text-anchor "start"}) +(draw-box "63" {:span 8 :borders {} :text-anchor "start"}) (draw-box "34" {:borders {}}) (draw-box "33" {:span 2 :borders {} :text-anchor "start"}) (draw-box "32" {:span 2 :borders {} :text-anchor "end"}) @@ -31,7 +31,7 @@ (draw-box nil {:span 3 :borders {}}) (draw-box nil {:span 3 :borders {}}) -(draw-box "HSXLEN-34" {:span 9 :borders {}}) +(draw-box "30" {:span 9 :borders {}}) (draw-box "2" {:span 4 :borders {}}) (draw-box "9" {:span 6 :borders {}}) (draw-box "1" {:span 2 :borders {}}) diff --git a/src/images/bytefield/hypv-mstatus.edn b/src/images/bytefield/hypv-mstatus.edn index 2ed4a4d..885dc00 100644 --- a/src/images/bytefield/hypv-mstatus.edn +++ b/src/images/bytefield/hypv-mstatus.edn @@ -7,8 +7,8 @@ (def right-margin 30) (def boxes-per-row 32) -(draw-box "MSXLEN-1" {:span 3 :borders {}}) -(draw-box "MXLEN-2" {:span 4 :text-anchor "start" :borders {}}) +(draw-box "63" {:span 3 :borders {}}) +(draw-box "62" {:span 4 :text-anchor "start" :borders {}}) (draw-box "40" {:span 4 :text-anchor "end" :borders {}}) (draw-box "39" {:span 3 :borders {}}) (draw-box "38" {:span 3 :borders {}}) @@ -31,7 +31,7 @@ (draw-box nil {:borders {:top :border-unrelated :bottom :border-unrelated}}) (draw-box "1" {:span 3 :borders {}}) -(draw-box "MXLEN-41" {:span 8 :borders {}}) +(draw-box "23" {:span 8 :borders {}}) (draw-box "1" {:span 3 :borders {}}) (draw-box "1" {:span 3 :borders {}}) (draw-box "1" {:span 3 :borders {}}) diff --git a/src/images/bytefield/vsstatusreg.edn b/src/images/bytefield/vsstatusreg.edn index 87f4725..95780a6 100644 --- a/src/images/bytefield/vsstatusreg.edn +++ b/src/images/bytefield/vsstatusreg.edn @@ -7,8 +7,8 @@ (def right-margin 30) (def boxes-per-row 32) -(draw-box "VSXLEN-1" {:span 3 :borders {}}) -(draw-box "VSXLEN-2" {:span 5 :text-anchor "start" :borders {}}) +(draw-box "63" {:span 3 :borders {}}) +(draw-box "62" {:span 5 :text-anchor "start" :borders {}}) (draw-box "34" {:span 5 :text-anchor "end" :borders {}}) (draw-box "33" {:span 2 :text-anchor "start" :borders {}}) (draw-box "32" {:span 2 :text-anchor "end" :borders {}}) @@ -30,7 +30,7 @@ (draw-box nil {:span 2 :borders {}}) (draw-box "1" {:span 3 :borders {}}) -(draw-box "VSXLEN-35" {:span 10 :borders {}}) +(draw-box "29" {:span 10 :borders {}}) (draw-box "2" {:span 4 :borders {}}) (draw-box "12" {:span 6 :borders {}}) (draw-box "1" {:span 2 :borders {}}) diff --git a/src/images/wavedrom/ct-unconditional-2.adoc b/src/images/wavedrom/ct-unconditional-2.adoc index ef33a9e..4dda824 100644 --- a/src/images/wavedrom/ct-unconditional-2.adoc +++ b/src/images/wavedrom/ct-unconditional-2.adoc @@ -4,7 +4,7 @@ .... {reg: [ {bits: 7, name: 'opcode', attr: ['7', 'JALR'], type: 8}, - {bits: 5, name: 'rd', attr: ['6', 'dest'], type: 2}, + {bits: 5, name: 'rd', attr: ['5', 'dest'], type: 2}, {bits: 3, name: 'funct3', attr: ['3', '0'], type: 8}, {bits: 5, name: 'rs1', attr: ['5', 'base'], type: 4}, {bits: 12, name: 'imm[11:0]', attr: ['12', 'offset[11:0]'], type: 3}, diff --git a/src/intro.adoc b/src/intro.adoc index 53379e7..78d7a34 100644 --- a/src/intro.adoc +++ b/src/intro.adoc @@ -195,7 +195,7 @@ environment but must do so in a way that guest harts operate like independent hardware threads. In particular, if there are more guest harts than host harts then the execution environment must be able to preempt the guest harts and must not wait indefinitely for guest -software on a guest hart to “yield" control of the guest hart. +software on a guest hart to "yield" control of the guest hart. ==== === RISC-V ISA Overview diff --git a/src/mm-eplan.adoc b/src/mm-eplan.adoc index 1243b1d..470a3ab 100644 --- a/src/mm-eplan.adoc +++ b/src/mm-eplan.adoc @@ -922,7 +922,7 @@ instruction will be followed by a conditional branch checking whether the outcome was successful; this implies that there will be a control dependency from the store operation generated by the SC instruction to any memory operations following the branch. PPO -rule <<ppo-ctrl>> in turn implies that any subsequent store +rule <<ppo, 11>> in turn implies that any subsequent store operations will appear later in the global memory order than the store operation generated by the SC. However, since control, address, and data dependencies are defined over memory operations, and since an diff --git a/src/mm-formal.adoc b/src/mm-formal.adoc index 2a49696..fb89914 100644 --- a/src/mm-formal.adoc +++ b/src/mm-formal.adoc @@ -525,7 +525,7 @@ a construction of the post-transition model state for each. Transitions for all instructions: -latexmath:[$\bullet$] <<fetch, Fetch instruction>>: This transition represents a fetch and decode of a new instruction instance, as a program order successor of a previously fetched +* <<fetch, Fetch instruction>>: This transition represents a fetch and decode of a new instruction instance, as a program order successor of a previously fetched instruction instance (or the initial fetch address). The model assumes the instruction memory is fixed; it does not describe @@ -534,16 +534,17 @@ not generate memory load operations, and the shared memory is not involved in the transition. Instead, the model depends on an external oracle that provides an opcode when given a memory location. -latexmath:[$\circ$] <<reg_write, Register write>>: This is a write of a register value. +[circle] +* <<reg_write, Register write>>: This is a write of a register value. -latexmath:[$\circ$] <<reg_read, Register read>>: This is a read of a register value from the most recent +* <<reg_read, Register read>>: This is a read of a register value from the most recent program-order-predecessor instruction instance that writes to that register. -latexmath:[$\circ$] <<sail_interp, Pseudocode internal step>>: This covers pseudocode internal computation: arithmetic, function +* <<sail_interp, Pseudocode internal step>>: This covers pseudocode internal computation: arithmetic, function calls, etc. -latexmath:[$\circ$] <<finish, Finish instruction>>: At this point the instruction pseudocode is done, the instruction cannot be restarted, memory accesses cannot be discarded, and all memory +* <<finish, Finish instruction>>: At this point the instruction pseudocode is done, the instruction cannot be restarted, memory accesses cannot be discarded, and all memory effects have taken place. For conditional branch and indirect jump instructions, any program order successors that were fetched from an address that is not the one that was written to the _pc_ register are @@ -552,15 +553,20 @@ them. Transitions specific to load instructions: -latexmath:[$\circ$] <<initiate_load, Initiate memory load operations>>: At this point the memory footprint of the load instruction is +[circle] +* <<initiate_load, Initiate memory load operations>>: At this point the memory footprint of the load instruction is provisionally known (it could change if earlier instructions are restarted) and its individual memory load operations can start being satisfied. -latexmath:[$\bullet$] <<sat_from_forwarding, Satisfy memory load operation by forwarding from unpropogated stores>>: This partially or entirely satisfies a single memory load operation -by forwarding, from program-order-previous memory store operations. -latexmath:[$\bullet$] <<sat_from_mem, Satisfy memory load operation from memory>>: This entirely satisfies the outstanding slices of a single memory + +[disc] +* <<sat_from_forwarding, Satisfy memory load operation by forwarding from unpropogated stores>>: This partially or entirely satisfies a single memory load operation by forwarding, from program-order-previous memory store operations. + +* <<sat_from_mem, Satisfy memory load operation from memory>>: This entirely satisfies the outstanding slices of a single memory load operation, from memory. -latexmath:[$\circ$] <<complete_loads, Complete load operations>>: At this point all the memory load operations of the instruction have + +[circle] +* <<complete_loads, Complete load operations>>: At this point all the memory load operations of the instruction have been entirely satisfied and the instruction pseudocode can continue executing. A load instruction can be subject to being restarted until the transition. But, under some conditions, the model might treat a load @@ -568,44 +574,56 @@ instruction as non-restartable even before it is finished (e.g. see ). Transitions specific to store instructions: -latexmath:[$\circ$] <<initiate_store_footprint, Initiate memory store operation footprints>>: At this point the memory footprint of the store is provisionally +[circle] +* <<initiate_store_footprint, Initiate memory store operation footprints>>: At this point the memory footprint of the store is provisionally known. -latexmath:[$\circ$] <<instantiate_store_value, Instantiate memory store operation values>>: At this point the memory store operations have their values and + +* <<instantiate_store_value, Instantiate memory store operation values>>: At this point the memory store operations have their values and program-order-successor memory load operations can be satisfied by forwarding from them. -latexmath:[$\circ$] <<commit_stores, Commit store instruction>>: At this point the store operations are guaranteed to happen (the + +* <<commit_stores, Commit store instruction>>: At this point the store operations are guaranteed to happen (the instruction can no longer be restarted or discarded), and they can start being propagated to memory. -latexmath:[$\bullet$] <<prop_store, Propagate store operation>>: This propagates a single memory store operation to memory. -latexmath:[$\circ$] <<complete_stores, Complete store operations>>: At this point all the memory store operations of the instruction + +[disc] +* <<prop_store, Propagate store operation>>: This propagates a single memory store operation to memory. + +[circle] +* <<complete_stores, Complete store operations>>: At this point all the memory store operations of the instruction have been propagated to memory, and the instruction pseudocode can continue executing. Transitions specific to `sc` instructions: -latexmath:[$\bullet$] <<early_sc_fail, Early sc fail>>: This causes the `sc` to fail, either a spontaneous fail or because -it is not paired with a program-order-previous `lr`. -latexmath:[$\bullet$] <<paired_sc, Paired sc>>: This transition indicates the `sc` is paired with an `lr` and might +[disc] +* <<early_sc_fail, Early sc fail>>: This causes the `sc` to fail, either a spontaneous fail or becauset is not paired with a program-order-previous `lr`. + +* <<paired_sc, Paired sc>>: This transition indicates the `sc` is paired with an `lr` and might succeed. -latexmath:[$\bullet$] <<commit_sc, Commit and propagate store operation of an sc>>: This is an atomic execution of the transitions <<commit_stores, Commit store instruction>> and <<prop_store, Propagate store operation>>, it is enabled + +* <<commit_sc, Commit and propagate store operation of an sc>>: This is an atomic execution of the transitions <<commit_stores, Commit store instruction>> and <<prop_store, Propagate store operation>>, it is enabled only if the stores from which the `lr` read from have not been overwritten. -latexmath:[$\bullet$] <<late_sc_fail, Late sc fail>>: This causes the `sc` to fail, either a spontaneous fail or because + +* <<late_sc_fail, Late sc fail>>: This causes the `sc` to fail, either a spontaneous fail or because the stores from which the `lr` read from have been overwritten. Transitions specific to AMO instructions: -latexmath:[$\bullet$] <<do_amo, Satisfy, commit and propagate operations of an AMO>>: This is an atomic execution of all the transitions needed to satisfy +[disc] +* <<do_amo, Satisfy, commit and propagate operations of an AMO>>: This is an atomic execution of all the transitions needed to satisfy the load operation, do the required arithmetic, and propagate the store operation. Transitions specific to fence instructions: -latexmath:[$\circ$] <<commit_fence, Commit fence>> +[circle] +* <<commit_fence, Commit fence>> The transitions labeled latexmath:[$\circ$] can always be taken eagerly, as soon as their precondition is satisfied, without excluding other -behavior; the latexmath:[$\bullet$] cannot. Although is marked with a +behavior; the latexmath:[$\bullet$] cannot. Although <<fetch, Fetch instruction>> is marked with a latexmath:[$\bullet$], it can be taken eagerly as long as it is not taken infinitely many times. @@ -1214,7 +1232,7 @@ time if: . every memory store operation that has been forwarded to latexmath:[$i'$] is propagated; . the conditions of <<commit_stores, Commit store instruction>> is satisfied; -. the conditions of <<prop_stores, Commit store instruction>> is satisfied (notice that an `sc` instruction can +. the conditions of <<prop_store, Propagate store instruction>> is satisfied (notice that an `sc` instruction can only have one memory store operation); and . for every store slice latexmath:[$msos$] from latexmath:[$msoss$], latexmath:[$msos$] has not been overwritten, in the shared memory, by a @@ -1224,7 +1242,7 @@ since latexmath:[$msos$] was propagated to memory. Action: . apply the actions of <<commit_stores, Commit store instruction>>; and -. apply the action of <<prop_stores, Commit store instruction>>. +. apply the action of <<prop_store, Propagate store instruction>>. [[late_sc_fail]] ===== Late `sc` fail diff --git a/src/resources/themes/riscv-spec.yml b/src/resources/themes/riscv-spec.yml index 4aa9535..5cb07c9 100644 --- a/src/resources/themes/riscv-spec.yml +++ b/src/resources/themes/riscv-spec.yml @@ -164,14 +164,17 @@ admonition: padding: [0, $horizontal_rhythm, 0, $horizontal_rhythm] icon: note: - name: pencil-square-o + # name: pencil-square-o + name: far-edit stroke_color: 6489b3 tip: - name: comments-o + #name: comments-o + name: far-comments stroke_color: 646b74 size: 24 important: - name: info + #name: info + name: fas-info-circle stroke_color: 5f8c8b warning: stroke_color: 9c4d4b diff --git a/src/riscv-unprivileged.adoc b/src/riscv-unprivileged.adoc index dcfdf47..0936b42 100644 --- a/src/riscv-unprivileged.adoc +++ b/src/riscv-unprivileged.adoc @@ -51,16 +51,12 @@ endif::[] _Contributors to all versions of the spec in alphabetical order (please contact editors to suggest corrections): Arvind, Krste Asanović, Rimas Avižienis, Jacob Bachmeyer, Christopher F. Batten, -Allen J. Baum, Alex Bradbury, Scott Beamer, Preston Briggs, Christopher Celio, Chuanhua -Chang, David Chisnall, Paul Clayton, Palmer Dabbelt, Ken Dockser, Roger Espasa, Greg Favor, -Shaked Flur, Stefan Freudenberger, Marc Gauthier, Andy Glew, Jan Gray, Michael Hamburg, John +Allen J. Baum, Abel Bernabeu, Alex Bradbury, Scott Beamer, Preston Briggs, Christopher Celio, Chuanhua +Chang, David Chisnall, Paul Clayton, Palmer Dabbelt, Ken Dockser, Paul Donahue, Aaron Durbin, Roger Espasa, Greg Favor, Shaked Flur, Stefan Freudenberger, Marc Gauthier, Andy Glew, Jan Gray, Michael Hamburg, John Hauser, David Horner, Bruce Hoult, Bill Huffman, Alexandre Joannou, Olof Johansson, Ben Keller, -David Kruckemyer, Yunsup Lee, Paul Loewenstein, Daniel Lustig, Yatin Manerkar, Luc Maranget, -Margaret Martonosi, Joseph Myers, Vijayanand Nagarajan, Rishiyur Nikhil, Jonas Oberhauser, -Stefan O'Rear, Albert Ou, John Ousterhout, David Patterson, Christopher Pulte, Jose Renau, -Josh Scheid, Colin Schmidt, Peter Sewell, Susmit Sarkar, Michael Taylor, Wesley Terpstra, Matt -Thomas, Tommy Thorn, Caroline Trippel, Ray VanDeWalker, Muralidaran Vijayaraghavan, Megan -Wachs, Andrew Waterman, Robert Watson, Derek Williams, Andrew Wright, Reinoud Zandijk, +David Kruckemyer, Tariq Kurd, Yunsup Lee, Paul Loewenstein, Daniel Lustig, Yatin Manerkar, Luc Maranget, +Margaret Martonosi, Phil McCoy, Christoph Müllner, Joseph Myers, Vijayanand Nagarajan, Rishiyur Nikhil, Jonas Oberhauser, Stefan O'Rear, Albert Ou, John Ousterhout, David Patterson, Christopher Pulte, Jose Renau, +Josh Scheid, Colin Schmidt, Peter Sewell, Susmit Sarkar, Ved Shanbhogue, Michael Taylor, Wesley Terpstra, Matt Thomas, Tommy Thorn, Philipp Tomsich, Caroline Trippel, Ray VanDeWalker, Muralidaran Vijayaraghavan, Megan Wachs, Andrew Waterman, Robert Watson, David Weaver, Derek Williams, Andrew Wright, Reinoud Zandijk, and Sizhuo Zhang._ _This document is released under a Creative Commons Attribution 4.0 International License._ @@ -127,6 +123,7 @@ include::zfa.adoc[] include::ztso-st-ext.adoc[] //ztso.tex include::cmo.adoc[] +include::zawrs.adoc[] include::rv-32-64g.adoc[] //gmaps.tex include::extending.adoc[] diff --git a/src/rv-32-64g.adoc b/src/rv-32-64g.adoc index 7714436..1818ddf 100644 --- a/src/rv-32-64g.adoc +++ b/src/rv-32-64g.adoc @@ -442,6 +442,15 @@ ISA. 2+|1101010 |00011 |rs1 |rm |rd |1010011 |FCVT.H.LU |=== +[%autowidth.stretch,float="center",align="center",cols="^2m,^2m,^2m,^2m,<2m,>3m, <4m, >4m, <4m, >4m, <4m, >4m, <4m, >4m, <6m"] +|=== +15+^|Zawrs Standard Extension + +6+^|000000001101 2+^|00000 2+^|000 2+^|00000 2+^|1110011 <|WRS.NTO +6+^|000000011101 2+^|00000 2+^|000 2+^|00000 2+^|1110011 <|WRS.STO +|=== + + <<rvgcsrnames>> lists the CSRs that have currently been allocated CSR addresses. The timers, counters, and floating-point CSRs are the only CSRs defined in this specification. diff --git a/src/rv32.adoc b/src/rv32.adoc index 9ce3fb0..bd38ac8 100644 --- a/src/rv32.adoc +++ b/src/rv32.adoc @@ -50,7 +50,7 @@ holds the address of the current instruction. [[gprs]] .RISC-V base unprivileged integer register state. -[col[s="<|^|>"|option[s="header",width="50%",align="center"grid="rows"] +[cols="<,^,>",options="header",width="50%",align="center",grid="rows"] |=== <| [.small]#XLEN-1#| >| [.small]#0# 3+^| [.small]#x0/zero# diff --git a/src/zawrs.adoc b/src/zawrs.adoc new file mode 100644 index 0000000..456c582 --- /dev/null +++ b/src/zawrs.adoc @@ -0,0 +1,105 @@ +== "Zawrs" Statndard extension for Wait-on-Reservation-Set instructions, Version 1.01 + +The Zawrs extension defines a pair of instructions to be used in polling loops +that allows a core to enter a low-power state and wait on a store to a memory +location. Waiting for a memory location to be updated is a common pattern in +many use cases such as: + +. Contenders for a lock waiting for the lock variable to be updated. + +. Consumers waiting on the tail of an empty queue for the producer to queue + work/data. The producer may be code executing on a RISC-V hart, an accelerator + device, an external I/O agent. + +. Code waiting on a flag to be set in memory indicative of an event occurring. + For example, software on a RISC-V hart may wait on a "done" flag to be set in + memory by an accelerator device indicating completion of a job previously + submitted to the device. + +Such use cases involve polling on memory locations, and such busy loops can be a +wasteful expenditure of energy. To mitigate the wasteful looping in such usages, +a `WRS.NTO` (WRS-with-no-timeout) instruction is provided. Instead of polling +for a store to a specific memory location, software registers a reservation set +that includes all the bytes of the memory location using the `LR` instruction. +Then a subsequent `WRS.NTO` instruction would cause the hart to temporarily +stall execution in a low-power state until a store occurs to the reservation set +or an interrupt is observed. + +Sometimes the program waiting on a memory update may also need to carry out a +task at a future time or otherwise place an upper bound on the wait. To support +such use cases a second instruction `WRS.STO` (WRS-with-short-timeout) is +provided that works like `WRS.NTO` but bounds the stall duration to an +implementation-define short timeout such that the stall is terminated on the +timeout if no other conditions have occurred to terminate the stall. The +program using this instruction may then determine if its deadline has been +reached. + +[NOTE] +==== +The instructions in the Zawrs extension are only useful in conjunction with the +LR instruction, which is provided by the A extension, and which we also expect +to be provided by a narrower Zalrsc extension in the future. +==== +[[Zawrs]] +=== Wait-on-Reservation-Set Instructions + +The `WRS.NTO` and `WRS.STO` instructions cause the hart to temporarily stall +execution in a low-power state as long as the reservation set is valid and no +pending interrupts, even if disabled, are observed. For `WRS.STO` the stall +duration is bounded by an implementation defined short timeout. These +instructions are available in all privilege modes. These instructions are not +supported in a constrained `LR`/`SC` loop. + +[wavedrom, ,svg] +.... +{reg: [ + {bits: 7, name: 'opcode', attr: ['SYSTEM(0x73)'] }, + {bits: 5, name: 'rd', attr: ['0'] }, + {bits: 3, name: 'funct3', attr: ['0'] }, + {bits: 5, name: 'rs1', attr: ['0'] }, + {bits: 12, name: 'funct12', attr:['WRS.NTO(0x0d)', 'WRS.STO(0x1d)'] }, +], config:{lanes: 1, hspace:1024}} +.... + +<<< + +Hart execution may be stalled while the following conditions are all satisfied: +[loweralpha] + . The reservation set is valid + . If `WRS.STO`, a "short" duration since start of stall has not elapsed + . No pending interrupt is observed (see the rules below) + +While stalled, an implementation is permitted to occasionally terminate the +stall and complete execution for any reason. + +`WRS.NTO` and `WRS.STO` instructions follow the rules of the `WFI` instruction +for resuming execution on a pending interrupt. + +When the `TW` (Timeout Wait) bit in `mstatus` is set and `WRS.NTO` is executed +in any privilege mode other than M mode, and it does not complete within an +implementation-specific bounded time limit, the `WRS.NTO` instruction will cause +an illegal instruction exception. + +When executing in VS or VU mode, if the `VTW` bit is set in `hstatus`, the +`TW` bit in `mstatus` is clear, and the `WRS.NTO` does not complete within an +implementation-specific bounded time limit, the `WRS.NTO` instruction will cause +a virtual instruction exception. + +[NOTE] +==== +Since the `WRS.STO` and `WRS.NTO` instructions can complete execution for +reasons other than stores to the reservation set, software will likely need +a means of looping until the required stores have occurred. + +The duration of a `WRS.STO` instruction's timeout may vary significantly within +and among implementations. In typical implementations this duration should be +roughly in the range of 10 to 100 times an on-chip cache miss latency or a +cacheless access to main memory. + +`WRS.NTO`, unlike `WFI`, is not specified to cause an illegal instruction +exception if executed in U-mode when the governing `TW` bit is 0. `WFI` is +typically not expected to be used in U-mode and on many systems may promptly +cause an illegal instruction exception if used at U-mode. Unlike `WFI`, +`WRS.NTO` is expected to be used by software in U-mode when waiting on +memory but without a deadline for that wait. +====
\ No newline at end of file diff --git a/src/zfh.adoc b/src/zfh.adoc index f16514c..9e8710e 100644 --- a/src/zfh.adoc +++ b/src/zfh.adoc @@ -91,7 +91,7 @@ floating-point number to a quad-precision floating-point number, or vice-versa, respectively. include::images/wavedrom/half-prec-flpt-to-flpt-conv.adoc[] -[half-prec-flpt-to-flpt-conv] +[[half-prec-flpt-to-flpt-conv]] Floating-point to floating-point sign-injection instructions, FSGNJ.H, FSGNJN.H, and FSGNJX.H are defined analogously to the single-precision |