From 6ef41027236597115860994797186b2947fe7dbd Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Andr=C3=A9=20Sintzoff?= Date: Fri, 31 May 2024 09:53:34 +0200 Subject: machine.adoc: fix table title - move title before the table - replace redundant FS by VS --- src/machine.adoc | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'src') diff --git a/src/machine.adoc b/src/machine.adoc index d13fd4f..79f6b32 100644 --- a/src/machine.adoc +++ b/src/machine.adoc @@ -967,6 +967,8 @@ unconfigure or disable/enable instructions. <<< +[[fsxsstates]] +.FS, VS, and XS state transitions. [width=75,align=center,float=center,cols="<,<,<,<,<"] |=== |Current State + @@ -1070,9 +1072,7 @@ Off Off |=== -[[fsxsstates]] [width=75,align=center,float=center,cols="<,<,<,<,<"] -.FS, FS, and XS state transitions. |=== 5+^|Execute instruction to enable unit -- cgit v1.1 From c2886d5bf50adc178c6eade4d1a5147d8a60d981 Mon Sep 17 00:00:00 2001 From: Andrew Waterman Date: Fri, 31 May 2024 16:08:29 -0700 Subject: Integrate vector EGS spec --- src/v-st-ext.adoc | 142 +++++++++++++++++++++++++++++++++++++++++++++++++ src/vector-crypto.adoc | 4 +- 2 files changed, 143 insertions(+), 3 deletions(-) (limited to 'src') diff --git a/src/v-st-ext.adoc b/src/v-st-ext.adoc index 5d9d364..e4f8e22 100644 --- a/src/v-st-ext.adoc +++ b/src/v-st-ext.adoc @@ -1258,6 +1258,7 @@ NOTE: The `vsetivli` instruction provides more compact code when the dimensions of vectors are small and known to fit inside the vector registers, in which case there is no stripmining overhead. +[[constraints-on-setting-vl]] ==== Constraints on Setting `vl` The `vset{i}vl{i}` instructions first set VLMAX according to their `vtype` @@ -5181,6 +5182,147 @@ We considered requiring more complete scalar half-precision support, but we reasoned that, for many half-precision vector workloads, performing the scalar computation in single-precision will suffice. +[[vector-element-groups]] +=== Vector Element Groups + +Some vector instructions treat operands as a vector of one or more +_element_ _groups_, where each element group is a fixed number of +elements. For example, complex numbers can be viewed as a two-element +group (one real element and one imaginary element). +As another example, the SHA-256 cryptographic instructions in the Zvknha +extension operate on 128-bit values represented as a 4-element group of 32-bit +elements. + +This section describes recommendations and terminology for generic +instruction set design for vector instructions that operate on element +groups. + +==== Element Group Size + +The _element_ _group_ _size_ (EGS) is the number of elements in one +group, and must be a power-of-two (POT). + +NOTE: Support for non-POT EGS was considered but causes many practical +complications and so has been dropped. Error checking for `vl` is a +little more difficult. For LMUL>1, non-POT EGSs will result in groups +straddling the individual vector registers in a vector register +group. Non-POT EGS can also cause large increases in the +lowest-common-multiple of element group sizes, which adds constraints +to `vl` setting in order to avoid splitting an element group across +stripmine iterations in vector-length-agnostic code. + +The element group size is statically encoded in the instruction, often +implicitly as part of the opcode. + +Executing a vector instruction with EGS > VLMAX causes an illegal +instruction exception to be raised. + +NOTE: The vector instructions in the base V vector ISA can be viewed +as all having an element group size of 1 for all operands statically +encoded in the instruction. + +NOTE: Many operations only make sense with a certain number of +elements per group (e.g., complex operations require a element group +size of 2 and SHA-256 requires an element group size of 4). + +==== Setting `vl` + +Each source and destination operand to a vector instruction might be +defined as either a single element group or a vector of element +groups. When an operand is a vector of element groups, the `vl` +setting must correspond to an integer multiple of the element group +size, with other values of `vl` reserved. + +NOTE: For example, a SHA-256 instruction would require that `vl` is a +multiple of 4. + +When element group instructions are present, an additional constraint +is placed on the setting of `vl` based on an AVL value +(augmenting <>). +EGSMAX is the largest EGS supported by the +implementation. When AVL > VLMAX, the value of `vl` must be set to +either VLMAX or a positive integer multiple of EGSMAX. + +NOTE: As the base vector extension only has element group size of 1, +this constraint is backwards-compatible. + +NOTE: This constraint prevents element groups being broken across +stripmining iterations in vector-length-agnostic code when a +VLMAX-size vector would otherwise be able to accomodate a whole number +of element groups. + +NOTE: If EEW is encoded statically in the instruction, or if an +instruction has multiple operands containing vectors of element groups +with different EEW, an appropriate SEW must be chosen for `vsetvl` +instructions. + +NOTE: Additional constraints may be required for some element group +instructions to ensure legal length values for all operands. + +==== Determining EEW + +The `vtype` SEW can be used to indicate or calculate the effective +element size (EEW) of one or more operands of an element group +instruction. Where the operand is an element group, SEW and EEW refer +to the number of bits in each individual element within a group not +the number of bits in the group as a whole. + +Alternatively, the opcode might encode EEW of all operands statically +and ignore the value of SEW when the operation only makes sense for a +single size on each operand. + +NOTE: Many operations are only defined for one EEW, e.g., SHA-256 +requires EEW=32. Encoding EEWs statically in the instruction removes +a dynamic dependency on the SEW value and the need to check for errors +in SEW values. However, ignoring SEW also prevents reuse of the +static opcode with a different dynamic SEW, and in many cases, the SEW +setting will be needed for regular vector instructions used to process +the individual elements in the vector. + +==== Determining EMUL + +The `vtype` LMUL setting can be used to indicate or calculate the +effective length multiplier (EMUL) for one or more operands. Element +group instructions tend to exhibit a much wider range of relationships +between various operand EEW/EMUL values. For example, an instruction +might take a vector of length N of 4-element groups with EEW=8b and +reduce each group to produce a vector length N of 1-element groups +with EEW=32b. In this case, the input and output EMUL values are equal +even though the EEW settings differ by a factor of 4. + +Each source and destination operand to a vector instruction may have a +different element group size, different EMUL, and/or different EEW. + +==== Element Group Width + +The _element_ _group_ _width_ (EGW) is the number of bits in the +element group as a whole. +For example, the SHA-256 instructions in the Zvknha extension operate on an +EGW of 128, with EGS=4 and EEW=32. +It is possible to use LMUL to concatenate multiple vector registers together +to support larger EGW>VLEN. + +NOTE: If software using large-EGW instructions need be portable +across a range of implementations, some of which may have VLEN1, then software can only use a subset of the +architectural registers. Profiles can set minimum VLEN requirements +to inform authors of such software. + +NOTE: Element group operations by their nature will gather data from +across a wider portion of a vector datapath than regular vector +instructions. Some element group instructions might allow temporal +execution of individual element operations in a larger group, while +others will require all EGW bits of a group to be presented to a +functional unit at the same time. + +==== Masking + +No ratified extensions include masked element-group instructions. +Future extensions might extend the element-group scheme to support +element-level masking, or might define the concept of a _mask element group_ +(which might, e.g., update the destination element group if any mask bit in +the mask element group is set). + === Vector Instruction Listing include::images/wavedrom/v-inst-table.adoc[] diff --git a/src/vector-crypto.adoc b/src/vector-crypto.adoc index 82e5f21..a87a589 100644 --- a/src/vector-crypto.adoc +++ b/src/vector-crypto.adoc @@ -172,9 +172,7 @@ operands that are combined (for example, each SHA-2 operand is comprised of 4 wo these operands are a single value (for example, in the AES round instructions, each operand is 128-bit block or round key). -We treat these operands as a vector of one or more _element groups_ as defined in the -link:https://github.com/riscv/riscv-v-spec/blob/master/element_groups.adoc[RISC-V Vector Element Groups] -specification. +We treat these operands as a vector of one or more _element groups_ as defined in <>. Each vector crypto instruction that operates on element groups explicitly specifies their three defining parameters: EGW, EGS, and EEW. -- cgit v1.1 From 339a7cb4c69f1d004edda9f98b815c1005a4aab6 Mon Sep 17 00:00:00 2001 From: Yang Liu Date: Sat, 1 Jun 2024 09:29:25 +0800 Subject: Make vector CSR titles more consistent and remove some trailing spaces (#1439) --- src/v-st-ext.adoc | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) (limited to 'src') diff --git a/src/v-st-ext.adoc b/src/v-st-ext.adoc index e4f8e22..b8cd859 100644 --- a/src/v-st-ext.adoc +++ b/src/v-st-ext.adoc @@ -1,5 +1,5 @@ [[vector]] -== "V" Standard Extension for Vector Operations, Version 1.0 +== "V" Standard Extension for Vector Operations, Version 1.0 [NOTE] ==== @@ -180,7 +180,7 @@ is anticipated that a future extended 64-bit instruction encoding would allow these fields to be specified statically in the instruction encoding. -===== Vector selected element width `vsew[2:0]` +===== Vector Selected Element Width (`vsew[2:0]`) The value in `vsew` sets the dynamic _selected_ _element_ _width_ (SEW). By default, a vector register is viewed as being divided into @@ -452,7 +452,7 @@ when it cares about the non-participating elements, but given the historical meaning of the instruction prior to introduction of these flags, it was decided to always require them in future assembly code. -===== Vector Type Illegal `vill` +===== Vector Type Illegal (`vill`) The `vill` bit is used to encode that a previous `vset{i}vl{i}` instruction attempted to write an unsupported value to `vtype`. @@ -602,7 +602,7 @@ roundoff_signed(v, d) = (signed(v) >> d) + r ---- are used to represent this operation in the instruction descriptions below. -==== Vector Fixed-Point Saturation Flag `vxsat` +==== Vector Fixed-Point Saturation Flag (`vxsat`) The `vxsat` CSR has a single read-write least-significant bit (`vxsat[0]`) that indicates if a fixed-point instruction has had to @@ -843,7 +843,7 @@ that it can be aligned with the other datawidths in the same column that also have an LMUL setting, such that all have the same VLMAX. |=== -| 7+^| SEW/LMUL +| 7+^| SEW/LMUL | | 1 | 2 | 4 | 8 | 16 | 32 | 64 | SEW= 8 | 8 | 4 | 2 | 1 | 1/2 | 1/4 | 1/8 @@ -1734,7 +1734,7 @@ can be used to probe for valid effective addresses. The unit-stride versions only allow probing a region immediately contiguous to a known region, and so reduce the security impact when used in unprivileged code. However, code running in S-mode can establish arbitrary page -translations that allow probing of random guest physical addresses +translations that allow probing of random guest physical addresses provided by a hypervisor. Strided and scatter/gather fault-only-first instructions are not provided due to lack of encoding space, but they can also represent a larger security hole, allowing even unprivileged @@ -5064,7 +5064,7 @@ All Zve* extensions support all vector mask instructions (Section <>). All Zve* extensions support all vector permutation instructions -(Section <>), except that Zve32x and Zve64x +(Section <>), except that Zve32x and Zve64x do not include those with floating-point operands, and Zve64f does not include those with EEW=64 floating-point operands. -- cgit v1.1