diff options
author | Andrew Waterman <andrew@sifive.com> | 2017-02-01 20:41:47 -0800 |
---|---|---|
committer | Andrew Waterman <andrew@sifive.com> | 2017-02-01 20:41:47 -0800 |
commit | ab6f8c9bd7bc85361fcf35667d1fddfaf367a53f (patch) | |
tree | 716a2118ca0565dbb4e7903723f283ae4dd13c46 /src/extensions.tex | |
parent | 207a7c6ee51aa2fd74d4618cd1369ddc21706b9e (diff) | |
download | riscv-isa-manual-ab6f8c9bd7bc85361fcf35667d1fddfaf367a53f.zip riscv-isa-manual-ab6f8c9bd7bc85361fcf35667d1fddfaf367a53f.tar.gz riscv-isa-manual-ab6f8c9bd7bc85361fcf35667d1fddfaf367a53f.tar.bz2 |
Reorganize directory structure
Diffstat (limited to 'src/extensions.tex')
-rw-r--r-- | src/extensions.tex | 381 |
1 files changed, 381 insertions, 0 deletions
diff --git a/src/extensions.tex b/src/extensions.tex new file mode 100644 index 0000000..4346422 --- /dev/null +++ b/src/extensions.tex @@ -0,0 +1,381 @@ +\chapter{Extending RISC-V} +\label{extensions} + +In addition to supporting standard general-purpose software +development, another goal of RISC-V is to provide a basis for more +specialized instruction-set extensions or more customized +accelerators. The instruction encoding spaces and optional +variable-length instruction encoding are designed to make it easier to +leverage software development effort for the standard ISA toolchain +when building more customized processors. For example, the intent is +to continue to provide full software support for implementations that +only use the standard I base, perhaps together with many non-standard +instruction-set extensions. + +This chapter describes various ways in which the base RISC-V ISA can +be extended, together with the scheme for managing instruction-set +extensions developed by independent groups. This volume only deals +with the user-level ISA, although the same approach and terminology is +used for supervisor-level extensions described in the second volume. + +\section{Extension Terminology} + +This section defines some standard terminology for describing RISC-V +extensions. +\vspace{-0.2in} +\subsection*{Standard versus Non-Standard Extension} + +Any RISC-V processor implementation must support a base integer ISA +(RV32I or RV64I). In addition, an implementation may support one or +more extensions. We divide extensions into two broad categories: {\em + standard} versus {\em non-standard}. +\begin{itemize} +\item A standard extension is one that is generally useful and that is + designed to not conflict with any other standard extension. + Currently, ``MAFDQLCBTPV'', described in other chapters of this + manual, are either complete or planned standard extensions. +\item A non-standard extension may be highly specialized and may + conflict with other standard or non-standard extensions. We + anticipate a wide variety of non-standard extensions will be + developed over time, with some eventually being promoted to standard + extensions. +\end{itemize} + +\vspace{-0.2in} +\subsection*{Instruction Encoding Spaces and Prefixes} + +An instruction encoding space is some number of instruction bits +within which a base ISA or ISA extension is encoded. RISC-V supports +varying instruction lengths, but even within a single instruction +length, there are various sizes of encoding space available. For +example, the base ISA is defined within a 30-bit encoding space (bits +31--2 of the 32-bit instruction), while the atomic extension ``A'' +fits within a 25-bit encoding space (bits 31--7). + +We use the term {\em prefix} to refer to the bits to the {\em right} +of an instruction encoding space (since RISC-V is little-endian, the +bits to the right are stored at earlier memory addresses, hence form a +prefix in instruction-fetch order). The prefix for the standard base +ISA encoding is the two-bit ``11'' field held in bits 1--0 of the +32-bit word, while the prefix for the standard atomic extension ``A'' +is the seven-bit ``0101111'' field held in bits 6--0 of the 32-bit +word representing the AMO major opcode. A quirk of the encoding +format is that the 3-bit funct3 field used to encode a minor opcode is +not contiguous with the major opcode bits in the 32-bit instruction +format, but is considered part of the prefix for 22-bit instruction +spaces. + +Although an instruction encoding space could be of any size, adopting +a smaller set of common sizes simplifies packing independently +developed extensions into a single global encoding. +Table~\ref{encodingspaces} gives the suggested sizes for RISC-V. + +\begin{table}[H] +\begin{center} +\begin{tabular}{|c|l|r|r|r|r|} +\hline +\multicolumn{1}{|c|}{Size} & \multicolumn{1}{|c|}{Usage} & +\multicolumn{4}{|c|}{\# Available in standard instruction length} \\ \cline{3-6} + & & +\multicolumn{1}{|c|}{16-bit} & +\multicolumn{1}{|c|}{32-bit} & +\multicolumn{1}{|c|}{48-bit} & +\multicolumn{1}{|c|}{64-bit} \\ \hline \hline +14-bit & Quadrant of compressed 16-bit encoding & 3 & & & \\ \hline \hline +22-bit & Minor opcode in base 32-bit encoding & & $2^{8}$ & $2^{20}$ & $2^{35}$ \\ \hline +25-bit & Major opcode in base 32-bit encoding & & 32 & $2^{17}$ & $2^{32}$ \\ \hline +30-bit & Quadrant of base 32-bit encoding & & 1 & $2^{12}$ & $2^{27}$ \\ \hline \hline +32-bit & Minor opcode in 48-bit encoding & & & $2^{10}$ & $2^{25}$ \\ \hline +37-bit & Major opcode in 48-bit encoding & & & 32 & $2^{20}$ \\ \hline +40-bit & Quadrant of 48-bit encoding & & & 4 & $2^{17}$ \\ \hline \hline +45-bit & Sub-minor opcode in 64-bit encoding & & & & $2^{12}$ \\ \hline +48-bit & Minor opcode in 64-bit encoding & & & & $2^{9}$ \\ \hline +52-bit & Major opcode in 64-bit encoding & & & & 32\\ \hline +\end{tabular} +\end{center} +\caption{Suggested standard RISC-V instruction encoding space sizes.} +\label{encodingspaces} +\end{table} + +\vspace{-0.2in} +\subsection*{Greenfield versus Brownfield Extensions} + +We use the term {\em greenfield extension} to describe an extension +that begins populating a new instruction encoding space, and hence can +only cause encoding conflicts at the prefix level. We use the term +{\em brownfield extension} to describe an extension that fits around +existing encodings in a previously defined instruction space. A +brownfield extension is necessarily tied to a particular greenfield +parent encoding, and there may be multiple brownfield extensions to +the same greenfield parent encoding. For example, the base ISAs are +greenfield encodings of a 30-bit instruction space, while the FDQ +floating-point extensions are all brownfield extensions adding to the +parent base ISA 30-bit encoding space. + +Note that we consider the standard A extension to have a greenfield +encoding as it defines a new previously empty 25-bit encoding space in +the leftmost bits of the full 32-bit base instruction encoding, even +though its standard prefix locates it within the 30-bit encoding space +of the base ISA. Changing only its single 7-bit prefix could move the +A extension to a different 30-bit encoding space while only worrying +about conflicts at the prefix level, not within the encoding space +itself. + +\begin{table}[H] +{ +\begin{center} +\begin{tabular}{|r|c|c|} +\hline + & Adds state & No new state \\ \hline +Greenfield & RV32I(30), RV64I(30) & A(25) \\\hline +Brownfield & F(I), D(F), Q(D) & M(I) \\ +\hline +\end{tabular} +\end{center} +} +\caption{Two-dimensional characterization of standard instruction-set + extensions.} +\label{exttax} +\end{table} + +Table~\ref{exttax} shows the bases and standard extensions placed in a +simple two-dimensional taxonomy. One axis is whether the extension is +greenfield or brownfield, while the other axis is whether the +extension adds architectural state. For greenfield extensions, the +size of the instruction encoding space is given in parentheses. For +brownfield extensions, the name of the extension (greenfield or +brownfield) it builds upon is given in parentheses. Additional +user-level architectural state usually implies changes to the +supervisor-level system or possibly to the standard calling +convention. + +Note that RV64I is not considered an extension of RV32I, but a +different complete base encoding. + +\vspace{-0.2in} +\subsection*{Standard-Compatible Global Encodings} + +A complete or {\em global} encoding of an ISA for an actual RISC-V +implementation must allocate a unique non-conflicting prefix for every +included instruction encoding space. The bases and every standard +extension have each had a standard prefix allocated to ensure they can +all coexist in a global encoding. + +A {\em standard-compatible} global encoding is one where the base and +every included standard extension have their standard prefixes. A +standard-compatible global encoding can include non-standard +extensions that do not conflict with the included standard extensions. +A standard-compatible global encoding can also use standard prefixes +for non-standard extensions if the associated standard extensions are +not included in the global encoding. In other words, a standard +extension must use its standard prefix if included in a +standard-compatible global encoding, but otherwise its prefix is free +to be reallocated. These constraints allow a common toolchain to +target the standard subset of any RISC-V standard-compatible global +encoding. + +\vspace{-0.2in} +\subsection*{Guaranteed Non-Standard Encoding Space} + +To support development of proprietary custom extensions, portions of +the encoding space are guaranteed to never be used by standard +extensions. + +\section{RISC-V Extension Design Philosophy} + +We intend to support a large number of independently developed +extensions by encouraging extension developers to operate within +instruction encoding spaces, and by providing tools to pack these into +a standard-compatible global encoding by allocating unique prefixes. +Some extensions are more naturally implemented as brownfield +augmentations of existing extensions, and will share whatever prefix +is allocated to their parent greenfield extension. The standard +extension prefixes avoid spurious incompatibilities in the encoding of +core functionality, while allowing custom packing of more esoteric +extensions. + +This capability of repacking RISC-V extensions into different +standard-compatible global encodings can be used in a number of ways. + +One use-case is developing highly specialized custom accelerators, +designed to run kernels from important application domains. These +might want to drop all but the base integer ISA and add in only the +extensions that are required for the task in hand. The base ISA has +been designed to place minimal requirements on a hardware +implementation, and has been encoded to use only a small fraction of a +32-bit instruction encoding space. + +Another use-case is to build a research prototype for a new type of +instruction-set extension. The researchers might not want to expend +the effort to implement a variable-length instruction-fetch unit, and +so would like to prototype their extension using a simple 32-bit +fixed-width instruction encoding. However, this new extension might +be too large to coexist with standard extensions in the 32-bit space. +If the research experiments do not need all of the standard +extensions, a standard-compatible global encoding might drop the +unused standard extensions and reuse their prefixes to place the +proposed extension in a non-standard location to simplify engineering +of the research prototype. Standard tools will still be able to +target the base and any standard extensions that are present to reduce +development time. Once the instruction-set extension has been +evaluated and refined, it could then be made available for packing +into a larger variable-length encoding space to avoid conflicts with +all standard extensions. + +The following sections describe increasingly sophisticated strategies +for developing implementations with new instruction-set extensions. +These are mostly intended for use in highly customized, educational, +or experimental architectures rather than for the main line of RISC-V +ISA development. + +\section{Extensions within fixed-width 32-bit instruction format} +\label{fix32b} + +In this section, we discuss adding extensions to implementations that +only support the base fixed-width 32-bit instruction format. + +\begin{commentary} +We anticipate the simplest fixed-width 32-bit encoding will be popular for +many restricted accelerators and research prototypes. +\end{commentary} + +\subsection*{Available 30-bit instruction encoding spaces} + +In the standard encoding, three of the available 30-bit instruction +encoding spaces (those with 2-bit prefixes 00, 01, and 10) are used to +enable the optional compressed instruction extension. However, if the +compressed instruction-set extension is not required, then these three +further 30-bit encoding spaces become available. This quadruples the +available encoding space within the 32-bit format. + +\subsection*{Available 25-bit instruction encoding spaces} + +A 25-bit instruction encoding space corresponds to a major opcode in +the base and standard extension encodings. + +There are four major opcodes expressly reserved for custom extensions +(Table~\ref{opcodemap}), each of which represents a 25-bit encoding +space. Two of these are reserved for eventual use in the RV128 base +encoding (will be OP-IMM-64 and OP-64), but can be used for standard +or non-standard extensions for RV32 and RV64. + +The two opcodes reserved for RV64 (OP-IMM-32 and OP-32) can also be +used for standard and non-standard extensions to RV32 only. + +If an implementation does not require floating-point, then the seven +major opcodes reserved for standard floating-point extensions +(LOAD-FP, STORE-FP, MADD, MSUB, NMSUB, NMADD, OP-FP) can be reused for +non-standard extensions. Similarly, the AMO major opcode can be +reused if the standard atomic extensions are not required. + +If an implementation does not require instructions longer than +32-bits, then an additional four major opcodes are available (those +marked in gray in Table~\ref{opcodemap}). + +The base RV32I encoding uses only 11 major opcodes plus 3 reserved +opcodes, leaving up to 18 available for extensions. The base RV64I +encoding uses only 13 major opcodes plus 3 reserved opcodes, leaving +up to 16 available for extensions. + +\subsection*{Available 22-bit instruction encoding spaces} + +A 22-bit encoding space corresponds to a funct3 minor opcode space in +the base and standard extension encodings. Several major opcodes have +a funct3 field minor opcode that is not completely occupied, leaving +available several 22-bit encoding spaces. + +Usually a major opcode selects the format used to encode operands in +the remaining bits of the instruction, and ideally, an extension +should follow the operand format of the major opcode to simplify +hardware decoding. + +\subsection*{Other spaces} + +Smaller spaces are available under certain major opcodes, and not all +minor opcodes are entirely filled. + +\section{Adding aligned 64-bit instruction extensions} + +The simplest approach to provide space for extensions that are too +large for the base 32-bit fixed-width instruction format is to add +naturally aligned 64-bit instructions. The implementation must still +support the 32-bit base instruction format, but can require that +64-bit instructions are aligned on 64-bit boundaries to simplify +instruction fetch, with a 32-bit NOP instruction used as alignment +padding where necessary. + +To simplify use of standard tools, the 64-bit instructions should be +encoded as described in Figure~\ref{instlengthcode}. However, an +implementation might choose a non-standard instruction-length encoding +for 64-bit instructions, while retaining the standard encoding for +32-bit instructions. For example, if compressed instructions are not +required, then a 64-bit instruction could be encoded using one or more +zero bits in the first two bits of an instruction. + +\begin{commentary} +We anticipate processor generators that produce instruction-fetch +units capable of automatically handling any combination of supported +variable-length instruction encodings. +\end{commentary} + +\section{Supporting VLIW encodings} + +Although RISC-V was not designed as a base for a pure VLIW machine, +VLIW encodings can be added as extensions using several alternative +approaches. In all cases, the base 32-bit encoding has to be supported +to allow use of any standard software tools. + +\subsection*{Fixed-size instruction group} + +The simplest approach is to define a single large naturally aligned +instruction format (e.g., 128 bits) within which VLIW operations are +encoded. In a conventional VLIW, this approach would tend to waste +instruction memory to hold NOPs, but a RISC-V-compatible +implementation would have to also support the base 32-bit +instructions, confining the VLIW code size expansion to +VLIW-accelerated functions. + +\subsection*{Encoded-Length Groups} + +Another approach is to use the standard length encoding from +Figure~\ref{instlengthcode} to encode parallel instruction groups, +allowing NOPs to be compressed out of the VLIW instruction. For +example, a 64-bit instruction could hold two 28-bit operations, while +a 96-bit instruction could hold three 28-bit operations, and so on. +Alternatively, a 48-bit instruction could hold one 42-bit operation, +while a 96-bit instruction could hold two 42-bit operations, and so +on. + +This approach has the advantage of retaining the base ISA encoding for +instructions holding a single operation, but has the disadvantage of +requiring a new 28-bit or 42-bit encoding for operations within the +VLIW instructions, and misaligned instruction fetch for larger groups. +One simplification is to not allow VLIW instructions to straddle +certain microarchitecturally significant boundaries (e.g., cache lines +or virtual memory pages). + +\subsection*{Fixed-Size Instruction Bundles} + +Another approach, similar to Itanium, is to use a larger naturally +aligned fixed instruction bundle size (e.g., 128 bits) across which +parallel operation groups are encoded. This simplifies instruction +fetch, but shifts the complexity to the group execution engine. To +remain RISC-V compatible, the base 32-bit instruction would still have +to be supported. + +\subsection*{End-of-Group bits in Prefix} + +None of the above approaches retains the RISC-V encoding for the +individual operations within a VLIW instruction. Yet another approach +is to repurpose the two prefix bits in the fixed-width 32-bit +encoding. One prefix bit can be used to signal ``end-of-group'' if +set, while the second bit could indicate execution under a predicate +if clear. Standard RISC-V 32-bit instructions generated by tools +unaware of the VLIW extension would have both prefix bits set (11) and +thus have the correct semantics, with each instruction at the end of a +group and not predicated. + +The main disadvantage of this approach is that the base ISA lacks the +complex predication support usually required in an aggressive VLIW +system, and it is difficult to add space to specify more predicate +registers in the standard 30-bit encoding space. |