aboutsummaryrefslogtreecommitdiff
path: root/src/p.tex
diff options
context:
space:
mode:
authorAndrew Waterman <andrew@sifive.com>2017-02-01 20:41:47 -0800
committerAndrew Waterman <andrew@sifive.com>2017-02-01 20:41:47 -0800
commitab6f8c9bd7bc85361fcf35667d1fddfaf367a53f (patch)
tree716a2118ca0565dbb4e7903723f283ae4dd13c46 /src/p.tex
parent207a7c6ee51aa2fd74d4618cd1369ddc21706b9e (diff)
downloadriscv-isa-manual-ab6f8c9bd7bc85361fcf35667d1fddfaf367a53f.zip
riscv-isa-manual-ab6f8c9bd7bc85361fcf35667d1fddfaf367a53f.tar.gz
riscv-isa-manual-ab6f8c9bd7bc85361fcf35667d1fddfaf367a53f.tar.bz2
Reorganize directory structure
Diffstat (limited to 'src/p.tex')
-rw-r--r--src/p.tex92
1 files changed, 92 insertions, 0 deletions
diff --git a/src/p.tex b/src/p.tex
new file mode 100644
index 0000000..f66acf7
--- /dev/null
+++ b/src/p.tex
@@ -0,0 +1,92 @@
+\chapter{``P'' Standard Extension for Packed-SIMD Instructions,
+ Version 0.1}
+\label{sec:packedsimd}
+
+In this chapter, we outline a standard packed-SIMD extension for
+RISC-V. We've reserved the instruction subset name ``P'' for a future
+standard set of packed-SIMD extensions. Many other extensions can
+build upon a packed-SIMD extension, taking advantage of the wide data
+registers and datapaths separate from the integer unit.
+
+\begin{commentary}
+Packed-SIMD extensions, first introduced with the Lincoln Labs TX-2~\cite{tx2},
+have become a popular way to provide higher throughput on data-parallel
+codes. Earlier commercial microprocessor implementations include the
+Intel i860, HP PA-RISC MAX~\cite{lee-max-ieeemicro1996}, SPARC
+VIS~\cite{tremblay-vis-ieeemicro1996}, MIPS
+MDMX~\cite{gwennap-mdmx-mpr1996}, PowerPC
+AltiVec~\cite{diefendorff-altivec-ieeemicro2000}, Intel x86
+MMX/SSE~\cite{peleg-mmx-ieeemicro1996, raman-sse-ieeemicro2000}, while
+recent designs include Intel x86 AVX~\cite{lomont-avx-irm2011} and ARM
+Neon~\cite{goodacre-armisa-computer2005}. We describe a standard
+framework for adding packed-SIMD in this chapter, but are not actively
+working on such a design. In our opinion, packed-SIMD designs represent
+a reasonable design point when reusing existing wide datapath resources,
+but if significant additional resources are to be devoted to
+data-parallel execution then designs based on traditional vector
+architectures are a better choice and should use the V extension.
+
+\end{commentary}
+
+A RISC-V packed-SIMD extension reuses the floating-point registers
+({\tt f0}-{\tt f31}). These registers can be defined to have widths
+of FLEN=32 to FLEN=1024. The standard floating-point instruction
+subsets require registers of width 32 bits (``F''), 64 bits (``D''),
+or 128 bits (``Q'').
+
+\begin{commentary}
+It is natural to use the floating-point registers for packed-SIMD
+values rather than the integer registers (PA-RISC and Alpha
+packed-SIMD extensions) as this frees the integer registers for
+control and address values, simplifies reuse of scalar floating-point
+units for SIMD floating-point execution, and leads naturally to a
+decoupled integer/floating-point hardware design. The floating-point
+load and store instruction encodings also have space to handle wider
+packed-SIMD registers. However, reusing the floating-point registers
+for packed-SIMD values does make it more difficult to use a recoded
+internal format for floating-point values.
+\end{commentary}
+
+The existing floating-point load and store instructions are used to
+load and store various-sized words from memory to the {\tt f}
+registers. The base ISA supports 32-bit and 64-bit loads and stores,
+but the LOAD-FP and STORE-FP instruction encodings allows 8 different
+widths to be encoded as shown in Table~\ref{psimdwidth}. When used
+with packed-SIMD operations, it is desirable to support non-naturally
+aligned loads and stores in hardware.
+
+\begin{table}[htp]
+\begin{center}
+\begin{tabular}{|c|l|r|}
+\hline
+{\em width} field &
+Code &
+Size in bits\\
+\hline
+000 & B & 8 \\
+001 & H & 16 \\
+010 & W & 32 \\
+011 & D & 64 \\
+100 & Q & 128 \\
+101 & Q2 & 256 \\
+110 & Q4 & 512 \\
+111 & Q8 & 1024 \\
+\hline
+\end{tabular}
+\end{center}
+\caption{LOAD-FP and STORE-FP width encoding.}
+\label{psimdwidth}
+\end{table}
+
+Packed-SIMD computational instructions operate on packed values in
+{\tt f} registers. Each value can be 8-bit, 16-bit, 32-bit, 64-bit,
+or 128-bit, and both integer and floating-point representations can be
+supported. For example, a 64-bit packed-SIMD extension can treat each
+register as 1$\times$64-bit, 2$\times$32-bit, 4$\times$16-bit, or
+8$\times$8-bit packed values.
+
+\begin{commentary}
+Simple packed-SIMD extensions might fit in unused 32-bit instruction
+opcodes, but more extensive packed-SIMD extensions will likely require
+a dedicated 30-bit instruction space.
+\end{commentary}