From 8dfbd56cfb2f2504178f6c70b4801e75cf676b57 Mon Sep 17 00:00:00 2001
From: Krste Asanovic <krste@eecs.berkeley.edu>
Date: Tue, 5 Mar 2019 08:28:24 -0800
Subject: Version 20190305-Base-Ratification for ratification vote.

---
 src/p.tex | 93 +++------------------------------------------------------------
 1 file changed, 3 insertions(+), 90 deletions(-)

(limited to 'src/p.tex')

diff --git a/src/p.tex b/src/p.tex
index 074e475..dac4e4f 100644
--- a/src/p.tex
+++ b/src/p.tex
@@ -1,5 +1,5 @@
 \chapter{``P'' Standard Extension for Packed-SIMD Instructions,
-  Version 0.1}
+  Version 0.2}
 \label{sec:packedsimd}
 
 \begin{commentary}
@@ -8,94 +8,7 @@
   standardizing on the V extension for large floating-point SIMD
   operations.  However, there was interest in packed-SIMD fixed-point
   operations for use in the integer registers of small RISC-V
-  implementations.
+  implementations. A task group is working to define the new P
+  extension.
 \end{commentary}
 
-In this chapter, we outline a standard packed-SIMD extension for
-RISC-V.  We've reserved the instruction-set extension name ``P'' for a future
-standard set of packed-SIMD extensions.  Many other extensions can
-build upon a packed-SIMD extension, taking advantage of the wide data
-registers and datapaths separate from the integer unit.
-
-\begin{commentary}
-Packed-SIMD extensions, first introduced with the Lincoln Labs TX-2~\cite{tx2},
-have become a popular way to provide higher throughput on data-parallel
-codes. Earlier commercial microprocessor implementations include the
-Intel i860, HP PA-RISC MAX~\cite{lee-max-ieeemicro1996}, SPARC
-VIS~\cite{tremblay-vis-ieeemicro1996}, MIPS
-MDMX~\cite{gwennap-mdmx-mpr1996}, PowerPC
-AltiVec~\cite{diefendorff-altivec-ieeemicro2000}, Intel x86
-MMX/SSE~\cite{peleg-mmx-ieeemicro1996, raman-sse-ieeemicro2000}, while
-recent designs include Intel x86 AVX~\cite{lomont-avx-irm2011} and ARM
-Neon~\cite{goodacre-armisa-computer2005}.  We describe a standard
-framework for adding packed-SIMD in this chapter, but are not actively
-working on such a design.  In our opinion, packed-SIMD designs represent
-a reasonable design point when reusing existing wide datapath resources,
-but if significant additional resources are to be devoted to
-data-parallel execution then designs based on traditional vector
-architectures are a better choice and should use the V extension.
-
-\end{commentary}
-
-A RISC-V packed-SIMD extension reuses the floating-point registers
-({\tt f0}-{\tt f31}).  These registers can be defined to have widths
-of FLEN=32 to FLEN=1024.  The standard floating-point instruction-set
-extensions require registers of width 32 bits (``F''), 64 bits (``D''),
-or 128 bits (``Q'').
-
-\begin{commentary}
-It is natural to use the floating-point registers for packed-SIMD
-values rather than the integer registers (PA-RISC and Alpha
-packed-SIMD extensions) as this frees the integer registers for
-control and address values, simplifies reuse of scalar floating-point
-units for SIMD floating-point execution, and leads naturally to a
-decoupled integer/floating-point hardware design.  The floating-point
-load and store instruction encodings also have space to handle wider
-packed-SIMD registers.  However, reusing the floating-point registers
-for packed-SIMD values does make it more difficult to use a recoded
-internal format for floating-point values.
-\end{commentary}
-
-The existing floating-point load and store instructions are used to
-load and store various-sized words from memory to the {\tt f}
-registers.  The base ISA supports 32-bit and 64-bit loads and stores,
-but the LOAD-FP and STORE-FP instruction encodings allows 8 different
-widths to be encoded as shown in Table~\ref{psimdwidth}.  When used
-with packed-SIMD operations, it is desirable to support non-naturally
-aligned loads and stores in hardware.
-
-\begin{table}[htp]
-\begin{center}
-\begin{tabular}{|c|l|r|}
-\hline
-{\em width} field &
-Code &
-Size in bits\\
-\hline
-000 & B  &  8   \\
-001 & H  & 16   \\
-010 & W  & 32   \\
-011 & D  & 64   \\
-100 & Q  & 128  \\
-101 & Q2 & 256  \\
-110 & Q4 & 512  \\
-111 & Q8 & 1024 \\
-\hline
-\end{tabular}
-\end{center}
-\caption{LOAD-FP and STORE-FP width encoding.}
-\label{psimdwidth}
-\end{table}
-
-Packed-SIMD computational instructions operate on packed values in
-{\tt f} registers.  Each value can be 8-bit, 16-bit, 32-bit, 64-bit,
-or 128-bit, and both integer and floating-point representations can be
-supported.  For example, a 64-bit packed-SIMD extension can treat each
-register as 1$\times$64-bit, 2$\times$32-bit, 4$\times$16-bit, or
-8$\times$8-bit packed values.
-
-\begin{commentary}
-Simple packed-SIMD extensions might fit in unused 32-bit instruction
-opcodes, but more extensive packed-SIMD extensions will likely require
-a dedicated 30-bit instruction space.
-\end{commentary}
-- 
cgit v1.1