\chapter{``Q'' Standard Extension for Quad-Precision Floating-Point,
  Version 2.0}

This chapter describes the Q standard extension for 128-bit binary
floating-point instructions compliant with the IEEE 754-2008
arithmetic standard. The 128-bit or quad-precision binary
floating-point instruction subset is named ``Q'', and requires
RV64IFD.  The floating-point registers are now extended to hold either
a single, double, or quad-precision floating-point value (FLEN=128).
The NaN-boxing scheme described in Section~\ref{nanboxing} is now
extended recursively to allow a single-precision value to be NaN-boxed
inside a double-precision value which is itself NaN-boxed inside a
quad-precision value.

\section{Quad-Precision Load and Store Instructions}

New 128-bit variants of LOAD-FP and STORE-FP instructions are added,
encoded with a new value for the funct3 width field.

\vspace{-0.2in}
\begin{center}
\begin{tabular}{M@{}R@{}F@{}R@{}O}
\\
\instbitrange{31}{20} &
\instbitrange{19}{15} &
\instbitrange{14}{12} &
\instbitrange{11}{7} &
\instbitrange{6}{0} \\
\hline
\multicolumn{1}{|c|}{imm[11:0]} &
\multicolumn{1}{c|}{rs1} &
\multicolumn{1}{c|}{width} &
\multicolumn{1}{c|}{rd} &
\multicolumn{1}{c|}{opcode} \\
\hline
12 & 5 & 3 & 5 & 7 \\
offset[11:0] & base & Q & dest & LOAD-FP \\
\end{tabular}
\end{center}

\vspace{-0.2in}
\begin{center}
\begin{tabular}{O@{}R@{}R@{}F@{}R@{}O}
\\
\instbitrange{31}{25} &
\instbitrange{24}{20} &
\instbitrange{19}{15} &
\instbitrange{14}{12} &
\instbitrange{11}{7} &
\instbitrange{6}{0} \\
\hline
\multicolumn{1}{|c|}{imm[11:5]} &
\multicolumn{1}{c|}{rs2} &
\multicolumn{1}{c|}{rs1} &
\multicolumn{1}{c|}{width} &
\multicolumn{1}{c|}{imm[4:0]} &
\multicolumn{1}{c|}{opcode} \\
\hline
7 & 5 & 5 & 3 & 5 & 7 \\
offset[11:5] & src & base & Q & offset[4:0] & STORE-FP \\
\end{tabular}
\end{center}

FLQ and FSQ are only guaranteed to execute atomically if the effective address
is naturally aligned and XLEN=128.

\section{Quad-Precision Computational Instructions}

A new supported format is added to the format field of most
instructions, as shown in Table~\ref{tab:fpextfmt}.

\begin{table}[htp]
\begin{center}
\begin{tabular}{|c|c|l|}
\hline
{\em fmt} field &
Mnemonic &
Meaning \\
\hline
00 & S & 32-bit single-precision \\
01 & D & 64-bit double-precision \\
10 & - & {\em reserved} \\
11 & Q & 128-bit quad-precision \\
\hline
\end{tabular}
\end{center}
\caption{Format field encoding.}
\label{tab:fpextfmt}
\end{table}

The quad-precision floating-point computational instructions are
defined analogously to their double-precision counterparts, but operate on
quad-precision operands and produce quad-precision results.

\vspace{-0.2in}
\begin{center}
\begin{tabular}{R@{}F@{}R@{}R@{}F@{}R@{}O}
\\
\instbitrange{31}{27} &
\instbitrange{26}{25} &
\instbitrange{24}{20} &
\instbitrange{19}{15} &
\instbitrange{14}{12} &
\instbitrange{11}{7} &
\instbitrange{6}{0} \\
\hline
\multicolumn{1}{|c|}{funct5} &
\multicolumn{1}{c|}{fmt} &
\multicolumn{1}{c|}{rs2} &
\multicolumn{1}{c|}{rs1} &
\multicolumn{1}{c|}{rm} &
\multicolumn{1}{c|}{rd} &
\multicolumn{1}{c|}{opcode} \\
\hline
5 & 2 & 5 & 5 & 3 & 5 & 7 \\
FADD/FSUB & Q & src2 & src1 & RM  & dest & OP-FP  \\
FMUL/FDIV & Q & src2 & src1 & RM  & dest & OP-FP  \\
FMIN-MAX  & Q & src2 & src1 & MIN/MAX & dest & OP-FP  \\
FSQRT     & Q & 0    & src  & RM  & dest & OP-FP  \\
\end{tabular}
\end{center}

\vspace{-0.2in}
\begin{center}
\begin{tabular}{R@{}F@{}R@{}R@{}F@{}R@{}O}
\\
\instbitrange{31}{27} &
\instbitrange{26}{25} &
\instbitrange{24}{20} &
\instbitrange{19}{15} &
\instbitrange{14}{12} &
\instbitrange{11}{7} &
\instbitrange{6}{0} \\
\hline
\multicolumn{1}{|c|}{rs3} &
\multicolumn{1}{c|}{fmt} &
\multicolumn{1}{c|}{rs2} &
\multicolumn{1}{c|}{rs1} &
\multicolumn{1}{c|}{rm} &
\multicolumn{1}{c|}{rd} &
\multicolumn{1}{c|}{opcode} \\
\hline
5 & 2 & 5 & 5 & 3 & 5 & 7 \\
src3 & Q & src2 & src1 & RM  & dest & F[N]MADD/F[N]MSUB  \\
\end{tabular}
\end{center}

\section{Quad-Precision Convert and Move Instructions}

\vspace{-0.2in}
\begin{center}
\begin{tabular}{R@{}F@{}R@{}R@{}F@{}R@{}O}
\\
\instbitrange{31}{27} &
\instbitrange{26}{25} &
\instbitrange{24}{20} &
\instbitrange{19}{15} &
\instbitrange{14}{12} &
\instbitrange{11}{7} &
\instbitrange{6}{0} \\
\hline
\multicolumn{1}{|c|}{funct5} &
\multicolumn{1}{c|}{fmt} &
\multicolumn{1}{c|}{rs2} &
\multicolumn{1}{c|}{rs1} &
\multicolumn{1}{c|}{rm} &
\multicolumn{1}{c|}{rd} &
\multicolumn{1}{c|}{opcode} \\
\hline
5 & 2 & 5 & 5 & 3 & 5 & 7 \\
FCVT.{\em int}.{\em fmt} & Q & W[U]/L[U] & src & RM  & dest & OP-FP  \\
FCVT.{\em fmt}.{\em int} & Q & W[U]/L[U] & src & RM  & dest & OP-FP  \\
\end{tabular}
\end{center}

New floating-point to floating-point conversion instructions FCVT.S.Q,
FCVT.Q.S, FCVT.D.Q, FCVT.Q.D are added.

\vspace{-0.2in}
\begin{center}
\begin{tabular}{R@{}F@{}R@{}R@{}F@{}R@{}O}
\\
\instbitrange{31}{27} &
\instbitrange{26}{25} &
\instbitrange{24}{20} &
\instbitrange{19}{15} &
\instbitrange{14}{12} &
\instbitrange{11}{7} &
\instbitrange{6}{0} \\
\hline
\multicolumn{1}{|c|}{funct5} &
\multicolumn{1}{c|}{fmt} &
\multicolumn{1}{c|}{rs2} &
\multicolumn{1}{c|}{rs1} &
\multicolumn{1}{c|}{rm} &
\multicolumn{1}{c|}{rd} &
\multicolumn{1}{c|}{opcode} \\
\hline
5 & 2 & 5 & 5 & 3 & 5 & 7 \\
FCVT.{\em fmt}.{\em fmt} & S & Q & src & RM  & dest & OP-FP  \\
FCVT.{\em fmt}.{\em fmt} & Q & S & src & RM  & dest & OP-FP  \\
FCVT.{\em fmt}.{\em fmt} & D & Q & src & RM  & dest & OP-FP  \\
FCVT.{\em fmt}.{\em fmt} & Q & D & src & RM  & dest & OP-FP  \\
\end{tabular}
\end{center}

Floating-point to floating-point sign-injection instructions, FSGNJ.Q,
FSGNJN.Q, and FSGNJX.Q are defined analogously to the double-precision
sign-injection instruction.

\vspace{-0.2in}
\begin{center}
\begin{tabular}{R@{}F@{}R@{}R@{}F@{}R@{}O}
\\
\instbitrange{31}{27} &
\instbitrange{26}{25} &
\instbitrange{24}{20} &
\instbitrange{19}{15} &
\instbitrange{14}{12} &
\instbitrange{11}{7} &
\instbitrange{6}{0} \\
\hline
\multicolumn{1}{|c|}{funct5} &
\multicolumn{1}{c|}{fmt} &
\multicolumn{1}{c|}{rs2} &
\multicolumn{1}{c|}{rs1} &
\multicolumn{1}{c|}{rm} &
\multicolumn{1}{c|}{rd} &
\multicolumn{1}{c|}{opcode} \\
\hline
5 & 2 & 5 & 5 & 3 & 5 & 7 \\
FSGNJ & Q & src2 & src1 & J[N]/JX & dest & OP-FP  \\
\end{tabular}
\end{center}

FMV.X.Q and FMV.Q.X instructions are not provided, so quad-precision bit
patterns must be moved to the integer registers via memory.

\begin{commentary}
RV128 supports FMV.X.Q and FMV.Q.X in the Q extension.
\end{commentary}

\section{Quad-Precision Floating-Point Compare Instructions}

Floating-point compare instructions perform the specified comparison (equal,
less than, or less than or equal) between floating-point registers {\em rs1}
and {\em rs2} and record the Boolean result in integer register {\em rd}.

\vspace{-0.2in}
\begin{center}
\begin{tabular}{S@{}F@{}R@{}R@{}F@{}R@{}O}
\\
\instbitrange{31}{27} &
\instbitrange{26}{25} &
\instbitrange{24}{20} &
\instbitrange{19}{15} &
\instbitrange{14}{12} &
\instbitrange{11}{7} &
\instbitrange{6}{0} \\
\hline
\multicolumn{1}{|c|}{funct5} &
\multicolumn{1}{c|}{fmt} &
\multicolumn{1}{c|}{rs2} &
\multicolumn{1}{c|}{rs1} &
\multicolumn{1}{c|}{rm} &
\multicolumn{1}{c|}{rd} &
\multicolumn{1}{c|}{opcode} \\
\hline
5 & 2 & 5 & 5 & 3 & 5 & 7 \\
FCMP & Q & src2 & src1 & EQ/LT/LE & dest & OP-FP  \\
\end{tabular}
\end{center}

\section{Quad-Precision Floating-Point Classify Instruction}

The quad-precision floating-point classify instruction, FCLASS.Q, is
defined analogously to its double-precision counterpart, but operates on
quad-precision operands.

\vspace{-0.2in}
\begin{center}
\begin{tabular}{S@{}F@{}R@{}R@{}F@{}R@{}O}
\\
\instbitrange{31}{27} &
\instbitrange{26}{25} &
\instbitrange{24}{20} &
\instbitrange{19}{15} &
\instbitrange{14}{12} &
\instbitrange{11}{7} &
\instbitrange{6}{0} \\
\hline
\multicolumn{1}{|c|}{funct5} &
\multicolumn{1}{c|}{fmt} &
\multicolumn{1}{c|}{rs2} &
\multicolumn{1}{c|}{rs1} &
\multicolumn{1}{c|}{rm} &
\multicolumn{1}{c|}{rd} &
\multicolumn{1}{c|}{opcode} \\
\hline
5 & 2 & 5 & 5 & 3 & 5 & 7 \\
FCLASS & Q & 0 & src & 001 & dest & OP-FP  \\
\end{tabular}
\end{center}