\chapter{``M'' Standard Extension for Integer Multiplication and Division, Version 2.0} This chapter describes the standard integer multiplication and division instruction extension, which is named ``M'' and contains instructions that multiply or divide values held in two integer registers. \begin{commentary} We separate integer multiply and divide out from the base to simplify low-end implementations, or for applications where integer multiply and divide operations are either infrequent or better handled in attached accelerators. \end{commentary} \section{Multiplication Operations} \vspace{-0.2in} \begin{center} \begin{tabular}{S@{}R@{}R@{}S@{}R@{}O} \\ \instbitrange{31}{25} & \instbitrange{24}{20} & \instbitrange{19}{15} & \instbitrange{14}{12} & \instbitrange{11}{7} & \instbitrange{6}{0} \\ \hline \multicolumn{1}{|c|}{funct7} & \multicolumn{1}{c|}{rs2} & \multicolumn{1}{c|}{rs1} & \multicolumn{1}{c|}{funct3} & \multicolumn{1}{c|}{rd} & \multicolumn{1}{c|}{opcode} \\ \hline 7 & 5 & 5 & 3 & 5 & 7 \\ MULDIV & multiplier & multiplicand & MUL/MULH[[S]U] & dest & OP \\ MULDIV & multiplier & multiplicand & MULW & dest & OP-32 \\ \end{tabular} \end{center} MUL performs an XLEN-bit$\times$XLEN-bit multiplication and places the lower XLEN bits in the destination register. MULH, MULHU, and MULHSU perform the same multiplication but return the upper XLEN bits of the full 2$\times$XLEN-bit product, for signed$\times$signed, unsigned$\times$unsigned, and signed$\times$unsigned multiplication respectively. If both the high and low bits of the same product are required, then the recommended code sequence is: MULH[[S]U] {\em rdh, rs1, rs2}; MUL {\em rdl, rs1, rs2} (source register specifiers must be in same order and {\em rdh} cannot be the same as {\em rs1} or {\em rs2}). Microarchitectures can then fuse these into a single multiply operation instead of performing two separate multiplies. MULW is only valid for RV64, and multiplies the lower 32 bits of the source registers, placing the sign-extension of the lower 32 bits of the result into the destination register. MUL can be used to obtain the upper 32 bits of the 64-bit product, but signed arguments must be proper 32-bit signed values, whereas unsigned arguments must have their upper 32 bits clear. \section{Division Operations} \vspace{-0.2in} \begin{center} \begin{tabular}{S@{}R@{}R@{}O@{}R@{}O} \\ \instbitrange{31}{25} & \instbitrange{24}{20} & \instbitrange{19}{15} & \instbitrange{14}{12} & \instbitrange{11}{7} & \instbitrange{6}{0} \\ \hline \multicolumn{1}{|c|}{funct7} & \multicolumn{1}{c|}{rs2} & \multicolumn{1}{c|}{rs1} & \multicolumn{1}{c|}{funct3} & \multicolumn{1}{c|}{rd} & \multicolumn{1}{c|}{opcode} \\ \hline 7 & 5 & 5 & 3 & 5 & 7 \\ MULDIV & divisor & dividend & DIV[U]/REM[U] & dest & OP \\ MULDIV & divisor & dividend & DIV[U]W/REM[U]W & dest & OP-32 \\ \end{tabular} \end{center} DIV and DIVU perform signed and unsigned integer division of XLEN bits by XLEN bits. REM and REMU provide the remainder of the corresponding division operation. If both the quotient and remainder are required from the same division, the recommended code sequence is: DIV[U] {\em rdq, rs1, rs2}; REM[U] {\em rdr, rs1, rs2} ({\em rdq} cannot be the same as {\em rs1} or {\em rs2}). Microarchitectures can then fuse these into a single divide operation instead of performing two separate divides. The semantics for division by zero and division overflow are summarized in Table~\ref{tab:divby0}. The quotient of division by zero has all bits set, i.e. $2^{XLEN}-1$ for unsigned division or $-1$ for signed division. The remainder of division by zero equals the dividend. Signed division overflow occurs only when the most-negative integer, $-2^{XLEN-1}$, is divided by $-1$. The quotient of signed division overflow is equal to the dividend, and the remainder is zero. Unsigned division overflow cannot occur. \vspace{0.1in} \begin{table}[h] \center \begin{tabular}{|l|c|c||c|c|c|c|} \hline Condition & Dividend & Divisor & DIVU & REMU & DIV & REM \\ \hline Division by zero & $x$ & 0 & $2^{XLEN}-1$ & $x$ & $-1$ & $x$ \\ Overflow (signed only) & $-2^{XLEN-1}$ & $-1$ & -- & -- & $-2^{XLEN-1}$ & 0 \\ \hline \end{tabular} \caption{Semantics for division by zero and division overflow.} \label{tab:divby0} \end{table} \begin{commentary} We considered raising exceptions on integer divide by zero, with these exceptions causing a trap in most execution environments. However, this would be the only arithmetic trap in the standard ISA (floating-point exceptions set flags and write default values, but do not cause traps) and would require language implementors to interact with the execution environment's trap handlers for this case. Further, where language standards mandate that a divide-by-zero exception must cause an immediate control flow change, only a single branch instruction needs to be added to each divide operation, and this branch instruction can be inserted after the divide and should normally be very predictably not taken, adding little runtime overhead. The value of all bits set is returned for both unsigned and signed divide by zero to simplify the divider circuitry. The value of all 1s is both the natural value to return for unsigned divide, representing the largest unsigned number, and also the natural result for simple unsigned divider implementations. Signed division is often implemented using an unsigned division circuit and specifying the same overflow result simplifies the hardware. \end{commentary} DIVW and DIVUW instructions are only valid for RV64, and divide the lower 32 bits of {\em rs1} by the lower 32 bits of {\em rs2}, treating them as signed and unsigned integers respectively, placing the 32-bit quotient in {\em rd}, sign-extended to 64 bits. REMW and REMUW instructions are only valid for RV64, and provide the corresponding signed and unsigned remainder operations respectively. Both REMW and REMUW always sign-extend the 32-bit result to 64 bits, including on a divide by zero.