diff options
-rw-r--r-- | src/assembly.tex | 3 | ||||
-rw-r--r-- | src/preface.tex | 2 | ||||
-rw-r--r-- | src/rv32.tex | 57 |
3 files changed, 53 insertions, 9 deletions
diff --git a/src/assembly.tex b/src/assembly.tex index de2d2e5..d58f33d 100644 --- a/src/assembly.tex +++ b/src/assembly.tex @@ -18,7 +18,8 @@ registers and their role in the standard calling convention. \tt x2 & \tt sp & Stack pointer & Callee \\ \tt x3 & \tt gp & Global pointer & --- \\ \tt x4 & \tt tp & Thread pointer & --- \\ - {\tt x5}--{\tt 7} & {\tt t0}--{\tt 2} & Temporaries & Caller \\ + \tt x5 & {\tt t0} & Temporary/alternate link register& Caller \\ + {\tt x6}--{\tt 7} & {\tt t1}--{\tt 2} & Temporaries & Caller \\ \tt x8 & {\tt s0}/\tt fp & Saved register/frame pointer & Callee \\ \tt x9 & {\tt s1} & Saved register & Callee \\ {\tt x10}--{\tt 11} & {\tt a0}--{\tt 1} & Function arguments/return values & Caller \\ diff --git a/src/preface.tex b/src/preface.tex index 3b0cf2c..9c5a13d 100644 --- a/src/preface.tex +++ b/src/preface.tex @@ -41,6 +41,8 @@ The major changes in this version of the document include: \parskip 0pt \itemsep 1pt \item Improvements to the description and commentary. +\item Modified implicit hinting suggestion on JALR to support more efficient + macro-op fusion of LUI;JALR pairs. \item Clarification of constraints on load-reserved/store-conditional sequences. \item Clarified purpose and behavior of high-order bits of {\tt fcsr}. \item Corrected the description of the FNMADD.{\em fmt} and FNMSUB.{\em fmt} diff --git a/src/rv32.tex b/src/rv32.tex index c1571ae..c226e5c 100644 --- a/src/rv32.tex +++ b/src/rv32.tex @@ -642,6 +642,15 @@ following the jump ({\tt pc}+4) into register {\em rd}. The standard software calling convention uses {\tt x1} as the return address register and {\tt x5} as an alternate link register. +\begin{commentary} +The alternate link register supports calling millicode routines (e.g., +those to save and restore registers in compressed code) while +preserving the regular return address register. The register {\tt x5} +was chosen as the alternate link register as it maps to a temporary in +the standard calling convention, and has an encoding that is only one +bit different than the regular link register. +\end{commentary} + Plain unconditional jumps (assembler pseudo-op J) are encoded as a JAL with {\em rd}={\tt x0}. @@ -719,14 +728,6 @@ information. Although there is potentially a slight loss of error checking in this case, in practice jumps to an incorrect instruction address will usually quickly raise an exception. -Return-address prediction stacks are a common feature of high-performance -instruction-fetch units. We note that {\em rd} and {\em rs1} can be used to -guide an implementation's instruction-fetch prediction logic, indicating -whether JALR instructions should push ({\em rd}$=${\tt x1}/{\tt x5}), pop -({\em rs1}$=${\tt x1}/{\tt x5}), or not touch (otherwise) -a return-address stack. Similarly, a JAL instruction should push the return -address onto the return-address stack only when {\em rd}$=${\tt x1}/{\tt x5}. - When used with a base {\em rs1}$=${\tt x0}, JALR can be used to implement a single instruction subroutine call to the lowest \wunits{2}{KiB} or highest \wunits{2}{KiB} address region from anywhere in the address space, which could @@ -743,6 +744,46 @@ that support extensions with 16-bit aligned instructions, such as the compressed instruction set extension, C. \end{commentary} +Return-address prediction stacks are a common feature of +high-performance instruction-fetch units, but require accurate +detection of instructions used for procedure calls and returns to be +effective. For RISC-V, hints as to the instructions usage are encoded +implicitly via the register numbers used. A JAL instruction should +push the return address onto a return-address stack (RAS) only when +{\em rd}$=${\tt x1}/{\tt x5}. JALR instructions should push/pop a +RAS as shown in the Table~\ref{rashints}. +\begin{table}[hbt] +\centering +\begin{tabular}{|c|c|c|l|} + \hline + \em rd & \em rs1 & {\em rs1}$=${\em rd} & RAS action \\ + \hline + !{\em link} & !{\em link} & - & none \\ + !{\em link} & {\em link} & - & pop \\ + {\em link} & !{\em link} & - & push \\ + {\em link} & {\em link} & 0 & push and pop \\ + {\em link} & {\em link} & 1 & push \\ + \hline +\end{tabular} +\caption{Return-address stack prediction hints encoded in register + specifiers used in the instruction. In the above, {\em link} is + true when the register is either {\tt x1} or {\tt x5}.} +\label{rashints} +\end{table} + +\begin{commentary} +Some other ISAs added explicit hint bits ot the encoding. We use +implicit hinting tied to register numbers and calling convention to +reduce the encoding space used for these hints. + +When two different link registers ({\tt x1} and {\tt x5}) are given as +{\em rs1} and {\em rd}, then the RAS is both pushed and popped to +support coroutines. If {\em rs1} and {\em rd} are the same link +register (either {\tt x1} or {\tt x5}), the RAS is only pushed to +enable macro-op fusion of the sequence:\linebreak +{\tt lui ra, imm20; jalr ra, ra, imm11} +\end{commentary} + \subsubsection*{Conditional Branches} All branch instructions use the SB-type instruction format. The |