aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--src/assembly.tex3
-rw-r--r--src/preface.tex2
-rw-r--r--src/rv32.tex57
3 files changed, 53 insertions, 9 deletions
diff --git a/src/assembly.tex b/src/assembly.tex
index de2d2e5..d58f33d 100644
--- a/src/assembly.tex
+++ b/src/assembly.tex
@@ -18,7 +18,8 @@ registers and their role in the standard calling convention.
\tt x2 & \tt sp & Stack pointer & Callee \\
\tt x3 & \tt gp & Global pointer & --- \\
\tt x4 & \tt tp & Thread pointer & --- \\
- {\tt x5}--{\tt 7} & {\tt t0}--{\tt 2} & Temporaries & Caller \\
+ \tt x5 & {\tt t0} & Temporary/alternate link register& Caller \\
+ {\tt x6}--{\tt 7} & {\tt t1}--{\tt 2} & Temporaries & Caller \\
\tt x8 & {\tt s0}/\tt fp & Saved register/frame pointer & Callee \\
\tt x9 & {\tt s1} & Saved register & Callee \\
{\tt x10}--{\tt 11} & {\tt a0}--{\tt 1} & Function arguments/return values & Caller \\
diff --git a/src/preface.tex b/src/preface.tex
index 3b0cf2c..9c5a13d 100644
--- a/src/preface.tex
+++ b/src/preface.tex
@@ -41,6 +41,8 @@ The major changes in this version of the document include:
\parskip 0pt
\itemsep 1pt
\item Improvements to the description and commentary.
+\item Modified implicit hinting suggestion on JALR to support more efficient
+ macro-op fusion of LUI;JALR pairs.
\item Clarification of constraints on load-reserved/store-conditional sequences.
\item Clarified purpose and behavior of high-order bits of {\tt fcsr}.
\item Corrected the description of the FNMADD.{\em fmt} and FNMSUB.{\em fmt}
diff --git a/src/rv32.tex b/src/rv32.tex
index c1571ae..c226e5c 100644
--- a/src/rv32.tex
+++ b/src/rv32.tex
@@ -642,6 +642,15 @@ following the jump ({\tt pc}+4) into register {\em rd}. The standard
software calling convention uses {\tt x1} as the return address
register and {\tt x5} as an alternate link register.
+\begin{commentary}
+The alternate link register supports calling millicode routines (e.g.,
+those to save and restore registers in compressed code) while
+preserving the regular return address register. The register {\tt x5}
+was chosen as the alternate link register as it maps to a temporary in
+the standard calling convention, and has an encoding that is only one
+bit different than the regular link register.
+\end{commentary}
+
Plain unconditional jumps (assembler pseudo-op J) are encoded as a JAL
with {\em rd}={\tt x0}.
@@ -719,14 +728,6 @@ information. Although there is potentially a slight loss of error
checking in this case, in practice jumps to an incorrect instruction
address will usually quickly raise an exception.
-Return-address prediction stacks are a common feature of high-performance
-instruction-fetch units. We note that {\em rd} and {\em rs1} can be used to
-guide an implementation's instruction-fetch prediction logic, indicating
-whether JALR instructions should push ({\em rd}$=${\tt x1}/{\tt x5}), pop
-({\em rs1}$=${\tt x1}/{\tt x5}), or not touch (otherwise)
-a return-address stack. Similarly, a JAL instruction should push the return
-address onto the return-address stack only when {\em rd}$=${\tt x1}/{\tt x5}.
-
When used with a base {\em rs1}$=${\tt x0}, JALR can be used to implement
a single instruction subroutine call to the lowest \wunits{2}{KiB} or highest
\wunits{2}{KiB} address region from anywhere in the address space, which could
@@ -743,6 +744,46 @@ that support extensions with 16-bit aligned instructions, such as the
compressed instruction set extension, C.
\end{commentary}
+Return-address prediction stacks are a common feature of
+high-performance instruction-fetch units, but require accurate
+detection of instructions used for procedure calls and returns to be
+effective. For RISC-V, hints as to the instructions usage are encoded
+implicitly via the register numbers used. A JAL instruction should
+push the return address onto a return-address stack (RAS) only when
+{\em rd}$=${\tt x1}/{\tt x5}. JALR instructions should push/pop a
+RAS as shown in the Table~\ref{rashints}.
+\begin{table}[hbt]
+\centering
+\begin{tabular}{|c|c|c|l|}
+ \hline
+ \em rd & \em rs1 & {\em rs1}$=${\em rd} & RAS action \\
+ \hline
+ !{\em link} & !{\em link} & - & none \\
+ !{\em link} & {\em link} & - & pop \\
+ {\em link} & !{\em link} & - & push \\
+ {\em link} & {\em link} & 0 & push and pop \\
+ {\em link} & {\em link} & 1 & push \\
+ \hline
+\end{tabular}
+\caption{Return-address stack prediction hints encoded in register
+ specifiers used in the instruction. In the above, {\em link} is
+ true when the register is either {\tt x1} or {\tt x5}.}
+\label{rashints}
+\end{table}
+
+\begin{commentary}
+Some other ISAs added explicit hint bits ot the encoding. We use
+implicit hinting tied to register numbers and calling convention to
+reduce the encoding space used for these hints.
+
+When two different link registers ({\tt x1} and {\tt x5}) are given as
+{\em rs1} and {\em rd}, then the RAS is both pushed and popped to
+support coroutines. If {\em rs1} and {\em rd} are the same link
+register (either {\tt x1} or {\tt x5}), the RAS is only pushed to
+enable macro-op fusion of the sequence:\linebreak
+{\tt lui ra, imm20; jalr ra, ra, imm11}
+\end{commentary}
+
\subsubsection*{Conditional Branches}
All branch instructions use the SB-type instruction format. The