diff options
Diffstat (limited to 'src/supervisor.adoc')
-rw-r--r-- | src/supervisor.adoc | 84 |
1 files changed, 51 insertions, 33 deletions
diff --git a/src/supervisor.adoc b/src/supervisor.adoc index b212620..fee952f 100644 --- a/src/supervisor.adoc +++ b/src/supervisor.adoc @@ -149,6 +149,20 @@ and load and store effective addresses are taken modulo latexmath:[$2^{\text{UXLEN}}$]. For example, when UXLEN=32 and SXLEN=64, user-mode memory accesses reference the lowest 4 GiB of the address space. +Some HINT instructions are encoded as integer computational instructions that +overwrite their destination register with its current value, e.g., +`c.addi x8, 0`. +When such a HINT is executed with XLEN < SXLEN and bits SXLEN..XLEN of the +destination register not all equal to bit XLEN-1, it is implementation-defined +whether bits SXLEN..XLEN of the destination register are unchanged or are +overwritten with copies of bit XLEN-1. + +NOTE: This definition allows implementations to elide register writeback for +some HINTs, while allowing them to execute other HINTs in the same manner as +other integer computational instructions. +The implementation choice is observable only by S-mode with SXLEN > UXLEN; it +is invisible to U-mode. + [[sum]] ===== Memory Privilege in `sstatus` Register @@ -253,13 +267,13 @@ An `SRET` instruction sets the `SDT` bit to 0. [NOTE] ==== -A trap handler after saving the state needed for resuming from the trap, -including `scause`, `sepc`, and `stval` among others, should clear the `SDT` bit -when it is reentrant. +After a trap handler has saved the state, such as `scause`, `sepc`, +and `stval`, needed for resuming from the trap and is reentrant, it +should clear the `SDT` bit. -Resetting of the `SDT` by an `SRET` enables the trap handler to detect double -trap occuring during the tail phase, where it restores critical state to return -from a trap. +Resetting the `SDT` by an `SRET` enables the trap handler to detect a double +trap that may occur during the tail phase, where it restores critical state +to return from a trap. The consequence of this specification is that if a critical error condition was caused by a guest page-fault, then the GPA will not be available in `mtval2` @@ -270,7 +284,7 @@ instruction in this phase of trap handling is not common. However, not recording the GPA is considered benign because, if required, it can still be obtained -- albeit with added effort -- through the process of walking the page tables. -For a double trap originating in VS-mode, M-mode should redirect the exception +For a double trap that originates in VS-mode, M-mode should redirect the exception to HS-mode by copying the values of M-mode CSRs updated by the trap to HS-mode CSRs and should use an `MRET` to resume execution at the address in `stvec`. @@ -697,6 +711,7 @@ instruction bits is implemented, `stval` must also be able to hold all values less than latexmath:[$2^N$], where latexmath:[$N$] is the smaller of SXLEN and ILEN. +[[sec:senvcfg]] ==== Supervisor Environment Configuration (`senvcfg`) Register The `senvcfg` CSR is an SXLEN-bit read/write register, formatted as @@ -1096,13 +1111,6 @@ If the value held in _rs1_ is not a valid virtual address, then the SFENCE.VMA instruction has no effect. No exception is raised in this case. -When __rs2__≠``x0``, bits SXLEN-1:ASIDMAX of the value held -in _rs2_ are reserved for future standard use. Until their use is -defined by a standard extension, they should be zeroed by software and -ignored by current implementations. Furthermore, if -ASIDLEN<ASIDMAX, the implementation shall ignore bits -ASIDMAX-1:ASIDLEN of the value held in _rs2_. - [NOTE] ==== It is always legal to over-fence, e.g., by fencing only based on a @@ -1114,6 +1122,13 @@ choice not to raise an exception when an invalid virtual address is held in _rs1_ facilitates this type of simplification. ==== +When __rs2__≠``x0``, bits SXLEN-1:ASIDMAX of the value held +in _rs2_ are reserved for future standard use. Until their use is +defined by a standard extension, they should be zeroed by software and +ignored by current implementations. Furthermore, if +ASIDLEN<ASIDMAX, the implementation shall ignore bits +ASIDMAX-1:ASIDLEN of the value held in _rs2_. + An implicit read of the memory-management data structures may return any translation for an address that was valid at any time since the most recent SFENCE.VMA that subsumes that address. The ordering implied by @@ -1169,7 +1184,7 @@ without the need to execute an SFENCE.VMA instruction. Changing immediately, without the need to execute an SFENCE.VMA instruction. Likewise, changes to `satp`.ASID take effect immediately. -[TIP] +[NOTE] ==== The following common situations typically require executing an SFENCE.VMA instruction: @@ -2227,29 +2242,32 @@ exceptions when A/D bits need be set, instead takes effect. The Svade extension is also defined in <<translation>>. [[sec:svvptc]] -== "Svvptc" Extension for Eliding Memory-Management Fences on Making PTEs Valid, Version 1.0 +== "Svvptc" Extension for Obviating Memory-Management Instructions after Marking PTEs Valid, Version 1.0 -When the Svvptc extension is implemented, explicit stores that update the Valid -bit of leaf and/or non-leaf PTEs from 0 to 1 and are visible to a hart will -eventually become visible within a bounded timeframe to subsequent implicit +When the Svvptc extension is implemented, explicit stores by a hart that update +the Valid bit of leaf and/or non-leaf PTEs from 0 to 1 and are visible to a hart +will eventually become visible within a bounded timeframe to subsequent implicit accesses by that hart to such PTEs. [NOTE] ==== -Typically, PTEs are marked as Valid by the operating system following a -page-fault exception or during system calls for memory mapping. In such cases, -the trap handler commonly employs an `SRET` instruction to return from the trap. -When Svvptc is implemented, the stores it executes to change the Valid bit -of the PTEs from 0 to 1 then become visible to implicit references to those PTEs -within a bounded timeframe. This visibility pertains to the instructions like -the one causing the page fault or those accessing new memory regions. A -memory-management fence can be used to force immediate visibility of these PTE -updates to all implicit references associated with instructions following the -memory-management fence. However, when Svvptc is implemented, visibility (in a -bounded amount of time) is guaranteed and use of a memory-management fence is -not required in these scenarios. While this approach might lead to an occasional -gratuitous page-fault, the performance benefit of omitting the memory-management -fence instructions outweighs the occasional cost of a gratuitous page fault. +Svvptc relieves an operating system from executing certain memory-management +instructions, such as `SFENCE.VMA` or `SINVAL.VMA`, which would normally be used +to synchronize the hart's address-translation caches when a memory-resident PTE +is changed from Invalid to Valid. Synchronizing the hart's address-translation +caches with other forms of updates to a memory-resident PTE, including when a +PTE is changed from Valid to Invalid, requires the use of suitable +memory-management instructions. Svvptc guarantees that a change to a PTE from +Invalid to Valid is made visible within a bounded time, thereby making the +execution of these memory-management instructions redundant. The performance +benefit of eliding these instructions outweighs the cost of an occasional +gratuitous additional page fault that may occur. + +Depending on the microarchitecture, some possible ways to facilitate +implementation of Svvptc include: not having any address-translation caches, not +storing Invalid PTEs in the address-translation caches, automatically evicting +Invalid PTEs using a bounded timer, or making address-translation caches +coherent with store instructions that modify PTEs. ==== //// |