diff options
author | Stewart Smith <stewart@linux.ibm.com> | 2018-05-28 17:49:39 +1000 |
---|---|---|
committer | Stewart Smith <stewart@linux.ibm.com> | 2018-05-28 17:49:39 +1000 |
commit | c55a54bbf38b6b3144105885e173ae7b6afab091 (patch) | |
tree | dd6c606aa14bcbca9133fc0c635220085efe0a6f | |
parent | cc52c56200956485aee67cb933b2a3d0132cc7fd (diff) | |
download | skiboot-6.0.4.zip skiboot-6.0.4.tar.gz skiboot-6.0.4.tar.bz2 |
skiboot 6.0.4 release notesv6.0.4
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
-rw-r--r-- | doc/release-notes/skiboot-6.0.4.rst | 55 |
1 files changed, 55 insertions, 0 deletions
diff --git a/doc/release-notes/skiboot-6.0.4.rst b/doc/release-notes/skiboot-6.0.4.rst new file mode 100644 index 0000000..0db6aac --- /dev/null +++ b/doc/release-notes/skiboot-6.0.4.rst @@ -0,0 +1,55 @@ +.. _skiboot-6.0.4: + +============= +skiboot-6.0.4 +============= + +skiboot 6.0.4 was released on Monday May 28th, 2018. It replaces +:ref:`skiboot-6.0.3` as the current stable release in the 6.0.x series. + +It is recommended that 6.0.4 be used instead of any previous 6.0.x version. + +Over :ref:`skiboot-6.0.3`, we have two bug fixes: one helps with performance +(especially in HPC environments), and one is an opal-prd fix. + +Changes are: + +- SLW: Remove stop1_lite and stop2_lite + + stop1_lite has been removed since it adds no additional benefit + over stop0_lite. stop2_lite has been removed since currently it adds + minimal benefit over stop2. However, the benefit is eclipsed by the time + required to ungate the clocks + + Moreover, Lite states don't give up the SMT resources, can potentially + have a performance impact on sibling threads. + + Since current OSs (Linux) aren't smart enough to make good decisions + with these stop states, we're (temporarly) removing them from what + we expose to the OS, the idea being to bring them back in a new + DT representation so that only an OS that knows what to do will + do things with them. +- opal-prd: Do not error out on first failure for soft/hard offline. + + The memory errors (CEs and UEs) that are detected as part of background + memory scrubbing are reported by PRD asynchronously to opal-prd along with + affected memory ranges. hservice_memory_error() converts these ranges into + page granularity before hooking up them to soft/hard offline-ing + infrastructure. + + But the current implementation of hservice_memory_error() does not hookup + all the pages to soft/hard offline-ing if any of the page offline action + fails. e.g hard offline can fail for: + + - Pages that are not part of buddy managed pool. + - Pages that are reserved by kernel using memblock_reserved() + - Pages that are in use by kernel. + + But for the pages that are in use by user space application, the hard + offline marks the page as hwpoison, sends SIGBUS signal to kill the + affected application as recovery action and returns success. + + Hence, It is possible that some of the pages in that memory range are in + use by application or free. By stopping on first error we loose the + opportunity to hwpoison the subsequent pages which may be free or in use by + application. This patch fixes this issue. |