From aaa3e159cb2ce6baa3d4ca1a283c5f918944c18b Mon Sep 17 00:00:00 2001 From: Stewart Smith Date: Mon, 28 May 2018 18:06:27 +1000 Subject: skiboot 5.4.10 release notes Signed-off-by: Stewart Smith --- doc/release-notes/skiboot-5.4.10.rst | 58 ++++++++++++++++++++++++++++++++++++ 1 file changed, 58 insertions(+) create mode 100644 doc/release-notes/skiboot-5.4.10.rst diff --git a/doc/release-notes/skiboot-5.4.10.rst b/doc/release-notes/skiboot-5.4.10.rst new file mode 100644 index 0000000..97e0f5c --- /dev/null +++ b/doc/release-notes/skiboot-5.4.10.rst @@ -0,0 +1,58 @@ +.. _skiboot-5.4.10: + +============== +skiboot-5.4.10 +============== + +skiboot-5.4.10 was released on Monday May 28th, 2018. It replaces +:ref:`skiboot-5.4.9` as the current stable release in the 5.4.x series. + +Over :ref:`skiboot-5.4.9`, we have a few bug fixes: + +- opal-prd: Do not error out on first failure for soft/hard offline. + + The memory errors (CEs and UEs) that are detected as part of background + memory scrubbing are reported by PRD asynchronously to opal-prd along with + affected memory ranges. hservice_memory_error() converts these ranges into + page granularity before hooking up them to soft/hard offline-ing + infrastructure. + + But the current implementation of hservice_memory_error() does not hookup + all the pages to soft/hard offline-ing if any of the page offline action + fails. e.g hard offline can fail for: + + - Pages that are not part of buddy managed pool. + - Pages that are reserved by kernel using memblock_reserved() + - Pages that are in use by kernel. + + But for the pages that are in use by user space application, the hard + offline marks the page as hwpoison, sends SIGBUS signal to kill the + affected application as recovery action and returns success. + + Hence, It is possible that some of the pages in that memory range are in + use by application or free. By stopping on first error we loose the + opportunity to hwpoison the subsequent pages which may be free or in use by + application. This patch fixes this issue. +- OPAL_PCI_SET_POWER_STATE: fix locking in error paths + + Otherwise we could exit OPAL holding locks, potentially leading + to all sorts of problems later on. +- p8-i2c: Limit number of retry attempts + + Current we will attempt to start an I2C transaction until it succeeds. + In the event that the OCC does not release the lock on an I2C bus this + results in an async token being held forever and the kernel thread that + started the transaction will block forever while waiting for an async + completion message. Fix this by limiting the number of attempts to + start the transaction. +- FSP/CONSOLE: Disable notification on unresponsive consoles + + Commit fd6b71fc fixed the situation where ipmi console was open (hvc0) but got + data on different console (hvc1). + + During FSP R/R OPAL closes all consoles. After R/R complete FSP requests to + open hvc1 and sends data on this. If hvc1 registration failed or not opened in + host kernel then it will not read data and results in RCU stalls. + + Note that this is workaround for older kernel where we don't have separate irq + for each console. Latest kernel works fine without this patch. -- cgit v1.1