aboutsummaryrefslogtreecommitdiff
path: root/external
diff options
context:
space:
mode:
authorMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>2018-05-11 19:28:43 +0530
committerStewart Smith <stewart@linux.ibm.com>2018-05-28 12:24:06 +1000
commit6ca368e2e2254eac8682b6af43758ba134aa3763 (patch)
tree76323aabb4712c219e9a358d5f3fa8a2e4cb8053 /external
parent0fcedc753e6b56da85fa33033682748f70715464 (diff)
downloadskiboot-6ca368e2e2254eac8682b6af43758ba134aa3763.zip
skiboot-6ca368e2e2254eac8682b6af43758ba134aa3763.tar.gz
skiboot-6ca368e2e2254eac8682b6af43758ba134aa3763.tar.bz2
opal-prd: Do not error out on first failure for soft/hard offline.
The memory errors (CEs and UEs) that are detected as part of background memory scrubbing are reported by PRD asynchronously to opal-prd along with affected memory ranges. hservice_memory_error() converts these ranges into page granularity before hooking up them to soft/hard offline-ing infrastructure. But the current implementation of hservice_memory_error() does not hookup all the pages to soft/hard offline-ing if any of the page offline action fails. e.g hard offline can fail for: - Pages that are not part of buddy managed pool. - Pages that are reserved by kernel using memblock_reserved() - Pages that are in use by kernel. But for the pages that are in use by user space application, the hard offline marks the page as hwpoison, sends SIGBUS signal to kill the affected application as recovery action and returns success. Hence, It is possible that some of the pages in that memory range are in use by application or free. By stopping on first error we loose the opportunity to hwpoison the subsequent pages which may be free or in use by application. This patch fixes this issue. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com> (cherry picked from commit e9ee7c7d357160a704c8248a1787124f94df8c54) Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
Diffstat (limited to 'external')
-rw-r--r--external/opal-prd/opal-prd.c6
1 files changed, 3 insertions, 3 deletions
diff --git a/external/opal-prd/opal-prd.c b/external/opal-prd/opal-prd.c
index 5a15f1d..d5b1700 100644
--- a/external/opal-prd/opal-prd.c
+++ b/external/opal-prd/opal-prd.c
@@ -696,7 +696,7 @@ int hservice_memory_error(uint64_t i_start_addr, uint64_t i_endAddr,
{
const char *sysfsfile, *typestr;
char buf[ADDR_STRING_SZ];
- int memfd, rc, n;
+ int memfd, rc, n, ret = 0;
uint64_t addr;
switch(i_errorType) {
@@ -732,11 +732,11 @@ int hservice_memory_error(uint64_t i_start_addr, uint64_t i_endAddr,
pr_log(LOG_CRIT, "MEM: Failed to offline memory! "
"page addr: %016lx type: %d: %m",
addr, i_errorType);
- return rc;
+ ret = rc;
}
}
- return 0;
+ return ret;
}
uint64_t hservice_get_interface_capabilities(uint64_t set)