diff options
author | Nicholas Piggin <npiggin@gmail.com> | 2018-03-01 17:12:20 +1000 |
---|---|---|
committer | Stewart Smith <stewart@linux.vnet.ibm.com> | 2018-03-01 20:36:53 -0600 |
commit | 56a85b41d23147e7dbe6d78d5a46d13910bc8495 (patch) | |
tree | c45de89a8fb6a1be5872a635cc126f3c8bc80a4d /include/pci.h | |
parent | 8ea3ac76137be3f02d4131b36a66f6917190e384 (diff) | |
download | skiboot-56a85b41d23147e7dbe6d78d5a46d13910bc8495.zip skiboot-56a85b41d23147e7dbe6d78d5a46d13910bc8495.tar.gz skiboot-56a85b41d23147e7dbe6d78d5a46d13910bc8495.tar.bz2 |
core/hmi: report processor recovery reason from core FIR bits on P9
When an error is encountered that causes processor recovery, HMI is
generated if the recovery was successful. The reason is recorded in
the core FIR, which gets copied into the WOF.
In this case dump the WOF register and an error string into the OPAL
msglog.
A broken init setting led to HMIs reported in Linux as:
[ 3.591547] Harmless Hypervisor Maintenance interrupt [Recovered]
[ 3.591648] Error detail: Processor Recovery done
[ 3.591714] HMER: 2040000000000000
This patch would have been useful because it tells us exactly that
the problem is in the d-side ERAT:
[ 414.489690798,7] HMI: Received HMI interrupt: HMER = 0x2040000000000000
[ 414.489693339,7] HMI: [Loc: UOPWR.0000000-Node0-Proc0]: P:0 C:1 T:1: Processor recovery occurred.
[ 414.489699837,7] HMI: Core WOF = 0x0000000410000000 recovered error:
[ 414.489701543,7] HMI: LSU - SRAM (DCACHE parity, etc)
[ 414.489702341,7] HMI: LSU - ERAT multi hit
In future it will be good to unify this reporting, so Linux could
print something more useful. Until then, this gives some good data.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Diffstat (limited to 'include/pci.h')
0 files changed, 0 insertions, 0 deletions