aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorFrederic Barrat <fbarrat@linux.ibm.com>2019-10-09 21:38:11 +0200
committerOliver O'Halloran <oohall@gmail.com>2019-10-22 17:31:52 +1100
commit233e863c8b1dccad8be7c39336d232a4a3994e6b (patch)
tree74be71577f108115f1d45f1bca29964956c927ba
parent9d5faafc56f5cac7ba848bc684835353e039f048 (diff)
downloadskiboot-233e863c8b1dccad8be7c39336d232a4a3994e6b.zip
skiboot-233e863c8b1dccad8be7c39336d232a4a3994e6b.tar.gz
skiboot-233e863c8b1dccad8be7c39336d232a4a3994e6b.tar.bz2
npu2-opencapi: Log a warning when resetting a broken device
On P9, the NPU doesn't support recovery if the link goes down unexpectedly. It was not fully verified. We mark the device as broken when we receive an error interrupt from the NPU. However, there's nothing to prevent the OS from trying to reset the device; It may or may not work, it's unsupported territory, so let's log a message to make it clear, as it could help when debugging. We haven't hit any cases where the reset goes badly enough that we'd want to prevent it, so let it go for now. We can revisit later if we have evidence that it's causing more problems than it is worth. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
-rw-r--r--hw/npu2-opencapi.c4
1 files changed, 4 insertions, 0 deletions
diff --git a/hw/npu2-opencapi.c b/hw/npu2-opencapi.c
index 5658ec6..fc9e50c 100644
--- a/hw/npu2-opencapi.c
+++ b/hw/npu2-opencapi.c
@@ -1203,6 +1203,10 @@ static int64_t npu2_opencapi_poll_link(struct pci_slot *slot)
case OCAPI_SLOT_LINK_TRAINED:
otl_enabletx(chip_id, dev->npu->xscom_base, dev);
pci_slot_set_state(slot, OCAPI_SLOT_NORMAL);
+ if (dev->flags & NPU2_DEV_BROKEN) {
+ OCAPIERR(dev, "Resetting a device which hit a previous error. Device recovery is not supported, so future behavior is undefined\n");
+ dev->flags &= ~NPU2_DEV_BROKEN;
+ }
check_perf_counters(dev);
dev->phb_ocapi.scan_map = 1;
return OPAL_SUCCESS;