hw/phb4: Clear the PEC FIRs when taking the ETU out of reset

The documented PEC recovery procedure is to clear the PEC FIR registers when the ETU/PHB is in reset. However, any xscom access targeting a PHB register while it is in reset will raise a new error (PFIR bit 3), so it is possible to get out of reset and still have a FIR register showing errors. It has been observed that the OCC, through its 24x7 service, can do such a xscom access at boot time if we end up in the CRESET path. So the current behavior of logging an error is not desirable. The recommendation from the logic designer is to keep the existing mechanism to clear the FIR registers and add an extra step to clear any new errors immediately after taking the ETU out of reset. That's what this patch is doing. Fixes: https://github.com/open-power/skiboot/issues/273 Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
author: Frederic Barrat <fbarrat@linux.ibm.com> 2023-02-07 16:08:37 +0100
committer: Reza Arbab <arbab@linux.ibm.com> 2023-02-27 14:05:47 -0600
commit: 80e2b1dc7396d5a02d14b90cd6e86dfbacd85d1d (patch)
tree: d447d90ee3e0c4458bf9d46bb6d5f52ac72e5e1c
parent: c353f5a76aff5ea4ed0ed34ac059904877fdf114 (diff)
download: skiboot-80e2b1dc7396d5a02d14b90cd6e86dfbacd85d1d.zip
skiboot-80e2b1dc7396d5a02d14b90cd6e86dfbacd85d1d.tar.gz
skiboot-80e2b1dc7396d5a02d14b90cd6e86dfbacd85d1d.tar.bz2
1 files changed, 17 insertions, 13 deletions
diff --git a/hw/phb4.c b/hw/phb4.c
index e015646..b1fa08f 100644
--- a/hw/phb4.c
+++ b/hw/phb4.c
@@ -3576,8 +3576,19 @@ static int64_t phb4_creset(struct pci_slot *slot)
 			xscom_write(p->chip_id, p->pe_stk_xscom +
 				    XPEC_NEST_STK_PCI_NFIR_CLR, ~p->nfir_cache);
 
-			/* Re-read errors in PFIR and NFIR and reset any new
-			 * error reported.
+			/* Clear PHB from reset */
+			xscom_write(p->chip_id,
+				    p->pci_stk_xscom + XPEC_PCI_STK_ETU_RESET, 0x0);
+			p->flags &= ~PHB4_ETU_IN_RESET;
+
+			/*
+			 * Re-read errors in PFIR and NFIR and reset
+			 * any new error reported while the ETU was in
+			 * reset.
+			 * A xscom access when the ETU is in reset
+			 * will set PFIR bit 3 and the OCC is known to
+			 * access PHB performance counters, so such an
+			 * error is not uncommon.
 			 */
 			xscom_read(p->chip_id, p->pci_stk_xscom +
 				   XPEC_PCI_STK_PCI_FIR, &p->pfir_cache);
@@ -3585,21 +3596,14 @@ static int64_t phb4_creset(struct pci_slot *slot)
 				   XPEC_NEST_STK_PCI_NFIR, &p->nfir_cache);
 
 			if (p->pfir_cache || p->nfir_cache) {
-				PHBERR(p, "CRESET: PHB still fenced !!\n");
-				phb4_dump_pec_err_regs(p);
-
-				/* Reset the PHB errors */
 				xscom_write(p->chip_id, p->pci_stk_xscom +
-					    XPEC_PCI_STK_PCI_FIR, 0);
+					    XPEC_PCI_STK_PCI_FIR_CLR,
+					    ~p->pfir_cache);
 				xscom_write(p->chip_id, p->pe_stk_xscom +
-					    XPEC_NEST_STK_PCI_NFIR, 0);
+					    XPEC_NEST_STK_PCI_NFIR_CLR,
+					    ~p->nfir_cache);
 			}
 
-			/* Clear PHB from reset */
-			xscom_write(p->chip_id,
-				    p->pci_stk_xscom + XPEC_PCI_STK_ETU_RESET, 0x0);
-			p->flags &= ~PHB4_ETU_IN_RESET;
-
 			pci_slot_set_state(slot, PHB4_SLOT_CRESET_REINIT);
 			/* After lifting PHB reset, wait while logic settles */
 			return pci_slot_set_sm_timeout(slot, msecs_to_tb(10));
author	Frederic Barrat <fbarrat@linux.ibm.com>	2023-02-07 16:08:37 +0100
committer	Reza Arbab <arbab@linux.ibm.com>	2023-02-27 14:05:47 -0600
commit	80e2b1dc7396d5a02d14b90cd6e86dfbacd85d1d (patch)
tree	d447d90ee3e0c4458bf9d46bb6d5f52ac72e5e1c
parent	c353f5a76aff5ea4ed0ed34ac059904877fdf114 (diff)
download	skiboot-80e2b1dc7396d5a02d14b90cd6e86dfbacd85d1d.zip skiboot-80e2b1dc7396d5a02d14b90cd6e86dfbacd85d1d.tar.gz skiboot-80e2b1dc7396d5a02d14b90cd6e86dfbacd85d1d.tar.bz2