diff options
author | Frederic Barrat <fbarrat@linux.ibm.com> | 2023-02-07 16:08:37 +0100 |
---|---|---|
committer | Reza Arbab <arbab@linux.ibm.com> | 2023-02-27 14:05:47 -0600 |
commit | 80e2b1dc7396d5a02d14b90cd6e86dfbacd85d1d (patch) | |
tree | d447d90ee3e0c4458bf9d46bb6d5f52ac72e5e1c /hw | |
parent | c353f5a76aff5ea4ed0ed34ac059904877fdf114 (diff) | |
download | skiboot-80e2b1dc7396d5a02d14b90cd6e86dfbacd85d1d.zip skiboot-80e2b1dc7396d5a02d14b90cd6e86dfbacd85d1d.tar.gz skiboot-80e2b1dc7396d5a02d14b90cd6e86dfbacd85d1d.tar.bz2 |
hw/phb4: Clear the PEC FIRs when taking the ETU out of reset
The documented PEC recovery procedure is to clear the PEC FIR
registers when the ETU/PHB is in reset. However, any xscom access
targeting a PHB register while it is in reset will raise a new
error (PFIR bit 3), so it is possible to get out of reset and still
have a FIR register showing errors. It has been observed that the OCC,
through its 24x7 service, can do such a xscom access at boot time if
we end up in the CRESET path. So the current behavior of logging an
error is not desirable.
The recommendation from the logic designer is to keep the existing
mechanism to clear the FIR registers and add an extra step to clear
any new errors immediately after taking the ETU out of reset. That's
what this patch is doing.
Fixes: https://github.com/open-power/skiboot/issues/273
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
Diffstat (limited to 'hw')
-rw-r--r-- | hw/phb4.c | 30 |
1 files changed, 17 insertions, 13 deletions
@@ -3576,8 +3576,19 @@ static int64_t phb4_creset(struct pci_slot *slot) xscom_write(p->chip_id, p->pe_stk_xscom + XPEC_NEST_STK_PCI_NFIR_CLR, ~p->nfir_cache); - /* Re-read errors in PFIR and NFIR and reset any new - * error reported. + /* Clear PHB from reset */ + xscom_write(p->chip_id, + p->pci_stk_xscom + XPEC_PCI_STK_ETU_RESET, 0x0); + p->flags &= ~PHB4_ETU_IN_RESET; + + /* + * Re-read errors in PFIR and NFIR and reset + * any new error reported while the ETU was in + * reset. + * A xscom access when the ETU is in reset + * will set PFIR bit 3 and the OCC is known to + * access PHB performance counters, so such an + * error is not uncommon. */ xscom_read(p->chip_id, p->pci_stk_xscom + XPEC_PCI_STK_PCI_FIR, &p->pfir_cache); @@ -3585,21 +3596,14 @@ static int64_t phb4_creset(struct pci_slot *slot) XPEC_NEST_STK_PCI_NFIR, &p->nfir_cache); if (p->pfir_cache || p->nfir_cache) { - PHBERR(p, "CRESET: PHB still fenced !!\n"); - phb4_dump_pec_err_regs(p); - - /* Reset the PHB errors */ xscom_write(p->chip_id, p->pci_stk_xscom + - XPEC_PCI_STK_PCI_FIR, 0); + XPEC_PCI_STK_PCI_FIR_CLR, + ~p->pfir_cache); xscom_write(p->chip_id, p->pe_stk_xscom + - XPEC_NEST_STK_PCI_NFIR, 0); + XPEC_NEST_STK_PCI_NFIR_CLR, + ~p->nfir_cache); } - /* Clear PHB from reset */ - xscom_write(p->chip_id, - p->pci_stk_xscom + XPEC_PCI_STK_ETU_RESET, 0x0); - p->flags &= ~PHB4_ETU_IN_RESET; - pci_slot_set_state(slot, PHB4_SLOT_CRESET_REINIT); /* After lifting PHB reset, wait while logic settles */ return pci_slot_set_sm_timeout(slot, msecs_to_tb(10)); |