Age | Commit message (Collapse) | Author | Files | Lines |
|
Small cleanup when reading the PEC config when setting up CAPI, in
preparation for P10. Scom addresses vary between P9 and P10 and we'll
be accessing more than one PCI chiplet. No functional change.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
The workaround forces a state machine deep in the PHB to start from
scratch and to block its evolution until after the link has been
reset. It applies on all paths where the link can go down
unexpectedly, though it's probably useless on the creset path, since
we're going to deep-reset the PHB anyway. But it doesn't hurt and it
keeps the set/unset path symmetrical.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
Update init sequence to take into account Gen5.
Define default equlization settings if HDAT is not used.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
The PHB5 introduces a new Address-Based Interrupt mode which extends
the notification offloading to the ESB pages. When ABT is activated,
the PHB maps the interrupt source number into the interrupt command
address. The PHB triggers the interrupt using directly the IC ESB page
of the interrupt number and does not use the notify page of the IC
anymore.
The PHB interrrupt configuration under ABT is a little different. The
'Interrupt Notify Base Address' register points to the base address of
the IC ESB pages and not to the notify page of the IC anymore as on
P9. The 'Interrupt Notify Base Index' register is unused.
This should improve overall performance. The P10 IC can handle higher
interrupt rates compared to P9 and the PHB latency should be improved
under ABT. Debug is easier as the interrupt number is now exposed on
the PowerBUS.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
[FB: port to phb4.c]
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
The POWER9 DD2.0 introduced a StoreEOI operation which had benefits
over the LoadEOI operation : less latency and improved performance for
interrupt handling. Because of load vs. store ordering issues in some
cases, it had to be deactivates. The POWER10 processor has a set
of new features in the XIVE2 and the PHB5 controllers to address this
problem.
At the interrupt controller level, XIVE2 adds a new load offset to the
ESB page which offers the capability to order loads after stores. It
should be enforced by the OS when doing loads if StoreEOI is to be
used.
But this is not enough. The firmware should also carefully configure
the PHB interrupt sources to make sure that operations on the PQ state
bits of a source are routed to a single logic unit : the XIVE2 IC.
The PHB5 introduces a new configuration PQ disable (bit 9) bit for
this purpose.
It disables the check of the PQ state bits when processing new MSI
interrupts. When set, the PHB ignores its local PQ state bits and
forwards unconditionally any MSI trigger to the XIVE2 interrupt
controller. The XIVE2 IC knows from the trigger message that the PQ
bits have not been checked and performs the check using the local PQ
bits. This configuration bit only applies to MSIs and LSIs are still
checked on the PHB to handle the assertion level.
This requires a new XIVE interface to register a HW interrupt source
using the IC ESB pages of the allocated HW interrupt numbers, and not
the ESB pages of the HW source. This is what this change proposes for
MSIs, LSI still being handled the old way.
PQ disable is a requirement for StoreEOI.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
[FB: port to phb4.c]
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
The PHB5 logic on P10 is pretty close to the P9's version. So
we keep our base phb4 implementation and just add the few changes
within if statements.
Signed-off-by: Jordan Niethe <jpn@ozlabs.au.ibm.com>
[clg: misc cleanups and fixes ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
[Fixed compilation issue - Vasant]
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
[Nick: Unify PHB4/PHB5 drivers ]
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[Mikey: set default lane eq settings for phb5]
Signed-off-by: Michael Neuling <mikey@neuling.org>
[FB: squash commits + small cleanup ]
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
This patch implements a circumvention for HW557787. It disables the
TCE cache line buffer as, under heavy loads, there's a possibility of
an entry being re-allocated incorrectly.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
SPDX makes it a simpler diff.
I have audited the commit history of each file to ensure that they are
exclusively authored by IBM and thus we have the right to relicense.
The motivation behind this is twofold:
1) We want to enable experiments with coreboot, which is GPLv2 licensed
2) An upcoming firmware component wants to incorporate code from skiboot
and code from the Linux kernel, which is GPLv2 licensed.
I have gone through the IBM internal way of gaining approval for this.
The following files are not exclusively authored by IBM, so are *not*
included in this update (I will be seeking approval from contributors):
core/direct-controls.c
core/flash.c
core/pcie-slot.c
external/common/arch_flash_unknown.c
external/common/rules.mk
external/gard/Makefile
external/gard/rules.mk
external/opal-prd/Makefile
external/pflash/Makefile
external/xscom-utils/Makefile
hdata/vpd.c
hw/dts.c
hw/ipmi/ipmi-watchdog.c
hw/phb4.c
include/cpu.h
include/phb4.h
include/platform.h
libflash/libffs.c
libstb/mbedtls/sha512.c
libstb/mbedtls/sha512.h
platforms/astbmc/barreleye.c
platforms/astbmc/garrison.c
platforms/astbmc/mihawk.c
platforms/astbmc/nicole.c
platforms/astbmc/p8dnu.c
platforms/astbmc/p8dtu.c
platforms/astbmc/p9dsu.c
platforms/astbmc/vesnin.c
platforms/rhesus/ec/config.h
platforms/rhesus/ec/gpio.h
platforms/rhesus/gpio.c
platforms/rhesus/rhesus.c
platforms/astbmc/talos.c
platforms/astbmc/romulus.c
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
[oliver: fixed up the drift]
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Use Software Package Data Exchange (SPDX) to indicate license for each
file that is unique to skiboot.
At the same time, ensure the (C) who and years are correct.
See https://spdx.org/
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
[oliver: Added a few missing files]
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
This is not a shipping product and is no longer supported by Linux
or other firmware components.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
We had a bunch of remaining definitions for registers that
don't actually exist in PHB4 anymore (copied from PHB3).
This removes them along with a handful of minor style cleanups
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
Some PHB4 PHYs can get stuck in a bad state where they are constantly
retraining the link. This happens transparently to skiboot and Linux
but will causes PCIe to be slow. Resetting the PHB4 clears the
problem.
We can detect this case by looking at the RX errors count where we
check for link stability. This patch does this by modifying the link
optimal code to check for RX errors. If errors are occurring we
retrain the link irrespective of the chip rev or card.
Normally when this problem occurs, the RX error count is maxed out at
255. When there is no problem, the count is 0. We chose 8 as the max
rx errors value to give us some margin for a few errors. There is also
a knob that can be used to set the error threshold for when we should
retrain the link. ie
nvram -p ibm,skiboot --update-config phb-rx-err-max=8
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
We reallocate additional 16/8 DMA-Read engines allocated to stack0/1
on PEC2 respectively. This is needed to improve bandwidth available to
the Mellanox CX5 adapter when trying to read GPU memory (GPU-Direct).
If kernel cxl driver indicates a request to allocate maximum possible
DMA read engines when calling enable_capi_mode() and card is attached
to PEC2/stack0 slot then we assume its a Mellanox CX5 adapter. We then
allocate additional 16/8 extra DMA read engines to stack0 and stack1
respectively on PEC2. This is done by populating the
XPEC_PCI_PRDSTKOVR and XPEC_NEST_READ_STACK_OVERRIDE as suggested by
the h/w team.
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
By default when a PCIe device issues a read request via the PHB it is first
issued with nodal scope. When accessing GPU memory the NPU does not know at the
time of response if the requested memory page is off node or not. Therefore
every read of GPU memory by a PHB is retried with larger scope which introduces
bandwidth and latency issues.
On smaller boxes which have pump mode enabled nodal and group scoped reads are
treated the same and both types of request are broadcast to one chip. Therefore
we can avoid the retry by disabling nodal scope on the PHB for these boxes. On
larger boxes nodal (single chip) and group (multiple chip) scoped reads are
treated differently. Therefore we avoid disabling nodal scope on large boxes
which have pump mode disabled to avoid all PHB requests being broadcast to
multiple chips.
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
Presently in CAPI mode the number of STQ/DMA-read engines allocated on
PEC2 for CAPP is fixed to 6 and 0-30 respectively irrespective of the
PCI link width. These values are only suitable for x8 cards and
quickly run out if a x16 card is plugged to a PEC2 attached slot. This
usually manifests as CAPP reporting TLBI timeout due to these messages
getting stalled due to insufficient STQs.
To fix this we update enable_capi_mode() to check if PEC2 chiplet is
in x16 mode and if yes then we allocate 4/0-47 STQ/DMA-read engines
for the CAPP traffic.
Cc: stable # v5.7+
Fixes: 37ea3cfdc852("capi: Enable capi mode for PHB4")
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
Enables reporting of slot status information, etc in the config space of
the root complex. Currently this is only used to set the slot power
limit in our generic PCI code, but we might use it for other things
later on.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
P9 supports PCI tunneled operations (atomics and as_notify) that require
setting the PHB ASN Compare/Mask register with a 16-bit indication.
This register is currently initialized by enable_capi_mode(). But, as
tunneled operations may also work in PCI mode, the ASN Compare/Mask
register should rather be initialized in phb4_init_ioda3().
This patch also adds "ibm,phb-indications" to the device tree, to tell
Linux the values of CAPI, ASN, and NBW indications, when supported.
Tunneled operations tested by IBM in CAPI mode, by Mellanox Technologies
in PCI mode.
Signed-off-by: Philippe Bergheaud <felix@linux.vnet.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
This fixes these nvidia cards training at only GEN2 spends rather than
GEN3 by disabling PCIe lane equalisation.
Firstly we check if the card is in a whitelist. If it is and the link
has not trained optimally, retry with lane equalisation off. We do
this on all POWER9 chip revisions since this is a device issue, not
a POWER9 chip issue.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Reviewed-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
This patch introduces a new function phb4_dump_app_err_regs() that
dumps CAPP error registers in case the PEC nestfir register indicates
that the fence was due to a CAPP error (BIT-24).
Contents of these registers are helpful in diagnosing CAPP
issues. Registers that are dumped in phb4_dump_app_err_regs() are:
* CAPP FIR Register
* CAPP APC Master Error Report Register
* CAPP Snoop Error Report Register
* CAPP Transport Error Report Register
* CAPP TLBI Error Report Register
* CAPP Error Status and Control Register
Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: Christophe Lombard<clombard@linux.vnet.ibm.com>
Acked-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Freeze events such as MMIO loads can cause the PHB to lose it's
limited powerbus credits. If all credits are used and a further MMIO
will cause a checkstop.
To work around this, we escalate the troublesome freeze events to a
fence. The fence will cause a full PHB reset which resets the powerbus
credits and avoids the checkstop.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
The CAPI initialization sequence has been updated in DD2.
This patch adapts to the changes, retaining compatibility with DD1.
The patch includes some changes to DD1 fix-ups as well.
Tests performed on some of the old/new hardware.
Some CAPP registers are initialized through the initfile p9.cxa.scom as
the CAPP FIR, Transport Control and Snoop control registers.
The following features will be added soon:
- CAPP recovery.
- Credit setup for Non Blocking Write + force quiesce.
- Disable CAPI mode.
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
This add more details to the PCI training tracing (aka Rick Mata
mode). It enables the PCIe Link Training and Status State
Machine (LTSSM) tracing and details on speed and link width.
Output now looks like this when enabled (via nvram):
[ 1.096995141,3] PHB#0000[0:0]: TRACE:0x0000001101000000 0ms GEN1:x16:detect
[ 1.102849137,3] PHB#0000[0:0]: TRACE:0x0000102101000000 11ms presence GEN1:x16:polling
[ 1.104341838,3] PHB#0000[0:0]: TRACE:0x0000182101000000 14ms training GEN1:x16:polling
[ 1.104357444,3] PHB#0000[0:0]: TRACE:0x00001c5101000000 14ms training GEN1:x16:recovery
[ 1.104580394,3] PHB#0000[0:0]: TRACE:0x00001c5103000000 14ms training GEN3:x16:recovery
[ 1.123259359,3] PHB#0000[0:0]: TRACE:0x00001c5104000000 51ms training GEN4:x16:recovery
[ 1.141737656,3] PHB#0000[0:0]: TRACE:0x0000144104000000 87ms presence GEN4:x16:L0
[ 1.141752318,3] PHB#0000[0:0]: TRACE:0x0000154904000000 87ms trained GEN4:x16:L0
[ 1.141757964,3] PHB#0000[0:0]: TRACE: Link trained.
[ 1.096834019,3] PHB#0001[0:1]: TRACE:0x0000001101000000 0ms GEN1:x16:detect
[ 1.105578525,3] PHB#0001[0:1]: TRACE:0x0000102101000000 17ms presence GEN1:x16:polling
[ 1.112763075,3] PHB#0001[0:1]: TRACE:0x0000183101000000 31ms training GEN1:x16:config
[ 1.112778956,3] PHB#0001[0:1]: TRACE:0x00001c5081000000 31ms training GEN1:x08:recovery
[ 1.113002083,3] PHB#0001[0:1]: TRACE:0x00001c5083000000 31ms training GEN3:x08:recovery
[ 1.114833873,3] PHB#0001[0:1]: TRACE:0x0000144083000000 35ms presence GEN3:x08:L0
[ 1.114848832,3] PHB#0001[0:1]: TRACE:0x0000154883000000 35ms trained GEN3:x08:L0
[ 1.114854650,3] PHB#0001[0:1]: TRACE: Link trained.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
P9 supports PCI peer-to-peer: a PCI device can write directly to the
mmio space of another PCI device. It completely by-passes the CPU.
It requires some configuration on the PHBs involved:
1. on the initiating side, the address for the read/write operation is
in the mmio space of the target, i.e. well outside the range normally
allowed. So we disable range-checking on the TVT entry in bypass mode.
2. on the target side, we need to explicitly enable p2p by setting a
bit in a configuration register. It has the side-effect of reserving
an outbound (as seen from the CPU) store queue for p2p. Therefore we
only enable p2p on the PHBs using it, as we don't want to waste the
resource if we don't have to.
P9 supports p2p mmio writes. Reads are currently only supported if the
two devices are under the same PHB but that is expected to change in
the future, and it raises questions about intermediate switches
configuration, so we report an error for the time being.
The patch adds a new OPAL call to allow the OS to declare a p2p
(initiator, target) pair.
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Reviewed-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Add a mode to PHB4 to trace training process closely. This activates
as soon as PERST is deasserted and produces human readable output of
the process.
This may increase training times since it duplicates some of the
training code. This code has it's own simple checks for fence and
timeout but will fall through to the default training code once done.
Output produced, looks like the "TRACE:" lines below:
[ 3.410799664,7] PHB#0001[0:1]: FRESET: Starts
[ 3.410802000,7] PHB#0001[0:1]: FRESET: Prepare for link down
[ 3.410806624,7] PHB#0001[0:1]: FRESET: Assert skipped
[ 3.410808848,7] PHB#0001[0:1]: FRESET: Deassert
[ 3.410812176,3] PHB#0001[0:1]: TRACE: 0x0000000101000000 0ms
[ 3.417170176,3] PHB#0001[0:1]: TRACE: 0x0000100101000000 12ms presence
[ 3.436289104,3] PHB#0001[0:1]: TRACE: 0x0000180101000000 49ms training
[ 3.436373312,3] PHB#0001[0:1]: TRACE: 0x00001d0811000000 49ms trained
[ 3.436420752,3] PHB#0001[0:1]: TRACE: Link trained.
[ 3.436967856,7] PHB#0001[0:1]: LINK: Start polling
[ 3.437482240,7] PHB#0001[0:1]: LINK: Electrical link detected
[ 3.437996864,7] PHB#0001[0:1]: LINK: Link is up
[ 4.438000048,7] PHB#0001[0:1]: LINK: Link is stable
Enabled via nvram using:
nvram -p ibm,skiboot --update-config pci-tracing=true
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
On PHB4 the number of index bits in the IODA table address register
was bumped to 10 bits to accomodate for 1024 MSIs and 1024 TVEs (DD2).
However our macro only defined the field to be 9 bits, thus causing
"interesting" behaviours on some systems.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Support StoreEOI, full complements of PEs (twice as big TVT)
and other updates.
Also renumber init steps to match spec 063
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Enable the Coherently attached processor interface. The PHB is used as
a CAPI interface.
CAPI Adapters can be connected to either PEC0 or PEC2. Single port
CAPI adapter can be connected to either PEC0 or PEC2, but Dual-Port
Adapter can be only connected to PEC2
CAPP0 attached to PHB0(PEC0 - single port)
CAPP1 attached to PHB3(PEC2 - single or dual port)
As we did for PHB3, a new specific file 'phb4-capp.h' is created to
contain the CAPP register definitions.
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
During a hot reset the PCI link will drop, so we need to mask link down
events to prevent unnecessary errors.
Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
The root complex config space size on PHB4 is 2048. This patch sets
that size and enforces it when trying to read/write the config space
in the root complex.
Without this someone reading the config space via /sysfs in linux will
cause an EEH on the PHB.
If too high, reads returns 1s and writes are silently dropped.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Add some missing register definitions, delete some duplicates and put things
in order.
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Neuling <mikey@neuling.org>
|
|
Fix some of the bit definitions.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
This includes some DD2.0 support
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
This adds the base support for the PHB4. It currently only support
the M32 window, EEH or in general error recovery aren't supported
yet.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[stewart@linux.vnet.ibm.com: update (C) year, fix indenting]
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|