Age | Commit message (Collapse) | Author | Files | Lines |
|
[ Upstream commit 9597a12ef4b3644e4b8644f659bec04ca139b7f9 ]
Some PHB4 PHYs can get stuck in a bad state where they are constantly
retraining the link. This happens transparently to skiboot and Linux
but will causes PCIe to be slow. Resetting the PHB4 clears the
problem.
We can detect this case by looking at the RX errors count where we
check for link stability. This patch does this by modifying the link
optimal code to check for RX errors. If errors are occurring we
retrain the link irrespective of the chip rev or card.
Normally when this problem occurs, the RX error count is maxed out at
255. When there is no problem, the count is 0. We chose 8 as the max
rx errors value to give us some margin for a few errors. There is also
a knob that can be used to set the error threshold for when we should
retrain the link. ie
nvram -p ibm,skiboot --update-config phb-rx-err-max=8
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
We've been carting around this field since the original p7ioc-phb code.
As far as I can tell we never actually use it for anything other than
checking if the PHB has been marked as broken or not. The _FENCED
state is set in a few places, but we never use it in favour of just
checking the MMIO register.
This patch just replaces it with a boolean that indicates if
the PHB has been marked as broken and removes the giant, mostly
wrong, comment explaining it's usage that is copied and pasted
into each phb header file.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
This fixes these nvidia cards training at only GEN2 spends rather than
GEN3 by disabling PCIe lane equalisation.
Firstly we check if the card is in a whitelist. If it is and the link
has not trained optimally, retry with lane equalisation off. We do
this on all POWER9 chip revisions since this is a device issue, not
a POWER9 chip issue.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Reviewed-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Make link retries a #define rather than open coding it in the PHB4
init code.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
This code was never used (since retries is set to 0), it's not very
useful and it makes the code harder to read. So lets just remove it.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
In this patch we tuned our link timing to be more agressive:
cf960e2884 phb4: Improve reset and link training timing
Cards should take only 32ms but unfortunately we've seen some take
up to 440ms. Hence bump our timer up to 1000ms.
This can hurt boot times on systems where slots indicate a hotplug
status but no electrical link is present (which we've seen). Since we
have to wait 1 second between PERST and touching config space anyway,
it shouldn't hurt too much.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
This improves PHB reset and link training timing. Justifications and
reasons are included in the patch.
Polling frequencies are decreased from 100ms to 10ms.
Added is a new state called PHB4_SLOT_LINK_STABLE which is now needed
since the link training can be so fast that we touch config space too
quickly (PCIe spec requires 1 second between PERST de-assert and
device config space reads). We use this new state to sanity check the
PHB and link before moving onto the PCI bus scan, where we no longer
recover from these error conditions.
Also added is simplified documentation of the PHB reset and training flow.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Support StoreEOI, full complements of PEs (twice as big TVT)
and other updates.
Also renumber init steps to match spec 063
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
This implements complete reset (creset) functionality for POWER9 DD1.
Only partially tested and contends with some DD1 errata, but it's a start.
Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Witherspoon systems come with a 'shared' PCI slot: physically, it
looks like a x16 slot, but it's actually two x8 slots connected to two
PHBs of two different chips. Taking advantage of it requires some
logic on the PCI adapter. Only the Mellanox CX5 adapter is known to
support it at the time of this writing.
This patch enables support for the shared slot on witherspoon if a x16
adapter is detected. Each x8 slot has a presence bit, so both bits
need to be set for the activation to take place. Slot sharing is
activated through a gpio.
Note that there's no easy way to be sure that the card is indeed a
shared-slot compatible PCI adapter and not a normal x16 card. Plugging
a normal x16 adapter on the shared slot should be avoided on
witherspoon, as the link won't train on the second slot, resulting in
a timeout and a longer boot time. Only the first slot is usable and
the x16 adapter will end up using only half the lines.
If the PCI card plugged on the physical slot is only x8 (or less),
then the presence bit of the second slot is not set, so this patch
does nothing. The x8 (or less) adapter should work like on any other
physical slot.
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
[stewart@linux.vnet.ibm.com: re-org code, move into platform file]
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Currently we print "PHB4" and mean either "PHB version 4" or "PHB
number 4" which can be quite confusing.
This makes it clearer when it's one or the other.
Also fixes some cut and paste errors in comments from PHB3.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
This includes some DD2.0 support
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
PCI slot pfreset() operation is obsoleted as nobody uses it. This
removes it and the related PCI slot states. No functional changes
introduced.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Russell Currey <ruscur@russell.cc>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Various backends define their own PCI slot states for flexibility
with numbers [A]. PCI core also defines its PCI slot states [B].
For one specific PCI slot state, the major number of [A] and [B]
should be same so that the corresponding operation can be found.
It means [A] and [B] are relevant to some extent, but the code
where defines the PCI slots in backends doesn't reflect it.
This makes the major PCI slot state defined in backend same to
the corresponding one defined in PCI core. The minor PCI slot
states are made to be incremental to their base number (major
PCI slot state). No functional changes introduced.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
This adds the base support for the PHB4. It currently only support
the M32 window, EEH or in general error recovery aren't supported
yet.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[stewart@linux.vnet.ibm.com: update (C) year, fix indenting]
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|