Age | Commit message (Collapse) | Author | Files | Lines |
|
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Add a mode to PHB4 to trace training process closely. This activates
as soon as PERST is deasserted and produces human readable output of
the process.
This may increase training times since it duplicates some of the
training code. This code has it's own simple checks for fence and
timeout but will fall through to the default training code once done.
Output produced, looks like the "TRACE:" lines below:
[ 3.410799664,7] PHB#0001[0:1]: FRESET: Starts
[ 3.410802000,7] PHB#0001[0:1]: FRESET: Prepare for link down
[ 3.410806624,7] PHB#0001[0:1]: FRESET: Assert skipped
[ 3.410808848,7] PHB#0001[0:1]: FRESET: Deassert
[ 3.410812176,3] PHB#0001[0:1]: TRACE: 0x0000000101000000 0ms
[ 3.417170176,3] PHB#0001[0:1]: TRACE: 0x0000100101000000 12ms presence
[ 3.436289104,3] PHB#0001[0:1]: TRACE: 0x0000180101000000 49ms training
[ 3.436373312,3] PHB#0001[0:1]: TRACE: 0x00001d0811000000 49ms trained
[ 3.436420752,3] PHB#0001[0:1]: TRACE: Link trained.
[ 3.436967856,7] PHB#0001[0:1]: LINK: Start polling
[ 3.437482240,7] PHB#0001[0:1]: LINK: Electrical link detected
[ 3.437996864,7] PHB#0001[0:1]: LINK: Link is up
[ 4.438000048,7] PHB#0001[0:1]: LINK: Link is stable
Enabled via nvram using:
nvram -p ibm,skiboot --update-config pci-tracing=true
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
This improves PHB reset and link training timing. Justifications and
reasons are included in the patch.
Polling frequencies are decreased from 100ms to 10ms.
Added is a new state called PHB4_SLOT_LINK_STABLE which is now needed
since the link training can be so fast that we touch config space too
quickly (PCIe spec requires 1 second between PERST de-assert and
device config space reads). We use this new state to sanity check the
PHB and link before moving onto the PCI bus scan, where we no longer
recover from these error conditions.
Also added is simplified documentation of the PHB reset and training flow.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
This adds a function phb4_check_reg() to sanity check when we do MMIO
reads from the PHB to make sure it's not fenced.
This also adds some uses of this function in common locations where
these may occur on PHB reset and link training.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Currently we retry if we don't detect an electrical link. This is
pointless as all devices should respond in the given time.
This patches removes this retry and just returns OPAL_HARDWARE if we
don't detect an electrical link.
This has the additional benefit of improving boot times on machines
that have badly wired presence detect (ie. says a device is present
when there isn't).
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
phb4_retry_state() returns a good error code, so just use that rather
than complicating the caller.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Currently we assume on boot that PERST is asserted so that we can skip
having to assert it ourselves.
This instead reads the PERST status and determines if we need to
assert it based on that.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Byte swap TLP headers so they are the same as the PCIe spec.
Also remove redundant print.
Suggested-by: Rob Lippert <rlippert@google.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
If the link doesn't have a electrical link or the link doesn't train
we should make that more obvious to the user.
This boosts these prints to error level.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Better logs why the slot didn't work and make it a PR_ERR so users
see it by default.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Force verbose EEH. This is a heavy handed and we should turn if off
later as things stabilise, but is useful for now.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Acked-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Mostly errata workarounds, some DD1 specific.
The step Init_5 was moved to Init_16, so the numbering was updated to
reflect this.
(mikey: added section on ignoring errata ER20161123)
Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
We should be checking the array version, not the HDIF header version.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Regions with the no-map property should be handled seperately to
"normal" firmware reservations. When creating mem_region regions
from a reserved-memory DT node use the no-map property to select
the right reservation type.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Missed a few.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Currently we just add these to a list of pre-boot reserved regions
which is then converted into a the contents of the /reserved-memory/
node just before Skiboot jumps into the firmware kernel.
This approach is insufficent because we need to add the ibm,prd-instance
labels to the various hostboot reserved regions. To do this we want to
create these resevation nodes inside the HDAT parser rather than having
the mem_region flattening code handle it. On P8 systems Hostboot placed
its memory reservations under the /ibm,hostboot/ node and this patch
makes the HDAT parser do the same.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Add a lock so that only one thread can print a backtrace at a time.
This should prevent multiple threads from garbaling each other's
backtraces.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
We use irq for reading input from console, but not in output path.
Hence do not enable input irq in write path.
Fixes : 583c8203 (fsp/console: Allocate irq for each hvc console)
CC: Sam Mendoza-Jonas <sam@mendozajonas.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Acked-By: Samuel Mendoza-Jonas <sam@mendozajonas.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Fast-reboot has a memory leak which causes the system to crash after about
250 fast-reboots. The patch fixes the memory leak.
The cause of the leak was the pci_device's being freed, without freeing
the pci_slot within it.
Signed-off-by: Matt Brown <matthew.brown.dev@gmail.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
At the moment, we mark them both as being able to fail, as we're
hitting an assert in one of the unit tests on debian stretch, and
that hasn't yet been chased down.
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
OpenBMC stack added IPMI OEM extension to log eSEL events.
Lets enable eSEL logging from OPAL side.
See: https://github.com/openbmc/openpower-host-ipmi-oem/blob/d9296050bcece5c2eca5ede0932d944b0ced66c9/oemhandler.cpp#L142
(yes, that is the documentation)
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
[stewart@linux.vnet.ibm.com: remove pnor access request, add link to OpenBMC doc]
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Fixes build warnings when running with higher optimization than -O0
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
We end up with a bit of a nasty hack to count the libflash symlinks
in gard and pflash as part of libflash code coverage, but it does
work and is unlikely to break anytime soon.
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
This enables us to do coverage reports on gard/pflash.
Reviewed-by: Cyril Bur <cyril.bur@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Reviewed-by: Cyril Bur <cyril.bur@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Mostly unused parameter warnings due to callbacks
Reviewed-by: Cyril Bur <cyril.bur@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
This enables some extra linked list checking
Reviewed-by: Cyril Bur <cyril.bur@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
mostly missing prototypes and unused parameters.
Reviewed-by: Cyril Bur <cyril.bur@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
This commit extends pflash with an option to retrieve and print
information for a particular partition, including the content from
"pflash -i" and a verbose list of set miscellaneous flags. -i option
is also updated to print a short list of flags in addition to the
ECC flag, with one character per flag. A test of the new option is
included in libflash/test.
Signed-off-by: Michael Tritz <mtritz@us.ibm.com>
Reviewed-by: Cyril Bur <cyril.bur@au1.ibm.com>
[stewart@linux.vnet.ibm.com: various test fixes, enable gcov]
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Fixes: 7801be0fcf2a2 ('skiboot: Add opal calls to init/start/stop IMC devices)
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Add IPMI sensor data under /bmc node.
CC: Joel Stanley <joel@jms.id.au>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Tested-by: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
..so that it can be used in other places as well.
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Today we have an issue where the NUMA nodes corresponding
to GPU's have the same affinity/distance as normal memory
nodes. Our reference-points today supports two levels
[0x4, 0x4] for normal systems and [0x4, 0x3] for Power8E
systems. This patch adds a new level [0x4, X, 0x2] and
uses node-id as at all levels for the GPU.
Cc: Reza Arbab <arbab@linux.vnet.ibm.com>
Cc: Alistair Popple <alistair@popple.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Reviewed-by: Alistair Popple <alistair@popple.id.au>
Acked-by: Reza Arbab <arbab@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Currently we only retry once when we have a link training failure.
This changes this to be 3 retries as 1 retry is not giving us enough
reliablity.
This will increase the boot time, especially on systems where we
incorrectly detect a link presence when there really is nothing
present. I'll post a followup patch to optimise our timings to help
mitigate this later.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
This reworks the pci link training retry code so that we can do more
than one retry.
This will now also print an error if a link fails to train.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
For PHB4 it's possible that the phy may end up in a bad state where it
can no longer recieve data. This can manifest as the link not
retraining. A simple PERST will not clear this. The PHB must be
completely reset.
This changes the retry state to CRESET to do this.
This issue may also manifest itself as the link training in a degraded
state (lower speed or narrower width). This patch doesn't attempt to
fix that (will come later).
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Currently we recursively call run_sm() in phb4_retry_state(). This is
unnecessary and overly complex.
This just returns with a small wait time. 1ms should be a very small
over head compared to having to do the actual retry.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
PCI link training is responsible for a huge chunk of the skiboot boot
time, so add the ability to trace it waiting in the main state
machine.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Currently during boot there a long delay while we wait for the PHBs to
be reset and train. During this time, there is no output from skiboot
and the last message doesn't give an indication of what's happening.
This boosts the PHB reset message from info to notice so users can see
what's happening during this long period of waiting.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
The MPIPL procedure says to only set bit 26 when forcing the PEC into
freeze mode. Currently we set bits 24-27.
This changes the code to follow spec and only set bit 26.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
According to the workbook, pfir must be cleared before the nfir.
The way we have it now causes the nfir to not clear properly in some
error circumstances.
This swaps the order to match the workbook.
Also updates the comments to be clearer.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
When waiting in PHB4_SLOT_CRESET_WAIT_CQ for transations to end, we
incorrectly move onto the next state. Generally we don't hit this as
the transactions have ended already anyway.
This removes the incorrect state transition.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Set default lane equalisation if there is nothing in the device-tree.
Default value taken from hdat and confirmed by hardware team. Neatens
the code up a bit too.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
The lane-eq data we get from hdat is all 7s but what we end up in the
device tree is:
xscom@603fc00000000/pbcq@4010c00/stack@0/ibm,lane-eq
00000000 31c339e0 00000000 0000000c
00000000 00000000 00000000 00000000
00000000 31c30000 77777777 77777777
77777777 77777777 77777777 77777777
This fixes grabbing the properties from hdat and fixes the call to put
them in the device tree.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
We had a few problems:
- We used the wrong register to trigger the reset (spec bug)
- We should clear the PFIR and NFIR while the reset is asserted
- ... and in the right order !
- We should only apply the DD1 workaround after the reset has
been lifted.
- We should ensure we use ASB whenever we are fenced or doing a
CRESET
- Make config ops write with ASB
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
This replaces use of MMIO registers with the new accessors
in places that can be called during recovery procedures at
times when the PHB can be fenced.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Those will pick between ASB (ie, XSCOM) accesses and direct MMIO
based on PHB flags, thus allowing transparent access whether the
PHB is fenced or not.
Mark as unused for now so we don't get a warning.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|