Age | Commit message (Collapse) | Author | Files | Lines |
|
It appears that our reset code wasn't entirely correct, and what we're
meant to do is reset each port and wait for command complete. In the
event where that fails, we can then bitbang things to recover to a state
where at least the i2c engine isn't in a weird state.
Practically, this means that "i2cdetect -y 10; i2cdetect -y 10" (where 10
is the bus where a TPM is attached, typically p8e1p2) doesn't hard lock
the machine (things are still bad and you won't reboot successfully, but
it's *better*).
one downside to this patch is that we spend a *long* time in OPAL (tens
of ms) when doing the reset. This is something that we really need to fix,
as it's not at all nice. The full fix for this though will involve changing
a decent chunk of the p8-i2c code, as we don't want to write *any* registers
while doing this extended reset (while existing code checks status a bit
later).
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Useful for debugging
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Otherwise we'd default to 2seconds (TIMER_POLL) during boot on
chips with a functional i2c interrupt, leading to slow i2c
during boot (or hitting timeouts instead).
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
We add some routines that let a caller get the xscom lock once and
then do a bunch of xscoms while holding it.
In some situations without this, it could take long enough to get
the xscom lock that the 1ms timeout would expire and we'd falsely
think the SLW timer didn't work when in fact it did.
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Doing everything asynchronously is brilliant, it's exactly what we
want to do.
Except... the tpm driver wants to do things synchronously, which isn't
so cool.
For reasons that are not yet completely known, we spend an awful lot of
time in the main thread *not* running pollers (potentially seconds), which
doesn't bode well for I2C timeouts.
Since the TPM measure is done in a secondary thread, we do *not* run pollers
there either (as of 323c8aeb54bd4e0b9004091fcbb4a9daeda2f576 - which is
roughly as of skiboot 2.1.1).
But we still need to crank the i2c state machine, so we introduce a call
to do just that. It will return how long the poll interval should be, so
that we can time_wait() for a more appropriate time for whatever i2c
implementation is sitting behind things.
Without this, it was "easy" to get to a situation where the i2c state machine
wasn't cranked at all, and you'd hit the i2c timeout (for the issued operation)
before the poller to crank i2c was ever called.
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Tested-by: Claudio Carvalho <cclaudio@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
If we were to terminate in a poller, we'd call op_display() which
called pollers which hit the recursive poller warning, which ended
in not much fun at all.
This patch will skip the running of pollers and instead run
the FSP poller to set the op-panel display before attn.
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Currently we print "PHB4" and mean either "PHB version 4" or "PHB
number 4" which can be quite confusing.
This makes it clearer when it's one or the other.
Also fixes some cut and paste errors in comments from PHB3.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Fix enabling config space on DD1.
Without this PCI devices disappear on kexec.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Fix some of the bit definitions.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Make sure we set consistent values between Init_4 and Init_14 and
set the default to Gen4 not Gen3
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
This includes some DD2.0 support
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
The HW check that the 2 tops bits aren't both clear to differenciate
an unallocated entry from a valid one. So we need to put some value
there.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
This will be reworked when we support EQ and VP allocation, for now
remove the unused field
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Some devices such as NX or the NPU will use some of the XIVE
provided IPIs for their own interrupts. Thus we need a way for
those to provide a custom irq_source_ops for portions of the IPI
space in order for them to provide their own attributes() and
if needed, interrutps() callbacks.
We achieve that by creating a second list of sources which can
overlap the primary.
The global stock of IPIs is registered by XIVE in the secondary
list which is searched when no match is found in the primary.
A new API xive_register_ipi_source() is provided for those devices
to create an overlapping source structure in the primary list for
a subset of the IPIs. Those IPIs must have been previously allocated
using xive_alloc_ipi_irqs()
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
To be used by such things as VAS
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
We only want to directly EOI the interrupt used to emulate the MFRR,
for all the other "IPI" (aka XIVE produced interrupts), we want to
go via the normal source mechanism.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
It will just generate spurious powerbus traffic and ESB state
changes.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Properly documenting assumptions and behaviour related to
interrupts occurring while masked. This reflects the documentation
update.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
In multi-chip environments, the XIVEs need to communicate to
each other via these ports, so they need to be configured
properly
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
The OPAL API uses mangled server numbers with the link in the
bottom 2 bits like a real XICS does, we need to account for it.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
The comment and implementation didn't match, we were putting the
block_id in the part of the field reserved for the CPPR.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
We need to inject the chip id in the MMIO address
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Some sim models have the LPC interrupts stuck asserted on secondary
chips so we add a device-tree option that makes us set the policy
for these to "Linux" instead of "OPAL".
Since they aren't referenced in the device-tree this will de-facto
prevent them from being enabled
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Fixes: 5611389876a748e19b7593d4eb426ced7a6ed31f
Reported-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
An out of tree platform (p8dtu) uses a different IPMI OEM command
for IPMI_PARTIAL_ADD_ESEL. This exposed some assumptions about the BMC
implementation in our core code.
Now, with platform.bmc, each platform can dictate (or detect) the BMC
that is present. We allow it to be set at runtime rather than purely
statically in struct platform as it's possible to have differing BMC
implementations on the one machine (e.g. AMI BMC or OpenBMC).
Acked-by: Jeremy Kerr <jk@ozlabs.org>
[stewart@linux.vnet.ibm.com: remove enum, update (C) years]
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Currently Hostboot populates /bmc/sensors dt node and corresponding sensors
only for BMC platforms, And for FSP platforms hostboot is not populating any
fsp sensors(Management sensors) and also there is no firmware progress sensor
exist in fsp platforms. Due to which OPAL incorrectly setting firmware status
on a sensor id "00" which is not at all exist.
On a FSP system:
cat /sys/firmware/opal/msglog | grep -i setting
[ 21.189204883,6] IPMI: setting fw progress sensor 00 to 07
[ 21.189559121,6] IPMI: setting fw progress sensor 00 to 13
cat /sys/firmware/opal/msglog | grep -i skiboot
[ 84.127416495,5] SkiBoot skiboot-5.4.0-rc3 starting...
On a BMC system:
cat /sys/firmware/opal/msglog | grep -i setting
[ 3.166286901,6] IPMI: setting fw progress sensor 05 to 14
[ 14.259153338,6] IPMI: setting fw progress sensor 05 to 07
[ 14.469070593,5] IPMI: Resetting boot count on successful boot
[ 15.001210324,6] IPMI: setting fw progress sensor 05 to 13
So this patch fixes this incorrect setting on a fsp system, and also sets the sensor
only if OPAL initialises ipmi sensors and corresponding sensor exists for a given
sensor type in the device tree.
After patch:
On a FSP system:
cat /sys/firmware/opal/msglog | grep -i setting
On a BMC system:
cat /sys/firmware/opal/msglog | grep -i setting
[ 3.164859816,6] IPMI: setting fw progress sensor 05 to 14
[ 14.024941077,6] IPMI: setting fw progress sensor 05 to 07
[ 14.211514767,5] IPMI: Resetting boot count on successful boot
[ 14.252554375,6] IPMI: setting fw progress sensor 05 to 13
Signed-off-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com>
[stewart@linux.vnet.ibm.com: return OPAL_UNSUPPORTED on !sensors_present,
make ipmi_sensor_type_present() static in ipmi-sensor.c]
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
elog_reject_head() routine makes the state 'elog_read_from_fsp_head_state'
either 'ELOG_STATE_REJECTED' or 'ELOG_STATE_NONE' depending on the current
state of 'elog_read_from_fsp_head_state'.
We can remove this elog_reject_head() from 'opal_kexec_elog_notify()' as just
after that it is called inside 'fsp_opal_resend_pending_logs()'. So, it is
redundant inside opal_kexec_elog_notify() routine.
Signed-off-by: Mukesh Ojha <mukesh02@linux.vnet.ibm.com>
Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
We use 'elog_enabled' flag to check whether host OS is ready to receive
error log or not. This is nothing to do with reading error log from
service processor.
This patch is to remove the check and keep this 'elog_enabled' free from
FSP specific code and move it into core/errorlog.c in later upcoming patches.
With this changes, in some corner cases we may endup reading same error
log twice from FSP. It happens as we call 'elog_reject_head' inside
'fsp_opal_resend_pending_logs' which makes the state either
'ELOG_STATE_REJECTED' or 'ELOG_STATE_NONE'. So, a call to
'fsp_elog_check_and_fetch_head' routine ends up reading the error
log from FSP which was already read. This case happens twice in a reboot
as whenever 'fsp_opal_resend_pending_logs' gets called.
So, we can ignore it.
Signed-off-by: Mukesh Ojha <mukesh02@linux.vnet.ibm.com>
Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Some modifications related to typo errors, alignment, case letter mismatch to add
more clarity to the code.
Signed-off-by: Mukesh Ojha <mukesh02@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
[stewart@linux.vnet.ibm.com: unlock before return (suggested by Mahesh/Andrew),
disable only on non-cancelling fsp codeupdate call (suggested by Vasant)]
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
PCI slot pfreset() operation is obsoleted as nobody uses it. This
removes it and the related PCI slot states. No functional changes
introduced.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Russell Currey <ruscur@russell.cc>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
For PCI slot behind root port, its prepare_link_change() should be
same to PHB's. Otherwise, the UTL events cannot be masked when the
slot is reseted, leading to EEH error because of UTL link-down
event.
Cc: stable # 5.3.0+
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
This masks surprise link down event on RC or downtream ports
if the PCI slots behind them support PCI surprise hotplug. The
event should be handled by PCI hotplug driver instead of EEH
subsystem.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
This is an experimental patch that implements "Fast reboot" on P8
machines.
The basic idea is that when the OS calls OPAL reboot, we gather all
the threads in the system using a combination of patching the reset
vector and soft-resetting them, then cleanup a few bits of hardware
(we do re-probe PCIe for example), and reload & restart the bootloader.
For Trusted Boot, this means we *add* measurements to the TPM, so you
will get *different* PCR values as compared to a full IPL. This makes
sense as if you want to be sure you are running something known then,
well, do a full IPL as soft reset should never be trusted to clear any
malicious code.
This is very experimental and needs a lot of testing and also auditing
code for other bits of HW that might need to be cleaned up.
BenH TODO: I also need to check if we are properly PERST'ing PCI devices.
This is partially based on old code I had to do that on P7. I only
support it on P8 though as there are issues with the PSI interrupts
on P7 that cannot be reliably solved.
Even though this should be considered somewhat experimental, we've had
a lot of success on a variety of machines. Dozens/hundreds of reboots
across Tuleta, Garrison and Habanero.
Currently, we've hidden it behind a NVRAM config option, which *is*
liable to change in the future (to ensure that only those who know
what they're doing enable it)
You can enable the experimental support via nvram option:
nvram -p ibm,skiboot --update-config experimental-fast-reset=feeling-lucky
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[stewart@linux.vnet.ibm.com: hide behind nvram option, include Mambo fixes
from Mikey]
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
In the PCI post-fundamental reset code, a hot reset is performed at the
end. This is causing issues at boot time as a reset signal is being sent
downstream before the links are up, which is causing issues on adapters
behind switches. No errors result in skiboot, but the adapters are not
usable in Linux as a result.
Hot resets also occur in the FSP platform-specific code for conventional
PCI slots, which could cause issues.
This patch fixes some adapters not being configurable in Linux on some
systems. The issue was not present in skiboot 5.2.x.
Cc: stable # 5.3.x
Signed-off-by: Russell Currey <ruscur@russell.cc>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
The ECRC generation and check can't be enabled on Broadcom's NIC
(14e4:168a) when it seats behind PMC PCIe switch downstream port
(11f8:8546). Otherwise, the NIC's config space can not be accessed
and returns 0xFF's on read because of EEH error even after the error
is cleared. The issue is reported from Firestone.
This disables ECRC generation and check on Broadcom's NIC when it
seats behind PMC PCIe switch downstream port. With this applied,
the NIC can be detected successfully.
Reported-by: Li Meng <shlimeng@cn.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
[stewart@linux.vnet.ibm.com: add description of device workaround is for]
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
This reverts commit cf39c2a7dd1a2ee9b19a5490f7fa25690b8e8ae3.
Fixes: cf39c2a7dd1a2ee9b19a5490f7fa25690b8e8ae3
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
We should use the API properly.
This reverts commit 0657bccb778cbe71fc8c00879826ca0217b7010d.
Fixes: 0657bccb778cbe71fc8c00879826ca0217b7010d
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
This adjusts the CAPP header offset if CAPP is a secure boot container.
Signed-off-by: Claudio Carvalho <cclaudio@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
This change preloads the whole CAPP partition.
We decided to build a container for the whole CAPP lid as opposed to
have one for the TOC and one for each subpartition.
Signed-off-by: Claudio Carvalho <cclaudio@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
During an OCC reset cycle the system is forced to Psafe pstate.
When OCC becomes active, the system has to be restored to its
last pstate as requested by host. So host needs to be notified
of OCC_RESET event or else system will continue to remian in
Psafe state until host requests a new pstate after the OCC
reset cycle.
This patch defines 'OPAL_PRD_MSG_TYPE_OCC_RESET_NOTIFY' to
notify OPAL when opal-prd issues OCC reset. OPAL will queue
OCC_RESET message to host when it receives opal_prd_msg of
type '*_OCC_RESET_NOTIFY'.
Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Allocate an irq number for each hvc console and set its interrupt-parent
property so that Linux can use the opal irqchip instead of the
OPAL_EVENT_CONSOLE_INPUT interface.
Signed-off-by: Samuel Mendoza-Jonas <sam@mendozajonas.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Fixes: 81154ba9b2d418cd5f9eda3a6f89ca6631556510
Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Reviewed-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
|
|
Currently the reserved PE is set to NPU_NUM_OF_PES, which is one
greater than the maximum PE resulting in the following kernel errors
at boot:
[ 0.000000] pnv_ioda_reserve_pe: Invalid PE 4 on PHB#4
[ 0.000000] pnv_ioda_reserve_pe: Invalid PE 4 on PHB#5
Due to a HW errata PE#0 is already reserved in the kernel, so update
the opal-reserved-pe device-tree property to match this.
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|