Age | Commit message (Collapse) | Author | Files | Lines |
|
Linux sends us a 0 when shutting down. This means we don't need to pass
the u64 to the IPMI driver. Add a check that the value is what we expect
in case Linux changes it's behaviour in the future.
When rebooting, we should send the BMC a HARD_RESET command (0x03), not
POWER_CYCLE (0x02).
While we are here, trim some whitespace and drop opal from the IPMI
function name for readability.
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
We moved the rtc code out to its own ipmi driver, so we don't need these
headers anymore.
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Previously we were doing synchronous messaging and cranking the bt
state machine from within OPAL. This was not ideal as it could
potentially take control away from the OS for long periods of
time if the BMC is busy. This patch solves the problem using the
opal_poll api to do asynchronous messaging.
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
The initial implementation of the ipmi stack was still tightly coupled
with the backend (in this case bt). This patch refactors the ipmi code
to use a generic backend device.
The core ipmi messaging functionality and the implementation of
specific commands has also been split into different files.
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
The original implementation of the bt and ipmi layers required the bt,
ipmi and message data to be allocated separately. This is sub-optimal
as it could cause excessive memory fragmentation. This patch fixes the
problem by adding a function to the bt layer to allocate space for
both the required data and bt/ipmi message.
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
There is no use for time, remove it.
Fixes the following error (and the warning once stdint.h was included):
libc/test/run-time.c:40:2: error: unknown type name ‘uint32_t’
uint32_t time;
libc/test/run-time.c:41:11: warning: unused variable ‘time’
[-Wunused-variable]
uint32_t time;
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
When the fast/deep power management modes for the cpu idle states
is initialized, bits which are not relevant in this context are also
being set. Fix this.
Besides this, the EX_PM_GP1 register will be read/written into by the
OCC as well. We touch this register during initialization of fast/deep
cpuidle modes and during initialization of pstate transitions. The register
contents can thus get messed up due to potential race conditions between
the OCC and sapphire settings.
Hence make use of the AND and OR scoms to do the settings and hence
let the hardware take care of the necessary synchronization.
We can also get rid of the setting of deep mode during slw_reinit since
we enable the required deep winkle mode during slw_init itself. This means
effectively removing the slw_prepare_chip() and its children functions.
They are no longer useful.
Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
As Murali reported, the upstream port of PLX 8724 (rev ba) switch do
have PCI/PCIE capabilities. However, it doesn't have the capability
indicator (bit#4) in PCI_STATUS config register (offset: 0x6). So
the PCIE capability can't probed successfully from the port and we
don't configure the MPS correctly. Eventually, it caused mismatched
MPS on the PHB and run into EEH error.
The patch fixes it by ignoring the capability indicator on PLX 8724
(rev ba) switch upstream port when looking for its PCIE capability
so that its MPS can be configured properly.
Reported-by: Murali N. Iyer <mniyer@us.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
When waiting for a job to complete, users call cpu_wait_job. This causes
the pollers to run and take locks eg. dummy_console takes con_lock.
These locks may also be contended by the job that is running, and this
contention between the pollers and the job stops the job making forward
progress.
The 10us delay was picked as a reasonable compromise.
With this change Palmetto with DD2 PCI probe time goes from 8e10 to 4e8
ticks.
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Instead of running the pollers flat out, instead call them once every
5ms. This helps in situations where pollers are taking locks that are
also taken by tasks completing on other CPUs.
The 5ms time is arbitrary; it was chosen such that most callers of
time_wait will call the pollers at least twice.
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Code cleanup.
Signed-off-by: Aruna Balakrishnaiah <aruna@linux.vnet.ibm.com>
Acked-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Use real functionality based flags instead of a mode list in the DT
and other cleanups & missing bits (this one actually builds !)
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Based on email from JT Kellington, Dave Larson, and Joe McGill and feedback
from Ben H.
handle_malfunction reads the bits in the malf alert reg, checks for
is_capp_recoverable, and returns 1 if recoverable. It also calls into phb3 to
put phb3 in capp error recovery state. Returns 0 if not capp recoverable and
it's a TODO to add the logic to check the other FIRs.
Don't send message when malf alert empty. Use return code -1 to tell
opal_handle_hmi to swallow the event. Also, with locking, only one thread per
core will send the message instead of all threads.
Signed-off-by: Ryan Grimm <grimm@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Take a lock before handle_hmi_event per Ben's suggestion. So, when we clear
events, only one thread per core will report it.
Signed-off-by: Ryan Grimm <grimm@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Add a flag indicating the CAPP unit is in recovery. When a capp recoverable
malfunction HMI comes in, the HMI handler will call into
phb3_set_capp_recovery, which will put set the flag and send the event to
Linux.
EEH will call phb3_next_error which will tell it the phb is fenced.
EEH will then call into sapphire to reinitialize the phb which contains steps
3-5 of capp recovery procedure. The code increases wait time of PERST to 1s to
ensure fpga download is complete before polling linkup.
EEH will then rebind the cxl driver and it will complete recovery once it
initializes and turns snoops on, steps 7-8, completing capp recovery procedure.
Signed-off-by: Ryan Grimm <grimm@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
For user initiated capp recovery, provide a mode to turn snoops off. The perst
alone does not turn snoops off and we need to do this as part of the capp
recovery procedure before reinitializing the phb.
A second mode turns snoops back on after recovery. The driver needs to do this
after it reinitializes the PSL otherwise tlbies could come in before the psl is
initialized. Also write 0 to capp error status and control as part of the
recovery procedure.
Put modes as flag defines in opal.h so the driver can pick them up.
Add a dt property "ibm,capi-modes" which tells the driver which modes sapphire
supports. For backwards compatibility with older opals. Also, the driver can
disable reset in sysfs if not supported.
Move the mode checking into phb3.c so it's all in one place.
Signed-off-by: Ryan Grimm <grimm@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
FLUSH_SUE_STATE_MAP change fixes a problem with recovery. We were using an old
lab value that marked PTE entries in a shared state. After recovery, PTE
entries were getting flushed out to memory with an SUE, resulting in a machine
check. The new value means PTE entries are dropped on recovery.
For, APC_MASTER_PB_CTRL spec says to use initfile value and bit 3 should be
set. Initfile missing bit 3 so do a RMW. Bit 3 enables CAPP combined
response.
CAPP_EPOCH_TIMER_CTRL enables epoch timers and the recovery timer when recovery
is enabled. Also relax epoch timer period mask due to a bug.
TRANSPORT_CONTROL reg set bit 37 - rfs_benign_ptr_data in addition to spec
value. Should be set in initifile in future.
Rename APC_MASTER_CONFIG to APC_MASTER_CAPI_CTRL to match workbook name.
Signed-off-by: Ryan Grimm <grimm@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
This patch changes fsp_opal_get_dpo_status function to return
OPAL_WRONG_STATE when not in DPO pending state. This will help
the host to differentiate whether the system is in DPO pending
state or not and then analyse the returned timeout value correctly.
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Right now if the OPAL message queuing fails, the FSP never gets
the ack back for the original DPO initiation message it had sent
previously. With this patch, if the OPAL message queuing fails to
send the DPO message to the host, it still acks the FSP about the
original message but with error flags.
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Acked-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
This patch adds a positive return statement after handling
DPO message from FSP. Currently it was returning a negetive
value for all the possible cases.
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Acked-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
This patch cleans up multiple printf statements and also
introduces couple of defines to reflect the byte position
signatures present on the FSP DPO initiation command.
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
The FPGA used on some open power machines generates regular pulses instead
of levels. In that case, reading the status might fail since it's not
latched. In that case, also check the latched event bit in the XIVR.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Create a device-node which will be used by Linux for matching
and use a saner default time if IPMI doesn't work.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
The platform probe code might want to add things to it.
While at it, make add_cpu_idle_state_properties() local to slw.c
and call it from slw_init() instead of from add_opal_node().
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
For PCIe devices, there are 2 bits used to control completion
timeout as follows:
PCIe Cap + 0x24, Device Capabilities 2 Register, bit#4
PCIe Cap + 0x28, Device Control 2 Register, bit#4
The patch adds function pci_disable_completion_timeout(), which
is called during bootup or after PE reset.
It's responsing to bug#114961
Suggested-by: Michael A. Perez <perezma@us.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
The patch adds function pci_device_init(), which is called by
phb->ops->device_init() to apply common initialization on the
specified PCI device during bootup or after PE reset.
Currently, we only put the logic of MPS configuration to the
function, but more will be put there.
Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Keep it 0 for open-power platforms where OCC is going to be preloaded,
also avoids a annoying 1mn delay on early openpower and bml when there
is no OCC firmware to wait for.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Probably due to the way we spin, we seem to still be hitting the
odd case where we fail to reinit due to a secondary not having quite
reached the right state inside skiboot. Let's bump the timeout up.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
The patch provides the in-band support for reading the 'console-select'
system parameter. It also adds the console support to honour the system
param for switching the console type in P8 systems.
Tested-by: Neelesh Gupta <neelegup@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Jeremy Kerr <jeremy.kerr@au.ibm.com>
|
|
Commit 9f64cb20 introduced a spurious unconditional byteswap, which we
don't need for HAVE_BIG_ENDIAN.
Signed-off-by: Jeremy Kerr <jeremy.kerr@au.ibm.com>
|
|
Match the fast-sleep name between OPAL and HB
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Jeremy Kerr <jeremy.kerr@au.ibm.com>
|
|
|
|
Add IPMI GET_SEL_TIME and SET_SEL_TIME commands to the IPMI stack.
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Our libc now has a proper implementation of mktime, which makes adding
tm structures together easy. This patch makes the FSP RTC functions
use the library functions and removes the generic time calculation
code from the FSP RTC driver.
The OPAL<->tm conversion functions are also made public as they will
be useful for the IPMI RTC implementation.
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
For the case where the survserver on the fsp server is dead (for whatever
reason [1]), even before the first time query via sysparam of the surv
status by sapphire, we get an error response to the sysparam query.
We should apparently trigger a HIR in that case (same as phyp).
[1] survserver has a real bug on a 'fsptelinit --disablerecovery'
followed by a 'kill -9 <survserver_pid>'
Fixes https://bugzilla.linux.ibm.com/show_bug.cgi?id=114646.
Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Jeremy Kerr <jeremy.kerr@au.ibm.com>
|
|
|
|
Both the FSP RTC and the upcoming IPMI RTC implementation need to
manipulate time in various ways. Rather than re-implementing slightly
different versions of the calculations twice lets implement some
standard library functions (with tests) and use those.
This patch adds mktime and gmtime_r to the libc.
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Commit e810dcbc (ATTN: Set up attention area to handle attention) broke
tests, as the familiy of CPU_TO_BEXX macros are not compile time constant.
hdata/test/../spira.c:60:4: error: initializer element is not constant
.addr = CPU_TO_BE64((unsigned long)&(cpu_ctl_spat_area) + SKIBOOT_BASE)
There is no test coverage of this code, so for now we can comment out
these areas in order to allow the tests to pass.
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
We are missing a prlog for tests. This adds a dumb version that ignores
the log level and uses printf to display all messages.
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
This pulls in a fix for warnings in our tests:
hdata/test/../spira.c:64:64: warning: suggest parentheses around ‘+’
in operand of ‘&’ [-Wparentheses]
.addr = CPU_TO_BE64((unsigned long)&(cpu_ctl_sp_attn_area1) + SKIBOOT_BASE)
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
I misread the spec when implementing the chassis control message.
This fixes the message, as well as correcting the naming of the IPMI
fields to better reflect what they represent.
Signed-off-by: Joel Stanley <joel@jms.id.au>
Acked-by: Jeremy Kerr <jeremy.kerr@au.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
For the case where the survserver on the fsp server is dead (for whatever
reason [1]), even before the first time query via sysparam of the surv
status by sapphire, we get an error response to the sysparam query.
We should apparently trigger a HIR in that case (same as phyp).
[1] survserver has a real bug on a 'fsptelinit --disablerecovery'
followed by a 'kill -9 <survserver_pid>'
Fixes https://bugzilla.linux.ibm.com/show_bug.cgi?id=114646.
Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Jeremy Kerr <jeremy.kerr@au.ibm.com>
|
|
This chagne fixes a bug found by Alistair Popple: we have a stray '9' in
the count of non-leap-years in 400 years. This will cause an incorrect
result from tm_add if the TOD cache is >400 years old.
Signed-off-by: Jeremy Kerr <jeremy.kerr@au.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Now that the log automatically timestamps entries, remove the tb print
in the error paths.
Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
It seems that when we commited the IPMI/BT driver we updated the
device tree compatible property for the iBT interface. Unfortunately
Palmetto still requires a DT fixup for this node and somewhere along
the way there was a typo.
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
This patch adds a OPAL interface to fetch the DPO timeout. This
functionality is required to synchronously query Sapphire about
how much seconds are remaining for a forced system shutdown which
is useful in cases where the host has missed the OPAL_MSG_DPO for
some reason like system boot, reboot or kexec operations. This
ensures host can still query about the DPO timeout status and act.
This patch also adds helper routine to convert time base into seconds.
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
This patch moves the DPO message handling from FSP core code into
a separate file to make it more cleaner and to add OPAL interfaces
in the subsequent patch. It does not change anything functionally.
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
This patch changes the log message prefix from EPOW to FSPEPOW
as this standard is followed every where in FSP specific code
base. This also changes a bit in the file header.
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
For better debugability, the patch adds git version and backtrace
details to user data section (along with file info which was already
present).
After adding required details TermImmedData looks like:
TermImmedData |
| 00000000 63386631 6639322D 64697274 793A0000 c8f1f92-dirty:.. |
| 00000010 00000000 00000000 00000000 00000000 ................ |
| 00000020 00000000 00000000 43505520 30303030 ........CPU 0000 |
| 00000030 30303264 20426163 6B747261 63653A0A 002d Backtrace:. |
| 00000040 20533A20 30303030 30303030 33316162 S: 0000000031ab |
| 00000050 36626130 20523A20 30303030 30303030 6ba0 R: 00000000 |
| 00000060 33303031 33306238 0A20533A 20303030 300130b8. S: 000 |
| 00000070 30303030 30333161 62366334 3020523A 0000031ab6c40 R: |
| 00000080 20303030 30303030 30333030 34623738 000000003004b78 |
| 00000090 380A2053 3A203030 30303030 30303331 8. S: 0000000031 |
| 000000A0 61623663 63302052 3A203030 30303030 ab6cc0 R: 000000 |
| 000000B0 30303330 30313736 31300A20 533A2030 0030017610. S: 0 |
| 000000C0 30303030 30303033 31616236 64343020 000000031ab6d40 |
| 000000D0 523A2030 30303030 30303033 30303035 R: 0000000030005 |
| 000000E0 3133340A 20533A20 30303030 30303030 134. S: 00000000 |
| 000000F0 33316162 36663030 20523A20 30303030 31ab6f00 R: 0000 |
| 00000100 30303030 33303030 32353534 0A000000 000030002554.... |
| 00000110 00000000 00000000 00000000 00000000 ................ |
| 00000120 00000000 00000000 00000000 00000000 ................ |
| 00000130 00000000 00000000 00000000 00000000 ................ |
| 00000140 00000000 00000000 00000000 00000000 ................ |
| 00000150 00000000 00000000 00000000 00000000 ................ |
| 00000160 00000000 00000000 00000000 00000000 ................ |
| 00000170 00000000 00000000 00000000 00000000 ................ |
| 00000180 00000000 00000000 00000000 00000000 ................ |
| 00000190 00000000 00000000 00000000 00000000 ................ |
| 000001A0 00000000 00000000 00000000 00000000 ................ |
| 000001B0 00000000 00000000 00000000 00000000 ................ |
| 000001C0 00000000 00000000 00000000 636F7265 ............core |
| 000001D0 2F6F7061 6C2E633A 3233383A 30000000 /opal.c:238:0... |
|------------------------------------------------------------------------------|
Signed-off-by: Aruna Balakrishnaiah <aruna@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Generating src dynamically results in:
1. Difficulty in documenting and for field people to understand.
2. It might also conflict with existing srcs.
Hence add default SRC in SRC section.
Assert function call address in hex word 2.
errl -d <elog-entry-id>:
..
| Reference Code : BB821410 |
| Hex Words 2 - 5 : 30017610 00000000 00000000 00000000 |
..
Signed-off-by: Aruna Balakrishnaiah <aruna@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|