Age | Commit message (Collapse) | Author | Files | Lines |
|
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Arch headers need to be linked in before compiling.
Signed-off-by: Brad Bishop <bradleyb@fuzziesquirrel.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
(cherry picked from commit 5660d300fb23f299c4a306be4a213eb608158b6c)
|
|
Parse the entire pstate table provided by OCC and filter out the
entries that are outside the Pmax and Pmin limits. This can
occur when turbo mode is disabled and OCC limits the Pmax to
nominal pstate, but includes turbo pstates in the pstate table.
We end up with wrong pstates in such cases if we do not parse
the pstate table to filter out the correct range.
Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Acked-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
(cherry picked from commit eca02ee2e62cee115d921a01cea061782ce47cc7)
|
|
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
(cherry picked from commit 98b80af1001027cc59dce040831c1f54d41e4f88)
|
|
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
(cherry picked from commit 0c94f97edaee2d0b1c00066db24fd4555650dbe4)
|
|
Current travis-ci seems to no longer do this terribly cleanly.
Just disable it for now.. it was never a great test anyway.
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
(cherry picked from commit 3df1760b4851d65ce04a748dc915284e1b377ddb)
|
|
GCC 6 warns when we look at any stack frame other than our own, ie any
argument to __builtin_frame_address other than zero.
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
(cherry picked from commit 793f6f5b32c96f2774bd955b6062c74a672317ca)
|
|
|
|
Backport of user visible typo fixes
partial cherry picked from 4c95b5e04e3c4f72e4005574f67cd6e365d3276f
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
|
|
Fixes: f46c1e506d199332b0f9741278c8ec35b3e39135
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
(cherry picked from commit 348dacfaca9f139db2603f5c2e78d87e21938ca6)
|
|
On PCI Express, devices need to know their own bus number in order
to provide the correct source identification (aka RID) in upstream
packets they might send, such as error messages or DMAs.
However while devices know (and hard wire) their own device and
function number, they know nothing about bus numbers by default, those
are decoded by bridges for routing. All they know is that if their
parent bridge sends a "type 0" configuration access, they should decode
it provided the device and function numbers match.
The PCIe spec thus defines that when a device receive such a configuration
access and it's a write, it should "capture" the bus number in the source
field of the packet, and re-use as the originator bus number of all
subsequent outgoing requests.
In order to ensure that a device has this bus number firmly established
before it's likely to send error packets upstream, we should thus do a
dummy configuration write to it as soon as possible after probing.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
[stewart@linux.vnet.ibm.com: fix Evolution broken patch, write vdid rather than &vdid as per Gavin suggestion]
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
(cherry picked from commit f46c1e506d199332b0f9741278c8ec35b3e39135)
|
|
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
On P8+ Garrison platform, the root port's pref window register might
be not writable and we have to emulate the window because of hardware
defect. In order to detect that, we read the register content, write
inversed value and read the register content again. The register is
regarded as read-only if the values from the two continuous read are
same. However, the original register content isn't written back and
it causes corruption on pref window register if it's writable.
This fixes the above issue by writing the original content back to
the register at the end.
Fixes: d40160f6 ("PHB3: Emulate root complex pref 64-bits window")
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
|
|
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
The makefiles under external/* utilize the $(CROSS_COMPILE) variable
to determine the cross-compiler prefix. In a few places,
$(CROSS_COMPILE)gcc is called instead of $(CC). The issue with this is
that yocto build passes some compile flags as part of $(CC) instead of
$(CFLAGS), the most important of these is '--sysroot=...'. Without the
proper --sysroot flag, pflash compile fails to find critical libc
headers like stdio.h.
This change delegates setting of $(CC) and $(LD) to
external/common/rules.mk, which is widely used in the external tree, and
ensures that:
1) $(CC) is used instead of $(CROSS_COMPILE)gcc.
2) CC is only set when not passed from the environment.
Signed-off-by: Patrick Williams <patrick@stwcx.xyz>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
(cherry picked from commit 3137d249ba10ad6fa7a52486cdacddfab7419189)
|
|
Add the slot location names for the PCI and NPU slots.
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Claimed-to-be-Tested-By: Abhijit Saikia <Abhijit.Saikia@in.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
The PHB slot location code ueses the ibm,phb-index property to find
slot location names. As the NPU is implemented as a different PHB type
it means the phb-index property overlaps with the other PHBs in the
system.
This patch changes the existing usage of phb-index to npu-index which
allows the phb-index property to be assigned a unique value which can
then be matched by the PHB slot location code.
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Merge PHB3 race fix
|
|
When we mask an interrupt, we may race with another interrupt coming
in from the hardware. If this occurs, the P and/or Q bit may end up
being set but we never EOI/clear them. This could result in a lost
interrupt or the next interrupt that comes in after re-enabling never
being presented.
This patch ensures that when masking an interrupt, any pending P/Q
bits are cleared.
This fixes a bug seen with some CAPI workloads which have lots of
interrupt masking at the same time as high interrupt load. The fix is
not specific to CAPI though.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Tested-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
When we EOI we need to clear the present (P) bit in the Interrupt
Vector Cache (IVC). We must clear P ensuring that any additional
interrupts that come in aren't lost while also maintaining coherency
with the Interrupt Vector Table (IVT).
To do this, the hardware provides a conditional update bit in the
IVC. This bit ensures that generation counts between the IVT and the
IVC updates are synchronised.
Unfortunately we never set this the bit to conditionally update the P
bit in the IVC based on the generation count. Also, we didn't set
what we wanted the new generation count to be if the update was
successful.
This patch fixes sets both of these. It also reworks and documents
the code so that mortals may eventually be able to understand this
process.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Tested-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Tested-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Problem Description:
During FSP termination/reset, FSP received mbox command from OPAL for
"Fetching platform management function data". As FSP is in termination
state DMAE operation failed to write memory data to hypervisor,
so FSP sent mbox command with response status as 0x24 to OPAL and
OPAL committed a predictive log with SRC BB822411 and sent back
response status as 0xFE, which FSP IPMI will not understand the
failure at the Host and IPMI will log the error.
Fix:This patch is to fix when OPAL receives a bad response from FSP 0x24
due to DMAE error, commit informational log and return response status
as SUCCESS and for all other bad status response commit predictive log.
Signed-off-by: Mamatha Inamdar <mamatha4@linux.vnet.ibm.com>
Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
|
|
The current code sends partial hmi event (4 * 64bits instead of
5 * 64bits) to host. The last 64 bits contains chip id/pir info for
reporting checkstop events. This bug affects only checkstop events.
Host console o/p without this patch:
[ 305.628283] Fatal Hypervisor Maintenance interrupt [Not recovered]
[ 305.628341] Error detail: Malfunction Alert
[ 305.628388] HMER: 8040000000000000
[ 305.628423] CPU PIR: 00000000
[ 305.628458] [Unit: VSU] Logic core check stop
Host console o/p with this patch:
[ 200.122883] Fatal Hypervisor Maintenance interrupt [Not recovered]
[ 200.122941] Error detail: Malfunction Alert
[ 200.122986] HMER: 8040000000000000
[ 200.123021] CPU PIR: 000008e8
[ 200.123055] [Unit: VSU] Logic core check stop
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
If the NPU detects an unrecoverable error, it will send a HMI. This is
problematic since unhandled HMIs will checkstop the entire system, which
is not the intended behaviour of a NPU failure. Instead, the NPU
emulated PCI devices should be fenced as part of EEH.
Add support for handling NPU HMIs. This works by finding the NPU
responsible for the HMI, checking its error registers, and sending a
recoverable HMI event. The NPU itself cannot actually recover, but the
system should not be brought down. Fence mode is set on the NPU, such
that any further operations on the NPU will trigger EEH, and it will be
subsequently fenced from the system.
Signed-off-by: Russell Currey <ruscur@russell.cc>
Reviewed-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Regardless of whether a handler for a specific component has raised an
event to deal with a HMI or not, skiboot will raise an extra HMI at the
end of the detection. This is problematic, as if one component reports
it is recoverable but another reports it is not, the last handler to be
called will have priority.
Rework this to instead only send a HMI event if no handler has raised an
event themselves. This will send an unknown, unrecoverable HMI event
since the cause could not be found.
Signed-off-by: Russell Currey <ruscur@russell.cc>
Reviewed-by: Alistair Popple <alistair@popple.id.au>
Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
If reading the FIR with XSCOM failed, the existing code would not raise
a HMI under the assumption that the CPU was asleep and nothing is wrong.
Now that it is possible to check whether or not the CPU was asleep,
raise an unrecoverable HMI if the read failed for other reasons, and
skip the CPU if it was asleep.
If the CPU is asleep when it's not meant to be and that is the cause of
the HMI, an unrecoverable "catchall" HMI will be raised since no other
components will claim responsibility for it.
Signed-off-by: Russell Currey <ruscur@russell.cc>
Reviewed-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
An error message was clearly copy-pasted from the register beforehand,
so fix.
Signed-off-by: Russell Currey <ruscur@russell.cc>
Reviewed-by: Alistair Popple <alistair@popple.id.au>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
If npu.h were to be used by anything that hasn't included io.h, it fails
to find the out_be64 symbol. Fix that up by making it a requirement of
npu.h.
Signed-off-by: Russell Currey <ruscur@russell.cc>
Acked-by: Alistair Popple <alistair@popple.id.au>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Similar to for_each_cpu, adding a for_each_phb makes PHB traversal easy.
Suggested-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
xscom_read and xscom_write return OPAL_SUCCESS if they worked, and
OPAL_HARDWARE if they didn't. This doesn't provide information about why
the operation failed, such as if the CPU happens to be asleep.
This is specifically useful in error scanning, so if every CPU is being
scanned for errors, sleeping CPUs likely aren't the cause of failures.
So, return OPAL_WRONG_STATE in xscom_read and xscom_write if the CPU is
sleeping.
Signed-off-by: Russell Currey <ruscur@russell.cc>
Reviewed-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
|
|
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
|
|
Commit a5299ba2 dropped non-severe event from logging to BMC, but I forgot
to releaes the error log structure.
Fixes: a5299ba2 (IPMI: Only log events that needs attention)
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Include 'extract-gcov' in make clean.
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
We pre-allocate IPMI message for PANIC event and use that memory to send
PANIC event to BMC. Presently we return NULL if we have not initiated PANIC
event message. So we won't be able to log early failure events.
This patch tries to initialize ipmi message instead of returning NULL.
Also intialize elog before ipmi_sel_init. Otherwise we will not be able
to create elog message.
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Presently we use queue method (ipmi_queue_msg) to send eSEL logs
to BMC.
There are cases like assert() where we want to commit messages
synchronously. This patch checks for log severity and logs PANIC
messages synchronously to BMC (Similar to what we do in FSP based
system).
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
When flash controllers get deconfigured or yanked out from under these
tools flash accesses tend to just return all 0xFF bytes.
libffs is usually the first thing to do reads and will fail parsing its
partition structures. This patch adds reporting when it fails to parse
because it got all 0xFF bytes.
The idea is that this will help debugging by splitting the possible reasons
for a failed init into 1) flash controller issue or reading erased flash 2)
flash corruption or not valid reading partition data. These two cases are
nice to be able to separate as early as possible as they usually mean two
quite different type of bugs.
Signed-off-by: Cyril Bur <cyril.bur@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Commit a5299ba2 dropped non-severe event from logging to BMC, but I forgot
to releaes the error log structure.
Fixes: a5299ba2 (IPMI: Only log events that needs attention)
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Remove *-gcov and *.d files.
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
These aren't used, so remove them.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
|
|
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|