Age | Commit message (Collapse) | Author | Files | Lines |
|
GCC 6 warns when we look at any stack frame other than our own, ie any
argument to __builtin_frame_address other than zero.
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
(cherry picked from commit 793f6f5b32c96f2774bd955b6062c74a672317ca)
|
|
|
|
Fixes: f46c1e506d199332b0f9741278c8ec35b3e39135
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
(cherry picked from commit 348dacfaca9f139db2603f5c2e78d87e21938ca6)
|
|
On PCI Express, devices need to know their own bus number in order
to provide the correct source identification (aka RID) in upstream
packets they might send, such as error messages or DMAs.
However while devices know (and hard wire) their own device and
function number, they know nothing about bus numbers by default, those
are decoded by bridges for routing. All they know is that if their
parent bridge sends a "type 0" configuration access, they should decode
it provided the device and function numbers match.
The PCIe spec thus defines that when a device receive such a configuration
access and it's a write, it should "capture" the bus number in the source
field of the packet, and re-use as the originator bus number of all
subsequent outgoing requests.
In order to ensure that a device has this bus number firmly established
before it's likely to send error packets upstream, we should thus do a
dummy configuration write to it as soon as possible after probing.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
[stewart@linux.vnet.ibm.com: fix Evolution broken patch, write vdid rather than &vdid as per Gavin suggestion]
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
(cherry picked from commit f46c1e506d199332b0f9741278c8ec35b3e39135)
|
|
|
|
The current code sends partial hmi event (4 * 64bits instead of
5 * 64bits) to host. The last 64 bits contains chip id/pir info for
reporting checkstop events. This bug affects only checkstop events.
Host console o/p without this patch:
[ 305.628283] Fatal Hypervisor Maintenance interrupt [Not recovered]
[ 305.628341] Error detail: Malfunction Alert
[ 305.628388] HMER: 8040000000000000
[ 305.628423] CPU PIR: 00000000
[ 305.628458] [Unit: VSU] Logic core check stop
Host console o/p with this patch:
[ 200.122883] Fatal Hypervisor Maintenance interrupt [Not recovered]
[ 200.122941] Error detail: Malfunction Alert
[ 200.122986] HMER: 8040000000000000
[ 200.123021] CPU PIR: 000008e8
[ 200.123055] [Unit: VSU] Logic core check stop
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
If the NPU detects an unrecoverable error, it will send a HMI. This is
problematic since unhandled HMIs will checkstop the entire system, which
is not the intended behaviour of a NPU failure. Instead, the NPU
emulated PCI devices should be fenced as part of EEH.
Add support for handling NPU HMIs. This works by finding the NPU
responsible for the HMI, checking its error registers, and sending a
recoverable HMI event. The NPU itself cannot actually recover, but the
system should not be brought down. Fence mode is set on the NPU, such
that any further operations on the NPU will trigger EEH, and it will be
subsequently fenced from the system.
Signed-off-by: Russell Currey <ruscur@russell.cc>
Reviewed-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Regardless of whether a handler for a specific component has raised an
event to deal with a HMI or not, skiboot will raise an extra HMI at the
end of the detection. This is problematic, as if one component reports
it is recoverable but another reports it is not, the last handler to be
called will have priority.
Rework this to instead only send a HMI event if no handler has raised an
event themselves. This will send an unknown, unrecoverable HMI event
since the cause could not be found.
Signed-off-by: Russell Currey <ruscur@russell.cc>
Reviewed-by: Alistair Popple <alistair@popple.id.au>
Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
If reading the FIR with XSCOM failed, the existing code would not raise
a HMI under the assumption that the CPU was asleep and nothing is wrong.
Now that it is possible to check whether or not the CPU was asleep,
raise an unrecoverable HMI if the read failed for other reasons, and
skip the CPU if it was asleep.
If the CPU is asleep when it's not meant to be and that is the cause of
the HMI, an unrecoverable "catchall" HMI will be raised since no other
components will claim responsibility for it.
Signed-off-by: Russell Currey <ruscur@russell.cc>
Reviewed-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
An error message was clearly copy-pasted from the register beforehand,
so fix.
Signed-off-by: Russell Currey <ruscur@russell.cc>
Reviewed-by: Alistair Popple <alistair@popple.id.au>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Similar to for_each_cpu, adding a for_each_phb makes PHB traversal easy.
Suggested-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
None of the functions declared in the header file are public. This
removes the header file. No logical changes introduced.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
The UART is a simulated ns16550 with memory mapped registers.
A /simics dt node is detected and a SIMICS_QUIRK is added to chip quirks
similar to MAMBO_CALLOUTS. It can contain an ns16550 dt node with a property
console-bar.
The LPC UART code is reused and this will work without an LPC bus in the model.
Signed-off-by: Ryan Grimm <grimm@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
As per IPMI message format NetFn value in present in first 6 bits
while last 2 bits contain LUN value. This needs to be taken care
while printing NetFn value in OPAL logs which is useful while debugging
fails.
[root@fir01 /]# ipmitool raw 0x0a 0x48
47 b1 d0 56
[root@fir01 /]#
>From OPAL Logs
---------------
[133969609199,7] BT: seq 0x3d netfn 0x0a cmd 0x48: Message sent to host
[133975465455,7] BT: seq 0x3d netfn 0x0a cmd 0x48: IPMI MSG done
>From BMC Logs
--------------
IPMIMain: [693 WARNING][corecmdselect.c:913]
Request: Channel:f; Netfn:a; Cmd:48;
IPMIMain: [693 INFO][corecmdselect.c:924]
Response: Channel:f; Netfn:a; Cmd:48; Data:0 47 b1 d0 56
Signed-off-by: Vipin K Parashar <vipin@linux.vnet.ibm.com>
Reviewed-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Currently the null branch catcher blows away the first 16 bytes of
memory.
This patch saves this away in case we need to reinstate them later
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Make note that this will be broken for little endian.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
If we fail an assert() before we add a mem region, such as missing chip-id in a
dt xscom node, we don't get a backtrace:
[836883,0] Assert fail: core/device.c:870:id != 0xffffffff
[848859,0] Aborting!
CPU 0000 Backtrace:
This patch adjusts the top_of_ram value compared to the fp stack frame to
assume one stack early on so we get a backtrace:
[440546,0] Assert fail: core/device.c:822:id != 0xffffffff
[452522,0] Aborting!
CPU 0000 Backtrace:
S: 0000000031c03b70 R: 00000000300135d0 .backtrace+0x24
S: 0000000031c03bf0 R: 0000000030017f38 ._abort+0x4c
S: 0000000031c03c70 R: 0000000030017fb4 .assert_fail+0x34
S: 0000000031c03cf0 R: 0000000030021830 .dt_get_chip_id+0x24
S: 0000000031c03d60 R: 00000000300143cc .init_chips+0x23c
S: 0000000031c03e30 R: 0000000030013ab8 .main_cpu_entry+0x110
S: 0000000031c03f00 R: 000000003000254c boot_entry+0x19c
Signed-off-by: Ryan Grimm <grimm@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
get_available_nr_cores_in_chip() takes 'chip_id' as an argument and
returns the number of available cores in the chip.
Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
|
|
Currently skiboot adds an empty ranges property for each PCI bridge,
representing a 1:1 mapping, which the kernel can later update if needed.
However this does not appear to be the case, which leads to an issue in
the kernel where the translation of assigned-addresses properties is
mishandled and prematurely drops the PCI memory flags (ie. the first
cell of an address).
Instead, explicitly describe a 1:1 mapping in each bridge's ranges
property, allowing assigned-addresses to be properly handled.
Signed-off-by: Sam Mendoza-Jonas <sam@mendozajonas.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
commit 2e4cc4dca8c0d31138adc52076b38d80c5a6bef0 upstream
Find the phb index with capp_phb3_attached_mask.
Signed-off-by: Philippe Bergheaud <felix@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Merge device tree sorting
|
|
Moved the dt_dump() into test/dt_common.c so that it can be shared
between hdata/test/hdata_to_dt.c and core/test/run-device.c
run-device.c contains two tests, one basic sorting test and a
generate-and-sort test.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
[stewart@linux.vnet.ibm.com: remove trailing whitespace]
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
When unflattening (or building from hdat) a device tree child nodes are added in
the order in which they are encountered. For nodes that have a <basename>@<unit>
style name with a common basename it is useful to have them in the tree sorted
by the unit in ascending order. Currently this requires the source data to
present them in sorted order, but this isn't always the case.
This patch modifies the node insertion process to insert new nodes in the
correct location so the list of child nodes is sorted.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
[stewart@linux.vnet.ibm.com: remove trailing whitespace]
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
commit 56bc1890b229072513788992d1d29b6f173c13de upstream
We create our own inttypes.h to get the correct printf formatting for
64bit numbers.
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Find the phb index with capp_phb3_attached_mask.
Signed-off-by: Philippe Bergheaud <felix@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
|
|
uart consoles only flush output when polled. The Linux kernel calls
these pollers frequently, except when in a panic state. As such, panic
messages are not fully printed unless the system is configured to reboot
after panic.
This patch adds a new call to the OPAL API to flush the buffer. If the
system has a uart console (i.e. BMC machines), it will incrementally
flush the buffer, returning if there is more to be flushed or not. If
the system has a different console, the function will have no effect.
This will allow the Linux kernel to ensure that panic message have been
fully printed out.
The existing synchronous flushing mechanism used in OPAL's shutdown and
reboot routines has been refactored into a helper that repeatedly calls
the new partial flush function.
Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
p5ioc2 is used by approximately 2 machines in the world, and has never
ever been a supported configuration.
Not only is the code virtually unused and very tricky to test, but
keeping it around is making life unnecessarily difficult:
- It's more complexity to manage for things such as PCI slot support
- It's more code for static analysis to cover, which means more time
fixing bugs that affect no-one.
- It's bloating every single install of skiboot for no benefit.
- It's reducing coverage stats, which is sad.
Drop p5ioc2.
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
No functional change, but static analysis showed up the oddity of
something that is generally unsigned (opal_id) having a signed
value assigned to it.
Took the opportunity to use a define to increase readability.
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
We create our own inttypes.h to get the correct printf formatting for
64bit numbers.
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Modify the OCC reset order such that master OCC is reset after the
slave OCCs are reset. In Tuleta/Alpine systems 'proc0' will always be
the master OCC, which has to be stopped last when FSP sends OCC_RESET
command to Opal.
This fixes BZ 119718, SW289036
Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
|
|
Signed-off-by: Samuel Mendoza-Jonas <sam.mj@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
the DT
Commit 75e9440 (PCI: Trace device node from PCI device) introduced the
ability for PCI devices to do fixups based on information added to the
device tree as part of device initialisation.
However that patch called the fixups during device initialisation
meaning not all devices present in the system had been added to the
device tree. Depending on device initialisation order this means some
devices were not detected by the fixup code.
This patch moves the calls to the fixup code until after all PCI
devices have been added to the device tree.
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
|
|
OPAL needs an extra compatible property "ibm,opal-sensor" to make
module autoload work smoothly in Linux for ibmpowernv driver.
Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
The error paths here are a bit suspicious anyway as we're allocating
memory in a failure path for having failed to allocate memory.
One can hope that the memory allocated to display the error is less
than the memory we attempted to allocate in the first place.
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Even though we're heading to abort(), it seems like some static
analysis checkers still think we're leaking. So.... well, let's
just make them happy and free the memory. It's harmless to do
that.
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
No resulting code changes due to skiboot being all BE
core/opal-msg.c:58:29: warning: incorrect type in assignment (different base types)
core/opal-msg.c:58:29: expected restricted beint32_t [usertype] msg_type
core/opal-msg.c:58:29: got int enum opal_msg_type [signed] msg_type
core/opal-msg.c:120:50: warning: restricted beint64_t degrades to integer
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
core/utils.c:25:35: warning: constant 0xdeadf00dbaad300d is so big it is unsigned long
core/utils.c:25:15: warning: symbol '__stack_chk_guard' was not declared. Should it be static?
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Need to flip things appropriately for endian annotations
No actual functional changes since skiboot is still BE, but we're
a bit more explicit about the fact the ABI is BE.
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
This fixes many spurious sparse warnings
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
core/trace.c:106:23: warning: incorrect type in assignment (different base types)
core/trace.c:106:23: expected restricted beint16_t [usertype] prev_len
core/trace.c:106:23: got int
Never read anywhere (by anyone), but silences a warning by doing the right thing.
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
core/trace.c:43:42: warning: incorrect type in assignment (different base types)
core/trace.c:43:42: expected restricted beint64_t static [toplevel] [usertype] mask
core/trace.c:43:42: got int
core/trace.c:44:46: warning: incorrect type in assignment (different base types)
core/trace.c:44:46: expected restricted beint32_t static [toplevel] [usertype] max_size
core/trace.c:44:46: got unsigned long
Shouldn't affect any runtime code, just cleans up the warnings.
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Completely flush the output buffer of the console driver before
power down and reboot. Implements the flushing function for uart
consoles, which includes the astbmc and rhesus platforms.
Adds a new function, flush(), to the con_ops struct that allows
each console driver to specify how their output buffers are flushed.
In the cec_power_down and cec_reboot functions, the flush function
of the driver is called if it exists.
This fixes an issue where some console output is sometimes lost before
power down or reboot in uart consoles. If this issue is also prevalent
in other console types then it can be fixed later by adding a .flush
to that driver's con_ops.
Signed-off-by: Russell Currey <ruscur@russell.cc>
[stewart@linux.vnet.ibm.com: reduce diff size, change flush function name]
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
In some simulation environments, we simulate a system close to an
ibm-fsp system but with a crucial difference: we don't simulate OCCs.
This means that for a P8 (well, a simulated one) that looks like it's
part of a ibm-fsp system, we'd wait around for about a minute to be
asked to start OCCs and for the OCCs to start. Obviously, this would
never happen and we'd hit the OCC initialization timeout (correctly)
logging an error.
However, in this simulation environment, it isn't an error as the
required information to work out it isn't an error is (at least now)
provided in hdat under 'OCC Functional State'.
Previously, the ibm,occ-functional-state property was just passed
through the device tree to the host through the XSCOM node and
skiboot ignored it.
This patch takes note of occ-functional-state and skips waiting for
OCCs on any chips that have been marked as having non functional
OCC.
In such simulation environments this means we:
a) don't log an error that isn't really an error
b) boot 1 minute quicker as we don't hit the timeout.
Tested-by: Gajendra B Bandhu1 <gbandhu1@in.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Completely flush the output buffer of the console driver before
power down and reboot. Implements the flushing function for uart
consoles, which includes the astbmc and rhesus platforms.
Adds a new function, flush(), to the con_ops struct that allows
each console driver to specify how their output buffers are flushed.
In the cec_power_down and cec_reboot functions, the flush function
of the driver is called if it exists.
This fixes an issue where some console output is sometimes lost before
power down or reboot in uart consoles. If this issue is also prevalent
in other console types then it can be fixed later by adding a .flush
to that driver's con_ops.
Signed-off-by: Russell Currey <ruscur@russell.cc>
[stewart@linux.vnet.ibm.com: reduce diff size, change flush function name]
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
05f52a8dd7c7e402896e049fd24f83d56b70aff4 core: Setup the OPAL DT node
before platform probe
add_cpu_idle_state_properties() was made local to slw.c in the above
commit which caused p7 systems to not populate the nap idle state in
DT. So moving add_cpu_idle_state_properties() to add_opal_node to fix
this bug.
Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|