Age | Commit message (Collapse) | Author | Files | Lines |
|
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Today, when run on an IBM Power systems, opal-prd complains in syslog
with a set of messages similar to these :
opal-prd: CTRL: Starting PRD daemon
opal-prd: I2C: Found Chip: 00000000 engine 1 port 0
opal-prd: I2C: Found Chip: 00000010 engine 1 port 0
opal-prd: CTRL: Listening on control socket /run/opal-prd-control
opal-prd: FW: Can't open PRD device /dev/opal-prd: No such file or directory
opal-prd: FW: Error initialising PRD channel
opal-prd: CTRL: stopping PRD daemon
Which are difficult to interpret for a person not initiated to Power
firmware.
The patch below detects if the platform has support for PRD by looking
at the device tree property :
/sys/firmware/devicetree/base/ibm,opal/diagnostics/compatible
and stops opal-prd early in the main routine with an explicit
message for the user.
Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
The story of extract-gcov is not necessarily a pleasant one, involving
GCC internals, padding of data structures, differences in data structures
that are designed to change whenever GCC wants to and a strong desire
to not implement a VFS in skiboot or some other streaming interface
(and associated userspace and other such blergh).
This patch makes us be all explicit about padding in the structures,
enabling -Wpadding for extract-gcov.c.
We also get all strict over the size of things and add support for
gcc 5.1, which added an extra counter.
There is likely GCC hacking in my future to make this a lot less
fragile.
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Signed-off-by: Cyril Bur <cyril.bur@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
In commit 8f5b8616, we introduced a dependency on libgcc, for the
__builtin_parityl() function in commit 6cfaa3ba.
However, if we're building with a biarch compiler, we may not have a
libgcc available.
This commit removes the __builtin_parityl() call, and replaces with the
equivalent instructions, and removes the dependency on libgcc.
Although this is untested, I have confirmed that the __builtin_parityl()
functions emits the same instructions (on power7 and power8, with
gcc-4.9) as we're using in the parity() function.
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
[stewart@linux.vnet.ibm.com: only use inline asm for skiboot build]
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Since we are anyway on the way to standby and apparently the other
hypervisor also does this.
Tested-by: Vipin K Parashar <vipin@linux.vnet.ibm.com>
Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
|
|
When doing an MSI EOI, we update the P and Q bits in the IVE. That causes
the corresponding cache line to be dirty in the L3 which will cause a
subsequent update by the PHB (upon recieving the next MSI) to get a few
retries until it gets flushed.
We can improve the situation (and thus performance) by doing a dcbf
instruction to force a flush of the update we do in SW.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
When we got FSP_STATUS_TOD_RESET or similar, we would return OPAL_BUSY
which would cause the Linux OPAL RTC driver to retry in a loop until
we didn't say we're busy.
The problem with this is that some errors, such as FSP_STATUS_TOD_RESET
are, in fact, permanent until we (say) set the time explicitly, so no
matter how hard that little linux driver tries, it's never going to
break out of that loop.
This fix is to fix our use of the state machine introduced way back in
6cf8b663e7d7cb1e827b6d9c90e694ea583f6f87 so that we return an error code
to linux.
Reported-by: Cédric Le Goater <clg@fr.ibm.com>
Fixes: 6cf8b663e7d7cb1e827b6d9c90e694ea583f6f87
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Reviewed-by: Cédric Le Goater <clg@fr.ibm.com>
|
|
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Signed-off-by: Samuel Mendoza-Jonas <sam.mj@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
In a scenario where the DPO has been initiated, but the FSP then went into
reset before the CEC power down came in, OPAL may not give up the link since
it may never see the PSI interrupt. So, if we are in dpo_pending and an FSP
reset is detected via the DISR, give up the PSI link voluntarily.
Tested-by: Vipin K Parashar <vipin@linux.vnet.ibm.com>
Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
OPAL needs an extra compatible property "ibm,opal-sensor" to make
module autoload work smoothly in Linux for ibmpowernv driver.
Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Signed-off-by: Samuel Mendoza-Jonas <sam.mj@au1.ibm.com>
Reviewed-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Completely flush the output buffer of the console driver before
power down and reboot. Implements the flushing function for uart
consoles, which includes the astbmc and rhesus platforms.
Adds a new function, flush(), to the con_ops struct that allows
each console driver to specify how their output buffers are flushed.
In the cec_power_down and cec_reboot functions, the flush function
of the driver is called if it exists.
This fixes an issue where some console output is sometimes lost before
power down or reboot in uart consoles. If this issue is also prevalent
in other console types then it can be fixed later by adding a .flush
to that driver's con_ops.
Signed-off-by: Russell Currey <ruscur@russell.cc>
[stewart@linux.vnet.ibm.com: reduce diff size, change flush function name]
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
On some BMC firmware revisions, we need to copy over a pflash
binary and we need to ensure that the executable bit is set.
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
On NX checkstop OPAL need to signal PRD about it by setting NXDMAENGFIR[38]
bit. Otherwise PRD will not be able to do NX unit checkstop error
analysis. NXDMAENGFIR[38] is a spare bit and used to report a software
initiated attention for NX checkstop.
The behavior of this bit and all FIR bits are documented in RAS
spreadsheet.
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
A while loop from wait_for_subcore_threads() function loops until one
thread from each subcore completes the pre-cleanup task and sets a
cleanup done bit.
while (!(*(this_cpu()->core_hmi_state_ptr) & HMI_STATE_CLEANUP_DONE))
cpu_relax();
Without a memory barrier we see that the compiler optimizes the above while
loop not to re-fetch the data from memory pointed by
this_cpu()->core_hmi_state_ptr. This makes CPU to spin infinitely
even though the other CPUs have modified the data causing soft lockup in
kernel.
There are two ways to fix this, 1) introduce volatile specifier to force
re-read the fresh value from the memory. 2) Add barrier() call to cpu_relax().
Second approach will avoid similar bugs in future.
This patch uses the second approach to fix this issue.
This patch also introduces a timeout for the while loop to handle a worst
situation where all other threads are badly stuck without setting a
cleanup done bit. Under such situation timeout will help to avoid soft
lockups and report failure to kernel.
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
[stewart@linux.vnet.ibm.com: add explanation as to why we don't use timebase for timeout]
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Fixes: 8f433d6cd4f92b4f878e5ddc414e2800a2fb7140
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
|
|
This removes below unnecessary message in phb3_sm_fundamental_reset()
as there already has on subsequent message indicating the situation.
Performing PERST...
Also, this decreases the outputing level of all messages in this
function to DEBUG.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
When issuing fundamental reset on below IPR adapter that seats
behind root complex, there is 50% possibility that the link
fails to come up after the reset. In that case, the adapter's
config space is blocked and it's not usable.
host# lspci -ns 0004:01:00.0
0004:01:00.0 0104: 1014:034a (rev 01)
host# lspci -s 0004:01:00.0
0004:01:00.0 RAID bus controller: IBM PCI-E IPR SAS Adapter
(ASIC) (rev 01)
This introduces another PHB3 state (PHB3_STATE_FRESET_START)
allowing to redo fundamental reset if the link doesn't come up
in time at the first attempt, to improve the robustness of PHB's
fundamental reset. If the link comes up after the first reset,
the 2nd reset won't be issued at all.
Reported-by: Paul Nguyen <nguyenp@us.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
This bug has originated since day 1 (of public release), what was going on
was that we were incorrectly using PSI_DMA_LOC_COD_BUF as the *address*
to write to for the FSP to read rather than using that purely as the
TCE table.
What we *should* have been doing (and this patch now does), is allocating
some (aligned) memory and using it.
With this patch, we no longer write over some poor random memory location
that could be being used by the host OS for something important, for example,
in the (internal) bug report of this, it was futex_hash_bucket in Linux
being replaced with our structure for replying to FSP_CMD_GET_LED_LIST (which
is around 4kb) and Linux doesn't like it when you replace a bunch of lock
data structures with essentially garbage.
Since this is FSP LED code specific, this only affects FSP based systems.
Reported-by: Dionysius d. Bell <belldi@us.ibm.com>
Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Tag skiboot-5.1.6
|
|
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
|
|
Fixes: 55ae15b
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
This removes below unnecessary message in phb3_sm_fundamental_reset()
as there already has on subsequent message indicating the situation.
Performing PERST...
Also, this decreases the outputing level of all messages in this
function to DEBUG.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
When issuing fundamental reset on below IPR adapter that seats
behind root complex, there is 50% possibility that the link
fails to come up after the reset. In that case, the adapter's
config space is blocked and it's not usable.
host# lspci -ns 0004:01:00.0
0004:01:00.0 0104: 1014:034a (rev 01)
host# lspci -s 0004:01:00.0
0004:01:00.0 RAID bus controller: IBM PCI-E IPR SAS Adapter
(ASIC) (rev 01)
This introduces another PHB3 state (PHB3_STATE_FRESET_START)
allowing to redo fundamental reset if the link doesn't come up
in time at the first attempt, to improve the robustness of PHB's
fundamental reset. If the link comes up after the first reset,
the 2nd reset won't be issued at all.
Reported-by: Paul Nguyen <nguyenp@us.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
When deciding if a BT message has timed out we should first check for
a message response. This will ensure that messages will not time out
if there was a delay calling the pollers.
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
In root causing a bug on AST BMC Alistair found that pollers weren't
being run for around 3800ms.
This was due to a wonderful accident that's probably about a year or more
old where:
In cpu_wait_job we have:
unsigned long ticks = usecs_to_tb(5);
...
time_wait(ticks);
While in time_wait(), deciding on if to run pollers:
unsigned long period = msecs_to_tb(5);
...
if (remaining >= period) {
Obviously, this means we never run pollers. Not ideal.
This patch ensures we run pollers every 5ms in cpu_wait_job() as well
as displaying how long we waited for a job if that wait was >1second.
Reported-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Some BMC firmware versions don't ship pflash.
Support PFLASH_TO_COPY environment variable to a pflash binary
built for the BMC that will be copied over and used to pflash
the partition or whole pnor.
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
It works just like P8, we copy the code for now rather than make
it somewhat common due to our locking differences and to limit
the risk close to release. We can refactor later.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
We didn't pass the right "is_write" argument for writes and
the string used for logging was somewhat confusing.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
When struct phb3::has_link is set to true, the downstream link of
root port is up, not down. This fixes the incorrect comments.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Our normal sequence for a soft power action (IPMI 'power soft' or
'power cycle') involve receiving a SEL from the BMC, sending a message
to Linux's opal platform support which instructs the host OS to shut
down, and finally the host will request OPAL to cut power.
When the host is not yet up we will send the message to /dev/null, and
no action will be taken. This patches changes that behaviour to perform
the action immediately if we know how.
Signed-off-by: Joel Stanley <joel@jms.id.au>
[stewart@linux.vnet.ibm.com: modify checking of OPAL_BOOT_COMPLETE flag, typo]
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
This tells us when we've entered the host. First use case is knowing if
we can can rely on host communication working, such as receiving and
acting on an opal_msg.
Signed-off-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
[stewart@linux.vnet.ibm.com: use real bit field rather than C bitfield]
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
We've seen various IPMI timeouts during testing (mainly hit
by petitboot) but it seems that 5 seconds is the magic value
that matches everywhere.
This echoes what we use in petitboot, so at least being consistent
with ourselves is a good idea.
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
We disallow to inject error to reserved PE#, which is 255 instead
of 0 on PHB3. Otherwise, error OPAL_PARAM is returned when injecting
error to PE#0.
This fixes above issue by checking against the correct PE number 255.
Reported-by: Pradeep Ramanna <pramann2@in.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
In the event of a lot of OCC events (or many CPU cores), we could
send many OCC messages to the host, which if it wasn't calling
opal_get_msg really often, would cause skiboot to malloc() additional
messages until we ran out of skiboot heap and things didn't end up
being much fun.
When running certain hardware exercisers, they seem to steal all time
from Linux being able to call opal_get_msg, causing these to queue up
and get "opalmsg: No available node in the free list, allocating" warnings
followed by tonnes of backtraces of failing memory allocations.
|
|
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
SLW image)
Memory regions in skiboot have an interesting life cycle. First, we get
a bunch from the initial device tree or hdat specifying some existing
reserved ranges (as well as adding some of our own if they're missing)
but we also get ranges for the entirety of RAM.
The idea is that we can do node local allocations for per node resources
(which we do) and then, just prior to booting linux, we copy the reserved
memory regions to expose to linux along with a set of reserver regions
to cover the node local allocations.
The problem was that mem_range_is_reserved() was wanting subtle different
semantics for memory region type than region_is_reserved() provided.
That is, we were overriding the meaning of REGION_SKIBOOT_HEAP to mean both
"this is reserved by skiboot" *and* "this is a memory region that covers
all of memory and will be shrunk to cover just the memory we have allocated
for it just before we boot the payload (linux)".
So what would happen is we would ask "hey, is the memory holding the SLW
image reserved?" and we'd get the answer of "yes" but referring to the memory
region that covers the entirety of memory in a NUMA node, *not* meaning
our intent of "this will be reserved when we start linux".
To fix this, introduce a new memory region type REGION_MEMORY. This has
the semantics of a memory region that covers a block of memory that we can
allocate from (using local_alloc) and that the part that was allocated
will be passed to linux as reserved, but that the entire range will not
be reserved.
So our new semantics are:
- region_is_reservable() is true if the region *MAY* be reserved
(i.e. is the regions that cover the whole of memory OR is explicitly reserved)
- region_is_reserved() is true if the region *WILL* be reserved
(i.e. is explicitly reserved)
This way we check that the SLW image is explicitly reserved and if it isn't,
we reserve it.
Fixes: 58033e44
Acked-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|