Age | Commit message (Collapse) | Author | Files | Lines |
|
This makes it easier for future changes to ensure that the watchdog
stops ticking and doesn't requeue itself for execution in the
background. This way it is safe for resets to be performed after the
ticks are assumed to be stopped and it won't start the timer again.
Signed-off-by: William A. Kennington III <wak@google.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
The op-build linux kernel has been configured to support the ipmi
watchdog. This driver will always handle the watchdog by either leaving
it enabled if configured, or by disabling it during module load if no
configuration is provided. This increases the coverage of the watchdog
during the boot process. The watchdog should no longer be disabled at
any point during skiboot execution.
Signed-off-by: William A. Kennington III <wak@google.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
There is no clarification for why this change was needed, but presumably
this is due to a buggy BMC implementation where the Watchdog Set command
was processed concurrently or after the initial Watchdog Reset. This
inversion would cause the watchdog to stop since the DONT_STOP bit was
not set. Since we are now using the DONT_STOP bit during initialization,
the watchdog should not be stopped even if an inversion occurs.
Signed-off-by: William A. Kennington III <wak@google.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
The IPMI standard supports setting a DONT_STOP bit during an Watchdog
Set operation. Most of the time we don't want to stop the Watchdog when
updating the settings so we should be using this bit. This patch makes
it possible for callers of set_wdt to prevent the watchdog from being
stopped. This only changes the behavior of the watchdog during the
initial settings update when initializing skiboot. The watchdog is no
longer disabled and then immediately re-enabled.
Signed-off-by: William A. Kennington III <wak@google.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
The IPMI specification denotes that action 0x1 is Host Reset and 0x3 is
Host Power Cycle. Use the correct name for Reset in our watchdog code.
Signed-off-by: William A. Kennington III <wak@google.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
The PHB callback 'get_link_state' is always reporting the link width,
irrespective of the link status and even when the link is down. It is
causing too much work (and failures) when the PHB is probed during pci
init.
The fix is to look at the link status first and report the link as
down when appropriate.
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
Now that links may train in parallel, traces shown during training can
be all mixed up. So add a prefix to all the traces to clearly identify
the chip and link the trace refers to:
OCAPI[<chip id>:<link id>]: this is a very useful message
The lower-level hardware procedures (npu2-hw-procedures.c) also print
traces which would need work. But that code is being reworked to be
better integrated with opencapi and nvidia, so leave it alone for now.
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
Reorder our link training steps so that they are executed on
fundamental reset instead of during the initial setup. Skiboot always
call a fundamental reset on all the PHBs during pci init.
It is done through a state machine, similarly to what is done for
'real' PHBs.
This is the first step for a longer term goal to be able to trigger an
adapter reset from linux. We'll need the reset callbacks of the PHB to
be defined. We have to handle the various delays differently, since a
linux thread shouldn't stay stuck waiting in opal for too long.
No functional changes.
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
Rework a bit the code to reset the opencapi adapter:
- make clearer which i2c pin is resetting which device
- break the reset operation in smaller chunks. This is really to
prepare for a future patch.
No functional changes.
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
Presence detection is not part of the opencapi specification. So each
platform may choose to implement it the way it wants.
All current platforms implement it through an i2c device where we can
query a pin to know if a device is connected or not. ZZ and Zaius have
a similar design and even use the same i2c information and pin
numbers.
However, presence detection on older ZZ planar (older than v4) doesn't
work, so we don't activate it for now, until our lab systems are
upgraded and it's better tested.
Presence detection on witherspoon is still being worked on. It's
shaping up to be quite different, so we may have to revisit the topic
in a later patch.
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
stop1_lite has been removed since it adds no additional benefit
over stop0_lite. stop2_lite has been removed since currently it adds
minimal benefit over stop2. However, the benefit is eclipsed by the time
required to ungate the clocks
Moreover, Lite states don't give up the SMT resources, can potentially
have a performance impact on sibling threads.
Since current OSs (Linux) aren't smart enough to make good decisions
with these stop states, we're (temporarly) removing them from what
we expose to the OS, the idea being to bring them back in a new
DT representation so that only an OS that knows what to do will
do things with them.
Signed-off-by: Akshay Adiga <akshay.adiga@linux.vnet.ibm.com>
Reviewed-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
[stewart: add to explanation]
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
Caught by scan-build, also constant-ify the input
parameter.
Signed-off-by: Balbir singh <bsingharora@gmail.com>
Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
Currently we only call set_opal_console() to establish the backend
used by the OPAL console API if we find at least one FSP serial
port in HDAT.
On systems where there is none (IPMI only), we fail to set it,
causing the console code to try to use the dummy console causing
an assertion failure during boot due to clashing on the device-tree
node names.
So always set it if an FSP is present
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
This is required by the architecture and the implementations, I've
observed failures to wake up on big cores without this.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
Force reset was added as an attempt to work around some issues with TPM
devices locking up their I2C bus. In that particular case the problem
was that the device would hold the SCL line down permanently due to a
device firmware bug. The force reset doesn't actually do anything to
alleviate the situation here, it just happens to reset the internal
master state enough to make the I2C driver appear to work until
something tries to access the bus again.
On P9 systems with secure boot enabled there is the added problem
of the "diagostic mode" not being supported on I2C masters A,B,C and
D. Diagnostic mode allows the SCL and SDA lines to be driven directly
by software. Without this force reset is impossible to implement.
This patch removes the force reset functionality entirely since:
a) it doesn't do what it's supposed to, and
b) it's butt ugly code
Additionally, turn p8_i2c_reset_engine() into p8_i2c_reset_port().
There's no need to reset every port on a master in response to an
error that occurred on a specific port.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
Add support for setting a default timeout for the I2C port to the
device-tree. This is consumed by skiboot.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
The transport control register needs to be loaded in two steps: Once
the register values have been set, we have to write bit 63 to a '1',
which loads the register values into the ci store buffer logic.
Bit 63 always reads back as a zero but to load the ci store buffer
values in capp the transition of 0 to 1 of bit 63 must be seen.
A new comment is added in the code to avoid confusion and to precise
the feature of this register.
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
Without the WOF registers it's hard to figure out what went wrong first,
so print those when we print the FIRs when a fence is detected.
Suggested-by: Mike Perez <perezma@us.ibm.com>
Signed-off-by: Russell Currey <ruscur@russell.cc>
Reviewed-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
Stability improvements in microcode for stop4/stop5 are
available in upstream hcode images. Stop4 and stop5 can
be safely enabled by default.
Use ~0xE0000000 to cut all but stop0,1,2 in case there
are any issues with stop4/5.
example:
nvram -p ibm,skiboot --update-config opal-stop-state-disable-mask=0x1FFFFFFF
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
BMC Get device ID command gives BMC firmware version details. Lets add this
to device tree. User space tools will use this information to display BMC
version details.
Stewart,
I have added bmc information under /ibm,firmware-version node as its firmware
version. But may be we should add new node (/bmc/firmware). So that we can
keep BMC related information separately. Let me know your thoughts on this.
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
The minor version increments of the pstate table are backward
compatible. The minor version is changed when the pstate table
remains same and the existing reserved bytes are used for pointing
new data. So use only major version number while parsing the pstate
table. This will allow old skiboot to parse the pstate table and
handle minor version updates.
Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
hw/fsp/fsp.c:1011:17: warning: passing an object that undergoes default argument promotion to
'va_start' has undefined behavior [-Wvarargs]
va_start(list, add_words);
^
hw/fsp/fsp.c:1007:59: note: parameter of type 'u8' (aka 'unsigned char') is declared here
void fsp_fillmsg(struct fsp_msg *msg, u32 cmd_sub_mod, u8 add_words, ...)
^
[CC] platforms/ibm-fsp/apollo-pci.o
hw/fsp/fsp.c:1026:17: warning: passing an object that undergoes default argument promotion to
'va_start' has undefined behavior [-Wvarargs]
va_start(list, add_words);
^
hw/fsp/fsp.c:1016:47: note: parameter of type 'u8' (aka 'unsigned char') is declared here
struct fsp_msg *fsp_mkmsg(u32 cmd_sub_mod, u8 add_words, ...)
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
These make clang angry:
hw/imc.c:690:29: warning: equality comparison with extraneous parentheses [-Wparentheses-equality]
if ((wakeup_engine_state == WAKEUP_ENGINE_PRESENT)) {
~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~
hw/imc.c:690:29: note: remove extraneous parentheses around the comparison to silence this warning
if ((wakeup_engine_state == WAKEUP_ENGINE_PRESENT)) {
~ ^ ~
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
After commit 35c66b8ce5a2 ("SLW: Move MAMBO simulator checks to
slw_init"), mambo boot no longer calls add_cpu_idle_state_properties()
and as such we never enable stop states.
After adding the call back, we get more testing coverage as well
as faster mambo SMT boots.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
CFG Write Request Timeout was incorrectly set to informational and not
fatal for both non-CAPI and CAPI, so set it to fatal. This was a
mistake in the specification. Correcting this fixes a niche bug in
escalation (which is necessary on pre-DD2.2) that can cause a checkstop
due to a NCU timeout.
In addition, set the values in the timeout control registers to match.
This fixes an extremely rare and unreproducible bug, though the current
timings don't make sense since they're higher than the NCU timeout (16)
which will checkstop the machine anyway.
Signed-off-by: Russell Currey <ruscur@russell.cc>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> # CAPI
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
SBE on P9 provides one shot programmable timer facility. We can use this
to implement OPAL timers and hence limit the reliance on the Linux
heartbeat (similar to HW timer facility provided by SLW on P8).
Design:
- We will continue to run Linux heartbeat.
- Each chip has SBE. This patch always schedules timer on SBE on master chip.
- Start timer option starts new timer or modifies an active timer for the
specified timeout.
- SBE expects timeout value in microseconds. We track timeout value in TB.
Hence we convert tb to microseconds before sending request to SBE.
- We are requesting ack from SBE for timer message. It gaurantees that
SBE has scheduled timer.
- Disabling SBE timer
We expect SBE to send timer expiry interrupt whenever timer expires. We
wait for 10 more ms before disabling timer.
In future we can consider below alternative approaches:
- Presently SBE timer disable is permanent (until we reboot system).
SBE sends "I'm back" interrupt after reset. We can consider restarting
timer after SBE reset.
- Reset SBE and start timer again.
- Each chip has SBE. On multi chip system we can try to schedule timer
on different chip.
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
Lets move P8 timer support code from slw.c to sbe-p8.c (as suggested
by BenH). There is a difference between timer support in P8 and P9.
Hence I think it makes sense to name it as sbe-p8.c.
Note that this is pure code movement and renaming functions/variables.
No functionality changes.
Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
SBE (Self Boot Engine) on P9 has two different jobs:
- Boot the chip up to the point the core is functional
- Provide various services like timer, scom, stash MPIPL, etc., at runtime
OPAL can communicate to SBE via a set of data and control registers provided
by the PSU block in P9 chip.
- Four 8 byte registers for Host to send command packets to SBE
- Four 8 byte registers for SBE to send response packets to Host
- Two doorbell registers (1 on each side) to alert either party
when data is placed in above mentioned data register
Protocol constraints:
Only one command is accepted in the command buffer until the response for the
command is enqueued in the response buffer by SBE.
Usage:
We will use SBE for various purposes like timer, MPIPL, etc.
This patch implements the SBE MBOX spec for OPAL to communicate with
SBE.
Design consideration:
- Each chip has SBE. We need to track SBE messages per chip. Hence added
per chip sbe structure and list of messages to that chip
- SBE accepts only one command at a time. Hence serialized MBOX commands.
- OPAL gets interrupted once SBE sets doorbell register
- OPAL has to clear doorbell register after reading response
- Every command class has timeout option. Timed out messages are discarded
- SBE MBOX commands can be classified into four types :
- Those that must be sent to the master only (ex: sending MDST/MDDT info)
- Those that must be sent to slaves only (ex: continue MPIPL)
- Those that must be sent to all chips (ex: close insecure window)
- Those that can be sent to any chip (ex: timer)
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
Cc: Russell Currey <ruscur@russell.cc>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
Found with sparse and some added lock annotations.
CC: stable # 5.10+
Fixes: de82c2e0e
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
For some reason skiboot populates nodes in /cpus/ for the cores on
chips that are deconfigured. As a result Linux includes the threads
of those cores in it's set of possible CPUs in the system and attempts
to set the SPR values that should be used when waking a thread from
a deep sleep state.
However, in the case where we have deconfigured chip we don't create
a xscom node for that chip and as a result we don't have a proc_chip
structure for that chip either. In turn, this results in an assertion
failure when calling opal_slw_set_reg() since it expects the chip
structure to exist. Fix this up and print an error instead.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
The ibm,slot-label property is to name the slot that appears under a
PCIe bridge. In the past we (ab)used the slot tables to attach names
to GPU devices and their corresponding NVLinks which resulted in npu2.c
using slot-label as a location code rather than as a way to name slots.
Fix this up since it's confusing.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
The NPU workbook defines a way of fencing a brick and
getting the brick out of fence state. We do have an implementation
of bringing the brick out of fenced/quiesced state. We do
the latter in our procedures, but to support run time reset
we need to do the former.
The fencing ensures that access to memory behind the links
will not lead to HMI's, but instead SUE's will be populated
in cache (in the case of speculation). The expectation is then
that prior to and after reset, the operating system components
will flush the cache for the region of memory behind the GPU.
This patch does the following:
1. Implements a npu2_dev_fence_brick() function to set/clear
fence state
2. Clear FIR bits prior to clearing the fence status
3. Clear's the fence status
4. We take the powerbus out of CQ fence much later now,
in credits_check() which is the last hardware procedure
called after link training.
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Reviewed-By: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
The NPU_SM_CONFIG0 register currently needs to be configured in Skiboot to
select NVLink mode, however Hostboot should configure other bits in this
register.
For some reason Skiboot was explicitly clearing bit-6
(CONFIG_DISABLE_VG_NOT_SYS). It is unclear why this bit was getting cleared
as recent Hostboot versions explicitly set it to the correct value based on
the specific system configuration. Therefore Skiboot should not alter it.
Bit-58 (CONFIG_NVLINK_MODE) selects if NVLink mode should be enabled or
not. Hostboot does not configure this bit so Skiboot should continue to
configure it.
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
Debugging issues related to unconnected NVLinks can be a little less
irritating if we use the NPU2DEV{DBG,INF}() macros instead of prlog().
In short, change this:
NPU2: comparing GPU 'GPU2' and NPU2 'GPU1'
NPU2: comparing GPU 'GPU3' and NPU2 'GPU1'
NPU2: comparing GPU 'GPU4' and NPU2 'GPU1'
NPU2: comparing GPU 'GPU5' and NPU2 'GPU1'
:
npu2_dev_bind_pci_dev: No PCI device for NPU2 device 0006:00:01.0 to bind to. If you expect a GPU to be there, this is a problem.
to this:
NPU6:0:1.0 Comparing GPU 'GPU2' and NPU2 'GPU1'
NPU6:0:1.0 Comparing GPU 'GPU3' and NPU2 'GPU1'
NPU6:0:1.0 Comparing GPU 'GPU4' and NPU2 'GPU1'
NPU6:0:1.0 Comparing GPU 'GPU5' and NPU2 'GPU1'
:
NPU6:0:1.0 No PCI device found for slot 'GPU1'
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
Reviewed-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
There are two sets of core temperature sensors today. One is DTS scom
based core temperature sensors and the second group is the sensors
provided by OCC. DTS is the highest temperature among the different
temperature zones in the core while OCC core temperature sensors are
the average temperature of the core. DTS sensors are read directly by
the host by SCOMing the DTS sensors while OCC sensors are read and
updated by OCC to main memory.
Reading DTS sensors by SCOMing is a heavy and slower operation as
compared to reading OCC sensors which is as good as reading memory.
So dont add DTS sensors when OCC sensors are available.
Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Acked-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
Fix the issue where every thread on the chip sends HMI event to host for
TOD errors. TOD errors are reported to all the core/threads on the chip.
Any one thread can fix the error and send event. Rest of the threads don't
need to send HMI event unnecessarily.
This patch fixes this by modifying __chiptod_recover_tod_errors() function
to return -1 if no errors found. Without this change every thread that
see TFMR[51]=1 sends HMI event to the host kernel.
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
There are some TOD errors which do not affect working of TOD and TB. They
stay in valid state. Hence we don't need rendez vous for TOD errors that
does not affect TB working.
TOD errors that affects TOD/TB will report a global error on TFMR[44]
alongwith bit 51, and they will go in rendez vous path as expected.
But the TOD errors that does not affect TB register sets only TFMR bit 51.
The TFMR bit 51 is cleared when any single thread clears the TOD error.
Once cleared, the bit 51 is reflected to all the cores on that chip. Any
thread that reads the TFMR register after the error is cleared will see
TFMR bit 51 reset. Hence the threads that see TFMR[51]=1, falls through
rendez-vous path and threads that see TFMR[51]=0, returns doing
nothing. This ends up in a soft lockups in host kernel.
This patch fixes this issue by not considering TOD interrupt (TFMR[51])
as a core-global error and hence avoiding rendez-vous path completely.
Instead threads that see TFMR[51]=1 will now take different path that
just do the TOD error recovery.
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
For TOD errors, all the cores in the chip get HMIs. Any one thread from any
core can fix the issue and TFMR will have error conditions cleared. Rest of
the threads need take any action if TOD errors are already cleared. Hence
thread 0 of every core should get a fresh copy of TFMR before going ahead
recovery path. Initialize recover = -1, so that if no errors found that
thread need not send a HMI event to linux. This helps in stop flooding host
with hmi event by every thread even there are no errors found.
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
This patch reworks the HMI handling for TFAC errors by introducing
4 rendez-vous points improve the thread synchronization while handling
timebase errors that requires all thread to clear dirty data from TB/HDEC
register before clearing the errors.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
Currently we restore PCIe bus numbers right after the link is
up. Unfortunately as this point we haven't done CRS so config space
may not be accessible.
This moves the bus number restore till after CRS has happened.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
Enables reporting of slot status information, etc in the config space of
the root complex. Currently this is only used to set the slot power
limit in our generic PCI code, but we might use it for other things
later on.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
Deprecate the old "opal-interrupts", it's still there, but the new
property follows the standard and allow us to specify whether an
interrupt is level or edge sensitive.
Similarly create "interrupt-names" whose content is identical to
"opal-interrupts-names".
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
When the phb is used as a CAPI interface, the current mmio windows list
is cleaned before adding the capi and the prefetchable memory (M64)
windows, which implies that the non-prefetchable BAR is no more
configured.
This patch allows to set only the mbt bar to pass capi mmio window and
to keep, as defined, the other mmio values (M32 and M64).
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
When setting up an opencapi link, we set the transport muxes first,
then set the PHY training config register, which includes disabling
nvlink mode for the bricks. That's the order of the init sequence, as
found in the NPU workbook.
In reality, doing so works, but it raises 2 FIR bits in the PowerBus
OLL FIR Register for the 2 links when we configure the transport
muxes. Presumably because nvlink is not disabled yet and we are
configuring the transport muxes for opencapi.
bit 60: link0 internal error
bit 61: link1 internal error
Overall the current setup ends up being correct and everything works,
but we raise 2 FIR bits.
So tweak the order of operations to disable nvlink before configuring
the transport muxes. Incidentally, this is what the scripts from the
opencapi enablement team were doing all along.
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
When we setup a link, we always enable ODL0 and ODL1 at the same time
in the PHY training config register, even though we are setting up
only one OTL/ODL, so it raises a "link internal error" FIR bit in the
PowerBus OLL FIR Register for the second link. The error is harmless,
as we'll eventually setup the second link, but there's no reason to
raise that FIR bit.
The fix is simply to only enable the ODL we are using for the link.
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
Fix the sensor type to match HWMON sensor types. Add compatible flag
to indicate the environmental sensor groups so that operations on
these groups can be handled by HWMON linux interface.
Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
On ZZ, stop4,5,11 are enabled for PHYP, even though doing
so may cause problems with OPAL due to bugs in hcode.
For other platforms, this isn't so much of an issue as
we can just control stop states by the MRW. However the
rebuild-the-world approach to changing values there is a bit
annoying if you just want to rule out a specific stop state
from being problematic.
Provide an nvram option to override what's disabled in OPAL.
The OPAL mask is currently ~0xE0000000 (i.e. all but stop 0,1,2)
You can set an NVRAM override with:
nvram -p ibm,skiboot --update-config opal-stop-state-disable-mask=0xFFFFFFF
This nvram override will disable *all* stop states.
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|