Age | Commit message (Collapse) | Author | Files | Lines |
|
SBE on P9 provides one shot programmable timer facility. We can use this
to implement OPAL timers and hence limit the reliance on the Linux
heartbeat (similar to HW timer facility provided by SLW on P8).
Design:
- We will continue to run Linux heartbeat.
- Each chip has SBE. This patch always schedules timer on SBE on master chip.
- Start timer option starts new timer or modifies an active timer for the
specified timeout.
- SBE expects timeout value in microseconds. We track timeout value in TB.
Hence we convert tb to microseconds before sending request to SBE.
- We are requesting ack from SBE for timer message. It gaurantees that
SBE has scheduled timer.
- Disabling SBE timer
We expect SBE to send timer expiry interrupt whenever timer expires. We
wait for 10 more ms before disabling timer.
In future we can consider below alternative approaches:
- Presently SBE timer disable is permanent (until we reboot system).
SBE sends "I'm back" interrupt after reset. We can consider restarting
timer after SBE reset.
- Reset SBE and start timer again.
- Each chip has SBE. On multi chip system we can try to schedule timer
on different chip.
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
Lets move P8 timer support code from slw.c to sbe-p8.c (as suggested
by BenH). There is a difference between timer support in P8 and P9.
Hence I think it makes sense to name it as sbe-p8.c.
Note that this is pure code movement and renaming functions/variables.
No functionality changes.
Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
SBE (Self Boot Engine) on P9 has two different jobs:
- Boot the chip up to the point the core is functional
- Provide various services like timer, scom, stash MPIPL, etc., at runtime
OPAL can communicate to SBE via a set of data and control registers provided
by the PSU block in P9 chip.
- Four 8 byte registers for Host to send command packets to SBE
- Four 8 byte registers for SBE to send response packets to Host
- Two doorbell registers (1 on each side) to alert either party
when data is placed in above mentioned data register
Protocol constraints:
Only one command is accepted in the command buffer until the response for the
command is enqueued in the response buffer by SBE.
Usage:
We will use SBE for various purposes like timer, MPIPL, etc.
This patch implements the SBE MBOX spec for OPAL to communicate with
SBE.
Design consideration:
- Each chip has SBE. We need to track SBE messages per chip. Hence added
per chip sbe structure and list of messages to that chip
- SBE accepts only one command at a time. Hence serialized MBOX commands.
- OPAL gets interrupted once SBE sets doorbell register
- OPAL has to clear doorbell register after reading response
- Every command class has timeout option. Timed out messages are discarded
- SBE MBOX commands can be classified into four types :
- Those that must be sent to the master only (ex: sending MDST/MDDT info)
- Those that must be sent to slaves only (ex: continue MPIPL)
- Those that must be sent to all chips (ex: close insecure window)
- Those that can be sent to any chip (ex: timer)
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
Cc: Russell Currey <ruscur@russell.cc>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
Found with sparse and some added lock annotations.
CC: stable # 5.10+
Fixes: de82c2e0e
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
For some reason skiboot populates nodes in /cpus/ for the cores on
chips that are deconfigured. As a result Linux includes the threads
of those cores in it's set of possible CPUs in the system and attempts
to set the SPR values that should be used when waking a thread from
a deep sleep state.
However, in the case where we have deconfigured chip we don't create
a xscom node for that chip and as a result we don't have a proc_chip
structure for that chip either. In turn, this results in an assertion
failure when calling opal_slw_set_reg() since it expects the chip
structure to exist. Fix this up and print an error instead.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
The ibm,slot-label property is to name the slot that appears under a
PCIe bridge. In the past we (ab)used the slot tables to attach names
to GPU devices and their corresponding NVLinks which resulted in npu2.c
using slot-label as a location code rather than as a way to name slots.
Fix this up since it's confusing.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
The NPU workbook defines a way of fencing a brick and
getting the brick out of fence state. We do have an implementation
of bringing the brick out of fenced/quiesced state. We do
the latter in our procedures, but to support run time reset
we need to do the former.
The fencing ensures that access to memory behind the links
will not lead to HMI's, but instead SUE's will be populated
in cache (in the case of speculation). The expectation is then
that prior to and after reset, the operating system components
will flush the cache for the region of memory behind the GPU.
This patch does the following:
1. Implements a npu2_dev_fence_brick() function to set/clear
fence state
2. Clear FIR bits prior to clearing the fence status
3. Clear's the fence status
4. We take the powerbus out of CQ fence much later now,
in credits_check() which is the last hardware procedure
called after link training.
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Reviewed-By: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
The NPU_SM_CONFIG0 register currently needs to be configured in Skiboot to
select NVLink mode, however Hostboot should configure other bits in this
register.
For some reason Skiboot was explicitly clearing bit-6
(CONFIG_DISABLE_VG_NOT_SYS). It is unclear why this bit was getting cleared
as recent Hostboot versions explicitly set it to the correct value based on
the specific system configuration. Therefore Skiboot should not alter it.
Bit-58 (CONFIG_NVLINK_MODE) selects if NVLink mode should be enabled or
not. Hostboot does not configure this bit so Skiboot should continue to
configure it.
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
Debugging issues related to unconnected NVLinks can be a little less
irritating if we use the NPU2DEV{DBG,INF}() macros instead of prlog().
In short, change this:
NPU2: comparing GPU 'GPU2' and NPU2 'GPU1'
NPU2: comparing GPU 'GPU3' and NPU2 'GPU1'
NPU2: comparing GPU 'GPU4' and NPU2 'GPU1'
NPU2: comparing GPU 'GPU5' and NPU2 'GPU1'
:
npu2_dev_bind_pci_dev: No PCI device for NPU2 device 0006:00:01.0 to bind to. If you expect a GPU to be there, this is a problem.
to this:
NPU6:0:1.0 Comparing GPU 'GPU2' and NPU2 'GPU1'
NPU6:0:1.0 Comparing GPU 'GPU3' and NPU2 'GPU1'
NPU6:0:1.0 Comparing GPU 'GPU4' and NPU2 'GPU1'
NPU6:0:1.0 Comparing GPU 'GPU5' and NPU2 'GPU1'
:
NPU6:0:1.0 No PCI device found for slot 'GPU1'
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
Reviewed-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
There are two sets of core temperature sensors today. One is DTS scom
based core temperature sensors and the second group is the sensors
provided by OCC. DTS is the highest temperature among the different
temperature zones in the core while OCC core temperature sensors are
the average temperature of the core. DTS sensors are read directly by
the host by SCOMing the DTS sensors while OCC sensors are read and
updated by OCC to main memory.
Reading DTS sensors by SCOMing is a heavy and slower operation as
compared to reading OCC sensors which is as good as reading memory.
So dont add DTS sensors when OCC sensors are available.
Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Acked-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
Fix the issue where every thread on the chip sends HMI event to host for
TOD errors. TOD errors are reported to all the core/threads on the chip.
Any one thread can fix the error and send event. Rest of the threads don't
need to send HMI event unnecessarily.
This patch fixes this by modifying __chiptod_recover_tod_errors() function
to return -1 if no errors found. Without this change every thread that
see TFMR[51]=1 sends HMI event to the host kernel.
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
There are some TOD errors which do not affect working of TOD and TB. They
stay in valid state. Hence we don't need rendez vous for TOD errors that
does not affect TB working.
TOD errors that affects TOD/TB will report a global error on TFMR[44]
alongwith bit 51, and they will go in rendez vous path as expected.
But the TOD errors that does not affect TB register sets only TFMR bit 51.
The TFMR bit 51 is cleared when any single thread clears the TOD error.
Once cleared, the bit 51 is reflected to all the cores on that chip. Any
thread that reads the TFMR register after the error is cleared will see
TFMR bit 51 reset. Hence the threads that see TFMR[51]=1, falls through
rendez-vous path and threads that see TFMR[51]=0, returns doing
nothing. This ends up in a soft lockups in host kernel.
This patch fixes this issue by not considering TOD interrupt (TFMR[51])
as a core-global error and hence avoiding rendez-vous path completely.
Instead threads that see TFMR[51]=1 will now take different path that
just do the TOD error recovery.
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
For TOD errors, all the cores in the chip get HMIs. Any one thread from any
core can fix the issue and TFMR will have error conditions cleared. Rest of
the threads need take any action if TOD errors are already cleared. Hence
thread 0 of every core should get a fresh copy of TFMR before going ahead
recovery path. Initialize recover = -1, so that if no errors found that
thread need not send a HMI event to linux. This helps in stop flooding host
with hmi event by every thread even there are no errors found.
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
This patch reworks the HMI handling for TFAC errors by introducing
4 rendez-vous points improve the thread synchronization while handling
timebase errors that requires all thread to clear dirty data from TB/HDEC
register before clearing the errors.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
Currently we restore PCIe bus numbers right after the link is
up. Unfortunately as this point we haven't done CRS so config space
may not be accessible.
This moves the bus number restore till after CRS has happened.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
Enables reporting of slot status information, etc in the config space of
the root complex. Currently this is only used to set the slot power
limit in our generic PCI code, but we might use it for other things
later on.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
Deprecate the old "opal-interrupts", it's still there, but the new
property follows the standard and allow us to specify whether an
interrupt is level or edge sensitive.
Similarly create "interrupt-names" whose content is identical to
"opal-interrupts-names".
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
When the phb is used as a CAPI interface, the current mmio windows list
is cleaned before adding the capi and the prefetchable memory (M64)
windows, which implies that the non-prefetchable BAR is no more
configured.
This patch allows to set only the mbt bar to pass capi mmio window and
to keep, as defined, the other mmio values (M32 and M64).
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
When setting up an opencapi link, we set the transport muxes first,
then set the PHY training config register, which includes disabling
nvlink mode for the bricks. That's the order of the init sequence, as
found in the NPU workbook.
In reality, doing so works, but it raises 2 FIR bits in the PowerBus
OLL FIR Register for the 2 links when we configure the transport
muxes. Presumably because nvlink is not disabled yet and we are
configuring the transport muxes for opencapi.
bit 60: link0 internal error
bit 61: link1 internal error
Overall the current setup ends up being correct and everything works,
but we raise 2 FIR bits.
So tweak the order of operations to disable nvlink before configuring
the transport muxes. Incidentally, this is what the scripts from the
opencapi enablement team were doing all along.
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
When we setup a link, we always enable ODL0 and ODL1 at the same time
in the PHY training config register, even though we are setting up
only one OTL/ODL, so it raises a "link internal error" FIR bit in the
PowerBus OLL FIR Register for the second link. The error is harmless,
as we'll eventually setup the second link, but there's no reason to
raise that FIR bit.
The fix is simply to only enable the ODL we are using for the link.
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
Fix the sensor type to match HWMON sensor types. Add compatible flag
to indicate the environmental sensor groups so that operations on
these groups can be handled by HWMON linux interface.
Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
On ZZ, stop4,5,11 are enabled for PHYP, even though doing
so may cause problems with OPAL due to bugs in hcode.
For other platforms, this isn't so much of an issue as
we can just control stop states by the MRW. However the
rebuild-the-world approach to changing values there is a bit
annoying if you just want to rule out a specific stop state
from being problematic.
Provide an nvram option to override what's disabled in OPAL.
The OPAL mask is currently ~0xE0000000 (i.e. all but stop 0,1,2)
You can set an NVRAM override with:
nvram -p ibm,skiboot --update-config opal-stop-state-disable-mask=0xFFFFFFF
This nvram override will disable *all* stop states.
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
A bad GPU or other condition may leave us with a subset of links that
never get initialized. If an ATSD is sent to one of those bricks, it
will never complete, leaving us waiting forever for a response:
watchdog: BUG: soft lockup - CPU#23 stuck for 23s! [acos:2050]
...
Modules linked in: nvidia_uvm(O) nvidia(O)
CPU: 23 PID: 2050 Comm: acos Tainted: G W O 4.14.0 #2
task: c0000000285cfc00 task.stack: c000001fea860000
NIP: c0000000000abdf0 LR: c0000000000acc48 CTR: c0000000000ace60
REGS: c000001fea863550 TRAP: 0901 Tainted: G W O (4.14.0)
MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 28004484 XER: 20040000
CFAR: c0000000000abdf4 SOFTE: 1
GPR00: c0000000000acc48 c000001fea8637d0 c0000000011f7c00 c000001fea863820
GPR04: 0000000002000000 0004100026000000 c0000000012778c8 c00000000127a560
GPR08: 0000000000000001 0000000000000080 c000201cc7cb7750 ffffffffffffffff
GPR12: 0000000000008000 c000000003167e80
NIP [c0000000000abdf0] mmio_invalidate_wait+0x90/0xc0
LR [c0000000000acc48] mmio_invalidate.isra.11+0x158/0x370
ATSDs are only sent to bricks which have a valid entry in the XTS_BDF
table. So to prevent the hang, don't set NPU2_XTS_BDF_MAP_VALID unless
we make it all the way to creating a context for the BDF.
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
The cxl driver will set the capi value, like other drivers already do.
Signed-off-by: Philippe Bergheaud <felix@linux.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
Add support to load the imc catalog from a lid file packaged
as part of the system firmware. Lid number allocated
is 0x80f00103.lid.
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
This happens normally if a slot doesn't have a working HW presence
detect and relies instead of inband presence detect.
The message we display is scary and not very useful unless ou
are debugging, so quiten it up and change it to something more
meaningful.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-By: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
pause_microcode_at_boot() loops through all the chip's ucode
control block and pause the ucode if it is in the running state.
But it does not fail if any of the chip's ucode is not initialised.
Add code to return a failure if ucode is not initialized in any
of the chip. Since pause_microcode_at_boot() is called just before
attaching the IMC device nodes in imc_init(), add code to check for
the function return.
Fixes: 9750eee802f8d ('hw/imc: pause microcode at boot')
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
The ASN indication is used for tunneled operations (as_notify and
atomics). Tunneled operation messages can be sent in PCI mode as
well as CAPI mode.
The address field of as_notify messages is hijacked to encode the
LPID/PID/TID of the target thread, so those messages should not go
through address translation. Therefore bit 59 is part of the ASN
indication.
This patch sets TVT#1 in bypass mode when capi mode is enabled,
to prevent as_notify messages from being dropped.
Signed-off-by: Philippe Bergheaud <felix@linux.ibm.com>
Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
Hardware has limitations which would require to put a sync after each
store EOI to make sure the MMIO operations that change the ESB state
are ordered. This is a killer for performance and the PHBs do not
support the sync. So remove the store EOI for the moment, until
hardware is improved.
Also, while we are at changing the XIVE source flags, let's fix the
settings for the PHB4s which should follow these rules :
- SHIFT_BUG for DD10
- STORE_EOI for DD20 and if enabled
- TRIGGER_PAGE for DDx0 and if not STORE_EOI
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
The function phb4_probe_stack() resets "ETU Reset Register" to
unfreeze the PHB before it performs mmio access on the PHB. However in
case the FIR/NFIR registers are set while entering this function,
the reset of "ETU Reset Register" wont unfreeze the PHB and it will
remain fenced. This leads to failure during initial CRESET of the PHB
as mmio access is still not enabled and an error message of the form
below is logged:
PHB#0000[0:0]: Initializing PHB4...
PHB#0000[0:0]: Default system config: 0xffffffffffffffff
PHB#0000[0:0]: New system config : 0xffffffffffffffff
PHB#0000[0:0]: Initial PHB CRESET is 0xffffffffffffffff
PHB#0000[0:0]: Waiting for DLP PG reset to complete...
<snip>
PHB#0000[0:0]: Timeout waiting for DLP PG reset !
PHB#0000[0:0]: Initialization failed
This is especially seen happening during the MPIPL flow where SBE
would quiesces and fence the PHB so that it doesn't stomp on the main
memory. However when skiboot enters phb4_probe_stack() after MPIPL,
the FIR/NFIR registers are set forcing PHB to re-enter fence after ETU
reset is done.
So to fix this issue the patch introduces new xscom writes to
phb4_probe_stack() to reset the FIR/NFIR registers before performing
ETU reset to enable mmio access to the PHB.
Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Tested-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
This patch updates do_capp_recovery_scoms() to poll the CAPP
Err/Status control register, check for CAPP-Recovery to complete/fail
based on indications of BITS-1,5,9 and then proceed with the
CAPP-Recovery scoms iif recovery completed successfully. This would
prevent cases where we bring-up the PCIe link while recovery sequencer
on CAPP is still busy with casting out cache lines.
In case CAPP-Recovery didn't complete successfully an error is returned
from do_capp_recovery_scoms() asking phb4_creset() to keep the phb4
fenced and mark it as broken.
The loop that implements polling of Err/Status register will also log
an error on the PHB when it continues for more than 168ms which is the
max time to failure for CAPP-Recovery.
Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Reviewed-by: Alastair D'Silva <alastair@d-silva.org>
Acked-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
This can happen under mambo, at least.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Peer-to-peer GPU bandwidth latency testing has produced some tunable
values that improve performance. Add them to our device initialization.
File these under things that need to be cleaned up with nice #defines
for the register names and bitfields when we get time.
A few of the settings are dependent on the system's particular NVLink
topology, so introduce a helper to determine how many links go to a
single GPU.
Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
This gets used elsewhere to index items in the XTS tables.
Signed-off-by: Alistair Popple <alistair@popple.id.au>
[arbab@linux.vnet.ibm.com: Added commit log]
Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
This reverts commit fbdc91e693fc3103f7e2a65054ed32bfb26a2e17.
We don't need this as we need to do it a different way, with a explicit
set of registers as otherwise we trip other random FIR bits and everything
becomes even more terrible.
I suggest alcohol.
Cc: stable
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Signed-off-by: Cyril Bur <cyril.bur@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Fixes: CID 263056 and 263052
Signed-off-by: Cyril Bur <cyril.bur@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Fixes: ac4272bf ("fast-reboot: occ: Delete OCC child nodes in /ibm, opal/power-mgt")
Fixes: CID 263053
Signed-off-by: Cyril Bur <cyril.bur@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Fixes: CID 264267
Signed-off-by: Cyril Bur <cyril.bur@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
The follow pattern exists in several npu2 functions:
struct phb *phb = pci_get_phb(phb_id);
struct npu2 *p = phb_to_npu2_nvlink(phb);
The problem is that pci_get_phb() can return NULL and
phb_to_npu2_nvlink() dereferences its parameter. Coverity says that the
return value of pci_get_phb() is checked 43 out of 46 times which
suggests we should be more careful.
Futhurmore, functions with the baddly placed call to
phb_to_npu2_nvlink() do seem to check that the return value of
pci_get_phb() isn't NULL, but this check would be too little too late.
This patch just moves the call of phb_to_npu2_nvlink() to after the
NULL check for the return value of pci_get_phb().
Affected functions are:
opal_npu_map_lpar()
opal_npu_init_context()
opal_npu_destroy_context()
Fixes: CID 264274, 264273, 264272, 264271, 264266, 264265
Signed-off-by: Cyril Bur <cyril.bur@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Both scale_sensor() and scale_energy() take the value to scale as a
pointer. These functions do not NULL check the pointer before the first
time they dereference it, which is fine since passing NULL would be
completely pointless.
Both functions do perform a pointless NULL check later on. This
confuses coverity and really doesn't make much sense at all. Since
calling these functions with NULL as the sensor parameter makes no
sense, and currently theres a dereference before the check, just remove
the check.
Fixes: CID 264276 and 264275
Signed-off-by: Cyril Bur <cyril.bur@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
This means that we no longer hit this bug if we fail to get valid pstates
from the OCC.
[console-pexpect]#echo 1 > //sys/firmware/opal/sensor_groups//occ-csm0/clear
echo 1 > //sys/firmware/opal/sensor_groups//occ-csm0/clear
[ 94.019971181,5] CPU ATTEMPT TO RE-ENTER FIRMWARE! PIR=083d cpu @0x33cf4000 -> pir=083d token=8
[ 94.020098392,5] CPU ATTEMPT TO RE-ENTER FIRMWARE! PIR=083d cpu @0x33cf4000 -> pir=083d token=8
[ 10.318805] Disabling lock debugging due to kernel taint
[ 10.318808] Severe Machine check interrupt [Not recovered]
[ 10.318812] NIP [000000003003e434]: 0x3003e434
[ 10.318813] Initiator: CPU
[ 10.318815] Error type: Real address [Load/Store (foreign)]
[ 10.318817] opal: Hardware platform error: Unrecoverable Machine Check exception
[ 10.318821] CPU: 117 PID: 2745 Comm: sh Tainted: G M 4.15.9-openpower1 #3
[ 10.318823] NIP: 000000003003e434 LR: 000000003003025c CTR: 0000000030030240
[ 10.318825] REGS: c00000003fa7bd80 TRAP: 0200 Tainted: G M (4.15.9-openpower1)
[ 10.318826] MSR: 9000000000201002 <SF,HV,ME,RI> CR: 48002888 XER: 20040000
[ 10.318831] CFAR: 0000000030030258 DAR: 394a00147d5a03a6 DSISR: 00000008 SOFTE: 1
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
[Additional changes from Shilpa]
Reviewed-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Tested-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
In case of error, opal_xive_set_vp_info() will return without
unlocking the xive object. This is most certainly a typo.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Major changes in the NPU between DD1 and DD2 necessitated a fair bit of
revision-specific code.
Now that all our lab machines are DD2, we no longer test anything on DD1
and it's time to get rid of it.
Remove DD1-specific code and abort probe if we're running on a DD1 machine.
Cc: Alistair Popple <alistair@popple.id.au>
Cc: Reza Arbab <arbab@linux.vnet.ibm.com>
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-By: Alistair Popple <alistair@popple.id.au>
Acked-by: Reza Arbab <arbab@linux.vnet.ibm.com>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
We've been carting around this field since the original p7ioc-phb code.
As far as I can tell we never actually use it for anything other than
checking if the PHB has been marked as broken or not. The _FENCED
state is set in a few places, but we never use it in favour of just
checking the MMIO register.
This patch just replaces it with a boolean that indicates if
the PHB has been marked as broken and removes the giant, mostly
wrong, comment explaining it's usage that is copied and pasted
into each phb header file.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Using DGEMM benchmark we observed there was a drop of 5-9% throughput with
and without stop4/5. In this benchmark the GPU waits on the cpu to wakeup
and provide the subsequent data block to compute. The wakup latency
accumulates over the run and shows up as a performance drop.
Linux enters stop4/5 more aggressively for its wakeup latency. Increasing
the residency from 1ms to 10ms makes the performance drop <1%
Signed-off-by: Akshay Adiga <akshay.adiga@linux.vnet.ibm.com>
Acked-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Tested-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
We coded few workarounds in special wakeup logic to handle the
buggy firmware. Now that is fixed remove them as they break the
special wakeup protocol. As per the spec we should not de-assert
beofre assert is complete. So follow this protocol.
Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Reviewed-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Tested-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Fast reboot does not yet work right with the NPU. It's been disabled on
NVLink and OpenCAPI machines. Do the same for NVLink2.
This amounts to a port of 3e4577939bbf ("npu: Fix broken fast reset")
from the npu code to npu2.
Cc: stable # 5.10.x
Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|