Age | Commit message (Collapse) | Author | Files | Lines |
|
creset calls in the hw procedure that resets the PHY, we don't
take them out of reset, just put them in reset.
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Acked-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Use the generic dctl_{set/clear}_special_wakeup() in hostservices to
assert and de-assert core special wakeup for P8 and remove the
duplicated code.
Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
In P9, we have to enable "flush the instruction cache" bit along with
"attn instruction support" bit to trigger attention.
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
We re-enable channel tag streaming for PHB in CAPP mode as without it
PEC was waiting for cresp for each DMA write command before sending a
new DMA write command on the Powerbus. This resulted in much lower DMA
write performance than expected.
The patch updates enable_capi_mode() to remove the masking of
channel_streaming_en bit in PBCQ Hardware Configuration Register. Also
does some re-factoring of the code that updates this register to use
xscom_write_mask instead of xscom_read followed by a xscom_write.
Cc: stable
Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: Christophe Lombard clombard@linux.vnet.ibm.com
Acked-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
dt_find_compatible_node() and dt_find_compatible_node_on_chip() are used to
find device nodes under a parent/root node with a given compatible
property.
dt_next(root, prev) is used to walk the child nodes of the given parent and
takes two arguments - root contains the parent node to walk whilst prev
contains the previous child to search from so that it can be used as an
iterator over all children nodes.
The first iteration of dt_find_compatible_node(root, prev) calls
dt_next(root, root) which is not a well defined operation as prev is
assumed to be child of the root node. The result is that when a node
contains no children it will start returning the parent nodes siblings
until it hits the top of the tree at which point a NULL derefence is
attempted when looking for the root nodes parent.
Dereferencing NULL can result in undesirable data exceptions during system
boot and untimely non-hilarious system crashes. dt_next() should not be
called with prev == root. Instead we add a check to dt_next() such that
passing prev = NULL will cause it to start iterating from the first child
node (if any).
Also add a unit test for this case to run-device.c.
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Hostboot will expect the label field of the stb header to contain
"PAYLOAD" for skiboot or it will fail to load and run skiboot.
The failure looks something like this:
53.40896|ISTEP 20. 1 - host_load_payload
53.65840|secure|Secureboot Failure plid = 0x90000755, rc = 0x1E07
53.65881|System shutting down with error status 0x1E07
53.67547|================================================
53.67954|Error reported by secure (0x1E00) PLID 0x90000755
53.67560| Container's component ID does not match expected component ID
53.67561| ModuleId 0x09 SECUREBOOT::MOD_SECURE_VERIFY_COMPONENT
53.67845| ReasonCode 0x1e07 SECUREBOOT::RC_ROM_VERIFY
53.67998| UserData1 : 0x0000000000000000
53.67999| UserData2 : 0x0000000000000000
53.67999|------------------------------------------------
53.68000| Callout type : Procedure Callout
53.68000| Procedure : EPUB_PRC_HB_CODE
53.68001| Priority : SRCI_PRIORITY_HIGH
53.68001|------------------------------------------------
53.68002| Callout type : Procedure Callout
53.68003| Procedure : EPUB_PRC_FW_VERIFICATION_ERR
53.68003| Priority : SRCI_PRIORITY_HIGH
53.68004|------------------------------------------------
Reported-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com>
Tested-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
commit 85a1de35cbe4 ("fast-boot: occ: Re-parse the pstate table during fast-boot" )
breaks the fast-reboot on P8 platforms while reiniting the OCC pstates. On P8
platforms OPAL adds additional two properties #address-cells and #size-cells
under ibm,opal/power-mgmt/ DT node. While in fast-reboot same properties adding
back to the same node results in Duplicate properties and hence fast-reboot fails
with below traces.
[ 541.410373292,5] OCC: All Chip Rdy after 0 ms
[ 541.410488745,3] Duplicate property "#address-cells" in node /ibm,opal/power-mgt
[ 541.410694290,0] Aborting!
CPU 0058 Backtrace:
S: 0000000031d639d0 R: 000000003001367c .backtrace+0x48
S: 0000000031d63a60 R: 000000003001a03c ._abort+0x4c
S: 0000000031d63ae0 R: 00000000300267d8 .new_property+0xd8
S: 0000000031d63b70 R: 0000000030026a28 .__dt_add_property_cells+0x30
S: 0000000031d63c10 R: 000000003003ea3c .occ_pstates_init+0x984
S: 0000000031d63d90 R: 00000000300142d8 .load_and_boot_kernel+0x86c
S: 0000000031d63e70 R: 000000003002586c .fast_reboot_entry+0x358
S: 0000000031d63f00 R: 00000000300029f4 fast_reset_entry+0x2c
This patch fixes this issue by removing these two properties on P8 while doing
OCC pstates re-init in fast-reboot code path.
Fixes: 85a1de35cbe4
Signed-off-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Log HMI errors as step 1. OS will need to deduce
and interpret the HMI event.
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Acked-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
i2c.c fails to compile with gcc7 and -Werror=format-overflow used in
Debian Unstable and Ubuntu 18.04 :
i2c.c: In function ‘i2c_init’:
i2c.c:211:15: error: ‘%s’ directive writing up to 255 bytes into a
region of size 236 [-Werror=format-overflow=]
dpath is supposed to store an entire path.
Reported-by: Michel Normand <michel.mno@free.fr>
Signed-off-by: Frédéric Bonnard <frediz@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Add *.pyc (to catch doc/DtsLexer.pyc) and *.patch (to catch patch files we
leave lying around).
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Print DSISR and DAR, to help with deciphering machine check exceptions,
and improve the output a bit, decode NIP symbol, improve alignment, etc.
Also print a specific header for machine check, because we do expect to
see these if there is a hardware failure.
Before:
[ 0.005968779,3] ***********************************************
[ 0.005974102,3] Unexpected exception 200 !
[ 0.005978696,3] SRR0 : 000000003002ad80 SRR1 : 9000000000001000
[ 0.005985239,3] HSRR0: 00000000300027b4 HSRR1: 9000000030001000
[ 0.005991782,3] LR : 000000003002ad80 CTR : 0000000000000000
[ 0.005998130,3] CFAR : 00000000300b58bc
[ 0.006002769,3] CR : 40000004 XER: 20000000
[ 0.006008069,3] GPR00: 000000003002ad80 GPR16: 0000000000000000
[ 0.006015170,3] GPR01: 0000000031c03bd0 GPR17: 0000000000000000
[...]
After:
[ 0.003287941,3] ***********************************************
[ 0.003561769,3] Fatal MCE at 000000003002ad80 .nvram_init+0x24
[ 0.003579628,3] CFAR : 00000000300b5964
[ 0.003584268,3] SRR0 : 000000003002ad80 SRR1 : 9000000000001000
[ 0.003590812,3] HSRR0: 00000000300027b4 HSRR1: 9000000030001000
[ 0.003597355,3] DSISR: 00000000 DAR : 0000000000000000
[ 0.003603480,3] LR : 000000003002ad68 CTR : 0000000030093d80
[ 0.003609930,3] CR : 40000004 XER : 20000000
[ 0.003615698,3] GPR00: 00000000300149e8 GPR16: 0000000000000000
[ 0.003622799,3] GPR01: 0000000031c03bc0 GPR17: 0000000000000000
[...]
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
The current boot sequence inherits MSR[ME] from the IPL firmware, and
never changes it. Some environments disable MSR[ME] (e.g., mambo), and
others can enable it (hostboot).
This has two problems. First, MSR[ME] must be disabled while in the
process of taking over the interrupt vector from the previous
environment. Second, after installing our machine check handler,
MSR[ME] should be enabled to get some useful output rather than a
checkstop.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
get_symbol is difficult to use. Add snprintf_symbol helper which
prints a symbol into a buffer with length, and returns the number
of bytes used, similarly to snprintf. Use this in the stack dumping
code rather than open-coding it.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
OCC shares the frequency list to host by copying the pstate table to
main memory in HOMER. This table is parsed during boot to create
device-tree properties for frequency and pstate IDs. OCC can update
the pstate table to present a new set of frequencies to the host. But
host will remain oblivious to these changes unless it is re-inited
with the updated device-tree CPU frequency properties. So this patch
allows to re-parse the pstate table and update the device-tree
properties during fast-reboot.
OCC updates the pstate table when asked to do so using pstate-table
bias command. And this is mainly used by WOF team for
characterization purposes.
Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Tested-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
pci_reset() currently does a platform reboot if it fails. It
should not know about fast-reboot at this level, so instead have
it return an error, and the fast reboot caller will do the
platform reboot.
The code essentially does the same thing, but flexibility is
improved. Ideally the fast reboot code should perform pci_reset
and all such fail-able operations before the CPU resets itself
and destroys its own stack. That's not the case now, but that
should be the goal.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Switch to 512KB mode (directory size) as we don’t use bit 48 of the tag
in addressing the array. This mode is controlled by the Snoop CAPI
Configuration Register.
Set the maximum of the number of data polls received before signaling
TLBI hang detect timer expired. The value of '0000' is equal to 16.
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
The page size is encoded in the TVT data [59:63] as @shift+11 but
the tce_kill handler does not do the math right; this fixes it.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-By: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
As a safer side move the imc catalog preload after the STB init
to make sure the imc catalog resource get's verified and measured
properly during loading when both secure and trusted boot modes
are on.
Signed-off-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
When we load a flash resource during OPAL init, STB calls trusted measure
to measure the given resource. There is a situation when a flash gets loaded
before STB initialization then trusted measure cannot measure properly.
So this patch fixes this issue by calling trusted measure only if the
corresponding trusted init was done.
The ideal fix is to make sure STB init done at the first place during init
and then do the loading of flash resources, by that way STB can properly
verify and measure the all resources.
Signed-off-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Currently in OPAL init time at various stages we are loading various
PNOR partition containers from the flash device. When we load a flash
resource STB calls the CVC verify and trusted measure(sha512) functions.
So when we have a flash resource gets loaded before STB initialization,
then cvc verify function fails to start the verify and enforce the boot.
Below is one of the example failure where our VERSION partition gets
loading early in the boot stage without STB initialization done.
This is with secure mode off.
STB: VERSION NOT VERIFIED, invalid param. buf=0x305ed930, len=4096 key-hash=0x0 hash-size=0
In the same code path when secure mode is on, the boot process will abort.
So this patch fixes this issue by calling cvc verify only if we have
STB init was done.
And also we need a permanent fix in init path to ensure STB init gets
done at first place and then start loading all other flash resources.
Signed-off-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Signed-off-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Currently libstb logs the verify and hash caluculation messages in
PR_INFO level. So when there is a secure boot enforcement happens
in loading last flash resource(Ex: BOOTKERNEL), the previous verify
and measure messages are not logged to console, which is not clear
to the end user which resource is verified and measured.
So this patch fixes this by increasing the log level to PR_NOTICE.
Signed-off-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
When exercising more than one CAPI accelerators simultaneously in
cache coherency mode, the verification team is seeing a deadlock. To
fix this a workaround of disabling CAPP virtual machines is
suggested. These 'virtual machines' let PSL queue multiple CAPP
commands for servicing by CAPP there by increasing
throughput. Below is the error scenario described by the h/w team:
" With virtual machines enabled we had a deadlock scenario where with 2
or more CAPI's in a system you could get in a deadlock scenario due to
cast-outs that are required break the deadlock (evict lines that
another CAPI is requesting) get stuck in the virtual machine queue by
a command ahead of it that is being retried by the same scenario in
the other CAPI. "
So this patch updates CAPP APC Master Powerbus control
register during CAPP init to also set Bit(12) that disables CAPP
virtual machines. This forces processing of CAPP commands from PSL one
at a time and thereby preventing above mentioned deadlock scenario.
Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Presently during a CRESET the CAPP recovery sequence can be executed
multiple times in case PBCQ on the PEC is still busy processing in/out
bound inflight transactions.
This patch updates phb4_creset() to perform capp-recovery sequence via
do_capp_recovery_scoms() only when PBCQ General Status Register
reports no pending transactions.
Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: Christophe Lombard clombard@linux.vnet.ibm.com
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
ibm,vpd blob contains VN field. Use that to populate vendor property
for various FRU's.
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
APSS is not avialable on platforms like Zaius, Romulus where OCC
can only measure Vdd (core) and Vdn (nest) power from the AVSbus
reading. So all the sensors for APSS channels will be populated
with 0. Different component power sensors like system, memory
which point to the APSS channels will also be 0.
As per OCC team (Martha Broyles) zero'ed power sensor means that the
system doesnot have it. So this patch filters out these sensors.
Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
For opencapi, the trigger page of an interrupt is mapped to user
space. The intent is to write the page to raise an interrupt but
there's nothing to prevent a user process from reading it, which has
the infortunate consequence of checkstopping the system.
Mask the FIR bit raised when an MMIO operation targets an invalid
location. It's the recommendation from recent documentation and
hostboot is expected to mask it at some point. In the meantime, let's
play it safe.
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Each nvlink device is associated with a particular group of OBUS lanes via
a lane mask which is read from HDAT via the device-tree. However Skiboot's
interpretation of lane mask was different to what is exported from the
HDAT.
Specifically the lane mask bits in the HDAT are encoded in IBM bit ordering
for a 24-bit wide value. So for example in normal bit ordering lane-0 is
represented by having lane-mask bit 23 set and lane-23 is represented by
lane-mask bit 0. This patch alters the Skiboot interpretation to match what
is passed from HDAT.
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Reviewed-by: Reza Arbab <arbab@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Newer versions of Hostboot will not power up the NVLink PHY lanes by
default. The phy_reset procedure already powers up the lanes but they also
need to be powered up in order to access the DL.
The reset_ntl procedure is called by the device driver to bring the DL out
of reset and get it into a working state. Therefore we also need to add
lane and clock power up to the reset_ntl procedure.
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Reviewed-by: Reza Arbab <arbab@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
There are userspace tools that update the planar VPD via the sysfs
interface. Currently we do not get correct information from hostboot
about the exact type of the EEPROM so we need to manually fix it up
here. This needs to be done as a platform specific fix since there is
not standardised VPD EEPROM type.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
In Debian/Ubuntu, the packaging system likes to have a full cleanup that
restores the tree back to original one, so add some files to the distclean
target.
Signed-off-by: Frédéric Bonnard <frediz@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
For the need of Debian/Ubuntu packaging, I infered some initial man
pages from their help output.
Signed-off-by: Frédéric Bonnard <frediz@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
The next_extry label doesn't do anything other than perform an addition
which requires a dereference of the NULL entry variable, just continue
the loop instead.
Fixes: 77190aa7 (hdata/vpd: Rework vpd node creation logic)
Signed-off-by: Cyril Bur <cyril.bur@au1.ibm.com>
Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
A likely copy and paste oversight.
Fixes: 0d84ea6b (core: Add support for quiescing OPAL)
Signed-off-by: Cyril Bur <cyril.bur@au1.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
STOP API generates SPR restore instruction for a given SPR.
Commit fixes the generation of mtspr instruction by API.
Problem will show up only when API is changed to generate
restore instruction using a GPR other than R0.
CQ: SW407799
Change-Id: I2a841a9aae417b7bcd92a323197d9c6a1f3cb149
Reviewed-on: http://ralgit01.raleigh.ibm.com/gerrit1/49525
Tested-by: FSP CI Jenkins <fsp-CI-jenkins+hostboot@us.ibm.com>
Tested-by: Jenkins Server <pfd-jenkins+hostboot@us.ibm.com>
Tested-by: Hostboot CI <hostboot-ci+hostboot@us.ibm.com>
Reviewed-by: RANGANATHPRASAD G. BRAHMASAMUDRA <prasadbgr@in.ibm.com>
Reviewed-by: STEWART E. SMITH <stewart@linux.vnet.ibm.com>
Dev-Ready: Gregory S. Still <stillgs@us.ibm.com>
Reviewed-by: Gregory S. Still <stillgs@us.ibm.com>
Reviewed-on: http://ralgit01.raleigh.ibm.com/gerrit1/49529
Reviewed-by: Hostboot Team <hostboot@us.ibm.com>
Tested-by: Jenkins OP Build CI <op-jenkins+hostboot@us.ibm.com>
Tested-by: Jenkins OP HW <op-hw-jenkins+hostboot@us.ibm.com>
Reviewed-by: Christian R. Geddes <crgeddes@us.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
This reverts commit 65f9abea8e8cfd7f711a5c54217b5505826ff497.
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Register the location of the secure ROM, not the address of the location.
Fixes: 594c7a6ae3ccc
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Invalid accesses from the GPU can cause a specific PE to be frozen by the
NPU. Add an interrupt handler which reports the frozen PE to the operating
system via as an EEH event.
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Add a mechanism to enable/disable sw checkstop by looking at nvram option
opal-sw-xstop=<enable/disable>.
For now this patch disables the sw checkstop trigger unless explicitly
enabled through nvram option 'opal-sw-xstop=enable'i for p9. This will allow
an opportunity to get host kernel in panic path or xmon for unrecoverable
HMIs or MCE, to be able to debug the issue effectively.
To enable sw checkstop in opal issue following command:
# nvram -p ibm,skiboot --update-config opal-sw-xstop=enable
NOTE: This is a workaround patch to disable sw checkstop by default to gain
control in host kernel for better checkstop debugging. Once we have most of
the checkstop issues stabilized/resolved, revisit this patch to enable sw
checkstop by default.
For p8 platform it will remain enabled by default unless explicitly disabled.
To disable sw checkstop on p8 issue following command:
# nvram -p ibm,skiboot --update-config opal-sw-xstop=disable
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Reviewed-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
An old Witherspoon platform definition was added to aid the transition from
versions of Hostboot which didn't have the correct NVLink HDAT information
available and/or planar VPD. These system should now be updated so remove
the possibly incorrect default assumption.
This may disable NVLink on old out-dated systems but it can easily be
restored with the appropriate FW and/or VPD updates. In any case there is a
a 50% chance the existing default behaviour was incorrect as it only
supports 6 GPU systems. Using an incorrect platform definition leads to
undefined behaviour which is more difficult to detect/debug than not
creating the NVLink devices so remove the possibly incorrect default
behaviour.
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
All the nodes under the vpd heirachy have a unit address (their SLCA
index) but no reg properties. Add them and their size/address cells
to squash the warnings.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Hostboot doesn't give us accurate information about the DIMM SPD
devices. Hack around by assuming any EEPROM we find on the SPD I2C
master is an SPD eeprom.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
There's no such thing as a 412Kb EEPROM.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|
|
Fixes: 35c66b8ce5a27ad3312806e8bde9148a5e5b5df8
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
|