aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2020-04-08hdata/memory: Add support for memory-buffer mmioOliver O'Halloran1-14/+125
HDAT now allows associating a set of MMIO address ranges with an MSAREA. This is to allow for exporting the MMIO register space associated with a memory-buffer chip to the hypervisor so we can wire up access to that for PRD. The DT format is similar to the old centaur memory-buffer@<addr> nodes that we had on P8 OpenPower systems. The biggest difference is that the HDAT format allows for multiple memory ranges on each "chip" and each of these ranges may have a different register size. Cc: Klaus Heinrich Kiwi <klaus@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-04-08hw/centaur: Convert to use the new scom APIOliver O'Halloran3-10/+16
Currently we assume any xscom_read / write targeted at a chipid with 0x8 as the top four bits is intended to be a centaur SCOM. On non-P8 platforms there is no reason to assume this so covert it to use the new struct scom_controller infrastructure. Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-04-08hw/xscom: Add scom infrastructureOliver O'Halloran2-0/+87
Currently the top nibble of the "part ID" is used to determine the type of a xscom_read() / xscom_write() call. This was mainly done for the benefit of PRD on P8 which would do "targeted" SCOMs to EX (core) chiplets and rely on skiboot to do find the actual scom address. Similarly, PRD also relied on this to access the SCOMs of centaur chips which are accessed via FSI on P8. On P9 PRD moved to only doing non-targeted scoms where it would only ever supply a "part ID" which was the fabric ID of the chip to be SCOMed. The centaur support was also unnecessary since OPAL didn't support any P9 systems with Centaurs. However, on future systems we will have to support memory buffer chips again so we need to expand the SCOM support to accomodate them. To do this, allow skiboot components to register a SCOM read and write() function for chip ID. This will allow us to ensure the P8 EX chiplet and Centaur SCOM code is only ever used on P8, freeing up the Part ID address space for other uses. Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-04-01docs: Fix ref to skiboot-6.4 in 6.5 release notesv6.6-rc1Oliver O'Halloran1-1/+1
I like to click things. Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-03-31platform/mihawk: support dynamic PCIe slot tableJoy Chu1-16/+212
Slot table auto-detection for different riser cards by using IPMI OEM command to communicate with BMC. Signed-off-by: Joy Chu <joy_chu@wistron.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-03-30hw/phb4: Tune GPU direct performance on witherspoon in PCI modeFrederic Barrat3-24/+78
Good GPU direct performance on witherspoon, with a Mellanox adapter on the shared slot, requires to reallocate some dma engines within PEC2, "stealing" some from PHB4&5 and giving extras to PHB3. It's currently done when using CAPI mode. But the same is true if the adapter stays in PCI mode. In preparation for upcoming versions of MOFED, which may not use CAPI mode, this patch reallocates dma engines even in PCI mode for a series of Mellanox adapters that can be used with GPU direct, on witherspoon and on the shared slot only. The loss of dma engines for PHB4&5 on witherspoon has not shown problems in testing, as well as in current deployments where CAPI mode is used. Here is a comparison of the bandwidth numbers seen with the PHB in PCI mode (no CAPI) with and without this patch. Variations on smaller packet sizes can be attributed to jitter and are not that meaningful. # OSU MPI-CUDA Bi-Directional Bandwidth Test v5.6.1 # Send Buffer on DEVICE (D) and Receive Buffer on DEVICE (D) # Size Bandwidth (MB/s) Bandwidth (MB/s) # with patch without patch 1 1.29 1.48 2 2.66 3.04 4 5.34 5.93 8 10.68 11.86 16 21.39 23.71 32 42.78 49.15 64 85.43 97.67 128 170.82 196.64 256 385.47 383.02 512 774.68 755.54 1024 1535.14 1495.30 2048 2599.31 2561.60 4096 5192.31 5092.47 8192 9930.30 9566.90 16384 18189.81 16803.42 32768 24671.48 21383.57 65536 28977.71 24104.50 131072 31110.55 25858.95 262144 32180.64 26470.61 524288 32842.23 26961.93 1048576 33184.87 27217.38 2097152 33342.67 27338.08 Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Cc: skiboot-stable@lists.ozlabs.org # skiboot-op940.x Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-03-30hw/imc: Add error message on failing cases for imc_initMadhavan Srinivasan1-3/+11
Add couple of more debug messages to understand possible fail in imc_init(). Currently the only message printed is "IMC Devices not added" which is not very helpful when debugging. Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-03-30Revert "FSP: Disable PSI link whenever FSP tells OPAL about impending R/R"Vasant Hegde3-9/+19
This reverts commit a4788a49f004a91bb8ca015336abf9ae119fbc52. Above patch was added to handle host power down with FSP in R/R state. But FSP is not liking OPAL giving up PSI link early in R/R process. For FSP initiated R/R OPAL should wait until we get PSI interrupt. Hence reverting above commit. Also partially reverting commit e04a34af to make fsp_dpo_pending as global variable. We have made several improvement in the way we handle FSP communication and also in power down path. Now if host sends powerdown message when FSP in RR, OPAL return OPAL_BUSY_EVENT. Kernel will run poller() and retry power down message after sometime. So I think this patch will not have any side effect on power down path. Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Acked-by: Ananth N Mavinakayanahalli <ananth@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-03-30skiboot v6.0.22 release notesVasant Hegde1-0/+21
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-03-20hw/prd: Hold FSP notifications while PRD is inactiveOliver O'Halloran1-1/+12
On FSP systems we rely on a service on the FSP to send us a notification when the OCCs become active. On systems with NVDIMMs this is especially critical because the OCC is responsible for starting the NVDIMM save procedure when power fails. The message sent from the FSP isn't sent to OPAL itself, rather it's sent to the PRD service running on the host (via OPAL). If this service is not running OPAL will currently send an error response back to the FSP and drop the message. This causes problems because the OCCs active message is generally sent while OPAL is still booting the system so the PRD daemon never gets notified that the OCC is active. Once the OS is running we rely on PRD to report the protection status of the NVDIMMs on the system. However, because it never recieves the notification from the FSP it will always report the DIMMs as un-protected because it thinks the OCCs are inactive. This patch fixes the issue by allowing a single message to be held in OPAL while PRD is inactive. Once OPAL recieves a notification that PRD has started we deliver the message. It's worth pointing out that this is kind of janky and brittle and would probably break horribly if FSP notify messages were multi-part since we could end up in a situation where only a single part of a multi-part message is queued, with the rest being dropped. However, the only user of the FSP notification message appears to be the OCC, and the OCC team says it's not a problem. I'll take their word for it. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> ---
2020-03-20skiboot v6.5.4 release notesVasant Hegde1-0/+16
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2020-03-13Re-license contributions from YadroOliver O'Halloran3-3/+3
Cc: Ilya Kuznetsov <ilya@yadro.com> Cc: Artem Senichev <artemsen@gmail.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-03-13Re-license contributions from Dan HorákOliver O'Halloran3-3/+3
Cc: Dan Horák <dan@danny.cz> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-03-12Re-license contributions from Raptor Computer SystemsOliver O'Halloran5-5/+5
The following files contain contributions from Timothy Pearson at Raptor Computer Systems. He has agreed to re-license these contributions as Dual Apache 2.0 / GPLv2+, so amend the SPDX tag to reflect that. hw/phb4.c include/phb4.h include/platform.h platforms/astbmc/talos.c platforms/astbmc/romulus.c Cc: Timothy Pearson <tpearson@raptorengineering.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-03-12Re-license IBM written files as Apache 2.0 OR GPLv2+Stewart Smith471-470/+471
SPDX makes it a simpler diff. I have audited the commit history of each file to ensure that they are exclusively authored by IBM and thus we have the right to relicense. The motivation behind this is twofold: 1) We want to enable experiments with coreboot, which is GPLv2 licensed 2) An upcoming firmware component wants to incorporate code from skiboot and code from the Linux kernel, which is GPLv2 licensed. I have gone through the IBM internal way of gaining approval for this. The following files are not exclusively authored by IBM, so are *not* included in this update (I will be seeking approval from contributors): core/direct-controls.c core/flash.c core/pcie-slot.c external/common/arch_flash_unknown.c external/common/rules.mk external/gard/Makefile external/gard/rules.mk external/opal-prd/Makefile external/pflash/Makefile external/xscom-utils/Makefile hdata/vpd.c hw/dts.c hw/ipmi/ipmi-watchdog.c hw/phb4.c include/cpu.h include/phb4.h include/platform.h libflash/libffs.c libstb/mbedtls/sha512.c libstb/mbedtls/sha512.h platforms/astbmc/barreleye.c platforms/astbmc/garrison.c platforms/astbmc/mihawk.c platforms/astbmc/nicole.c platforms/astbmc/p8dnu.c platforms/astbmc/p8dtu.c platforms/astbmc/p9dsu.c platforms/astbmc/vesnin.c platforms/rhesus/ec/config.h platforms/rhesus/ec/gpio.h platforms/rhesus/gpio.c platforms/rhesus/rhesus.c platforms/astbmc/talos.c platforms/astbmc/romulus.c Signed-off-by: Stewart Smith <stewart@linux.ibm.com> [oliver: fixed up the drift] Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-03-11skiboot v6.5.3 release notesVasant Hegde1-0/+24
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-03-11tpm_i2c_nuvoton: check TPM vendor id register during probeEric Richter1-0/+14
The driver for the nuvoton i2c TPM does not currently check if there is a functional TPM at the bus and address given by the device tree. This patch adds a simple check of the TPM vendor id register, compares against the known expected value for the chip, skips registering it if the chip is inaccessible or returns an unexpected id. Signed-off-by: Eric Richter <erichte@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-03-11external/mambo: rename env var PMEM_MODES to PMEM_MODEacsawdey1-2/+2
This patch just renames the env var to be non-plural to match PMEM_DISK. A related patch going in to mambo will make the actual mode used in mambo be called "private" instead of "cow" to reflect the fact that you're getting a private copy of the image file and that any data written to it will be lost when mambo exits. Unlike the old bogusdisk driver, there is no separate .cow file where the writes go. Signed-off-by: Aaron Sawdey <acsawdey@linux.ibm.com> Acked-by: Michael Neuling <mikey@neuling.org> [oliver: fixed whitespace damage on the patch] Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-03-11core/pci.c: cleanup pci_add_loc_code()Klaus Heinrich Kiwi1-20/+15
Minor cleanups to add clarity after commit ab1b05d2 "PCI: create optional loc-code platform callback" Signed-off-by: Klaus Heinrich Kiwi <klaus@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-03-11platform/mihawk: add nvme devices slot tableJoy Chu1-16/+76
Add nvme slot table for broadcom gen4 nvme hba card support. Signed-off-by: Joy Chu <joy_chu@wistron.com> [oliver: fixed statment with no effect warning] Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-03-11hw/fsp: Fix GENERIC_FAILURE mailbox status codeOliver O'Halloran3-5/+4
The 0xEF return code is used to tell the hypervisor that the FSP was not able to replicate an NVRAM write to the secondary FSP. The GENERIC_FAILURE is using this code instead of the correct 0xFE code which indicates a generic error condition. We already have a FSP_STATUS_GENERIC_ERROR for 0xFE so convert the existing users of FSP_STATUS_GENERIC_FAILURE to use GENERIC_ERROR and remove the duplicate. Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-03-11hw/fsp: Remove stray va_end() in __fsp_fillmsg()Oliver O'Halloran1-1/+0
__fsp_fillmsg() is called from fsp_fillmsg() and fsp_mkmsg(). Both callers wrap it in a va_start() / va_end() pair so using va_end() inside the function is almost certainly wrong. Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-03-11errorlog: Increase the severity of abnormal reboot eventsVasant Hegde1-1/+1
Currently Linux will usually call opal_cec_reboot2() in response to unrecoverable HMIs and other serious hardware errors. OPAL handles platform errors by sending an error log to the BMC / FSP and triggering a software checkstop. Sending error logs to the BMC / FSP is normally an async operation, but in this path we need to ensure that error logs are sent out before the xstop is triggered. The easiest way to do that is to escalate the severity of the generated error log from "abnormal reboot" to "panic" since we force panic logs to be send synchronusly. It's also a more accurate description of what's happening. CC: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Reviewed-by: Klaus Heinrich Kiwi <klaus@linux.vnet.ibm.com> [oliver: commit message] Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-03-11eSEL: Make sure PANIC logs are sent to BMC before calling assertVasant Hegde1-2/+15
eSEL logs are split into multiple smaller chunks and sent to BMC. We use ipmi_queue_msg_sync() interface for sending OPAL_ERROR_PANIC severity events to BMC. But callback handler (ipmi_cmd_done()) clears 'sync_msg' after getting response to first chunk as its not aware that we have more data to send. So in assert()/checkstop path we may endup checkstoping system before error log is sent to BMC completely. We will miss useful error log. This patch introduces new wait loop in ipmi_elog_commit(). It will wait until error log is sent to BMC. I think this is safe because even if something goes wrong (like BMC reset) we will hit timeout and eventually we will come out of this loop. Alternatively we can add additional check in ipmi_cmd_done() path. But I don't wanted to make this path aware of message type. Reviewed-by: Klaus Heinrich Kiwi <klaus@linux.vnet.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-03-11esel: Fix OEM SEL generator IDVasant Hegde1-1/+1
Fixes: a2c74d83 (ipmi: endian conversion) Cc: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-03-11ipmi-sel: Free ipmi_msg in error pathVasant Hegde1-0/+1
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-03-11tpm_i2c_nuvoton: fix tpm_read_fifo overflow checkMauro S. M. Rodrigues1-0/+1
The tpm_read_fifo expects buflen parameter which is the size of buf parameter. Later it uses buflen to check for an overflow in the case tpm response is bigger than buf capacity. The check is fine, but it doesn't interrupt the code flow, so even though we see error messages about the overflow, it doesn't prevent it. Adding a goto after specifying the error return code fixes it. Signed-off-by: Mauro S. M. Rodrigues <maurosr@linux.vnet.ibm.com> Reviewed-by: Klaus Heinrich Kiwi <klausk@linux.vnet.ibm.com> Reviewed-by: Claudio Carvalho <cclaudio@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-02-26firenze-pci: Fix infinite loop in firenze_pci_add_loc_code()Oliver O'Halloran1-1/+1
If ibm,slot-location-code isn't in a PCI device's parent node the loop to search for it will never terminate since p = np->parent is always going to return the same result. Fixes: ab1b05d29f5e ("PCI: create optional loc-code platform callback") Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-02-26memory: Sort memory regions listVasant Hegde3-6/+20
local_alloc() function tries to find first allocatable local region and tries to allocate memory. If that region doesn't have sufficient memory to allocate then it will log below warning and tries to allocate memory from next available region. Warning: -------- [ 268.346728058,3] mem_alloc(0x800000, 0x800000, "hw/xive.c:1630", ibm,firmware-allocs-memory@0) failed ! [ 268.346732718,6] Memory regions: [ 268.346734353,6] 0x000030500000..000030ffffff : ibm,firmware-heap [ 268.346833468,5] 420 allocs of 0x00000058 bytes at core/device.c:41 (total 0x9060) [ 268.346978805,5] 2965 allocs of 0x00000038 bytes at core/device.c:424 (total 0x28898) [ 268.347035614,5] 434 allocs of 0x00000040 bytes at core/device.c:424 (total 0x6c80) [ 268.347093567,5] 365 allocs of 0x00000028 bytes at libc/string/strdup.c:23 (total 0x3908) [ 268.347136818,5] 84 allocs of 0x00000048 bytes at core/device.c:424 (total 0x17a0) [ 268.347179123,5] 21 allocs of 0x00000030 bytes at libc/string/strdup.c:23 (total 0x3f0) .... .... Hostboot reserves memory for various nodes and passes this information via HDAT. In some cases there will be small memory holes between two reservations (ex: 16MB free space between two reservation). add_region() function adds new region to the head of regions list. mem_region_init() adds OPAL regions first and then hostboot regions. So these smaller regions will be added to head of list. If we have smaller free regions then we may hit above warning. Hostboot uses top of the memory for various reservations. So if we sort memory regions then allocator will use bigger region (region after OPAL memory) for local allocation. And we will not hit above warning. Memory region layout with this patch: 0 - 756MB : OS reserved region (for loading kernel) 756MB - ~856MB : OPAL memory (actual size depends on PIR) 856MB - ~956MB : Memory for MPIPL (actual size depends on OPAL runtime size) 956MB - ... : We will have free memory after 956MB which we can use for local_alloc(). Typically this will be multiple GBs. So it works fine. .... - top_mem: Hostboot reservations + small holes Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-02-26list: Add list_add_after()Vasant Hegde1-0/+19
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-02-26errorlog: Replace hardcode value with macroVasant Hegde5-5/+7
Suggested-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-02-26core/fast-reset: Do not enable by defaultOliver O'Halloran2-20/+25
Fast reboot started life as a debug hack and it escaped into the wild when Stewart enabled it by default. There was some reasons for this, but the main one is that a full reboot takes somewhere between one and five minutes. For those of us who spend all day rebooting their POWER systems this is great, but the utility for end users has always been pretty questionable. Rebooting a system should be a fairly infrequent activity in the field with the main reasons for doing one being: 1) Kernel updates, 2) Misbehaving hardware Although 1) can be performed by kexec we have found that it fails due to 2) occasionally. The reason for 2) is usually hardware getting itself into a bad state. The universal fix for that type of hardware problem is turning the hardware off and back on again so it's preferable that a reboot actually does that. This patch refactors the reboot handling OPAL calls so that fast-reboot is only used by default when explicitly enabled, or manually invoked. This allows developers to continue to use fast-reboot without expecting users deal with its quirks (and understand how a "normal" reboot, fast-reboot and MPIPL differ). This has two user visible changes: 1. Full reboot is now the default. In order to get fast-reboot as the default the nvram option needs to be set: nvram -p ibm,skiboot --update-config fast-reset=1 2. The nvram option to force a fast-reboot even when some part of skiboot has called disable_fast_reboot() has changed from 'fast-reset=im-feeling-lucky' to 'force-fast-reset=1' because it's impossible to actually use that 'feature' if fast-reboot is off by default. nvram -p ibm,skiboot --update-config force-fast-reset=1 Cc: Stewart Smith <stewart@flamingspork.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-02-26core/platform: Add an explicit fast-reboot typeOliver O'Halloran2-0/+5
The OPAL_CEC_REBOOT2 OPAL call allows a specific type of reboot to be requested. We can use this to allow the OS to request a fast-reboot explicitly rather than relying on nvram hacks to change the default behaviour. Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-02-24external/mambo: support mambo COW mode for PMEM diskAaron Sawdey1-1/+12
I've added support in mambo's "memory mmap" command to have a "cow" mode which just uses MAP_PRIVATE instead of MAP_SHARED on the file so that writes to the memory region are not sent back to the file. This allows multiple mambo instances to share the same filesystem image. This is implemented by having a PMEM_MODES environment variable. If this is set, it is expected to contain a comma separated list of modes (rw or cow) for the list of files in PMEM_DISK. If there are fewer modes than files, the remaining files default to "rw". Signed-off-by: Aaron Sawdey <acsawdey@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-02-24PCI: create optional loc-code platform callbackKlaus Heinrich Kiwi6-35/+89
Some platforms (mostly OpenPower-based) will favor a short, slot-label-based string for the "ibm,loc-code" DT property. Other platforms such as ZZ/FSP-based platforms will prefer the fully-qualified slot-location-code for it. This patches creates a new operation on the platform struct, allowing for an optional callback to create the "ibm,loc-code" property in a platform-specific way. If the callback is not defined, use the cleaned-up default that was in use so far. Signed-off-by: Klaus Heinrich Kiwi <klaus@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-02-24capp: Add lid definition for P9 DD2.3Frederic Barrat1-0/+2
Add the definition of the CAPP microcode for DD2.3 to the lid map. Cc: skiboot-stable@lists.ozlabs.org # v6.5+ Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Tested-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-02-12opal-gard: Add support for new PVR POWER9P.Mahesh Salgaonkar1-0/+1
Enable a new PVR for gard tool to work on another p9 variant. Makes op-test as well happy. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> [oliver: s/Axone/axone/] Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-02-12npu2-opencapi: Allow platforms to identify physical slotsFrederic Barrat5-3/+74
This patch lets each platform define the name of the opencapi slots. It makes it easier to identify which physical card is generating errors or messages in the linux or skiboot log files. The patch provides slot names for mihawk and witherspoon. If the platform doesn't define any, then we default to 'OPENCAPI-xxxx' There are various ways to find out about the slot names: skiboot log lspci command (if the PCI hotplug driver pnv-php is loaded) lshw checking the device tree and probably others.... Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-02-12npu2-opencapi: Don't drive reset signal permanentlyFrederic Barrat1-6/+40
A problem was found with the way we manage the I2C signal to reset adapters. Skiboot currently always drives the value of the opencapi reset signal. We set the I2C pin for reset in output mode and keep it in output mode permanently. And since the reset signal is inverted, it is explicitly set to high by the I2C controller pretty much all the time. When the opencapi card is powered off, for example on a reboot, actively driving the I2C reset pin to high keeps applying a voltage to part of the FPGA, which can leak current, send the FPGA in a bad state since it's unexpected or even damage the card. To prevent damaging adapters, the recommendation from the hardware team is to switch back the pin to input mode at the end of a reset cycle. There are pull-up resistors on the planar of all the platforms to make sure the reset signal is high "naturally". When the slot is powered off, the reset pin won't be kept high by the i2c controller any more. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-02-12ibm-fsp/lxvpd: check for upstream port on slot labelingKlaus Heinrich Kiwi2-4/+9
Certain FSP configurations include PCIe switches that can have LXVPD slot map entries using the same switch-id and dev-id, even if they are referring to different upstream and downstream ports of the same link. The slot matching function (lxvpd_get_slot()) will match the first occurence, that can be the upstream port with, and ignore the downstream port. The main symptom for the above is an incorrect label for those slots, but I believe other slot attributes could be incorrect as well (as we are associating a slot with an upstream port). This patch picks-up an existing "upstream port" attribute from the 1005 version of the LXVPD slot map to prevent matching upstream ports on the slot matching function. Signed-off-by: Klaus Heinrich Kiwi <klaus@linux.vnet.ibm.com> [oliver: 80cols compliance] Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-02-12platform: Log error to BMC even if diag data is missingVasant Hegde1-2/+2
Also fix "DESC" to ASCII conversion. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-02-12mpipl: Rework memory reservation for OPAL dumpVasant Hegde3-23/+40
During boot, OPAL reserves memory required to capture OPAL dump and architected register data. During MPIPL, hostboot will copy OPAL dump to this memory. Post MPIPL kernel will use this memory to create opalcore. We use mem_reserve_fw() for this reservation. At present this reservation happens late in the init path. It may clash with memory allocated by local_alloc(). We have two option to fix above issue: - Use local_alloc() for allocating memory for OPAL dump This works fine on first boot. We can use this method to reserve memory. But Post MPIPL we still want to reserve destination memory to make sure no one is stomping this area. Also this reservation might have happened in between other local_allocations. So in Post MPIPL boot allocator may not find enough memory in first region for other local_alloc() requests and may throw mem_alloc() error before trying to allocate from other regions. - Early memory reservation for OPAL dump Allocate and reserve memory just after memory region init. This patch uses second approach to fix reservation issue. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-02-12FSP: Remove flash hook after completing code updateVasant Hegde1-0/+6
In some corner cases, FSP may not respond to Deep IPL request after code update -OR- it may delay processing MBOX command. In such cases we may enter code update path again.. which is not required. Hence clear flash hook after completing code update. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-02-12mpipl: Disable fast-reboot during post MPIPL bootVasant Hegde1-0/+2
Otherwise device tree will continue to have `mpipl-boot` and kernel may think its MPIPL boot. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Reviewed-by: Stewart Smith <stewart@flamingspork.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-02-12mpipl: Release cpu data memory in free reserved memory pathVasant Hegde1-0/+2
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-02-12hdata: Fix SP attention area addressVasant Hegde1-1/+2
SP attention area is aligned. We were sending wrong address. Hence `attn` on FSP based system is failing. Align SP attention area so that FSP can locate attention data. Fixes: 518e554 (spira: fix endian conversions in spira data structures) CC: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-01-29stable/build: Do not convert warnings to errorVasant Hegde1-2/+11
During skiboot build, by default we convert all warnings to error. Because of this sometime skiboot stable branch fails to build on modern compiler. And we endup backporting build failure fixes to stable branches. Hence lets disable `-Werror` on skiboot stable branches (tagged version). Suggested-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Reviewed-by: Dan Horák <dan@danny.cz> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-01-29platforms/nicole: Fixup the system VPD EEPROM sizeArtem Senichev1-0/+23
Hostboot doesn't export the correct description for EEPROMs, as a result, all EEPROMs in the system work in "atmel,24c128" compatibility mode (16KiB). Nicole platform has 32KiB EEPROM for the system VPD. Signed-off-by: Artem Senichev <a.senichev@yadro.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-01-29skiboot.tcl: Add chip-id to pmem device tree nodeAlistair Popple1-0/+1
Skiboot expects a chip-id to be associated with the pmem device tree node. Without one the following assertion is hit during boot: 1305405: (1305405): [ 0.001300406,3] *********************************************** 1316412: (1316412): [ 0.001306446,3] < assert failed at core/device.c:968 > 1319544: (1319544): [ 0.001316645,3] . 1322726: (1322726): [ 0.001319777,3] . 1325958: (1325958): [ 0.001322959,3] . 1329490: (1329490): [ 0.001326191,3] OO__) 1332972: (1332972): [ 0.001329723,3] <"__/ 1336404: (1336404): [ 0.001333205,3] ^ ^ 1612307: (1612307): [ 0.001581877,3] Fatal TRAP at 00000000300297f4 .dt_get_chip_id+0x28 MSR 9000000000021002 1618779: (1618779): [ 0.001612557,3] CFAR : 00000000300ce7b4 MSR : 9000000030001000 1625236: (1625236): [ 0.001619014,3] SRR0 : 00000000300297f4 SRR1 : 9000000000021002 1631693: (1631693): [ 0.001625471,3] HSRR0: 0000000030012624 HSRR1: 9000000030001000 1637745: (1637745): [ 0.001631928,3] DSISR: 00000000 DAR : 0000000000000000 1644023: (1644023): [ 0.001637980,3] LR : 00000000300297dc CTR : 0000000000000000 1649684: (1649684): [ 0.001644258,3] CR : 20000202 XER : 00040000 1656705: (1656705): [ 0.001649921,3] GPR00: 00000000300297dc GPR16: 0000000000000000 1663740: (1663740): [ 0.001656946,3] GPR01: 0000000031c13940 GPR17: 0000000000000000 1670775: (1670775): [ 0.001663981,3] GPR02: 0000000030127400 GPR18: 0000000000000000 1677810: (1677810): [ 0.001671016,3] GPR03: 00000000ffffffff GPR19: 0000000000000000 1684845: (1684845): [ 0.001678051,3] GPR04: 00000000300d0376 GPR20: 0000000000000000 1691796: (1691796): [ 0.001685086,3] GPR05: 000000000000000c GPR21: 0000000000000000 1698831: (1698831): [ 0.001692037,3] GPR06: 0000000031c10060 GPR22: 0000000000000000 1705960: (1705960): [ 0.001699072,3] GPR07: 0000000030500010 GPR23: 00000000300eca0c 1713089: (1713089): [ 0.001706201,3] GPR08: 0000000030502b48 GPR24: 00000000300cf2c8 1720124: (1720124): [ 0.001713330,3] GPR09: 0000000000000000 GPR25: 00000000300ce8d7 1727182: (1727182): [ 0.001720365,3] GPR10: 0000000000000063 GPR26: 00000000300ce8fb 1734228: (1734228): [ 0.001727423,3] GPR11: 0000000000000003 GPR27: 00000000300d0398 1741358: (1741358): [ 0.001734469,3] GPR12: 0000000040000402 GPR28: 00000000300ce91d 1748488: (1748488): [ 0.001741599,3] GPR13: 0000000031c10000 GPR29: 00000000300d048f 1755618: (1755618): [ 0.001748729,3] GPR14: 00000000300026fc GPR30: 0000000030502cb8 1762748: (1762748): [ 0.001755859,3] GPR15: 0000000030000000 GPR31: 0000000030502cb8 2414283: (2414283): CPU 0000 Backtrace: 2414283: (2414283): S: 0000000031c13c40 R: 00000000300297dc .dt_get_chip_id+0x10 2414283: (2414283): S: 0000000031c13cb0 R: 000000003002ab68 .add_chip_dev_associativity+0x14 2414283: (2414283): S: 0000000031c13d50 R: 0000000030017c00 .mem_region_init+0x144 2414283: (2414283): S: 0000000031c13e30 R: 00000000300153d0 .main_cpu_entry+0x4d4 2414283: (2414283): S: 0000000031c13f00 R: 000000003000275c boot_entry+0x1bc Signed-off-by: Alistair Popple <alistair@popple.id.au> Tested-by: Praveen K Pandey <praveen@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-01-29npu2-opencapi: don't fence on masked XSL errorsFrederic Barrat1-2/+9
An upcoming change in the initfile is going to modify the default action and fence behavior of some of the NPU FIR2 bits. We're already overriding the settings of most of those. The one exception is for bits 41 and 42, which are XSL errors impacting 2 links that we mask (instead we rely on the subsequent OTL error, which is per link). The new initfile will fence-on-error for bits 41 and 42. And even if the FIRs are masked, the NPU logic could fence the links, which is not what we want. So this patch makes sure we don't fence on the FIRs we want to ignore. It has no effect on existing firmware. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>