aboutsummaryrefslogtreecommitdiff
path: root/hw
AgeCommit message (Collapse)AuthorFilesLines
2021-09-29phb4: Disable TCE cache line bufferFrederic Barrat1-0/+1
This patch implements a circumvention for HW557787. It disables the TCE cache line buffer as, under heavy loads, there's a possibility of an entry being re-allocated incorrectly. [ Upstream commit 15b93a301509ba7813343540e25b47ba395674b9 ] Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2021-01-06SBE: Account cancelled timer requestVasant Hegde1-0/+3
[ Upstream commit b44c7594523d20945179e497c45ec9007981ac75 ] Currently we are not accounting cancelled timer request. So in some corner cases we may schedule new timer request with new-timer-value > inflight-timer-value. Lets explicit check new_target value with inflight timer value. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2021-01-06SBE: Rate limit timer requestsVasant Hegde1-0/+22
[ Upstream commit 2e654443050acdd4deffdbb44723a847ca11e6b2 ] We schedule timer and wait for `timer expiry` interrupt from SBE. If we get new timer request which is lesser than inflight timer expiry value we can update timer (essentially sending new timer chip-op and SBE takes care of stoping inflight timer and scheduling new one). SBE runs at much slower speed than host CPU. If we do continuous timer update like below then SBE will be busy with handling PSU side timer message and will not get time to handle FIFO side requests. send timer chip-op -> Got ACK -> send timer chip-op Hence this patch limits number of continuous timer update and we will restart sending timer request as soon as we get timer expiry interrupt. Rate limit value (2) is suggested by SBE team. With this patch: If our timer requests are : 2ms, 1500us, 1000us and 800us (and requests are coming after sending each message) We will schedule timer for 2ms and then update timer for 1500us and 1000us (These update happens after getting ACK interrupt from SBE) We will not send 800us request. At 1000us we get `timer expiry` and we are good to send next timer requests (At this stage both 1000us and 800us timeout happens. We will schedule next timer request with timeout value 500us (1500-1000)). Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2021-01-06SBE: Check timer state before scheduling timerVasant Hegde1-2/+4
[ Upstream commit 47ab3a92298e72e44b9477a02b1312a09272a54a ] Timer flow: - OPAL sends timer chip-op to SBE and waits for ACK - Until we get ACK interrupt from SBE we will not schedule any new timer - Once we get ACK either we wait for timer expiry -OR- schedule new one if new-timer-request < inflight-timer-timeout value. - If we get new timer request while processing current one p9_sbe_update_timer_expiry code sets `has_new_target` and we schedule it in ACK path (p9_sbe_timer_resp()). p9_sbe_timer_resp() is callback handler and its called without lock. It does not check whether timer message is busy or not (timer_ctrl_msg). So in theory we may hit below scenario and corrupt msg_list. CPU 1 -> Timer ACK (callback handler) -- its not holding any lock CPU 2 -> Grabbed sbe_timer_lock -> scheduled timer --> done CPU 3 -> p9_sbe_update_timer_expiry() -> see timer is busy -> sets has_new_timer -> done CPU 1 -> gets chance to grab sbe_timer_lock -> saw has_new_timer -> Called p9_sbe_timer_schedule() --> List corrupted ! This patch adds timer message busy check in p9_sbe_timer_resp(). Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2021-01-06xscom: Fix xscom error logging caused due to xscom OPAL callGautham R. Shenoy1-2/+19
[ Upstream commit a4101173cacf79fcd91d395ab12aac9cb6840975 ] Commit 80fd2e963bd4 ("xscom: Don't log xscom errors caused by OPAL calls") ensured that xscom errors caused due to XSCOM read/write OPAL calls aren't logged in the error-log since the caller of the OPAL call is expected to handle it. However we are continuing to print the prerror() in the OPAL log regarding the same. This patch reduces the severity of the log from PR_ERROR to PR_INFO for the xscom read and write made via OPAL calls. Tested-by: Pavaman Subramaniyam <pavsubra@in.ibm.com> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Print info only for xscom read/writes made via opal calls Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2021-01-06xive/p9: Remove assert from xive_eq_for_target()Cédric Le Goater1-1/+1
[ Upstream commit f07ea9564425d8005ab334dfa40f7cebe4e71fbf ] XIVE VPs are structures describing the vCPUs of guests. When starting a guest, these are allocated and enabled and some checks are done on the location of the associated ENDs, which describe the event queues. If the block of the VP and the block of the ENDs do not match, the XIVE driver asserts. Unfortunately, there is no way to check that a VP identifier is part of a VP block that was previously allocated and it is relatively easy to crash the host with a bogus VP id. That can be done with a QEMU hack on a machine using vsmt. Simply remove the assert, the OS should gracefully handle the error. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reported-by: Greg Kurz <groug@kaod.org> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2020-10-22FSP/NVRAM: Do not assert in vNVRAM statistics callVasant Hegde1-2/+1
[ Upstream commit 9ca8bf1bde56330075634bd3cb601d0f6ee90514 ] `msg` is valid pointer here. I don't recall why I added assert here :-( This is not correct. We shouldn't call assert here. Also we are not using `msg`. Hence convert it to `__unused`. Fixes: 19d4f98e ('FSP/NVRAM: Handle "get vNVRAM statistics" command') Cc: skiboot-stable@lists.ozlabs.org # v5.4.x + Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2020-09-09fsp/dump: Handle non-MPIPL scenarioVasant Hegde1-4/+4
[ Upstream commit 0ad0ab3e24a322b79bec8451bc21e9bdd40a6657 ] If MPIPL is not enabled then we will not create `/ibm,opal/dump` node and we should continue to parse/retrieve SYSDUMP. I missed this scenario when I fixed similar issue last time :-( Fixes: 92b7968 (fsp: Skip sysdump retrieval only in MPIPL boot) Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2020-09-09hw/phb4: Verify AER support before initialising AER regsOliver O'Halloran1-0/+3
[ Upstream commit 0a5f2812a7e9007f2a89502a3f07bac34bfacbdb ] Check the AER capability offset pointer is non-zero before enabling the AER messages. If the device doesn't support AER we end up writing garbage to config offset 0x0 + PCIECAP_AER_CAPCTL, or 0x18. For a normal device this is one of the BARs so this doesn't do much, but for a bridge this results in overriding: 0x18 - The primary bus number 0x19 - The secondary bus number 0x1A - The subordinate bus number 0x1B - The latency timer 0x1B is hardwired to zero for PCIe devices, but overwriting the bus number register can cause issues with routing of config space accesses. It's worth pointing out that we write actual values for the secondary and subordinate bus numbers before scanning the secondary bus, but the primary bus number is never restored. Cc: skiboot-stable@lists.ozlabs.org Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2020-09-09hw/phb4: Actually enable error reportingOliver O'Halloran1-0/+1
[ Upstream commit 9b594262eeec7699836ff50c8762241d1f2570a3 ] PHB3 had an errata about correctable errors and when Ben was doing the initial PHB4 port he deleted the corresponding config write to DEVCTL. Whoops. Cc: skiboot-stable@lists.ozlabs.org Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2020-07-03fsp: Skip sysdump retrieval only in MPIPL bootVasant Hegde1-3/+11
[ Upstream commit 92b79689cae560ff0cb3620a0221147bb947138c ] It seems we should continue to retrieval SYSDUMP except in MPIPL boot. Fixes: d6eb510 (fsp: Ignore platform dump notification on P9) Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2020-06-06occ: Fix false negatives in wait_for_all_occ_init()Gautham R. Shenoy1-30/+147
[ Upstream commit ec3c45f3889cd5f7615db5615dd6824abe32f759 ] Currently the wait_for_all_occ_init() function determines that the OCCs associated with every Chip has been initialized by verifying if the "Valid" bit in pstate table of that OCC is set. However, on chips where all the EX units are guarded, the OCC, even though it is active, does not update the pstate_table. Currently as a result of this, OPAL concludes that the OCC is not functional and not only disable Pstate initialization, but incorrectly report that that OCCs were not initialized, thereby cutting other features such as sensors. Fix this by ensuring that * We check if there is atleast one active EX unit in the chip before checking if the OCC is active. * On platforms with OCC-OPAL communication interface version 0x90 * wait_for_all_occ_init() only checks if the occ_state in the OCC dynamic area is set to "Active State". * move the "Valid" bit check to add_cpu_pstate_properties(), which is where we create the device-tree entries for frequency scaling. Cc: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Tested-by: Pavaman Subramaniyam <pavsubra@in.ibm.com> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2020-06-06uart: Drop console write data if BMC becomes unresponsiveVasant Hegde1-26/+74
[ Upstream commit 6bf21350da32776aac8ba75bf48933854647bd7e ] If BMC becomes unresponsive (ex: during BMC reboot) during console write then we may get stuck in uart_wait_tx_room(). This will result in CPU to get stuck in OPAL. This will result in kernel lockups and in some cases host becomes unresponsive. This patch introduces timeout option. If UART operation doesn't complete within predefined time then it will drop write data and comes out. Note that this patch fixes both OPAL internal console as well as console write APIs. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [Various fixes on top of Nick's proposal to have single timer - Vasant] Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2020-06-06hw/phys-map: Fix OCAPI_MEM BAR valuesAndrew Donnellan1-3/+3
[ Upstream commit 75198f668911830bb5df27da59786199eac2e47c ] The comment next to the OCAPI_MEM entries in the Nimbus phys-map claims that we are "varying the upper 2 bits of the group ID" for each OpenCAPI link, as matches the chip address extension mask that will be set by future versions of Hostboot. The actual entries, on the other hand, vary the *lower* 2 bits of the group ID. Whoops. This didn't appear to cause us problems on the specific machines that we had access to at the time, but now that this is being tested a bit harder it's crashing machines... Fixes: bc72973d13215 ("hw/npu2-opencapi: Support multiple LPC devices") Cc: Frederic Barrat <fbarrat@linux.ibm.com> Reported-by: Wael El-Essawy <welessa@us.ibm.com> Reported-by: Milton Miller <miltonm@us.ibm.com> Reported-by: Jenny Huynh <jhuynh@us.ibm.com> Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2020-06-06platform/mihawk: Tune equalization settings for opencapiFrederic Barrat1-4/+19
[ Upstream commit afe6bc9051907d25082309895f8cfe44f59e2f25 ] The Bittware 250SOC adapter on Mihawk was showing a high count of CRC errors on one of the opencapi slots. The PHY team suggested new equalization settings to correct the errors. All existing adapters have been tested on mihawk to make sure the settings are compatible. However, the new settings should not be used on platforms other than mihawk. The changes specific to mihawk are: - Update the tx_ffe_pre_coeff and tx_ffe_post_coeff input parameters used during zcal - turn off the tx_ffe_boost parameter through scom Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Cc: skiboot-stable@lists.ozlabs.org # skiboot-op940.x Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2020-06-06PSI: Convert prerror to PR_NOTICEVasant Hegde1-1/+1
[ Upstream commit 071f00d661feaca05d9f610a21bd7c4d643e6b29 ] "Spurious interrupt" is not severe. Reduce message severity and keep msglog happy! Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2020-06-06sensors: occ: Fix a bug when sensor values are zeroGautham R. Shenoy1-1/+2
[ Upstream commit 1beb1519f4c39c3d4c418aafa219236568c38c8d ] The commit 1b9a449d ("opal-api: add endian conversions to most opal calls") modified the code in opal_read_sensor() to make it Little-Endian safe. In the process, it changed the code so that if a sensor value was zero, it would simply return OPAL_SUCCESS without updating the return buffer. As a result, the return buffer contained bogus values which were reflected on those sensors being read by the Kernel. This patch fixes it by ensuring that the return buffer is updated with the value read from the sensor every time. Thanks to Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> for spotting the missing return-buffer update. cc: skiboot-stable@lists.ozlabs.org Fixes: commit 1b9a449d ("opal-api: add endian conversions to most opal calls") Reported-by: Pavaman Subramaniyam <pavsubra@in.ibm.com> Tested-by: Pavaman Subramaniyam <pavsubra@in.ibm.com> Reviewed-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2020-06-06sensors: occ: Fix the GPU detection codeGautham R. Shenoy1-2/+20
[ Upstream commit f3ac046b386fea80286c72c3217acb407230a8c6 ] commit bebe096ee242 ("sensors: occ: Skip GPU sensors for non-gpu systems") assumes that presence of "ibm,power9-npu" compatible node indicates the presence of GPUs. However this is incorrect, as even OpenCAPI is supported via NPU. Thus ZZ systems, which have OpenCAPI connectors but not GPUs will have "ibm,power9-npu" compatible nodes. This results in OPAL creating device-tree entries for the GPU sensors on ZZ systems which don't even have GPUs. This patch fixes the GPU detection code in occ-sensors, by first checking for "ibm,ioda2-npu2-phb" compatible node which indicates the presence of nvlink. Only if such a node exists, do we check with the OCC for presence of GPUs on systems to confirm the presence of the GPU. Otherwise, we cut the GPU sensors. Thanks to Frederic Barrat <fbarrat@linux.ibm.com> for suggesting "ibm,ioda2-npu2-phb" for detecting the presence of nvlink GPUs. cc: skiboot-stable@lists.ozlabs.org Fixes: commit bebe096ee242 ("sensors: occ: Skip GPU sensors for non-gpu systems") Reported-by: Pavaman Subramaniyam <pavsubra@in.ibm.com> Tested-by: Pavaman Subramaniyam <pavsubra@in.ibm.com> Reviewed-by: Vaidyanathan Srinivasan <svaidy@linux.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2020-04-15MPIPL: Add support to save crash CPU details on FSP systemVasant Hegde1-3/+0
OPAL uses different path to trigger MPIPL: - On BMC system we call SBE S0 interrupt - On FSP system we call `attn` instruction Currently on BMC system we collect crash CPU PIR details.. which is needed to generate proper dump. This happens just before calling SBE S0 interrupt. Since we don't use this path in FSP system OPAL is not saving crashing CPU details. Hence by default `opalcore` is not pointing to crashing CPU and not showing proper backtrace. We have to go through all CPUs to find crashing CPU backtrace. This patch move this function to common place so that if MPIPL is supported we collect crashing CPU data. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-04-15fsp: Ignore platform dump notification on P9Vasant Hegde1-0/+3
After system crash FSP collects dump and passes dump details via HDAT. OPAL/Linux uses this detail to extract SYSDUMP. P9 FSP system we have MPIPL support. FSP folks says we have to ignore platform dump notification passed by HDAT and use inband MPIPL mechanism to extract dump. CC: Murulidhar Nataraju <murulidhar@in.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-04-08hw/ocmb: Add OCMB SCOM supportOliver O'Halloran2-0/+168
Add a driver for the SCOM ranges of the OCMB. Unlike most chips the OCMB has two different (three if you count OpenCAPI config space) register spaces and we need to ensure that the right access size is used on each. Additionally the SCOM interface is a bit non-standard in that a full physical address is passed as the SCOM address rather than a register number so we don't need to perform any address transformations, we just need to verify that the address falls into one of the nominated address ranges. Cc: Klaus Heinrich Kiwi <klaus@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-04-08hw/centaur: Convert to use the new scom APIOliver O'Halloran2-8/+14
Currently we assume any xscom_read / write targeted at a chipid with 0x8 as the top four bits is intended to be a centaur SCOM. On non-P8 platforms there is no reason to assume this so covert it to use the new struct scom_controller infrastructure. Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-04-08hw/xscom: Add scom infrastructureOliver O'Halloran1-0/+75
Currently the top nibble of the "part ID" is used to determine the type of a xscom_read() / xscom_write() call. This was mainly done for the benefit of PRD on P8 which would do "targeted" SCOMs to EX (core) chiplets and rely on skiboot to do find the actual scom address. Similarly, PRD also relied on this to access the SCOMs of centaur chips which are accessed via FSI on P8. On P9 PRD moved to only doing non-targeted scoms where it would only ever supply a "part ID" which was the fabric ID of the chip to be SCOMed. The centaur support was also unnecessary since OPAL didn't support any P9 systems with Centaurs. However, on future systems we will have to support memory buffer chips again so we need to expand the SCOM support to accomodate them. To do this, allow skiboot components to register a SCOM read and write() function for chip ID. This will allow us to ensure the P8 EX chiplet and Centaur SCOM code is only ever used on P8, freeing up the Part ID address space for other uses. Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-03-30hw/phb4: Tune GPU direct performance on witherspoon in PCI modeFrederic Barrat1-24/+29
Good GPU direct performance on witherspoon, with a Mellanox adapter on the shared slot, requires to reallocate some dma engines within PEC2, "stealing" some from PHB4&5 and giving extras to PHB3. It's currently done when using CAPI mode. But the same is true if the adapter stays in PCI mode. In preparation for upcoming versions of MOFED, which may not use CAPI mode, this patch reallocates dma engines even in PCI mode for a series of Mellanox adapters that can be used with GPU direct, on witherspoon and on the shared slot only. The loss of dma engines for PHB4&5 on witherspoon has not shown problems in testing, as well as in current deployments where CAPI mode is used. Here is a comparison of the bandwidth numbers seen with the PHB in PCI mode (no CAPI) with and without this patch. Variations on smaller packet sizes can be attributed to jitter and are not that meaningful. # OSU MPI-CUDA Bi-Directional Bandwidth Test v5.6.1 # Send Buffer on DEVICE (D) and Receive Buffer on DEVICE (D) # Size Bandwidth (MB/s) Bandwidth (MB/s) # with patch without patch 1 1.29 1.48 2 2.66 3.04 4 5.34 5.93 8 10.68 11.86 16 21.39 23.71 32 42.78 49.15 64 85.43 97.67 128 170.82 196.64 256 385.47 383.02 512 774.68 755.54 1024 1535.14 1495.30 2048 2599.31 2561.60 4096 5192.31 5092.47 8192 9930.30 9566.90 16384 18189.81 16803.42 32768 24671.48 21383.57 65536 28977.71 24104.50 131072 31110.55 25858.95 262144 32180.64 26470.61 524288 32842.23 26961.93 1048576 33184.87 27217.38 2097152 33342.67 27338.08 Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Cc: skiboot-stable@lists.ozlabs.org # skiboot-op940.x Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-03-30hw/imc: Add error message on failing cases for imc_initMadhavan Srinivasan1-3/+11
Add couple of more debug messages to understand possible fail in imc_init(). Currently the only message printed is "IMC Devices not added" which is not very helpful when debugging. Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-03-30Revert "FSP: Disable PSI link whenever FSP tells OPAL about impending R/R"Vasant Hegde2-9/+18
This reverts commit a4788a49f004a91bb8ca015336abf9ae119fbc52. Above patch was added to handle host power down with FSP in R/R state. But FSP is not liking OPAL giving up PSI link early in R/R process. For FSP initiated R/R OPAL should wait until we get PSI interrupt. Hence reverting above commit. Also partially reverting commit e04a34af to make fsp_dpo_pending as global variable. We have made several improvement in the way we handle FSP communication and also in power down path. Now if host sends powerdown message when FSP in RR, OPAL return OPAL_BUSY_EVENT. Kernel will run poller() and retry power down message after sometime. So I think this patch will not have any side effect on power down path. Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Acked-by: Ananth N Mavinakayanahalli <ananth@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-03-20hw/prd: Hold FSP notifications while PRD is inactiveOliver O'Halloran1-1/+12
On FSP systems we rely on a service on the FSP to send us a notification when the OCCs become active. On systems with NVDIMMs this is especially critical because the OCC is responsible for starting the NVDIMM save procedure when power fails. The message sent from the FSP isn't sent to OPAL itself, rather it's sent to the PRD service running on the host (via OPAL). If this service is not running OPAL will currently send an error response back to the FSP and drop the message. This causes problems because the OCCs active message is generally sent while OPAL is still booting the system so the PRD daemon never gets notified that the OCC is active. Once the OS is running we rely on PRD to report the protection status of the NVDIMMs on the system. However, because it never recieves the notification from the FSP it will always report the DIMMs as un-protected because it thinks the OCCs are inactive. This patch fixes the issue by allowing a single message to be held in OPAL while PRD is inactive. Once OPAL recieves a notification that PRD has started we deliver the message. It's worth pointing out that this is kind of janky and brittle and would probably break horribly if FSP notify messages were multi-part since we could end up in a situation where only a single part of a multi-part message is queued, with the rest being dropped. However, the only user of the FSP notification message appears to be the OCC, and the OCC team says it's not a problem. I'll take their word for it. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> ---
2020-03-12Re-license contributions from Raptor Computer SystemsOliver O'Halloran1-1/+1
The following files contain contributions from Timothy Pearson at Raptor Computer Systems. He has agreed to re-license these contributions as Dual Apache 2.0 / GPLv2+, so amend the SPDX tag to reflect that. hw/phb4.c include/phb4.h include/platform.h platforms/astbmc/talos.c platforms/astbmc/romulus.c Cc: Timothy Pearson <tpearson@raptorengineering.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-03-12Re-license IBM written files as Apache 2.0 OR GPLv2+Stewart Smith85-85/+85
SPDX makes it a simpler diff. I have audited the commit history of each file to ensure that they are exclusively authored by IBM and thus we have the right to relicense. The motivation behind this is twofold: 1) We want to enable experiments with coreboot, which is GPLv2 licensed 2) An upcoming firmware component wants to incorporate code from skiboot and code from the Linux kernel, which is GPLv2 licensed. I have gone through the IBM internal way of gaining approval for this. The following files are not exclusively authored by IBM, so are *not* included in this update (I will be seeking approval from contributors): core/direct-controls.c core/flash.c core/pcie-slot.c external/common/arch_flash_unknown.c external/common/rules.mk external/gard/Makefile external/gard/rules.mk external/opal-prd/Makefile external/pflash/Makefile external/xscom-utils/Makefile hdata/vpd.c hw/dts.c hw/ipmi/ipmi-watchdog.c hw/phb4.c include/cpu.h include/phb4.h include/platform.h libflash/libffs.c libstb/mbedtls/sha512.c libstb/mbedtls/sha512.h platforms/astbmc/barreleye.c platforms/astbmc/garrison.c platforms/astbmc/mihawk.c platforms/astbmc/nicole.c platforms/astbmc/p8dnu.c platforms/astbmc/p8dtu.c platforms/astbmc/p9dsu.c platforms/astbmc/vesnin.c platforms/rhesus/ec/config.h platforms/rhesus/ec/gpio.h platforms/rhesus/gpio.c platforms/rhesus/rhesus.c platforms/astbmc/talos.c platforms/astbmc/romulus.c Signed-off-by: Stewart Smith <stewart@linux.ibm.com> [oliver: fixed up the drift] Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-03-11hw/fsp: Fix GENERIC_FAILURE mailbox status codeOliver O'Halloran1-2/+2
The 0xEF return code is used to tell the hypervisor that the FSP was not able to replicate an NVRAM write to the secondary FSP. The GENERIC_FAILURE is using this code instead of the correct 0xFE code which indicates a generic error condition. We already have a FSP_STATUS_GENERIC_ERROR for 0xFE so convert the existing users of FSP_STATUS_GENERIC_FAILURE to use GENERIC_ERROR and remove the duplicate. Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-03-11hw/fsp: Remove stray va_end() in __fsp_fillmsg()Oliver O'Halloran1-1/+0
__fsp_fillmsg() is called from fsp_fillmsg() and fsp_mkmsg(). Both callers wrap it in a va_start() / va_end() pair so using va_end() inside the function is almost certainly wrong. Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-03-11eSEL: Make sure PANIC logs are sent to BMC before calling assertVasant Hegde1-2/+15
eSEL logs are split into multiple smaller chunks and sent to BMC. We use ipmi_queue_msg_sync() interface for sending OPAL_ERROR_PANIC severity events to BMC. But callback handler (ipmi_cmd_done()) clears 'sync_msg' after getting response to first chunk as its not aware that we have more data to send. So in assert()/checkstop path we may endup checkstoping system before error log is sent to BMC completely. We will miss useful error log. This patch introduces new wait loop in ipmi_elog_commit(). It will wait until error log is sent to BMC. I think this is safe because even if something goes wrong (like BMC reset) we will hit timeout and eventually we will come out of this loop. Alternatively we can add additional check in ipmi_cmd_done() path. But I don't wanted to make this path aware of message type. Reviewed-by: Klaus Heinrich Kiwi <klaus@linux.vnet.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-03-11esel: Fix OEM SEL generator IDVasant Hegde1-1/+1
Fixes: a2c74d83 (ipmi: endian conversion) Cc: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-03-11ipmi-sel: Free ipmi_msg in error pathVasant Hegde1-0/+1
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-02-26errorlog: Replace hardcode value with macroVasant Hegde1-2/+2
Suggested-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-02-24capp: Add lid definition for P9 DD2.3Frederic Barrat1-0/+2
Add the definition of the CAPP microcode for DD2.3 to the lid map. Cc: skiboot-stable@lists.ozlabs.org # v6.5+ Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Tested-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-02-12npu2-opencapi: Allow platforms to identify physical slotsFrederic Barrat1-3/+13
This patch lets each platform define the name of the opencapi slots. It makes it easier to identify which physical card is generating errors or messages in the linux or skiboot log files. The patch provides slot names for mihawk and witherspoon. If the platform doesn't define any, then we default to 'OPENCAPI-xxxx' There are various ways to find out about the slot names: skiboot log lspci command (if the PCI hotplug driver pnv-php is loaded) lshw checking the device tree and probably others.... Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-02-12npu2-opencapi: Don't drive reset signal permanentlyFrederic Barrat1-6/+40
A problem was found with the way we manage the I2C signal to reset adapters. Skiboot currently always drives the value of the opencapi reset signal. We set the I2C pin for reset in output mode and keep it in output mode permanently. And since the reset signal is inverted, it is explicitly set to high by the I2C controller pretty much all the time. When the opencapi card is powered off, for example on a reboot, actively driving the I2C reset pin to high keeps applying a voltage to part of the FPGA, which can leak current, send the FPGA in a bad state since it's unexpected or even damage the card. To prevent damaging adapters, the recommendation from the hardware team is to switch back the pin to input mode at the end of a reset cycle. There are pull-up resistors on the planar of all the platforms to make sure the reset signal is high "naturally". When the slot is powered off, the reset pin won't be kept high by the i2c controller any more. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-01-29npu2-opencapi: don't fence on masked XSL errorsFrederic Barrat1-2/+9
An upcoming change in the initfile is going to modify the default action and fence behavior of some of the NPU FIR2 bits. We're already overriding the settings of most of those. The one exception is for bits 41 and 42, which are XSL errors impacting 2 links that we mask (instead we rely on the subsequent OTL error, which is per link). The new initfile will fence-on-error for bits 41 and 42. And even if the FIRs are masked, the NPU logic could fence the links, which is not what we want. So this patch makes sure we don't fence on the FIRs we want to ignore. It has no effect on existing firmware. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-01-29hw/npu2-opencapi: Support multiple LPC devicesAndrew Donnellan2-19/+45
Currently, we only have a single range for LPC memory per chip, and we only allow a single device to use that range. With upcoming Hostboot/SBE changes, we'll use the chip address extension mask to give us multiple ranges by using the masked bits of the group ID. Each device can now allocate a whole 4TB non-mirrored region. We still don't do >4TB ranges. If the extension mask is not set correctly, we'll fall back to only permitting one device and printing an error suggesting a firmware upgrade. Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-01-29npu3: Register virtual PHBs with static IDsFrederic Barrat1-1/+2
Assigning opal IDs to virtual PHBs dynamically may lead to userland seeing the PCI domain ID for an adapter vary when adding or removing another adapter (GPU or opencapi). This patch switches to using static opal IDs for virtual PHBs, based on their ibm,phb-index property, which was made static by a previous patch. Note that the PCI domain IDs will increase on the second chip (or more, if we had more) because we now reserve 16 IDs per chip for PHBs. This affects Axone only. We don't change anything on P9 and npu2, to avoid altering how domain IDs have been shown on already GA'd platforms. Reviewed-by: Reza Arbab <arbab@linux.ibm.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-01-29npu2, npu3: Remove ibm, phb-index property from the NPU dt nodeFrederic Barrat1-1/+0
The 'ibm,phb-index' property of the NPU node is now useless, as we can have multiple PHBs associated to the same NPU on P9. Let's remove it to avoid confusion. Reviewed-by: Reza Arbab <arbab@linux.ibm.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-01-29npu3: Don't use the device tree to assign the phb-index of the PHBFrederic Barrat1-1/+1
On Axone, there's a 1-to-1 mapping between virtual PHBs and NPUs. We could keep assigning the phb-index of the virtual PHB from the value found in the npu node of the device tree, but to be consistent with P9/npu2 and avoid confusion, this patch assigns the phb-index when the virtual PHB is created, based on the npu index, similarly to what we do on P9. Reviewed-by: Reza Arbab <arbab@linux.ibm.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-01-29npu2: Rework phb-index assignments for virtual PHBsFrederic Barrat3-2/+3
Until now, opencapi PHBs were not using the 'ibm,phb-index' property, as it was thought unnecessary. For nvlink, a phb-index was associated to the npu when parsing hdat data, and the nvlink PHB was reusing the same value. It turns out it helps to have the 'ibm,phb-index' property for opencapi PHBs after all. Otherwise it can lead to wrong results on platforms like mihawk when trying to match entries in the slot table. We end up with an opencapi device inheriting wrong properties in the device tree, because match_slot_phb_entry() default to phb-index 0 if it cannot find the property. Though it doesn't seem to cause any harm, it's wrong and a future patch is expected to start using the slot table for opencapi, so it needs fixing. The twist is that with opencapi, we can have multiple virtual PHBs for a single NPU on P9. There's one PHB per (opencapi) brick. Therefore there's no 1-to-1 mapping between the NPU and PHB index and it no longer makes sense to associate a phb-index to a npu. With this patch, opencapi PHBs created under a NPU use a fixed mapping for their phb-index, based on the brick index. The range of possible values is 7 to 12. Because there can only be one nvlink PHB per NPU, it is always using a phb-index of 7. A side effect is that 2 virtual PHBs on 2 different chips can have the same phb-index, which is similar to what happens for 'real' PCI PHBs, but is different from what was happening on a nvlink-only witherspoon so far. Reviewed-by: Reza Arbab <arbab@linux.ibm.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-12-16fix simple sparse warningsNicholas Piggin3-4/+4
Should be no real code change, these mostly update type declarations that sparse complains about. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-12-16add more sparse endian annotationsNicholas Piggin4-10/+11
This fixes quite a few sparse endian annotations across the tree. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-12-16dt: assorted cleanupsNicholas Piggin2-17/+13
This replaces several instances dt accesses with higher level primitives throughout the tree. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-12-16sfc-ctrl: endian conversionsNicholas Piggin1-10/+9
Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-12-16prd: endian conversionsNicholas Piggin1-32/+32
Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-12-16fsp: endian conversionsNicholas Piggin22-287/+298
Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>