aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2019-05-10skiboot v6.3.1 release notesv6.3.1Vasant Hegde1-0/+60
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2019-05-10doc/bmc: Document SBE validation on P8 platformsSamuel Mendoza-Jonas1-0/+27
[ Upstream commit 5e8a373ebe4dea501245e1103de9ca3abc7ab976 ] Signed-off-by: Samuel Mendoza-Jonas <sam@mendozajonas.com> Reviewed-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2019-05-10platforms/astbmc: Check for SBE validation stepSamuel Mendoza-Jonas10-3/+196
[ Upstream commit 1bc63b896405ccea4584d764a28d01858e81efc3 ] On some POWER8 astbmc systems an update to the SBE requires pausing at runtime to ensure integrity of the SBE. If this is required the BMC will set a chassis boot option IPMI flag using the OEM parameter 0x62. If Skiboot sees this flag is set it waits until the SBE update is complete and the flag is cleared. Unfortunately the mystery operation that validates the SBE also leaves it in a bad state and unable to be used for timer operations. To workaround this the flag is checked as soon as possible (ie. when IPMI and the console are set up), and once complete the system is rebooted. Signed-off-by: Samuel Mendoza-Jonas <sam@mendozajonas.com> Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Reviewed-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2019-05-10include/ipmi: Fix incorrect chassis commandsSamuel Mendoza-Jonas1-7/+7
[ Upstream commit bc2b1de3beb2ee7904d936b10c8a57cd220d8ddc ] These commands are listed in the order they appear in the IPMI specification but with the wrong values - correct them! Signed-off-by: Samuel Mendoza-Jonas <sam@mendozajonas.com> Reviewed-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2019-05-10ipmi: ensure forward progress on ipmi_queue_msg_sync()Stewart Smith4-1/+28
[ Upstream commit f01cd777adb16cbab93215d26159aa1c4606112c ] BT responses are handled using a timer doing the polling. To hope to get an answer to an IPMI synchronous message, the timer needs to run. We can't just check all timers though as there may be a timer that wants a lock that's held by a code path calling ipmi_queue_msg_sync(), and if we did enforce that as a requirement, it's a pretty subtle API that is asking to be broken. So, if we just run a poll function to crank anything that the IPMI backend needs, then we should be fine. This issue shows up very quickly under QEMU when loading the first flash resource with the IPMI HIOMAP backend. Reported-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Stewart Smith <stewart@linux.ibm.com> Reviewed-by: Andrew Jeffery <andrew@aj.id.au> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2019-05-10pci/iov: Remove skiboot VF trackingOliver O'Halloran4-305/+1
[ Upstream commit 22057f868f3b2b1fd02647a738f6da0858b5eb6c ] This feature was added a few years ago in response to a request to make the MaxPayloadSize (MPS) field of a Virtual Function match the MPS of the Physical Function that hosts it. The SR-IOV specification states the the MPS field of the VF is "ResvP". This indicates the VF will use whatever MPS is configured on the PF and that the field should be treated as a reserved field in the config space of the VF. In other words, a SR-IOV spec compliant VF should always return zero in the MPS field. Adding hacks in OPAL to make it non-zero is... misguided at best. Additionally, there is a bug in the way pci_device structures are handled by VFs that results in a crash on fast-reboot that occurs if VFs are enabled and then disabled prior to rebooting. This patch fixes the bug by removing the code entirely. This patch has no impact on SR-IOV support on the host operating system. Cc: Sergey Miroshnichenko <s.miroshnichenko@yadro.com> Cc: skiboot-stable@lists.ozlabs.org Tested-by: Santwana Samantray <santwana.samantray@in.ibm.com> Tested-by: Satheesh Rajendran <satheera@in.ibm.com> [oliver: added tested-bys] Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2019-05-03skiboot v6.3 release notesv6.3Stewart Smith1-0/+1275
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-05-03Disable fast-reset for POWER8Stewart Smith1-2/+9
There is a bug with fast-reset when CPU cores are busy, which can be reproduced by running `stress` and then trying `reboot -ff` (this is what the op-test test cases FastRebootHostStress and FastRebootHostStressTorture do). What happens is the cores lock up, which isn't the best thing in the world when you want them to start executing instructions again. A workaround is to use instruction ramming, which while greatly increasing the reliability of fast-reset on p8, doesn't make it perfect. Instruction ramming is what pdbg was modified to do in order to have the sreset functionality work reliably on p8. pdbg patches: https://patchwork.ozlabs.org/project/pdbg/list/?series=96593&state=* Fixes: https://github.com/open-power/skiboot/issues/185 Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-05-03pci: Try harder to add meaningful ibm,loc-codeStewart Smith1-0/+15
We keep the existing logic of looking to the parent for the slot-label or slot-location-code, but we add logic to (if all that fails) we look directly for the slot-location-code (as this should give us the correct loc code for things directly under the PHB), and otherwise we just look for a loc-code. The applicable bit of PAPR here is: R1–12.1–1. Each instance of a hardware entity (FRU) has a platform unique location code and any node in the OF device tree that describes a part of a hardware entity must include the “ibm,loc-code” property with a value that represents the location code for that hardware entity. which we weren't really fully obeying at any recent (ever?) point in time. Now we should do okay, at least for PCI. Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-05-02skiboot v6.3-rc3 release notesv6.3-rc3Stewart Smith1-0/+228
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-05-02Mark all partitions except full PNOR and boot kernel firmware read onlyTimothy Pearson1-0/+7
FFS partitions don't always align on erase blocks. Mark any paritions not known to align on erase blocks as read only to prevent silent corruption of adjacent partitions during erase / write from the host. Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-05-02Expose PNOR Flash partitions to host MTD driver via devicetreeTimothy Pearson2-12/+65
This makes it possible for the host to directly address each partition without requiring each application to directly parse the FFS headers. This has been in use for some time already to allow BOOTKERNFW partition updates from the host. Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-05-02Write boot progress to LPC ports 81 and 82Stewart Smith2-2/+102
There's a thought to write more extensive boot progress codes to LPC ports 81 and 82 to supplement/replace any reliance on port 80. We want to still emit port 80 for platforms like Zaius and Barreleye that have the physical display. Ports 81 and 82 can be monitored by a BMC though. Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-05-02Write boot progress to LPC port 80hStewart Smith6-3/+195
This is an adaptation of what we currently do for op_display() on FSP machines, inventing an encoding for what we can write into the single byte at LPC port 80h. Port 80h is often used on x86 systems to indicate boot progress/status and dates back a decent amount of time. Since a byte isn't exactly very expressive for everything that can go on (and wrong) during boot, it's all about compromise. Some systems (such as Zaius/Barreleye G2) have a physical dual 7 segment display that display these codes. So far, this has only been driven by hostboot (see hostboot commit 90ec2e65314c). Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-05-02Remove Talos DT match from Romulus fileTimothy Pearson1-2/+1
Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-05-02Copy and convert Romulus descriptors to TalosTimothy Pearson2-1/+88
Talos II has some hardware differences from Romulus, therefore we cannot guarantee Talos II == Romulus in skiboot. Copy and slightly modify the Romulus files for Talos II. Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-05-02hw/phb4: Fix references to PHB3Oliver O'Halloran1-2/+2
Currently most of the functionality of phb4_lsi_attributes() is disabled when we have #defined DISABLE_ERR_INTS. This is the default behaviour and #undefing the constant results in skiboot not compiling because the code was not updated when it was copied across from PHB3. This patch fixes the problem by changing the names to the phb4 versions. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-05-02npu2: Disable Probe-to-Invalid-Return-Modified-or-Owned snarfing by defaultAlexey Kardashevskiy2-13/+57
V100 GPUs are known to violate NVLink2 protocol in some cases (one is when memory was accessed by the CPU and they by GPU using so called block linear mapping) and issue double probes to NPU which can cope with this problem only if CONFIG_ENABLE_SNARF_CPM ("disable/enable Probe.I.MO snarfing a cp_m") is not set in the CQ_SM Misc Config register #0. If the bit is set (which is the case today), NPU issues the machine check stop. The snarfing feature is designed to detect 2 probes in flight and combine them into one. This adds a new "opal-npu2-snarf-cpm" nvram variable which controls CONFIG_ENABLE_SNARF_CPM for all NVLinks to prevent the machine check stop from happening. This disables snarfing by default as otherwise a broken GPU driver can crash the entire box even when a GPU is passed through to a guest. This provides a dial to allow regression tests (might be useful for a bare metal). To enable snarfing, the user needs to run: sudo nvram -p ibm,skiboot --update-config opal-npu2-snarf-cpm=enable and reboot the host system. While at this, define macros for register names as well to avoid touching same lines over and over again. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-05-02core/init: LPC isn't just P8 (fix comment)Stewart Smith1-1/+1
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-05-02doc: Add (most) nvram debugging optionsStewart Smith6-1/+166
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-04-29hw/npu2: Show name of opencapi error interruptsFrederic Barrat1-2/+5
Add the name of which error interrupt is received. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-04-29core/pci: Use PHB io-base-location by default for PHB slotsOliver O'Halloran1-0/+9
On witherspoon only the GPU slots and the three pluggable PCI slots (SLOT0, 1, 2) have platform defined slot names. For builtin devices such as the SATA controller or the PLX switch that fans out to the GPU slots we have no location codes which some people consider an issue. This patch address the problem by making the ibm,slot-location-code for the root port device default to the ibm,io-base-location-code which is typically the location code for the system itself. e.g. pciex@600c3c0100000/ibm,loc-code "UOPWR.0000000-Node0-Proc0" pciex@600c3c0100000/pci@0/ibm,loc-code "UOPWR.0000000-Node0-Proc0" pciex@600c3c0100000/pci@0/usb-xhci@0/ibm,loc-code "UOPWR.0000000-Node0" The PHB node, and the root complex nodes have a loc code of the processor they are attached to, while the usb-xhci device under the root port has a location code of the system itself. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-04-29hw/phb4: Read ibm,loc-code from PBCQ nodeOliver O'Halloran1-2/+2
On P9 the PBCQs are subdivided by stacks which implement the PCI Express logic. When phb4 was forked from phb3 most of the properties that were in the pbcq node moved into the stack node, but ibm,loc-code was not one of them. This patch fixes the phb4 init sequence to read the base location code from the PBCQ node (parent of the stack node) rather than the stack node itself. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-04-17hw/xscom: P9P rather than P9Stewart Smith1-1/+1
Fixes: 2c8f96534a978bb4cac3e4b7dd393a9cc4926555 Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-04-17hw/xscom: add missing P9P chip nameNicholas Piggin1-1/+1
Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-04-17asm/head: balance branches to avoid link stack predictor mispredictsNicholas Piggin1-1/+6
The Linux wrapper for OPAL call and return is arranged like this: __opal_call: mflr r0 std r0,PPC_STK_LROFF(r1) LOAD_REG_ADDR(r11, opal_return) mtlr r11 hrfid -> OPAL opal_return: ld r0,PPC_STK_LROFF(r1) mtlr r0 blr When skiboot returns to Linux, it branches to LR (i.e., opal_return) with a blr. This unbalances the link stack predictor and will cause mispredicts back up the return stack. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-04-17external/mambo: also invoke readline for the non-autorun caseNicholas Piggin1-0/+2
Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-04-17asm/head.S: set POWER9 radix HID bit at entryNicholas Piggin4-20/+5
When running in virtual memory mode, the radix MMU hid bit should not be changed, so set this in the initial boot SPR setup. As a side effect, fast reboot also has HID0:RADIX bit set by the shared spr init, so no need for an explicit call. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-04-17opal-prd: Fix memory leak in is-fsp-system checkVasant Hegde1-1/+6
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-04-17opal-prd: Check malloc return valueVasant Hegde1-0/+4
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-04-17Makefile: Build with symbolsJoel Stanley1-1/+1
Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-04-17hw/phb4: Squash the IO bridge windowOliver O'Halloran1-0/+8
The PCI-PCI bridge spec says that bridges that implement an IO window should hardcode the IO base and limit registers to zero. Unfortunately, these registers only define the upper bits of the IO window and the low bits are assumed to be 0 for the base and 1 for the limit address. As a result, setting both to zero can be mis-interpreted as a 4K IO window. This patch fixes the problem the same way PHB3 does. It sets the IO base and limit values to 0xf000 and 0x1000 respectively which most software interprets as a disabled window. lspci before patch: 0000:00:00.0 PCI bridge: IBM Device 04c1 (prog-if 00 [Normal decode]) I/O behind bridge: 00000000-00000fff lspci after patch: 0000:00:00.0 PCI bridge: IBM Device 04c1 (prog-if 00 [Normal decode]) I/O behind bridge: None Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-04-17build: link with --orphan-handling=warnNicholas Piggin1-0/+2
The linker can warn when the linker script does not explicitly place all sections. These orphan sections are placed according to heuristics, which may not always be desirable. Enable this warning. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-04-17opal-ci: Centos7 with latest crosstool toolchain (gcc 8.1.0)Stewart Smith2-4/+4
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-04-17build/lds: place remaining sections according to defaultsNicholas Piggin1-4/+11
Place remaining orphan linker sections according to default script as described by `ld --verbose`. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-04-17build/lds: place debug sections according to defaultsNicholas Piggin1-0/+45
Place debug orphan linker sections according to default script as described by `ld --verbose`. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-04-17build: -fno-asynchronous-unwind-tablesNicholas Piggin2-1/+2
skiboot does not use unwind tables, this option saves about 100kB, mostly from .text. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-04-17chiptod: Remove unused prototype from headerJordan Niethe1-1/+0
There is prototype for chiptod_reset_tb() in include/chiptod.h. However no definition is ever provided, nor is it ever used. Remove the prototype. Signed-off-by: Jordan Niethe <jniethe5@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-04-17hw/xscom: Enable sw xstop by default on p9Oliver O'Halloran1-24/+2
This was disabled at some point during bringup to make life easier for the lab folks trying to debug NVLink issues. This hack really should have never made it out into the wild though, so we now have the following situation occuring in the field: 1) A bad happens 2) The host kernel recieves an unrecoverable HMI and calls into OPAL to request a platform reboot. 3) OPAL rejects the reboot attempt and returns to the kernel with OPAL_PARAMETER. 4) Kernel panics and attempts to kexec into a kdump kernel. A side effect of the HMI seems to be CPUs becoming stuck which results in the initialisation of the kdump kernel taking a extremely long time (6+ hours). It's also been observed that after performing a dump the kdump kernel then crashes itself because OPAL has ended up in a bad state as a side effect of the HMI. All up, it's not very good so re-enable the software checkstop by default. If people still want to turn it off they can using the nvram override. Cc: skiboot-stable@lists.ozlabs.org Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Acked-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-04-17opal/hmi: Initialize the hmi event with old value of TFMR.Mahesh Salgaonkar1-1/+3
Do this before we fix TFAC errors. Otherwise the event at host console shows no thread error reported in TFMR register. Without this patch the console event show TFMR with no thread error: (DEC parity error TFMR[59] injection) [ 53.737572] Severe Hypervisor Maintenance interrupt [Recovered] [ 53.737596] Error detail: Timer facility experienced an error [ 53.737611] HMER: 0840000000000000 [ 53.737621] TFMR: 3212000870e04000 After this patch it shows old TFMR value on host console: [ 2302.267271] Severe Hypervisor Maintenance interrupt [Recovered] [ 2302.267305] Error detail: Timer facility experienced an error [ 2302.267320] HMER: 0840000000000000 [ 2302.267330] TFMR: 3212000870e14010 Fixes: 674f7696f ("opal/hmi: Rework HMI handling of TFAC errors") Cc: skiboot-stable@lists.ozlabs.org Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-04-17core/pci: Prefer ibm, slot-label when finding loc codesOliver O'Halloran1-5/+10
On OpenPower systems the ibm,slot-label property is used to identify slots rather than the more verbose ibm,slot-location-code. The slot-label lookup is currently broken since it assumes that the ibm,slot-label is in the PCI device node rather than in the node of the device that provides the slot (e.g. root port or switch downstream port). This patch corrects the lookup code to search the parent node (and possibly it's grandparents), similar to how we search for ibm,slot-location-code. Fixes: 1c3baae4f2b3 ("hdata/iohub: Look for IOVPD on P9") Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-04-11skiboot v6.3-rc2 release notesv6.3-rc2Stewart Smith1-0/+96
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-04-09test-ipmi-hiomap: Add read-one-byte testVasant Hegde1-0/+40
Add test case to read: - 1 byte - 1 block and 1 byte data Cc: Andrew Jeffery <andrew@aj.id.au> Cc: skiboot-stable@lists.ozlabs.org Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Reviewed-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-04-09test-ipmi-hiomap: Fix lpc-read-successVasant Hegde1-1/+3
Cc: Andrew Jeffery <andrew@aj.id.au> Cc: skiboot-stable@lists.ozlabs.org Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Reviewed-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-04-09test-ipmi-hiomap: Add write-one-byte testVasant Hegde1-0/+38
Add test case to write: - 1 byte - 1 block and 1 byte data Cc: Andrew Jeffery <andrew@aj.id.au> Cc: skiboot-stable@lists.ozlabs.org Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Reviewed-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-04-09test-ipmi-hiomap: Assert if size is zeroVasant Hegde1-1/+2
Cc: Andrew Jeffery <andrew@aj.id.au> Cc: skiboot-stable@lists.ozlabs.org Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Reviewed-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-04-09libflash/ipmi-hiomap: Fix blocks count issueVasant Hegde1-3/+18
We convert data size to block count and pass block count to BMC. If data size is not block aligned then we endup sending block count less than actual data. BMC will write partial data to flash memory. Sample log : [ 594.388458416,7] HIOMAP: Marked flash dirty at 0x42010 for 8 [ 594.398756487,7] HIOMAP: Flushed writes [ 594.409596439,7] HIOMAP: Marked flash dirty at 0x42018 for 3970 [ 594.419897507,7] HIOMAP: Flushed writes In this case HIOMAP sent data with block count=0 and hence BMC didn't flush data to flash. Lets fix this issue by adjusting block count before sending it to BMC. Cc: Andrew Jeffery <andrew@aj.id.au> Cc: skiboot-stable@lists.ozlabs.org Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Reviewed-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-04-09opal/hmi: Never trust a cow!Frederic Barrat1-1/+1
With opencapi, it's fairly common to trigger HMIs during AFU development on the FPGA, by not replying in time to an NPU command, for example. So shift the blame reported by that cow to avoid crowding my mailbox. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-04-09hw/npu2: Dump (more) npu2 registers on link error and HMIsFrederic Barrat4-57/+246
We were already logging some NPU registers during an HMI. This patch cleans up a bit how it is done and separates what is global from what is specific to nvlink or opencapi. Since we can now receive an error interrupt when an opencapi link goes down unexpectedly, we also dump the NPU state but we limit it to the registers of the brick which hit the error. The list of registers to dump was worked out with the hw team to allow for proper debugging. For each register, we print the name as found in the NPU workbook, the scom address and the register value. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-04-09hw/npu2: Report errors to the OS if an OpenCAPI brick is fencedFrederic Barrat2-4/+52
Now that the NPU may report interrupts due to the link going down unexpectedly, report those errors to the OS when queried by the 'next_error' PHB callback. The hardware doesn't support recovery of the link when it goes down unexpectedly. So we report the PHB as dead, so that the OS can log the proper message, notify the drivers and take the devices down. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>