aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2019-10-23skiboot v6.5.1 release notesv6.5.1Vasant Hegde1-0/+27
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2019-10-23core/ipmi: Fix use-after-freeVasant Hegde1-3/+11
[ Upstream commit d75e82dbfbb9443efeb3f9a5921ac23605aab469 ] Commit f01cd77 introduced backend poller() for ipmi message. But in some corner cases its possible that we endup calling poller() after freeing ipmi message. Thread 1 : ipmi_queue_msg_sync() Waiting for ipmi sync message to complete Thread 2 : bt_poll() -> ipmi_cmd_done() -> callback handler -> free message Oliver hit this issue during fast-reboot test with skiboot DEBUG build. In debug build we poision the memory after free. That helped us to catch this issue. [ 460.295570781,3] *********************************************** [ 460.295773157,3] Fatal MCE at 0000000030035cb4 .ipmi_queue_msg_sync+0x110 MSR 9000000000201002 [ 460.295887496,3] CFAR : 0000000030035ce8 MSR : 9000000000000000 [ 460.295956419,3] SRR0 : 0000000030035cb4 SRR1 : 9000000000201002 [ 460.296035015,3] HSRR0: 0000000030012624 HSRR1: 9000000002803002 [ 460.296102413,3] DSISR: 00000008 DAR : 99999999999999d1 [ 460.296169710,3] LR : 0000000030035ce4 CTR : 0000000030002880 [ 460.296248482,3] CR : 28002422 XER : 20040000 [ 460.296336621,3] GPR00: 0000000030035ce4 GPR16: 00000000301d36d8 [ 460.296415449,3] GPR01: 0000000031c133d0 GPR17: 00000000300f5cd8 [ 460.296482811,3] GPR02: 0000000030142700 GPR18: 0000000030407ff0 [ 460.296550265,3] GPR03: 0000000000000100 GPR19: 0000000000000000 [ 460.296629041,3] GPR04: 0000000028002424 GPR20: 0000000000000000 [ 460.296696369,3] GPR05: 0000000020040000 GPR21: 0000000030121d73 [ 460.296820977,3] GPR06: c000001fffffd480 GPR22: 0000000030121dd2 [ 460.296888226,3] GPR07: c000001fffffd480 GPR23: 0000000030613400 [ 460.296978218,3] GPR08: 0000000000000001 GPR24: 0000000000000001 [ 460.297056871,3] GPR09: 9999999999999999 GPR25: 0000000031c13960 [ 460.297124647,3] GPR10: 0000000000000000 GPR26: 0000000000000004 [ 460.297203811,3] GPR11: 0000000000000000 GPR27: 0000000000000003 [ 460.297271250,3] GPR12: 0000000028002424 GPR28: 0000000030613400 [ 460.297339026,3] GPR13: 0000000031c10000 GPR29: 0000000030406b50 [ 460.297417605,3] GPR14: 00000000300f58f8 GPR30: 0000000030406b40 [ 460.297485176,3] GPR15: 00000000300f58d8 GPR31: 00000000309249c8 Reported-by: Oliver O'Halloran <oohall@gmail.com> Fixes: f01cd77 (ipmi: ensure forward progress on ipmi_queue_msg_sync()) Cc: skiboot-stable@lists.ozlabs.org # v6.3+ Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2019-10-23blocklevel: smart_write: Fix unaligned writes to ECC partitionsAndrew Jeffery4-7/+65
[ Upstream commit 96ddf4b5d3e18ed9fbf726d0dabebe1ec2424fac ] Currently trying to clear a gard record results in errors: $ ./opal-gard -pef part create /sys0/node0/proc1 $ ./opal-gard -pef part list ID | Error | Type | Path --------------------------------------------------------- 00000001 | 00000000 | Manual | /Sys0/Node0/Proc1 ========================================================= $ ./opal-gard -pef part clear 00000001 Clearing gard record 0x00000001...done ECC: uncorrectable error: ffffff00ffffffff ff libflash ecc invalid $ A little wrapper around hexdump(1) helps show where the error lies by grouping output in blocks of nine such that the last byte is the ECC byte: $ declare -f ecchd ecchd () { hexdump -e '"%08_ax""\t"' -e '9/1 "%02x ""\t""|"' -e '9/1 "%_p""|\n"' "$@" } A clean GARD partition displays as: $ ecchd part 0002c000 ff ff ff ff ff ff ff ff 00 |.........| * 00030ffb ff ff ff ff ff |.....| $ Dumping the corrupt partition shows: $ ecchd part 0002c000 ff ff ff ff ff ff ff ff 00 |.........| * 0002c024 ff ff ff ff ff ff ff ff ff |.........| 0002c02d ff ff ff 00 ff ff ff ff ff |.........| * 0002c051 ff ff ff 00 ff ff ff ff 00 |.........| 0002c05a ff ff ff ff ff ff ff ff 00 |.........| * 00030ffb ff ff ff ff ff |.....| $ blocklevel_smart_write() turned out to not be quite as smart as it thought it was in that any unaligned write to ECC protected partitions aligned the calculated ECC values to the start of the write buffer and not relative to the start of the partition. Fixes: 29d1e6f78109 ("libflash/blocklevel: add a smart write function which wraps up eraseing and writing") Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2019-10-23blocklevel: smart_write: Tidy local variable declarationsAndrew Jeffery1-4/+7
[ Upstream commit a950fd789c1ce0fbdc4f5486ccdd8301d6258ba7 ] Group them by use (and name). It's not reverse christmas tree, but it's a bit easier on the eye. Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2019-10-23blocklevel: smart_write: Avoid reuse of formal parametersAndrew Jeffery1-6/+13
[ Upstream commit aa52f9439b91db5949c04acb8e1d2d21c37ddba5 ] Lays the ground-work for fixing unaligned writes to ECC protected partitions. Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2019-10-23blocklevel: smart_write: Deny writes intersecting ECC protected regionsAndrew Jeffery1-1/+9
[ Upstream commit bdbbfcacccdb768db587ef73a2a28cf3dc8ceda9 ] Other code paths don't handle writes spanning mixed regions, and it's a headache, so deny it here too. Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2019-10-23blocklevel: smart_write: Avoid indirectly testing formal parametersAndrew Jeffery1-1/+1
[ Upstream commit 6867bd54c21b023b74e924abd6f4c3f1cd9959c2 ] The early-exit tests write_buf, but write_buf is assigned to buf on declaration. Test buf directly instead to avoid unnecessary indirection. Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2019-10-23blocklevel: smart_write: Rename size variable for clarityAndrew Jeffery1-8/+9
[ Upstream commit 518db2b216af5178a18f8b7ad06769b6acc51bcc ] We're writing in chunks, so lets make it clear that size is relative to the chunk that we're writing. Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2019-10-23blocklevel: smart_write: Rename write bufferAndrew Jeffery1-7/+7
[ Upstream commit 5c935e78f335597f502e1fda1911ca50bbeed561 ] The buffer is only used for ECC protected partitions, so lets call it ecc_buf for clarity. Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2019-10-23blocklevel: smart_write: Terminate line for debug output in no-change caseAndrew Jeffery1-0/+2
[ Upstream commit 8f204c12ebc07111c99d2059e1df0b86d7a203ae ] Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2019-10-23gard: Fix data corruption when clearing single recordsAndrew Jeffery4-1/+31
[ Upstream commit e08fee36a5196ca094c5fb7dd1421279b146531f ] Attempting to clear a specific gard record leads to corruption of the target record rather than the expected removal: $ ./opal-gard -f romulus.pnor list No GARD entries to display $ ./opal-gard -f romulus.pnor create /sys0/node0/proc1 $ ./opal-gard -f romulus.pnor list ID | Error | Type | Path --------------------------------------------------------- 00000001 | 00000000 | Manual | /Sys0/Node0/Proc1 ========================================================= $ ./opal-gard -f romulus.pnor clear 00000001 Clearing gard record 0x00000001...done $ ./opal-gard -f romulus.pnor list ID | Error | Type | Path --------------------------------------------------------- 00000001 | 00000000 | Unknown | /Sys0/Node0/Proc1 ========================================================= The GUARD partition needs to be compacted when clearing records as the end of the list is a sentinel represented by the erased-flash state. The compaction strategy is to read the trailing records and write them to the offset of the record to be removed, followed by writing the sentinel record at the offset of what was previously the last valid record. The corruption occurs due to incorrect calculation of the offset at which the trailing records will be written. Cc: Skiboot Stable <skiboot-stable@lists.ozlabs.org> Fixes: 5616c42d900a ("libflash/blocklevel: Make read/write be ECC agnostic for callers") Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2019-10-23core/platform: Actually disable fast-reboot on P8Oliver O'Halloran1-3/+5
[ Upstream commit 923b5a5342a7a37bd376327e35c7fcb98138d41c ] There was an attempt. It was not successful. Fixes: 14f709b8eeda ("Disable fast-reset for POWER8") Cc: skiboot-stable@lists.ozlabs.org Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2019-10-23xive: fix return value of opal_xive_allocate_irq()Cédric Le Goater1-1/+1
[ Upstream commit e97391ae2bb5a146a5041453f9185326654264d9 ] When the maximum number of interrupts per chip is reached, xive_try_allocate_irq() returns an internal XIVE error: XIVE_ALLOC_NO_SPACE. But its value 0xffffffff is interpreted as a positive value by its caller opal_xive_allocate_irq() and not as an error. opal_xive_allocate_irq() returns this value to Linux which also considers 0xffffffff as a valid interrupt number and tries to get the interrupt characteritics using opal_xive_get_irq_info(). This OPAL calls finally fails leading to all sort of errors on the host which is not prepared for such a scenario. Code impacted are the IPI setup and the both XIVE KVM devices. Fix by returning OPAL_RESOURCE from xive_try_allocate_irq() which is consistent with the other errors returned by this routine. This fixes the behavior in opal_xive_allocate_irq() and in Linux. A workaround could be introduced in Linux to consider 0xffffffff as a OPAL_RESOURCE value. This assumption is valid with the current XIVE IRQ number encoding. Fixes: 07946e68f47a ("xive: Add interrupt allocator") Reported-by: Greg Kurz <groug@kaod.org> Signed-off-by: Cédric Le Goater <clg@kaod.org> [oliver: Added fixes tag] Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2019-10-23MPIPL: struct opal_mpipl_fadump doesn't needs to be packedVasant Hegde1-1/+1
[ Upstream commit cc02885770f63d00d2483be4e1627d2cadfffa8a ] [CC] core/opal-dump.o core/opal-dump.c: In function ‘post_mpipl_get_opal_data’: core/opal-dump.c:471:11: warning: taking address of packed member of ‘struct opal_mpipl_fadump’ may result in an unaligned pointer value [-Waddress-of-packed-member] 471 | region = opal_mpipl_data->region; | ^~~~~~~~~~~~~~~ Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2019-10-23core/flash: Validate secure boot content sizeOliver O'Halloran1-0/+11
[ Upstream commit e2018d2a3d46491dc2abd758c67c1937910b3a67 ] Currently we don't check if the secure boot payload size fits within the partition that we are reading it from. This results in strange failures later on in boot if we cross the boundary between an ECCed and a non-ECCed partition since libflash does not support reading from regions with mixed ECC status. Without this patch: blocklevel_read: Can't cope with partial ecc FLASH: failed to read content size 15728640 BOOTKERNEL partition, rc 3 With: FLASH: Cannot load BOOTKERNEL. Content is larger than the partition Cc: Nayna Jain <nayna@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Acked-by: Stewart Smith <stewart@flamingspork.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2019-08-16skiboot 6.5 release notesv6.5Oliver O'Halloran1-0/+20
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-08-16ipmi: Use standard MIN() macro definitionJordan Niethe1-3/+1
There is a MIN() macro definition in skiboot.h. Remove the redundant definition from here and use that one. Signed-off-by: Jordan Niethe <jniethe5@gmail.com> Acked-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-08-16hw/phb4: Use standard MIN/MAX macro definitionsJordan Niethe1-6/+3
The max() macro definition incorrectly returns the minimum value. The max() macro is used to ensure that PERST has been asserted for 250ms and that we wait 100ms seconds for the ETU logic in the CRESET_START PHB4 PCI slot state. However, by returning the minimum value there is no guarantee that either of these requirements are met. Correct macro definitions for MIN and MAX are already provided in skiboot.h. Remove the redundant/incorrect versions here and switch to using the standard ones. Fixes: 70edcbb4b39d ("hw/phb4: Skip FRESET PERST when coming from CRESET") Signed-off-by: Jordan Niethe <jniethe5@gmail.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-08-16hw/phb4: Prevent register accesses when in resetOliver O'Halloran2-0/+11
While the the ETU is in reset we cannot access any of the PHB registers. If a PHB register is accessed via the XSCOM indirect interface then we'll cause an ETU reset error which may prevent the PHB from being re-initialised once the reset is lifted. Prevent register accesses while in reset by adding a flag that is set while the ETU reset bit is high and checking that flag in the XSCOM (ASB) backdoor register access path. Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-08-16npu: Fix device binding error messageReza Arbab1-2/+6
Helping someone troubleshoot a Garrison machine, I noticed some of the BDFs printed here are wrong: npu_dev_bind_pci_dev: No PCI device for NPU device 0004:00:00.0 to bind to. If you expect a GPU to be there, this is a problem. npu_dev_bind_pci_dev: No PCI device for NPU device 0004:00:01.0 to bind to. If you expect a GPU to be there, this is a problem. npu_dev_bind_pci_dev: No PCI device for NPU device 0004:00:04.0 to bind to. If you expect a GPU to be there, this is a problem. npu_dev_bind_pci_dev: No PCI device for NPU device 0004:00:05.0 to bind to. If you expect a GPU to be there, this is a problem. Change the prlog() call to print them correctly. Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-08-16npu3: Expose remaining ATSD launch registersReza Arbab2-9/+13
List all 16 ATSD registers in the device tree, not just the first 8. Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-08-16npu3: Initialize NPU3_SNP_MISC_CFG0Reza Arbab2-0/+11
Enable powerbus snooping here, or else MMIO to any NTL/NDL registers will cause a checkstop. This was not an issue in Simics simulation, but discovered rather quickly during bringup on a real Axone chip. Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-08-16npu3: Rename NPU3_SM_MISC_CFGn register macrosReza Arbab2-12/+12
The SM blocks have multiple MISC_CFG registers. For example, there are both CS.SM0.MCP.MISC.CONFIG0 and CS.SM0.SNP.MISC.CONFIG0. Rename our macro for the former to more clearly reflect this and avoid a clash when the latter is added. Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-08-16pci: Use a macro for accessing PCI BDF Function NumberJordan Niethe8-17/+18
Currently when the Function Number bits of a BDF are needed the bit operations to get it are free coded. There are many places where the Function Number is used, so make a macro to use instead of free coding it everytime. Signed-off-by: Jordan Niethe <jniethe5@gmail.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-08-16pci: Use a macro for accessing PCI BDF Device NumberJordan Niethe9-15/+16
Currently when the Device Number bits of a BDF are needed the bit operations to get it are free coded. There are many places where the Device Number is used, so make a macro to use instead of free coding it everytime. Signed-off-by: Jordan Niethe <jniethe5@gmail.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-08-16pci: Use a macro for accessing PCI BDF Bus NumberJordan Niethe9-18/+21
Currently when the Bus Number bits of a BDF are needed the bit operations to get it are free coded. There are many places where the Bus Number is used, so make a macro to use instead of free coding it everytime. Signed-off-by: Jordan Niethe <jniethe5@gmail.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-08-16core/pci-dt-slots: Remove duplicate PCIDBG() definitionJordan Niethe1-6/+0
PCIDBG() is already defined in pci.h, which is included by pci-dt-slot.c. It should not be defined again so remove this definition. Signed-off-by: Jordan Niethe <jniethe5@gmail.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-08-16include/xscom: Use the name EQ rather than EPOliver O'Halloran2-11/+15
The P9 pervasive spec uses the term "EP" to refer to the combination of an EQ chiplet and its two child EX chiplets. Nothing else seems to use the term EP and in Skiboot all the uses of the XSCOM_ADDR_P9_EP() macro are to translate the address of EQ specific SCOM registers. Change the name of our address calculation macros to match the general terminology to make what it does clearer. Cc: Anju T Sudhakar <anju@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-08-16include/xscom: Remove duplicate p9 definitionsOliver O'Halloran1-5/+0
These are already defined in xscom-p9-regs.h Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-08-16include/xscom: Remove duplicate p8 definitionsOliver O'Halloran1-40/+0
Duplicates of what's already in xscom-p8-regs.h Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-08-15MPIPL: Add documentationVasant Hegde6-0/+250
Document MPIPL device tree and OPAL APIs. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> [oliver: rebased] Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-08-15MPIPL: Prepare architected registers data tagVasant Hegde1-0/+37
Post MPIPL kernel needs saved CPU register details to create vmcore/opalcore. This patch prepares CPU register data tag and add it to tags list. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> [oliver: rebased] Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-08-15MPIPL: Reserve memory to capture architected registers dataVasant Hegde4-1/+53
- Split SPIRAH memory to accommodate architected register ntuple. Today we have 1K memory for SPIRAH and it uses 288 bytes. Lets split this into two parts : SPIRAH (756 bytes) architected register memory (256 bytes) - Update SPIRAH architected register ntuple - Calculate memory required to capture architected registers data Ideally we should use HDAT provided data (proc_dump_area->thread_size). But we are not getting this data during boot. Hence lets reserve fixed memory for architected registers data collection. - Add architected registers destination memory to reserve-memory DT node. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> [oliver: rebased] Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-08-15MPIPL: Clear tags and metadataVasant Hegde1-0/+6
Post dump process, kernel sends FREE_PRESERVE_MEMEORY notification to OPAL. OPAL will clear metadata section and tags. Subsequent opal_mpipl_query_tag() call will return OPAL_EMPTY. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> [oliver: rebased] Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-08-15MPIPL: Add OPAL API to query saved tagsVasant Hegde2-2/+43
Pre-MPIPL kernel saves various information required to create vmcore in metadata area and passes metadata area pointer to OPAL. OPAL will preserve this pointer across MPIPL. Post MPIPL kernel will request for saved tags via this API. Kernel also needs below tags: - Saved CPU registers data to access CPU registers - OPAL metadata area to create opalcore Format: opal_mpipl_query_tag(enum opal_mpipl_tags tag, uint64_t *tag_val) tag : OPAL_MPIPL_TAG_CPU Pointer to CPU register data content metadata area OPAL_MPIPL_TAG_OPAL Pointer to OPAL metadata area OPAL_MPIPL_TAG_KERNEL During first boot, kernel will setup its metadata area and asks OPAL to preserve metadata area pointer across MPIPL. Post MPIPL kernel calls this API to get metadata pointer and it will use that pointer to retrieve metadata and create dump. OPAL_MPIPL_TAG_BOOT_MEM During MPIPL registration kernel will specify how much memory firmware can use for Post MPIPL load. Post MPIPL petitboot kernel will query for this tag to get boot memory size. Return values: OPAL_SUCCESS : Operation success OPAL_PARAMETER : Invalid parameter Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> [oliver: rebased] Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-08-15MPIPL: Prepare OPAL data tagVasant Hegde2-0/+81
Post MPIPL kernel needs OPAL metadata to create opalcore. This patch sets up OPAL metadata tag. Next patch will add API to pass metadata pointer to kernel. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> [oliver: rebased] Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-08-15hdata: Add "mpipl-boot" property to "dump" nodeVasant Hegde2-0/+47
During MPIPL boot, hostboot updates HDAT to indicate its MPIPL boot. Lets add "mpipl-boot" property to device tree. So that kernel can detect its MPIPL boot and create dump. Device tree property: /ibm,opal/dump/mpipl-boot - Indicate kernel that its MPIPL boot Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> [oliver: rebased] Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-08-15platform: Introduce new reboot typeVasant Hegde3-0/+14
Enhance reboot2 call to support MPIPL. Payload will call this interface to initiate MPIPL. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> [oliver: rebased] Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-08-15HIOMAP: Reset bmc mbox in MPIPL pathVasant Hegde10-4/+251
During boot SBE and early hostboot does not use HIOMAP protocol to get image from PNOR. Instead it expects PNOR TOC and Hostboot Boot Loader to be available at particular address in LPC bus. mbox daemon in BMC side takes care of this during normal boot. Once boot is complete mbox daemon switches to normal mode. During normal reboot, BMC side mbox daemon gets notification and takes care of loading PNOR TOC and HBBL to LPC bus again. In MPIPL path, OPAL calls SBE S0 interrupt to initiate MPIPL. BMC will not be aware of this. But SBE expects PNOR TOC and HBBL to be available in LPC bus at predefined address. Hence call HIOMAP Reset from OPAL in assert path. This needs working LPC and IPMI driver in OPAL. If we have issue in these drivers then we may not be able to reset BMC MBOX properly. Hence MPIPL may fail. We have to live with this until we find a way to intiate BMC on MPIPL. CC: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Reviewed-by: Andrew Jeffery <andrew@aj.id.au> [oliver: rebased] Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-08-15MPIPL: Save crashing PIRVasant Hegde3-0/+14
Crashing CPU PIR is required to get proper backtrace from core file. Save crashing CPU PIR before triggering MPIPL. Post MPIPL OPAL will pass saved PIR to kernel and kernel will use that to create OPAL dump. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> [oliver: rebased] Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-08-15MPIPL: Add support to trigger MPIPL on BMC systemVasant Hegde3-3/+84
On FSP based system we call 'attn' instruction. FSP detects attention and initiates memory preserving IPL. On BMC system we have to call SBE S0 interrupt to initiate memory preserving IPL. This patch adds support to call SBE S0 interrupt in assert path. Sequence : - S0 interrupt on secondary chip SBE - S0 interrupt on primary chip SBE Note that this is hooked to ipmi_terminate path. We have HDAT flag for MPIPL support. If MPIPL is not supported then we don't create 'ibm,opal/dump' node and we will fall back to existing termination flow. Finally we want to log error log to BMC before triggerring MPIPL. Hence this patch re-organizes ipmi_terminate() such that we call ipmi_log_terminate_event() before triggering MPIPL. Note: - At present we do not have a proper way to detect SBE is alive or not. So we wait for predefined time and then call normal reboot. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> [oliver: rebased] Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-08-15SBE: Send OPAL relocated base address to SBEVasant Hegde3-0/+58
OPAL relocates itself during boot. During memory preserving IPL hostboot needs to access relocated OPAL base address to get MDST, MDDT tables. Hence send relocated base address to SBE via 'stash MPIPL config' chip-op. During next IPL SBE will send stashed data to hostboot... so that hostboot can access these data. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> [oliver: rebased] Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-08-15MPIPL: Add OPAL API to register tagsVasant Hegde2-1/+41
This patch adds new API to register tags. opal_mpipl_register_tag(enum opal_mpipl_tags tag, uint64_t tag_val) tag: OPAL_MPIPL_TAG_KERNEL During first boot, kernel will setup its metadata area and asks OPAL to preserve metadata area pointer across MPIPL. Post MPIPL kernel requests OPAL to provide metadata pointer and it will use that pointer to retrieve metadata and create dump. OPAL_MPIPL_TAG_BOOT_MEM During MPIPL registration kernel will specify how much memory firmware can use for Post MPIPL load. Post MPIPL petitboot kernel will query for this tag to get boot memory size. Return values: OPAL_SUCCESS : Operation success OPAL_PARAMETER : Payload passed invalid tag Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> [oliver: rebased] Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-08-15MPIPL: Add OPAL API to register for dump regionVasant Hegde3-1/+192
This patch add new API to register for dump region. u64 opal_mpipl_update(u8 ops, u64 src, u64 dest, u64 size) ops : OPAL_MPIPL_ADD_RANGE Add new entry to MPIPL table. Kernel will send src, dest and size. During MPIPL content from source address is moved to destination address. src = Source start address dest = Destination start address size = size OPAL_MPIPL_REMOVE_RANGE Remove kernel requested entry from MPIPL table. src = Source start address dest = Destination start address size = ignore OPAL_MPIPL_REMOVE_ALL Remove all kernel passed entry from MPIPL table. src = ignore dest = ignore size = ignore OPAL_MPIPL_FREE_PRESERVED_MEMORY Post MPIPL, kernel will indicate OPAL that it has processed dump and it can clear/release metadata area. src = ignore dest = ignore size = ignore Return values: OPAL_SUCCESS : Operation success OPAL_PARAMETER : Payload passed invalid data OPAL_RESOURCE : Ran out of MDST or MDDT table size OPAL_HARDWARE : MPIPL not supported Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> [oliver: rebased] Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-08-15MPIPL: Define OPAL metadata areaVasant Hegde3-1/+30
We want to save some information (like crashing CPU PIR, kernel tags, etc) before triggering MPIPL. Post MPIPL we will use this information to retrieve dump metadata and create dump. MDRT table doesn't need 64K. Hence split MDRT table to accommodate metadata area. Finally define metadata structure. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> [oliver: rebased] Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-08-15MPIPL: Register for OPAL dumpVasant Hegde4-1/+153
This patch adds support to register for OPAL dump. - Calculate memory required to capture OPAL dump - Reserve OPAL dump destination memory - Add OPAL dump details to MDST and MDDT table Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> [oliver: rebased] Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-08-15hdata: Create /ibm, opal/dump device tree nodeVasant Hegde2-0/+22
We use MPIPL system parameter to detect whether MPIPL is supported or not. If its supported create new device tree node (/ibm,opal/dump) to pass all dump related information to kernel. This patch creates new node and populates below properties: - compatible - dump version (ibm,opal-dump) - fw-load-area - Memory used by OPAL to load kernel/initrd from PNOR (KERNEL_LOAD_BASE & INITRAMFS_LOAD_BASE). This is the temporary memory used by OPAL during boot. Later Linux kernel is free to use this memory. During MPIPL boot also OPAL will overwrite this memory. OPAL will advertise these memory details to kernel. If kernel is using these memory and needs these memory content for proper dump creation, then it has to reserve destination memory to preserve these memory ranges. Also kernel should pass this detail during registration. During MPIPL firmware will take care of preserving memory and post MPIPL kernel can create proper dump. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> [oliver: rebased] Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-08-15hdata: Adjust various structure offset after relocationVasant Hegde1-0/+23
ntuple addresses in SPIRAH are relative to payload base. Update various addresses after relocation so that hostboot can access new address to capture dump. Note that we update relocated SPIRAH. So future if we add early OPAL crash support, hostboot can still collect dump using origianl skiboot base. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> [oliver: rebased] Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-08-15hdata: Update spirah structureVasant Hegde1-3/+24
Update MDST, MDDT and MDRT ntuple inside SPIRAH. During MPIPL hostboot will use these details to preserve memory. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> [oliver: rebased] Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-08-15mem-map: Setup memory for MDRT tableVasant Hegde1-1/+7
Hostboot fills MDRT table after moving memory content from source to destination memory. And OPAL relies on this table to extract the dump. We have to make sure this table is intact. Hence define memory relative to SKIBOOT_BASE so that our relocation doesn't overwrite this memory. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> [oliver: rebased] Signed-off-by: Oliver O'Halloran <oohall@gmail.com>