riscv-gnu-toolchain/qemu/roms/skiboot.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
2019-11-04	xive/p9: obsolete OPAL_XIVE_IRQ_SHIFT_BUG flags	Cédric Le Goater	4	-9/+3
	These were needed to workaround HW bugs in PHB4 LSIs of POWER9 DD1.0 processors. HW395455 P9/PHB4: Wrong Interrupt ESB CI Load Opcode Location in 64K page mode Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-11-04	xive/p9: obsolete OPAL_XIVE_IRQ_*_VIA_FW flags	Cédric Le Goater	2	-14/+2
	These were needed to workaround HW bugs in PHB4 LSIs of POWER9 DD1.0 processors. Keep the flags in case of a similar issue in the next generation of the XIVE logic and keep it also for Linux which still has handlers in its XIVE layer. However, there is no need to keep the code in POWER9 XIVE driver. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-11-04	xive/p9: remove dead code	Cédric Le Goater	1	-4/+0
	Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-11-04	xive/p9: remove code not using block group mode	Cédric Le Goater	1	-208/+1
	block group mode is now required, it can not be disabled. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-11-04	xive/p9: remove code not using indirect mode	Cédric Le Goater	1	-111/+12
	An indirect table is a one page array of XIVE VSDs pointing to subpages containing XIVE virtual structures: NVTs, ENDs, EASs or ESBs. The OPAL XIVE driver uses direct tables for the EAS and ESB tables. The number of interrupts being 1M, the tables are respectivelly 256K and 8M per chip. We want the EAS lookup to be fast so we keep this table direct. The NVT and END are bigger structures (64 and 32 bytes). If the table were direct, we would use 32M and 32M of OPAL memory per chip. They are indirect today and Linux allocates the pages on behalf of OPAL when a new indirect subpage is needed. We plan to increase the NVT space and END space in P10. Remove USE_INDIRECT ifdef and associated code not used anymore. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-11-04	xive/p9: use MMIO access for VC_EQC_CONFIG	Cédric Le Goater	1	-1/+1
	There is no reason to issue loads on XSCOM when syncing the interrupt controller. All should be in place to use MMIOs. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-11-04	xive/p9: minor cleanup of the interface	Cédric Le Goater	6	-8/+8
	The XIVE driver exposes an API to the core OPAL layer and to other OPAL drivers. This is a minor cleanup preparing ground for future XIVE logic. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-11-04	xive/p9: introduce header files for the registers	Cédric Le Goater	4	-459/+490
	This is moving the definitions of the registers of the P9 XIVE interrupt controller and the P9 XIVE internal structures in a specific header file and moving the definitions related to the thread interrupt context area to a common file. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-11-04	npu3: Make SALT CMD_REG writable	Reza Arbab	1	-3/+4
	CMD_REG should be writable, not read-only. Fix this, initializing it with a default "unset" value (0xffffffff). Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-11-04	npu3: Improve SALT log output	Reza Arbab	1	-3/+5
	Add a log line for when the PPE indicates it's not in the ready state, and make all the SALT lines start with a capital to look nicer. Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-11-04	npu3: Add ibm, ioda2-npu3-phb to compatible property	Reza Arbab	1	-2/+7
	Though they are currently identical to the OS, it may become necessary to distinguish npu3 phbs from npu2 ones at some point. Add a unique string to the compatible property. Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-11-04	platforms/swift: Remove spurious error message	Reza Arbab	1	-5/+0
	npu3_chip_possible_gpus() works by dividing the number of NVLink-mode bricks by the number of bricks connecting a single GPU. In a system with no GPUs, the latter value is unknown, so the function returns zero and we trip a somewhat misleading error message. The code afterward is safe to execute in any case, so there's no need to return either. Remove the check entirely. Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-10-24	skiboot v6.5.1 release notes	Vasant Hegde	1	-0/+27
	Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-10-24	skiboot v6.3.4 release notes	Vasant Hegde	1	-0/+29
	Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-10-22	gard: Add support to run gard tests on FSP platform	Vasant Hegde	2	-6/+11
	gard tool is not supported on FSP based system. But we can still run gard tests on FSP based system. Acked-by: Stewart Smith <stewart@flamingspork.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-10-22	core/ipmi: Remove redundant variable	Vasant Hegde	1	-7/+3
	Previous commit d75e82dbf introduced unnecessary variable/check. Remove that and add barrier after setting sync_msg to NULL. Cc: Oliver O'Halloran <oohall@gmail.com> Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Fixes: d75e82dbf (core/ipmi: Fix use-after-free) Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-10-22	chip: enable HOMER/OCC common area region in Qemu emulated PowerNV host	Balamuruhan S	2	-2/+3
	Recent work on Qemu adds support to emulate homer memory region and occ common area region with respective device models, so remove `QUIRK_NO_PBA` to enable HOMER/OCC common area region for Qemu emulated PowerNV host. Introduce `QUIRK_QEMU` in enum proc_chip_quirks that will be used for future work. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Balamuruhan S <bala24@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-10-22	occ-sensor: clean dt properties if sensor is not available	Balamuruhan S	1	-0/+4
	In `occ_sensor_init()` device tree node is created for sensor-goups and performs `occ_sensor_sanity()` check to initialize the device tree. But if there are no sensors like in Qemu, sanity check fails but still device tree populates the sensor-groups node wrongly as the node created is not cleaned up. Signed-off-by: Balamuruhan S <bala24@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-10-22	npu2-opencapi: Log a warning when resetting a broken device	Frederic Barrat	1	-0/+4
	On P9, the NPU doesn't support recovery if the link goes down unexpectedly. It was not fully verified. We mark the device as broken when we receive an error interrupt from the NPU. However, there's nothing to prevent the OS from trying to reset the device; It may or may not work, it's unsupported territory, so let's log a message to make it clear, as it could help when debugging. We haven't hit any cases where the reset goes badly enough that we'd want to prevent it, so let it go for now. We can revisit later if we have evidence that it's causing more problems than it is worth. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-10-22	npu2-opencapi: Handle OPAL_UNMAP_PE operation on set_pe() callback	Frederic Barrat	1	-1/+6
	In a hot-unplug scenario, the OS will try to unmap the PE. Skiboot doesn't do anything with the linux PE for opencapi other than being a mailbox, but at least let's be consistent. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-10-22	npu2-opencapi: Activate PCI hotplug on opencapi slot	Frederic Barrat	1	-4/+65
	Implement the get_power_state() and set_power_state() callbacks for the opencapi slot and add properties in the device tree to mark the opencapi slot as hot-pluggable. We don't really power off/on the opencapi adapter. The slot at play here is the virtual slot associated to the virtual opencapi PHB. The real PCIe slot where the card is drawing its power from is untouched (skiboot is not even aware which PCIe slot the card is seated on). So the 'fake' power off is fencing the card and set it in reset so that the FPGA image can be updated. The 'fake' power on is not doing much, as the unfencing happens on the subsequent link training. Opencapi slots are named 'OPENCAPI-xxxx' where xxxx is the opal ID of the PHB/slot. This is meant to easily identify the slot used by an AFU device, as the AFU device names are also built around that ID. For example, the device /dev/ocxl/AFP3.0006:00:00.1.0 uses the slot OPENCAPI-0006. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-10-22	npu2-opencapi: Improve error reporting to the OS	Frederic Barrat	3	-4/+29
	When resetting an opencapi link, the brick will be fenced temporarily. Therefore we can't rely on the fencing state of the brick any more to check for the health of an opencapi PHB, as we could report errors if queried for a PHB state at the same time a link is being reset. Instead, we flag the device as 'broken' when an error interrupt is received, just before raising an event to the OS. When the OS is querying for the state of a PHB, we only have to look at the 'broken' attribute. Note that there's no recovery possible on P9 when an error interrupt is received unexpectedly, as recovery is not supported by hardware. So when a device/link is marked as 'broken', it stays broken. All the OS can do is log the error and notify the drivers. Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-10-22	npu2-opencapi: Detect PHY reset errors	Frederic Barrat	3	-5/+17
	PHY reset can fail! Though past problems are now fixed, let's handle any future failure. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-10-22	npu2-opencapi: Simplify freset states	Frederic Barrat	1	-13/+3
	Let's get rid of one transitional state, since there's no need to pause in between releasing the reset signals of the ODL and the adapter. Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-10-22	npu2-opencapi: Tweak fundamental reset sequence	Frederic Barrat	2	-24/+26
	Modify slightly the ordering of a few steps in our init sequence on fundamental reset, so that it can be called from the OS, when the link is already up: - when the card is reset, the link goes down, so we need to fence the brick to prevent errors propagating to the NPU and OS - since fencing and unfencing don't require any delay, let's also fence/unfence during the very first reset at boot. It's useless but doesn't hurt and keep the code simpler. - resetting the PHY must be done a bit later, while fenced and the ODL and DLx in reset Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-10-22	npu2-opencapi: Rework link training timeout	Frederic Barrat	2	-4/+7
	Opencapi link state should be polled for up to 3 seconds. Current code assumes a tight retry loop during fundamental reset at boot, which is not going to be true on link retraining. So update the timeout detection code to use a timebase instead of a simple retry count which could be way too long. Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-10-22	npu2-hw-procedures: Fix link retraining on reset	Frederic Barrat	1	-0/+16
	Link retraining was showing reliability problems due to some opencapi-only settings not being optimized. This patch updates some extra PHY state, as agreed with the PHY team. Though they mostly impact link retraining behavior, they should also be set at boot. Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-10-22	npu2-opencapi: Make sure the PCI slot has the proper ID	Frederic Barrat	1	-1/+2
	The PCI slot created for the opencapi PHB didn't have its ID properly defined because it was created before we assign an ID to the PHB. Simply switch the PCI slot creation and PHB registration calls to fix it. Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-10-22	npu2-hw-procedures: Move some opencapi PHY settings in one-off init	Frederic Barrat	1	-19/+16
	The PHY_RX_AC_COUPLED and PHY_RX_SPEED_SELECT for opencapi are group settings for the obus. They should be set in the one-off PHY init function at boot and not on the link reset path, as they theoretically impact more than one link. Since we cannot mix link type and/or speed on an optical bus, it has no pratical impact, it just looks cleaner. Also use the OCAPIINF macro for the associated traces. Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-10-22	core/pci: Fix scan of devices for opencapi slots	Frederic Barrat	1	-5/+15
	Opencapi devices are found directly under the PHB and the PHB slot doesn't have an associated PCI device (root complex). So when scanning a PHB, devices are added directly under the PHB, like it's done at boot time. Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-10-22	core/pci: Train link of PHB slots when hotplugging	Frederic Barrat	1	-22/+85
	The link of PHB slots must be trained after powering on. This can be done by calling the fundamental reset callback of the slot. We could force a reset for all the slots and have a common path in set_power_state(). But this patch only resets the PHB slot. Some slot implementations do a power cycle during fundamental reset, so calling a reset after powering on would repeat that operation. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-10-22	core/pci: Use proper phandle during hotplug for PHB slots	Frederic Barrat	1	-6/+15
	PHB slots don't have an associated device (slot->pd = NULL). They were not used by the PCI hotplug framework so far, but with opencapi virtual PHBs, that's changing. With opencapi, devices are directly under the PHB (no root complex or intermediate bridge) and the slot used for hotplug is the PHB slot. This patch uses the proper phandle when replying asynchronously to the OS when using a PHB slot. Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-10-22	core/pci: Add missing lock in set_power_timer	Frederic Barrat	2	-0/+12
	set_power_timer() was not using any lock, though it alters the slot state and devices found under it. There's a remote possibility that set_power_timer() is called through check_timers() by a thread already holding the phb lock, so we try to take the lock but yield and rearm the timer if somebody else is already owning it. There really shouldn't be any contention here. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-10-22	core/pci: Refactor common paths on slot hotplug	Frederic Barrat	1	-17/+26
	Refactor code executed to remove or rescan devices when a slot power state changes, synchronously or asynchronously through a timer callback. It will be more useful in a future patch. No functional changes. Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-10-15	core/ipmi: Fix use-after-free	Vasant Hegde	1	-3/+11
	Commit f01cd77 introduced backend poller() for ipmi message. But in some corner cases its possible that we endup calling poller() after freeing ipmi message. Thread 1 : ipmi_queue_msg_sync() Waiting for ipmi sync message to complete Thread 2 : bt_poll() -> ipmi_cmd_done() -> callback handler -> free message Oliver hit this issue during fast-reboot test with skiboot DEBUG build. In debug build we poision the memory after free. That helped us to catch this issue. [ 460.295570781,3] *********************************************** [ 460.295773157,3] Fatal MCE at 0000000030035cb4 .ipmi_queue_msg_sync+0x110 MSR 9000000000201002 [ 460.295887496,3] CFAR : 0000000030035ce8 MSR : 9000000000000000 [ 460.295956419,3] SRR0 : 0000000030035cb4 SRR1 : 9000000000201002 [ 460.296035015,3] HSRR0: 0000000030012624 HSRR1: 9000000002803002 [ 460.296102413,3] DSISR: 00000008 DAR : 99999999999999d1 [ 460.296169710,3] LR : 0000000030035ce4 CTR : 0000000030002880 [ 460.296248482,3] CR : 28002422 XER : 20040000 [ 460.296336621,3] GPR00: 0000000030035ce4 GPR16: 00000000301d36d8 [ 460.296415449,3] GPR01: 0000000031c133d0 GPR17: 00000000300f5cd8 [ 460.296482811,3] GPR02: 0000000030142700 GPR18: 0000000030407ff0 [ 460.296550265,3] GPR03: 0000000000000100 GPR19: 0000000000000000 [ 460.296629041,3] GPR04: 0000000028002424 GPR20: 0000000000000000 [ 460.296696369,3] GPR05: 0000000020040000 GPR21: 0000000030121d73 [ 460.296820977,3] GPR06: c000001fffffd480 GPR22: 0000000030121dd2 [ 460.296888226,3] GPR07: c000001fffffd480 GPR23: 0000000030613400 [ 460.296978218,3] GPR08: 0000000000000001 GPR24: 0000000000000001 [ 460.297056871,3] GPR09: 9999999999999999 GPR25: 0000000031c13960 [ 460.297124647,3] GPR10: 0000000000000000 GPR26: 0000000000000004 [ 460.297203811,3] GPR11: 0000000000000000 GPR27: 0000000000000003 [ 460.297271250,3] GPR12: 0000000028002424 GPR28: 0000000030613400 [ 460.297339026,3] GPR13: 0000000031c10000 GPR29: 0000000030406b50 [ 460.297417605,3] GPR14: 00000000300f58f8 GPR30: 0000000030406b40 [ 460.297485176,3] GPR15: 00000000300f58d8 GPR31: 00000000309249c8 Reported-by: Oliver O'Halloran <oohall@gmail.com> Fixes: f01cd77 (ipmi: ensure forward progress on ipmi_queue_msg_sync()) Cc: skiboot-stable@lists.ozlabs.org # v6.3+ Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-10-14	blocklevel: smart_write: Fix unaligned writes to ECC partitions	Andrew Jeffery	4	-7/+65
	Currently trying to clear a gard record results in errors: $ ./opal-gard -pef part create /sys0/node0/proc1 $ ./opal-gard -pef part list ID \| Error \| Type \| Path --------------------------------------------------------- 00000001 \| 00000000 \| Manual \| /Sys0/Node0/Proc1 ========================================================= $ ./opal-gard -pef part clear 00000001 Clearing gard record 0x00000001...done ECC: uncorrectable error: ffffff00ffffffff ff libflash ecc invalid $ A little wrapper around hexdump(1) helps show where the error lies by grouping output in blocks of nine such that the last byte is the ECC byte: $ declare -f ecchd ecchd () { hexdump -e '"%08_ax""\t"' -e '9/1 "%02x ""\t""\|"' -e '9/1 "%_p""\|\n"' "$@" } A clean GARD partition displays as: $ ecchd part 0002c000 ff ff ff ff ff ff ff ff 00 \|.........\| * 00030ffb ff ff ff ff ff \|.....\| $ Dumping the corrupt partition shows: $ ecchd part 0002c000 ff ff ff ff ff ff ff ff 00 \|.........\| * 0002c024 ff ff ff ff ff ff ff ff ff \|.........\| 0002c02d ff ff ff 00 ff ff ff ff ff \|.........\| * 0002c051 ff ff ff 00 ff ff ff ff 00 \|.........\| 0002c05a ff ff ff ff ff ff ff ff 00 \|.........\| * 00030ffb ff ff ff ff ff \|.....\| $ blocklevel_smart_write() turned out to not be quite as smart as it thought it was in that any unaligned write to ECC protected partitions aligned the calculated ECC values to the start of the write buffer and not relative to the start of the partition. Fixes: 29d1e6f78109 ("libflash/blocklevel: add a smart write function which wraps up eraseing and writing") Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-10-14	blocklevel: smart_write: Tidy local variable declarations	Andrew Jeffery	1	-4/+7
	Group them by use (and name). It's not reverse christmas tree, but it's a bit easier on the eye. Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-10-14	blocklevel: smart_write: Avoid reuse of formal parameters	Andrew Jeffery	1	-6/+13
	Lays the ground-work for fixing unaligned writes to ECC protected partitions. Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-10-14	blocklevel: smart_write: Deny writes intersecting ECC protected regions	Andrew Jeffery	1	-1/+9
	Other code paths don't handle writes spanning mixed regions, and it's a headache, so deny it here too. Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-10-14	blocklevel: smart_write: Avoid indirectly testing formal parameters	Andrew Jeffery	1	-1/+1
	The early-exit tests write_buf, but write_buf is assigned to buf on declaration. Test buf directly instead to avoid unnecessary indirection. Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-10-14	blocklevel: smart_write: Rename size variable for clarity	Andrew Jeffery	1	-8/+9
	We're writing in chunks, so lets make it clear that size is relative to the chunk that we're writing. Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-10-14	blocklevel: smart_write: Rename write buffer	Andrew Jeffery	1	-7/+7
	The buffer is only used for ECC protected partitions, so lets call it ecc_buf for clarity. Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-10-14	blocklevel: smart_write: Terminate line for debug output in no-change case	Andrew Jeffery	1	-0/+2
	Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-10-14	gard: Fix data corruption when clearing single records	Andrew Jeffery	4	-1/+31
	Attempting to clear a specific gard record leads to corruption of the target record rather than the expected removal: $ ./opal-gard -f romulus.pnor list No GARD entries to display $ ./opal-gard -f romulus.pnor create /sys0/node0/proc1 $ ./opal-gard -f romulus.pnor list ID \| Error \| Type \| Path --------------------------------------------------------- 00000001 \| 00000000 \| Manual \| /Sys0/Node0/Proc1 ========================================================= $ ./opal-gard -f romulus.pnor clear 00000001 Clearing gard record 0x00000001...done $ ./opal-gard -f romulus.pnor list ID \| Error \| Type \| Path --------------------------------------------------------- 00000001 \| 00000000 \| Unknown \| /Sys0/Node0/Proc1 ========================================================= The GUARD partition needs to be compacted when clearing records as the end of the list is a sentinel represented by the erased-flash state. The compaction strategy is to read the trailing records and write them to the offset of the record to be removed, followed by writing the sentinel record at the offset of what was previously the last valid record. The corruption occurs due to incorrect calculation of the offset at which the trailing records will be written. Cc: Skiboot Stable <skiboot-stable@lists.ozlabs.org> Fixes: 5616c42d900a ("libflash/blocklevel: Make read/write be ECC agnostic for callers") Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-10-14	core/init: Checksum romem after patching out traps	Oliver O'Halloran	1	-2/+4
	Currently we checksum the read-only parts of skiboot's memory just before loading and booting petitboot. Commit 9ddc1a6bfaef ("core/util: trap based assertions") modifies the .text after this point since it needs to disable the trap instructions that we use to trigger an abort() before entering the kernel. We can fix this by moving the checksum to after the point where the traps are patched out. We could do the patching sooner, but since load_and_boot_kernel() is a fairly complex function it's perferable to keep boot-time assertion infrastructure active until just before we enter the kernel. Reported-by: Carol L Soto <clsoto@us.ibm.com> Tested-by: Carol L Soto <clsoto@us.ibm.com> Tested-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Fixes: 9ddc1a6bfaef ("core/util: trap based assertions") Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-10-14	core/init: Don't checksum MPIPL data areas	Oliver O'Halloran	3	-1/+15
	Right now the romem checksum runs from _start until the start of our data area. This spans the area used for the MPIPL data structures since they're included in the SPIRA-H data area. Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-10-03	core/exceptions.c: do not include handler code in exception backtrace	Nicholas Piggin	3	-6/+31
	Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-10-03	core/util: branch-to-NULL assert for ELFv2 ABI	Nicholas Piggin	2	-6/+17
	The ELFv1 branch to NULL catcher puts a function descriptor at 0 which points to a function that asserts. For ELFv2, put a trap at address 0. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [oliver: commit message prefix] Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-10-03	core/util: trap based assertions	Nicholas Piggin	10	-35/+130
	Using traps for assertions like Linux does gives a few advantages: - The asm code leading to the failure condition is nicer. - The interrupt gives a clean snapshot of machine state to dump. The difficulty with using traps for this in OPAL is that the runtime component will not deal well with the OS taking the 0x700 interrupt caused by a trap in OPAL. The long term goal is to improve the ability of the OS to inspect and debug OPAL at runtime. For now though, the traps are patched out before passing control to the OS, and the assert falls through to in-line failure handling. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [oliver: commit prefix, added and renamed the FWTS label, fix tests] Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-10-03	core/exceptions.c: rearrange code to allow more interrupt types	Nicholas Piggin	1	-4/+12
	Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>