Age | Commit message (Collapse) | Author | Files | Lines |
|
These were needed to workaround HW bugs in PHB4 LSIs of POWER9 DD1.0
processors.
HW395455 P9/PHB4: Wrong Interrupt ESB CI Load Opcode Location in 64K
page mode
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
These were needed to workaround HW bugs in PHB4 LSIs of POWER9 DD1.0
processors. Keep the flags in case of a similar issue in the next
generation of the XIVE logic and keep it also for Linux which still
has handlers in its XIVE layer.
However, there is no need to keep the code in POWER9 XIVE driver.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
block group mode is now required, it can not be disabled.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
An indirect table is a one page array of XIVE VSDs pointing to subpages
containing XIVE virtual structures: NVTs, ENDs, EASs or ESBs.
The OPAL XIVE driver uses direct tables for the EAS and ESB tables. The
number of interrupts being 1M, the tables are respectivelly 256K and 8M
per chip. We want the EAS lookup to be fast so we keep this table direct.
The NVT and END are bigger structures (64 and 32 bytes). If the table
were direct, we would use 32M and 32M of OPAL memory per chip. They are
indirect today and Linux allocates the pages on behalf of OPAL when a
new indirect subpage is needed. We plan to increase the NVT space and
END space in P10.
Remove USE_INDIRECT ifdef and associated code not used anymore.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
There is no reason to issue loads on XSCOM when syncing the interrupt
controller. All should be in place to use MMIOs.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
The XIVE driver exposes an API to the core OPAL layer and to other
OPAL drivers. This is a minor cleanup preparing ground for future XIVE
logic.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
This is moving the definitions of the registers of the P9 XIVE
interrupt controller and the P9 XIVE internal structures in a specific
header file and moving the definitions related to the thread interrupt
context area to a common file.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
CMD_REG should be writable, not read-only. Fix this, initializing it
with a default "unset" value (0xffffffff).
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Add a log line for when the PPE indicates it's not in the ready state,
and make all the SALT lines start with a capital to look nicer.
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Though they are currently identical to the OS, it may become necessary
to distinguish npu3 phbs from npu2 ones at some point. Add a unique
string to the compatible property.
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
npu3_chip_possible_gpus() works by dividing the number of NVLink-mode
bricks by the number of bricks connecting a single GPU. In a system with
no GPUs, the latter value is unknown, so the function returns zero and
we trip a somewhat misleading error message.
The code afterward is safe to execute in any case, so there's no need to
return either. Remove the check entirely.
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
gard tool is not supported on FSP based system. But we can still
run gard tests on FSP based system.
Acked-by: Stewart Smith <stewart@flamingspork.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Previous commit d75e82dbf introduced unnecessary variable/check.
Remove that and add barrier after setting sync_msg to NULL.
Cc: Oliver O'Halloran <oohall@gmail.com>
Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Fixes: d75e82dbf (core/ipmi: Fix use-after-free)
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Recent work on Qemu adds support to emulate homer memory region and occ
common area region with respective device models, so remove `QUIRK_NO_PBA`
to enable HOMER/OCC common area region for Qemu emulated PowerNV host.
Introduce `QUIRK_QEMU` in enum proc_chip_quirks that will be used for
future work.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Balamuruhan S <bala24@linux.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
In `occ_sensor_init()` device tree node is created for sensor-goups
and performs `occ_sensor_sanity()` check to initialize the device
tree. But if there are no sensors like in Qemu, sanity check fails
but still device tree populates the sensor-groups node wrongly as
the node created is not cleaned up.
Signed-off-by: Balamuruhan S <bala24@linux.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
On P9, the NPU doesn't support recovery if the link goes down
unexpectedly. It was not fully verified. We mark the device as broken
when we receive an error interrupt from the NPU. However, there's
nothing to prevent the OS from trying to reset the device; It may or
may not work, it's unsupported territory, so let's log a message to
make it clear, as it could help when debugging. We haven't hit any
cases where the reset goes badly enough that we'd want to prevent it,
so let it go for now. We can revisit later if we have evidence that
it's causing more problems than it is worth.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
In a hot-unplug scenario, the OS will try to unmap the PE. Skiboot
doesn't do anything with the linux PE for opencapi other than being a
mailbox, but at least let's be consistent.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Implement the get_power_state() and set_power_state() callbacks for
the opencapi slot and add properties in the device tree to mark the
opencapi slot as hot-pluggable.
We don't really power off/on the opencapi adapter. The slot at play
here is the virtual slot associated to the virtual opencapi PHB. The
real PCIe slot where the card is drawing its power from is
untouched (skiboot is not even aware which PCIe slot the card is
seated on). So the 'fake' power off is fencing the card and set it in
reset so that the FPGA image can be updated. The 'fake' power on is
not doing much, as the unfencing happens on the subsequent link
training.
Opencapi slots are named 'OPENCAPI-xxxx' where xxxx is the opal ID of
the PHB/slot. This is meant to easily identify the slot used by an AFU
device, as the AFU device names are also built around that ID.
For example, the device /dev/ocxl/AFP3.0006:00:00.1.0 uses the slot
OPENCAPI-0006.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
When resetting an opencapi link, the brick will be fenced
temporarily. Therefore we can't rely on the fencing state of the brick
any more to check for the health of an opencapi PHB, as we could
report errors if queried for a PHB state at the same time a link is
being reset.
Instead, we flag the device as 'broken' when an error interrupt is
received, just before raising an event to the OS. When the OS is
querying for the state of a PHB, we only have to look at the 'broken'
attribute.
Note that there's no recovery possible on P9 when an error interrupt
is received unexpectedly, as recovery is not supported by hardware. So
when a device/link is marked as 'broken', it stays broken. All the OS
can do is log the error and notify the drivers.
Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
PHY reset can fail! Though past problems are now fixed, let's handle
any future failure.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Let's get rid of one transitional state, since there's no need to
pause in between releasing the reset signals of the ODL and the
adapter.
Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Modify slightly the ordering of a few steps in our init sequence on
fundamental reset, so that it can be called from the OS, when the link
is already up:
- when the card is reset, the link goes down, so we need to fence the
brick to prevent errors propagating to the NPU and OS
- since fencing and unfencing don't require any delay, let's also
fence/unfence during the very first reset at boot. It's useless but
doesn't hurt and keep the code simpler.
- resetting the PHY must be done a bit later, while fenced and the ODL
and DLx in reset
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Opencapi link state should be polled for up to 3 seconds. Current code
assumes a tight retry loop during fundamental reset at boot, which is
not going to be true on link retraining. So update the timeout
detection code to use a timebase instead of a simple retry count which
could be way too long.
Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Link retraining was showing reliability problems due to some
opencapi-only settings not being optimized. This patch updates some
extra PHY state, as agreed with the PHY team. Though they mostly
impact link retraining behavior, they should also be set at boot.
Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
The PCI slot created for the opencapi PHB didn't have its ID properly
defined because it was created before we assign an ID to the
PHB. Simply switch the PCI slot creation and PHB registration calls to
fix it.
Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
The PHY_RX_AC_COUPLED and PHY_RX_SPEED_SELECT for opencapi are group
settings for the obus. They should be set in the one-off PHY init
function at boot and not on the link reset path, as they theoretically
impact more than one link.
Since we cannot mix link type and/or speed on an optical bus, it has
no pratical impact, it just looks cleaner.
Also use the OCAPIINF macro for the associated traces.
Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Opencapi devices are found directly under the PHB and the PHB slot
doesn't have an associated PCI device (root complex). So when scanning
a PHB, devices are added directly under the PHB, like it's done at
boot time.
Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
The link of PHB slots must be trained after powering on. This can be
done by calling the fundamental reset callback of the slot.
We could force a reset for all the slots and have a common path in
set_power_state(). But this patch only resets the PHB slot. Some slot
implementations do a power cycle during fundamental reset, so calling
a reset after powering on would repeat that operation.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
PHB slots don't have an associated device (slot->pd = NULL). They were
not used by the PCI hotplug framework so far, but with opencapi
virtual PHBs, that's changing. With opencapi, devices are directly
under the PHB (no root complex or intermediate bridge) and the slot
used for hotplug is the PHB slot.
This patch uses the proper phandle when replying asynchronously to the
OS when using a PHB slot.
Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
set_power_timer() was not using any lock, though it alters the slot
state and devices found under it. There's a remote possibility that
set_power_timer() is called through check_timers() by a thread already
holding the phb lock, so we try to take the lock but yield and rearm
the timer if somebody else is already owning it. There really
shouldn't be any contention here.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Refactor code executed to remove or rescan devices when a slot power
state changes, synchronously or asynchronously through a timer
callback. It will be more useful in a future patch.
No functional changes.
Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Commit f01cd77 introduced backend poller() for ipmi message. But in some
corner cases its possible that we endup calling poller() after freeing
ipmi message.
Thread 1 :
ipmi_queue_msg_sync()
Waiting for ipmi sync message to complete
Thread 2 :
bt_poll() -> ipmi_cmd_done() -> callback handler -> free message
Oliver hit this issue during fast-reboot test with skiboot DEBUG build.
In debug build we poision the memory after free. That helped us to catch
this issue.
[ 460.295570781,3] ***********************************************
[ 460.295773157,3] Fatal MCE at 0000000030035cb4 .ipmi_queue_msg_sync+0x110 MSR 9000000000201002
[ 460.295887496,3] CFAR : 0000000030035ce8 MSR : 9000000000000000
[ 460.295956419,3] SRR0 : 0000000030035cb4 SRR1 : 9000000000201002
[ 460.296035015,3] HSRR0: 0000000030012624 HSRR1: 9000000002803002
[ 460.296102413,3] DSISR: 00000008 DAR : 99999999999999d1
[ 460.296169710,3] LR : 0000000030035ce4 CTR : 0000000030002880
[ 460.296248482,3] CR : 28002422 XER : 20040000
[ 460.296336621,3] GPR00: 0000000030035ce4 GPR16: 00000000301d36d8
[ 460.296415449,3] GPR01: 0000000031c133d0 GPR17: 00000000300f5cd8
[ 460.296482811,3] GPR02: 0000000030142700 GPR18: 0000000030407ff0
[ 460.296550265,3] GPR03: 0000000000000100 GPR19: 0000000000000000
[ 460.296629041,3] GPR04: 0000000028002424 GPR20: 0000000000000000
[ 460.296696369,3] GPR05: 0000000020040000 GPR21: 0000000030121d73
[ 460.296820977,3] GPR06: c000001fffffd480 GPR22: 0000000030121dd2
[ 460.296888226,3] GPR07: c000001fffffd480 GPR23: 0000000030613400
[ 460.296978218,3] GPR08: 0000000000000001 GPR24: 0000000000000001
[ 460.297056871,3] GPR09: 9999999999999999 GPR25: 0000000031c13960
[ 460.297124647,3] GPR10: 0000000000000000 GPR26: 0000000000000004
[ 460.297203811,3] GPR11: 0000000000000000 GPR27: 0000000000000003
[ 460.297271250,3] GPR12: 0000000028002424 GPR28: 0000000030613400
[ 460.297339026,3] GPR13: 0000000031c10000 GPR29: 0000000030406b50
[ 460.297417605,3] GPR14: 00000000300f58f8 GPR30: 0000000030406b40
[ 460.297485176,3] GPR15: 00000000300f58d8 GPR31: 00000000309249c8
Reported-by: Oliver O'Halloran <oohall@gmail.com>
Fixes: f01cd77 (ipmi: ensure forward progress on ipmi_queue_msg_sync())
Cc: skiboot-stable@lists.ozlabs.org # v6.3+
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Currently trying to clear a gard record results in errors:
$ ./opal-gard -pef part create /sys0/node0/proc1
$ ./opal-gard -pef part list
ID | Error | Type | Path
---------------------------------------------------------
00000001 | 00000000 | Manual | /Sys0/Node0/Proc1
=========================================================
$ ./opal-gard -pef part clear 00000001
Clearing gard record 0x00000001...done
ECC: uncorrectable error: ffffff00ffffffff ff
libflash ecc invalid
$
A little wrapper around hexdump(1) helps show where the error lies by grouping
output in blocks of nine such that the last byte is the ECC byte:
$ declare -f ecchd
ecchd ()
{
hexdump -e '"%08_ax""\t"' -e '9/1 "%02x ""\t""|"' -e '9/1 "%_p""|\n"' "$@"
}
A clean GARD partition displays as:
$ ecchd part
0002c000 ff ff ff ff ff ff ff ff 00 |.........|
*
00030ffb ff ff ff ff ff |.....|
$
Dumping the corrupt partition shows:
$ ecchd part
0002c000 ff ff ff ff ff ff ff ff 00 |.........|
*
0002c024 ff ff ff ff ff ff ff ff ff |.........|
0002c02d ff ff ff 00 ff ff ff ff ff |.........|
*
0002c051 ff ff ff 00 ff ff ff ff 00 |.........|
0002c05a ff ff ff ff ff ff ff ff 00 |.........|
*
00030ffb ff ff ff ff ff |.....|
$
blocklevel_smart_write() turned out to not be quite as smart as it thought it
was in that any unaligned write to ECC protected partitions aligned the
calculated ECC values to the start of the write buffer and not relative to the
start of the partition.
Fixes: 29d1e6f78109 ("libflash/blocklevel: add a smart write function which wraps up eraseing and writing")
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Group them by use (and name). It's not reverse christmas tree, but it's
a bit easier on the eye.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Lays the ground-work for fixing unaligned writes to ECC protected
partitions.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Other code paths don't handle writes spanning mixed regions, and it's a
headache, so deny it here too.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
The early-exit tests write_buf, but write_buf is assigned to buf on
declaration. Test buf directly instead to avoid unnecessary indirection.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
We're writing in chunks, so lets make it clear that size is relative to
the chunk that we're writing.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
The buffer is only used for ECC protected partitions, so lets call it
ecc_buf for clarity.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Attempting to clear a specific gard record leads to corruption of the target
record rather than the expected removal:
$ ./opal-gard -f romulus.pnor list
No GARD entries to display
$ ./opal-gard -f romulus.pnor create /sys0/node0/proc1
$ ./opal-gard -f romulus.pnor list
ID | Error | Type | Path
---------------------------------------------------------
00000001 | 00000000 | Manual | /Sys0/Node0/Proc1
=========================================================
$ ./opal-gard -f romulus.pnor clear 00000001
Clearing gard record 0x00000001...done
$ ./opal-gard -f romulus.pnor list
ID | Error | Type | Path
---------------------------------------------------------
00000001 | 00000000 | Unknown | /Sys0/Node0/Proc1
=========================================================
The GUARD partition needs to be compacted when clearing records as the
end of the list is a sentinel represented by the erased-flash state. The
compaction strategy is to read the trailing records and write them to
the offset of the record to be removed, followed by writing the sentinel
record at the offset of what was previously the last valid record.
The corruption occurs due to incorrect calculation of the offset at which the
trailing records will be written.
Cc: Skiboot Stable <skiboot-stable@lists.ozlabs.org>
Fixes: 5616c42d900a ("libflash/blocklevel: Make read/write be ECC agnostic for callers")
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Currently we checksum the read-only parts of skiboot's memory just
before loading and booting petitboot. Commit 9ddc1a6bfaef
("core/util: trap based assertions") modifies the .text after this
point since it needs to disable the trap instructions that we use
to trigger an abort() before entering the kernel.
We can fix this by moving the checksum to after the point where the
traps are patched out. We could do the patching sooner, but since
load_and_boot_kernel() is a fairly complex function it's perferable
to keep boot-time assertion infrastructure active until just before
we enter the kernel.
Reported-by: Carol L Soto <clsoto@us.ibm.com>
Tested-by: Carol L Soto <clsoto@us.ibm.com>
Tested-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Fixes: 9ddc1a6bfaef ("core/util: trap based assertions")
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Right now the romem checksum runs from _start until the start of our
data area. This spans the area used for the MPIPL data structures since
they're included in the SPIRA-H data area.
Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
The ELFv1 branch to NULL catcher puts a function descriptor at 0 which
points to a function that asserts. For ELFv2, put a trap at address 0.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[oliver: commit message prefix]
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Using traps for assertions like Linux does gives a few advantages:
- The asm code leading to the failure condition is nicer.
- The interrupt gives a clean snapshot of machine state to dump.
The difficulty with using traps for this in OPAL is that the runtime
component will not deal well with the OS taking the 0x700 interrupt
caused by a trap in OPAL.
The long term goal is to improve the ability of the OS to inspect and
debug OPAL at runtime. For now though, the traps are patched out before
passing control to the OS, and the assert falls through to in-line
failure handling.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[oliver: commit prefix, added and renamed the FWTS label, fix tests]
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|