Age | Commit message (Collapse) | Author | Files | Lines |
|
OS Secure Boot establishes a chain of trust from firmware to the OS.
However, OS Secure Boot can only be secure if the chain of trust
beneath it - from hardware to firmware - has been established by
Firmware Secure Boot. This patch ensures that OS Secure Boot is enabled
only if Firmware Secure Boot is enabled.
Signed-off-by: Nayna Jain <nayna@linux.ibm.com>
Signed-off-by: Eric Richter <erichte@linux.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
With DEBUG=1 builds we use the mcount hook to instrument how much stack
space we're using. If we detect that a function call has come within
2KB of the bottom of the stack we currently print a backtrace. This can
result in a huge amount of console IO in DEBUG=1 builds which can cause
op-test timeouts, etc.
Printing a backtrace on each function call isn't terribly useful, and it
ends up crowding out the backtrace that's printed when we hit a new
stack usage watermark. The watermark should provide enough information
to find and fix excessive stack usage issues so drop the per-function
backtrace printing and move the warning into the high-watermark check.
This change is largely necessary because of DEBUG=1 expands adds a
backtrace save area to struct lock which expands the size of it to nearly
2KB. struct cpu_thread (which lives at the bottom of the per-thread stacks)
contains three locks and an additional backtrace save area which is
enabled when DEBUG=1. The extra space requirements result in cpu_thread
ballooning from ~420 bytes to nearly 8KB.
Any growth in cpu_thread also results in less stack space being
available for the thread, so when DEBUG=1 is enabled we go from having
a 16KB stack to an 8KB stack. Although this seems large, skiboot does
have some fairly deep call chains (UART console flushing, TPM drivers,
both combined) which can cause the thread to come within 2KB of the stack
use warning zone.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
Maybe we should swap locations of the normal and emergency stacks so
cpu_thread takes space from the emergency stack rather than the normal one.
The e-stack should only be used at runtime where the call chains should be
smaller.
|
|
Previous commit 482f18adf21eeb5f6ce2a93334725509a8f6f0cd
added check for fused core mode and bailed out.
The check can be removed since fused core mode
is now supported in OPAL.
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Add PVR checks and feature mapping for POWER9 Cumulus chip.
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
cpu_get_core_index() currently uses pir_to_core_id() which returns
an EC number always (ie, a normal core number) even in fused core
mode. This is inconsistent with cpu_get_thread_index() which returns
a thread within a fused core (0...7) on P9.
So let's make things consistent and document it.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
The "EC" primary is the primary thread of an EC, ie, the corresponding
small core "half" of the big core where the thread resides.
It will be necessary for the direct controls to target the right
half when doing special wakeups among others.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
pir_to_core_id() and pir_to_thread_id() are extensively
used by the direct controls code and are expected to return
the "normal" (non-fused, aka EC) core/thread IDs.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
P9 cores can be configured into fused core mode where two core chiplets
function as an 8-threaded, single core. So, bump four to eight in boot_entry
when in fused core mode and cpu_thread_count in init_boot_cpu.
The HID, AMOR, TSCR, RPR require the first active thread on that core chiplet
to load the copy for that core chiplet. So, send thread 1 of a fused core to
init_shared_sprs in boot_entry.
The code checks for fused core mode in the core thead state register and puts a
field in struct cpu_thread. This flag is checked when updating the HID and in
XIVE code when setting the special bar.
For XSCOM, the core ID is the non-fused EX. So, create macros to arrange the
bits. It's fairly verbose but somewhat readable.
This was tested on a P9 ZZ with 16 fused cores and ran HTX for over 24 hours.
Signed-off-by: Ryan Grimm <grimm@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Commit 34664746 moved opal_mpipl_save_crashing_pir() function call from
platform specific code to generic assert() path. I completely missed
to take care of all terminate path :-(
This resulted in breaking `opalcore` on Linux kernel initiated MPIPL. As :
- Linux initiated MPIPL calls platform termination function directly
- ELF core format needs crashing CPU details to generate proper code
Hence I think it makes sense to move this back to platform specific
terminate handler code.
Today we have two ways to trigger MPIPL based on service processor.
- On BMC system we call SBE S0 interrupt
- On FSP system we call `attn` instruction
In future if we add new ways to trigger MPIPL then we have to add platform
specific support code anyway. That way its fine to move this to platform
sepcific code.
One alternative is to make this call in all code path before making
platform.terminate call... which makes it more complicated than above approach.
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
If OPAL boot fails after MPIPL init (opal_mpipl_init()) then we call MPIPL
boot instead of reboot. BMC is not aware of MPIPL. Hence it may result in
continuous MPIPL loop (boot -> crash -> MPIPL -> boot).
If OPAL boot fails (before loading kernel) then its better to call reboot.
So that BMC can detect `n` number of boot failures (generally n = 3) and
stop booting. That way we can avoid continuous loop.
This patch moves MPIPL init to the end of init process (just before starting
kernel). So that if we fail to boot OPAL we call normal reboot.
Also this patch introduces new function to detect MPIPL is enabled or not
(is_mpipl_enabled()). And in assert() path we check for this function
instead of `dump` DT node. So that it will make sure we will not call
MPIPL until opal_mpipl_init is complete.
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
All callers of dt_resize_property() need to set the new property length
after calling it. append_chip_id() wasn't doing it, which caused this
assert when booting my machine:
[ 136.387213258,3] Unable to use memory range 0 from MSAREA 0
[ 136.387356677,3] Unable to use memory range 0 from MSAREA 2
[ 136.387408390,3] ***********************************************
[ 136.387454272,3] < assert failed at core/device.c:605 >
[ 136.387493225,3] .
[ 136.387512799,3] .
[ 136.387534056,3] .
[ 136.387550294,3] OO__)
[ 136.387579530,3] <"__/
[ 136.387605086,3] ^ ^
[ 136.387719329,3] Fatal TRAP at 0000000030028a18 .dt_property_set_cell+0x34 MSR 9000000000021002
[ 136.387801707,3] CFAR : 00000000300bfd3c MSR : 9000000000001000
[ 136.387847032,3] SRR0 : 0000000030028a18 SRR1 : 9000000000021002
[ 136.387893119,3] HSRR0: 0000000030012524 HSRR1: 9000000000001000
[ 136.387936830,3] DSISR: 40000000 DAR : 00000002019df000
[ 136.387983570,3] LR : 00000000300bfd40 CTR : 0000000000000000
[ 136.388046031,3] CR : 20004202 XER : 00000000
[ 136.388094553,3] GPR00: 00000000300bfd40 GPR16: 0000000000000001
[ 136.388139862,3] GPR01: 0000000031e536e0 GPR17: 00000000300ca3c9
[ 136.388181131,3] GPR02: 0000000030121200 GPR18: 0000000030103e1c
[ 136.388224105,3] GPR03: 000000003053fc60 GPR19: 0000000000000008
[ 136.388270356,3] GPR04: 0000000000000001 GPR20: 000000003053fba0
[ 136.388313950,3] GPR05: 0000000000000008 GPR21: 0000000000000001
[ 136.388363021,3] GPR06: 0000000031e50060 GPR22: 0000000000000001
[ 136.388416754,3] GPR07: 0000000000000000 GPR23: 0000000000000000
[ 136.388465729,3] GPR08: 0000000000000000 GPR24: 0000000000000000
[ 136.388508156,3] GPR09: 0000000000000004 GPR25: 0000000031204060
[ 136.388556203,3] GPR10: 0000000000000008 GPR26: 000000003120402c
[ 136.388599076,3] GPR11: 0000000000000000 GPR27: 0000000030010000
[ 136.388642108,3] GPR12: 0000000040004204 GPR28: 0000000000000002
[ 136.388694064,3] GPR13: 0000000031e50000 GPR29: 0000000031203ee0
[ 136.388743298,3] GPR14: 00000000300cbf03 GPR30: 0000000031202e80
[ 136.388797131,3] GPR15: 00000000300cc01c GPR31: 0000000030103a33
CPU 0048 Backtrace:
S: 0000000031e539e0 R: 0000000030028874 .dt_resize_property+0x28
S: 0000000031e53a60 R: 00000000300bfd40 .memory_parse+0xd84
S: 0000000031e53c40 R: 00000000300bc4d8 .parse_hdat+0xed0
S: 0000000031e53e30 R: 000000003001504c .main_cpu_entry+0x1ac
S: 0000000031e53f00 R: 0000000030002760 boot_entry+0x1b0
Avoid further appearances of the unidentified animal of doom by making
dt_resize_property() do the length updating itself, freeing its callers
from that need.
Suggested-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
We only really use the gcov output when doing the coverage report as a
part of the "docs" CI builds. It's useful for development to just run
the unit tests so make sure the "check" and "coverage" targets are
seperate.
This also speeds up our CI builds since those jobs are already doing a
seperate GCOV pass so building and running the GCOV binaries during the
check pass is redundant.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
This provides an initial facility to decode machine checks into
human readable strings, plus a minimum amount of metadata that
a handler has to understand in order to deal with the machine
check.
For now this is only used by skiboot to make MCE reporting nicer,
and an ERAT flush recovery attempt which is more about code
coverage than really being helpful.
***********************************************
Fatal MCE at 00000000300c9c0c .memcmp+0x3c MSR 9000000000141002
Cause: instruction fetch TLB multi-hit error
Effective address: 0x00000000300c9c0c
...
The intention is to subsequently provide an OPAL API with this
information that will enable an OS to implement a machine
independent OPAL machine check driver.
The code and data tables are derived from Linux code that I wrote,
so relicensing is okay.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Use magic marker in the exception stack frame that is used by the
unwinder to decode the interrupt type and NIA. The below example trace
comes from a modified skiboot that uses virtual memory, but any
interrupt type will appear similarly.
CPU 0000 Backtrace:
S: 0000000031c13580 R: 0000000030028210 .vm_dsi+0x360
S: 0000000031c13630 R: 000000003003b0dc .exception_entry+0x4fc
S: 0000000031c13830 R: 0000000030001f4c exception_entry_foo+0x4
--- Interrupt 0x300 at 000000003002431c ---
S: 0000000031c13b40 R: 000000003002430c .make_free.isra.0+0x110
S: 0000000031c13bd0 R: 0000000030025198 .mem_alloc+0x4a0
S: 0000000031c13c80 R: 0000000030028bac .__memalign+0x48
S: 0000000031c13d10 R: 0000000030028da4 .__zalloc+0x18
S: 0000000031c13d90 R: 000000003002fb34 .opal_init_msg+0x34
S: 0000000031c13e20 R: 00000000300234b4 .main_cpu_entry+0x61c
S: 0000000031c13f00 R: 00000000300031b8 boot_entry+0x1b0
--- OPAL boot ---
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[oliver: the new stackentry fields made our test heaps too small]
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
fixup! core: interrupt markers for stack traces
|
|
.head is for code and data which must reside at a fixed low address,
mainly entry points.
These are moved into .rodata. Despite being modified at runtime, this
facilitates these tables being write-protected in a later patch.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
The current fast reboot sequence is not as robust as it could be. It
is this:
- Fast reboot CPU stops all other threads with direct control xscoms;
- it disables ME (machine checks become checkstops);
- resets its SPRs (to get HID[HILE] for machine check interrupts) and
overwrites exception vectors with our vectors, with a special fast
reboot sreset vector that fixes endian (because OS owns HILE);
- then the fast reboot CPU enables ME.
At this point the fast reboot CPU can handle machine checks with the
skiboot handler, but no other cores can if the OS had switched HILE
(they'll execute garbled byte swapped instructions and crash badly).
- Then all CPUs run various cleanups, XIVE, resync TOD, etc.
- The boot CPU, which is not necessarily the same as the fast reboot
initiator CPU, runs xive_reset.
This is a lot of code to run, including locking and xscoms, with
machine check inoperable.
- Finally secondaries are released and everyone sets SPRs and enables
ME.
Secondaries on other cores don't wait for their thread 0 to set shared
SPRs before calling into the normal OPAL secondary code. This is
mostly okay because the boot CPU pauses here until all secondaries
reach their idle code, but it's not nice to release them out of the
fast reboot code in a state with various per-core SPRs in flux.
Fix this by having the fast reboot CPU not disable ME or reset its
SPRs, because machine checks can still be handled by the OS. Then
wait until all CPUs are called into fast reboot and spinning with
ME disabled, only then reset any SPRs, copy remaining exception
vectors, and now skiboot has taken over the machine check handling,
then the CPUs enable ME before cleaning up other things.
This way, the region with ME disabled and SPRs and exception vectors
in flux is kept absolutely minimal, with no xscoms, no MMIOs, and few
significant memory modifications, and all threads kept closely in step.
There are no windows where a machine check interrupt may execute
garbage due to mismatched HILE on any CPU.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Initial boot already saved original exception vectors to old_vectors,
copying again upon fast reboot will overwrite old_vectors with some
arbitrary vectors set up by the current OS.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
This patch disables Protected Execution Faciltiy (PEF).
This software procedure is needed for the lab because Cronus will be
configured to bring the machine up with PEF on. Hostboot has a similar
procedure for running with PEF off.
Skiboot can run with PEF on but the kernel cannot; the kernel will take
a machine check when trying to write a protected resource, such as the
PTCR.
So, use this until we have an ultravisor, or if we want to use BML with
Cronus without UV = 1.
Signed-off-by: Ryan Grimm <grimm@linux.ibm.com>
Tested-by: Alistair Popple <alistair@popple.id.au>
[oliver: replaced bare urfid with a macro for toolchain compatibility]
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Currently the wait_for_all_occ_init() function determines that the
OCCs associated with every Chip has been initialized by verifying if
the "Valid" bit in pstate table of that OCC is set.
However, on chips where all the EX units are guarded, the OCC, even
though it is active, does not update the pstate_table. Currently as a
result of this, OPAL concludes that the OCC is not functional and not
only disable Pstate initialization, but incorrectly report that that
OCCs were not initialized, thereby cutting other features such as
sensors.
Fix this by ensuring that
* We check if there is atleast one active EX unit in the chip
before checking if the OCC is active.
* On platforms with OCC-OPAL communication interface version 0x90
* wait_for_all_occ_init() only checks if the occ_state in the
OCC dynamic area is set to "Active State".
* move the "Valid" bit check to add_cpu_pstate_properties(),
which is where we create the device-tree entries for
frequency scaling.
Cc: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Tested-by: Pavaman Subramaniyam <pavsubra@in.ibm.com>
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Fused code mode is currently not supported in OPAL. Continuing to
boot the system would result in errors at later stages of boot.
Wait for console to be up and print message for developers to check
and fix the system modes.
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
Signed-off-by: Joel Stanley <joel@jms.id.au>
Tested-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
In simulation, hundreds of millions of cycles are chewed up in this code
path:
PC: 0x0000000030033450 -> <.bitmap_tst_bit>+0x18
LR: 0x000000003003347c -> <.buddy_check_alloc>+0x14
0x0000000031c13b30 -> <_ebss>+0x1803b30
0x000000003003351c -> <.buddy_check_alloc_down>+0x4c
0x00000000300339c4 -> <.buddy_free>+0x7c
0x0000000030033be8 -> <.buddy_create>+0xcc
0x0000000030089bbc -> <.xive_init>+0xf0
0x00000000300157cc -> <.main_cpu_entry>+0x8a0
0x000000003000275c -> <boot_entry>+0x1bc
Undefining BUDDY_DEBUG saves 30+ minutes of wall clock time, so fix
the "warning: unused parameter" messages when compiling.
Signed-off-by: Ryan Grimm <grimm@linux.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
OPAL uses different path to trigger MPIPL:
- On BMC system we call SBE S0 interrupt
- On FSP system we call `attn` instruction
Currently on BMC system we collect crash CPU PIR details.. which is needed to
generate proper dump. This happens just before calling SBE S0 interrupt. Since
we don't use this path in FSP system OPAL is not saving crashing CPU details.
Hence by default `opalcore` is not pointing to crashing CPU and not showing
proper backtrace. We have to go through all CPUs to find crashing CPU backtrace.
This patch move this function to common place so that if MPIPL is supported
we collect crashing CPU data.
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Add a driver for the SCOM ranges of the OCMB. Unlike most chips the OCMB
has two different (three if you count OpenCAPI config space) register
spaces and we need to ensure that the right access size is used on each.
Additionally the SCOM interface is a bit non-standard in that a full
physical address is passed as the SCOM address rather than a register
number so we don't need to perform any address transformations, we just
need to verify that the address falls into one of the nominated address
ranges.
Cc: Klaus Heinrich Kiwi <klaus@linux.vnet.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Cc: Ilya Kuznetsov <ilya@yadro.com>
Cc: Artem Senichev <artemsen@gmail.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
SPDX makes it a simpler diff.
I have audited the commit history of each file to ensure that they are
exclusively authored by IBM and thus we have the right to relicense.
The motivation behind this is twofold:
1) We want to enable experiments with coreboot, which is GPLv2 licensed
2) An upcoming firmware component wants to incorporate code from skiboot
and code from the Linux kernel, which is GPLv2 licensed.
I have gone through the IBM internal way of gaining approval for this.
The following files are not exclusively authored by IBM, so are *not*
included in this update (I will be seeking approval from contributors):
core/direct-controls.c
core/flash.c
core/pcie-slot.c
external/common/arch_flash_unknown.c
external/common/rules.mk
external/gard/Makefile
external/gard/rules.mk
external/opal-prd/Makefile
external/pflash/Makefile
external/xscom-utils/Makefile
hdata/vpd.c
hw/dts.c
hw/ipmi/ipmi-watchdog.c
hw/phb4.c
include/cpu.h
include/phb4.h
include/platform.h
libflash/libffs.c
libstb/mbedtls/sha512.c
libstb/mbedtls/sha512.h
platforms/astbmc/barreleye.c
platforms/astbmc/garrison.c
platforms/astbmc/mihawk.c
platforms/astbmc/nicole.c
platforms/astbmc/p8dnu.c
platforms/astbmc/p8dtu.c
platforms/astbmc/p9dsu.c
platforms/astbmc/vesnin.c
platforms/rhesus/ec/config.h
platforms/rhesus/ec/gpio.h
platforms/rhesus/gpio.c
platforms/rhesus/rhesus.c
platforms/astbmc/talos.c
platforms/astbmc/romulus.c
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
[oliver: fixed up the drift]
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Minor cleanups to add clarity after commit ab1b05d2
"PCI: create optional loc-code platform callback"
Signed-off-by: Klaus Heinrich Kiwi <klaus@linux.vnet.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Currently Linux will usually call opal_cec_reboot2() in response to
unrecoverable HMIs and other serious hardware errors. OPAL handles
platform errors by sending an error log to the BMC / FSP and
triggering a software checkstop.
Sending error logs to the BMC / FSP is normally an async operation,
but in this path we need to ensure that error logs are sent out before
the xstop is triggered. The easiest way to do that is to escalate the
severity of the generated error log from "abnormal reboot" to "panic"
since we force panic logs to be send synchronusly. It's also a more
accurate description of what's happening.
CC: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Reviewed-by: Klaus Heinrich Kiwi <klaus@linux.vnet.ibm.com>
[oliver: commit message]
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
local_alloc() function tries to find first allocatable local region and
tries to allocate memory. If that region doesn't have sufficient memory to
allocate then it will log below warning and tries to allocate memory from
next available region.
Warning:
--------
[ 268.346728058,3] mem_alloc(0x800000, 0x800000, "hw/xive.c:1630", ibm,firmware-allocs-memory@0) failed !
[ 268.346732718,6] Memory regions:
[ 268.346734353,6] 0x000030500000..000030ffffff : ibm,firmware-heap
[ 268.346833468,5] 420 allocs of 0x00000058 bytes at core/device.c:41 (total 0x9060)
[ 268.346978805,5] 2965 allocs of 0x00000038 bytes at core/device.c:424 (total 0x28898)
[ 268.347035614,5] 434 allocs of 0x00000040 bytes at core/device.c:424 (total 0x6c80)
[ 268.347093567,5] 365 allocs of 0x00000028 bytes at libc/string/strdup.c:23 (total 0x3908)
[ 268.347136818,5] 84 allocs of 0x00000048 bytes at core/device.c:424 (total 0x17a0)
[ 268.347179123,5] 21 allocs of 0x00000030 bytes at libc/string/strdup.c:23 (total 0x3f0)
....
....
Hostboot reserves memory for various nodes and passes this information via HDAT.
In some cases there will be small memory holes between two reservations
(ex: 16MB free space between two reservation).
add_region() function adds new region to the head of regions list.
mem_region_init() adds OPAL regions first and then hostboot regions. So
these smaller regions will be added to head of list. If we have smaller
free regions then we may hit above warning.
Hostboot uses top of the memory for various reservations. So if we sort
memory regions then allocator will use bigger region (region after OPAL
memory) for local allocation. And we will not hit above warning.
Memory region layout with this patch:
0 - 756MB : OS reserved region (for loading kernel)
756MB - ~856MB : OPAL memory (actual size depends on PIR)
856MB - ~956MB : Memory for MPIPL (actual size depends on OPAL runtime size)
956MB - ... : We will have free memory after 956MB which we can use
for local_alloc(). Typically this will be multiple GBs.
So it works fine.
.... - top_mem: Hostboot reservations + small holes
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Suggested-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Fast reboot started life as a debug hack and it escaped into the wild when
Stewart enabled it by default. There was some reasons for this, but the
main one is that a full reboot takes somewhere between one and five
minutes. For those of us who spend all day rebooting their POWER systems
this is great, but the utility for end users has always been pretty
questionable.
Rebooting a system should be a fairly infrequent activity in the field
with the main reasons for doing one being:
1) Kernel updates,
2) Misbehaving hardware
Although 1) can be performed by kexec we have found that it fails due to
2) occasionally. The reason for 2) is usually hardware getting itself
into a bad state. The universal fix for that type of hardware problem
is turning the hardware off and back on again so it's preferable that
a reboot actually does that.
This patch refactors the reboot handling OPAL calls so that fast-reboot
is only used by default when explicitly enabled, or manually invoked.
This allows developers to continue to use fast-reboot without expecting
users deal with its quirks (and understand how a "normal" reboot,
fast-reboot and MPIPL differ).
This has two user visible changes:
1. Full reboot is now the default. In order to get fast-reboot as the
default the nvram option needs to be set:
nvram -p ibm,skiboot --update-config fast-reset=1
2. The nvram option to force a fast-reboot even when some part of
skiboot has called disable_fast_reboot() has changed from
'fast-reset=im-feeling-lucky' to 'force-fast-reset=1' because
it's impossible to actually use that 'feature' if fast-reboot is
off by default.
nvram -p ibm,skiboot --update-config force-fast-reset=1
Cc: Stewart Smith <stewart@flamingspork.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
The OPAL_CEC_REBOOT2 OPAL call allows a specific type of reboot to be
requested. We can use this to allow the OS to request a fast-reboot
explicitly rather than relying on nvram hacks to change the default
behaviour.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Some platforms (mostly OpenPower-based) will favor a short,
slot-label-based string for the "ibm,loc-code" DT property. Other
platforms such as ZZ/FSP-based platforms will prefer the fully-qualified
slot-location-code for it.
This patches creates a new operation on the platform struct, allowing
for an optional callback to create the "ibm,loc-code" property in a
platform-specific way. If the callback is not defined, use the
cleaned-up default that was in use so far.
Signed-off-by: Klaus Heinrich Kiwi <klaus@linux.vnet.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Also fix "DESC" to ASCII conversion.
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
During boot, OPAL reserves memory required to capture OPAL dump and
architected register data. During MPIPL, hostboot will copy OPAL dump
to this memory. Post MPIPL kernel will use this memory to create opalcore.
We use mem_reserve_fw() for this reservation. At present this reservation
happens late in the init path. It may clash with memory allocated by
local_alloc().
We have two option to fix above issue:
- Use local_alloc() for allocating memory for OPAL dump
This works fine on first boot. We can use this method to reserve
memory. But Post MPIPL we still want to reserve destination
memory to make sure no one is stomping this area. Also this reservation
might have happened in between other local_allocations. So in Post MPIPL
boot allocator may not find enough memory in first region for other
local_alloc() requests and may throw mem_alloc() error before trying to
allocate from other regions.
- Early memory reservation for OPAL dump
Allocate and reserve memory just after memory region init.
This patch uses second approach to fix reservation issue.
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Otherwise device tree will continue to have `mpipl-boot` and kernel may
think its MPIPL boot.
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Reviewed-by: Stewart Smith <stewart@flamingspork.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Print the contents of the buffer as an array of bytes in hex, which
avoids endian issues and reading beyond the end of the buffer.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Should be no real code change, these mostly update type declarations
that sparse complains about.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
This fixes quite a few sparse endian annotations across the tree.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
This replaces several instances dt accesses with higher level
primitives throughout the tree.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
This adds support for building LE skiboot with LITTLE_ENDIAN=1.
This is not complete, notably PHB3, NPU* and *CAPI*, but it is
sufficient to build and boot on mambo and OpenPOWER POWER9 systems.
LE/ELFv2 is a nicer calling convention, and results in smaller image and
less stack usage. It also follows the rest of the Linux/OpenPOWER stack
moving to LE.
The OPALv3 call interface still requires an ugly transition through BE
for compatibility, but that is all handled on the OPAL side.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Convert memconsole dt construction and in-memory tables to use
explicit endian conversions.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|
|
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|