Age | Commit message (Collapse) | Author | Files | Lines |
|
Standard support for Ubuntu 20.04 ended on May 31, 2025. Remove it and
add Ubuntu 24.04.
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|
|
Fedora 40 has reached end-of-life. Remove it and add Fedora 42.
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
Reviewed-by: Dan Horák <dan@danny.cz>
|
|
Respect SSL_DIR if it is set, to use ssl headers and libs that are in a
nonstandard location.
When skiboot is built by op-build, the system ssl installation is being
used instead of the buildroot one. This change will let us fix that.
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|
|
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|
|
Starting with RPM v6, "packages built with RPM < 4.14.0 cannot be
verified due to their use of weak, obsolete MD5 and SHA1 digests." [1]
So, our ancient p9 mambo package is now failing to install on
fedora-rawhide. Use the suggested workaround of setting
%_pkgverify_flags to 0 to restore the old behavior.
[1] https://rpm.org/releases/6.0.0#compatibility-notes
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
Reviewed-by: Dan Horák <dan@danny.cz>
|
|
Added logic to handle cases where the Line Status Register (LSR)
reads 0xFF, which may indicate an error in reading the
register through LPC or the presence of multiple simultaneous UART errors.
previously, This false read of set bit lead to soft lock or hand in older
production systems.
In such scenarios, processing data read/write operations does
not make sense. The function now returns `false` to signal
the failure and halt further operations.
Signed-off-by: Abhishek Singh Tomar <abhishek@linux.ibm.com>
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|
|
According to doc/opal-api/opal-console-read-write-1-2.rst, the length
argument of OPAL_CONSOLE_WRITE_BUFFER_SPACE is only used to return a
value. Indeed, the API is called twice in the kernel code, and __length
remains uninitialized in both cases. This can lead to a hang/softlock
issue in older hardware.
Eliminate the problematic comparison which uses the uninitialized value.
Fixes: 6bf21350da32 ("uart: Drop console write data if BMC becomes unresponsive")
Signed-off-by: Abhishek Singh Tomar <abhishek@linux.ibm.com>
Reviewed-by: Aditya Gupta <adityag@linux.ibm.com>
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|
|
When BMC is in NTP mode, SET_SEL_TIME returns IPMI wrong state errors.
This better tracks and returns IPMI errors from OPAL_RTC_WRITE, which
prevents Linux from continuing to retry this non-transient failure.
Could the BMC be switched to non-NTP mode some time after the OS is up?
The host could see the OPAL_WRONG_STATE return and deal with this if
necessary.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|
|
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|
|
The IPMI Set Enables command is supposed to do a read-modify-write
operation to change bits if it is not coded specifically for the
system. Since the various BMCs supported by astbmc platform code
(e.g., QEMU and OpenBMC) are a bit different and subject to change,
it's safer to set bits with RMW.
Then bits should be set one by one to help isolate failures. And
the Set Enables command is changed to run synchronously so that
host/BMC behaviour is a bit more deterministic when setting up IPMI.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|
|
The push-context operation is not defined if a context is already
valid, it should only be performed if the CAM is pulled. Add a check
to ensure the TIMA reset was performed properly before pushing a
context.
QEMU does not yet model the reset via PTER toggle correctly, so this
causes some noise in boot.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|
|
Poll timers are not delay based and have no kind of ordering, so
processing does not have to stop if a busy timer is encountered.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|
|
When timer run code encounters an alreay-running timer, it has to
stop processing and run them later. In the case of poll timers the
SBE timer is scheduled for a minimum-delay, and for delay timers
nothing is done.
This looks backwards: poll timers do not get called from the
SBE interrupt so that delay is pointless, whereas it is helpful
for delay timers to ensure they're processed again soon.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|
|
The LPC firmware memory space is 256MB in size, and it may select
up to 16 devices with the IDSEL field.
OPB addresses FW space as a 32-bit value with the top 4 bit selecting
the device and the bottom 28 addressing the FW memory space. Therefore
the top bits should ignored when calculating the offset into the FW
window.
Fix this by allowing lpc_opb_prepare() to adjust the address directly
and correctly mask it. Now there's no need to return opb_base to the
caller either, fold that in at the same time.
This bug could be observed with QEMU's PNOR implementation that placed
some of the PNOR in device 1, though that has been changed in QEMU 10.0.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|
|
Add validation of BT FIFO sizes against IPMI message allocations.
The BT interface capabilities command returns one less than the FIFO
size, so fix this off by one error in the sanity check.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|
|
Guarded CPUs are powered down so access to their PC xscom registers
fails. This prevents the failed attempt and accompanying warnings.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|
|
On systems with redundant FSP, opal detects primary/backup FSPs.
However, it ends up considering Backup FSP as active_fsp and starts
mailbox communication with it. This causes opal to send IPL messages to
backup FSP instead of primary. Since primary FSP never receives IPL
messages from OPAL, it assumes that opal failed to boot and enters into
termination state.
The active_fsp is set during fsp_update_links_states() function which is
invoked during boot, through fsp_create_fsp(), as well as reset/reload,
through fsp_reinit_fsp(). During the boot, when 2 FSPs are detected by
opal, fsp_update_links_states() sets the last one as active_fsp which
may not be primary FSP.
Fix this issue by detecting/setting primary FSP as active_fsp during
opal boot.
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|
|
The compile errors this ignores are now being resolved by use of the
"nonstring" attribute.
This reverts commit 009fd0976006d0327cf374c1ff8ae73dd4895efa.
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|
|
GCC 15 has introduced errors for "unterminated-string-initialization"
Which treat any character array initialised with a string with a larger
size such that the null-character is not getting included in the
character array, GCC 15 gives a warning (and warnings are treated as
errors in skiboot compile).
This causes following errors on compiling skiboot with GCC 15:
core/init.c:79:27: error: initializer-string for array of ‘unsigned char’ truncates NUL terminator but destination lacks ‘nonstring’ attribute (9 chars into 8 available) [-Werror=unterminated-string-initialization]
79 | .eye_catcher = "OPALdbug",
| ^~~~~~~~~~
cc1: all warnings being treated as errors
...
In file included from hdata/hdata.h:8,
from hdata/spira.c:17:
hdata/spira.c:35:32: error: initializer-string for array of ‘char’ truncates NUL terminator but destination lacks ‘nonstring’ attribute (7 chars into 6 available) [-Werror=unterminated-string-initialization]
35 | .hdr = HDIF_SIMPLE_HDR("PROCIN", 1, struct proc_init_data),
| ^~~~~~~~
hdata/hdif.h:45:68: note: in definition of macro ‘HDIF_ID’
45 | #define HDIF_ID(_id) .d1f0 = CPU_TO_BE16(0xd1f0), .id = _id
| ^~~
hdata/spira.c:35:16: note: in expansion of macro ‘HDIF_SIMPLE_HDR’
35 | .hdr = HDIF_SIMPLE_HDR("PROCIN", 1, struct proc_init_data),
| ^~~~~~~~~~~~~~~
...
(similar errors few more times with hdata)
...
cc1: all warnings being treated as errors```
Fix the errors by marking character arrays which are not supposed to be
"null-terminated strings" with "nonstring" attribute, such as
eye-catchers in skiboot debug descriptor and hdif header
Signed-off-by: Aditya Gupta <adityag@linux.ibm.com>
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|
|
Currently if you encounter duplicate entries in TPMREL section while
parsing HDAT, opal crashes with below back trace:
[ 119.205498180,3] DT: dt_attach_root failed, duplicate ibm,cvc-service@40
[ 119.206975658,3] ***********************************************
[ 119.208669044,3] Fatal MCE at 000000003003729c
.dt_find_property+0x30 MSR 9000000000001002
[ 119.210355268,3] Cause: unknown error
[ 119.211273270,3] CFAR : 0000000030037288 MSR : 9000000000001002
[ 119.212502638,3] SRR0 : 000000003003729c SRR1 : 9000000000001002
[ 119.214037362,3] HSRR0: 0000000030020024 HSRR1: 9000000000001000
[ 119.215266730,3] DSISR: 40000000 DAR : a600607d01006b79
[...]
CPU 0008 Backtrace:
S: 0000000031c53980 R: 0000000030026b0c .__memalign+0x58
S: 0000000031c53a10 R: 0000000030037378 .new_property+0xb0
S: 0000000031c53aa0 R: 0000000030037778 .__dt_add_property_strings+0x58
S: 0000000031c53b40 R: 000000003010bf74 .node_stb_parse+0x414
S: 0000000031c53c30 R: 0000000030102ee4 .parse_hdat+0x20cc
S: 0000000031c53e30 R: 0000000030022c04 .main_cpu_entry+0x1d0
S: 0000000031c53f00 R: 000000003000321c go_primary+0x10c
--- OPAL boot ---
Fix the null pointer deref and proceed with warning message instead of
crashing. Also add debug prints to display all entries.
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Reviewed-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Reviewed-by: Dan Horák <dan@danny.cz>
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|
|
Since gcc commit 72e0c742bd01 ("gcov: make profile merging smarter"),
gcov expects there to be a checksum field after the stamp. For our
purposes it's not necessary for it to be a valid value.
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|
|
Starting with GCC 12, gcov cannot parse the .gcda files we generate:
$ powerpc64le-linux-gcov-dump platforms/qemu/qemu.gcda
platforms/qemu/qemu.gcda:data:magic `adcg':version `B23*' (swapped endianness)
platforms/qemu/qemu.gcda:stamp 1742688024
platforms/qemu/qemu.gcda:checksum 2623079854
platforms/qemu/qemu.gcda: 01000000: 3:FUNCTION ident=1191288390, lineno_checksum=0xdb12f55c, cfg_checksum=0xf9e50e8f
platforms/qemu/qemu.gcda:tag `46db12f5' is incorrectly nested
platforms/qemu/qemu.gcda: 46db12f5:1559880974:UNKNOWN
This is due to gcc commit 23eb66d1d46a ("gcov: Use system IO
buffering"), where the length field of tags in the file changed to
represent total bytes, not a count of words.
Change what we write accordingly.
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|
|
The number of gcov counters can be derived from the gcc source by doing
git grep -c ^DEF_GCOV_COUNTER $(git tag | grep ^releases/gcc) gcc/gcov-counter.def
Add the newer GCC releases to extract-gcov. While we're at it, rewrite
the preprocessor statements to use #elif for easier reading.
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|
|
Remove the extra 0x from this message:
[ 0.042561024,5] GCOV: gcov_info_list at 0x0x30481280
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|
|
Add a line to the gcov documentation for what was added in commit
8d0f41e021b3 ("gcov: Add gcov data struct to sysfs").
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|
|
Standard support for Ubuntu 18.04 ended on May 31, 2023.
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|
|
With QEMU with NO support for MPIPL, 'p9_sbe_terminate' returns early
at:
/* Return if MPIPL is not supported */
if (!is_mpipl_enabled())
return;
But with MPIPL supported in QEMU, 'p9_sbe_terminate' continues further and
calls 'flash_unregister' which causes a Machine Check due to nullptr
dereference of 'system_flash':
[ 13.240783728,5] Reboot: OS reported error. Performing MPIPL
[ 13.241662601,5] DUMP: Crashing PIR = 0x0
[ 13.244049276,5] RESET: Fast reboot disabled: Kernel re-entered OPAL
[ 1.815018] Disabling lock debugging due to kernel taint
[ 1.815518] MCE: CPU0: machine check (Severe) Real address Load (bad) DAR: 0000006000000098 [Not recovered]
[ 1.815544] MCE: CPU0: NIP: [0000000030040f54] 0x30040f54
[ 1.815911] MCE: CPU0: Initiator CPU
[ 1.815930] MCE: CPU0: Hardware error
[ 1.816110] opal: Hardware platform error: Unrecoverable Machine Check exception
[ 1.816338] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Tainted: G M 6.12.0-rc4+ #1
[ 1.816531] Tainted: [M]=MACHINE_CHECK
[ 1.816546] Hardware name: IBM PowerNV (emulated by qemu) POWER10 0x801200 opal:v7.1 PowerNV
[ 1.816629] NIP: 0000000030040f54 LR: 000000003007e528 CTR: 000000003004d75c
[ 1.816646] REGS: c0000004d5e47d60 TRAP: 0200 Tainted: G M (6.12.0-rc4+)
[ 1.816684] MSR: 9000000002a03002 <SF,HV,VEC,VSX,FP,ME,RI> CR: 28002284 XER: 00000000
[ 1.816863] CFAR: 000000003007e524 DAR: 0000006000000098 DSISR: 00000040 IRQMASK: 3
[ 1.816863] GPR00: 000000003007e528 0000000031c13ac0 0000000030192900 0000006000000060
[ 1.816863] GPR04: 0000000030500028 000000000000000a 0000000031c10068 0000000031c10068
[ 1.816863] GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 1.816863] GPR12: 0000000028002284 c000000002e80000 c00000000001192c 0000000000000000
[ 1.816863] GPR16: 0000000031c10000 0000000000000000 0000000000000000 0000000000000000
[ 1.816863] GPR20: 0000000000000003 0000000000000074 0000000000000000 0000000000000000
[ 1.816863] GPR24: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 1.816863] GPR28: c000000002d0e8c8 00000000301257de c000000002d0e8c8 000000000000000c
[ 1.817061] NIP [0000000030040f54] 0x30040f54
[ 1.817074] LR [000000003007e528] 0x3007e528
[ 1.817165] Call Trace:
[ 1.817337] Code: 00000060 80002138 e01d0d48 00000000 01000000 00000180 a602087c 3700223d 602e29e9 100001f8 91ff21f8 180069e8 <380023e9> 0000292c 34008241 280041f8
[ 13.247702490,0] OPAL: Reboot requested due to Platform error.
[ 13.247857686,3] OPAL: failed to log an error
[ 13.248012502,2] NVRAM: Failed to load
Previously above machine check was never hit as QEMU platform didn't
had MPIPL, and hence the caller 'p9_sbe_terminate' used to return early.
Add null check to ignore the unregister request if system_flash is not set.
Signed-off-by: Aditya Gupta <adityag@linux.ibm.com>
Reviewed-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|
|
At the moment the generic platform sets bmc_generic() as bmc platform
which does not have any support to initialize the flash and hence it
fails to load petitboot kernel.
[ 583.105000325,4] FLASH: Failed to load VERSION data
[ 583.105490257,5] INIT: Waiting for kernel...
[ 583.105523156,5] INIT: platform wait for kernel load failed
[ 583.105555219,5] INIT: Assuming kernel at 0x20000000
[ 583.105589925,3] INIT: ELF header not found. Assuming raw binary.
[...]
[ 583.299682673,5] INIT: Starting kernel at 0x20000000, fdt at 0x30a44eb0
1274673 bytes
[ 583.344432417,3] ***********************************************
[ 583.344490230,3] Fatal Exception 0x800 at 0000000020000000
MSR 9000000000000000
[ 583.344535875,3] CFAR : 0000000030022948 MSR : 9000000000000000
[ 583.344578019,3] SRR0 : 0000000020000000 SRR1 : 9000000000000000
[ 583.344620242,3] HSRR0: 0000000020000000 HSRR1: 9000000000000000
OPAL builds the device tree for BMC based system using HDAT. It
populates bmc/compatible node with bmc hw version e.g.
"ibm,ast2600,openbmc". Use that to identify proper BMC hw board and
initialize BMC platform with proper backend. This allows opal to
successfully load and boot into petitboot kernel.
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|
|
Everest's hub id is 0x52, which OPAL earlier didn't recognise:
[ 574.179390090,6] CEC: HUB FRU 0 is CPU Card
[ 574.179430286,6] CEC: 2 chips in FRU
[ 574.179464930,7] CEC: IO Hub Chip #0 OK
[ 574.179497312,7] CEC: PChip: 0 HUB ID: 0052 [EC=0x20] Hub#=0)
[ 574.179543358,3] CEC: Hub ID 0x0052 unsupported ! <--------
Due to not recognising the HUB id, it doesn't initialise the PCI slots.
Define 0x52 as Everest's hub id, so OPAL initialises PCIe slots also for
Everest
Signed-off-by: Aditya Gupta <adityag@linux.ibm.com>
Acked-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|
|
Setup skiboot.tcl with Power11 config to be boot on Power11 mambo.
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Acked-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|
|
Add support for QEMU simulator for Power11 when it starts supporting
"qemu,powernv11" machines.
Signed-off-by: Aditya Gupta <adityag@linux.ibm.com>
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|
|
Update the cpu_feature structure to support Power11.
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|
|
Detect Power11 PVR and use P10 code path.
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
[adityag: Add Power11 chiptod device node]
[adityag: Fix the proc_gen checks in pir_to_thread_id and bmc sensor]
Signed-off-by: Aditya Gupta <adityag@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|
|
Update the external archictecture checker script and Makefile
for aarch64.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|
|
In ffspart we assign this make variable:
FFSPART_VERSION ?= $(shell ./make_version.sh $(EXE))
However, ./make_version.sh is actually a make target, and whether it
exists or not at the time of this assignment is by chance, depending on
how the make concurrency works out.
In practice, this intermittently causes CI build failure:
make -j${MAKE_J} check
+ make -j4 check
...
[ RUN-TEST ] check-ffspart
...
make[1]: ./make_version.sh: No such file or directory
...
make[1]: *** [Makefile:13: check] Error 1
make[1]: Entering directory '/build/external/ffspart'
...
running test/tests/00-usage
running test/tests/01-param-sanity
Fatal error, cannot execute binary './ffspart'. Did you make?
make[1]: Leaving directory '/build/external/ffspart'
make: *** [/build/external/Makefile.check:21: check-ffspart] Error 2
make: *** Waiting for unfinished jobs....
The rule for make_version.sh is just a symlink:
make_version.sh:
$(Q_LN)ln -sf ../../make_version.sh
To avoid the race, call make_version.sh from its actual location instead
of relying on the link to be created. The same thing was done for gard
in commit 8ab0caf26de9 ("external/gard: Fix make dist target").
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|
|
Currently the path where to install the opal-prd binary is defined in
the Makefile by the $sbindir variable, but its service files hard-codes
the path to /usr/sbin/opal-prd. The build should generate the service
file based on the actual $sbindir value.
Also strip the trailing slash from the $prefix variable.
Signed-off-by: Dan Horák <dan@danny.cz>
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|
|
commit 0a6a2ff30c9e ("mambo: Add persistent memory disk support") allows
user to map disk images persistent memory using PMEM_DISK ENV variable.
However, If the size of the disk image file passed is not 2MB align,
then the Linux kernel fails to detect pmem device with misaligned error.
nd_pmem namespace0.0: [mem 0x20000000000-0x203fffe01ff flags 0x200]
misaligned, unable to map
nd_pmem namespace0.0: probe with driver nd_pmem failed with error -95
And then linux kernel fails to mount root fs from /dev/pmem0
md: ... autorun DONE.
/dev/root: Can't open blockdev
VFS: Cannot open root device "/dev/pmem0" or unknown-block(0,0):
error -6
[...]
Kernel panic - not syncing: VFS: Unable to mount root fs on
unknown-block(0,0)
Fix this by adding remaining bytes as padding to make pmem device memory
map 2MB aligned.
Reported-by: Brad Thomasson <bthomas@us.ibm.com>
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|
|
Six bytes of the HDIF header are used as an eye catcher:
struct HDIF_common_hdr {
...
char id[6]; /* eye catcher string */
...
}
We assign all six characters of this string without a terminating nul,
so now that GCC 15 enables -Werror=unterminated-string-initialization by
default, the build breaks:
In file included from hdata/test/../spira.h:7,
from hdata/test/../cpu-common.c:5,
from hdata/test/hdata_to_dt.c:148:
hdata/test/../spira.c:35:32: error: initializer-string for array of 'char' is too long [-Werror=unterminated-string-initialization]
35 | .hdr = HDIF_SIMPLE_HDR("PROCIN", 1, struct proc_init_data),
| ^~~~~~~~
hdata/test/../hdif.h:45:68: note: in definition of macro 'HDIF_ID'
45 | #define HDIF_ID(_id) .d1f0 = CPU_TO_BE16(0xd1f0), .id = _id
| ^~~
hdata/test/../spira.c:35:16: note: in expansion of macro 'HDIF_SIMPLE_HDR'
35 | .hdr = HDIF_SIMPLE_HDR("PROCIN", 1, struct proc_init_data),
| ^~~~~~~~~~~~~~~
hdata/test/../spira.h:797:33: error: initializer-string for array of 'char' is too long [-Werror=unterminated-string-initialization]
797 | #define CPU_CTL_HDIF_SIG "CPUCTL"
| ^~~~~~~~
hdata/test/../hdif.h:45:68: note: in definition of macro 'HDIF_ID'
45 | #define HDIF_ID(_id) .d1f0 = CPU_TO_BE16(0xd1f0), .id = _id
| ^~~
hdata/test/../spira.c:73:16: note: in expansion of macro 'HDIF_SIMPLE_HDR'
73 | .hdr = HDIF_SIMPLE_HDR(CPU_CTL_HDIF_SIG, 2, struct cpu_ctl_init_data),
| ^~~~~~~~~~~~~~~
hdata/test/../spira.c:73:32: note: in expansion of macro 'CPU_CTL_HDIF_SIG'
73 | .hdr = HDIF_SIMPLE_HDR(CPU_CTL_HDIF_SIG, 2, struct cpu_ctl_init_data),
| ^~~~~~~~~~~~~~~~
hdata/test/../spira.h:30:33: error: initializer-string for array of 'char' is too long [-Werror=unterminated-string-initialization]
30 | #define SPIRAH_HDIF_SIG "SPIRAH"
| ^~~~~~~~
hdata/test/../hdif.h:45:68: note: in definition of macro 'HDIF_ID'
45 | #define HDIF_ID(_id) .d1f0 = CPU_TO_BE16(0xd1f0), .id = _id
| ^~~
hdata/test/../spira.c:126:16: note: in expansion of macro 'HDIF_SIMPLE_HDR'
126 | .hdr = HDIF_SIMPLE_HDR(SPIRAH_HDIF_SIG, SPIRAH_VERSION, struct spirah),
| ^~~~~~~~~~~~~~~
hdata/test/../spira.c:126:32: note: in expansion of macro 'SPIRAH_HDIF_SIG'
126 | .hdr = HDIF_SIMPLE_HDR(SPIRAH_HDIF_SIG, SPIRAH_VERSION, struct spirah),
| ^~~~~~~~~~~~~~~
To ignore the spurious error, build the single testcase that trips this
with -Wno-error=unterminated-string-initialization.
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
Reviewed-by: Dan Horák <dan@danny.cz>
|
|
P10 has a lower minimum timeout threshold than P9 (100usecs).
Some P10 SBE timers run about 6.7% slow, which must be a hardware or
firmware issue. Use the SBE timer health checking code to detect this
and compensate for it. Speeding up timers as a rule is dangerous because
early-expiry is a bug, howerver the core timer code checks expiry against
the CPU's timebase when running timers, and with the previous changes it
will schedule a new SBE timer for the remaining delay. So if this
adjustment speeds things up slightly too much, it won't cause bugs.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|
|
The SBE in P10 has a maximum expiry limit of just over 10s, so limit
SBE timers to 10s. If the desired timeout is longer than 10s,
additional SBE timers will be scheduled as the 10s timers are
serviced.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|
|
sbe_last_gen_stamp isn't a very clear name, so rename it to
sbe_current_timer_tb first of all. This is used to detect if
the timer should be programmed to get an earlier timeout.
One issue with it is that it is set *after* the SBE acks the
timer message, at which point the SBE could already have
started counting the timer. This means the SBE timer interrupt
could come in before that time, which is confusing and error
prone. Set the field at the point the timer is submitted to
the SBE.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|
|
These aren't "defaults", but really minimum advertised accurate timeouts.
Rename them and make them variables to accommodate changes for P10.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|
|
SBE timer messages are rate-limited so as not to flood the SBE. 2 timer
updates are permitted before the next timer interrupt. The problem with
this is that any subsequent sooner timers will not reprogram the
interrupt earlier so will be arbitrarily delayed.
Change this code to allow 3 updates, and have the 3rd update program
the SBE to the minimum expiry time, which gives rate-limiting without
compromising timer accuracy.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|
|
Disabling the SBE timer entirely is counter-productive: the SBE
interrupt can be delayed for a number of reasons including booting
or OS bugs, and there is no other timer to replace it. If the SBE
timer is detected to be lagging, increase polling rate until it
fires but keep it running.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|
|
When the SBE interrupt fires, clear the previous sbe_timer_target
and has_new_target variables, because the timer code will send us
an updated timer expiry after running check_timers().
This allows for example, a case where the SBE timer has fired too
early to reschedule the SBE timer again rather than leaving it to
be picked up by polling. SBE timer can fire early if the timer
exceeds its maximum timeout, or of the SBE timing is a little off.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|
|
Have the core timer code always call into the SBE timers with the
soonest time, so the SBE code can be more careful with maintaining the
hardware timer.
This fixes a bug where the SBE timer is not being set immediately on
schedule_timer. With a subsequent change to SBE code, it allows an SBE
timer that fires too early to cause a re-schedule of the SBE timer.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|
|
SBE message acks should always apply to the first message in the list,
if the message list is empty this would be a bug, so print an error
message in that case.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|
|
Add a SBE health check when initialising the SBEs, which sends a timer
message and checks for the ack and timer expiry responses. This is
better than eventually finding a timer is not firing and shutting down
the SBE timer, it also tests SBEs on all chips in the system, not just
the primary.
This bypasses the queueing code to make things simpler, which is
okay because the SBEs are not up yet so no other messages are being
sent to the SBE.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|
|
The sequence number is a low level SBE hardware detail, so it can
be assigned later when the message is being sent to the SBE. This
allows SBE messages to be sent without queueing in special cases.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|
|
There appears to be no device-tree test for the P9 SBE presence like
there is for P8. The P9 device tree test looks for the "primary"
property, but this doesn't really test SBE presence because all chips
have an SBE. It just happens to work because mambo must not add that
property.
So add a platform quirk, and mark mambo and awan as not having SBE.
This is needed for a later change that runs a health check on every
SBE in the system.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[arbab: Add #include <chip.h>]
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|