Age | Commit message (Collapse) | Author | Files | Lines |
|
Changing the HID attn enable bit on POWER9 and POWER10 requires the
icache to be flushed *after* ATTN is changed. It is not clear that it
may be done at the same time, so move it to after the attn bit change.
Flushing the icache with HID requires a 0->1 edge and the bit does not
reset back to 0, so first write 1 then 0 ready for the next flush.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|
|
Some firmware configurations boot in LPAR-per-core mode, which is not
compatible with KVM on POWER9 and later machines.
Detect which LPAR mode the boot core is in (all others will be set
the same way), and if booted in LPAR-per-core mode then print a warning
and add a device-tree entry that the OS can test for.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|
|
Similar to commit: c043065cf923 ("flash: Make size 64 bit safe")
update the encoding for PNOR partition which is the partition that is mapping
the full disk 64 bit safe.
Without this mambo disk larger than 4G fails to mount with the below error:
[ 2.075170] EXT4-fs (mtdblock0): bad geometry: block count 7864320 exceeds size of device (524288 blocks)
Fixes: 27fcf2fa8350 ("Expose PNOR Flash partitions to host MTD driver via devicetree")
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
|
|
Resolves : the warray bounds warning during compilation
/build/libc/include/string.h:34:16: warning: '__builtin_memset' offset [0, 2097151] is out of the bounds [0, 0] [-Warray-bounds]
34 | #define memset __builtin_memset
hw/fsp/fsp.c:1855:9: note: in expansion of macro 'memset'
1855 | memset(fsp_tce_table, 0, PSI_TCE_TABLE_SIZE);
use volatile pointer to avoid optimization introduced with gcc-11 on constant
address assignment to pointer.
Signed-off-by: Abhishek Singh Tomar <abhishek@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
|
|
Convert VAS and NX to use the hwprobe facility for init.
Reviewed-by: Dan Horák <dan@danny.cz>
[npiggin: remove imc_init because it moved later in boot (fbcbd4e47c)]
Signed-off-by: Stewart Smith <stewart@flamingspork.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
|
|
POWER8 support is large and significantly different than P9/10 code.
This change prepares to make P8 support configurable.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[ clg: Removed commented headers in slw.c ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
|
|
This saves about 10kB from skiboot.lid.xz
Reviewed-by: Dan Horák <dan@danny.cz>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
|
|
Rather than have code call processor-specific SBE routines depending
on version, hide those details in SBE APIs.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[ clg: Fixed run-timer test ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
|
|
We can use a base CPU of POWER9 if we don't have P8.
We can also hide PHB3 code behind this,
and shave 12kb off skiboot.lid.xz
Reviewed-by: Dan Horák <dan@danny.cz>
[npiggin: add cpp define, fail gracefully on P8]
Signed-off-by: Stewart Smith <stewart@flamingspork.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
|
|
Reviewed-by: Dan Horák <dan@danny.cz>
[npiggin: split out from initial hwprobe pach]
Signed-off-by: Stewart Smith <stewart@flamingspork.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
|
|
hwprobe is a little system to have different hardware probing modules
run in the dependency order they choose rather than hard coding
that order in core/init.c.
Reviewed-by: Dan Horák <dan@danny.cz>
Signed-off-by: Stewart Smith <stewart@flamingspork.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
|
|
POWER9/10 are missing TLB flushing after fast reboot. Add it back to
cpu_fast_reboot_complete(), which is where fast-reboot code thinks it
should be.
Suggested-by: Cédric Le Goater <clg@fr.ibm.com>
Fixes: 53ef0db6e2 ("asm/head.S: set POWER9 radix HID bit at entry")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
|
|
The sleep/wake synchronisation involes the waker setting a wake
condition then testing if the target needs to be woken, vs setting
a wake-required flag then testing the wake condition. The low level
sleep state call comes after that.
This patch moves the synchronisation out from the low level sleep
functions and consolidates both copies into one place.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
|
|
Rework the CPU idle state code:
* in_idle is true for any kind of idle including spinning. This is not
used anywhere except for state assertions for now.
* in_sleep is true for idle that requires an IPI to wake up.
* in_job_sleep is true for in_sleep idle which is also cpu_wake_on_job.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
|
|
There is no need to send the IPI while holding the job_lock. If the
target does wake after the job is queued and before we send the IPI,
it will check for new jobs anyway.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
|
|
Pull the IPI sending code into its own function where it is used in
two places.
cpu_wake() already checks in_idle, so its caller does not need to
check pm_enabled.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
|
|
POWER8 does not have to loop sending IPIs until the destination wakes
up. cpu_wake() only sends IPI so that should be enough here too.
This will help the next patch make a common IPI sending function.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
|
|
Idle reconfiguration is difficult to follow and verify as correct
because it can occur while CPUs are in low-level idle routines. For
example pm_enabled can change while CPUs are idle. If nothing else, this
can result in "cpu_idle_p9 called pm disabled" messages.
This changes the idle reconfiuration to always kick all other CPUs out
of idle code first whenever idle settings (pm_enabled, IPIs, sreset,
etc.) are to be changed.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
|
|
Upstream ccan uses (list, existing entry, new entry) parameter ordering
rather than (list, new entry, existing entry) ordering.
Switch these to make syncing with upstream simpler.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
|
|
This patch add a new function to dump PAU registers when a HMI has been
raised and an OpenCAPI link has been hit by an error.
For each register, the scom address and the register value are printed.
The hmi.c has been redesigned in order to support the new PHB/PCIEX
type (PAU OpenCapi). Now, the *npu* functions support NPU and PAU units of
P8, P9 and P10 chips.
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
Implement the necessary operations for the OpenCAPI PHB type and
inform the device-tree properties associated.
The OpenCapi PCI config Addr/Data registers are reachable through
the Generation-ID Registers MMIO BARS.
The Config Address and Data registers are located at the following offsets
from the AFU Config BAR plus 320 KB.
• Config Address for Brick 0 – Offset 0
• Config Data for Brick 0 – Offsets:
◦ 128 – 4-byte config register
• Config Address for Brick 1 – Offset 256
• Config Data for Brick 1 – Offsets:
◦ 384 – 4-byte config register
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
OpenCapi for P10 is included in the P10 chip. This requires OCAPI capable
PHYs, Datalink Layer Logic and Transaction Layer Logic to be included.
The PHYs are the physical connection to the OCAPI interconnect.
The Datalink Layer provides link training.
The Transaction Layer executes the cache coherent and data movement
commands on the P10 chip.
The PAU provides the Transaction Layer functionality for the OCAPI
link(s) on the P10 chip.
The P10 PAU supports two OCAPI links. Six accelerator units PAUs are
instantiated on the P10 chip for a total of twelve OCAPI links.
This patch adds PAU opencapi structure for supporting OpenCapi5.
hw/pau.c file contains main of PAU management functions.
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
This patch enables Skiboot to initialize and Linux to boot to user space
on the AWAN core and chip models.
We need the distinction between core and chip models because the core
models do not have an XSCOM unit, CHIPTOD, nor RNG. The chip
model does have them and they work.
So, add a device_type property to the awan node to distinguish core from
chip. Sample DTS are provided for the core and chip models in
external/awan.
Just like Mambo, we need to return in slw_init before trying to
initialize SLW. Without an XSCOM unit in the device tree for the core
model, the SLW code path eventually fails an assert due to lack of
chips.
This commit defines a QUIRK_AWAN where previously Mambo used
QUIRK_MAMBO_CALLOUTS so now Mambo and AWAN core both work.
Also, fix up chip quirks so the core model and chip model boot and
initialize the appropriate units.
Disable sreset and power management in a couple spots because the chip
model does not support stop with EC=1 and enter_p9_pm_state spins in the
branch-to-self after stop.
Provide an external/awan/README.md with a high-level view of booting in
the environment.
Signed-off-by: Ryan Grimm <grimm@linux.ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
This significantly simplifies the SLW code.
HILE is now always supported.
Reviewed-by: Stewart Smith <stewart@flamingspork.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
If cpu_relax() is called when not at medium SMT priority, it will lose
the prior priority and return at medium. Add a debug check to catch
this, which would have flagged the previous bug.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
Calling cpu_relax resets the SMT priority to medium, causing the idle
loop not to run with lowest priority. Just use barrier() instead, this
saves about 3 seconds on a SMT4 systemsim (mambo) boot.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
On P10, get_ics_phandle() calls xive2_get_phandle() directly. This
results in a NULL dereference on mambo when xive2 is not set up.
This was caught with the virtual memory boot patch on P10 mambo.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
npu3 was only used on the Swift platform to add support for
GPUs (nvlink). The Swift platform has never left the lab and support
for GPUs on it is pretty much dead. So let's remove it.
The patch removes all related code. Device tree entries are no
longer created and in the very unlikely case that someone is still
trying to boot it, the linux nvlink discovery code should be quiet.
Tested by booting on Swift with no GPU.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Reza Arbab <arbab@linux.ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
We only support the XIVE interface.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
Update libpore with P10 STOP API. Add minor changes to make
P9 stop-api and P10 stop-api to co-exist in OPAL.
These calls are required for STOP11 support on P10.
STIOP0,2,3 on P10 does not lose full core state or scoms.
stop-api based restore of SPRs or xscoms required only
for STOP11 on P10.
STOP11 on P10 will be a limited lab test/stress feature
and not a product feature. (Same case as P9)
Co-authored-by: Pratik Rajesh Sampat <psampat@linux.ibm.com>
Signed-off-by: Pratik Rajesh Sampat <psampat@linux.ibm.com>
Co-authored-by: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
Co-authored-by: Ryan Grimm <grimm@linux.ibm.com>
Signed-off-by: Ryan Grimm <grimm@linux.ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
The PHB5 logic on P10 is pretty close to the P9's version. So
we keep our base phb4 implementation and just add the few changes
within if statements.
Signed-off-by: Jordan Niethe <jpn@ozlabs.au.ibm.com>
[clg: misc cleanups and fixes ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
[Fixed compilation issue - Vasant]
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
[Nick: Unify PHB4/PHB5 drivers ]
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[Mikey: set default lane eq settings for phb5]
Signed-off-by: Michael Neuling <mikey@neuling.org>
[FB: squash commits + small cleanup ]
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
The XIVE2 interrupt controller of the POWER10 processor follows the
same logic than on POWER9 but the HW interface has been largely
reviewed. It has a new register interface, different BARs, extra
VSDs, new layout for the XIVE structures, and a set of new features
which are described below.
The OPAL XIVE2 driver code activating this controller was duplicated
from P9 for clarity as the registers and structures have changed
considerably. The same OPAL interface is implemented for OS
compatibility and it should not impact existing Linux kernels, KVM
included. Guest OS is not impacted either.
Support for new features will be implemented in time and will require
new support from the OS.
* XIVE2 BARS
The interrupt controller BARs have a different layout outlined below.
Each sub-engine has now own its range and the indirect TIMA access was
replaced with a set of pages, one per CPU, under the IC BAR:
- IC BAR (Interrupt Controller)
. 4 pages, one per sub-engine
. 128 indirect TIMA pages
- TM BAR (Thread Interrupt Management Area)
. 4 pages
- ESB BAR (ESB pages for IPIs)
. up to 1TB
- END BAR (ESB pages for ENDs)
. up to 2TB
- NVC BAR (Notification Virtual Crowd)
. up to 128
- NVPG BAR (Notification Virtual Process and Group)
. up to 1TB
- Direct mapped Thread Context Area (reads & writes)
OPAL does not use the grouping and crowd capability.
* Virtual Structure Tables
XIVE2 adds new tables types and also changes the field layout of the END
and NVP Virtualization Structure Descriptors.
- EAS
- END new layout
- NVT was splitted in :
. NVP (Processor), 32B
. NVG (Group), 32B
. NVC (Crowd == P9 block group) 32B
- IC for remote configuration
- SYNC for cache injection
- ERQ for event input queue
The setup is slighly different on XIVE2 because the indexing has changed
for some of the tables, block ID or the chip topology ID can be used.
* XIVE2 features
SCOM and MMIO registers have a new layout and XIVE2 adds a new global
capability and configuration registers.
The lowlevel hardware offers a set of new features among which :
- cache injection mechanism
- 4 cache watch engines
- a configurable number of priorities : 1 -8
- StoreEOI with load-after-store ordering is activated by default
- new sync/kill operations for cache operations
Other features will have some impact on the Hypervisor and guest OS
when activated, but this is not required for initial support of the
controller.
- Gen2 TIMA layout
- A P9-compat mode, or Gen1, TIMA toggle bit for SW compatibility
- Automatic Context save & restore
- increase to 24bit for VP number
- New escalations schems : ESB, Adaptive, CPPR
POWER10 adds support for User interrupts. When configured, the XIVE2
controller can notify directly user processes using the Event Based
Branch exception line of the thread. If not running, the OS is
notified through an escalation event. New OPAL and PAPR interfaces
will be required and OS support needs to be studied.
* XIVE2 P9-compat mode, or Gen1
The thread interrupt management area (TIMA) is a set of pages mapped
in the Hypervisor and in the guest OS address space giving access to
the interrupt thread context registers for interrupt management, ACK,
EOI, CPPR, etc.
XIVE2 changes slightly the TIMA layout with extra bits for the new
features, larger CAM lines and the controller provides configuration
switches for backward compatibility. This is called the XIVE2
P9-compat mode, of Gen1 TIMA. It impacts the layout of the TIMA and
the availability of the internal features associated with it,
Automatic Save & Restore for instance. Using a P9 layout also means
setting the controller in such a mode at init time.
The XIVE2 driver in OPAL chooses to initialize the XIVE2 controller
with a XIVE2/P10 TIMA directly because the layouts are compatible with
the Linux PowerNV and the guest OSes expectations.
For KVM support, the OPAL calls abstract the HW interface and no
assumption is made on the OS CAM line width.
* Activating new XIVE2 features
Everything related to OPAL internals such as the use of the new cache
sync mechanism can be implemented in time without impact on the OS.
Other features will require new device tree properties exposed to the
OS and extra support for the OS. Automatic Context save & restore is
one of the first feature which should be looked at.
* XICS-over-XICS driver (P8 compatibility)
The P8 emulation mode is an OPAL compat interface used for Linux
kernels which did not have XIVE native support. This was useful for
POWER9 bringup but it is much less now. As it was adding a lot of
complexity and reducing the interrupt controller resources, this mode
is not available in the XIVE2 driver for POWER10.
It will still be possible to add this compat mode in the future if
required. The OS will have to reset the driver at boot time, like on
POWER9.
* Impact on other drivers (PSI, PHB, NPU)
Interrupts are allocated in a very similar way. Each controller might
have different ESB characteristics, StoreEOI support, 64K pages for
PSI. All is in place to support these changes already.
PHB5 will have support for "address-based trigger mode", probably in
the DD2.0 time frame when verification is completed. When activated,
the XIVE IC ESB pages will be used instead of the PHB ESB pages for a
lower interrupt latency.
LSI will still use old fashion triggers without StoreEOI.
* Yet to be addressed :
- OPAL P10 interface incomplete (stop states)
- Clarify the PHB5 strategy regarding the use of the XIVE IC ESB
pages instead of the PHB ones when address-based trigger mode is
supported.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
HDAT provides Topology ID table and the primary topology location on
P10. This primary location points to primary topology entry in ID table
which contains the primary topology index and this index is used to
define the paste base address per chip.
This patch reads Topology ID table and the primary topology location
from hdata and retrieves the primary topology index in the ID table.
Make this primaty topology index value available with
ibm,primary-topology-index property per chip. VAS reads this property
to setup paste base address for each chip.
Signed-off-by: Haren Myneni <haren@linux.ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
BMC is still defined as ast2500 but it should change to ast2600 when
available.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
[Folded Ravi's DAWR patch - Vasant]
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
This works around a core recovery issue in P10. The workaround involves
the CME polling for a core recovery and performing the recovery
procedure itself.
For this to happen, the host leaves core recovery off (HID[5]) and
then masks the PC system checkstop. This patch does this.
Firmware starts skiboot with recovery already off, so we just leave it
off for longer and then mask the PC system checkstop. This makes the
window longer where a core recovery can cause an xstop but this
window is still small and can still only happens on boot.
Signed-off-by: Michael Neuling <mikey@neuling.org>
[Added mambo check - Vasant]
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
Co-authored-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Co-authored-by: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
Co-authored-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Co-authored-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Co-authored-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Co-authored-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
De-assert special wakeup bits for the case when SPWU bit is set, however
the core is gated to maintain a coherent state for special wakeup.
Signed-off-by: Pratik R. Sampat <psampat@linux.ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
Starting from p10 hostboot will no longer clear all the system memory except
its own space. OPAL uses the memory at SKIBOOT_BASE + SKIBOOT_SIZE for cpu
stack with pir as index. With hostboot no longer clearing memory this region
may hold junk contents. Currently opal initialize cpu stack memory only for
cpu pir that is found on the device-tree. For the rest, the cpu thread
contents are uninitialized. This sometime causes for_each_cpu* macros to
return cpu thread for pir/cpu which isn't present on the system. The
for_each_cpu* macros iterate over cpu stacks using pir as index and returns
cpu thread pointer if state != cpu_state_no_cpu. For cpus that are not found
on device-tree the state may hold junk value leading OPAL to access invalid
cpu thread area. This further leads to accessing pointers with junk values
causing machine check (MCE) during OPAL init code. Fix this by Initializing
all the cpu thread areas upto cpu_max_pir.
[ 182.049714372,3] ***********************************************
[ 182.049878580,3] Fatal MCE at 0000000030039738 .init_trace_buffers+0x21c MSR 9000000000201002
[ 182.049943811,3] Cause: load real address error
[ 182.049968681,3] Effective address: 0x480113a4791c4a50
[ 182.050000736,3] CFAR : 00000000300395b8 MSR : 9000000000201002
[ 182.050035376,3] SRR0 : 0000000030039738 SRR1 : 9000000000201002
[ 182.050072878,3] HSRR0: 0000000030020024 HSRR1: 9000000000001000
[ 182.050117303,3] DSISR: 00000040 DAR : 480113a4791c4a50
[ 182.050149054,3] LR : 0000000030039744 CTR : 0000000000000000
[ 182.050182991,3] CR : 42000224 XER : 00000000
[ 182.050217262,3] GPR00: 000000003003962c GPR16: 0000000032d50000
[ 182.050255746,3] GPR01: 0000000032d53a50 GPR17: 0000000030003198
[ 182.050288081,3] GPR02: 000000003014cb00 GPR18: 0000000000000000
[ 182.050331474,3] GPR03: 0000000031c50000 GPR19: 0000000000000000
[ 182.050371934,3] GPR04: 0000000000000000 GPR20: 0000000000000000
[ 182.050416212,3] GPR05: ffffffffffffffff GPR21: 0000000000000001
[ 182.050454130,3] GPR06: 0000000000000005 GPR22: 00000000300f74eb
[ 182.050488053,3] GPR07: 0000000000000028 GPR23: 00000000000fffd8
[ 182.050522774,3] GPR08: 000000000000067f GPR24: 00000000000fff40
[ 182.050566878,3] GPR09: 480113a4791c4a18 GPR25: 0000000000000070
[ 182.050601524,3] GPR10: 00000000078b0353 GPR26: 00000000300f7527
[ 182.050640345,3] GPR11: 0000000000000000 GPR27: 00000000300f7516
[ 182.050680816,3] GPR12: 0000000042000222 GPR28: 000000003acd0000
[ 182.050724099,3] GPR13: 000000000025a908 GPR29: 000000003acd0000
[ 182.050759728,3] GPR14: 0000000000000000 GPR30: 0000000000000000
[ 182.050790430,3] GPR15: 0000000000000000 GPR31: 00000000301f0038
CPU 0228 Backtrace:
S: 0000000032d53d60 R: 000000003003962c .init_trace_buffers+0x110
S: 0000000032d53e30 R: 0000000030022f84 .main_cpu_entry+0x550
S: 0000000032d53f00 R: 00000000300031f8 not_fused+0x11c
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[Folded Nick's patch to that added mark_all_secondary_cpus_absent() - Vasant]
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
Add support for tracing I2C transactions performed by skiboot. This covers
both internally initiated I2C ops and those that requested by the kernel
via the OPAL API.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
Previously we put all the trace buffer exports in the exports/ node.
However, there's one trace buffer for each core so I moved them into a
subdirectory since they were crowding up the place. Most kernels don't
support recursively exporting subnodes though so kernel's don't have
support for recursively exporting subnodes, so add a hack to restore the
old behaviour for now.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
[Fixed run-trace test case - Vasant]
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
Async machine check errors due to bad real address from store or
foreign link time out comes with the load/store bit (PPC bit 42)
set in SRR1 but the cause is set in SRR1 not DSISR, unlike other
errors that have the load/store bit set.
This behaviour was omitted from the POWER9 User Manual but it is
confirmed to be the expected one. Update the machine check decoder
to match.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
cpu_pm_idle sets pm_enabled = false and expected all cpus
to exit idle. This is needed to re-enter with new settings.
Right after cpu_bringup() we call copy_sreset_vector() and then
cpu_set_sreset_enable(true). At this time some cpus are still
yet to enter idle and hence miss the doorbell to wakeup.
This leads to cpu_pm_idle waiting forever. This pattern happens
on some system in fused-core mode.
The fact that pm_enabled flag is changing right in the middle of
idle entry is see from the "cpu_idle_p9 called with pm disabled" traces.
One method to fix this race is to retry the door-bell after a timeout.
This patch implements a small time out (few seconds) and then issues
the doorbell once again to kick the cpu that entered idle late after
missing the pm_enabled = false flag.
This checking loop run in smt_lowest() and hence the timeout number
maps to couple of seconds which is sufficient to let the cpus settle in
idle and make them see the doorbell and exit.
Example boot log:
[ 288.309322810,7] INIT: CPU PIR 0x000d called in
[ 288.309320768,7] INIT: CPU PIR 0x000b called in
[ 288.314603802,7] INIT: CPU PIR 0x0020 called in
[ 288.321303468,5] CPU: All 88 processors called in...
[ 288.315056796,6] cpu_idle_p9 called on cpu 0x024e with pm disabled
[ 288.321308091,6] cpu_idle_p9 called on cpu 0x0264 with pm disabled
[ 288.314424259,6] cpu_idle_p9 called on cpu 0x025b with pm disabled
[ 288.324928307,6] cpu_idle_p9 called on cpu 0x0065 with pm disabled
[ 305.207316004,6] cpu_pm_disable TIMEOUT on cpu 0x0261 to exit idle
[ 322.093298501,6] cpu_pm_disable TIMEOUT on cpu 0x0263 to exit idle
[ 338.491281028,6] cpu_pm_disable TIMEOUT on cpu 0x0265 to exit idle
[ 355.377263492,6] cpu_pm_disable TIMEOUT on cpu 0x0267 to exit idle
[ 372.263245960,6] cpu_pm_disable TIMEOUT on cpu 0x0269 to exit idle
[ 389.149228389,6] cpu_pm_disable TIMEOUT on cpu 0x026b to exit idle
[ 406.035210852,6] cpu_pm_disable TIMEOUT on cpu 0x026d to exit idle
[ 422.433193381,6] cpu_pm_disable TIMEOUT on cpu 0x026f to exit idle
[ 422.433277720,6] CHIPTOD: Calculated MCBS is 0x25 (Cfreq=2000000000 Tfreq=32000000)
Reported-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
[Reworded commit message - Vasant]
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
imc_init() checks for the 24x7 microcode state at boot to
check whether the microcode is in proper state (running or paused).
But in a larger system, loading of 24x7 microcode by OCC gets delayed.
Because of this, imc_init() removes imc devices from the device tree.
Moving imc_init() function towards end of the main_cpu_entry()
works around this.
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
Sample output from Cédric:
-------------------------
[ 88.294111649,7] cpu_idle_p9 called on cpu 0x063c with pm disabled
[ 88.289365222,7] cpu_idle_p9 called on cpu 0x025f with pm disabled
[ 88.289900684,7] cpu_idle_p9 called on cpu 0x045f with pm disabled
[ 88.302621295,7] CHIPTOD: Base TFMR=0x2512000000000000
[ 88.289899701,7] cpu_idle_p9 called on cpu 0x0456 with pm disabled
LOCK ERROR: Deadlock detected @0x30402740 (state: 0x0000000400000001)
[ 88.332264757,3] ***********************************************
[ 88.332300051,3] < assert failed at core/lock.c:32 >
[ 88.332328282,3] .
[ 88.332347335,3] .
[ 88.332364894,3] .
[ 88.332377963,3] OO__)
[ 88.332395458,3] <"__/
[ 88.332412628,3] ^ ^
[ 88.332450246,3] Fatal TRAP at 00000000300286a0 .lock_error+0x64 MSR 9000000000021002
[ 88.332501812,3] CFAR : 00000000300414f4 MSR : 9000000000021002
[ 88.332536539,3] SRR0 : 00000000300286a0 SRR1 : 9000000000021002
[ 88.332574644,3] HSRR0: 0000000030020024 HSRR1: 9000000000001000
[ 88.332610635,3] DSISR: 00000000 DAR : 0000000000000000
[ 88.332650628,3] LR : 0000000030028690 CTR : 00000000300f9fa0
[ 88.332684451,3] CR : 20002000 XER : 00000000
[ 88.332712767,3] GPR00: 0000000030028690 GPR16: 0000000032c98000
[ 88.332748046,3] GPR01: 0000000032c9b0a0 GPR17: 0000000000000000
[ 88.332784060,3] GPR02: 0000000030169d00 GPR18: 0000000000000000
[ 88.332822091,3] GPR03: 0000000032c9b310 GPR19: 0000000000000000
[ 88.332861357,3] GPR04: 0000000030041480 GPR20: 0000000000000000
[ 88.332897229,3] GPR05: 0000000000000000 GPR21: 0000000000000000
[ 88.332937051,3] GPR06: 0000000000000010 GPR22: 0000000000000000
[ 88.332968463,3] GPR07: 0000000000000000 GPR23: 0000000000000000
[ 88.333007333,3] GPR08: 000000000002cbb5 GPR24: 0000000000000000
[ 88.333041971,3] GPR09: 0000000000000000 GPR25: 0000000000000000
[ 88.333081073,3] GPR10: 0000000000000000 GPR26: 0000000000000003
[ 88.333114301,3] GPR11: 3839616263646566 GPR27: 0000000000000211
[ 88.333156040,3] GPR12: 0000000020002000 GPR28: 000000003042a134
[ 88.333189222,3] GPR13: 0000000000000000 GPR29: 0000000030402740
[ 88.333225638,3] GPR14: 0000000000000000 GPR30: 0000000000000001
[ 88.333259730,3] GPR15: 0000000000000000 GPR31: 0000000000000000
CPU 0211 Backtrace:
S: 0000000032c9b3b0 R: 0000000030028690 .lock_error+0x54
S: 0000000032c9b440 R: 0000000030028828 .add_lock_request+0xd0
S: 0000000032c9b4f0 R: 0000000030028a9c .lock_caller+0x8c
S: 0000000032c9b5a0 R: 0000000030021b30 .__mcount_stack_check+0x70
S: 0000000032c9b650 R: 00000000300fabb0 .list_check_node+0x1c
S: 0000000032c9b6f0 R: 00000000300fac98 .list_check+0x38
S: 0000000032c9b790 R: 00000000300289bc .try_lock_caller+0xac
S: 0000000032c9b830 R: 0000000030028ad8 .lock_caller+0xc8
S: 0000000032c9b8e0 R: 0000000030028d74 .lock_recursive_caller+0x54
S: 0000000032c9b980 R: 0000000030020cb8 .console_write+0x48
S: 0000000032c9ba30 R: 00000000300445a8 .vprlog+0xc8
S: 0000000032c9bc20 R: 0000000030044630 ._prlog+0x50
S: 0000000032c9bcb0 R: 0000000030029204 .cpu_idle_p9+0x74
S: 0000000032c9bd40 R: 0000000030029628 .cpu_idle_pm+0x4c
S: 0000000032c9bde0 R: 0000000030023fe0 .__secondary_cpu_entry+0xa0
S: 0000000032c9be70 R: 0000000030024034 .secondary_cpu_entry+0x40
S: 0000000032c9bf00 R: 0000000030003290 secondary_wait+0x8c
CPU 0x4:
opal_run_pollers ->
check_stacks -> takes stack_check_lock lock
prlog ->
console_write -> waits for con_lock
CPU 0x211
cpu_idle_p9 ->
prlog ->
console_write -> Takes con_lock lock
list_check_node -> tries to take stack_check_lock and hits deadlock.
I think we don't need to hold `stack_check_lock` while printing
backtraces. Instead it makes sense to hold backtrace lock (bt_lock)
and print output.
Reported-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Tested-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
Fixes:
core/opal.c:418:61: warning: Using plain integer as NULL pointer
Signed-off-by: Stewart Smith <stewart@flamingspork.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
If fast reboot fails then we return to Linux with OPAL_SUCCESS.
Current Linux code thinks that request succedded and enters
infinite loop (see Linux pnv_restart() code).
This patch fixes above issue by return OPAL_UNSUPPORTED if fast
reboot fails.
Alternatively we can directly call full_reboot() itself. But I
think it makes sense to go back to Linux and report the failure.
And Linux falls back to normal reboot request.
Fixes: 10bbcd07 ("core/platform: Add an explicit fast-reboot type")
Cc: Oliver O'Halloran <oohall@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Dan Horák <dan@danny.cz>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
next_unguarded_primary dereferences NULL CPU -> UB -> infinite loop
Fast reboot works again after this patch.
Fixes: 98f5834253c7e ("cpu: Keep track of the "ec_primary" in big core more")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
In secure boot enabled systems, the petitboot linux kernel verifies the
OS kernel against x509 certificates that are wrapped in secure variables
controlled by OPAL. These secure variables are stored in the PNOR SECBOOT
partition, as well as the updates submitted for them using userspace
tools.
This patch adds read and write support to the PNOR SECBOOT partition in
a similar fashion to that of NVRAM, so that OPAL can handle the secure
variables.
Signed-off-by: Claudio Carvalho <cclaudio@linux.ibm.com>
Signed-off-by: Eric Richter <erichte@linux.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
|