aboutsummaryrefslogtreecommitdiff
path: root/core
AgeCommit message (Collapse)AuthorFilesLines
2022-06-13cpu: Fix HID SPR icache flushing and attn change sequenceNicholas Piggin1-0/+22
Changing the HID attn enable bit on POWER9 and POWER10 requires the icache to be flushed *after* ATTN is changed. It is not clear that it may be done at the same time, so move it to after the attn bit change. Flushing the icache with HID requires a 0->1 edge and the bit does not reset back to 0, so first write 1 then 0 ready for the next flush. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
2022-06-13core: detect LPAR-per-core mode and report in dtNicholas Piggin1-0/+38
Some firmware configurations boot in LPAR-per-core mode, which is not compatible with KVM on POWER9 and later machines. Detect which LPAR mode the boot core is in (all others will be set the same way), and if booted in LPAR-per-core mode then print a warning and add a device-tree entry that the OS can test for. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
2022-04-29flash: Make PNOR partition encoding size 64 bit safe.Aneesh Kumar K.V1-9/+17
Similar to commit: c043065cf923 ("flash: Make size 64 bit safe") update the encoding for PNOR partition which is the partition that is mapping the full disk 64 bit safe. Without this mambo disk larger than 4G fails to mount with the below error: [ 2.075170] EXT4-fs (mtdblock0): bad geometry: block count 7864320 exceeds size of device (524288 blocks) Fixes: 27fcf2fa8350 ("Expose PNOR Flash partitions to host MTD driver via devicetree") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
2022-02-04Fix array-bound compilation warningsAbhishek Singh Tomar4-7/+14
Resolves : the warray bounds warning during compilation /build/libc/include/string.h:34:16: warning: '__builtin_memset' offset [0, 2097151] is out of the bounds [0, 0] [-Warray-bounds] 34 | #define memset __builtin_memset hw/fsp/fsp.c:1855:9: note: in expansion of macro 'memset' 1855 | memset(fsp_tce_table, 0, PSI_TCE_TABLE_SIZE); use volatile pointer to avoid optimization introduced with gcc-11 on constant address assignment to pointer. Signed-off-by: Abhishek Singh Tomar <abhishek@linux.ibm.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-01-03hwprobe: convert vas_init(), nx_init()Stewart Smith1-6/+0
Convert VAS and NX to use the hwprobe facility for init. Reviewed-by: Dan Horák <dan@danny.cz> [npiggin: remove imc_init because it moved later in boot (fbcbd4e47c)] Signed-off-by: Stewart Smith <stewart@flamingspork.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-01-03hw/slw: split P8 specific code into its own fileNicholas Piggin2-0/+2
POWER8 support is large and significantly different than P9/10 code. This change prepares to make P8 support configurable. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [ clg: Removed commented headers in slw.c ] Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-01-03hw/slw: Move P8 bits behind CONFIG_P8Nicholas Piggin1-0/+2
This saves about 10kB from skiboot.lid.xz Reviewed-by: Dan Horák <dan@danny.cz> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-01-03SBE: create processor-independent timer APIsNicholas Piggin3-15/+9
Rather than have code call processor-specific SBE routines depending on version, hide those details in SBE APIs. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [ clg: Fixed run-timer test ] Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-01-03Add CONFIG_P8 with PHB3 behind itStewart Smith1-2/+9
We can use a base CPU of POWER9 if we don't have P8. We can also hide PHB3 code behind this, and shave 12kb off skiboot.lid.xz Reviewed-by: Dan Horák <dan@danny.cz> [npiggin: add cpp define, fail gracefully on P8] Signed-off-by: Stewart Smith <stewart@flamingspork.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-01-03hwprobe: convert PHB, NPU, PAU subsystems to hwprobeStewart Smith1-14/+1
Reviewed-by: Dan Horák <dan@danny.cz> [npiggin: split out from initial hwprobe pach] Signed-off-by: Stewart Smith <stewart@flamingspork.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-01-03Introduce hwprobe facility to avoid hard-coding probe functionsStewart Smith3-0/+74
hwprobe is a little system to have different hardware probing modules run in the dependency order they choose rather than hard coding that order in core/init.c. Reviewed-by: Dan Horák <dan@danny.cz> Signed-off-by: Stewart Smith <stewart@flamingspork.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-01-03fast-reboot: fix TLB cleanup after fast rebootNicholas Piggin1-0/+4
POWER9/10 are missing TLB flushing after fast reboot. Add it back to cpu_fast_reboot_complete(), which is where fast-reboot code thinks it should be. Suggested-by: Cédric Le Goater <clg@fr.ibm.com> Fixes: 53ef0db6e2 ("asm/head.S: set POWER9 radix HID bit at entry") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2021-12-23core/cpu: move sleep/wake synchronisation out from low level codeNicholas Piggin1-110/+42
The sleep/wake synchronisation involes the waker setting a wake condition then testing if the target needs to be woken, vs setting a wake-required flag then testing the wake condition. The low level sleep state call comes after that. This patch moves the synchronisation out from the low level sleep functions and consolidates both copies into one place. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2021-12-23core/cpu: make cpu idle states simplerNicholas Piggin1-25/+50
Rework the CPU idle state code: * in_idle is true for any kind of idle including spinning. This is not used anywhere except for state assertions for now. * in_sleep is true for idle that requires an IPI to wake up. * in_job_sleep is true for in_sleep idle which is also cpu_wake_on_job. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2021-12-23core/cpu: move cpu_wake out of job_lockNicholas Piggin1-1/+2
There is no need to send the IPI while holding the job_lock. If the target does wake after the job is queued and before we send the IPI, it will check for new jobs anyway. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2021-12-23core/cpu: refactor IPI sendingNicholas Piggin1-18/+13
Pull the IPI sending code into its own function where it is used in two places. cpu_wake() already checks in_idle, so its caller does not need to check pm_enabled. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2021-12-23core/cpu: remove POWER8 IPI loopNicholas Piggin1-3/+1
POWER8 does not have to loop sending IPIs until the destination wakes up. cpu_wake() only sends IPI so that should be enough here too. This will help the next patch make a common IPI sending function. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2021-12-23core/cpu: rewrite idle synchronisationNicholas Piggin1-100/+156
Idle reconfiguration is difficult to follow and verify as correct because it can occur while CPUs are in low-level idle routines. For example pm_enabled can change while CPUs are idle. If nothing else, this can result in "cpu_idle_p9 called pm disabled" messages. This changes the idle reconfiuration to always kick all other CPUs out of idle code first whenever idle settings (pm_enabled, IPIs, sreset, etc.) are to be changed. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2021-12-09ccan: switch list_add_before/after arguments to match upstreamNicholas Piggin3-4/+4
Upstream ccan uses (list, existing entry, new entry) parameter ordering rather than (list, new entry, existing entry) ordering. Switch these to make syncing with upstream simpler. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2021-10-19pau: hmi scom dumpChristophe Lombard1-144/+132
This patch add a new function to dump PAU registers when a HMI has been raised and an OpenCAPI link has been hit by an error. For each register, the scom address and the register value are printed. The hmi.c has been redesigned in order to support the new PHB/PCIEX type (PAU OpenCapi). Now, the *npu* functions support NPU and PAU units of P8, P9 and P10 chips. Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2021-10-19pau: create phbChristophe Lombard2-4/+9
Implement the necessary operations for the OpenCAPI PHB type and inform the device-tree properties associated. The OpenCapi PCI config Addr/Data registers are reachable through the Generation-ID Registers MMIO BARS. The Config Address and Data registers are located at the following offsets from the AFU Config BAR plus 320 KB. • Config Address for Brick 0 – Offset 0 • Config Data for Brick 0 – Offsets: ◦ 128 – 4-byte config register • Config Address for Brick 1 – Offset 256 • Config Data for Brick 1 – Offsets: ◦ 384 – 4-byte config register Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2021-10-19pau: introduce supportChristophe Lombard1-0/+3
OpenCapi for P10 is included in the P10 chip. This requires OCAPI capable PHYs, Datalink Layer Logic and Transaction Layer Logic to be included. The PHYs are the physical connection to the OCAPI interconnect. The Datalink Layer provides link training. The Transaction Layer executes the cache coherent and data movement commands on the P10 chip. The PAU provides the Transaction Layer functionality for the OCAPI link(s) on the P10 chip. The P10 PAU supports two OCAPI links. Six accelerator units PAUs are instantiated on the P10 chip for a total of twelve OCAPI links. This patch adds PAU opencapi structure for supporting OpenCapi5. hw/pau.c file contains main of PAU management functions. Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2021-10-19AWAN simulator support for P10Ryan Grimm3-6/+18
This patch enables Skiboot to initialize and Linux to boot to user space on the AWAN core and chip models. We need the distinction between core and chip models because the core models do not have an XSCOM unit, CHIPTOD, nor RNG. The chip model does have them and they work. So, add a device_type property to the awan node to distinguish core from chip. Sample DTS are provided for the core and chip models in external/awan. Just like Mambo, we need to return in slw_init before trying to initialize SLW. Without an XSCOM unit in the device tree for the core model, the SLW code path eventually fails an assert due to lack of chips. This commit defines a QUIRK_AWAN where previously Mambo used QUIRK_MAMBO_CALLOUTS so now Mambo and AWAN core both work. Also, fix up chip quirks so the core model and chip model boot and initialize the appropriate units. Disable sreset and power management in a couple spots because the chip model does not support stop with EC=1 and enter_p9_pm_state spins in the branch-to-self after stop. Provide an external/awan/README.md with a high-level view of booting in the environment. Signed-off-by: Ryan Grimm <grimm@linux.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2021-10-19Remove support for POWER8 DD1Nicholas Piggin1-14/+9
This significantly simplifies the SLW code. HILE is now always supported. Reviewed-by: Stewart Smith <stewart@flamingspork.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2021-10-19cpu: add debug check in cpu_relaxNicholas Piggin1-0/+6
If cpu_relax() is called when not at medium SMT priority, it will lose the prior priority and return at medium. Add a debug check to catch this, which would have flagged the previous bug. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2021-10-19cpu: cpu_idle_job SMT priority fixNicholas Piggin1-1/+0
Calling cpu_relax resets the SMT priority to medium, causing the idle loop not to run with lowest priority. Just use barrier() instead, this saves about 3 seconds on a SMT4 systemsim (mambo) boot. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2021-10-19interrupts: add_opal_interrupts avoid NULL dereference on P10 mamboNicholas Piggin1-1/+6
On P10, get_ics_phandle() calls xive2_get_phandle() directly. This results in a NULL dereference on mambo when xive2 is not set up. This was caught with the virtual memory boot patch on P10 mambo. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2021-09-09npu3: Remove GPU support on SwiftFrederic Barrat1-1/+0
npu3 was only used on the Swift platform to add support for GPUs (nvlink). The Swift platform has never left the lab and support for GPUs on it is pretty much dead. So let's remove it. The patch removes all related code. Device tree entries are no longer created and in the very unlikely case that someone is still trying to boot it, the linux nvlink discovery code should be quiet. Tested by booting on Swift with no GPU. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Reza Arbab <arbab@linux.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2021-08-18interrupts: Do not advertise XICS support on P10Cédric Le Goater1-1/+11
We only support the XIVE interface. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2021-08-06libpore: P10 stop-api supportPratik Rajesh Sampat1-4/+27
Update libpore with P10 STOP API. Add minor changes to make P9 stop-api and P10 stop-api to co-exist in OPAL. These calls are required for STOP11 support on P10. STIOP0,2,3 on P10 does not lose full core state or scoms. stop-api based restore of SPRs or xscoms required only for STOP11 on P10. STOP11 on P10 will be a limited lab test/stress feature and not a product feature. (Same case as P9) Co-authored-by: Pratik Rajesh Sampat <psampat@linux.ibm.com> Signed-off-by: Pratik Rajesh Sampat <psampat@linux.ibm.com> Co-authored-by: Vaidyanathan Srinivasan <svaidy@linux.ibm.com> Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.ibm.com> Co-authored-by: Ryan Grimm <grimm@linux.ibm.com> Signed-off-by: Ryan Grimm <grimm@linux.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2021-08-06hw/phb5: Add initial supportJordan Niethe2-1/+5
The PHB5 logic on P10 is pretty close to the P9's version. So we keep our base phb4 implementation and just add the few changes within if statements. Signed-off-by: Jordan Niethe <jpn@ozlabs.au.ibm.com> [clg: misc cleanups and fixes ] Signed-off-by: Cédric Le Goater <clg@kaod.org> [Fixed compilation issue - Vasant] Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> [Nick: Unify PHB4/PHB5 drivers ] Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [Mikey: set default lane eq settings for phb5] Signed-off-by: Michael Neuling <mikey@neuling.org> [FB: squash commits + small cleanup ] Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2021-08-06xive/p10: Add a XIVE2 driverCédric Le Goater2-3/+13
The XIVE2 interrupt controller of the POWER10 processor follows the same logic than on POWER9 but the HW interface has been largely reviewed. It has a new register interface, different BARs, extra VSDs, new layout for the XIVE structures, and a set of new features which are described below. The OPAL XIVE2 driver code activating this controller was duplicated from P9 for clarity as the registers and structures have changed considerably. The same OPAL interface is implemented for OS compatibility and it should not impact existing Linux kernels, KVM included. Guest OS is not impacted either. Support for new features will be implemented in time and will require new support from the OS. * XIVE2 BARS The interrupt controller BARs have a different layout outlined below. Each sub-engine has now own its range and the indirect TIMA access was replaced with a set of pages, one per CPU, under the IC BAR: - IC BAR (Interrupt Controller) . 4 pages, one per sub-engine . 128 indirect TIMA pages - TM BAR (Thread Interrupt Management Area) . 4 pages - ESB BAR (ESB pages for IPIs) . up to 1TB - END BAR (ESB pages for ENDs) . up to 2TB - NVC BAR (Notification Virtual Crowd) . up to 128 - NVPG BAR (Notification Virtual Process and Group) . up to 1TB - Direct mapped Thread Context Area (reads & writes) OPAL does not use the grouping and crowd capability. * Virtual Structure Tables XIVE2 adds new tables types and also changes the field layout of the END and NVP Virtualization Structure Descriptors. - EAS - END new layout - NVT was splitted in : . NVP (Processor), 32B . NVG (Group), 32B . NVC (Crowd == P9 block group) 32B - IC for remote configuration - SYNC for cache injection - ERQ for event input queue The setup is slighly different on XIVE2 because the indexing has changed for some of the tables, block ID or the chip topology ID can be used. * XIVE2 features SCOM and MMIO registers have a new layout and XIVE2 adds a new global capability and configuration registers. The lowlevel hardware offers a set of new features among which : - cache injection mechanism - 4 cache watch engines - a configurable number of priorities : 1 -8 - StoreEOI with load-after-store ordering is activated by default - new sync/kill operations for cache operations Other features will have some impact on the Hypervisor and guest OS when activated, but this is not required for initial support of the controller. - Gen2 TIMA layout - A P9-compat mode, or Gen1, TIMA toggle bit for SW compatibility - Automatic Context save & restore - increase to 24bit for VP number - New escalations schems : ESB, Adaptive, CPPR POWER10 adds support for User interrupts. When configured, the XIVE2 controller can notify directly user processes using the Event Based Branch exception line of the thread. If not running, the OS is notified through an escalation event. New OPAL and PAPR interfaces will be required and OS support needs to be studied. * XIVE2 P9-compat mode, or Gen1 The thread interrupt management area (TIMA) is a set of pages mapped in the Hypervisor and in the guest OS address space giving access to the interrupt thread context registers for interrupt management, ACK, EOI, CPPR, etc. XIVE2 changes slightly the TIMA layout with extra bits for the new features, larger CAM lines and the controller provides configuration switches for backward compatibility. This is called the XIVE2 P9-compat mode, of Gen1 TIMA. It impacts the layout of the TIMA and the availability of the internal features associated with it, Automatic Save & Restore for instance. Using a P9 layout also means setting the controller in such a mode at init time. The XIVE2 driver in OPAL chooses to initialize the XIVE2 controller with a XIVE2/P10 TIMA directly because the layouts are compatible with the Linux PowerNV and the guest OSes expectations. For KVM support, the OPAL calls abstract the HW interface and no assumption is made on the OS CAM line width. * Activating new XIVE2 features Everything related to OPAL internals such as the use of the new cache sync mechanism can be implemented in time without impact on the OS. Other features will require new device tree properties exposed to the OS and extra support for the OS. Automatic Context save & restore is one of the first feature which should be looked at. * XICS-over-XICS driver (P8 compatibility) The P8 emulation mode is an OPAL compat interface used for Linux kernels which did not have XIVE native support. This was useful for POWER9 bringup but it is much less now. As it was adding a lot of complexity and reducing the interrupt controller resources, this mode is not available in the XIVE2 driver for POWER10. It will still be possible to add this compat mode in the future if required. The OS will have to reset the driver at boot time, like on POWER9. * Impact on other drivers (PSI, PHB, NPU) Interrupts are allocated in a very similar way. Each controller might have different ESB characteristics, StoreEOI support, 64K pages for PSI. All is in place to support these changes already. PHB5 will have support for "address-based trigger mode", probably in the DD2.0 time frame when verification is completed. When activated, the XIVE IC ESB pages will be used instead of the PHB ESB pages for a lower interrupt latency. LSI will still use old fashion triggers without StoreEOI. * Yet to be addressed : - OPAL P10 interface incomplete (stop states) - Clarify the PHB5 strategy regarding the use of the XIVE IC ESB pages instead of the PHB ones when address-based trigger mode is supported. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2021-08-06hdat/spira: Define ibm, primary-topology-index property per chipHaren Myneni1-0/+3
HDAT provides Topology ID table and the primary topology location on P10. This primary location points to primary topology entry in ID table which contains the primary topology index and this index is used to define the paste base address per chip. This patch reads Topology ID table and the primary topology location from hdata and retrieves the primary topology index in the ID table. Make this primaty topology index value available with ibm,primary-topology-index property per chip. VAS reads this property to setup paste base address for each chip. Signed-off-by: Haren Myneni <haren@linux.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2021-08-06plat/qemu/p10: add a POWER10 platformCédric Le Goater1-0/+1
BMC is still defined as ast2500 but it should change to ast2600 when available. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2021-08-06cpufeatures: Add POWER10 supportNicholas Piggin1-22/+82
Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> [Folded Ravi's DAWR patch - Vasant] Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2021-08-06p10: Workaround core recovery issueMichael Neuling1-0/+36
This works around a core recovery issue in P10. The workaround involves the CME polling for a core recovery and performing the recovery procedure itself. For this to happen, the host leaves core recovery off (HID[5]) and then masks the PC system checkstop. This patch does this. Firmware starts skiboot with recovery already off, so we just leave it off for longer and then mask the PC system checkstop. This makes the window longer where a core recovery can cause an xstop but this window is still small and can still only happens on boot. Signed-off-by: Michael Neuling <mikey@neuling.org> [Added mambo check - Vasant] Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2021-08-06Initial POWER10 enablementNicholas Piggin8-66/+725
Co-authored-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Co-authored-by: Vaidyanathan Srinivasan <svaidy@linux.ibm.com> Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.ibm.com> Co-authored-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Neuling <mikey@neuling.org> Co-authored-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Co-authored-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Co-authored-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2021-08-04POWER9 Cleanups: de-assert SPWPratik R. Sampat1-0/+2
De-assert special wakeup bits for the case when SPWU bit is set, however the core is gated to maintain a coherent state for special wakeup. Signed-off-by: Pratik R. Sampat <psampat@linux.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2021-06-30core/cpu: Initialize all cpu thread areas to avoid invalid memory access.Mahesh Salgaonkar1-2/+24
Starting from p10 hostboot will no longer clear all the system memory except its own space. OPAL uses the memory at SKIBOOT_BASE + SKIBOOT_SIZE for cpu stack with pir as index. With hostboot no longer clearing memory this region may hold junk contents. Currently opal initialize cpu stack memory only for cpu pir that is found on the device-tree. For the rest, the cpu thread contents are uninitialized. This sometime causes for_each_cpu* macros to return cpu thread for pir/cpu which isn't present on the system. The for_each_cpu* macros iterate over cpu stacks using pir as index and returns cpu thread pointer if state != cpu_state_no_cpu. For cpus that are not found on device-tree the state may hold junk value leading OPAL to access invalid cpu thread area. This further leads to accessing pointers with junk values causing machine check (MCE) during OPAL init code. Fix this by Initializing all the cpu thread areas upto cpu_max_pir. [ 182.049714372,3] *********************************************** [ 182.049878580,3] Fatal MCE at 0000000030039738 .init_trace_buffers+0x21c MSR 9000000000201002 [ 182.049943811,3] Cause: load real address error [ 182.049968681,3] Effective address: 0x480113a4791c4a50 [ 182.050000736,3] CFAR : 00000000300395b8 MSR : 9000000000201002 [ 182.050035376,3] SRR0 : 0000000030039738 SRR1 : 9000000000201002 [ 182.050072878,3] HSRR0: 0000000030020024 HSRR1: 9000000000001000 [ 182.050117303,3] DSISR: 00000040 DAR : 480113a4791c4a50 [ 182.050149054,3] LR : 0000000030039744 CTR : 0000000000000000 [ 182.050182991,3] CR : 42000224 XER : 00000000 [ 182.050217262,3] GPR00: 000000003003962c GPR16: 0000000032d50000 [ 182.050255746,3] GPR01: 0000000032d53a50 GPR17: 0000000030003198 [ 182.050288081,3] GPR02: 000000003014cb00 GPR18: 0000000000000000 [ 182.050331474,3] GPR03: 0000000031c50000 GPR19: 0000000000000000 [ 182.050371934,3] GPR04: 0000000000000000 GPR20: 0000000000000000 [ 182.050416212,3] GPR05: ffffffffffffffff GPR21: 0000000000000001 [ 182.050454130,3] GPR06: 0000000000000005 GPR22: 00000000300f74eb [ 182.050488053,3] GPR07: 0000000000000028 GPR23: 00000000000fffd8 [ 182.050522774,3] GPR08: 000000000000067f GPR24: 00000000000fff40 [ 182.050566878,3] GPR09: 480113a4791c4a18 GPR25: 0000000000000070 [ 182.050601524,3] GPR10: 00000000078b0353 GPR26: 00000000300f7527 [ 182.050640345,3] GPR11: 0000000000000000 GPR27: 00000000300f7516 [ 182.050680816,3] GPR12: 0000000042000222 GPR28: 000000003acd0000 [ 182.050724099,3] GPR13: 000000000025a908 GPR29: 000000003acd0000 [ 182.050759728,3] GPR14: 0000000000000000 GPR30: 0000000000000000 [ 182.050790430,3] GPR15: 0000000000000000 GPR31: 00000000301f0038 CPU 0228 Backtrace: S: 0000000032d53d60 R: 000000003003962c .init_trace_buffers+0x110 S: 0000000032d53e30 R: 0000000030022f84 .main_cpu_entry+0x550 S: 0000000032d53f00 R: 00000000300031f8 not_fused+0x11c Signed-off-by: Mahesh Salgaonkar <mahesh@linux.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [Folded Nick's patch to that added mark_all_secondary_cpus_absent() - Vasant] Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2021-06-30fast-reboot: Fix the bonus cleanup_cpu_state()Oliver O'Halloran1-2/+10
Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2021-06-30i2c,trace: Add I2C operation trace eventsOliver O'Halloran1-0/+32
Add support for tracing I2C transactions performed by skiboot. This covers both internally initiated I2C ops and those that requested by the kernel via the OPAL API. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2021-06-30trace: Add nvram hack to use the old trace export behaviourOliver O'Halloran3-7/+18
Previously we put all the trace buffer exports in the exports/ node. However, there's one trace buffer for each core so I moved them into a subdirectory since they were crowding up the place. Most kernels don't support recursively exporting subnodes though so kernel's don't have support for recursively exporting subnodes, so add a hack to restore the old behaviour for now. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> [Fixed run-trace test case - Vasant] Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2021-06-30core/mce: POWER9 fix machine check decoding of async errorsNicholas Piggin1-0/+13
Async machine check errors due to bad real address from store or foreign link time out comes with the load/store bit (PPC bit 42) set in SRR1 but the cause is set in SRR1 not DSISR, unlike other errors that have the load/store bit set. This behaviour was omitted from the POWER9 User Manual but it is confirmed to be the expected one. Update the machine check decoder to match. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2021-06-30cpu: Add retry in cpu_pm_disable to kick cpus out of idleVaidyanathan Srinivasan1-2/+11
cpu_pm_idle sets pm_enabled = false and expected all cpus to exit idle. This is needed to re-enter with new settings. Right after cpu_bringup() we call copy_sreset_vector() and then cpu_set_sreset_enable(true). At this time some cpus are still yet to enter idle and hence miss the doorbell to wakeup. This leads to cpu_pm_idle waiting forever. This pattern happens on some system in fused-core mode. The fact that pm_enabled flag is changing right in the middle of idle entry is see from the "cpu_idle_p9 called with pm disabled" traces. One method to fix this race is to retry the door-bell after a timeout. This patch implements a small time out (few seconds) and then issues the doorbell once again to kick the cpu that entered idle late after missing the pm_enabled = false flag. This checking loop run in smt_lowest() and hence the timeout number maps to couple of seconds which is sufficient to let the cpus settle in idle and make them see the doorbell and exit. Example boot log: [ 288.309322810,7] INIT: CPU PIR 0x000d called in [ 288.309320768,7] INIT: CPU PIR 0x000b called in [ 288.314603802,7] INIT: CPU PIR 0x0020 called in [ 288.321303468,5] CPU: All 88 processors called in... [ 288.315056796,6] cpu_idle_p9 called on cpu 0x024e with pm disabled [ 288.321308091,6] cpu_idle_p9 called on cpu 0x0264 with pm disabled [ 288.314424259,6] cpu_idle_p9 called on cpu 0x025b with pm disabled [ 288.324928307,6] cpu_idle_p9 called on cpu 0x0065 with pm disabled [ 305.207316004,6] cpu_pm_disable TIMEOUT on cpu 0x0261 to exit idle [ 322.093298501,6] cpu_pm_disable TIMEOUT on cpu 0x0263 to exit idle [ 338.491281028,6] cpu_pm_disable TIMEOUT on cpu 0x0265 to exit idle [ 355.377263492,6] cpu_pm_disable TIMEOUT on cpu 0x0267 to exit idle [ 372.263245960,6] cpu_pm_disable TIMEOUT on cpu 0x0269 to exit idle [ 389.149228389,6] cpu_pm_disable TIMEOUT on cpu 0x026b to exit idle [ 406.035210852,6] cpu_pm_disable TIMEOUT on cpu 0x026d to exit idle [ 422.433193381,6] cpu_pm_disable TIMEOUT on cpu 0x026f to exit idle [ 422.433277720,6] CHIPTOD: Calculated MCBS is 0x25 (Cfreq=2000000000 Tfreq=32000000) Reported-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> [Reworded commit message - Vasant] Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2021-05-13hw/imc: move imc_init() towards end main_cpu_entry()Madhavan Srinivasan1-3/+3
imc_init() checks for the 24x7 microcode state at boot to check whether the microcode is in proper state (running or paused). But in a larger system, loading of 24x7 microcode by OCC gets delayed. Because of this, imc_init() removes imc devices from the device tree. Moving imc_init() function towards end of the main_cpu_entry() works around this. Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2020-12-15Fix possible deadlock with DEBUG buildVasant Hegde1-2/+2
Sample output from Cédric: ------------------------- [ 88.294111649,7] cpu_idle_p9 called on cpu 0x063c with pm disabled [ 88.289365222,7] cpu_idle_p9 called on cpu 0x025f with pm disabled [ 88.289900684,7] cpu_idle_p9 called on cpu 0x045f with pm disabled [ 88.302621295,7] CHIPTOD: Base TFMR=0x2512000000000000 [ 88.289899701,7] cpu_idle_p9 called on cpu 0x0456 with pm disabled LOCK ERROR: Deadlock detected @0x30402740 (state: 0x0000000400000001) [ 88.332264757,3] *********************************************** [ 88.332300051,3] < assert failed at core/lock.c:32 > [ 88.332328282,3] . [ 88.332347335,3] . [ 88.332364894,3] . [ 88.332377963,3] OO__) [ 88.332395458,3] <"__/ [ 88.332412628,3] ^ ^ [ 88.332450246,3] Fatal TRAP at 00000000300286a0 .lock_error+0x64 MSR 9000000000021002 [ 88.332501812,3] CFAR : 00000000300414f4 MSR : 9000000000021002 [ 88.332536539,3] SRR0 : 00000000300286a0 SRR1 : 9000000000021002 [ 88.332574644,3] HSRR0: 0000000030020024 HSRR1: 9000000000001000 [ 88.332610635,3] DSISR: 00000000 DAR : 0000000000000000 [ 88.332650628,3] LR : 0000000030028690 CTR : 00000000300f9fa0 [ 88.332684451,3] CR : 20002000 XER : 00000000 [ 88.332712767,3] GPR00: 0000000030028690 GPR16: 0000000032c98000 [ 88.332748046,3] GPR01: 0000000032c9b0a0 GPR17: 0000000000000000 [ 88.332784060,3] GPR02: 0000000030169d00 GPR18: 0000000000000000 [ 88.332822091,3] GPR03: 0000000032c9b310 GPR19: 0000000000000000 [ 88.332861357,3] GPR04: 0000000030041480 GPR20: 0000000000000000 [ 88.332897229,3] GPR05: 0000000000000000 GPR21: 0000000000000000 [ 88.332937051,3] GPR06: 0000000000000010 GPR22: 0000000000000000 [ 88.332968463,3] GPR07: 0000000000000000 GPR23: 0000000000000000 [ 88.333007333,3] GPR08: 000000000002cbb5 GPR24: 0000000000000000 [ 88.333041971,3] GPR09: 0000000000000000 GPR25: 0000000000000000 [ 88.333081073,3] GPR10: 0000000000000000 GPR26: 0000000000000003 [ 88.333114301,3] GPR11: 3839616263646566 GPR27: 0000000000000211 [ 88.333156040,3] GPR12: 0000000020002000 GPR28: 000000003042a134 [ 88.333189222,3] GPR13: 0000000000000000 GPR29: 0000000030402740 [ 88.333225638,3] GPR14: 0000000000000000 GPR30: 0000000000000001 [ 88.333259730,3] GPR15: 0000000000000000 GPR31: 0000000000000000 CPU 0211 Backtrace: S: 0000000032c9b3b0 R: 0000000030028690 .lock_error+0x54 S: 0000000032c9b440 R: 0000000030028828 .add_lock_request+0xd0 S: 0000000032c9b4f0 R: 0000000030028a9c .lock_caller+0x8c S: 0000000032c9b5a0 R: 0000000030021b30 .__mcount_stack_check+0x70 S: 0000000032c9b650 R: 00000000300fabb0 .list_check_node+0x1c S: 0000000032c9b6f0 R: 00000000300fac98 .list_check+0x38 S: 0000000032c9b790 R: 00000000300289bc .try_lock_caller+0xac S: 0000000032c9b830 R: 0000000030028ad8 .lock_caller+0xc8 S: 0000000032c9b8e0 R: 0000000030028d74 .lock_recursive_caller+0x54 S: 0000000032c9b980 R: 0000000030020cb8 .console_write+0x48 S: 0000000032c9ba30 R: 00000000300445a8 .vprlog+0xc8 S: 0000000032c9bc20 R: 0000000030044630 ._prlog+0x50 S: 0000000032c9bcb0 R: 0000000030029204 .cpu_idle_p9+0x74 S: 0000000032c9bd40 R: 0000000030029628 .cpu_idle_pm+0x4c S: 0000000032c9bde0 R: 0000000030023fe0 .__secondary_cpu_entry+0xa0 S: 0000000032c9be70 R: 0000000030024034 .secondary_cpu_entry+0x40 S: 0000000032c9bf00 R: 0000000030003290 secondary_wait+0x8c CPU 0x4: opal_run_pollers -> check_stacks -> takes stack_check_lock lock prlog -> console_write -> waits for con_lock CPU 0x211 cpu_idle_p9 -> prlog -> console_write -> Takes con_lock lock list_check_node -> tries to take stack_check_lock and hits deadlock. I think we don't need to hold `stack_check_lock` while printing backtraces. Instead it makes sense to hold backtrace lock (bt_lock) and print output. Reported-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Tested-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2020-11-27core/opal.c: sparse cleanup integer as NULLStewart Smith1-1/+1
Fixes: core/opal.c:418:61: warning: Using plain integer as NULL pointer Signed-off-by: Stewart Smith <stewart@flamingspork.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2020-11-27core/platform: Fallback to full_reboot if fast-reboot failsVasant Hegde1-1/+2
If fast reboot fails then we return to Linux with OPAL_SUCCESS. Current Linux code thinks that request succedded and enters infinite loop (see Linux pnv_restart() code). This patch fixes above issue by return OPAL_UNSUPPORTED if fast reboot fails. Alternatively we can directly call full_reboot() itself. But I think it makes sense to go back to Linux and report the failure. And Linux falls back to normal reboot request. Fixes: 10bbcd07 ("core/platform: Add an explicit fast-reboot type") Cc: Oliver O'Halloran <oohall@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Dan Horák <dan@danny.cz> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2020-11-27core/cpu: fix next_ungarded_primaryNicholas Piggin1-4/+2
next_unguarded_primary dereferences NULL CPU -> UB -> infinite loop Fast reboot works again after this patch. Fixes: 98f5834253c7e ("cpu: Keep track of the "ec_primary" in big core more") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
2020-10-01core/flash.c: add SECBOOT read and write supportClaudio Carvalho1-0/+126
In secure boot enabled systems, the petitboot linux kernel verifies the OS kernel against x509 certificates that are wrapped in secure variables controlled by OPAL. These secure variables are stored in the PNOR SECBOOT partition, as well as the updates submitted for them using userspace tools. This patch adds read and write support to the PNOR SECBOOT partition in a similar fashion to that of NVRAM, so that OPAL can handle the secure variables. Signed-off-by: Claudio Carvalho <cclaudio@linux.ibm.com> Signed-off-by: Eric Richter <erichte@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>