aboutsummaryrefslogtreecommitdiff
path: root/core
AgeCommit message (Collapse)AuthorFilesLines
2018-04-30Add SBE driver supportVasant Hegde1-0/+7
SBE (Self Boot Engine) on P9 has two different jobs: - Boot the chip up to the point the core is functional - Provide various services like timer, scom, stash MPIPL, etc., at runtime OPAL can communicate to SBE via a set of data and control registers provided by the PSU block in P9 chip. - Four 8 byte registers for Host to send command packets to SBE - Four 8 byte registers for SBE to send response packets to Host - Two doorbell registers (1 on each side) to alert either party when data is placed in above mentioned data register Protocol constraints: Only one command is accepted in the command buffer until the response for the command is enqueued in the response buffer by SBE. Usage: We will use SBE for various purposes like timer, MPIPL, etc. This patch implements the SBE MBOX spec for OPAL to communicate with SBE. Design consideration: - Each chip has SBE. We need to track SBE messages per chip. Hence added per chip sbe structure and list of messages to that chip - SBE accepts only one command at a time. Hence serialized MBOX commands. - OPAL gets interrupted once SBE sets doorbell register - OPAL has to clear doorbell register after reading response - Every command class has timeout option. Timed out messages are discarded - SBE MBOX commands can be classified into four types : - Those that must be sent to the master only (ex: sending MDST/MDDT info) - Those that must be sent to slaves only (ex: continue MPIPL) - Those that must be sent to all chips (ex: close insecure window) - Those that can be sent to any chip (ex: timer) Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-29OPAL_PCI_SET_POWER_STATE: fix locking in error pathsStewart Smith1-4/+12
Otherwise we could exit OPAL holding locks, potentially leading to all sorts of problems later on. Cc: stable # 5.3+ Fixes: 7a3e2c4ee3aa0 Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-29core/pci-dt-slots: Fix devfn lookupOliver O'Halloran1-1/+1
We only want to use the device part of the bdfn when looking up the switch down port. The required bit twiddling happens inside find_devfn() and the masking here is broken since: a) Keeps the fn part of the bdfn, and b) Masks off part of the device number. This breaks looking up the slot information in some cases. Fixes: 6878b806682f ("pci-dt-slot: Big ol' cleanup") Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-24opal/hmi: Generate one event per core for processor recovery.Mahesh Salgaonkar1-3/+3
Processor recovery is per core error. All threads on that core receive HMI. All threads don't need to generate HMI event for same error. Let thread 0 only generate the event. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-24opal:hmi: Add missing processor recovery reason string.Mahesh Salgaonkar1-0/+1
With this patch now we see reason string printed for CORE_WOF[43] bit. [ 477.352234986,7] HMI: [Loc: U78D3.001.WZS004A-P1-C48]: P:8 C:22 T:3: Processor recovery occurred. [ 477.352240742,7] HMI: Core WOF = 0x0000000000100000 recovered error: [ 477.352242181,7] HMI: PC - Thread hang recovery Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-23pci-dt-slot: Big ol' cleanupOliver O'Halloran1-80/+74
The underlying data that we get from HDAT can only really describe a PCIe system. As such we can simplify the devicetree slot lookup code by only caring about the important cases, namly, root ports and switch downstream ports. This also fixes a bug where root port didn't get a Slot label applied which results in devices under that port not having ibm,loc-code set. This results in the EEH core being unable to report the location of EEHed devices under that port. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-19sensors: Dont add DTS sensors when OCC inband sensors are availableShilpasri G Bhat2-2/+3
There are two sets of core temperature sensors today. One is DTS scom based core temperature sensors and the second group is the sensors provided by OCC. DTS is the highest temperature among the different temperature zones in the core while OCC core temperature sensors are the average temperature of the core. DTS sensors are read directly by the host by SCOMing the DTS sensors while OCC sensors are read and updated by OCC to main memory. Reading DTS sensors by SCOMing is a heavy and slower operation as compared to reading OCC sensors which is as good as reading memory. So dont add DTS sensors when OCC sensors are available. Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> Acked-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-18core/test/run-trace: fix on ppc64elStewart Smith1-1/+2
Hackish fix from benh Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-18core/fast-reboot: Increase timeout for dctl sreset to 1secVaidyanathan Srinivasan1-1/+1
Direct control xscom can take more time to complete. We seem to wait too little on Boston failing fast-reboot for no good reason. Increase timeout to 1 sec as a reasonable value for sreset to be delivered and core to start executing instructions. Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-18core: Fix iteration condition to skip garded cpuVaidyanathan Srinivasan1-1/+1
Fix the logic error in the loop that iterated incorrectly over garded cpu. Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-18core/opal: Allow poller re-entry if OPAL was re-enteredNicholas Piggin1-4/+8
If an NMI interrupts the middle of running pollers and the OS invokes pollers again (e.g., for console output), the poller re-entrancy check will prevent it from running and spam the console. That check was designed to catch a poller calling opal_run_pollers, OPAL re-entrancy is something different and is detected elsewhere. Avoid the poller recursion check if OPAL has been re-entered. This is a best-effort attempt to cope with errors. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-18core/opal: Emergency stack for re-entryNicholas Piggin2-7/+19
This detects OPAL being re-entered by the OS, and switches to an emergency stack if it was. This protects the firmware's main stack from re-entrancy and allows the OS to use NMI facilities for crash / debug functionality. Further nested re-entry will destroy the previous emergency stack and prevent returning, but those should be rare cases. This stack is sized at 16kB, which doubles the size of CPU stacks, so as not to introduce a regression in primary stack size. The 16kB stack originally had a 4kB machine check stack at the top, which was removed by 80eee1946 ("opal: Remove machine check interrupt patching in OPAL."). So it is possible the size could be tightened again, but that would require further analysis. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-18asm/head: implement quiescing without stack or clobbering regsNicholas Piggin1-27/+14
Quiescing currently is implmeented in C in opal_entry before the opal call handler is called. This works well enough for simple cases like fast reset when one CPU wants all others out of the way. Linux would like to use it to prevent an sreset IPI from interrupting firmware, which could lead to deadlocks when crash dumping or entering the debugger. Linux interrupts do not recover well when returning back to general OPAL code, due to r13 not being restored. OPAL also can't be re-entered, which may happen e.g., from the debugger. So move the quiesce hold/reject to entry code, beore the stack or r1 or r13 registers are switched. OPAL can be interrupted and returned to or re-entered during this period. This does not completely solve all such problems. OPAL will be interrupted with sreset if the quiesce times out, and it can be interrupted by MCEs as well. These still have the issues above. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-18core/stack: backtrace unwind basic OPAL call detailsNicholas Piggin1-7/+26
Put OPAL callers' r1 into the stack back chain, and then use that to unwind back to the OPAL entry frame (as opposed to boot entry, which has a 0 back chain). >From there, dump the OPAL call token and the caller's r1. A backtrace looks like this: CPU 0000 Backtrace: S: 0000000031c03ba0 R: 000000003001a548 ._abort+0x4c S: 0000000031c03c20 R: 000000003001baac .opal_run_pollers+0x3c S: 0000000031c03ca0 R: 000000003001bcbc .opal_poll_events+0xc4 S: 0000000031c03d20 R: 00000000300051dc opal_entry+0x12c --- OPAL call entry token: 0xa caller R1: 0xc0000000006d3b90 --- This is pretty basic for the moment, but it does give you the bottom of the Linux stack. It will allow some interesting improvements in future. First, with the eframe, all the call's parameters can be printed out as well. The ___backtrace / ___print_backtrace API needs to be reworked in order to support this, but it's otherwise very simple (see opal_trace_entry()). Second, it will allow Linux's stack to be passed back to Linux via a debugging opal call. This will allow Linux's BUG() or xmon to also print the Linux back trace in case of a NMI or MCE or watchdog lockup that hits in OPAL. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-17opal/hmi: Generate hmi event for recovered HDEC parity error.Mahesh Salgaonkar1-3/+2
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-17opal/hmi: check thread 0 tfmr to validate latched tfmr errors.Mahesh Salgaonkar1-19/+42
Due to P9 errata, HDEC parity and TB residue errors are latched for non-zero threads 1-3 even if they are cleared. But these are not latched on thread 0. Hence, use xscom SCOMC/SCOMD to read thread 0 tfmr value and ignore them on non-zero threads if they are not present on thread 0. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-17opal/hmi: Print additional debug information in rendezvous.Mahesh Salgaonkar1-2/+4
Helps in debugging... Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-17opal/hmi: Fix handling of TFMR parity/corrupt error.Mahesh Salgaonkar1-5/+4
While testing TFMR parity/corrupt error it has been observed that HMIs are delivered twice for this error - First time HMI is delivered with HMER[4,5]=1 and TFMR[60]=1. - Second time HMI is delivered with HMER[4,5]=1 and TFMR[60]=0 with valid TB. On second HMI we end up throwing below error message even though TB is in valid state. "HMI: TB invalid without core error reported" This patch fixes this issue by ignoring HMER[5] and checking only for TFMR[60] before setting this_cpu()->tb_invalid to true. Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-17opal/hmi: Fix soft lockups during TOD errorsMahesh Salgaonkar1-1/+15
There are some TOD errors which do not affect working of TOD and TB. They stay in valid state. Hence we don't need rendez vous for TOD errors that does not affect TB working. TOD errors that affects TOD/TB will report a global error on TFMR[44] alongwith bit 51, and they will go in rendez vous path as expected. But the TOD errors that does not affect TB register sets only TFMR bit 51. The TFMR bit 51 is cleared when any single thread clears the TOD error. Once cleared, the bit 51 is reflected to all the cores on that chip. Any thread that reads the TFMR register after the error is cleared will see TFMR bit 51 reset. Hence the threads that see TFMR[51]=1, falls through rendez-vous path and threads that see TFMR[51]=0, returns doing nothing. This ends up in a soft lockups in host kernel. This patch fixes this issue by not considering TOD interrupt (TFMR[51]) as a core-global error and hence avoiding rendez-vous path completely. Instead threads that see TFMR[51]=1 will now take different path that just do the TOD error recovery. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-17opal/hmi: Do not send HMI event if no errors are found.Mahesh Salgaonkar1-8/+13
For TOD errors, all the cores in the chip get HMIs. Any one thread from any core can fix the issue and TFMR will have error conditions cleared. Rest of the threads need take any action if TOD errors are already cleared. Hence thread 0 of every core should get a fresh copy of TFMR before going ahead recovery path. Initialize recover = -1, so that if no errors found that thread need not send a HMI event to linux. This helps in stop flooding host with hmi event by every thread even there are no errors found. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-17opal/hmi: Initialize the hmi event with old value of HMER.Mahesh Salgaonkar1-3/+6
Do this before we check for TFAC errors. Otherwise the event at host console shows no error reported in HMER register. Without this patch the console event show HMER with all zeros [ 216.753417] Severe Hypervisor Maintenance interrupt [Recovered] [ 216.753498] Error detail: Timer facility experienced an error [ 216.753509] HMER: 0000000000000000 [ 216.753518] TFMR: 3c12000870e04000 After this patch it shows old HMER values on host console: [ 2237.652533] Severe Hypervisor Maintenance interrupt [Recovered] [ 2237.652651] Error detail: Timer facility experienced an error [ 2237.652766] HMER: 0840000000000000 [ 2237.652837] TFMR: 3c12000870e04000 Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-17opal/hmi: Rework HMI handling of TFAC errorsBenjamin Herrenschmidt2-294/+231
This patch reworks the HMI handling for TFAC errors by introducing 4 rendez-vous points improve the thread synchronization while handling timebase errors that requires all thread to clear dirty data from TB/HDEC register before clearing the errors. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-17opal/hmi: Don't bother passing HMER to pre-recovery cleanupBenjamin Herrenschmidt1-14/+6
The test for TFAC error is now redundant so we remove it and remove the HMER argument. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-17opal/hmi: Move timer related error handling to a separate functionBenjamin Herrenschmidt1-48/+58
Currently no functional change. This is a first step to completely rewriting how these things are handled. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-17opal/hmi: Add a new opal_handle_hmi2 that returns direct info to LinuxBenjamin Herrenschmidt1-45/+82
It returns a 64-bit flags mask currently set to provide info about which timer facilities were lost, and whether an event was generated. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-17opal/hmi: Remove races in clearing HMERBenjamin Herrenschmidt1-10/+12
Writing to HMER acts as an "AND". The current code writes back the value we originally read with the bits we handled cleared. This is racy, if a new bit gets set in HW after the original read, we'll end up clearing it without handling it. Instead, use an all 1's mask with only the bit handled cleared. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-17opal/hmi: Don't re-read HMER multiple timesBenjamin Herrenschmidt1-21/+14
We want to make sure all reporting and actions are based upon the same snapshot of HMER in case bits get added by HW while we are in OPAL. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-11phb4: Restore bus numbers after CRSMichael Neuling1-0/+16
Currently we restore PCIe bus numbers right after the link is up. Unfortunately as this point we haven't done CRS so config space may not be accessible. This moves the bus number restore till after CRS has happened. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-11core/pci: Set slot power limit when supportedOliver O'Halloran2-0/+38
The PCIe slot capability can be implemented in a root or switch downstream port to set the maximum power a card is allowed to draw from the system. This patch adds support for setting the power limit when the platform has defined one. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-11interrupts: Create an "interrupts" property in the OPAL nodeBenjamin Herrenschmidt1-7/+26
Deprecate the old "opal-interrupts", it's still there, but the new property follows the standard and allow us to specify whether an interrupt is level or edge sensitive. Similarly create "interrupt-names" whose content is identical to "opal-interrupts-names". Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-11core: Correctly load initramfs in stb containerSamuel Mendoza-Jonas2-5/+24
Skiboot does not calculate the actual size and start location of the initramfs if it is wrapped by an STB container (for example if loading an initramfs from the ROOTFS partition). Check if the initramfs is in an STB container and determine the size and location correctly in the same manner as the kernel. Since load_initramfs() is called after load_kernel() move the call to trustedboot_exit_boot_services() into load_and_boot_kernel() so it is called after both of these. Signed-off-by: Samuel Mendoza-Jonas <sam@mendozajonas.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-10pcie-slot: Don't fail powering on an already on switchBenjamin Herrenschmidt1-1/+1
If the power state is already the required value, return OPAL_SUCCESS rather than OPAL_PARAMETER to avoid spurrious errors during boot. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-By: Michael Neuling <mikey@neuling.org> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-10core/pci: Document some stuffOliver O'Halloran1-9/+13
Document the bridge class code hack and what ibm,pci-config-space-type actually means. Also replace some of the pci_cap() calls with a variable to make it a bit more readable. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-09libflash/blocklevel: Make read/write be ECC agnostic for callersCyril Bur1-5/+2
The blocklevel abstraction allows for regions of the backing store to be marked as ECC protected so that blocklevel can decode/encode the ECC bytes into the buffer automatically without the caller having to be ECC aware. Unfortunately this abstraction is far from perfect, this is only useful if reads and writes are performed at the start of the ECC region or in some circumstances at an ECC aligned position - which requires the caller be aware of the ECC regions. The problem that has arisen is that the blocklevel abstraction is initialised somewhere but when it is later called the caller is unaware if ECC exists in the region it wants to arbitrarily read and write to. This should not have been a problem since blocklevel knows. Currently misaligned reads will fail ECC checks and misaligned writes will overwrite ECC bytes and the backing store will become corrupted. This patch add the smarts to blocklevel_read() and blocklevel_write() to cope with the problem. Note that ECC can always be bypassed by calling blocklevel_raw_() functions. All this work means that the gard tool can can safely call blocklevel_read() and blocklevel_write() and as long as the blocklevel knows of the presence of ECC then it will deal with all cases. This also commit removes code in the gard tool which compensated for inadequacies no longer present in blocklevel. Signed-off-by: Cyril Bur <cyril.bur@au1.ibm.com> Tested-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com> [stewart: core/flash: Adapt to new libflash ECC API Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-05core/cpufeatures: Fix setting DARN and SCV HWCAP feature bitsNicholas Piggin1-2/+2
DARN and SCV has been assigned AT_HWCAP2 (32-63) bits: #define PPC_FEATURE2_DARN 0x00200000 /* darn random number insn */ #define PPC_FEATURE2_SCV 0x00100000 /* scv syscall */ A cpufeatures-aware OS will not advertise these to userspace without this patch. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2018-04-04core/cpu: Prevent clobbering of stack guard for boot-cpuVaibhav Jain1-1/+5
Commit 90d53934c2da ("core/cpu: discover stack region size before initialising memory regions") introduced memzero for struct cpu_thread in init_cpu_thread(). This has an unintended side effect of clobbering the stack-guard cannery of the boot_cpu stack. This results in opal failing to init with this failure message: CPU: P9 generation processor (max 4 threads/core) CPU: Boot CPU PIR is 0x0004 PVR is 0x004e1200 Guard skip = 0 Stack corruption detected ! Aborting! CPU 0004 Backtrace: S: 0000000031c13ab0 R: 0000000030013b0c .backtrace+0x5c S: 0000000031c13b50 R: 000000003001bd18 ._abort+0x60 S: 0000000031c13be0 R: 0000000030013bbc .__stack_chk_fail+0x54 S: 0000000031c13c60 R: 00000000300c5b70 .memset+0x12c S: 0000000031c13d00 R: 0000000030019aa8 .init_cpu_thread+0x40 S: 0000000031c13d90 R: 000000003001b520 .init_boot_cpu+0x188 S: 0000000031c13e30 R: 0000000030015050 .main_cpu_entry+0xd0 S: 0000000031c13f00 R: 0000000030002700 boot_entry+0x1c0 So the patch provides a fix by tweaking the memset() call in init_cpu_thread() to skip over the stack-guard cannery. Fixes:90d53934c2da("core/cpu: discover stack region size before initialising memory regions") Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2018-04-03core/lock.c: ensure valid start value for lock spin duration warningStewart Smith1-3/+9
The previous fix in a8e6cc3f4 only addressed half of the problem, as we could also get an invalid value for start, causing us to fail in a weird way. This was caught by the testcases.OpTestHMIHandling.HMI_TFMR_ERRORS test in op-test-framework. You'd get to this part of the test and get the erroneous lock spinning warnings: PATH=/usr/local/sbin:$PATH putscom -c 00000000 0x2b010a84 0003080000000000 0000080000000000 [ 790.140976993,4] WARNING: Lock has been spinning for 790275ms [ 790.140976993,4] WARNING: Lock has been spinning for 790275ms [ 790.140976918,4] WARNING: Lock has been spinning for 790275ms This patch checks the validity of timebase before setting start, and only checks the lock timeout if we got a valid start value. Fixes: a8e6cc3f47525f86ef1d69d69a477b6264d0f8ee Fixes: 84186ef0944c9413262f0974ddab3fb1343ccfe8 Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Tested-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2018-03-28Fix 'make check' compile for mem_clear_rangeStewart Smith1-2/+3
We play funny business with printf format specifiers because of how we do unit tests. Fixes: c32943bfc1e254176ecab564fdb4752403a48cab Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2018-03-27core/cpu: discover stack region size before initialising memory regionsNicholas Piggin3-39/+48
Stack allocation first allocates a memory region sized to hold stacks for all possible CPUs up to the maximum PIR of the architecture, zeros the region, then initialises all stacks. Max PIR is 32768 on POWER9, which is 512MB for stacks. The stack region is then shrunk after CPUs are discovered, but this is a bit of a hack, and it leaves a hole in the memory allocation regions as it's done after mem regions are initialised. 0x000000000000..00002fffffff : ibm,os-reserve - OS 0x000030000000..0000303fffff : ibm,firmware-code - OPAL 0x000030400000..000030ffffff : ibm,firmware-heap - OPAL 0x000031000000..000031bfffff : ibm,firmware-data - OPAL 0x000031c00000..000031c0ffff : ibm,firmware-stacks - OPAL *** gap *** 0x000051c00000..000051d01fff : ibm,firmware-allocs-memory@0 - OPAL 0x000051d02000..00007fffffff : ibm,firmware-allocs-memory@0 - OS 0x000080000000..000080b3cdff : initramfs - OPAL 0x000080b3ce00..000080b7cdff : ibm,fake-nvram - OPAL 0x000080b7ce00..0000ffffffff : ibm,firmware-allocs-memory@0 - OS This change moves zeroing into the per-cpu stack setup. The boot CPU stack is set up based on the current PIR. Then the size of the stack region is set, by discovering the maximum PIR of the system from the device tree, before mem regions are intialised. This results in all memory being accounted within memory regions, and less memory fragmentation of OPAL allocations. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2018-03-27nvram: run nvram_validate() after nvram_reformat()Nicholas Piggin2-3/+8
nvram_reformat() sets nvram_valid = true, but it does not set skiboot_part_hdr. Call nvram_validate() instead, which sets everything up properly. Reviewed-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2018-03-27core/fast-reboot: zero memory after fast rebootNicholas Piggin2-0/+82
This improves the security and predictability of the fast reboot environment. There can not be a secure fence between fast reboots, because a malicious OS can modify the firmware itself. However a well-behaved OS can have a reasonable expectation that OS memory regions it has modified will be cleared upon fast reboot. The memory is zeroed after all other CPUs come up from fast reboot, just before the new kernel is loaded and booted into. This allows image preloading to run concurrently, and will allow parallelisation of the clearing in future. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2018-03-27mem-map: Use a symbolic constant for exception vector sizeNicholas Piggin1-8/+10
Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2018-03-27core/fast-reboot: verify mem regions before fast rebootNicholas Piggin3-7/+35
Run the mem_region sanity checkers before proceeding with fast reboot. This is the beginning of proactive sanity checks on opal data for fast reboot (with complements the reactive disable_fast_reboot cases). This is encouraged to re-use and share any kind of debug code and unit test code. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2018-03-27NPU2: dump NPU2 registers on npu2 HMIStewart Smith1-2/+73
Due to the nature of debugging npu2 issues, folk are wanting the full list of NPU2 registers dumped when there's a problem. We have to list out each register as traversing the range triggers FIR bits that confuse PRD. Suggested-by: Ryan Black <rblack@us.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2018-03-27Revert "NPU2 HMIs: dump out a *LOT* of npu2 registers for debugging"Stewart Smith1-37/+1
This reverts commit fbdc91e693fc3103f7e2a65054ed32bfb26a2e17. We don't need this as we need to do it a different way, with a explicit set of registers as otherwise we trip other random FIR bits and everything becomes even more terrible. I suggest alcohol. Cc: stable Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2018-03-22core/fast-reboot: disable fast reboot upon fundamental entry/exit/locking errorsNicholas Piggin2-0/+3
This disables fast reboot in several more cases where serious errors like lock corruption or call re-entrancy are detected. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2018-03-22core/opal: allow some re-entrant callsNicholas Piggin1-3/+16
This allows a small number of OPAL calls to succeed despite re-entering the firmware, and rejects others rather than aborting. This allows a system reset interrupt that interrupts OPAL to do something useful. Sreset other CPUs, use the console, which allows xmon to work or stack traces to be printed, reboot the system. Use OPAL_INTERNAL_ERROR when rejecting, rather than OPAL_BUSY, which is used for many other things that does not mean a serious permanent error. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2018-03-22core/opal: abort in case of re-entrant OPAL callNicholas Piggin1-1/+1
The stack is already destroyed by the time we get here, so there is not much point continuing. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2018-03-14dts: spl_wakeup: Remove all workarounds in the spl wakeup logicShilpasri G Bhat1-30/+29
We coded few workarounds in special wakeup logic to handle the buggy firmware. Now that is fixed remove them as they break the special wakeup protocol. As per the spec we should not de-assert beofre assert is complete. So follow this protocol. Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> Reviewed-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Tested-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2018-03-09Don't detect lock timeouts when timebase is invalidStewart Smith1-0/+7
We can have threads waiting on hmi_lock who have an invalid timebase. Because of this, we want to go poke the register directly rather than rely on this_cpu()->tb_invalid (which won't have been set yet). Without this patch, you get something like this when you're injecting timebase errors: [10976.202052846,4] WARNING: Lock has been spinning for 10976394ms Fixes: 84186ef0944c9413262f0974ddab3fb1343ccfe8 Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>