aboutsummaryrefslogtreecommitdiff
path: root/core/fast-reboot.c
AgeCommit message (Collapse)AuthorFilesLines
2018-02-08fast-reboot: move pci_reset error handling into fast-reboot codeNicholas Piggin1-1/+12
pci_reset() currently does a platform reboot if it fails. It should not know about fast-reboot at this level, so instead have it return an error, and the fast reboot caller will do the platform reboot. The code essentially does the same thing, but flexibility is improved. Ideally the fast reboot code should perform pci_reset and all such fail-able operations before the CPU resets itself and destroys its own stack. That's not the case now, but that should be the goal. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Acked-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-12-03fast-reboot: bare bones fast reboot implementation for POWER9Nicholas Piggin1-21/+43
This is an initial fast reboot implementation for p9 which has only been tested on the Witherspoon platform, and without the use of NPUs, NX/VAS, etc. This has worked reasonably well so far, with no failures in about 100 reboots. It is hidden behind the traditional fast-reboot experimental nvram option, until more platforms and configurations are tested. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-12-03fast-reboot: move boot CPU cleanup logically together with secondariesNicholas Piggin1-8/+8
Move the boot CPU cleanup and state transition to active, logically together with secondaries. Don't release secondaries from fast reboot hold until everyone has cleaned up and transitioned to active. This is cosmetic, but it is helpful to run the fast reboot state machine the same way on all CPUs. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-12-03fast-reboot: move fdt freeing into initNicholas Piggin1-6/+2
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-12-03fast-reboot: improve failure error messagesNicholas Piggin1-3/+13
Change existing failure error messages to PR_NOTICE so they get printed to the console, and add some new ones. It's not a more severe class because it falls back to IPL on failure. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-12-03fast-reboot: quiesce opal before initiating a fast rebootNicholas Piggin1-29/+18
Switch fast reboot to use quiescing rather than "wait for a while". If firmware can not be quiesced, then fast reboot is skipped. This significantly improves the robustness of fast reboot in the face of bugs or unexpected latencies. Complexity of synchronization in fast-reboot is reduced, because we are guaranteed to be single-threaded when quiesce succeeds, so locks can be removed. In the case that firmware can be quiesced, then it will generally reduce fast reboot times by nearly 200ms, because quiescing usually takes very little time. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-12-03fast-reboot: move sreset direct controls to direct-controls.cNicholas Piggin1-283/+1
Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-12-03fast-reboot: allow mambo fast reboot independent of CPU typeNicholas Piggin1-1/+2
Don't tie mambo fast reboot to POWER8 CPU type. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-12-03fast-reboot: move de-asserting of special wakeups to the initiatorNicholas Piggin1-3/+3
Currently the boot CPU (not the initiator) clears special wakeups after all CPUs have called in. After the earlier change to have the initiator wait for secondaries before calling in, this is no longer necessary. Have the initiator finish the entire sreset sequence, clearing special wakeups after all others have called in. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-12-03fast-reboot: inline fast_reset_p8 into fast_rebootNicholas Piggin1-30/+24
This function has shrunk to the point it's not so helpful to keep it, it's no longer power8 specific, and getting rid of it simplifies error handling a little in future changes. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-12-03fast-reboot: remove delay after sresetNicholas Piggin1-2/+1
There is a 100ms delay when targets reach sreset which does not appear to have a good purpose. Remove it and therefore reduce the sreset timeout by the same amount. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-12-03fast-reboot: add more barriers around cpu state changesNicholas Piggin1-1/+4
This is a bit of paranoia, but when a CPU changes state to signal it has reached a particular point, all previous stores should be visible. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-12-03fast-reboot: add sreset timeout detection and handlingNicholas Piggin1-33/+45
Have the initiator wait for all its sreset targets to call in, and time out after 200ms if they did not. Fail and revert to IPL reboot. Testing indicates that after successful sreset_all_others(), it takes less than 102ms (in hundreds of fast reboots) for secondaries to call in. 100 of that is due to an initial delay, but core un-splitting was not measured. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-12-03fast-reboot: make spin loops consistent and SMT friendlyNicholas Piggin1-9/+15
Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-12-03fast-reboot: add sreset_all_others error handlingNicholas Piggin1-10/+13
Pass back failures from sreset_all_others, also change return codes to OPAL_ form in sreset_all_prepare to match. Errors will revert to the IPL path, so it's not critical to completely clean up everything if that would complicate things. Detecting the error and failing is the important thing. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-12-03fast-reboot: factor out the mambo sreset codeNicholas Piggin1-26/+62
Move the mambo sreset code out from the P8 implementation. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-12-03fast-reboot: clean up some common cpu iteration processes with macrosNicholas Piggin1-46/+11
Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-12-03fast-reboot: remove last man standing logicNicholas Piggin1-22/+1
The "last man standing" logic has the initiator CPU sreset all others, then one of them sresets the initiator. This complicates the fast reboot process and increases potential for errors. The initiator can simply branch to 0x100 directly. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-12-03fast-reboot: factor out direct control loops for sresetNicholas Piggin1-21/+58
This provides a simple API that is amenable to be implemented by the direct-controls subsystem in a future change. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-12-03fast-reboot: restore SMT priority on spin loop exitNicholas Piggin1-0/+1
Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-28cpu: idle split pm enable into sreset and ipi componentsNicholas Piggin1-1/+2
pm idle requires the system reset vector and IPI facilities before it can be enabled. Split these out and manage them individually. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-06-26cpu: Rework HILE changeBenjamin Herrenschmidt1-2/+3
Create a more generic helper for changing HID0 bits on all processors. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-06-26Rename cleanup_tlb() to cleanup_local_tlb()Benjamin Herrenschmidt1-1/+1
It uses tlbiel and only cleans up the TLB of the calling core Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-06-06cpu: Introduce smt_lowest()Nicholas Piggin1-3/+3
Recent CPUs have introduced a lower SMT priority. This uses the Linux pattern of executing priority nops in descending order to get a simple portable way to put the CPU into lowest SMT priority. Introduce smt_lowest() and use it in place of smt_very_low and smt_low ; smt_very_low sequences. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-03-20fast-reboot: remove CAPI checkAndrew Donnellan1-12/+0
Now that we can handle disabling CAPI mode on PHBs, we don't need to disable fast reboot if there's a CAPI mode PHB. Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2016-12-21opal/fast-reboot: set fw_progress sensor status with IPMI_FW_PCI_INIT.Pridhiviraj Paidipeddi1-0/+3
In fast-reboot path, OPAL is re-initializing the PCI subsystem. Accordingly set firmware progress sensor status with IPMI_FW_PCI_INIT. Signed-off-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2016-11-02opal/fast-reboot: Send special reset sequence to operational CPUs only.Mahesh Salgaonkar1-1/+29
In the fast reboot path opal sends multiple special sequence to all the CPUs using xscom operations. On freshly booted system where all CPUs are in good condition, the fast reboot works fine. But fast reboot fails when any of the COREs are GARDed by Service processor. This is because xscom operations fails/timeout on the CPUs/COREs that are GARDed. Fix this issue by skipping GARDed CPUs during fast reboot path. The GARDed CPUs are presented as 'bad' to OPAL and OPAL marks that cpu->state as 'cpu_state_unavailable'. This patch checks the cpu state to skip GARDed CPUs during fast reboot. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2016-10-25fast-reboot: disable on FSP code update or unrecoverable HMIStewart Smith1-0/+19
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> [stewart@linux.vnet.ibm.com: unlock before return (suggested by Mahesh/Andrew), disable only on non-cancelling fsp codeupdate call (suggested by Vasant)] Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2016-10-24fast-reboot: abort fast reboot if CAPP attachedAndrew Donnellan1-0/+13
If a PHB is in CAPI mode, we cannot safely fast reboot - the PHB will be fenced during the reboot resulting in major problems when we load the new kernel. In order to handle this safely, we need to disable CAPI mode before resetting PHBs during the fast reboot. However, we don't currently support this. In the meantime, when fast rebooting, check if there are any PHBs with a CAPP attached, and if so, abort the fast reboot and revert to a normal reboot instead. Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2016-10-17fast-reset: free fdt on fast reset, count fast rebootsStewart Smith1-1/+5
A bit of a hack to free the flattened device tree on fast reset. This means we don't leak ~500kb memory every fast reset. We also count the number of fast resets we've done (if enabled), which means that for stress testing, we have a hope of finding out how many we managed to do before we hit a problem. Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2016-10-17Fast reboot for P8Benjamin Herrenschmidt1-177/+361
This is an experimental patch that implements "Fast reboot" on P8 machines. The basic idea is that when the OS calls OPAL reboot, we gather all the threads in the system using a combination of patching the reset vector and soft-resetting them, then cleanup a few bits of hardware (we do re-probe PCIe for example), and reload & restart the bootloader. For Trusted Boot, this means we *add* measurements to the TPM, so you will get *different* PCR values as compared to a full IPL. This makes sense as if you want to be sure you are running something known then, well, do a full IPL as soft reset should never be trusted to clear any malicious code. This is very experimental and needs a lot of testing and also auditing code for other bits of HW that might need to be cleaned up. BenH TODO: I also need to check if we are properly PERST'ing PCI devices. This is partially based on old code I had to do that on P7. I only support it on P8 though as there are issues with the PSI interrupts on P7 that cannot be reliably solved. Even though this should be considered somewhat experimental, we've had a lot of success on a variety of machines. Dozens/hundreds of reboots across Tuleta, Garrison and Habanero. Currently, we've hidden it behind a NVRAM config option, which *is* liable to change in the future (to ensure that only those who know what they're doing enable it) You can enable the experimental support via nvram option: nvram -p ibm,skiboot --update-config experimental-fast-reset=feeling-lucky Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [stewart@linux.vnet.ibm.com: hide behind nvram option, include Mambo fixes from Mikey] Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2015-02-26sparse: fix fonction declarationsCédric Le Goater1-2/+2
Signed-off-by: Cédric Le Goater <clg@fr.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2014-11-18Remove useless global include memory.hBenjamin Herrenschmidt1-1/+0
It only exposed one function that is local to the hdat stuff Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-07-02Initial commit of Open Source releaseBenjamin Herrenschmidt1-0/+346
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>