aboutsummaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2018-11-02phb4/capp: Only reset FIR bits that cause capp machine checkVaibhav Jain1-0/+1
[ Upstream commit 999246716d2da347aad46a28ed9899b832bffe6c ] During CAPP recovery do_capp_recovery_scoms() will reset the CAPP Fir register just after CAPP recovery is completed. This has an unintentional side effect of preventing PRD from analyzing and reporting this error. If PRD tries to read the CAPP FIR after opal has already reset it, then it logs a critical error complaining "No active error bits found". To prevent this from happening we update do_capp_recovery_scoms() to only reset fir bits that cause CAPP machine check (local xstop). This is done by reading the CAPP Fir Action0/1 & Mask registers and generating a mask which is then written on CAPP_FIR_CLEAR register. Cc: stable Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-11-02phb4: Check for RX errors after link trainingOliver O'Halloran2-0/+4
[ Upstream commit 9597a12ef4b3644e4b8644f659bec04ca139b7f9 ] Some PHB4 PHYs can get stuck in a bad state where they are constantly retraining the link. This happens transparently to skiboot and Linux but will causes PCIe to be slow. Resetting the PHB4 clears the problem. We can detect this case by looking at the RX errors count where we check for link stability. This patch does this by modifying the link optimal code to check for RX errors. If errors are occurring we retrain the link irrespective of the chip rev or card. Normally when this problem occurs, the RX error count is maxed out at 255. When there is no problem, the count is 0. We chose 8 as the max rx errors value to give us some margin for a few errors. There is also a knob that can be used to set the error threshold for when we should retrain the link. ie nvram -p ibm,skiboot --update-config phb-rx-err-max=8 Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-10-31Add fast-reboot property to /ibm,opal DT nodeStewart Smith1-0/+1
[ Upstream commit 7c8e1c6f89f3aac77661cfcee75ab515bd053d75 ] this means that if it's permanently disabled on boot, the test suite can pick that up and not try a fast reboot test. Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-10-31platform: Restructure bmc_platform typeAndrew Jeffery2-2/+15
[ Upstream commit 9a830ee06c66058b1421c017b25a65a22921e9f6 ] Segregate the BMC platform configuration into hardware and software components. This allows population of platform default values for hardware configuration that may no-longer be accessible by the host. Signed-off-by: Andrew Jeffery <andrew@aj.id.au> [stewart: fixup pci-quirk unit test] Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-10-31lpc: Introduce generic probe capabilityAndrew Jeffery1-4/+10
[ Upstream commit 5684204c2d0b470de72eff563c9e0172bbdbcb18 ] Introduce generic read and write probe functions that allow detection of valid addresses by way of synchronous testing for the SYNC no-response state. If the no-response state is detected the probe functions will return an error to the caller, who can do with it what they wish. In the process, rip out the naive mechanism for muting the equivalent asynchronous error logging (regretfully introduced recently by yours truly). Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-10-31astbmc: Remove coordinated isolation supportAndrew Jeffery1-2/+0
[ Upstream commit 1a1ff0ab2c78f9257bb77301191df38242d11f0d ] Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-10-31astbmc: Prefer ipmi-hiomap for PNOR accessAndrew Jeffery1-2/+2
[ Upstream commit b5edb1692b7f6af1a60758f4f63f52f795b5dba0 ] If the IPMI command is not available, fall back to the mailbox interface. Signed-off-by: Andrew Jeffery <andrew@aj.id.au> [stewart: fix up mbox test] Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-10-31libflash: Add ipmi-hiomapAndrew Jeffery3-17/+87
[ Upstream commit 35c955970af6c90429a558470c779d158ce54ea9 ] ipmi-hiomap implements the PNOR access control protocol formerly known as "the mbox protocol" but uses IPMI instead of the AST LPC mailbox as a transport. As there is no-longer any mailbox involved in this alternate implementation the old protocol name is quite misleading, and so it has been renamed to "the hiomap protoocol" (Host I/O Mapping protocol). The same commands and events are used though this client-side implementation assumes v2 of the protocol is supported by the BMC. The code is a heavily-reworked copy of the mbox-flash source and is introduced this way to allow for the mbox implementation's eventual removal. mbox-flash should in theory be renamed to mbox-hiomap for consistency, but as it is on life-support effective immediately we may as well just remove it entirely when the time is right. Signed-off-by: Andrew Jeffery <andrew@aj.id.au> [stewart: prlog debug over prerror for mbox fallback, fix indent] Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-10-31ipmi: Introduce registration for SEL command handlersAndrew Jeffery1-0/+5
[ Upstream commit d4048420962097ce5b46167b2715b458142d394f ] Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-10-31lpc: Silence LPC SYNC no-response error when necessaryAndrew Jeffery2-0/+4
[ Upstream commit 1d8793c64b596cfdcc3cf6035a3b4cbe3c341ae9 ] Add the ability to silence particular errors from the LPC bus when they can be expected, particularly: LPC[000]: Got SYNC no-response error. Error address reg: 0xd001002f This is necessary on platform exit on some astbmc machines to avoid unnecessary noise in the msglog. Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-10-31ast-io: Rework setup/tear-down of communication with the BMCAndrew Jeffery1-4/+15
[ Upstream commit ebc8524a3a457f73083d984296bfd797940a711c ] It's possible for the platform to configure the BMC with SuperIO access disabled. Rework the interfaces to report failures if SuperIO is not enabled, and clean up once we're finished. Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-10-31ast-bmc: Rename LPC FW cycle helpersAndrew Jeffery1-2/+2
[ Upstream commit 8972e44f97883e5aabf4b9c6737dcf3b22fd24b8 ] Introduce some consistency for readability and make the names better reflect the nature of the tests. Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-10-24ast-bmc: Move copy routines to ast-sf-ctrlAndrew Jeffery1-6/+0
[ Upstream commit 5b1bc2ffe791ae94361d86b2ae063ee543bf2df5 ] The only user was hw/ast-bmc/ast-sf-ctrl.c, and for accessing flash the copy routines require knowledge of the PNOR LPC offset. For systems using MBOX the ast-sf-ctrl implementation is unused, so move the offset initialisation out of the common code-path and the copy routines to the place where they are necessary. Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-10-24astbmc: Enable mbox depending on scratch regJoel Stanley1-0/+1
[ Upstream commit b09e48ffcdbffca97f7f6ebc2135a9e82dc5d9e9 ] P8 boxes can opt in for mbox pnor support if they set the scratch register bit to indicate it is supported. Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-10-16phb4/capp: Update the expected Eye-catcher for CAPP ucode lidVaibhav Jain1-1/+1
[ Upstream commit d5ebd5519dcd1727bd2355d9e5aa4bfbcd7f3792 ] Currently on a FSP based P9 system load_capp_code() expects CAPP ucode lid header to have eye-catcher magic of 'CAPPPSLL'. However skiboot currently supports CAPP ucode only lids that have a eye-catcher magic of 'CAPPLIDH'. This prevents skiboot from loading the ucode with this error message: CAPP: ucode header invalid We fix this issue by updating load_capp_ucode() to use the eye-catcher value of 'CAPPLIDH' instead of 'CAPPPSLL'. Cc: stable Fixes: e50764d4f2b1("capi: Load capp microcode") Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-07-19phb4: Reallocate PEC2 DMA-Read engines to improve GPU-Direct bandwidthVaibhav Jain1-0/+2
We reallocate additional 16/8 DMA-Read engines allocated to stack0/1 on PEC2 respectively. This is needed to improve bandwidth available to the Mellanox CX5 adapter when trying to read GPU memory (GPU-Direct). If kernel cxl driver indicates a request to allocate maximum possible DMA read engines when calling enable_capi_mode() and card is attached to PEC2/stack0 slot then we assume its a Mellanox CX5 adapter. We then allocate additional 16/8 extra DMA read engines to stack0 and stack1 respectively on PEC2. This is done by populating the XPEC_PCI_PRDSTKOVR and XPEC_NEST_READ_STACK_OVERRIDE as suggested by the h/w team. Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com> (cherry picked from commit 3754dba77ef5a4d72dc579e789c0a7b06af02160) Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-07-18phb4: Disable nodal scoped DMA accesses when PB pump mode is enabledAlistair Popple1-0/+2
By default when a PCIe device issues a read request via the PHB it is first issued with nodal scope. When accessing GPU memory the NPU does not know at the time of response if the requested memory page is off node or not. Therefore every read of GPU memory by a PHB is retried with larger scope which introduces bandwidth and latency issues. On smaller boxes which have pump mode enabled nodal and group scoped reads are treated the same and both types of request are broadcast to one chip. Therefore we can avoid the retry by disabling nodal scope on the PHB for these boxes. On larger boxes nodal (single chip) and group (multiple chip) scoped reads are treated differently. Therefore we avoid disabling nodal scope on large boxes which have pump mode disabled to avoid all PHB requests being broadcast to multiple chips. Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com> (cherry picked from commit 68518e542e6f7adfe4e97ac22024970ac2400872) Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-07-18Move pb_cen_hp_mode_curr register definition to xscom-p9-reg.hAlistair Popple2-2/+4
Currently it is defined in npu2-regs.h but needs to be used by other files as well so move it somewhere generic. Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com> (cherry picked from commit b8702e2c69638f9cab818e76232af3481935e250) Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-07-17npu2/hw-procedures: Enable parity and credit overflow checksReza Arbab1-0/+4
Enable these error checking features by setting the appropriate bits in our one-off initialization of each "NTL Misc Config 2" register. The exception is NDL RX parity checking, which should be disabled during the link training procedures. Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com> (cherry picked from commit 041d69bb1a7084778d63a846d109c148c7a0009a) Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-07-17npu2/hw-procedures: Don't open code NPU2_NTL_MISC_CFG2_BRICK_ENABLEReza Arbab1-0/+1
Name this bit properly. There's a lot more cleanup like this to be done, but I'm catching this one now as part of some related changes. Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com> (cherry picked from commit c2493fd0ce30dd4204cf4cec2e9c4496201a0cf1) Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-07-11phb4/capp: Calculate STQ/DMA read engines based on link-width for PECVaibhav Jain1-0/+6
Presently in CAPI mode the number of STQ/DMA-read engines allocated on PEC2 for CAPP is fixed to 6 and 0-30 respectively irrespective of the PCI link width. These values are only suitable for x8 cards and quickly run out if a x16 card is plugged to a PEC2 attached slot. This usually manifests as CAPP reporting TLBI timeout due to these messages getting stalled due to insufficient STQs. To fix this we update enable_capi_mode() to check if PEC2 chiplet is in x16 mode and if yes then we allocate 4/0-47 STQ/DMA-read engines for the CAPP traffic. Cc: stable # v5.7+ Fixes: 37ea3cfdc852("capi: Enable capi mode for PHB4") Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com> (cherry picked from commit 47c09cdfe7a34843387c968ce75cea8dc578ab91) Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-06-19pci: Fix PCI_DEVICE_ID()Andrew Jeffery1-1/+1
The vendor ID is 16 bits not 8. This error leaves the top of the vendor ID in the bottom bits of the device ID, which resulted in e.g. a failure to run the PCI quirk for the AST VGA device. Fixes: 2b841bf0ef1b ("core/pci: Use cached vendor/device IDs in quirks") Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com> (cherry picked from commit 50dfd067835af53736ec8106b1f0e99339b54a81) Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-06-19NX: Add NX coprocessor init opal callHaren Myneni3-1/+5
The read offset (4:11) in Receive FIFO control register is incremented by FIFO size whenever CRB read by NX. But the index in RxFIFO has to match with the corresponding entry in FIFO maintained by VAS in kernel. VAS entry is reset to 0 when opening the receive window during driver initialization. So when NX842 is reloaded or in kexec boot, possibility of mismatch between RxFIFO control register and VAS entries in kernel. It could cause CRB failure / timeout from NX. This patch adds nx_coproc_init opal call for kernel to initialize readOffset (4:11) and Queued (15:23) in RxFIFO control register. Fixes: 3b3c5962f432 ("NX: Add P9 NX support for 842 compression engine") CC: stable # v5.8+ Signed-off-by: Haren Myneni <haren@us.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com> (cherry picked from commit 56026a13292453b072ad3cc9adf3dee960077f38) Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-05-18cpu: Clear PCR SPR in opal_reinit_cpus()Michael Neuling1-0/+1
Currently if Linux boots with a non-zero PCR, things can go bad where some early userspace programs can take illegal instructions. This is being fixed in Linux, but in the mean time, we should cleanup in skiboot also. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Stewart Smith <stewart@linux.ibm.com> (cherry picked from commit 3d019581c98153fc047821e3266d90852141f70d) Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-05-09ipmi: Add BMC firmware version to device treeVasant Hegde1-0/+7
BMC Get device ID command gives BMC firmware version details. Lets add this to device tree. User space tools will use this information to display BMC version details. Stewart, I have added bmc information under /ibm,firmware-version node as its firmware version. But may be we should add new node (/bmc/firmware). So that we can keep BMC related information separately. Let me know your thoughts on this. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-05-04fsp: Fix msg vaargs usageJoel Stanley1-2/+2
hw/fsp/fsp.c:1011:17: warning: passing an object that undergoes default argument promotion to 'va_start' has undefined behavior [-Wvarargs] va_start(list, add_words); ^ hw/fsp/fsp.c:1007:59: note: parameter of type 'u8' (aka 'unsigned char') is declared here void fsp_fillmsg(struct fsp_msg *msg, u32 cmd_sub_mod, u8 add_words, ...) ^ [CC] platforms/ibm-fsp/apollo-pci.o hw/fsp/fsp.c:1026:17: warning: passing an object that undergoes default argument promotion to 'va_start' has undefined behavior [-Wvarargs] va_start(list, add_words); ^ hw/fsp/fsp.c:1016:47: note: parameter of type 'u8' (aka 'unsigned char') is declared here struct fsp_msg *fsp_mkmsg(u32 cmd_sub_mod, u8 add_words, ...) Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-05-04processor.h: implement sndmsg instructionsJoel Stanley1-5/+15
Clang doesn't know about msgsnd, msgclr, msgsync yet. Open code them using .long asm() calls. Instead of introducing ifdef hell, do this unconditionally for all compilers as the code generation does not change. Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-30SBE: Add timer supportVasant Hegde1-0/+6
SBE on P9 provides one shot programmable timer facility. We can use this to implement OPAL timers and hence limit the reliance on the Linux heartbeat (similar to HW timer facility provided by SLW on P8). Design: - We will continue to run Linux heartbeat. - Each chip has SBE. This patch always schedules timer on SBE on master chip. - Start timer option starts new timer or modifies an active timer for the specified timeout. - SBE expects timeout value in microseconds. We track timeout value in TB. Hence we convert tb to microseconds before sending request to SBE. - We are requesting ack from SBE for timer message. It gaurantees that SBE has scheduled timer. - Disabling SBE timer We expect SBE to send timer expiry interrupt whenever timer expires. We wait for 10 more ms before disabling timer. In future we can consider below alternative approaches: - Presently SBE timer disable is permanent (until we reboot system). SBE sends "I'm back" interrupt after reset. We can consider restarting timer after SBE reset. - Reset SBE and start timer again. - Each chip has SBE. On multi chip system we can try to schedule timer on different chip. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> CC: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-30Move P8 timer code to separate fileVasant Hegde2-6/+28
Lets move P8 timer support code from slw.c to sbe-p8.c (as suggested by BenH). There is a difference between timer support in P8 and P9. Hence I think it makes sense to name it as sbe-p8.c. Note that this is pure code movement and renaming functions/variables. No functionality changes. Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-30Add SBE driver supportVasant Hegde2-3/+207
SBE (Self Boot Engine) on P9 has two different jobs: - Boot the chip up to the point the core is functional - Provide various services like timer, scom, stash MPIPL, etc., at runtime OPAL can communicate to SBE via a set of data and control registers provided by the PSU block in P9 chip. - Four 8 byte registers for Host to send command packets to SBE - Four 8 byte registers for SBE to send response packets to Host - Two doorbell registers (1 on each side) to alert either party when data is placed in above mentioned data register Protocol constraints: Only one command is accepted in the command buffer until the response for the command is enqueued in the response buffer by SBE. Usage: We will use SBE for various purposes like timer, MPIPL, etc. This patch implements the SBE MBOX spec for OPAL to communicate with SBE. Design consideration: - Each chip has SBE. We need to track SBE messages per chip. Hence added per chip sbe structure and list of messages to that chip - SBE accepts only one command at a time. Hence serialized MBOX commands. - OPAL gets interrupted once SBE sets doorbell register - OPAL has to clear doorbell register after reading response - Every command class has timeout option. Timed out messages are discarded - SBE MBOX commands can be classified into four types : - Those that must be sent to the master only (ex: sending MDST/MDDT info) - Those that must be sent to slaves only (ex: continue MPIPL) - Those that must be sent to all chips (ex: close insecure window) - Those that can be sent to any chip (ex: timer) Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-19sensors: Dont add DTS sensors when OCC inband sensors are availableShilpasri G Bhat1-1/+1
There are two sets of core temperature sensors today. One is DTS scom based core temperature sensors and the second group is the sensors provided by OCC. DTS is the highest temperature among the different temperature zones in the core while OCC core temperature sensors are the average temperature of the core. DTS sensors are read directly by the host by SCOMing the DTS sensors while OCC sensors are read and updated by OCC to main memory. Reading DTS sensors by SCOMing is a heavy and slower operation as compared to reading OCC sensors which is as good as reading memory. So dont add DTS sensors when OCC sensors are available. Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> Acked-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-18core/opal: Emergency stack for re-entryNicholas Piggin3-5/+15
This detects OPAL being re-entered by the OS, and switches to an emergency stack if it was. This protects the firmware's main stack from re-entrancy and allows the OS to use NMI facilities for crash / debug functionality. Further nested re-entry will destroy the previous emergency stack and prevent returning, but those should be rare cases. This stack is sized at 16kB, which doubles the size of CPU stacks, so as not to introduce a regression in primary stack size. The 16kB stack originally had a 4kB machine check stack at the top, which was removed by 80eee1946 ("opal: Remove machine check interrupt patching in OPAL."). So it is possible the size could be tightened again, but that would require further analysis. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-18asm/head: implement quiescing without stack or clobbering regsNicholas Piggin1-1/+1
Quiescing currently is implmeented in C in opal_entry before the opal call handler is called. This works well enough for simple cases like fast reset when one CPU wants all others out of the way. Linux would like to use it to prevent an sreset IPI from interrupting firmware, which could lead to deadlocks when crash dumping or entering the debugger. Linux interrupts do not recover well when returning back to general OPAL code, due to r13 not being restored. OPAL also can't be re-entered, which may happen e.g., from the debugger. So move the quiesce hold/reject to entry code, beore the stack or r1 or r13 registers are switched. OPAL can be interrupted and returned to or re-entered during this period. This does not completely solve all such problems. OPAL will be interrupted with sreset if the quiesce times out, and it can be interrupted by MCEs as well. These still have the issues above. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-18core/stack: backtrace unwind basic OPAL call detailsNicholas Piggin1-3/+22
Put OPAL callers' r1 into the stack back chain, and then use that to unwind back to the OPAL entry frame (as opposed to boot entry, which has a 0 back chain). >From there, dump the OPAL call token and the caller's r1. A backtrace looks like this: CPU 0000 Backtrace: S: 0000000031c03ba0 R: 000000003001a548 ._abort+0x4c S: 0000000031c03c20 R: 000000003001baac .opal_run_pollers+0x3c S: 0000000031c03ca0 R: 000000003001bcbc .opal_poll_events+0xc4 S: 0000000031c03d20 R: 00000000300051dc opal_entry+0x12c --- OPAL call entry token: 0xa caller R1: 0xc0000000006d3b90 --- This is pretty basic for the moment, but it does give you the bottom of the Linux stack. It will allow some interesting improvements in future. First, with the eframe, all the call's parameters can be printed out as well. The ___backtrace / ___print_backtrace API needs to be reworked in order to support this, but it's otherwise very simple (see opal_trace_entry()). Second, it will allow Linux's stack to be passed back to Linux via a debugging opal call. This will allow Linux's BUG() or xmon to also print the Linux back trace in case of a NMI or MCE or watchdog lockup that hits in OPAL. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-17opal/hmi: Generate hmi event for recovered HDEC parity error.Mahesh Salgaonkar1-1/+1
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-17opal/hmi: check thread 0 tfmr to validate latched tfmr errors.Mahesh Salgaonkar1-0/+8
Due to P9 errata, HDEC parity and TB residue errors are latched for non-zero threads 1-3 even if they are cleared. But these are not latched on thread 0. Hence, use xscom SCOMC/SCOMD to read thread 0 tfmr value and ignore them on non-zero threads if they are not present on thread 0. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-17opal/hmi: Fix soft lockups during TOD errorsMahesh Salgaonkar1-0/+1
There are some TOD errors which do not affect working of TOD and TB. They stay in valid state. Hence we don't need rendez vous for TOD errors that does not affect TB working. TOD errors that affects TOD/TB will report a global error on TFMR[44] alongwith bit 51, and they will go in rendez vous path as expected. But the TOD errors that does not affect TB register sets only TFMR bit 51. The TFMR bit 51 is cleared when any single thread clears the TOD error. Once cleared, the bit 51 is reflected to all the cores on that chip. Any thread that reads the TFMR register after the error is cleared will see TFMR bit 51 reset. Hence the threads that see TFMR[51]=1, falls through rendez-vous path and threads that see TFMR[51]=0, returns doing nothing. This ends up in a soft lockups in host kernel. This patch fixes this issue by not considering TOD interrupt (TFMR[51]) as a core-global error and hence avoiding rendez-vous path completely. Instead threads that see TFMR[51]=1 will now take different path that just do the TOD error recovery. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-17opal/hmi: Do not send HMI event if no errors are found.Mahesh Salgaonkar1-1/+1
For TOD errors, all the cores in the chip get HMIs. Any one thread from any core can fix the issue and TFMR will have error conditions cleared. Rest of the threads need take any action if TOD errors are already cleared. Hence thread 0 of every core should get a fresh copy of TFMR before going ahead recovery path. Initialize recover = -1, so that if no errors found that thread need not send a HMI event to linux. This helps in stop flooding host with hmi event by every thread even there are no errors found. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-17opal/hmi: Rework HMI handling of TFAC errorsBenjamin Herrenschmidt2-3/+6
This patch reworks the HMI handling for TFAC errors by introducing 4 rendez-vous points improve the thread synchronization while handling timebase errors that requires all thread to clear dirty data from TB/HDEC register before clearing the errors. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-17opal/hmi: Add a new opal_handle_hmi2 that returns direct info to LinuxBenjamin Herrenschmidt2-1/+8
It returns a 64-bit flags mask currently set to provide info about which timer facilities were lost, and whether an event was generated. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-11phb4: Restore bus numbers after CRSMichael Neuling1-0/+1
Currently we restore PCIe bus numbers right after the link is up. Unfortunately as this point we haven't done CRS so config space may not be accessible. This moves the bus number restore till after CRS has happened. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-11phb4: Enable the PCIe slotcap on pluggable slotsOliver O'Halloran1-0/+1
Enables reporting of slot status information, etc in the config space of the root complex. Currently this is only used to set the slot power limit in our generic PCI code, but we might use it for other things later on. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-11core/pci: Set slot power limit when supportedOliver O'Halloran1-0/+1
The PCIe slot capability can be implemented in a root or switch downstream port to set the maximum power a card is allowed to draw from the system. This patch adds support for setting the power limit when the platform has defined one. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-11interrupts: Create an "interrupts" property in the OPAL nodeBenjamin Herrenschmidt1-0/+3
Deprecate the old "opal-interrupts", it's still there, but the new property follows the standard and allow us to specify whether an interrupt is level or edge sensitive. Similarly create "interrupt-names" whose content is identical to "opal-interrupts-names". Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-03xive: disable store EOI supportCédric Le Goater1-0/+3
Hardware has limitations which would require to put a sync after each store EOI to make sure the MMIO operations that change the ESB state are ordered. This is a killer for performance and the PHBs do not support the sync. So remove the store EOI for the moment, until hardware is improved. Also, while we are at changing the XIVE source flags, let's fix the settings for the PHB4s which should follow these rules : - SHIFT_BUG for DD10 - STORE_EOI for DD20 and if enabled - TRIGGER_PAGE for DDx0 and if not STORE_EOI Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2018-03-27core/cpu: discover stack region size before initialising memory regionsNicholas Piggin2-1/+1
Stack allocation first allocates a memory region sized to hold stacks for all possible CPUs up to the maximum PIR of the architecture, zeros the region, then initialises all stacks. Max PIR is 32768 on POWER9, which is 512MB for stacks. The stack region is then shrunk after CPUs are discovered, but this is a bit of a hack, and it leaves a hole in the memory allocation regions as it's done after mem regions are initialised. 0x000000000000..00002fffffff : ibm,os-reserve - OS 0x000030000000..0000303fffff : ibm,firmware-code - OPAL 0x000030400000..000030ffffff : ibm,firmware-heap - OPAL 0x000031000000..000031bfffff : ibm,firmware-data - OPAL 0x000031c00000..000031c0ffff : ibm,firmware-stacks - OPAL *** gap *** 0x000051c00000..000051d01fff : ibm,firmware-allocs-memory@0 - OPAL 0x000051d02000..00007fffffff : ibm,firmware-allocs-memory@0 - OS 0x000080000000..000080b3cdff : initramfs - OPAL 0x000080b3ce00..000080b7cdff : ibm,fake-nvram - OPAL 0x000080b7ce00..0000ffffffff : ibm,firmware-allocs-memory@0 - OS This change moves zeroing into the per-cpu stack setup. The boot CPU stack is set up based on the current PIR. Then the size of the stack region is set, by discovering the maximum PIR of the system from the device tree, before mem regions are intialised. This results in all memory being accounted within memory regions, and less memory fragmentation of OPAL allocations. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2018-03-27mem-map: Use a symbolic constant for exception vector sizeNicholas Piggin1-0/+5
Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2018-03-27core/fast-reboot: verify mem regions before fast rebootNicholas Piggin1-0/+4
Run the mem_region sanity checkers before proceeding with fast reboot. This is the beginning of proactive sanity checks on opal data for fast reboot (with complements the reactive disable_fast_reboot cases). This is encouraged to re-use and share any kind of debug code and unit test code. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2018-03-27Revert "NPU2 HMIs: dump out a *LOT* of npu2 registers for debugging"Stewart Smith2-8/+3
This reverts commit fbdc91e693fc3103f7e2a65054ed32bfb26a2e17. We don't need this as we need to do it a different way, with a explicit set of registers as otherwise we trip other random FIR bits and everything becomes even more terrible. I suggest alcohol. Cc: stable Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2018-03-22npu2: Remove DD1 supportAndrew Donnellan2-3/+0
Major changes in the NPU between DD1 and DD2 necessitated a fair bit of revision-specific code. Now that all our lab machines are DD2, we no longer test anything on DD1 and it's time to get rid of it. Remove DD1-specific code and abort probe if we're running on a DD1 machine. Cc: Alistair Popple <alistair@popple.id.au> Cc: Reza Arbab <arbab@linux.vnet.ibm.com> Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-By: Alistair Popple <alistair@popple.id.au> Acked-by: Reza Arbab <arbab@linux.vnet.ibm.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>