aboutsummaryrefslogtreecommitdiff
path: root/core
AgeCommit message (Collapse)AuthorFilesLines
2014-10-08PCI: Refactor error injectionGavin Shan1-9/+8
The patch refactors the code we had for PCI error injection. It doesn't change the logic: * Rename names of error types and functions according to the comments given by Michael Ellerman when reviewing the kernel counterpart. * Split The backend of error injection for PHB3 and P7IOC to multiple functions to improve code readability. Some logics are simplified without affecting their original functionality. * Misc cleanup like renaming variables and functions. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-10-02hmi: decode_malfunction fixesRyan Grimm1-11/+7
Fix for nodes > 0. No need to map to node and local chip id. Just pass i as chip id. Remove unneccessary braces. In set_capp_recoverable, return not recovered if phb not found. Found by Milton Miller. Signed-off-by: Ryan Grimm <grimm@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-09-30attn: Make backtrace buffer globalAruna Balakrishnaiah1-6/+5
Code cleanup. Signed-off-by: Aruna Balakrishnaiah <aruna@linux.vnet.ibm.com> Acked-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-09-30hmi: Handle capp recoverable malfunctionsRyan Grimm1-8/+79
Based on email from JT Kellington, Dave Larson, and Joe McGill and feedback from Ben H. handle_malfunction reads the bits in the malf alert reg, checks for is_capp_recoverable, and returns 1 if recoverable. It also calls into phb3 to put phb3 in capp error recovery state. Returns 0 if not capp recoverable and it's a TODO to add the logic to check the other FIRs. Don't send message when malf alert empty. Use return code -1 to tell opal_handle_hmi to swallow the event. Also, with locking, only one thread per core will send the message instead of all threads. Signed-off-by: Ryan Grimm <grimm@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-09-30hmi: Add locking to hmi handlerRyan Grimm1-0/+5
Take a lock before handle_hmi_event per Ben's suggestion. So, when we clear events, only one thread per core will report it. Signed-off-by: Ryan Grimm <grimm@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-09-30phb3/capi: Add two new modes to opal_pci_set_phb_capi_modeRyan Grimm1-10/+5
For user initiated capp recovery, provide a mode to turn snoops off. The perst alone does not turn snoops off and we need to do this as part of the capp recovery procedure before reinitializing the phb. A second mode turns snoops back on after recovery. The driver needs to do this after it reinitializes the PSL otherwise tlbies could come in before the psl is initialized. Also write 0 to capp error status and control as part of the recovery procedure. Put modes as flag defines in opal.h so the driver can pick them up. Add a dt property "ibm,capi-modes" which tells the driver which modes sapphire supports. For backwards compatibility with older opals. Also, the driver can disable reset in sysfs if not supported. Move the mode checking into phb3.c so it's all in one place. Signed-off-by: Ryan Grimm <grimm@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-09-01ipmi: Improve RTC supportBenjamin Herrenschmidt1-3/+14
Create a device-node which will be used by Linux for matching and use a saner default time if IPMI doesn't work. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-09-01core: Setup the OPAL DT node before platform probeBenjamin Herrenschmidt2-4/+3
The platform probe code might want to add things to it. While at it, make add_cpu_idle_state_properties() local to slw.c and call it from slw_init() instead of from add_opal_node(). Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-29PCI: Disable completion timeoutGavin Shan1-0/+22
For PCIe devices, there are 2 bits used to control completion timeout as follows: PCIe Cap + 0x24, Device Capabilities 2 Register, bit#4 PCIe Cap + 0x28, Device Control 2 Register, bit#4 The patch adds function pci_disable_completion_timeout(), which is called during bootup or after PE reset. It's responsing to bug#114961 Suggested-by: Michael A. Perez <perezma@us.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-29PCI: Add pci_device_init()Gavin Shan1-6/+6
The patch adds function pci_device_init(), which is called by phb->ops->device_init() to apply common initialization on the specified PCI device during bootup or after PE reset. Currently, we only put the logic of MPS configuration to the function, but more will be put there. Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-21cpu: Bump the reinit timeout up to 1sBenjamin Herrenschmidt1-1/+2
Probably due to the way we spin, we seem to still be hitting the odd case where we fail to reinit due to a secondary not having quite reached the right state inside skiboot. Let's bump the timeout up. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-15ipmi: Add an IPMI command to get and set the RTCAlistair Popple1-0/+98
Add IPMI GET_SEL_TIME and SET_SEL_TIME commands to the IPMI stack. Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-15fsp/rtc: Use libc time functionsAlistair Popple2-2/+95
Our libc now has a proper implementation of mktime, which makes adding tm structures together easy. This patch makes the FSP RTC functions use the library functions and removes the generic time calculation code from the FSP RTC driver. The OPAL<->tm conversion functions are also made public as they will be useful for the IPMI RTC implementation. Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-15{core,hdata}/test: Add prlog to stubJoel Stanley1-1/+12
We are missing a prlog for tests. This adds a dumb version that ignores the log level and uses printf to display all messages. Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-15ipmi: Correct chassis control messageJoel Stanley1-5/+12
I misread the spec when implementing the chassis control message. This fixes the message, as well as correcting the naming of the IPMI fields to better reflect what they represent. Signed-off-by: Joel Stanley <joel@jms.id.au> Acked-by: Jeremy Kerr <jeremy.kerr@au.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13attn: Dump backtrace to bufferAruna Balakrishnaiah1-5/+25
Existing backtrace will dump the backtrace to stderr. __backtrace will dump the backtrace to buffer. backtrace() will call __backtrace internally and dump it to stderr. Signed-off-by Aruna Balakrishnaiah <aruna@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13Add cpu_relax to stop cores spinning hardJoel Stanley2-4/+20
Ensure a thread is not stopping its siblings from making forward progress when we are busy-waiting on older DD1.x CPU revisions where SMT priorities are somewhat broken. Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13plat/palmetto: Add shutdown and rebootJoel Stanley1-0/+26
Rebooting and power down for the Palmetto is done by the BMC, which we speak to over the BT interface using IPMI. Implement the IPMI chassis commands which are used for power control, and hook them up to the palmetto platform callbacks for shutdown and reboot. Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13ipmi: Add a base IPMI stack with a BT driverAlistair Popple2-1/+37
This patch adds a basic IPMI layer to the sapphire core and support for a BT IPMI interface as found on the Aspeed BMC of the Palmetto platform [ Changed the compatible property -- BenH ] Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-08pci: Improve logging format and log levelsBenjamin Herrenschmidt1-59/+78
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-08Add fake RTC to generic platformmillerjo@us.ibm.com1-0/+2
Adds a fake RTC that can be initialized via a named reserve in the device tree that may, at some point, be on NVRAM. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-08PCI: Restore bus numbers after complete resetGavin Shan1-8/+39
The complete reset could be issued by kdump kernel to remove pending PCI traffic in order to avoid EEH errors in kdump scenario. However, the bus numbers configured into PCI bridges would be lost after the reset and it would cause that some of PCI devices (e.g. IPR) can't be probed by kdump kernel successfully. The patch fixes above issue by restoring bus numbers after complete reset. It's responsing to bug#113210 Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-08PCI: Support parallel scanningGavin Shan1-18/+50
Currently, the tasks of scanning PHBs are done on master CPU one by one. The patch intends to do same tasks on multiple CPUs in order to save booting time with help of additional flags to PHB. With the patch applied, we saves 22 seconds the tasks to reset and scan 8 PHBs on one P8 box from 37 seconds to 15 seconds. NOTE: the printed logs during PCI enumeration should include PHB index to be self-explaining enough. I'll fix it later. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-08PCI: Split slot reset and scanGavin Shan1-20/+38
As Rolf reported, 2 downstream ports from different PHBs are connected to same physical bridge, which supports virtual "partitioned" functionalities. Fundamental reset issued on one PHB affects the functionality used by another PHB during PCI enumeration. Eventually, we can't detect the functionality and all devices behind it on one of two PHBs. The patch splits PCI enumeration to reset all PHBs and then scan them one by one to avoid above issue. Also, the patch replaces PCI_MAX_PHBs with ARRAY_SIZE, which is used heavily. Reported-by: Rolf Brudeseth <rolfb@us.ibm.com> Suggested-by: Benjamin Herrenschmidt <benh@au1.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-08Make log level thresholds dynamic in debug_descriptor rather than staticStewart Smith2-1/+20
This enables (advanced) users to vary what level of output they want at runtime in the memory console and through console drivers (fsp/uart) You can vary two things by poking in the debug descriptor: a) what log level is printed at all e.g. only turn on PR_TRACE at specific points during runtime b) what log level goes out the fsp/uart console defaults to PR_PRINTF We use two 4bit numbers (1 byte) for this in debug descriptor (saving some space, not needlessly wasting space that we may want in future). The default is 0x75 (7=PR_DEBUG to in memory console, 5=PR_PRINTF to drivers) If you write 0x77 you will get debug info on uart/fsp console as well as in memory. If you write 0x95 you get PR_INSANE in memory but still only PR_NOTICE through drivers. People who write something like 0x1f will get a very quiet boot indeed. A future patch would be to (when possible) peek at device tree entries for if we should change the default. A future patch would add an OPAL API to get/set this. Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-08Write log messages with log_level > PR_NOTICE only to in memory logStewart Smith2-13/+31
We modify write() (adding console_write()) which calls down to a modified __flush_console() which can now decide if it's flushing the added console contents to the console drivers or not. A future patch may add support for changing PR_NOTICE to some other level Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-08Trivial typo fix: case -> causeStewart Smith1-1/+1
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-08Use PR_EMERG priority in assert() codepathStewart Smith1-0/+6
Moving assert_fail() out of libc and into core/utils.c so that we can sanely call prlog(PR_EMERG). We shorten it from three fputs calls down to one prlog() call. This may increase the number of cycles and stack usage for when we hit an assert, which may not be desirable. Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-08Use PR_EMERG priority in (part of) assert()Stewart Smith1-1/+1
When handling assert and we're going to fail, get the message out with a high priority. Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-08Use PR_EMERG log priority when printing backtraceStewart Smith1-1/+1
If we're printing a backtrace, things have probably gone horribly, horribly wrong - highest log priority. Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-08replace printf() with console log, level 5 aka INFO messagesStewart Smith1-0/+12
Replace the libc printf implementation with a wrapper that does fancy log things such as display timestamp and the log level. Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-08Initial code for timestamps in logStewart Smith2-0/+55
This is the initial patch for having timestamps in the log. It currently only wraps prerror to our prlog() function and thus only (very slightly) modifies bootup log. we use the timebase as an indication of the progression of time. It is not perfect, and is indeed reset back to zero during boot, but it should serve adequately for our needs of "approximately this much time elapsed between log entries". Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-07-30core: Fix licence header in hmi.cBenjamin Herrenschmidt1-6/+15
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-07-25opal: Recover from TOD sync check error.Mahesh Salgaonkar1-1/+2
This patch implements basic framework for TOD error recovery. To start with, this patch implements TOD sync check error recovery as an example. Currently this patch recover from sync check error on non-master chip. We can use same framework and recover from more TOD errors. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-07-25opal: Add opal call to handle HMI.Mahesh Salgaonkar2-8/+86
With new proposed change, Linux will get the HMI interrupt directly. Linux will then invoke opal_handle_hmi to handle HMI recovery in opal. After handling HMI errors, opal will generate an OPAL HMI event and queue it up in opal message infrastructure so that Linux host can pull the event and act upon it accordingly. This patch also adds new message type for HMI event. Changes in v2: - Removed the token argument from opal_handle_hmi() Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-07-25opal: Move HMI handler to new file core/hmi.cMahesh Salgaonkar3-44/+182
Move the original hmi handler to new file core/hmi.c. No functionality change, just a code movement and variable name change. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-07-21PCI: Allow to set frozen stateGavin Shan1-0/+19
The patch introduces a new OPAL API opal_pci_eeh_freeze_set(), which allows to set frozen state for the specified PE, so that we can support "compound" PE in kernel. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-07-21PCI: Clear PAPR error injectionGavin Shan1-0/+6
Though the p7ioc spec states the errors triggered by PAPR error injection register set (0x2b0, 0x2b8, 0x2c0) should be one-shot without "sticky" bit, Firebird-L machine doesn't follow the rule. It will cause endless frozen PE until we have to remove the PE permanently. The patch extends opal_pci_reset() allowing kernel to clear PAPR error injection register set at appropriate point. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-07-21core: PCI error injection APIMike Qiu1-0/+26
The patch introduces new OPAL API opal_pci_err_injct() for injecting PCI errors. Signed-off-by: Mike Qiu <qiudayu@linux.vnet.ibm.com> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-07-13opal: poller re-entrancy try #3Benjamin Herrenschmidt1-31/+14
So my great attempt at avoiding all re-entencies fails due to HBRT... at least until we have some kind of way to thread things, it will have to re-enter so let's bite the bullet, make the poller list walking lockless (we'll handle removal when we have to, ie, not yet) and slightly extend the coverage of the PSI lock while at it. All the other pollers already have their own locks anyway so we are actually removing some overhead. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-07-08opal: Add a debug helper to check for poller recursionBenjamin Herrenschmidt1-1/+21
And check & warn inside opal_run_pollers() as well Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-07-08timebase: Add "nopoll" variants of the delaysBenjamin Herrenschmidt1-0/+18
In case where we don't want to recurse into opal pollers Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-07-08lock: Add helper to check if this CPU is already holding a lockBenjamin Herrenschmidt1-0/+6
For debug purposes essentially Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-07-08opal: Replace fsp_poll() with a full run of all OPAL pollersBenjamin Herrenschmidt3-15/+19
Otherwise we don't handle surveillance and PSI link monitoring This should fix cases of surveillance timeouts during things like code update such as BZ109939 Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-07-02Initial commit of Open Source releaseBenjamin Herrenschmidt39-0/+11572
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>