aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2015-10-14opal-prd: Improve error-checking in hservices_initJeremy Kerr1-3/+16
Currently, a signature failure for the HBRT image prints a log message, but doesn't actually abort the initialisation. This change adds a failure path for this, as well as hbrt_init() returning NULL. Signed-off-by: Jeremy Kerr <jk@ozlabs.org> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2015-10-14external/gard/tests: Add tests for the gard toolCyril Bur15-1/+90
Simple tests for the gard tool that can be expanded on over time Signed-off-by: Cyril Bur <cyril.bur@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2015-10-14external/test: Create an external test harnessCyril Bur1-0/+95
Unlike skiboot where individual functions can be tested, the external/ binaries can sometimes only be fully tested by observing the output of the full binary as such this little framework designed to grab stdout and stderr and compare to provided output files should prove useful. Signed-off-by: Cyril Bur <cyril.bur@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2015-10-14Fix strcat onto uninit string for pci location codeStewart Smith1-0/+2
I don't *think* we've managed to hit this in the wild, although probably largely by accident than anything on purpose. Fix is to just explicitly set it to ''. Fixes: 58ccf6a977ade80e4475d7d350c4c076ab1accad Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2015-10-14external/pflash: Add (C) headerCyril Bur1-0/+16
Signed-off-by: Cyril Bur <cyril.bur@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2015-10-13Merge branch 'stable'Stewart Smith0-0/+0
2015-10-13fix prerror() build failure in fsp-leds.cskiboot-5.1.7Stewart Smith1-1/+1
Fixes: 8f433d6cd4f92b4f878e5ddc414e2800a2fb7140 Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2015-10-13Merge branch 'stable': FSP leds fix and skiboot 5.1.7 release notesStewart Smith2-1/+41
2015-10-13Add skiboot-5.1.7 release notesStewart Smith1-0/+29
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2015-10-09Merge branch 'update-2.1.1.1' into stableStewart Smith1-1/+12
2015-10-09PHB3: Remove unnecessary message in phb3_sm_fundamental_reset()skiboot-2.1.1-fw810.40-1Gavin Shan1-2/+1
This removes below unnecessary message in phb3_sm_fundamental_reset() as there already has on subsequent message indicating the situation. Performing PERST... Also, this decreases the outputing level of all messages in this function to DEBUG. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2015-10-09PHB3: Retry fundamental resetGavin Shan2-2/+31
When issuing fundamental reset on below IPR adapter that seats behind root complex, there is 50% possibility that the link fails to come up after the reset. In that case, the adapter's config space is blocked and it's not usable. host# lspci -ns 0004:01:00.0 0004:01:00.0 0104: 1014:034a (rev 01) host# lspci -s 0004:01:00.0 0004:01:00.0 RAID bus controller: IBM PCI-E IPR SAS Adapter (ASIC) (rev 01) This introduces another PHB3 state (PHB3_STATE_FRESET_START) allowing to redo fundamental reset if the link doesn't come up in time at the first attempt, to improve the robustness of PHB's fundamental reset. If the link comes up after the first reset, the 2nd reset won't be issued at all. Reported-by: Paul Nguyen <nguyenp@us.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2015-10-09hw/fsp/fsp-leds.c: use allocated buffer for FSP_CMD_GET_LED_LIST responseStewart Smith1-2/+11
This bug has originated since day 1 (of public release), what was going on was that we were incorrectly using PSI_DMA_LOC_COD_BUF as the *address* to write to for the FSP to read rather than using that purely as the TCE table. What we *should* have been doing (and this patch now does), is allocating some (aligned) memory and using it. With this patch, we no longer write over some poor random memory location that could be being used by the host OS for something important, for example, in the (internal) bug report of this, it was futex_hash_bucket in Linux being replaced with our structure for replying to FSP_CMD_GET_LED_LIST (which is around 4kb) and Linux doesn't like it when you replace a bunch of lock data structures with essentially garbage. Since this is FSP LED code specific, this only affects FSP based systems. Reported-by: Dionysius d. Bell <belldi@us.ibm.com> Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2015-10-08PHB3: Emulate root complex pref 64-bits windowGavin Shan1-18/+70
On Naples, the prefetchable 64-bits window on root complex can't be altered. However, Linux kernel depends on that to detect the window properly. Otherwise, the PCI device BARs that should be covered by PHB's M64 window are allocated from M32. It leads to unworkable CAPI card on Linux side. This checks if the root complex's prefechable 64-bits window can be changed or not. If not, a PCI config register filter for it is registered to emulate the behaviour. Reported-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Tested-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2015-10-08PCI: Introduce config register filterGavin Shan3-0/+114
We have to provide the emulated result for PCI config register access on some devices to eleminate the gap between hardware and software. One example would be the 0x28 (prefetchable memory window upper 32-bits) of the root complex on Naples isn't writable. Linux kernel relies on that to detect 64-bits window successfully. This introduces config register filter to PCI device to eleminate above gap. Each PCI device maintains a list of filters, which are populated when the PCI device is initialized. When PCI config space is accessed, the filter is searched to override the result from user (write) or hardware (read) if necessary. Reported-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Tested-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2015-10-08hw/bt.c: Timeout messages when bt interface isn't functionalAlistair Popple1-9/+17
During system bring up we may not have a properly functioning ipmi interface. This prevents skiboot completing the boot process as it waits for certain bt messages to complete before continuing. This patch alters the bt message timeouts to ensure messages timeout in the case of a non-responsive bt interface. Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2015-10-08interrupts: Convert P8 IRQ assignment to functionsAlistair Popple4-30/+120
Interrupts on P8 are currently hard-coded using macros in include/interrupts.h. The new P8NVL processor has an extra PHB meaning it supports 4 PHBs in total which leads to the following assert fail when booting P8NVL based systems: [6614913194,3] register IRQ source overlap ! [6620562844,3] new: 2000..27f7 old: 2000..27f7 [6870377440,0] Assert fail: core/interrupts.c:67:0 This patch converts the existing macros to function calls so that different platforms can support extra PHBs at the expense of a reduced maximum number of chips. Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2015-10-08Add unit test for timebase functionsStewart Smith2-0/+57
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2015-10-08Merge skiboot-5.1.6 release notesStewart Smith1-0/+31
2015-10-08Merge tag 'skiboot-5.1.6' into stable-5.1.7Stewart Smith1-0/+31
Tag skiboot-5.1.6
2015-10-08Add skiboot-5.1.6 release notesskiboot-5.1.6Stewart Smith1-0/+31
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2015-10-07pci: Use a fixed numbering of PHBs on OPAL and improve log consistencyBenjamin Herrenschmidt5-45/+71
On P8, we calculate the OPAL ID of the PHB as a function of the physical chip number and PHB index on that chip. P7 continues using "allocated" numbers for now. We also consistently print the PHB ID as a 4-digit hex number which facilitates decoding it, and print the chip:index location in the probe code to make it easier to correlate log entries. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [stewart@linux.vnet.ibm.com: use next_chip rather than get_chip] Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2015-10-07chiptod: chiptod code improvementsVipin K Parashar2-116/+114
This patch makes below changes in chiptod code to improve quality Changes in hw/chiptod.c - Uses pr_fmt macro for tagging log messages - Simplifies if conditions - Removes extra write spaces Changes in hw/fsp/fsp-chiptod.c - Uses pr_fmt macro for tagging log messages Signed-off-by: Vipin K Parashar <vipin@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2015-10-07hw/fake-rtc: Add support for emulated rtc clockVaibhav Jain1-9/+50
This patch adds support for an emulated rtc clock over existing fake rtc implementation for generic platform (e.g BML). Presently a fake rtc clock is initialized when reserved region named 'ibm,fake-rtc' is present in the boot device tree. This mem-region points to the initial value of bcd coded date-time values. However as this region is in system memory hence its not updated with time. This results in an error from hwclock tool which tries to detect a change in system clock and then complains "Timed out waiting for time change." The patch overcomes this issue by emulating an rtc clock whose date-time values are calculated from the difference of current timebase and its value when the initial epoch was assigned. The initial epoch is set from the values at "ibm,fake-rtc" memory region. Acked-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2015-10-07Merge printf format warning fix from skiboot-5.1.6 into stable-5.1.7Stewart Smith1-1/+1
2015-10-07Merge printf format warning fix from skiboot-5.1.6 treeStewart Smith1-1/+1
2015-10-07Fix printf format warningStewart Smith1-1/+1
Fixes: 55ae15b Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2015-10-07PHB3: Fix unexpected ER (all) on errinjct by PCI configGavin Shan1-2/+2
This issue was found on SRIOV VFs initially and then I checked with Chad Larson who put much efforts to sort it out. As more experiments I did, the issue isn't limited to SRIOV VFs. That means the isue can be seen on non-SRIOV adapter as well: Firstly, I ensure that outbound request discard interrupt (bit#12) is enabled in PCI Express Port Interrupt Enable Register (offset: 0x558). Then injecting error to root complex by PAPR Error Injection Registers with PCI config read. Eventually, all (256) PEs are frozen. After clearing the bit, the target PE#0 is frozen as expected. As Chad pointed, the interrupt ("outbound request discard") is always raised during the error injection, which is translated to UTL's primary interrupt to freeze all (256) PEs. This drops bit#12 of PCI Express Port Interrupt Enable Register to avoid the UTL's primary interrupt caused by outbound request discard, in order to avoid freezing all (256) PEs during error injection via PCI config read. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2015-10-07Merge branch 'stable-5.1.7'Stewart Smith2-3/+31
2015-10-07PHB3: Remove unnecessary message in phb3_sm_fundamental_reset()Gavin Shan1-2/+1
This removes below unnecessary message in phb3_sm_fundamental_reset() as there already has on subsequent message indicating the situation. Performing PERST... Also, this decreases the outputing level of all messages in this function to DEBUG. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2015-10-07PHB3: Retry fundamental resetGavin Shan2-2/+31
When issuing fundamental reset on below IPR adapter that seats behind root complex, there is 50% possibility that the link fails to come up after the reset. In that case, the adapter's config space is blocked and it's not usable. host# lspci -ns 0004:01:00.0 0004:01:00.0 0104: 1014:034a (rev 01) host# lspci -s 0004:01:00.0 0004:01:00.0 RAID bus controller: IBM PCI-E IPR SAS Adapter (ASIC) (rev 01) This introduces another PHB3 state (PHB3_STATE_FRESET_START) allowing to redo fundamental reset if the link doesn't come up in time at the first attempt, to improve the robustness of PHB's fundamental reset. If the link comes up after the first reset, the 2nd reset won't be issued at all. Reported-by: Paul Nguyen <nguyenp@us.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2015-10-07Add doc/stable-skiboot-rules.txt documenting stable treeStewart Smith1-0/+62
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2015-10-07hw/bt.c: Check for timeout after checking for message responseAlistair Popple1-1/+2
When deciding if a BT message has timed out we should first check for a message response. This will ensure that messages will not time out if there was a delay calling the pollers. Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2015-10-07Ensure we run pollers in cpu_wait_job()Stewart Smith1-0/+9
In root causing a bug on AST BMC Alistair found that pollers weren't being run for around 3800ms. This was due to a wonderful accident that's probably about a year or more old where: In cpu_wait_job we have: unsigned long ticks = usecs_to_tb(5); ... time_wait(ticks); While in time_wait(), deciding on if to run pollers: unsigned long period = msecs_to_tb(5); ... if (remaining >= period) { Obviously, this means we never run pollers. Not ideal. This patch ensures we run pollers every 5ms in cpu_wait_job() as well as displaying how long we waited for a job if that wait was >1second. Reported-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2015-10-01Add skiboot-5.1.5 release notesskiboot-5.1.5Stewart Smith1-0/+39
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2015-10-01Add ability to copy pflash binary to BMC to boot_tests.shStewart Smith1-3/+9
Some BMC firmware versions don't ship pflash. Support PFLASH_TO_COPY environment variable to a pflash binary built for the BMC that will be copied over and used to pflash the partition or whole pnor. Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2015-10-01centaur: Add indirect XSCOM supportBenjamin Herrenschmidt3-15/+118
It works just like P8, we copy the code for now rather than make it somewhat common due to our locking differences and to limit the risk close to release. We can refactor later. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2015-09-30xscom: Fix logging of indirect XSCOM errorsBenjamin Herrenschmidt1-3/+3
We didn't pass the right "is_write" argument for writes and the string used for logging was somewhat confusing. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2015-09-30PHB3: Fix incorrect commentsGavin Shan1-1/+1
When struct phb3::has_link is set to true, the downstream link of root port is up, not down. This fixes the incorrect comments. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2015-09-30ipmi-sel: Run power action immediately if host not upJoel Stanley1-4/+16
Our normal sequence for a soft power action (IPMI 'power soft' or 'power cycle') involve receiving a SEL from the BMC, sending a message to Linux's opal platform support which instructs the host OS to shut down, and finally the host will request OPAL to cut power. When the host is not yet up we will send the message to /dev/null, and no action will be taken. This patches changes that behaviour to perform the action immediately if we know how. Signed-off-by: Joel Stanley <joel@jms.id.au> [stewart@linux.vnet.ibm.com: modify checking of OPAL_BOOT_COMPLETE flag, typo] Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2015-09-30Add opal_boot_complete to debug descriptorJoel Stanley2-1/+5
This tells us when we've entered the host. First use case is knowing if we can can rely on host communication working, such as receiving and acting on an opal_msg. Signed-off-by: Joel Stanley <joel@jms.id.au> Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> [stewart@linux.vnet.ibm.com: use real bit field rather than C bitfield] Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2015-09-30opal-prd: Increase IPMI timeout to a slightly better valueStewart Smith1-1/+1
We've seen various IPMI timeouts during testing (mainly hit by petitboot) but it seems that 5 seconds is the magic value that matches everywhere. This echoes what we use in petitboot, so at least being consistent with ourselves is a good idea. Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2015-09-30PHB3: Fix wrong PE number in error injectionGavin Shan1-2/+2
We disallow to inject error to reserved PE#, which is 255 instead of 0 on PHB3. Otherwise, error OPAL_PARAM is returned when injecting error to PE#0. This fixes above issue by checking against the correct PE number 255. Reported-by: Pradeep Ramanna <pramann2@in.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2015-09-26Add skiboot-5.1.4 release notesskiboot-5.1.4Stewart Smith1-0/+32
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2015-09-25Rate limit OPAL_MSG_OCC to only one outstanding message to hostStewart Smith1-2/+17
In the event of a lot of OCC events (or many CPU cores), we could send many OCC messages to the host, which if it wasn't calling opal_get_msg really often, would cause skiboot to malloc() additional messages until we ran out of skiboot heap and things didn't end up being much fun. When running certain hardware exercisers, they seem to steal all time from Linux being able to call opal_get_msg, causing these to queue up and get "opalmsg: No available node in the free list, allocating" warnings followed by tonnes of backtraces of failing memory allocations.
2015-09-22Improve debug/pr_fmt for libporeStewart Smith1-1/+3
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2015-09-22Ensure reserved memory ranges are exposed correctly to host (fix corrupted ↵Stewart Smith4-14/+29
SLW image) Memory regions in skiboot have an interesting life cycle. First, we get a bunch from the initial device tree or hdat specifying some existing reserved ranges (as well as adding some of our own if they're missing) but we also get ranges for the entirety of RAM. The idea is that we can do node local allocations for per node resources (which we do) and then, just prior to booting linux, we copy the reserved memory regions to expose to linux along with a set of reserver regions to cover the node local allocations. The problem was that mem_range_is_reserved() was wanting subtle different semantics for memory region type than region_is_reserved() provided. That is, we were overriding the meaning of REGION_SKIBOOT_HEAP to mean both "this is reserved by skiboot" *and* "this is a memory region that covers all of memory and will be shrunk to cover just the memory we have allocated for it just before we boot the payload (linux)". So what would happen is we would ask "hey, is the memory holding the SLW image reserved?" and we'd get the answer of "yes" but referring to the memory region that covers the entirety of memory in a NUMA node, *not* meaning our intent of "this will be reserved when we start linux". To fix this, introduce a new memory region type REGION_MEMORY. This has the semantics of a memory region that covers a block of memory that we can allocate from (using local_alloc) and that the part that was allocated will be passed to linux as reserved, but that the entire range will not be reserved. So our new semantics are: - region_is_reservable() is true if the region *MAY* be reserved (i.e. is the regions that cover the whole of memory OR is explicitly reserved) - region_is_reserved() is true if the region *WILL* be reserved (i.e. is explicitly reserved) This way we check that the SLW image is explicitly reserved and if it isn't, we reserve it. Fixes: 58033e44 Acked-by: Jeremy Kerr <jk@ozlabs.org> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2015-09-15Add skiboot-5.1.3 release notesskiboot-5.1.3Stewart Smith1-0/+92
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2015-09-15PCI: Clear error bits after changing MPSGavin Shan1-3/+19
Chaning MPS on PCI upstream bridge might cause error bits set on downstream endpoints when system boots into Linux as below case shows: host# lspci -vvs 0001:06:00.0 0001:06:00.0 Ethernet controller: Broadcom Corporation \ NetXtreme II BCM57810 10 Gigabit Ethernet (rev 10) : DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend- : CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ This clears those error bits in AER and PCIe capability after MPS is changed. With the patch applied, no more error bits are seen. Reported-by: John Walthour <jwalthour@us.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2015-09-15platforms/astbmc: Move prd_init calls to astbmc_early_init()Jeremy Kerr4-5/+2
Currently, most astbmc platforms do their own call to prd_init(), but garrison is out-of-sync. This change moves the prd_init call to astbmc_early_init, so we don't need to enable it on every platform. Signed-off-by: Jeremy Kerr <jk@ozlabs.org> Reviewed-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>