aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2017-09-19skiboot-5.4.7 release notesStewart Smith1-0/+30
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com> (cherry picked from commit 17661bef0e0968e60e0938e646e6d3ab0e201d46) Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-19add skiboot-5.1.21 release notesStewart Smith1-0/+24
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com> (cherry picked from commit 7d64a8b4daa00a78e49493668ad4fd6789bfc883) Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-18hdat: fix parsing of P8 hdatStewart Smith2-6/+3
Also fixes hdat_to_dt test cases. Fixes: ad484081ef8a51811e7902aec436fa8f1ca9604a Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-15npu2: hw-procedures: Add settings to PHY_RESETReza Arbab1-0/+10
Set a few new values in the PHY_RESET procedure, as specified by our updated programming guide documentation. Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-15hdata: Parse extra NVLink infoOliver O'Halloran2-1/+32
Add parsing for the link speed information and the OCC GPU presence flags. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-15hdata: Parse NVLink informationOliver O'Halloran4-2/+202
Add the per-chip structures that descibe how the A-Bus/NVLink/OpenCAPI phy is configured. This generates the npu@xyz nodes for each chip on systems that support it. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-15hdata: Parse IOSLOT informationOliver O'Halloran2-0/+470
Add structure definitions that describe the physical PCIe topology of a system and parse them into the device-tree based PCIe slot description. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-15hdata: Add xscom_to_pcrd()Oliver O'Halloran1-0/+39
Iterating the SPPCRD structures (per chip data) is a fairly common operation in the HDAT parser. Iterating the tuples directly is somewhat irritating since we need to check for disabled chips, etc on every pass. A better way to handle this is to iterate throught he xscom nodes (generated from the SPPCRD data) and map from the xscom node to the originating structure. This patch adds a function to do that. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-15hdata: Add an idata array iteratorOliver O'Halloran2-0/+87
Adds HDIF_get_iarray() which retrieves and validates an internal array header and HDIF_iarray_for_each() for walking the individual array entries. This reduces the amount of get-then-check boilerplate that we have with the existing HDIF_get_iarray_item() method for iterating internal data arrays. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-15libc: add strnlen()Oliver O'Halloran2-0/+14
Sometimes handy. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-15core: make is_rodata test-friendlyOliver O'Halloran1-3/+10
Add a dummy is_rodata() implementation for use inside test code. Currently we don't need to make this actually check if the given pointer is actually read-only, but someone might want it to work properly in the future. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-15convert new witherspoon to dt helperOliver O'Halloran1-1/+1
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-15witherspoon: Deprecate manual npu creationOliver O'Halloran1-4/+54
In the future we will always create the npu nodes based on what's in the HDAT. For now we seperate witherspoon into an old and new platform where the old platform will assume a sequoia planar and create the relevant NPU nodes for that planar. If you have a redbud system this will be broken, but this should be fine for most cases. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-15platform/witherspoon: Add slot names to tableOliver O'Halloran2-22/+68
Add the other PCIe devices to the witherspoon slot tables. This provides a fall back for systems without IOSLOT information in the HDAT. This is mainly to allow DD1 systems to continue being useful. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-15astbmc: Add methods for handing DT based slotsOliver O'Halloran2-0/+49
Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-15core/pci-slots: Move slot-label construction to a helperOliver O'Halloran3-32/+35
Move this out of the astbmc specific part into a generic helper. This allows us to use it more commonly. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-15core/pcie-slots: Make dynamic slot creation genericOliver O'Halloran3-62/+66
astbmc has some code to handle devices that are behind a "slot" on a riser card that can't be added to the static slot tables for a system. We probably want to use this code outside the slot table handling so move it somewhere generic and rework it so slot table specifics aren't buried inside it. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-15core/pci-dt-slot: Represent PCIe slots in the devicetreeOliver O'Halloran3-0/+238
In P9 we get information about the physical PCIe slot topology through the HDAT. As a rule we never directly consume the HDAT inside of Skiboot and we always parse and incorporate the data from HDAT into the Skiboot device tree. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> [stewart@linux.vnet.ibm.com: add (C) header] Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-15hw/slw.c: Offline code still uses p8 bitsBalbir Singh1-0/+3
I'm seeing an infinite loop while hot unplugging a CPU. This is a workaround till we do the right things for p9. May be a candidate for backporting The messages I see in an infinite loop are: [ 740.250192896,3] LIBPORE: Core ID = 20 is not within valid range of [0;15] [ 740.250230176,3] SLW: Failed to set spr for CPU 51 When trying to hotunplug core id 20. For now the patch just skips calling p8_pore* on p9 machines. Signed-off-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-15pflash: Fix erase command for unaligned start addressSuraj Jitindar Singh1-1/+1
The erase_range() function handles erasing the flash for a given start address and length, and can handle an unaligned start address and length. However in the unaligned start address case we are incorrectly calculating the remaining size which can lead to incomplete erases. If we're going to update the remaining size based on what the start address was then we probably want to do that before we overide the origin start address. So rearrange the code so that this is indeed the case. Reported-by: Pridhiviraj Paidipeddi <ppaidipe@in.ibm.com> Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Reviewed-by: Cyril Bur <cyril.bur@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-15phb4: Use link if degradedMichael Neuling1-1/+7
In the recent change: 3f936bae97 phb4: Retrain link if degraded We retrain if the link is degraded. We do 3 retries to get an optimal link. Unfortunately if the last retry fails, we mark the PHB as bad and don't use it. Hence that PHB is lost even though it actually trained (just degraded). This fixes the problem by printing an error message (as below) but still marking the PHB as good. [ 7.179320404,3] PHB#0005[0:5]: LINK: Link degraded [ 8.387346665,3] PHB#0005[0:5]: LINK: Link degraded [ 10.078409137,3] PHB#0005[0:5]: LINK: Link degraded [ 11.281477269,3] PHB#0005[0:5]: LINK: Link degraded [ 11.283123885,3] PHB#0005[0:5]: LINK: Degraded but no more retries Signed-off-by: Michael Neuling <mikey@neuling.org> Acked-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-12phb4: Retrain link if degradedMichael Neuling1-1/+133
On P9 Scale Out (Nimbus) DD2.0 and Scale in (Cumulus) DD1.0 (and below) the PCIe PHY can lockup causing training issues. This can cause a degradation in speed or width in ~5% of training cases (depending on the card). This is fixed in later chip revisions. This issue can also cause PCIe links to not train at all, but this case is already handled. This patch checks if the PCIe link has trained optimally and if not, does a full PHB reset (to fix the PHY lockup) and retrain. One complication is some devices are known to train degraded unless device specific configuration is performed. Because of this, we only retrain when the device is in a whitelist. All devices in the current whitelist have been testing on a P9DSU/Boston, ZZ and Witherspoon. We always gather information on the link and print it in the logs even if the card is not in the whitelist. For testing purposes, there's an nvram to retry all PCIe cards and all P9 chips when a degraded link is detected. The new option is 'pci-retry-all=true' which can be set using: nvram -p ibm,skiboot --update-config pci-retry-all=true This option may increase the boot time if used on a badly behaving card. Signed-off-by: Michael Neuling <mikey@neuling.org> [stewart@linux.vnet.ibm.com: fix Cumulus VERS_MAJ r.e. Mikey mail] Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-12phb4: Make link retries a #defineMichael Neuling2-1/+2
Make link retries a #define rather than open coding it in the PHB4 init code. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-12pci: Make pci_wait_crs() globalMichael Neuling2-1/+2
We are going need pci_wait_crs() in the PHB4 code so make it global. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-12phb4: Split phb4_get_link_state() into a new functionMichael Neuling1-6/+21
Split phb4_get_link_state() into a new function so that it can be reused to get info on the speed and width of the link. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-12phb4: Move nvram read of pci-eeh-mmio initMichael Neuling1-1/+3
Move nvram read to the PHB4 init code so that's it's only read once, rather than every time we go though PHB reset. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-12phb4: Remove stable retriesMichael Neuling3-9/+0
This code was never used (since retries is set to 0), it's not very useful and it makes the code harder to read. So lets just remove it. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-12xive: Fix opal_xive_dump_tm() to access W2 properlyBenjamin Herrenschmidt1-1/+7
The HW only supported limited access sizes. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-12xive: Documentation updatesBenjamin Herrenschmidt1-23/+107
Correct the documentation in a couple of places to match the actual behaviour and improve bits and pieces of it Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-12npu2: Add vendor cap for IRQ testingSam Bobroff1-0/+28
Provide a way to test recoverable data link interrupts via a new vendor capability byte. Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com> Acked-By: Alistair Popple <alistair@popple.id.au> ====== v2 -> v3: ====== * Corrected name of NPU RING (no 2). [Andrew Donnellan] * Corrected spelling of device. [Andrew Donnellan] hw/npu2.c | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-12npu2: Enable recoverable data link (no-stall) interruptsSam Bobroff2-15/+131
Allow the NPU2 to trigger "recoverable data link" interrupts. Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com> Acked-By: Alistair Popple <alistair@popple.id.au> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-12npu2: Update NPU to NPU2 in comments and messagesSam Bobroff1-13/+13
Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-12xive: Make opal_xive_allocate_irq() properly try all chipsBenjamin Herrenschmidt1-17/+37
When requested via OPAL_XIVE_ANY_CHIP, we need to try all chips. We first try the current one (on which the caller sits) and if that fails, we iterate all chips until the allocation succeeds. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-12xive: Fix initialization & cleanup of HW thread contextsBenjamin Herrenschmidt1-36/+38
Instead of trying to "pull" everything and clear VT (which didn't work and caused some FIRs to be set), instead just clear and then set the PTER thread enable bit. This has the side effect of completely resetting the corresponding thread context. This fixes the spurrious XIVE FIRs reported by PRD and fircheck Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-12xive: Add debug option for detecting misrouted IPI in emulationBenjamin Herrenschmidt1-15/+116
This is high overhead so we don't enable it by default even in debug builds, it's also a bit messy, but it allowed me to detect and debug a locking issue earlier so it can be useful. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-12xive: Increase the interrupt "gap" on debug buildsBenjamin Herrenschmidt1-2/+7
We normally allocate IPIs from 0x10. Make that 0x1000 on debug builds to limit the chances of overlapping with Linux interrupt numbers which makes debugging code that confuses them easier. Also add a warning in emulation if we get an interrupt in the queue whose number is below the gap. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-12xive: Fix locking around cache scrub & watchBenjamin Herrenschmidt1-0/+19
Thankfully the missing locking only affects debug code and init code that doesn't run concurrently. Also adds a DEBUG option that checks the lock is properly held. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-12xive: Use symbolic constantBenjamin Herrenschmidt1-1/+1
Cosmetic fix. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-12xive: Workaround HW issue with scrub facilityBenjamin Herrenschmidt1-1/+32
Without this, we sometimes don't observe from a CPU the values written to the ENDs or NVTs via the cache watch. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-12xive: Add exerciser for cache watch/scrub facility in DEBUG buildsBenjamin Herrenschmidt1-45/+96
This runs 1000 iterations exercising the cache watch and scrub facilities on VPs and ENDs at boot. This exposes a HW bug with the scrub which will be worked around in a subsequent patch. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-12xive: Make assertion in xive_eq_for_target() more informativeBenjamin Herrenschmidt1-1/+5
If this fails, print a bit more info about it. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-12xive: Add debug code to check initial cache updatesBenjamin Herrenschmidt1-0/+47
This adds debug code to check that the initial updates of in-memory VPs and EQs via the cache watch and cache scrub facilities has worked properly. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-12xive: Ensure pressure relief interrupts are disabledBenjamin Herrenschmidt2-0/+3
We don't use them and we hijack the VP field with their configuration to store the EQ reference, so make sure the kernel or guest can't turn them back on by doing MMIO writes to ACK# Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-12xive: Don't try setting the reserved ACK# field in VPsBenjamin Herrenschmidt1-4/+1
That doesn't work, the HW doesn't implement it in the cache watch facility anyway. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-12xive: Remove useless memory barriers in VP/EQ initsBenjamin Herrenschmidt1-2/+0
We no longer update "live" memory structures, we use a temporary copy on the stack and update the actual memory structure using the cache watch, so those barriers are pointless. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-12skiboot/hw/imc: Add nest_memory region to "exports" nodeMadhavan Srinivasan1-0/+22
Exports the In-Memory Collection counter nest memory to the OS. This allows the OS to view the nest counter region directly. This helps in nest microcode debug and to check counter raw value. Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-12skiboot/skiboot.tcl: Add imc device nodes to skiboot.tclMadhavan Srinivasan2-2/+120
Add In-Memory Collection counter dummy nodes to the skiboot.tcl to aid code testing in mambo for both OPAL and Kernel side enablement. Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-12skiboot/hw/imc: Add NULL pointer checkMadhavan Srinivasan1-0/+4
Minor cleanup to avoid null pointer access. Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-12npu2: Implement FLRReza Arbab3-1/+34
Add basic handling of FLR (function level reset) by porting the changes from commit b74841db759d ("npu: Implement FLR") to npu2. The only difference for npu2 is that we track the reset state explicitly with a link flag instead of inferring it from dev->procedure_{status,number,step,data}. Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com> Cc: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-12npu2: Add npu2_clear_link_flag()Reza Arbab2-0/+8
Add a complement to npu2_set_link_flag(). Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>