aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2019-03-20npu2/hw-procedures: Fix parallel zcal for opencapiFrederic Barrat3-5/+10
For opencapi, we currently do impedance calibration when initializing the PHY for the device, which could run in parallel if we were rich and had multiple opencapi devices. But if 2 devices are on the same obus, the 2 calibration sequences could overlap, which likely yields bad results and is useless anyway since it only needs to be done once per obus. This patch splits the opencapi PHY reset in 2 parts: - a 'init' part called serially at boot. That's when zcal is done. If we have 2 devices on the same socket, the zcal won't be redone, since we're called serially and we'll see it has already be done for the obus - a 'reset' part called during fundamental reset as a prereq for link training. It does the PHY setup for a set of lanes and the dccal. The PHY team confirmed there's no dependency between zcal and the other reset steps and it can be moved earlier. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-03-20npu2-hw-procedures: Fix zcal in mixed opencapi and nvlink modeFrederic Barrat1-3/+21
The zcal procedure needs to be run once per obus. We keep track of which obus is already calibrated in an array indexed by the obus number. However, the obus number is inferred from the brick index, which works well for nvlink but not for opencapi. Create an obus_index() function, which, from a device, returns the correct obus index, irrespective of the device type. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-03-20npu2-hw-procedures: Don't set iovalid for opencapi devicesFrederic Barrat1-0/+3
set_iovalid() is called on the PHY reset path. The hw logic it touches is meaningless for opencapi. It's not hurting as long as all the links under the NPU are in opencapi mode, but in case of mixing opencapi and nvlink, we'll be in troubles: the code finds which bit to modify based on the brick index, which varies depending on the mode. So calling that function on an opencapi device may modify a nvlink brick! For example, for brick index 3. So we simply avoid doing anything when calling set_iovalid() for an opencapi device. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-03-20libffs: Fix string truncation gcc warning.Michal Suchanek1-1/+1
Use memcpy as other libffs functions do. Signed-off-by: Michal Suchanek <msuchanek@suse.de> Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-03-20skiboot v6.2.3 release notesVasant Hegde1-0/+45
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> (cherry picked from commit 8463ee4bc297fab0181fbb418954c3476a2adbde) Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-03-20skiboot v6.0.19 release notesVasant Hegde1-0/+37
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> (cherry picked from commit 3d135fe39a6ac509bfa49a9eb9e5f8386fc5109d) Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-03-18Update skiboot stable tree rulesVasant Hegde1-5/+7
We have new mailing list (skiboot-stable@lists.ozlabs.org) for handling stable trees! Now onwards to submit patches to stable tree, one should send patches to skiboot-stable@lists.ozlabs.org mailing list with subject prefix [PATH <stable-version>] -OR- CC skiboot-stable@lists.ozlabs.org mailing list while sending patches to upstream mailing list (skiboot@lists.ozlabs.org). This will remove the requirement to do the --suppress-cc OR other related --no-cc-* options from git-send-email to remove "CC" list. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-03-15p9dsu: Undo slot label name changesDeb McLemore1-16/+16
During some code updates the slot labels were updated to reflect the phb layout, however expectations were that the slot labels be aligned with the riser card slots and not the system planar slots. [stewart: The tale of how we got here is long and varied and not at all clear. The first ESS systems went out with a skiboot v5.9.8 with additional SuperMicro patches. It was probably a slot table, but who knows, we don't have the code so can't check. It's possible it was all coming in through HDAT instead). The op-build tree (thus the exact patches) shipped on systems that work correct seems to not be around anywhere anymore (if it ever was). It was only in skiboot v6.0 that a slot table made it in, and, of course, only having remote machines in random configs, including possibly with riser cards from Briggs&Stratton rather than the ones destined for this system, doesn't make for verifying this at all. It also doesn't help that *consistently* there is *never* any review on slot tables, and we've had things be wrong in the past. Combine this with not upstream Hostboot patches.] Cc: skiboot-stable@lists.ozlabs.org Cc: Benjamin Mashak <mashak@us.ibm.com> Cc: Michael Lim <youhour@us.ibm.com> Fixes: 64a16ae05bb2 ("p9dsu: Fix slot labels for p9dsu2u") Fixes: 87517c8737b9 ("p9dsu: Fix p9dsu slot tables") Fixes: 31231ed300f2 ("p9dsu: Fix p9dsu default variant") Signed-off-by: Deb McLemore <debmc@linux.ibm.com> [stewart: added more detailed explanation, cc stable] Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-03-15Drop old Coverity jobs (we build via separate .travis.yml in a branch)Stewart Smith1-24/+2
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-03-15opal-ci: drop fedora 28Stewart Smith4-41/+30
We're getting close to Fedora 30, and keeping N-1 fedora around for too long doesn't really add much. Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-03-15opal-ci: Drop unneded reference to ubuntu 12.04Stewart Smith1-5/+0
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-03-15opal-ci: Drop CentOS6 supportStewart Smith2-38/+0
We use the same compiler on our CentOS7 image, and it has the bonus of being able to test against P8 and P9 Mambo. Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-03-15fast-reboot: occ: Call occ_pstates_init() on fast-reset on all machinesShilpasri G Bhat1-2/+4
Commit 815417dcda2e ("init, occ: Initialise OCC earlier on BMC systems") conditionally invoked occ_pstates_init() only on FSP based systems in load_and_boot_kernel(). Due to this pstate table is re-parsed on FSP system and skipped on BMC system during fast-reboot. So this patch fixes this by invoking occ_pstates_init() on all boxes during fast-reboot. Cc: skiboot-stable@lists.ozlabs.org Fixes: 815417dcda2e ("init, occ: Initialise OCC earlier on BMC systems") Reported-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-03-15fast-reboot: occ: Remove 'freq-domain-mask' from fast-reboot pathShilpasri G Bhat1-43/+42
OCC can change the pstate table at runtime to modify pstate limits or for characterization purpose. These changes are reflected by re-parsing the pstate table during fast-reboot to update the device-tree. Only relevant pstate DT properties are deleted and newly added during fast-reboot. The device-tree properties like 'freq-domain-mask' and 'domain-runs-at' are currently hard-coded and need not be updated during fast-reboot. So this patch removes them from the fast-reboot path. This patch fixes the below crash: [ 270.313998453,5] OCC: All Chip Rdy after 0 ms [ 270.314148918,3] Duplicate property "freq-domain-mask" in node /ibm,opal/power-mgt [ 270.314208553,0] Aborting! CPU 083c Backtrace: S: 0000000035de3a20 R: 000000003001b480 ._abort+0x4c S: 0000000035de3aa0 R: 0000000030028704 .new_property+0xd8 S: 0000000035de3b30 R: 0000000030028964 .__dt_add_property_cells+0x30 S: 0000000035de3bd0 R: 0000000030042980 .occ_pstates_init+0x7c8 S: 0000000035de3d90 R: 00000000300145f4 .load_and_boot_kernel+0x980 S: 0000000035de3e70 R: 00000000300276b4 .fast_reboot_entry+0x37c S: 0000000035de3f00 R: 0000000030002ac4 reset_fast_reboot_wakeup+0x40 Fixes: b821f8c2a8e3("power-mgmt : occ : Add 'freq-domain-mask' DT property") Reported-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> Tested-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-03-13npu2-opencapi: Fix adapter reset when using 2 adaptersFrederic Barrat3-7/+34
If two opencapi adapters are on the same obus, we may try to train the two links in parallel at boot time, when all the PCI links are being trained. Both links use the same i2c controller to handle the reset signal, so some care is needed to make sure resetting one doesn't interfere with the reset of the other. We need to keep track of the current state of the i2c controller (and use locking). This went mostly unnoticed as you need to have 2 opencapi cards on the same socket and links tended to train anyway because of the retries. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-03-13npu2-opencapi: Extend delay after releasing reset on adapterFrederic Barrat1-2/+2
Give more time to the FPGA to process the reset signal. The previous delay, 5ms, is too short for newer adapters with bigger FPGAs. Extend it to 250ms. Ultimately, that delay will likely end up being added to the opencapi specification, but we are not there yet. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-03-13npu2-opencapi: ODL should be in reset when enabledFrederic Barrat1-0/+6
We haven't hit any problem so far, but from the ODL designer, the ODL should be in reset when it is enabled. The ODL remains in reset until we start a fundamental reset to initiate link training. We still assert and deassert the ODL reset signal as part of the normal procedure just before training the link. Asserting is therefore useless at boot, since the ODL is already in reset, but we keep it as it's only a scom write and it's needed when we reset/retrain from the OS. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-03-13npu2-opencapi: Keep ODL and adapter in reset at the same timeFrederic Barrat1-25/+43
Split the function to assert and deassert the reset signal on the ODL, so that we can keep the ODL in reset while we reset the adapter, therefore having a window where both sides are in reset. It is actually not required with our current DLx at boot time, but I need to split the ODL reset function for the following patch and it will become useful/required later when we introduce resetting an opencapi link from the OS. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-03-13npu2-opencapi: Rename functions used to reset an adapterFrederic Barrat1-4/+4
This is really to avoid confusion with a later patch and clarify whether we're resetting the ODL or the adapter. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-03-13npu2-opencapi: Setup perf counters to detect CRC errorsFrederic Barrat2-0/+79
It's possible to set up performance counters for the PLL to detect various conditions for the links in nvlink or opencapi mode. Since those counters are currently unused, let's configure them when an obus is in opencapi mode to detect CRC errors on the link. Each link has two counters: - CRC error detected by the host - CRC error detected by the DLx (NAK received by the host) We also dump the counters shortly after the link trains, but they can be read multiple times through cronus, pdbg or linux. The counters are configured to be reset after each read. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-03-13npu2-opencapi: Rework ODL register accessFrederic Barrat3-135/+21
ODL registers used to control the opencapi link state have an address built on a base address and an offset for each brick which can be computed instead of hard-coded individually for each brick. Rework how we access the ODL registers, to avoid repeating switch statements all over the place. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-03-06skiboot v6.2.2 release notesVasant Hegde1-0/+227
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> (cherry picked from commit 5da21e2cc79d9f77d721daf170511fe3e1c027ef) Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-03-06skiboot v6.0.18 release notesVasant Hegde1-0/+184
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> (cherry picked from commit b90de1aae03c90ab817e2fcfd4a97329d733c4eb) Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-03-05Makefile: Check -Wno-stringop-truncation is supportedJoel Stanley1-1/+1
The cross compiler on my system throws a fit as it does not understand this option: cc1: error: unrecognized command line option '-Wno-stringop-truncation' [-Werror] Fixes: 918b7233d3bb ("Add -Wno-stringop-truncation for GCC8") Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-03-05asm/head: move unnecessary code out of headNicholas Piggin2-134/+139
head.S should be for things that must be located in low memory, like boot and interrupt entry. Move some code from there into misc.S that is not called from entry routines. The motivation for this patch is work to run skiboot in virtual memory mode, which does not map head.S code. Even without that motivation, it's still good to keep head.S clean. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-03-05p9dsu: Fix slot labels for p9dsu2uDeb McLemore1-5/+5
Update the slot labels for the p9dsu2u tables. Signed-off-by: Deb McLemore <debmc@linux.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-03-05platforms/zz: Re-enable LXVPD slot information parsingOliver O'Halloran1-4/+3
>From memory this was disabled in the distant past since we were waiting for an updates to the LXPVD format. It looks like that never happened so re-enable it for the ZZ platform so that we can get PCI slot location codes on ZZ. Cc: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-03-05doc: s/stb_init()/secureboot_init()/ to match realityStewart Smith1-1/+1
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-03-05opal/hmi: set a flag to inform OS that TOD/TB has failed.Mahesh Salgaonkar3-2/+13
Set a flag to indicate OS about TOD/TB failure as part of new opal_handle_hmi2 handler. This flag then can be used by OS to make sure functions depending on TB value (e.g. udelay()) are aware of TB not ticking. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-03-05opal/hmi: Don't retry TOD recovery if it is already in failed state.Mahesh Salgaonkar1-9/+22
On TOD failure, all cores/thread receives HMI and very first thread that gets interrupt fixes the TOD where as others just resets the respective HMER error bit and return. But when TOD is unrecoverable, all the threads try to do TOD recovery one by one causing threads to spend more time inside opal. Set a global flag when TOD is unrecoverable so that rest of the threads go back to linux immediately avoiding lock ups in system reboot/panic path. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-03-05opal/hmi: Fix double unlock of hmi lock in failure path.Mahesh Salgaonkar1-5/+1
unlock once and goto error_out. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-03-05zaius: Add BMC descriptionAndrew Jeffery1-1/+20
Frederic reported that Zaius was failing with a NULL dereference when trying to initialise IPMI HIOMAP. It turns out that the BMC wasn't described at all, so add a description. Tested on zaius1, which reached petitboot with the patch applied. Reported-by: Frederic Barrat <fbarrat@linux.ibm.com> Tested-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-03-01i2c: Fix sparse warnings for type assignmentStewart Smith1-5/+5
Use the correct beXX_to_cpu() macros. core/i2c.c:105:29: warning: incorrect type in assignment (different base types) core/i2c.c:105:29: expected unsigned int [usertype] offset core/i2c.c:105:29: got restricted beint32_t [usertype] subaddr core/i2c.c:110:29: warning: incorrect type in assignment (different base types) core/i2c.c:110:29: expected unsigned int [usertype] offset core/i2c.c:110:29: got restricted beint32_t [usertype] subaddr core/i2c.c:117:23: warning: incorrect type in assignment (different base types) core/i2c.c:117:23: expected unsigned int [usertype] dev_addr core/i2c.c:117:23: got restricted beint16_t [usertype] addr core/i2c.c:118:21: warning: incorrect type in assignment (different base types) core/i2c.c:118:21: expected unsigned int [usertype] rw_len core/i2c.c:118:21: got restricted beint32_t [usertype] size core/i2c.c:119:24: warning: cast from restricted beint64_t Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-03-01hw/bt: Do not disable ipmi message retry during OPAL bootVasant Hegde1-1/+2
Currently OPAL doesn't know whether BMC is functioning or not. If BMC is down (like BMC reboot), then we keep on retry sending message to BMC. So in some corner cases we may hit hard lockup issue in kernel. Ideally we should avoid using synchronous path as much as possible. But for now commit 01f977c3 added option to disable message retry in synchronous. But this fix is not required during boot. Hence lets disable IPMI message retry during OPAL boot. Fixes: 01f977c3 (hw/bt: Add backend interface to disable ipmi message) Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-03-01core/ipmi: Add ipmi sync messages to top of the listVasant Hegde1-1/+1
In ipmi_queue_msg_sync() path OPAL will wait until it gets response from BMC. If we do not get response ontime we may endup in kernel hardlockups. Hence lets add sync messages to top of the queue. This will reduces the chance of hardlockups. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-03-01hw/bt: Introduce separate list for synchronous messagesVasant Hegde1-45/+63
BT send logic always sends top of bt message list to BMC. Once BMC reads the message, it clears the interrupt and bt_idle() becomes true. bt_add_ipmi_msg_head() adds message to top of the list. If bt message list is not empty then: - if bt_idle() is true then we will endup sending message to BMC before getting response from BMC for inflight message. Looks like on some BMC implementation this results in message timeout. - else we endup starting message timer without actually sending message to BMC.. which is not correct. This patch introduces separate list to track synchronous messages. bt_add_ipmi_msg_head() will add messages to tail of this new list. We will always process this queue before processing normal queue. Finally this patch introduces new variable (inflight_bt_msg) to track inflight message. This will point to current inflight message. Suggested-by: Oliver O'Halloran <oohall@gmail.com> Suggested-by: Stewart Smith <stewart@linux.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-03-01hdata/memory: Removed share-id propertyVasant Hegde1-3/+0
Commit f35a3c37 (hdata/memory: Remove find_shared()) replaed find_shared() with dt_find_name_addr(). Now we do not need PRIVATE share-id property. Lets remove this property. CC: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Reviewed-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-03-01hdata/memory: Fix warning messageVasant Hegde1-1/+1
Even though we added memory to device tree, we are getting below warning. Sample log: [ 57.136949696,3] Unable to use memory range 0 from MSAREA 0 [ 57.137049753,3] Unable to use memory range 0 from MSAREA 1 [ 57.137152335,3] Unable to use memory range 0 from MSAREA 2 [ 57.137251218,3] Unable to use memory range 0 from MSAREA 3 Fixes: 4822a7ba (hdata/memory: Add NVDIMM support) CC: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Reviewed-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-03-01Don't use variable length arrays in exception codeStewart Smith1-14/+13
OMG Kees Cook was right, the code is *smaller*. We save like a dozen instructions in the exception path! Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-03-01astbmc: Enable IPMI HIOMAP for AMI platformsAndrew Jeffery1-0/+1
Required for Habanero, Palmetto and Romulus. Cc: Lei YU <mine260309@gmail.com> Cc: Uma Yadlapati <yadlapat@us.ibm.com> Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-02-28xive: Make no_sync parameter affermative in __xive_set_irq_config()Michael Neuling1-6/+6
In __xive_set_irq_config() change the no_sync parameter to sync and fix all the call sites. Just a cleanup. No functional change. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-02-25hw/phb4: Fix indentation of brdgCtlOliver O'Halloran1-2/+1
Come on bridge control register. You're letting the team down. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-02-25npu2: Allow ATSD for LPAR other than 0Alexey Kardashevskiy2-1/+23
Each XTS MMIO ATSD# register is accompanied by another register - XTS MMIO ATSD0 LPARID# - which controls LPID filtering for ATSD transactions. When a host system passes a GPU through to a guest, we need to enable some ATSD for an LPAR. At the moment the host assigns one ATSD to a NVLink bridge and this maps it to an LPAR when GPU is assigned to the LPAR. The link number is used for an ATSD index. ATSD6&7 stay mapped to the host (LPAR=0) all the time which seems to be acceptable price for the simplicity. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-02-25npu2: Add XTS_BDF_MAP wildcard refcountAlexey Kardashevskiy2-16/+32
Currently PID wildcard is programmed into the NPU once and never cleared up. This works for the bare metal as MSR does not change while the host OS is running. However with the device virtualization, we need to keep track of wildcard entries use and clear them up before switching a GPU from a host to a guest or vice versa. This adds refcount to a NPU2, one counter per wildcard entry. The index is a short lparid (4 bits long) which is allocated in opal_npu_map_lpar() and should be smaller than NPU2_XTS_BDF_MAP_SIZE (defined as 16). Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Acked-by: Reza Arbab <arbab@linux.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-02-25power-mgmt : occ : Add 'freq-domain-mask' DT propertyAbhishek Goel2-0/+61
Add a new device-tree property freq-domain-indicator to define group of CPUs which would share same frequency. This property has been added under power-mgmt node. It is a bitmask. Bitwise AND is taken between this bitmask value and PIR of cpu. All the CPUs lying in the same frequency domain will have same result for AND. For example, For POWER9, 0xFFF0 indicates quad wide frequency domain. Taking AND with the PIR of CPUs will yield us frequency domain which is quad wise distribution as last 4 bits have been masked which represent the cores. Similarly, 0xFFF8 will represent core wide frequency domain for P8. Also, Add a new device-tree property domain-runs-at which will denote the strategy OCC is using to change the frequency of a frequency-domain. There can be two strategy - FREQ_MOST_RECENTLY_SET and FREQ_MAX_IN_DOMAIN. FREQ_MOST_RECENTLY_SET : the OCC sets the frequency of the quad to the most recent frequency value requested by the CPUs in the quad. FREQ_MAX_IN_DOMAIN : the OCC sets the frequency of the CPUs in the Quad to the maximum of the latest frequency requested by each of the component cores. Signed-off-by: Abhishek Goel <huntbag@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-02-26Retry link training at PCIe GEN1 if presence detected but training ↵Timothy Pearson2-14/+48
repeatedly failed Certain older PCIe 1.0 devices will not train unless the training process starts at GEN1 speeds. As a last resort when a device will not train, fall back to GEN1 speed for the last training attempt. This is verified to fix devices based on the Conexant CX23888 on the Talos II platform. Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com> [stewart: cut P9NDD1.0 support, fixup dt_max_link_speed] Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-02-25imc/catalog: Decompress catalog asynchronouslySantosh Sivaraj3-84/+61
In-Memory Collection(IMC) counters catalog is compressed blob which is loaded from the flash; decompression starts once the data is loaded from nvram by the main thread. This can be optimized by using the libxz API function which creates a job to do the decompression by not blocking the main thread. Refactor decompress() to use the libxz asynchronous wrapper functions. This also cleans up the error handling path in imc_init(). CC: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Santosh Sivaraj <santosh@fossix.org> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-02-25flash: Add support for async decompressionSantosh Sivaraj2-0/+137
Implement a standard API for decompressing images using the existing method found in the IMC code. This patch also standardizes error codes and does the decompression asynchronously. The IMC decompress() function is refactored to decompress blobs/images as a separate CPU job. 'xz_decompress_start()' starts the decompression in a newly created CPU job; while 'wait_xz_decompress()' waits for the job to complete. The IMC code will be first user for the new APIs; whose implementation is provided as reference in the next patch. Signed-off-by: Santosh Sivaraj <santosh@fossix.org> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-02-25config: Get rid of FAST_REBOOT_CLEARS_MEMORYAndrew Donnellan1-3/+0
FAST_REBOOT_CLEARS_MEMORY is a relic of the initial attempts at fast reboot, which went away in 0279d8951ead ("Fast reboot for P8"). Remove it from config.h as it's misleading. Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-02-25config: Get rid of DEBUG_MALLOCAndrew Donnellan2-5/+1
Since the initial release of skiboot, we've #defined DEBUG_MALLOC to 1. Also since the initial release of skiboot, DEBUG_MALLOC has been referenced absolutely nowhere. Get rid of it. Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>