aboutsummaryrefslogtreecommitdiff
path: root/hw/npu2-hw-procedures.c
AgeCommit message (Collapse)AuthorFilesLines
2020-05-26platform/mihawk: Tune equalization settings for opencapiFrederic Barrat1-4/+19
The Bittware 250SOC adapter on Mihawk was showing a high count of CRC errors on one of the opencapi slots. The PHY team suggested new equalization settings to correct the errors. All existing adapters have been tested on mihawk to make sure the settings are compatible. However, the new settings should not be used on platforms other than mihawk. The changes specific to mihawk are: - Update the tx_ffe_pre_coeff and tx_ffe_post_coeff input parameters used during zcal - turn off the tx_ffe_boost parameter through scom Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Cc: skiboot-stable@lists.ozlabs.org # skiboot-op940.x Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2020-03-12Re-license IBM written files as Apache 2.0 OR GPLv2+Stewart Smith1-1/+1
SPDX makes it a simpler diff. I have audited the commit history of each file to ensure that they are exclusively authored by IBM and thus we have the right to relicense. The motivation behind this is twofold: 1) We want to enable experiments with coreboot, which is GPLv2 licensed 2) An upcoming firmware component wants to incorporate code from skiboot and code from the Linux kernel, which is GPLv2 licensed. I have gone through the IBM internal way of gaining approval for this. The following files are not exclusively authored by IBM, so are *not* included in this update (I will be seeking approval from contributors): core/direct-controls.c core/flash.c core/pcie-slot.c external/common/arch_flash_unknown.c external/common/rules.mk external/gard/Makefile external/gard/rules.mk external/opal-prd/Makefile external/pflash/Makefile external/xscom-utils/Makefile hdata/vpd.c hw/dts.c hw/ipmi/ipmi-watchdog.c hw/phb4.c include/cpu.h include/phb4.h include/platform.h libflash/libffs.c libstb/mbedtls/sha512.c libstb/mbedtls/sha512.h platforms/astbmc/barreleye.c platforms/astbmc/garrison.c platforms/astbmc/mihawk.c platforms/astbmc/nicole.c platforms/astbmc/p8dnu.c platforms/astbmc/p8dtu.c platforms/astbmc/p9dsu.c platforms/astbmc/vesnin.c platforms/rhesus/ec/config.h platforms/rhesus/ec/gpio.h platforms/rhesus/gpio.c platforms/rhesus/rhesus.c platforms/astbmc/talos.c platforms/astbmc/romulus.c Signed-off-by: Stewart Smith <stewart@linux.ibm.com> [oliver: fixed up the drift] Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-12-04npu2: Clear fence on all bricksAlexey Kardashevskiy1-5/+12
A bug in the NVidia driver can cause an UR HMI which fences bricks (links). At the moment we clear fence status only for bricks of a specific devices, however this does not appear to be enough and we need to clear fences for all bricks. This is ok as we do not allow using GPUs individually anyway. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Acked-by: Reza Arbab <arbab@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-11-19npu2/hw-procedures: Remove assertion from check_credits()Reza Arbab1-9/+6
The RX clock mux in the NVLink PHY can glitch, which will manifest in hard to diagnose behavior--at best, a checkstop during the first link traffic. The only reliable way we found to detect this was by checking for a discrepancy in the credits we expect to receive during link training. Since the time the check was added, we've found that * Commit ac6f1599ff33 ("npu2: hw-procedures: Add phy_rx_clock_sel()") does work around the original glitch. * Asserting is too harsh. Before root cause was established, it was thought this could have been a manufacturing defect and we wanted to loudly fail hardware acceptance boot cycle tests. * It seems there is a valid situation in which credits are off from the expected value. During GPU hot reset, a CPU prefetch across the link can affect the credit count before we check. Given all of the above, remove the assert(). Cc: stable # 6.0.x Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-10-22npu2-opencapi: Detect PHY reset errorsFrederic Barrat1-3/+10
PHY reset can fail! Though past problems are now fixed, let's handle any future failure. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-10-22npu2-hw-procedures: Fix link retraining on resetFrederic Barrat1-0/+16
Link retraining was showing reliability problems due to some opencapi-only settings not being optimized. This patch updates some extra PHY state, as agreed with the PHY team. Though they mostly impact link retraining behavior, they should also be set at boot. Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-10-22npu2-hw-procedures: Move some opencapi PHY settings in one-off initFrederic Barrat1-19/+16
The PHY_RX_AC_COUPLED and PHY_RX_SPEED_SELECT for opencapi are group settings for the obus. They should be set in the one-off PHY init function at boot and not on the link reset path, as they theoretically impact more than one link. Since we cannot mix link type and/or speed on an optical bus, it has no pratical impact, it just looks cleaner. Also use the OCAPIINF macro for the associated traces. Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-07-26SPDX-ify all skiboot codeStewart Smith1-13/+5
Use Software Package Data Exchange (SPDX) to indicate license for each file that is unique to skiboot. At the same time, ensure the (C) who and years are correct. See https://spdx.org/ Signed-off-by: Stewart Smith <stewart@linux.ibm.com> [oliver: Added a few missing files] Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
2019-06-03npu2: Clear fence state for a brick being resetAlexey Kardashevskiy1-0/+8
Resetting a GPU before resetting an NVLink leads to occasional HMIs which fence some bricks and prevent the "reset_ntl" procedure from succeeding at the "reset_ntl_release" step - the host system requires reboot; there may be other cases like this as well. This adds clearing of the fence bit in NPU.MISC.FENCE_STATE for the NVLink which we are about to reset. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-05-24npu2: Fix clearing the FIR bitsAlexey Kardashevskiy1-1/+1
FIR registers are SCOM-only so they cannot be accesses with the indirect write, and yet we use SCOM-based addresses for these; fix this. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-By: Alistair Popple <alistair@popple.id.au> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-03-20npu2/hw-procedures: Fix parallel zcal for opencapiFrederic Barrat1-2/+5
For opencapi, we currently do impedance calibration when initializing the PHY for the device, which could run in parallel if we were rich and had multiple opencapi devices. But if 2 devices are on the same obus, the 2 calibration sequences could overlap, which likely yields bad results and is useless anyway since it only needs to be done once per obus. This patch splits the opencapi PHY reset in 2 parts: - a 'init' part called serially at boot. That's when zcal is done. If we have 2 devices on the same socket, the zcal won't be redone, since we're called serially and we'll see it has already be done for the obus - a 'reset' part called during fundamental reset as a prereq for link training. It does the PHY setup for a set of lanes and the dccal. The PHY team confirmed there's no dependency between zcal and the other reset steps and it can be moved earlier. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-03-20npu2-hw-procedures: Fix zcal in mixed opencapi and nvlink modeFrederic Barrat1-3/+21
The zcal procedure needs to be run once per obus. We keep track of which obus is already calibrated in an array indexed by the obus number. However, the obus number is inferred from the brick index, which works well for nvlink but not for opencapi. Create an obus_index() function, which, from a device, returns the correct obus index, irrespective of the device type. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-03-20npu2-hw-procedures: Don't set iovalid for opencapi devicesFrederic Barrat1-0/+3
set_iovalid() is called on the PHY reset path. The hw logic it touches is meaningless for opencapi. It's not hurting as long as all the links under the NPU are in opencapi mode, but in case of mixing opencapi and nvlink, we'll be in troubles: the code finds which bit to modify based on the brick index, which varies depending on the mode. So calling that function on an opencapi device may modify a nvlink brick! For example, for brick index 3. So we simply avoid doing anything when calling set_iovalid() for an opencapi device. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-03-13npu2-opencapi: Rework ODL register accessFrederic Barrat1-16/+1
ODL registers used to control the opencapi link state have an address built on a base address and an offset for each brick which can be computed instead of hard-coded individually for each brick. Rework how we access the ODL registers, to avoid repeating switch statements all over the place. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-02-25sparse: symbol 'NPU2_PHY_*' was not declared. Should it be static?Stewart Smith1-66/+71
Yes they should. Also, some are unused so we comment them out to at least keep the code as documentation complete. Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-01-18sparse: Make tree 'constant is so big' warning cleanStewart Smith1-20/+20
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-01-16npu2: Remove unused npu2_dev::procedure_dataReza Arbab1-2/+0
This variable is never used. Remove it. Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-09-17npu2: Split device index into brick and link indexAndrew Donnellan1-4/+4
On Witherspoon, OpenCAPI devices attached to link indexes 0 and 1 are handled by bricks 2 and 3. Rename index to brick_index, and add a new field, link_index, to refer to the link index. For now, we set those values identically. Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-by: Reza Arbab <arbab@linux.ibm.com> Reviewed-by: Alistair Popple <alistair@popple.id.au> Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-09-17hw/npu2-hw-procedures: Enable RX auto recal on OpenCAPI linksAndrew Donnellan1-0/+8
The RX_RC_ENABLE_AUTO_RECAL flag is required on OpenCAPI but not NVLink. Traditionally, Hostboot sets this value according to the machine type. However, now that Witherspoon supports both NVLink and OpenCAPI, it can't tell whether or not a link is OpenCAPI. So instead, set it in skiboot, where it will only be triggered after we've done device detection and found an OpenCAPI device. Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com> Acked-by: Reza Arbab <arbab@linux.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-07-17npu2/hw-procedures: Enable parity and credit overflow checksReza Arbab1-0/+6
Enable these error checking features by setting the appropriate bits in our one-off initialization of each "NTL Misc Config 2" register. The exception is NDL RX parity checking, which should be disabled during the link training procedures. Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-07-10npu2/hw-procedures: Fence bricks via NTL instead of MISCReza Arbab1-24/+7
There are a couple of places we can set/unset fence for a brick: 1. MISC register: NPU2_MISC_FENCE_STATE 2. NTL register for the brick: NPU2_NTL_MISC_CFG1(ndev) Recent testing of ATS in combination with GPU reset has exposed a side effect of using (1); if fence is set for all six bricks, it triggers a sticky nmmu latch which prevents the NPU from getting ATR responses. This manifests as a hang in the tests. We have npu2_dev_fence_brick() which uses (1), and only two calls to it. Replace the call which sets fence with a write to (2). Remove the corresponding unset call entirely. It's unneeded because the procedures already do a progression from full fence to half to idle using (2). Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-04-23npu2/hw-procedures: fence bricks on GPU resetBalbir Singh1-7/+45
The NPU workbook defines a way of fencing a brick and getting the brick out of fence state. We do have an implementation of bringing the brick out of fenced/quiesced state. We do the latter in our procedures, but to support run time reset we need to do the former. The fencing ensures that access to memory behind the links will not lead to HMI's, but instead SUE's will be populated in cache (in the case of speculation). The expectation is then that prior to and after reset, the operating system components will flush the cache for the region of memory behind the GPU. This patch does the following: 1. Implements a npu2_dev_fence_brick() function to set/clear fence state 2. Clear FIR bits prior to clearing the fence status 3. Clear's the fence status 4. We take the powerbus out of CQ fence much later now, in credits_check() which is the last hardware procedure called after link training. Signed-off-by: Balbir Singh <bsingharora@gmail.com> Reviewed-By: Alistair Popple <alistair@popple.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-03-01npu2-hw-procedures: Add support for OpenCAPI PHY link trainingAndrew Donnellan1-12/+116
Unlike NVLink, which uses the pci-virt framework to fake a PCI configuration space for NVLink devices, the OpenCAPI device model presents us with a real configuration space handled by the device over the OpenCAPI link. As a result, we have to train the OpenCAPI link in skiboot before we do PCI probing, so that config space can be accessed, rather than having link training being triggered by the Linux driver. Add some helper functions to wrap the existing NVLink PHY training sequence so we can easily run it within skiboot. Additionally, we add OpenCAPI-specific lane settings, and a function to "bump" lanes that haven't trained properly (this process isn't documented in the workbook, but the hardware experts assure us that this improves link training reliability...) We also support the PRBS31 pattern that's used for bringup and test purposes. Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Acked-by: Reza Arbab <arbab@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2018-03-01npu2: Rework NPU data structures for OpenCAPIAndrew Donnellan1-3/+3
Unlike NVLink, OpenCAPI registers a separate PHB for each device, in order to allow us to force Linux to use the correct MMIO windows for each NPU link. This requires some reworking of NPU data structures to account for the fact that a PHB could correspond to either an NPU (NVLink) or a single link (OpenCAPI). At some later point, we may want to rework the NVLink code to present a separate PHB per device in order to simplify this. For now, we split NVLink-specific device data into a separate struct in order to make it clear which fields are NVLink-only. Additionally, add helper functions to correctly translate between OpenCAPI/NVLink PHBs and the underlying structures, and various fields for OpenCAPI data that we're going to need later on. Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-by: Reza Arbab <arbab@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2018-02-13hw/npu2: support creset of npu2 devicesBalbir Singh1-1/+1
creset calls in the hw procedure that resets the PHY, we don't take them out of reset, just put them in reset. Signed-off-by: Balbir Singh <bsingharora@gmail.com> Acked-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2018-01-30npu2-hw-procedures.c: Correct phy lane mappingAlistair Popple1-3/+3
Each nvlink device is associated with a particular group of OBUS lanes via a lane mask which is read from HDAT via the device-tree. However Skiboot's interpretation of lane mask was different to what is exported from the HDAT. Specifically the lane mask bits in the HDAT are encoded in IBM bit ordering for a 24-bit wide value. So for example in normal bit ordering lane-0 is represented by having lane-mask bit 23 set and lane-23 is represented by lane-mask bit 0. This patch alters the Skiboot interpretation to match what is passed from HDAT. Signed-off-by: Alistair Popple <alistair@popple.id.au> Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Reviewed-by: Reza Arbab <arbab@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2018-01-30npu2-hw-procedures.c: Power up lanes during ntl resetAlistair Popple1-0/+13
Newer versions of Hostboot will not power up the NVLink PHY lanes by default. The phy_reset procedure already powers up the lanes but they also need to be powered up in order to access the DL. The reset_ntl procedure is called by the device driver to bring the DL out of reset and get it into a working state. Therefore we also need to add lane and clock power up to the reset_ntl procedure. Signed-off-by: Alistair Popple <alistair@popple.id.au> Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Reviewed-by: Reza Arbab <arbab@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-11-28npu2: hw-procedures: Change phy_rx_clock_sel valuesReza Arbab1-4/+4
The clock selection bits we set here are inputs to a state machine. DL clock select (bits 30-31) 0b00: lane 0 clock 0b01: lane 7 clock 0b10: grid clock 0b11: invalid/noop To recover from a potential glitch, we need to ensure that the value we set forces a state change. Our current sequence is to set 0x3 followed by 0x1. With the above now known, that is actually a noop followed by selection of lane 7. Depending on lane reversal, that selection is not a state change for some bricks. The way to force a state change in all cases is to switch to the grid clock, and then back to a lane. Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com> Acked-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-11-28npu2: hw-procedures: Manipulate IOVALID during trainingReza Arbab1-0/+24
Ensure that the IOVALID bit for this brick is raised at the start of link training, in the reset_ntl procedure. Then, to protect us from a glitch when the PHY clock turns off or gets chopped, lower IOVALID for the duration of the phy_reset and phy_rx_dccal procedures. Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com> Acked-By: Alistair Popple <alistair@popple.id.au> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-11-28npu2: hw-procedures: Add obus_brick_index()Reza Arbab1-16/+13
We have code in reset_ntl() which finds the index number of our brick within its obus chiplet. Move that logic to a separate function for reuse. No functional change. Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com> Acked-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-11-21npu2: hw-procedures: Add check_credits procedureReza Arbab1-1/+38
As an immediate mitigator for a current hardware glitch, add a procedure that can be used to validate NTL credit values. This will be called as a safeguard to check that link training succeeded. Assert that things are exactly as we expect, because if they aren't, the system will experience a catastrophic failure shortly after the start of link traffic. Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com> Acked-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-11-15Revert "npu2: hw-procedures: Enable low power mode"Reza Arbab1-18/+1
As it turns out, low power mode is not yet ready for prime time. We shouldn't write the low power config register until it is. This reverts commit a05054c53a37850a2118d01fcf6669ebb10d1a33. Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com> Acked-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-11-13npu2: hw-procedures: Refactor reset_ntl procedureReza Arbab1-15/+58
Change the implementation of reset_ntl to match the latest programming guide documentation. Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com> Reviewed-by: Balbir Singh <bsingharora@gmail.com> Reviewed-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-11-13npu2: hw-procedures: Add phy_rx_clock_sel()Reza Arbab1-1/+19
Change the RX clk mux control to be done by software instead of HW. This avoids glitches caused by changing the mux setting. Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com> Reviewed-By: Alistair Popple <alistair@popple.id.au> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-11-09npu2: hw-procedures: Enable low power modeReza Arbab1-1/+18
Add a procedure which sets the NTL low power config register. To actually enter low power mode, a corresponding change must be present in the GPU device driver. The link will not enter low power mode unless both sides agree, which means this change is safe to make independently. It should have no forward or backward dependencies on other components. Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-15npu2: hw-procedures: Add settings to PHY_RESETReza Arbab1-0/+10
Set a few new values in the PHY_RESET procedure, as specified by our updated programming guide documentation. Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-12npu2: Implement FLRReza Arbab1-0/+5
Add basic handling of FLR (function level reset) by porting the changes from commit b74841db759d ("npu: Implement FLR") to npu2. The only difference for npu2 is that we track the reset state explicitly with a link flag instead of inferring it from dev->procedure_{status,number,step,data}. Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com> Cc: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-04npu2: hw-procedures: Update PHY DC calibration procedureReza Arbab1-1/+9
Per the updated programming guide (procedure 1.2.4), set rx_pr_edge_track_cntl and rx_pr_fw_off appropriately before and after calibration. Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-04npu2: hw-procedures: Change rx_pr_phase_step valueReza Arbab1-1/+1
Change this value, per the updated programming guide. Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com> Acked-By: Alistair Popple <alistair@popple.id.au> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-04npu2: hw-procedures: Add comments denoting procedure numberReza Arbab1-0/+2
There are comments in this file to indicate where each numbered procedure from the programming guide is implemented, for easy searching. Add a couple which were missing. Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com> Acked-By: Alistair Popple <alistair@popple.id.au> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-08-27hw/npu2-hw-procedures.c: Update PHY_RESET procedureAlistair Popple1-0/+11
Newer versions of Hostboot will have various clocks powered down by default to save power. Therefore we need to power them up before accessing the OBUS PHY. Signed-off-by: Alistair Popple <alistair@popple.id.au> Reviewed-by: Reza Arbab <arbab@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-08-15npu2: Fix typo in messageAndrew Donnellan1-1/+1
Correct "procuedure" -> "procedure". Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-06-20NPU2: Add flag to nvlink config space indicating DL reset stateAlistair Popple1-0/+2
Device drivers need to be able to determine if the DL is out of reset or not so they can safely probe to see if links have already been trained. This patch adds a flag to the vendor specific config space indicating if the DL is out of reset. Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-06-20hw/npu2-hw-procedures.c: Add nvram option to override zcal calculationsAlistair Popple1-8/+20
In some rare cases the zcal state machine may fail and flag an error. According to hardware designers it is sometimes ok to ignore this failure and use nominal values for the calculations. In this case we add a nvram variable (nv_zcal_override) which will cause skiboot to ignore the failure and use the nominal value specified in nvram. Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-06-06npu2: Fix npu2_{read,write}_4b()Reza Arbab1-2/+2
When writing or reading 4-byte values, we need to use the upper half of the 64-bit SCOM register. Fix npu2_{read,write}_4b() and their callers to use uint32_t, and appropriately shift the value being written or returned. Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com> Acked-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-06-06hw/npu2-hw-procedures.c: Fix running of zcal procedureAlistair Popple1-19/+5
The zcal procedure should only be run once per obus (ie. once per group of 3 links). Clean up the code and fix the potential buffer overflow due to a typo. Also updates the zcal settings to their proper values. Fixes coverity 143248. Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-03-30npu2: Add hardware link training proceduresAlistair Popple1-0/+721
Unlike other system buses the NVLink2 links need to be trained at runtime as training requires interaction from the GPU device drivers. This patch implements the required training procedures for NVLink2, which are different than the NVLink1 equivalents. Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>