aboutsummaryrefslogtreecommitdiff
path: root/hw
AgeCommit message (Collapse)AuthorFilesLines
2019-02-19hw/test: generalise makefileStewart Smith1-9/+9
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-02-18opal: Deprecate reading the PHB statusAlexey Kardashevskiy5-27/+9
The OPAL_PCI_EEH_FREEZE_STATUS call takes a bunch of parameters, one of them is @phb_status. It is defined as __be64* and always NULL in the current Linux upstream but if anyone ever decides to read that status, then the PHB3's handler will assume it is struct OpalIoPhb3ErrorData* (which is a lot bigger than 8 bytes) and zero it causing the stack corruption; p7ioc-phb has the same issue. This removes @phb_status from all eeh_freeze_status() hooks and moves the error message from PHB4 to the affected OPAL handlers. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-By: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-02-18phb4: Update some commentsOliver O'Halloran1-19/+13
I now know what an IODA cache is and I'm not happy about it. With the power of Comments™ you too can share the misery. Remove the big WARNING about the P8 specific hardware bug while we're here. That seems to have been copied over from phb3.c and no one thought about it too hard. Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-02-18phb4: Eliminate peltv_cacheOliver O'Halloran1-18/+12
The PELT-V is also an in-memory table and there is no reason to have two copies of it. Removing the cache shaves another 128KB off the size of each struct phb4. Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-02-18phb4: Eliminate p->rte_cacheOliver O'Halloran1-22/+15
In ancient times we added a caches to struct phb3 for some of the IODA tables which can only be accessed in-directly via XSCOM. A cache for the Requester Translation Table (RTT) was also added even though this is an in-memory table. This was carried over to PHB4 when Ben did the initial copy and paste, but it's still largely pointless. There's no real need to have a second copy of the table. This patch removes the "cache" and changes all the users to reference the RTT directly if we need to. This reduces the size of the struct phb4 by 128KB. Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-02-18phb4: Remove pointless NULL checksOliver O'Halloran1-12/+2
When we allocate the various in-memory tables we assert() on the allocation. There's no point in checking if the table pointer is NULL or not at runtime. Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-02-18phb4: Rework BDFN filtering in phb4_set_pe()Oliver O'Halloran1-41/+17
General cleanup. For a function that does nothing more than a mask-and-compare the current implementation is way more convoluted than it has any right to be. Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-02-18ipmi/power: Fix system reboot issueVasant Hegde1-2/+24
Kernel makes reboot/shudown OPAL call for reboot/shutdown. Once kernel gets response from OPAL it runs opal_poll_events() until firmware handles the request. On BMC based system, OPAL makes IPMI call (IPMI_CHASSIS_CONTROL) to initiate system reboot/shutdown. At present OPAL queues IPMI messages and return SUCESS to Host. If BMC is not ready to accept command (like BMC reboot), then these message will fail. We have to manually reboot/shutdown the system using BMC interface. This patch adds logic to validate message return value. If message failed, then it will resend the message. At some stage BMC will be ready to accept message and handles IPMI message. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-02-10Add PVR_TYPE_P9PReza Arbab1-0/+4
Enable a new PVR to get us running on another p9 variant. Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-02-10Revert "astbmc: Try IPMI HIOMAP for P8"Joel Stanley1-6/+1
This reverts commit bd9839684d482417e8c60449592f4308e9a91dac as it broke booting on P8 systems, including Garrison (AMI BMC), Firestone (AMI BMC) and QEMU (BMC simulator). Issue https://github.com/open-power/skiboot/issues/217 tracks the failure. The P8 IPMI HIOMAP feature can be re-enabled once this issue is resolved. Reported-by: Sam Mendoza-Jonas <sam@mendozajonas.com> Reported-by: Sam Mendoza-Jonas <sam@mendozajonas.com> Signed-off-by: Joel Stanley <joel@jms.id.au> Acked-by: Sam Mendoza-Jonas <sam@mendozajonas.com> Acked-by: Sam Mendoza-Jonas <sam@mendozajonas.com> Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru> Acked-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-01-25Remove POWER9N DD1 supportNicholas Piggin6-351/+86
This is not a shipping product and is no longer supported by Linux or other firmware components. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-01-25xive: remove POWER9N DD1 NVT table size workaroundNicholas Piggin1-5/+1
Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: CĂ©dric Le Goater <clg@kaod.org> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-01-25phb4: remove POWER9N DD1 creset workaroundNicholas Piggin1-5/+0
Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-01-25SLW: Print verbose info on errors onlyAkshay Adiga1-2/+7
Change print level from debug to warning for reporting bad EC_PPM_SPECIAL_WKUP_* scom values. To reduce cluttering in the log print only on error. Signed-off-by: Akshay Adiga <akshay.adiga@linux.vnet.ibm.com> Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-01-25SLW: Remove Idle state support tor Power8 DD1Akshay Adiga1-50/+0
Removing init routines required for Power8 DD1, but was enabled for all Power8 DD versions. Signed-off-by: Akshay Adiga <akshay.adiga@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-01-18astbmc: Try IPMI HIOMAP for P8Andrew Jeffery1-1/+6
The HIOMAP protocol was developed after the release of P8 in preparation for P9. As a consequence P9 always uses it, but it has rarely been enabled for P8. P8DTU has recently added IPMI HIOMAP support to its BMC firmware, so enable its use in skiboot with P8 machines. Doing so requires some rework to ensure fallback works correctly as in the past the fallback was to mbox, which will only work for P9. Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-01-18sparse: Make tree 'constant is so big' warning cleanStewart Smith9-180/+180
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-01-16npu2: Replace open coded dt_find_by_name_addr()Reza Arbab1-12/+1
We now have a dt function to do this. Use it. Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-01-16npu2: Remove redundant assignment to p->phb_nvlink.scan_mapReza Arbab1-4/+1
In npu2_populate_devices(), we do p->phb_nvlink.scan_map |= 0x1 << ((dev->bdfn & 0xf8) >> 3); : dev->nvlink.pvd = pci_virt_add_device(&p->phb_nvlink, dev->bdfn, 0x100, dev); /* At this point, dev->nvlink.pvd->bdfn = dev->bdfn */ if (dev->nvlink.pvd) { p->phb_nvlink.scan_map |= 0x1 << ((dev->nvlink.pvd->bdfn & 0xf8) >> 3); : } Because dev->nvlink.pvd->bdfn equals dev->bdfn, the second assignment to scan_map is redundant. Remove it. Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-01-16npu2: Fix missing iteration in tce kill loopReza Arbab1-0/+1
When killing multiple pages, npu2_tce_kill() loops doing single page kills, but never advances the address. Fix this. Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-01-16npu2: Remove unused npu2_dev_nvlink::vendor_capReza Arbab1-1/+0
This variable is never used. Remove it. Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-01-16npu2: Remove unused npu2_dev::procedure_dataReza Arbab1-2/+0
This variable is never used. Remove it. Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-01-16npu2: Remove unused npu2::bdf2pe_cacheReza Arbab2-31/+0
This cache is written but never read. Wiring it up would gain us little (except added complexity), and it obviously hasn't been missed thus far, so remove it altogether. Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-01-16npu2: Remove dead code from npu2_cfg_write_bar()Reza Arbab1-4/+0
We assign pci_cmd, but it never gets used. Remove it. Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-01-16Remove duplicate npu2-common.o from $(HW_OBJS)Reza Arbab1-1/+1
We've listed npu2-common.o twice. Remove one. Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-01-16capp/phb4: Prevent HMI from getting triggered when disabling CAPPVaibhav Jain1-0/+11
While disabling CAPP an HMI gets triggered as soon as ETU is put in reset mode. This is caused as before we can disabled CAPP, it detects PHB link going down and triggers an HMI requesting Opal to perform CAPP recovery. This has an un-intended side effect of spamming the Opal logs with malfunction alert messages and may also confuse the user. To prevent this we mask the CAPP FIR error 'PHB Link Down' Bit(31) when we are disabling CAPP just before we put ETU in reset in phb4_creset(). Also now since bringing down the PHB link now wont trigger an HMI and CAPP recovery, hence we manually set the PHB4_CAPP_RECOVERY flag on the phb to force recovery during creset. Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-01-16phb4/capp: Implement sequence to disable CAPP and enable fast-resetVaibhav Jain1-3/+64
We implement h/w sequence to disable CAPP in disable_capi_mode() and with it also enable fast-reset for CAPI mode in phb4_set_capi_mode(). Sequence to disable CAPP is executed in three phases. The first two phase is implemented in disable_capi_mode() where we reset the CAPP registers followed by PEC registers to their init values. The final third final phase is to reset the PHB CAPI Compare/Mask Register and is done in phb4_init_ioda3(). The reason to move the PHB reset to phb4_init_ioda3() is because by the time Opal PCI reset state machine reaches this function the PHB is already un-fenced and its configuration registers accessible via mmio. Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-01-16capp/phb4: Introduce PHB4 flag, PHB4_CAPP_DISABLE to disable CAPPVaibhav Jain1-1/+96
This patch introduces a PHB4 flag PHB4_CAPP_DISABLE and scaffolding necessary to handle it during CRESET flow. The flag is set when CAPP is request to switch to PCIe mode via call to phb4_set_capi_mode() with mode OPAL_PHB_CAPI_MODE_PCIE. This starts the below sequence that ultimately ends in newly introduced phb4_slot_sm_run_completed() 1. Set PHB4_CAPP_DISABLE to phb4->flags. 2. Start a CRESET on the phb slot. This also starts the opal pci reset state machine. 3. Wait for slot state to be PHB4_SLOT_CRESET_WAIT_CQ. 4. Perform CAPP recovery as PHB is still fenced, by calling do_capp_recovery_scoms(). 5. Call newly introduced 'disable_capi_mode()' to disable CAPP. 6. Wait for slot reset to complete while it transitions to PHB4_SLOT_FRESET and optionally to PHB4_SLOT_LINK_START. 7. Once slot reset is complete opal pci-core state machine will call slot->ops.completed_sm_run(). 8. For PHB4 this branches newly introduced 'phb4_slot_sm_run_completed()'. 9. Inside this function we mark the CAPP as disabled and un-register the opal syncer phb4_host_sync_reset(). 10. Optionally if the slot reset was unsuccessful disable fast-reboot. **************************** Notes: **************************** a. Function 'disable_capi_mode()' performs various sanity tests on CAPP to to determine if its ok to disable it and perform necessary xscoms to disable it. However the current implementation proposed in this patch is a skeleton one that just does sanity tests. A followup patch will be proposed that implements the xscoms necessary to disable CAPP. b. The sequence expects that Opal PCI reset state machine makes forward progress hence needs someone to call slot->ops.run_sm(). This can be either from phb4_host_sync_reset() or opal_pci_poll(). Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-01-16capp/phb4: Force CAPP to PCIe mode during kernel shutdownVaibhav Jain1-0/+34
This patch introduces a new opal syncer for PHB4 named phb4_host_sync_reset(). We register this opal syncer when CAPP is activated successfully in phb4_set_capi_mode() so that it will be called at kernel shutdown during fast-reset. During kernel shutdown the function will then repeatedly call phb->ops->set_capi_mode() to switch switch CAPP to PCIe mode. In case set_capi_mode() indicates its OPAL_BUSY, which indicates that CAPP is still transitioning to new state; it calls slot->ops.run_sm() to ensure that Opal slot reset state machine makes forward progress. Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-01-16phb4/capp: Update and re-factor phb4_set_capi_mode()Vaibhav Jain1-35/+53
Presently phb4_set_capi_mode() performs certain CAPP checks like, checking of CAPP ucode loaded or checks if CAPP is still in recovery, even when the requested mode is to switch to PCI mode. Hence this patch updates and re-factors phb4_set_capi_mode() to make sure CAPP related checks are only performed when request to enable CAPP is made by mode==OPAL_PHB_CAPI_MODE_CAPI/DMA_TVT1. We also update other possible modes requests to return a more appropriate status code based on if CAPP is activated or not. Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-01-16capp/phb: Introduce 'struct capp' to hold capp related info in 'struct phb'Vaibhav Jain2-7/+59
Previously struct proc_chip member 'capp_phb3_attached_mask' was used for Power-8 to keep track of PHB attached to the single CAPP on the chip. CAPP on that chip supported a flexible PHB assignment scheme. However since then new chips only support a static assignment i.e a CAPP can only be attached to a specific PEC. Hence instead of using 'proc_chip.capp_phb4_attached_mask' to manage CAPP <-> PEC assignments which needs a global lock (capi_lock) to be updated, we introduce a new struct named 'capp' a pointer to which resides inside struct 'phb4'. Since updates to struct 'phb4' already happen in context of phb_lock; this eliminates the need to use mutex 'capi_lock' while updating 'capp_phb4_attached_mask'. This struct is also used to hold CAPP specific variables such as pointer to the 'struct phb' to which the CAPP is attached, 'capp_xscom_offset' which is the xscom offset to be added to CAPP registers in case there are more than 1 on the chip, 'capp_index' which is the index of the CAPP on the chip, and attached_pe' which is the process endpoint index to which CAPP is attached. Finally member 'chip_id' holds the chip-id thats used for performing xscom read/writes. Also new helpers named capp_xscom_read()/write() are introduced to make access to CAPP xscom registers easier. Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-12-12Revert "npu2: Allow ATSD for LPAR other than 0"Stewart Smith1-18/+4
This reverts commit d8b161f4b361f70a7bb43be47d4a32b8f937287a. As discussed on list, a bit premature to merge, removing for now. Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-12-10hw/bt.c: Move some debug ifdef to make static analysis happyStewart Smith1-2/+4
Okay, so maybe the static analysis warning is all useless, and maybe having the ifdef around a call is actually useful. I'll take the less noise in my CI static analysis thing. Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-12-10Add purging CPU L2 and L3 caches into NPU hreset.Rashmica Gupta1-1/+138
If a GPU is passed through to a guest and the guest unexpectedly terminates, there can be cache lines in CPUs that belong to the GPU. So purge the caches as part of the reset sequence. L1 is write through, so doesn't need to be purged. The sequence to purge the L2 and L3 caches from the hw team: "L2 purge: (1) initiate purge putspy pu.ex EXP.L2.L2MISC.L2CERRS.PRD_PURGE_CMD_TYPE L2CAC_FLUSH -all putspy pu.ex EXP.L2.L2MISC.L2CERRS.PRD_PURGE_CMD_TRIGGER ON -all (2) check this is off in all caches to know purge completed getspy pu.ex EXP.L2.L2MISC.L2CERRS.PRD_PURGE_CMD_REG_BUSY -all (3) putspy pu.ex EXP.L2.L2MISC.L2CERRS.PRD_PURGE_CMD_TRIGGER OFF -all L3 purge: 1) Start the purge: putspy pu.ex EXP.L3.L3_MISC.L3CERRS.L3_PRD_PURGE_TTYPE FULL_PURGE -all putspy pu.ex EXP.L3.L3_MISC.L3CERRS.L3_PRD_PURGE_REQ ON -all 2) Ensure that the purge has completed by checking the status bit: getspy pu.ex EXP.L3.L3_MISC.L3CERRS.L3_PRD_PURGE_REQ -all You should see it say OFF if it's done: p9n.ex k0:n0:s0:p00:c0 EXP.L3.L3_MISC.L3CERRS.L3_PRD_PURGE_REQ OFF" Suggested-by: Alistair Popple <alistair@popple.id.au> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Rashmica Gupta <rashmica.g@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-12-10npu2: Allow ATSD for LPAR other than 0Alexey Kardashevskiy1-4/+18
Each XTS MMIO ATSD# register is accompanied by another register - XTS MMIO ATSD0 LPARID# - which controls LPID filtering for ATSD transactions. When a host system passes a GPU through to a guest, we need to enable some ATSD for an LPAR. At the moment the host assigns one ATSD to a NVLink bridge and this maps it to an LPAR when GPU is assigned to the LPAR. The link number is used for an ATSD index. ATSD6&7 stay mapped to the host (LPAR=0) all the time which seems to be acceptable price for the simplicity. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-12-10npu2: Return sensible PCI error when not frozenAlexey Kardashevskiy1-2/+6
The current kernel calls OPAL_PCI_EEH_FREEZE_STATUS with an uninitialized @pci_error_type parameter and then analyzes it even if the OPAL call returned OPAL_SUCCESS. This is results in unexpected EEH events and NPU freezes. This initializes @pci_error_type and @severity to known safe values. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-12-10npu2: Advertise correct TCE page sizeAlexey Kardashevskiy1-0/+5
The P9 NPU workbook says that only 4K/64K/16M/256M page size are supported and in fact npu2_map_pe_dma_window() supports just these but in absence of the "ibm,supported-tce-sizes" property Linux assumes the default P9 PHB4 page sizes - 4K/64K/2M/1G - so when Linux tries 2M/1G TCEs, we get lots of "Unexpected TCE size" from npu2_tce_kill(). This advertises TCE page sizes so Linux could handle it correctly, i.e. fall back to 4K/64K TCEs. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-11-28npu2-opencapi: Log ODL endpoint information registerFrederic Barrat1-1/+28
If the link trains in degraded mode, log the ODL endpoint information register for debug. Its content is specific to the DLx and TLx implementation, so this is really information useful for the hardware team. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-11-28npu2-opencapi: Detect if link trained in degraded modeFrederic Barrat1-19/+31
There's no status readily available to tell the effective link width. Instead, we have to look at the individual status of each lane, on the transmit and receive direction. All relevant information is in the ODL status register. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-11-28npu2-opencapi: Log extra information on link training failureFrederic Barrat1-3/+34
Log the link training status register in case of failure to train. It can have useful information for the hardware team. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-11-18Add the other 7 ATSD registers to the device tree.Rashmica Gupta1-5/+10
Suggested-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Rashmica Gupta <rashmica.g@gmail.com> Reviewed-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-11-01phb4/capp: Only reset FIR bits that cause capp machine checkVaibhav Jain1-0/+17
During CAPP recovery do_capp_recovery_scoms() will reset the CAPP Fir register just after CAPP recovery is completed. This has an unintentional side effect of preventing PRD from analyzing and reporting this error. If PRD tries to read the CAPP FIR after opal has already reset it, then it logs a critical error complaining "No active error bits found". To prevent this from happening we update do_capp_recovery_scoms() to only reset fir bits that cause CAPP machine check (local xstop). This is done by reading the CAPP Fir Action0/1 & Mask registers and generating a mask which is then written on CAPP_FIR_CLEAR register. Cc: stable Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-11-01phb4: Check for RX errors after link trainingOliver O'Halloran1-3/+26
Some PHB4 PHYs can get stuck in a bad state where they are constantly retraining the link. This happens transparently to skiboot and Linux but will causes PCIe to be slow. Resetting the PHB4 clears the problem. We can detect this case by looking at the RX errors count where we check for link stability. This patch does this by modifying the link optimal code to check for RX errors. If errors are occurring we retrain the link irrespective of the chip rev or card. Normally when this problem occurs, the RX error count is maxed out at 255. When there is no problem, the count is 0. We chose 8 as the max rx errors value to give us some margin for a few errors. There is also a knob that can be used to set the error threshold for when we should retrain the link. ie nvram -p ibm,skiboot --update-config phb-rx-err-max=8 Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-11-01nx: Don't abort on missing NX when using a QEMU machineBenjamin Herrenschmidt1-1/+2
These don't have an NX node (and probably never will) as they don't provide any coprocessor. However, the DARN instruction works so this abort is unnecessary. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-10-25npu2-opencapi: Enable presence detection on ZZFrederic Barrat1-2/+1
Presence detection for opencapi adapters was broken for ZZ planars v3 and below. All ZZ systems currently used in the lab have had their planar upgraded, so we can now remove the override we had to force presence and activate presence detection. Which should improve boot time. Considering the state of opal support on ZZ, this is really only for lab usage on BML. The opencapi enablement team has okay'd the change. In the unlikely case somebody tries opencapi on an old ZZ, the presence detection through i2c will show that no adapter is present and skiboot won't try to access or train the link. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-10-23lpc: Clear sync no-response field prior to device probeAndrew Jeffery1-1/+6
Artem Senichev reported[1] his P8 platform was failing to boot from a43e9a66aae9 ("astbmc: Fail SFC init if SIO is unavailable") with the following error: [ 110.097168975,3] PLAT: Failed to open PNOR flash controller I reproduced this behaviour on a Palmetto; we need to ensure the state of the no-response error bit is clear before proceding with the presence test. The fix appears to resolve the failure to open the PNOR flash controller on Palmetto and doesn't change the expected behaviour on Witherspoon. [1] https://github.com/open-power/skiboot/issues/197 Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Tested-by: Artem Senichev <a.senichev@yadro.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-10-16phb4: Enable PHB MMIO-0/1 Bars only when mmio window existsVaibhav Jain1-2/+1
Presently phb4_probe_stack() will always enable PHB MMIO0/1 windows even if they doesn't exist in phy_map. Hence we do some minor shuffling in the phb4_probe_stack() so that MMIO-0/1 Bars are only enabled if there corresponding MMIO window exists in the phy_map. In case phy_map for an mmio window is '0' we set the corresponding BAR register to '0'. Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Reviewed-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-10-16phb4/capp: Update the expected Eye-catcher for CAPP ucode lidVaibhav Jain1-2/+2
Currently on a FSP based P9 system load_capp_code() expects CAPP ucode lid header to have eye-catcher magic of 'CAPPPSLL'. However skiboot currently supports CAPP ucode only lids that have a eye-catcher magic of 'CAPPLIDH'. This prevents skiboot from loading the ucode with this error message: CAPP: ucode header invalid We fix this issue by updating load_capp_ucode() to use the eye-catcher value of 'CAPPLIDH' instead of 'CAPPPSLL'. Cc: stable Fixes: e50764d4f2b1("capi: Load capp microcode") Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-10-16phb4/capp: Use link width to allocate STQ engines to CAPPVaibhav Jain1-17/+29
Update phb4_init_capp_regs() to allocates STQ Engines to CAPP/PEC2 based on link width instead of always assuming it to x8. Also re-factor the function slightly to evaluate the link-width only once and cache it so that it can also be used to allocate DMA read engines. Cc: stable Fixes: 47c09cdfe7a3("phb4/capp: Calculate STQ/DMA read engines based on link-width for PEC") Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2018-10-11astbmc: Fail SFC init if SIO is unavailableAndrew Jeffery1-0/+3
If SuperIO is unavailable then the driver cannot perform accesses on which it currently depends. Test for SuperIO availability during initialsation and bail out immediately if it is absent. Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>