aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2017-10-18skiboot 5.9-rc3 release notesv5.9-rc3Stewart Smith1-0/+42
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-10-16hdata/vpd: Improve vpd node find logicVasant Hegde1-14/+2
Use dt_find_by_name_addr() instead of dt_find_by_name(). That way we can avoid unnecessary memory allocation/cleanup. CC: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-10-16hdata/vpd: Rework vpd node creation logicVasant Hegde5-399/+363
Presently we traverse SLCA structure to create various FRU nodes under /vpd node. We assumed that children are always contiguous. It happened to be contiguous in P8 and worked fine, but failed in P9 system. So it ended up populating duplicate node under wrong parent. Also failed to populate some of the nodes. Unfortunately there is no way to reach all the children of a given parent from parent node :-( Hence we have to rework vpd creation logic. This patch goes through all the SLCA entries serially and creates vpd node. Assumptions: - SLCA index is always serial (0..n) - When we traverse serially parent entry comes before child - Redundant resources are always consecutive - Populate node if SLCA has 'installed' and 'VPD collected' bit set CC: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Reviewed-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-10-16Revert "npu2: Add vendor cap for IRQ testing"Alistair Popple1-28/+0
This reverts commit 9817c9e29b6fe00daa3a0e4420e69a97c90eb373 which seems to break setting the PCI dev flag and the link number in the PCIe vendor specific config space. This leads to the device driver attempting to re-init the DL when it shouldn't which can cause HMI's. Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-10-16hw/imc: Fix the pvr (sub_id) for IMC Catalog loadMadhavan Srinivasan1-2/+2
Currently IMC catalog carry multiple dtbs in the pnor partition, one for each power9 major versions. And system pvr value (pvr_type and pvr_major version) is used as sub-id to load the right dtb from the partition. Since minor version of pvr is not used, mask it out. Reported-by: Shriya <shriyak@linux.vnet.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-10-16cpu: Add OPAL_REINIT_CPUS_TM_SUSPEND_DISABLEDMichael Ellerman3-0/+20
Add a new CPU reinit flag, "TM Suspend Disabled", which requests that CPUs be configured so that TM (Transactional Memory) suspend mode is disabled. Currently this always fails, because skiboot has no way to query the state. A future hostboot change will add a mechanism for skiboot to determine the status and return an appropriate error code. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-10-16skiboot 5.9-rc2 release notesv5.9-rc2Stewart Smith1-0/+246
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-10-16opal-prd: Fix memory leakVasant Hegde1-0/+1
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Acked-by: Jeremy Kerr <jk@ozlabs.org> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-10-15hdata/i2c: update the list of known i2c devsClaudio Carvalho1-4/+33
This updates the list of known i2c devices - as of HDAT spec v10.5e - so that they can be properly identified during the hdat parsing. Signed-off-by: Claudio Carvalho <cclaudio@linux.vnet.ibm.com> Reviewed-by: Oliver O'Halloran <oohal@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-10-15hdata/i2c: log unknown i2c devicesClaudio Carvalho1-4/+17
An i2c device is unknown if either the i2c device list is outdated or the device is marked as unknown (0xFF) in the hdat. This log both cases. Signed-off-by: Claudio Carvalho <cclaudio@linux.vnet.ibm.com> Reviewed-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-10-15hdata/i2c: add __packed to the host_i2c_hdr structureClaudio Carvalho1-1/+1
This adds __packed to the host_i2c_hdr structure since it defines an offset that refers to the beginning of the structure. Fixes: 41dc3eb4495c451a405974570f604622a3f829ef Signed-off-by: Claudio Carvalho <cclaudio@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-10-15opal/cpu: Mark the core as bad while disabling threads of the core.Mahesh Salgaonkar1-0/+10
If any of the core fails to sync its TB during chipTOD initialization, all the threads of that core are disabled. But this does not make linux kernel to ignore the core/cpus. It crashes while bringing them up with below backtrace: [ 38.883898] kexec_core: Starting new kernel cpu 0x0: Vector: 300 (Data Access) at [c0000003f277b730] pc: c0000000001b9890: internal_create_group+0x30/0x304 lr: c0000000001b9880: internal_create_group+0x20/0x304 sp: c0000003f277b9b0 msr: 900000000280b033 dar: 40 dsisr: 40000000 current = 0xc0000003f9f41000 paca = 0xc00000000fe00000 softe: 0 irq_happened: 0x01 pid = 2572, comm = kexec Linux version 4.13.2-openpower1 (jenkins@p89) (gcc version 6.4.0 (Buildroot 2017.08-00006-g319c6e1)) #1 SMP Wed Sep 20 05:42:11 UTC 2017 enter ? for help [c0000003f277b9b0] c0000000008a8780 (unreliable) [c0000003f277ba50] c00000000041c3ac topology_add_dev+0x2c/0x40 [c0000003f277ba70] c00000000006b078 cpuhp_invoke_callback+0x88/0x170 [c0000003f277bac0] c00000000006b22c cpuhp_up_callbacks+0x54/0xb8 [c0000003f277bb10] c00000000006bc68 cpu_up+0x11c/0x168 [c0000003f277bbc0] c00000000002f0e0 default_machine_kexec+0x1fc/0x274 [c0000003f277bc50] c00000000002e2d8 machine_kexec+0x50/0x58 [c0000003f277bc70] c0000000000de4e8 kernel_kexec+0x98/0xb4 [c0000003f277bce0] c00000000008b0f0 SyS_reboot+0x1c8/0x1f4 [c0000003f277be30] c00000000000b118 system_call+0x58/0x6c Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-10-15doc: Update VPD, ECID documentationVasant Hegde2-4/+16
Recently we added `ecid`, `wafer-id` and `wafer-location` properties under xscom node. Lets document these properties. Also update VPD documentation. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-10-15hw/imc: pause microcode at bootMadhavan Srinivasan1-0/+24
IMC nest counters has both in-band (ucode access) and out of band access to it. Since not all nest counter configurations are supported by ucode, out of band tools are used to characterize other configuration. So it is prefer to pause the nest microcode at boot to aid the nest out of band tools. If the ucode not paused and OS does not have IMC driver support, then out to band tools will race with ucode and end up getting undesirable values. Patch to check and pause the ucode at boot. OPAL provides APIs to control IMC counters. OPAL_IMC_COUNTERS_INIT is used to initialize these counters at boot. OPAL_IMC_COUNTERS_START and OPAL_IMC_COUNTERS_STOP API calls should be used to start and pause these IMC engines. `doc/opal-api/opal-imc-counters.rst` details the OPAL APIs and their usage. Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-10-15hw/imc: Use ARRAY_SIZE instead of static macroMadhavan Srinivasan2-3/+1
disable_unavailable_units() loops through nest_pmus array to filter out the unsupported nest units from the imc catalog dtb. Current code use a static macro ('MAX_NEST_UNITS') for array limit, instead use ARRAY_SIZE. This will avoid updates to static macro when updating the nest_pmus array. Fixes: 712837cedca06 ('skiboot/imc: Update the nest_pmus array with occ/gpe microcode uav updates') Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-10-15xive: Fix VP free block group mode false-positive parameter checkNicholas Piggin1-1/+3
The check to ensure the buddy allocation idx is aligned to its allocation order was not taking into account the allocation split. This would result in opal_xive_free_vp_block failures despite giving the same value as returned by opal_xive_alloc_vp_block. E.g., starting then stopping 4 KVM guests gives the following pattern in the host: opal_xive_alloc_vp_block(5)=0x45000020 opal_xive_alloc_vp_block(5)=0x45000040 opal_xive_alloc_vp_block(5)=0x45000060 opal_xive_alloc_vp_block(5)=0x45000080 opal_xive_free_vp_block(0x45000020)=-1 opal_xive_free_vp_block(0x45000040)=0 opal_xive_free_vp_block(0x45000060)=-1 opal_xive_free_vp_block(0x45000080)=0 Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-10-15doc: clarify locking and async of OPAL_SENSOR_READStewart Smith1-1/+6
Reported-by: Robert Lippert <rlippert@google.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-10-15hw/p8-i2c: Fix deadlock in p9_i2c_bus_owner_changeAnton Blanchard1-1/+1
When debugging a system where Linux was taking soft lockup errors, I noticed two CPUs were stuck in OPAL: CPU0 lock p8_i2c_recover opal_handle_interrupt CPU1 sync_timer cancel_timer p9_i2c_bus_owner_change occ_p9_interrupt xive_source_interrupt opal_handle_interrupt p8_i2c_recover() is a timer, and is stuck trying to take master->lock. p9_i2c_bus_owner_change() has taken master->lock, but then is stuck waiting for all timers to complete. We deadlock. Fix this by using cancel_timer_async(), as suggested by Oliver. Fixes: 201fd50f208d ("hw/p8-i2c: Fix OCC locking") Suggested-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-10-11skiboot 5.4.8 release notesStewart Smith1-0/+158
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com> (cherry picked from commit 43290f90e46d632ed5a314292c317e6f813c3b74) Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-10-11FSP/CONSOLE: Limit number of error loggingVasant Hegde1-8/+13
Commit c8a7535f (FSP/CONSOLE: Workaround for unresponsive ipmi daemon) added error logging when buffer is full. In some corner cases kernel may call this function multiple time and we may endup logging error again and again. This patch fixes it by generating error log only once. I think this is enough to indicate something went wrong. Also with previous patch, once console buffer is full, OPAL is returning error to payload from fsp_console_write_buffer_space(). So payload will never call fsp_console_write(). Hence move error logging logic to right place. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-10-11FSP/CONSOLE: Fix fsp_console_write_buffer_space() callVasant Hegde1-1/+35
Kernel calls fsp_console_write_buffer_space() to check console buffer space availability. If there is enough buffer space to write data, then kernel will call fsp_console_write() to write actual data. In some extreme corner cases (like one explained in commit c8a7535f) console becomes full and this function returns 0 to kernel (or space available in console buffer < next incoming data size). Kernel will continue retrying until it gets enough space. So we will start seeing RCU stalls. This patch keeps track of previous available space. If previous space is same as current means not enough space in console buffer to write incoming data. It may be due to very high console write operation and slow response from FSP -OR- FSP has stopped processing data (ex: because of ipmi daemon died). At this point we will start timer with timeout of SER_BUFFER_OUT_TIMEOUT (10 secs). If situation is not improved within 10 seconds means something went bad. Lets return OPAL_RESOURCE so that kernel can drop console write and continue. CC: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> CC: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> [stewart: reset timeout in fsp_console_write() path] Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-10-11FSP/CONSOLE: Close SOL session during R/RVasant Hegde1-3/+0
Presently we are not closing SOL and FW console sessions during R/R. Host will continue to write to SOL buffer during FSP R/R. If there is heavy console write operation happening during FSP R/R (like running `top` command inside console), then at some point console buffer becomes full. fsp_console_write_buffer_space() returns 0 (or less than required space to write data) to host. While one thread is busy writing to console, if some other threads tries to write data to console we may see RCU stalls (like below) in kernel. kernel call trace: ------------------ [ 2082.828363] INFO: rcu_sched detected stalls on CPUs/tasks: { 32} (detected by 16, t=6002 jiffies, g=23154, c=23153, q=254769) [ 2082.828365] Task dump for CPU 32: [ 2082.828368] kworker/32:3 R running task 0 4637 2 0x00000884 [ 2082.828375] Workqueue: events dump_work_fn [ 2082.828376] Call Trace: [ 2082.828382] [c000000f1633fa00] [c00000000013b6b0] console_unlock+0x570/0x600 (unreliable) [ 2082.828384] [c000000f1633fae0] [c00000000013ba34] vprintk_emit+0x2f4/0x5c0 [ 2082.828389] [c000000f1633fb60] [c00000000099e644] printk+0x84/0x98 [ 2082.828391] [c000000f1633fb90] [c0000000000851a8] dump_work_fn+0x238/0x250 [ 2082.828394] [c000000f1633fc60] [c0000000000ecb98] process_one_work+0x198/0x4b0 [ 2082.828396] [c000000f1633fcf0] [c0000000000ed3dc] worker_thread+0x18c/0x5a0 [ 2082.828399] [c000000f1633fd80] [c0000000000f4650] kthread+0x110/0x130 [ 2082.828403] [c000000f1633fe30] [c000000000009674] ret_from_kernel_thread+0x5c/0x68 Hence lets close SOL (and FW console) during FSP R/R. CC: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-10-11FSP/CONSOLE: Do not associate unavailable consoleVasant Hegde1-0/+11
Presently OPAL sends associate/unassociate MBOX command for all FSP serial console (like below OPAL message). We have to check console is available or not before sending this message. OPAL log: ------- [ 5013.227994012,7] FSP: Reassociating HVSI console 1 [ 5013.227997540,7] FSP: Reassociating HVSI console 2 Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Acked-by: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-10-11FSP: Disable PSI link whenever FSP tells OPAL about impending R/RVasant Hegde2-18/+8
Commit 42d5d047 fixed scenario where DPO has been initiated, but FSP went into reset before the CEC power down came in. But this is generic issue that can happen in normal shutdown path as well. Hence disable PSI link as soon as we detect FSP impending R/R. CC: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> CC: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-10-11fsp: return OPAL_BUSY_EVENT on failure sending FSP_CMD_REBOOT / DEEP_REBOOTStewart Smith1-2/+2
See 696d378d7b7295366e115e89a785640bf72a5043 for all the details. Suggested-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-10-11fsp: return OPAL_BUSY_EVENT on failure sending FSP_CMD_POWERDOWN_NORMStewart Smith3-5/+21
We had a race condition between FSP Reset/Reload and powering down the system from the host: Roughly: FSP Host --- ---- Power on Power on (inject EPOW) (trigger FSP R/R) Processes EPOW event, starts shutting down calls OPAL_CEC_POWER_DOWN (is still in R/R) gets OPAL_INTERNAL_ERROR, spins in opal_poll_events (FSP comes back) spinning in opal_poll_events (thinks host is running) The call to OPAL_CEC_POWER_DOWN is only made once as the reset/reload error path for fsp_sync_msg() is to return -1, which means we give the OS OPAL_INTERNAL_ERROR, which is fine, except that our own API docs give us the opportunity to return OPAL_BUSY when trying again later may be successful, and we're ambiguous as to if you should retry on OPAL_INTERNAL_ERROR. For reference, the linux code looks like this: >static void __noreturn pnv_power_off(void) >{ > long rc = OPAL_BUSY; > > pnv_prepare_going_down(); > > while (rc == OPAL_BUSY || rc == OPAL_BUSY_EVENT) { > rc = opal_cec_power_down(0); > if (rc == OPAL_BUSY_EVENT) > opal_poll_events(NULL); > else > mdelay(10); > } > for (;;) > opal_poll_events(NULL); >} Which means that *practically* our only option is to return OPAL_BUSY or OPAL_BUSY_EVENT. We choose OPAL_BUSY_EVENT for FSP systems as we do want to ensure we're running pollers to communicate with the FSP and do the final bits of Reset/Reload handling before we power off the system. Additionally, we really should update our documentation to point all of these return codes and what action an OS should take. CC: stable Reported-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com> Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Acked-by: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-10-11skiboot 5.9-rc1 release notesv5.9-rc1Stewart Smith2-0/+531
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-10-10hdat: Disable TM on Power9 DD 2.1Michael Ellerman1-1/+1
Update pa_features_p9[] to disable TM (Transactional Memory). On DD 2.1 TM is not usable by Linux without other workarounds, so skiboot must disable it. The presence of TM is communicated by setting bit 7 of byte 22 in the pa-features array. As no other bits are set in that byte, we currently have a value of 0x80. To disable TM we set bit 7 to 0, leaving a value of 0x0. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-10-10hdata/iohub: Check IOHUB child count before usingVasant Hegde1-1/+4
..else we endup getting below calltrace in older system. [ 169.179598388,3] HDIF: child array idx out of range! CPU 085c Backtrace: S: 0000000033d739b0 R: 00000000300136e8 .backtrace+0x40 S: 0000000033d73a50 R: 00000000300a1510 .HDIF_child_arr+0x34 S: 0000000033d73ac0 R: 00000000300a47a8 .io_parse+0x708 S: 0000000033d73c80 R: 000000003009f4ec .parse_hdat+0x177c S: 0000000033d73e30 R: 0000000030014750 .main_cpu_entry+0x148 S: 0000000033d73f00 R: 0000000030002690 boot_entry+0x198 Fixes: ad484081 (hdata: Parse IOSLOT information) CC: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-10-10FSP/NVRAM: Handle "get vNVRAM statistics" commandVasant Hegde1-0/+41
FSP sends MBOX command (cmd : 0xEB, subcmd : 0x05, mod : 0x00) to get vNVRAM statistics. OPAL doesn't maintain any such statistics. Hence return FSP_STATUS_INVALID_SUBCMD. Sample OPAL log: [16944.384670488,3] FSP: Unhandled message eb0500 [16944.474110465,3] FSP: Unhandled message eb0500 [16945.111280784,3] FSP: Unhandled message eb0500 [16945.293393485,3] FSP: Unhandled message eb0500 With this patch, I don't think FSP will ever call "free vNVRAM" MBOX command. But to be safer side lets return FSP_STATUS_INVALID_SUBCMD for this MBOX command as well. Reported-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Tested-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-10-10xscom: Do not print error message for 'chiplet offline' return valuesVasant Hegde1-0/+14
xscom_read/write operations returns CHIPLET_OFFLINE when chiplet is offline. Some multicast xscom_read/write requests from HBRT results in xscom operation on offline chiplet(s) and printing below warnings in OPAL console. [ 135.036327572,3] XSCOM: Read failed, ret = -14 [ 135.092689829,3] XSCOM: Read failed, ret = -14 This results in unnecessary bugs. Hence remove error message for multicast SCOM operations. Suggested-by: Daniel M Crowell <dcrowell@us.ibm.com> Tested-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Reviewed-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-10-10ipmi: Convert common debug prints to traceVasant Hegde2-4/+4
OPAL logs messages for every IPMI request from host. Sometime OPAL console is filled with only these messages. This path is pretty stable now and we have enough logs to cover bad path. Hence lets convert these debug message to trace/info message. [ 1356.423958816,7] opal_ipmi_recv(cmd: 0xf0 netfn: 0x3b resp_size: 0x02) [ 1356.430774496,7] opal_ipmi_send(cmd: 0xf0 netfn: 0x3a len: 0x3b) [ 1356.430797392,7] BT: seq 0x20 netfn 0x3a cmd 0xf0: Message sent to host [ 1356.431668496,7] BT: seq 0x20 netfn 0x3a cmd 0xf0: IPMI MSG done Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-10-10pci-iov: free memory across fast-rebootStewart Smith3-6/+26
pci_set_cap needs a callback to free data and we need to call that when we're doing __pci_reset() We also need to free pcrf entries. In the future, __pci_reset() and pci_remove_bus() need to come together to be one canonical place on how to free a PCI device rather than the two we have now. This patch *purely* focuses on the problem of not leaking memory across fast-reboot. Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-10-10platforms/astbmc: Make platform symbols consistentAndrew Donnellan2-2/+2
The DECLARE_PLATFORM macro takes the supplied platform name and adds "_platform" to it when defining the platform struct. Remove the _platform from the platform name for zaius and romulus so we have "zaius_platform" rather than "zaius_platform_platform", for consistency's sake... Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-10-06libflash/blocklevel: suppress debug printoutStewart Smith1-1/+0
while this is PR_DEBUG, and we shouldn't be printing it to the console, we do because of a long standing bug in how we do log priorities. So, for the moment at least, just don't print it at all. Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-10-06nx-compress: PR_DEBUG not prerror in the normal caseStewart Smith1-1/+1
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-10-06external/gard: Clear entire guard partition instead of entry by entrySuraj Jitindar Singh1-11/+18
When using the current implementation of the gard tool to ecc clear the entire GUARD partition it is done one gard record at a time. While this may be ok when accessing the actual flash this is very slow when done from the host over the mbox protocol (on the order of 4 minutes) because the bmc side is required to do many read, erase, writes under the hood. Fix this by rewriting the gard tool reset_partition() function. Now we allocate all the erased guard entries and (if required) apply ecc to the entire buffer. Then we can do one big erase and write of the entire partition. This reduces the time to clear the guard partition to on the order of 4 seconds. Reported-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com> Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Reviewed-by: Cyril Bur <cyril.bur@au1.ibm.com Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-10-06capp: Add lid definitions for P9 DD-2.0 & DD-2.1Vaibhav Jain1-0/+4
Update fsp_lid_map to include CAPP ucode lids for phb4-chipid == 0x200d1 and phb4-chipid == 0x201d1 that corresponds to P9 DD-2.0 & DD-2.1 chips respectively. Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-10-06hw/lpc-uart: read from RBR to clear character timeout interruptsJeremy Kerr1-0/+21
When using the aspeed SUART, we see a condition where the UART sends continuous character timeout interrupts. This change adds a (heavily commented) dummy read from the RBR to clear the interrupt condition on init. Signed-off-by: Jeremy Kerr <jk@ozlabs.org> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-10-06fast-reset: by default (if possible)Stewart Smith1-2/+2
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-10-02phb4: Reassign link_retries counter in IODA purgeGuilherme G. Piccoli1-0/+2
Recently, a link_retries counter was added in pci/phb4 in order Skiboot can retry to train a link some times - default number of attempts to retrain a link is 3. Happens that, if during a regular boot process we exhaust the link retries and fail to train a PHB, the variable link_retries is stuck in 0. If a kdump happens later, a PHB reset procedure is triggered by Linux and, since we have a decrement-and-test in this variable, we end up setting it to -1; it's unsigned, hence we get an overflow. This patch fixes the issue by reassigning the default value to link_retries in every IODA purge. Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-10-02phb4: Add additional adapter to retrain whitelistJohn W Walthour1-2/+3
The single port version of the ConnectX-5 has a different device ID 0x1017. Updated descriptions to match pciutils database. Signed-off-by: John Walthour <jwalthour@us.ibm.com> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-10-02i2c: Move tpm i2c wrapper code into coreAndrew Donnellan4-95/+123
The TPM code has a wrapper around the main i2c API to allow synchronous use. Move it into core/i2c.c so it can be used by other possible users. In particular, a future patch will use this to drive OpenCAPI device resets during boot time. Cc: Claudio Carvalho <cclaudio@linux.vnet.ibm.com> Cc: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-10-02hdata: Fix sparse warning in add_ecid_data()Vasant Hegde1-1/+1
warning: hdata/spira.c:433:33: warning: restricted beint64_t degrades to integer Reported-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-10-02hdata/i2c: hdat_i2c_labels and hdat_i2c_devs should be staticStewart Smith1-2/+2
Silences sparse warnings: hdata/i2c.c:103:23: warning: symbol 'hdat_i2c_labels' was not declared. Should it be static? hdata/i2c.c:93:22: warning: symbol 'hdat_i2c_devs' was not declared. Should it be static? Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-10-02hdata/spira: silence 'so big is unsigned long' sparse warningsStewart Smith1-2/+2
hdata/spira.c:1401:17: warning: constant 0x8000000009010c3f is so big it is unsigned long hdata/spira.c:1401:17: warning: constant 0x800000000c010c3f is so big it is unsigned long hdata/spira.c:1401:17: warning: constant 0x8000000009010c3f is so big it is unsigned long hdata/spira.c:1401:17: warning: constant 0x800000000c010c3f is so big it is unsigned long Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-10-02phb4: make retry_whitelist staticStewart Smith1-1/+1
Silences sparse warning: hw/phb4.c:XX:20: warning: symbol 'retry_whitelist' was not declared. Should it be static? Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-10-02hdata/spira: be32_to_cpu() doesn't work on 8bitsStewart Smith1-1/+1
Sparse warning: hdata/spira.c:1458:41: warning: incorrect type in argument 1 (different base types) expected restricted beint32_t [usertype] be_val got unsigned char const [unsigned] [usertype] link_speed Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com> Reviewed-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-28capi: Mask Psl Credit timeout error for P9Vaibhav Jain1-0/+4
Mask the PSL credit timeout error in CAPP FIR Mask register bit(46). As per the h/w team this error is now deprecated and shouldn't cause any fir-action for P9. Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Acked-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
2017-09-28cpu: idle POWER9 power management implementationNicholas Piggin6-27/+205
Add pm idle support to POWER9. IPIs are implemented with doorbells. POWER9 can use the EC=ESL=0 (lite) stop when sreset is not available. EC=ESL=1 state with RL=3 is enabled when we have a sreset wakeup. Deep idle states are not implemented. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>