aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2019-06-24Move FSP specific op-panel calls to platform.exit()Stewart Smith2-10/+7
We move the platform exit call much closer to executing the kernel, which should all be safe, and in fact a much better time to do watchdog related things. Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-06-24core/init.c: no longer need to include FSP headersStewart Smith1-2/+0
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-06-24core/init.c: use if (platform.bmc) rather than !fsp_present()Stewart Smith1-1/+1
This decouples FSP platform from core skiboot logic by using this small hack that may/may not be a good idea (although is already used elsewhere, so at least we're consistent). Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-06-24host_services_occ_base_setup is core homer code not host_servicesStewart Smith3-25/+25
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-06-24Move core/hostservices.c to platforms/ibm-fsp/Stewart Smith6-6/+4
It's only used on FSP systems so should really just be part of that platform support. Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-06-24Split FSP OCC code out into hw/fsp/Stewart Smith3-388/+426
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-06-24Move more FSP functions to FSP platformStewart Smith2-12/+13
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-06-24op_display: make platform function rather than "FSP" specificStewart Smith19-4/+34
We have an implementation for non-FSP systems now, and we shouldn't be calling that from code in an fsp/ directory, so move op_display() to a platform function. Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-06-24doc: travis-ci deploy docs!Stewart Smith5-8/+52
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-06-19Disable -Waddress-of-packed-member for GCC9Stewart Smith1-1/+2
We throw a bunch of errors in errorlog code otherwise, which we should fix, but we don't *have* to yet. Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-06-19hdata/vpd: fix printing (char*)0x00Stewart Smith1-4/+5
GCC9 now catches this bug: In file included from hdata/vpd.c:17: In function ‘vpd_vini_parse’, inlined from ‘vpd_data_parse’ at hdata/vpd.c:416:3: /home/stewart/skiboot/include/skiboot.h:93:31: error: ‘%s’ directive argument is null [-Werror=format-overflow=] 93 | #define prlog(l, f, ...) do { _prlog(l, pr_fmt(f), ##__VA_ARGS__); } while(0) | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ hdata/vpd.c:390:5: note: in expansion of macro ‘prlog’ 390 | prlog(PR_WARNING, | ^~~~~ hdata/vpd.c: In function ‘vpd_data_parse’: hdata/vpd.c:391:46: note: format string is defined here 391 | "VPD: CCIN desc not available for: %s\n", | ^~ cc1: all warnings being treated as errors Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-06-19struct p9_sbe_msg doesn't need to be packedStewart Smith1-1/+1
Only the reg member is sent anywhere (via xscom_write), so the structure does not need to be packed. Fixes GCC9 build problem: hw/sbe-p9.c: In function ‘p9_sbe_msg_send’: hw/sbe-p9.c:270:9: error: taking address of packed member of ‘struct p9_sbe_msg’ may result in an unaligned pointer value [-Werror=address-of-packed-member] 270 | data = &msg->reg[0]; | ^~~~~~~~~~~~ Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-06-19travis: add back Coverity ScanStewart Smith1-0/+9
We need this in the master branch so that the coverity_scan branch gets picked up. Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-06-13sparse: fix incorrect type assignment in core/ipmi.cStewart Smith1-1/+2
core/ipmi.c:262:44: warning: incorrect type in assignment (different base types) core/ipmi.c:262:44: expected unsigned long long [usertype] opal_event_ipmi_recv core/ipmi.c:262:44: got restricted beint64_t Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-06-13sparse: fix bt_lock should be staticStewart Smith1-1/+1
Fix this sparse warning: core/stack.c:123:13: warning: symbol 'bt_lock' was not declared. Should it be static? Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-06-13sparse: fix OPAL API function argument testStewart Smith1-2/+9
This is based on a patch from Cédric from way back in 2015 (probably around the time we split opal.h into opal-api.h and opal-internal.h) that for some reason never got merged. It means that false warnings spat out of sparse due to our crazy-ass macro usage get silenced. It fixes warnings such as "Using plain integer as NULL pointer" when the prototype of the function is tested with the __test_args# defines. Suggested-by: Cédric Le Goater <clg@fr.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-06-13external/mambo: fix tcl startup code for mambo bogus net (repost)Aaron Sawdey1-4/+6
Repost of the same thing with Signed-off-by, and Acked-by from Michael Neuling. This fixes a couple issues with external/mambo/skiboot.tcl so I can use the mambo bogus net. * newer distros (ubuntu 18.04) allow tap device to have a user specified name instead of just tapN so we need to pass in a name not a number. * need some kind of default for net_mac, and need the mconfig for it to be set from an env var. Thanks, Aaron Acked-by: Michael Neuling <mikey at neuling.org> Signed-off-by: Aaron Sawdey <sawdey at linux.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-06-13npu2: Purge cache when resetting a GPUReza Arbab1-0/+6
After putting all a GPU's links in reset, do a cache purge in case we have CPU cache lines belonging to the now-unaccessible GPU memory. Fixes: 68d11e4460ec ("npu2: Reset NVLinks when resetting a GPU") Cc: skiboot-stable@lists.ozlabs.org Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-06-13npu2: Fix typoReza Arbab1-2/+2
Change "brigde" to "bridge". Fixes: 68d11e4460ec ("npu2: Reset NVLinks when resetting a GPU") Signed-off-by: Reza Arbab <arbab@linux.ibm.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-06-07doc: Futher document OPAL_REINIT_CPUS_MMU_* modesStewart Smith1-0/+14
Fixes: https://github.com/open-power/skiboot/issues/134 Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-06-04doc: Add OPAL tokens 46-48 as never usedStewart Smith1-0/+6
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-06-04opal/hmi: Report NPU2 checkstop reasonFrederic Barrat1-0/+44
The NPU2 is currently not passing any information to linux to explain the cause of an HMI. NPU2 has three Fault Isolation Registers and over 30 of those FIR bits are configured to raise an HMI by default. We won't be able to fit all possible state in the 32-bit xstop_reason field of the HMI event, but we can still try to encode up to 4 HMI reasons. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-06-04npu2-opencapi: Mask 2 XSL errorsFrederic Barrat1-9/+20
Commit f8dfd699f584 ("hw/npu2: Setup an error interrupt on some opencapi FIRs") converted some FIR bits default action from system checkstop to raising an error interrupt. For 2 XSL error events that can be triggered by a misbehaving AFU, the error interrupt is raised twice, once for each link (the XSL logic in the NPU is shared between 2 links). So a badly behaving AFU could impact another, unsuspecting opencapi adapter. It doesn't look good and it turns out we can do better. We can mask those 2 XSL errors. The error will also be picked up by the OTL logic, which is per link. So we'll still get an error interrupt, but only on the relevant link, and the other opencapi adapter can stay functional. Fixes: f8dfd699f584 ("hw/npu2: Setup an error interrupt on some opencapi FIRs") Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-06-04core/cpu: Fix theoretical use-after-free if no_return job returnsStewart Smith1-2/+6
Practically speaking this should/would never happen, but static analysis caught it, and just *maybe* at some time in the future, someone will have less of a terrible day debugging something terrible if we fix it. Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-06-04hw/lpc: Fix theoretical possible out-of-bounds-readStewart Smith1-2/+2
number of elements versus starting counting from 0. Found by static analysis. Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-06-04Remove P7 remnants: hw/cec.c, apollo platformStewart Smith5-209/+2
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-06-04Remove POWER7 and POWER7+ supportStewart Smith44-6986/+42
It's been a good long while since either OPAL POWER7 user touched a machine, and even longer since they'd have been okay using an old version rather than tracking master. There's also been no testing of OPAL on POWER7 systems for an awfully long time, so it's pretty safe to assume that it's very much bitrotted. It also saves a whole 14kb of xz compressed payload space. Signed-off-by: Stewart Smith <stewart@linux.ibm.com> Enthusiasticly-Acked-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-06-03opal-prd: Fix prd message size issueVasant Hegde1-4/+23
If prd messages size is insufficient then read_prd_msg() call fails with below error. And caller is not reallocating sufficient buffer. Also its hard to guess the size. sample log: ----------- Mar 28 03:31:43 zz24p1 opal-prd: FW: error reading from firmware: alloc 32 rc -1: Invalid argument Mar 28 03:31:43 zz24p1 opal-prd: FW: error reading from firmware: alloc 32 rc -1: Invalid argument Mar 28 03:31:43 zz24p1 opal-prd: FW: error reading from firmware: alloc 32 rc -1: Invalid argument .... Lets use `opal-msg-size` device tree property to allocate memory for prd message. Cc: Skiboot Stable <skiboot-stable@lists.ozlabs.org> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-06-03prd: Implement generic FSP - HBRT interfaceVasant Hegde7-2/+149
This patch implements generic interface to pass data from FSP to HBRT during runtime (FSP -> OPAL -> opal-prd -> HBRT). OPAL gets notification from FSP for new HBRT messages. We will convert MBOX message to firmware_notify format and send it to HBRT. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-06-03prd: Implement generic HBRT - FSP interfaceVasant Hegde9-1/+233
This patch implements generic interface to pass data from HBRT to FSP during runtime (HBRT -> opal-prd -> kernel -> OPAL -> FSP). HBRT sends data via firmware_request interface. We have to convert that to MBOX format and send it to FSP. OPAL uses TCE mapped memory to send data. FSP will reuse same memory for response. Once processing is complete FSP sends response to OPAL. Finally OPAL calls HBRT with firmware_response message. Also introduces new opal_msg type (OPAL_MSG_PRD2) to pass bigger prd message to kernel. - if (prd_msg > OPAL_MSG_FIXED_PARAMS_SIZE) use OPAL_MSG_PRD2 Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-06-03prd: Validate _opal_queue_msg() return valueVasant Hegde1-6/+12
On safer side, validate _opal_queue_msg() return value. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Acked-by: Jeremy Kerr <jk@ozlabs.org> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-06-03hostservices: Do not call hservices_init on ZZVasant Hegde1-2/+1
We have user space opal-prd running on ZZ. We don't use host services. Hence do not call hservices_init(). CC: Jeremy Kerr <jk@ozlabs.org> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Acked-by: Jeremy Kerr <jk@ozlabs.org> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-06-03core/test/run-msg: Add callback function testVasant Hegde1-1/+18
- Test callback function - Add test case to test OPAL_PARTIAL return value - Add test for OPAL_PARAMETER return value Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-06-03opal-msg: Enhance opal-get-msg APIVasant Hegde4-25/+56
Linux uses opal_get_msg (OPAL_GET_MSG) API to get OPAL messages. This interface supports upto 8 params (64 bytes). We have a requirement to send bigger data to Linux. This patch enhances OPAL to send bigger data to Linux. - Linux will use "opal-msg-size" device tree property to allocate memory for OPAL messages (previous patch increased "opal-msg-size" to 64K). - Replaced `reserved` field in "struct opal_msg" with `size`. So that Linux side opal_get_msg user can detect actual data size. - If buffer size < actual message size, then opal_get_msg will copy partial data and return OPAL_PARTIAL to Linux. - Add new variable "extended" to "opal_msg_entry" structure to keep track of messages that has more than 64byte data. We will allocate separate memory for these messages and once kernel consumes message we will release that memory. Cc: Jeremy Kerr <jk@ozlabs.org> Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Cc: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Acked-by: Jeremy Kerr <jk@ozlabs.org> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-06-03opal-msg: Pass parameter size to _opal_queue_msg()Vasant Hegde5-20/+27
Currently _opal_queue_msg() takes number of parameters. So far this was fine as opal_queue_msg() was supporting only fixed number of parameters (8 * 8 bytes). Soon we are going to introduce variable size parameter. Hence num_params -> params_size. Cc: Jeremy Kerr <jk@ozlabs.org> Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Cc: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Acked-by Jeremy Kerr <jk@ozlabs.org> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-06-03opal-msg: Pass return value to callback handlerVasant Hegde6-20/+21
Kernel calls opal_get_msg() API to read OPAL message. In this path OPAL calls "callback" handler to inform caller that kernel read the opal message. It assumes that read is always success. This assumption was fine as message was always fixed size. Next patch introduces variable size opal message. In that situation opal_get_msg() may fail due to insufficient buffer size (ex: old kernel and new OPAL combination). So lets add `return value` parameter to "callback" handler. So that caller knows kernel didn't read the message and take appropriate action. Cc: Jeremy Kerr <jk@ozlabs.org> Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Cc: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Acked-by: Jeremy Kerr <jk@ozlabs.org> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-06-03core/opal: Increase opal-msg-size sizeVasant Hegde2-1/+4
Kernel will use `opal-msg-size` property to allocate memory for opal_msg. We want to send bigger data from OPAL to kernel. Hence increase opal-msg-size to 64K. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Acked-by: Jeremy Kerr <jk@ozlabs.org> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-06-03trustedboot: Change PCR and event_type for the skiboot eventsClaudio Carvalho1-13/+22
The existing skiboot events are being logged as EV_ACTION, however, the TCG PC Client spec says that EV_ACTION events should have one of the pre-defined strings in the event field recorded in the event log. For instance: - "Calling Ready to Boot", - "Entering ROM Based Setup", - "User Password Entered", and - "Start Option ROM Scan. None of the EV_ACTION pre-defined strings are applicable to the existing skiboot events. Based on recent discussions with other POWER teams, this patch proposes a convention on what PCR and event types should be used for skiboot events. This also changes the skiboot source code to follow the convention. The TCG PC Client spec defines several event types, other than EV_ACTION. However, many of them are specific to UEFI events and some others are related to platform or CRTM events, which is more applicable to hostboot events. Currently, most of the hostboot events are extended to PCR[0,1] and logged as either EV_PLATFORM_CONFIG_FLAGS, EV_S_CRTM_CONTENTS or EV_POST_CODE. The "Node Id" and "PAYLOAD" events, though, are extended to PCR[4,5,6] and logged as EV_COMPACT_HASH. For the lack of an event type that fits the specific purpose, EV_COMPACT_HASH seems to be the most adequate one due to its flexibility. According to the TCG PC Client spec: - May be used for any PCR except 0, 1, 2 and 3. - The event field may be informative or may be hashed to generate the digest field, depending on the component recording the event. Additionally, the PCR[4,5] seem to be the most adequate PCRs. They would be used for skiboot and some skiroot events. According to the TCG PC Client, PCR[4] is intended to represent the entity that manages the transition between the pre-OS and OS-present state of the platform. PCR[4], along with PCR[5], identifies the initial OS loader. In summary, for skiboot events: - Events that represents data should be extended to PCR 4. - Events that represents config should be extended to PCR 5. - For the lack of an event type that fits the specific purpose, both data and config events should be logged as EV_COMPACT_HASH. Signed-off-by: Claudio Carvalho <cclaudio@linux.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-06-03opal-api: Document hole in the token listOliver O'Halloran1-0/+3
OPAL call tokens 46, 47, and 48 have been unused since the dawn of time as far as I can tell. Document the hole so the next person to assume it's contigious doesn't get tripped up by it. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-06-03npu2: Clear fence state for a brick being resetAlexey Kardashevskiy1-0/+8
Resetting a GPU before resetting an NVLink leads to occasional HMIs which fence some bricks and prevent the "reset_ntl" procedure from succeeding at the "reset_ntl_release" step - the host system requires reboot; there may be other cases like this as well. This adds clearing of the fence bit in NPU.MISC.FENCE_STATE for the NVLink which we are about to reset. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-06-03hw/phb4: Make phb4_training_trace() more generalOliver O'Halloran1-19/+25
phb4_training_trace() is used to monitor the Link Training Status State Machine (LTSSM) of the PHB's data link layer. Currently it is only used to observe the LTSSM while bringing up the link, but sometimes it's useful to see what's occurring in other situations (e.g. link disable, or secondary bus reset). This patch renames it to phb4_link_trace() and allows the target LTSSM state and a flexible timeout to help in these situations. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-06-03hw/phb4: Set trace enable where it's usedOliver O'Halloran1-12/+22
The current LTSSM state was added to the PHB4 link trace output in 961547bceed3 ("phb4: Enhanced PCIe training tracing"). That patch split enabling the LTSSM state output from the rest of the tracing code in phb4_training_trace() to ensure that it would capture events from right after PERST is lifted. This is not really necessary since LTSSM state changes occur over milliseconds. We lose nothing by delaying the enable slightly so this patch moves it into phb4_training_trace() to keep the tracing code in one place. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-06-03hw/phb4: Add missing LTSSM statesOliver O'Halloran1-0/+6
The "disabled" and "loopback" states are missing from the table. We never expect to see the second, but the first does occasionally come up. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-06-03hw/phb4: Use read/write_reg in assert_perstOliver O'Halloran1-2/+2
While the PHB is fenced we can't use the MMIO interface to access PHB registers. While processing a complete reset we inject a PHB fence to isolate the PHB from the rest of the system because the PHB won't respond to MMIOs from the rest of the system while being reset. We assert PERST after the fence has been erected which requires us to use the XSCOM indirect interface to access the PHB registers rather than the MMIO interface. Previously we did that when asserting PERST in the CRESET path. However in b8b4c79d4419 ("hw/phb4: Factor out PERST control"). This was re-written to use the raw in_be64() accessor. This means that CRESET would not be asserted in the reset path. On some Mellanox cards this would prevent them from re-loading their firmware when the system was fast-reset. This patch fixes the problem by replacing the raw {in|out}_be64() accessors with the phb4_{read|write}_reg() functions. Reported-by: Carol L Soto <clsoto@us.ibm.com> Fixes: b8b4c79d4419 ("hw/phb4: Factor out PERST control") Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Tested-by: Carol L Soto <clsoto@us.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-06-03hw/phb4: Assert Link Disable bit after ETU initOliver O'Halloran1-0/+6
The cursed RAID card in ozrom1 has a bug where it ignores PERST being asserted. The PCIe Base spec is a little vague about what happens while PERST is asserted, but it does clearly specify that when PERST is de-asserted the Link Training and Status State Machine (LTSSM) of a device should return to the initial state (Detect) defined in the spec and the link training process should restart. This bug was worked around in 9078f8268922 ("phb4: Delay training till after PERST is deasserted") by setting the link disable bit at the start of the FRESET process and clearing it after PERST was de-asserted. Although this fixed the bug, the patch offered no explaination of why the fix worked. In b8b4c79d4419 ("hw/phb4: Factor out PERST control") the link disable workaround was moved into phb4_assert_perst(). This is called always in the CRESET case, but a following patch resulted in assert_perst() not being called if phb4_freset() was entered following a CRESET since p->skip_perst was set in the CRESET handler. This is bad since a side-effect of the CRESET is that the Link Disable bit is cleared. This, combined with the RAID card ignoring PERST results in the PCIe link being trained by the PHB while we're waiting out the 100ms ETU reset time. If we hack skiboot to print a DLP trace after returning from phb4_hw_init() we get: PHB#0001[0:1]: Initialization complete PHB#0001[0:1]: TRACE:0x0000102101000000 0ms presence GEN1:x16:polling PHB#0001[0:1]: TRACE:0x0000001101000000 23ms GEN1:x16:detect PHB#0001[0:1]: TRACE:0x0000102101000000 23ms presence GEN1:x16:polling PHB#0001[0:1]: TRACE:0x0000183101000000 29ms training GEN1:x16:config PHB#0001[0:1]: TRACE:0x00001c5881000000 30ms training GEN1:x08:recovery PHB#0001[0:1]: TRACE:0x00001c5883000000 30ms training GEN3:x08:recovery PHB#0001[0:1]: TRACE:0x0000144883000000 33ms presence GEN3:x08:L0 PHB#0001[0:1]: TRACE:0x0000154883000000 33ms trained GEN3:x08:L0 PHB#0001[0:1]: CRESET: wait_time = 100 PHB#0001[0:1]: FRESET: Starts PHB#0001[0:1]: FRESET: Prepare for link down PHB#0001[0:1]: FRESET: Assert skipped PHB#0001[0:1]: FRESET: Deassert PHB#0001[0:1]: TRACE:0x0000154883000000 0ms trained GEN3:x08:L0 PHB#0001[0:1]: TRACE: Reached target state PHB#0001[0:1]: LINK: Start polling PHB#0001[0:1]: LINK: Electrical link detected PHB#0001[0:1]: LINK: Link is up PHB#0001[0:1]: LINK: Went down waiting for stabilty PHB#0001[0:1]: LINK: DLP train control: 0x0000105101000000 PHB#0001[0:1]: CRESET: Starts What has happened here is that the link is trained to 8x Gen3 33ms after we return from phb4_init_hw(), and before we've waitined to 100ms that we normally wait after re-initialising the ETU. When we "deassert" PERST later on in the FRESET handler the link in L0 (normal) state. At this point we try to read from the Vendor/Device ID register to verify that the link is stable and immediately get a PHB fence due to a PCIe Completion Timeout. Skiboot attempts to recover by doing another CRESET, but this will encounter the same issue. This patch fixes the problem by setting the Link Disable bit (by calling phb4_assert_perst()) immediately after we return from phb4_init_hw(). This prevents the link from being trained while PERST is asserted which seems to avoid the Completion Timeout. With the patch applied we get: PHB#0001[0:1]: Initialization complete PHB#0001[0:1]: TRACE:0x0000102101000000 0ms presence GEN1:x16:polling PHB#0001[0:1]: TRACE:0x0000001101000000 23ms GEN1:x16:detect PHB#0001[0:1]: TRACE:0x0000102101000000 23ms presence GEN1:x16:polling PHB#0001[0:1]: TRACE:0x0000909101000000 29ms presence GEN1:x16:disabled PHB#0001[0:1]: CRESET: wait_time = 100 PHB#0001[0:1]: FRESET: Starts PHB#0001[0:1]: FRESET: Prepare for link down PHB#0001[0:1]: FRESET: Assert skipped PHB#0001[0:1]: FRESET: Deassert PHB#0001[0:1]: TRACE:0x0000001101000000 0ms GEN1:x16:detect PHB#0001[0:1]: TRACE:0x0000102101000000 0ms presence GEN1:x16:polling PHB#0001[0:1]: TRACE:0x0000001101000000 24ms GEN1:x16:detect PHB#0001[0:1]: TRACE:0x0000102101000000 36ms presence GEN1:x16:polling PHB#0001[0:1]: TRACE:0x0000183101000000 97ms training GEN1:x16:config PHB#0001[0:1]: TRACE:0x00001c5881000000 97ms training GEN1:x08:recovery PHB#0001[0:1]: TRACE:0x00001c5883000000 97ms training GEN3:x08:recovery PHB#0001[0:1]: TRACE:0x0000144883000000 99ms presence GEN3:x08:L0 PHB#0001[0:1]: TRACE: Reached target state PHB#0001[0:1]: LINK: Start polling PHB#0001[0:1]: LINK: Electrical link detected PHB#0001[0:1]: LINK: Link is up PHB#0001[0:1]: LINK: Link is stable PHB#0001[0:1]: LINK: Card [9005:028c] Optimal Retry:disabled PHB#0001[0:1]: LINK: Speed Train:GEN3 PHB:GEN4 DEV:GEN3 PHB#0001[0:1]: LINK: Width Train:x08 PHB:x08 DEV:x08 PHB#0001[0:1]: LINK: RX Errors Now:0 Max:8 Lane:0x0000 Cc: Michael Neuling <mikey@neuling.org> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-06-03lpc-port80h: Don't write port 80h when running under SimicsAlistair Popple2-0/+6
Simics doesn't model LPC port 80h. Writing to it terminates the simulation due to an invalid LPC memory access. This patch adds a check to ensure port 80h isn't accessed if we are running under Simics. Signed-off-by: Alistair Popple <alistair@popple.id.au> [stewart: fixup run-port80h test] Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-06-03platforms/vesnin: PCI inventory via IPMI OEMArtem Senichev1-33/+44
Replace raw protocol with OEM message supported by OpenBMC's IPMI plugins. BMC-side implementation (IPMI plug-in): https://github.com/YADRO-KNS/phosphor-pci-inventory Signed-off-by: Artem Senichev <a.senichev@yadro.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-06-03doc: fixup misc broken linksStewart Smith4-4/+5
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-06-03doc: prettify OPAL_GET_XIVE and OPAL_SET_XIVEStewart Smith2-3/+13
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
2019-06-03doc: Document OPAL_CONFIG_CPU_IDLE_STATEStewart Smith1-0/+32
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>