skiboot 5.9-rc1 release notesv5.9-rc1

Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
author: Stewart Smith <stewart@linux.vnet.ibm.com> 2017-10-11 16:35:19 +1100
committer: Stewart Smith <stewart@linux.vnet.ibm.com> 2017-10-11 16:35:19 +1100
commit: e363cd66debb6a83e64bdd3bbdbf0eff501443a8 (patch)
tree: c1aed158fbe53fd57f8baf0231b9b0aa416f4406 /doc/release-notes
parent: 48b8b46444f3f6b600012fdaa2fcdf9a52b3a6af (diff)
download: skiboot-e363cd66debb6a83e64bdd3bbdbf0eff501443a8.zip
skiboot-e363cd66debb6a83e64bdd3bbdbf0eff501443a8.tar.gz
skiboot-e363cd66debb6a83e64bdd3bbdbf0eff501443a8.tar.bz2
2 files changed, 531 insertions, 0 deletions
diff --git a/doc/release-notes/skiboot-5.1.21.rst b/doc/release-notes/skiboot-5.1.21.rst
index 1376673..3aab050 100644
--- a/doc/release-notes/skiboot-5.1.21.rst
+++ b/doc/release-notes/skiboot-5.1.21.rst
@@ -1,3 +1,5 @@
+.. _skiboot-5.1.21:
+
 skiboot-5.1.21
 --------------
 
diff --git a/doc/release-notes/skiboot-5.9-rc1.rst b/doc/release-notes/skiboot-5.9-rc1.rst
new file mode 100644
index 0000000..7822015
--- /dev/null
+++ b/doc/release-notes/skiboot-5.9-rc1.rst
@@ -0,0 +1,529 @@
+.. _skiboot-5.9-rc1:
+
+skiboot-5.9-rc1
+===============
+
+skiboot v5.9-rc1 was released on Wednesday October 11th 2017. It is the first
+release candidate of skiboot 5.9, which will become the new stable release
+of skiboot following the 5.8 release, first released August 31st 2017.
+
+skiboot v5.9-rc1 contains all bug fixes as of :ref:`skiboot-5.4.7`
+and :ref:`skiboot-5.1.21` (the currently maintained stable releases). We
+do not currently expect to do any 5.8.x stable releases.
+
+For how the skiboot stable releases work, see :ref:`stable-rules` for details.
+
+The current plan is to cut the final 5.9 by October 17th, with skiboot 5.9
+being for all POWER8 and POWER9 platforms in op-build v1.20 (Due October 18th).
+This release will be targetted to early POWER9 systems.
+
+Over skiboot-5.8, we have the following changes:
+
+New Features
+------------
+
+POWER8
+^^^^^^
+- fast-reset by default (if possible)
+
+  Currently, this is limited to POWER8 systems.
+
+  A normal reboot will, rather than doing a full IPL, go through a
+  fast reboot procedure. This reduces the "reboot to petitboot" time
+  from minutes to a handful of seconds.
+
+POWER9
+^^^^^^
+- POWER9 power management during boot
+
+  Less power should be consumed during boot.
+- OPAL_SIGNAL_SYSTEM_RESET for POWER9
+
+  This implements OPAL_SIGNAL_SYSTEM_RESET, using scom registers to
+  quiesce the target thread and raise a system reset exception on it.
+  It has been tested on DD2 with stop0 ESL=0 and ESL=1 shallow power
+  saving modes.
+
+  DD1 is not implemented because it is sufficiently different as to
+  make support difficult.
+- Enable deep idle states for POWER9
+
+  - SLW: Add support for p9_stop_api
+
+    p9_stop_api's are used to set SPR state on a core wakeup form a  deeper
+    low power state. p9_stop_api uses  low level platform formware and
+    self-restore microcode to restore the sprs to requested values.
+
+    Code is taken from :
+    https://github.com/open-power/hostboot/tree/master/src/import/chips/p9/procedures/utils/stopreg
+  - SLW: Removing timebase related flags for stop4
+
+    When a core enters stop4, it does not loose decrementer and time base.
+    Hence removing flags OPAL_PM_DEC_STOP and OPAL_PM_TIMEBASE_STOP.
+  - SLW: Allow deep states if homer address is known
+
+    Use a common variable has_wakeup_engine instead of has_slw to tell if
+    the:
+    - SLW image is populated in case of power8
+    - CME image is populated in case of power9
+
+    Currently we expect CME to be loaded if homer address is known ( except
+    for simulators)
+  - SLW: Configure self-restore for HRMOR
+
+    Make a stop api call using libpore to restore HRMOR register. HRMOR needs
+    to be cleared so that when thread exits stop, they arrives at linux
+    system_reset vector (0x100).
+  - SLW: Add opal_slw_set_reg support for power9
+
+    This OPAL call is made from Linux to OPAL to configure values in
+    various SPRs after wakeup from a deep idle state.
+- PHB4: CAPP recovery
+
+  CAPP recovery is initiated when a CAPP Machine Check is detected.
+  The capp recovery procedure is initiated via a Hypervisor Maintenance
+  interrupt (HMI).
+
+  CAPP Machine Check may arise from either an error that results in a PHB
+  freeze or from an internal CAPP error with CAPP checkstop FIR action.
+  An error that causes a PHB freeze will result in the link down signal
+  being asserted. The system continues running and the CAPP and PSL will
+  be re-initialized.
+
+  This implements CAPP recovery for POWER9 systems
+- Add ``wafer-location`` property for POWER9
+
+  Extract wafer-location from ECID and add property under xscom node.
+  - bits  64:71 are the chip x location (7:0)
+  - bits  72:79 are the chip y location (7:0)
+
+  Sample output: ::
+
+    [root@wsp xscom@623fc00000000]# lsprop ecid
+    ecid             019a00d4 03100718 852c0000 00fd7911
+    [root@wsp xscom@623fc00000000]# lsprop wafer-location
+    wafer-location   00000085 0000002c
+- Add ``wafer-id`` property for POWER9
+
+  Wafer id is derived from ECID data.
+  - bits   4:63 are the wafer id ( ten 6 bit fields each containing a code)
+
+  Sample output: ::
+
+    [root@wsp xscom@623fc00000000]# lsprop ecid
+    ecid             019a00d4 03100718 852c0000 00fd7911
+    [root@wsp xscom@623fc00000000]# lsprop wafer-id
+    wafer-id         "6Q0DG340SO"
+- Add ``ecid`` property under ``xscom`` node for POWER9.
+  Sample output: ::
+
+    [root@wsp xscom@623fc00000000]# lsprop ecid
+    ecid             019a00d4 03100718 852c0000 00fd7911
+- Add ibm,firmware-versions device tree node
+
+  In P8, hostboot provides mini device tree. It contains ``/ibm,firmware-versions``
+  node which has various firmware component version details.
+
+  In P9, OPAL is building device tree. This patch adds support to parse VERSION
+  section of PNOR and create ``/ibm,firmware-versions`` device tree node.
+
+  Sample output: ::
+
+            /sys/firmware/devicetree/base/ibm,firmware-versions # lsprop .
+            occ              "6a00709"
+            skiboot          "v5.7-rc1-p344fb62"
+            buildroot        "2017.02.2-7-g23118ce"
+            capp-ucode       "9c73e9f"
+            petitboot        "v1.4.3-p98b6d83"
+            sbe              "02021c6"
+            open-power       "witherspoon-v1.17-128-gf1b53c7-dirty"
+            ....
+            ....
+
+POWER9
+------
+
+- Disable Transactional Memory on Power9 DD 2.1
+
+  Update pa_features_p9[] to disable TM (Transactional Memory). On DD 2.1
+  TM is not usable by Linux without other workarounds, so skiboot must
+  disable it.
+- xscom: Do not print error message for 'chiplet offline' return values
+
+  xscom_read/write operations returns CHIPLET_OFFLINE when chiplet is offline.
+  Some multicast xscom_read/write requests from HBRT results in xscom operation
+  on offline chiplet(s) and printing below warnings in OPAL console: ::
+
+    [ 135.036327572,3] XSCOM: Read failed, ret = -14
+    [ 135.092689829,3] XSCOM: Read failed, ret = -14
+
+  Some SCOM users can deal correctly with this error code (notably opal-prd),
+  so the error message is (in practice) erroneous.
+- IMC: Fix the core_imc_event_mask
+
+  CORE_IMC_EVENT_MASK is a scom that contains bits to control event sampling for
+  different machine state for core imc. The current event-mask setting sample
+  events only on host kernel (hypervisor) and host userspace.
+
+  Patch to enable the sampling of events in other machine states (like guest
+  kernel and guest userspace).
+- IMC: Update the nest_pmus array with occ/gpe microcode uav updates
+
+  OOC/gpe nest microcode maintains the list of individual nest units
+  supported. Sync the recent updates to the UAV with nest_pmus array.
+
+  For reference occ/gpr microcode link for the UAV:
+  https://github.com/open-power/occ/blob/master/src/occ_gpe1/gpe1_24x7.h
+- Parse IOSLOT information from HDAT
+
+  Add structure definitions that describe the physical PCIe topology of
+  a system and parse them into the device-tree based PCIe slot
+  description.
+- idle: user context state loss flags fix for stop states
+
+  The "lite" stop variants with PSSCR[ESL]=PSSCR[EC]=1 do not lose user
+  context, while the non-lite variants do (ESL: enable state loss).
+
+  Some of the POWER9 idle states had these wrong.
+
+CAPI
+^^^^
+- POWER9 DD2 update
+
+  The CAPI initialization sequence has been updated in DD2.
+  This patch adapts to the changes, retaining compatibility with DD1.
+  The patch includes some changes to DD1 fix-ups as well.
+- Load CAPP microcode for POWER9 DD2.0 and DD2.1
+- capi: Mask Psl Credit timeout error for POWER9
+
+  Mask the PSL credit timeout error in CAPP FIR Mask register
+  bit(46). As per the h/w team this error is now deprecated and shouldn't
+  cause any fir-action for P9.
+
+NVLINK2
+^^^^^^^
+
+A notabale change is that we now generate the device tree description of
+NVLINK based on the HDAT we get from hostboot. Since Hostboot will generate
+HDAT based on VPD, you now *MUST* have correct VPD programmed or we will
+*default* to a Sequoia layout, which will lead to random problems if you
+are not booting a Sequoia Witherspoon planar. In the case of booting with
+old VPD and/or Hostboot, we print a **giant scary warning** in order to scare you.
+
+- npu2: Read slot label from the HDAT link node
+
+  Binding GPU to emulated NPU PCI devices is done using the slot labels
+  since the NPU devices do not have a patching slot node we need to
+  copy the label in here.
+
+- npu2: Copy link speed from the npu HDAT node
+
+  This needs to be in the PCI device node so the speed of the NVLink
+  can be passed to the GPU driver.
+- npu2: hw-procedures: Add settings to PHY_RESET
+
+  Set a few new values in the PHY_RESET procedure, as specified by our
+  updated programming guide documentation.
+- Parse NVLink information from HDAT
+
+  Add the per-chip structures that descibe how the A-Bus/NVLink/OpenCAPI
+  phy is configured. This generates the npu@xyz nodes for each chip on
+  systems that support it.
+- npu2: Add vendor cap for IRQ testing
+
+  Provide a way to test recoverable data link interrupts via a new
+  vendor capability byte.
+- npu2: Enable recoverable data link (no-stall) interrupts
+
+  Allow the NPU2 to trigger "recoverable data link" interrupts.
+
+- npu2: Implement basic FLR (Function Level Reset)
+- npu2: hw-procedures: Update PHY DC calibration procedure
+- npu2: hw-procedures: Change rx_pr_phase_step value
+
+XIVE
+^^^^
+- xive: Fix opal_xive_dump_tm() to access W2 properly.
+  The HW only supported limited access sizes.
+- xive: Make opal_xive_allocate_irq() properly try all chips
+
+  When requested via OPAL_XIVE_ANY_CHIP, we need to try all
+  chips. We first try the current one (on which the caller
+  sits) and if that fails, we iterate all chips until the
+  allocation succeeds.
+- xive: Fix initialization & cleanup of HW thread contexts
+
+  Instead of trying to "pull" everything and clear VT (which didn't
+  work and caused some FIRs to be set), instead just clear and then
+  set the PTER thread enable bit. This has the side effect of
+  completely resetting the corresponding thread context.
+
+  This fixes the spurrious XIVE FIRs reported by PRD and fircheck
+- xive: Add debug option for detecting misrouted IPI in emulation
+
+  This is high overhead so we don't enable it by default even
+  in debug builds, it's also a bit messy, but it allowed me to
+  detect and debug a locking issue earlier so it can be useful.
+- xive: Increase the interrupt "gap" on debug builds
+
+  We normally allocate IPIs from 0x10. Make that 0x1000 on debug
+  builds to limit the chances of overlapping with Linux interrupt
+  numbers which makes debugging code that confuses them easier.
+
+  Also add a warning in emulation if we get an interrupt in the
+  queue whose number is below the gap.
+- xive: Fix locking around cache scrub & watch
+
+  Thankfully the missing locking only affects debug code and
+  init code that doesn't run concurrently. Also adds a DEBUG
+  option that checks the lock is properly held.
+- xive: Workaround HW issue with scrub facility
+
+  Without this, we sometimes don't observe from a CPU the
+  values written to the ENDs or NVTs via the cache watch.
+- xive: Add exerciser for cache watch/scrub facility in DEBUG builds
+- xive: Make assertion in xive_eq_for_target() more informative
+- xive: Add debug code to check initial cache updates
+- xive: Ensure pressure relief interrupts are disabled
+
+  We don't use them and we hijack the VP field with their
+  configuration to store the EQ reference, so make sure the
+  kernel or guest can't turn them back on by doing MMIO
+  writes to ACK#
+- xive: Don't try setting the reserved ACK# field in VPs
+
+  That doesn't work, the HW doesn't implement it in the cache
+  watch facility anyway.
+- xive: Remove useless memory barriers in VP/EQ inits
+
+  We no longer update "live" memory structures, we use a temporary
+  copy on the stack and update the actual memory structure using
+  the cache watch, so those barriers are pointless.
+
+PHB4
+^^^^
+- phb4: Mask RXE_ARB: DEC Stage Valid Error
+
+  Change the inits to mask out the RXE ARB: DEC Stage Valid Error (bit
+  370. This has been a fatal error but should be informational only.
+
+  This update will be in the next version of the phb4 workbook.
+- phb4: Add additional adapter to retrain whitelist
+
+  The single port version of the ConnectX-5 has a different device ID 0x1017.
+  Updated descriptions to match pciutils database.
+- PHB4: Default to PCIe GEN3 on POWER9 DD2.00
+
+  You can use the NVRAM override for DD2.00 screened parts.
+- phb4: Retrain link if degraded
+
+  On P9 Scale Out (Nimbus) DD2.0 and Scale in (Cumulus) DD1.0 (and
+  below) the PCIe PHY can lockup causing training issues. This can cause
+  a degradation in speed or width in ~5% of training cases (depending on
+  the card). This is fixed in later chip revisions. This issue can also
+  cause PCIe links to not train at all, but this case is already
+  handled.
+
+  This patch checks if the PCIe link has trained optimally and if not,
+  does a full PHB reset (to fix the PHY lockup) and retrain.
+
+  One complication is some devices are known to train degraded unless
+  device specific configuration is performed. Because of this, we only
+  retrain when the device is in a whitelist. All devices in the current
+  whitelist have been testing on a P9DSU/Boston, ZZ and Witherspoon.
+
+  We always gather information on the link and print it in the logs even
+  if the card is not in the whitelist.
+
+  For testing purposes, there's an nvram to retry all PCIe cards and all
+  P9 chips when a degraded link is detected. The new option is
+  'pci-retry-all=true' which can be set using:
+  `nvram -p ibm,skiboot --update-config pci-retry-all=true`.
+  This option may increase the boot time if used on a badly behaving
+  card.
+
+
+IBM FSP platforms
+-----------------
+
+- FSP/NVRAM: Handle "get vNVRAM statistics" command
+
+  FSP sends MBOX command (cmd : 0xEB, subcmd : 0x05, mod : 0x00) to get vNVRAM
+  statistics. OPAL doesn't maintain any such statistics. Hence return
+  FSP_STATUS_INVALID_SUBCMD.
+
+  Fixes these messages appearing in the OPAL log: ::
+
+      [16944.384670488,3] FSP: Unhandled message eb0500
+      [16944.474110465,3] FSP: Unhandled message eb0500
+      [16945.111280784,3] FSP: Unhandled message eb0500
+      [16945.293393485,3] FSP: Unhandled message eb0500
+- fsp: Move common prints to trace
+
+  These two prints just end up filling the skiboot logs on any machine
+  that's been booted for more than a few hours.
+
+  They have never been useful, so make them trace level. They were: ::
+    SURV: Received heartbeat acknowledge from FSP
+    SURV: Sending the heartbeat command to FSP
+
+BMC based systems
+-----------------
+- hw/lpc-uart: read from RBR to clear character timeout interrupts
+
+  When using the aspeed SUART, we see a condition where the UART sends
+  continuous character timeout interrupts. This change adds a (heavily
+  commented) dummy read from the RBR to clear the interrupt condition on
+  init.
+
+  This was observed on p9dsu systems, but likely applies to other systems
+  using the SUART.
+- astbmc: Add methods for handing Device Tree based slots
+  e.g. ones from HDAT on POWER9.
+
+General
+-------
+- ipmi: Convert common debug prints to trace
+
+  OPAL logs messages for every IPMI request from host. Sometime OPAL console
+  is filled with only these messages. This path is pretty stable now and
+  we have enough logs to cover bad path. Hence lets convert these debug
+  message to trace/info message. Examples are: ::
+
+    [ 1356.423958816,7] opal_ipmi_recv(cmd: 0xf0 netfn: 0x3b resp_size: 0x02)
+    [ 1356.430774496,7] opal_ipmi_send(cmd: 0xf0 netfn: 0x3a len: 0x3b)
+    [ 1356.430797392,7] BT: seq 0x20 netfn 0x3a cmd 0xf0: Message sent to host
+    [ 1356.431668496,7] BT: seq 0x20 netfn 0x3a cmd 0xf0: IPMI MSG done
+- libflash/file: Handle short read()s and write()s correctly
+
+  Currently we don't move the buffer along for a short read() or write()
+  and nor do we request only the remaining amount.
+
+- hw/p8-i2c: Rework timeout handling
+
+  Currently we treat a timeout as a hard failure and will automatically
+  fail any transations that hit their timeout. This results in
+  unnecessarily failing I2C requests if interrupts are dropped, etc.
+  Although these are bad things that we should log we can handle them
+  better by checking the actual hardware status and completing the
+  transation if there are no real errors. This patch reworks the timeout
+  handling to check the status and continue the transaction if it can.
+  if it can while logging an error if it detects a timeout due to a
+  dropped interrupt.
+- core/flash: Only expect ELF header for BOOTKERNEL partition flash resource
+
+  When loading a flash resource which isn't signed (secure and trusted
+  boot) and which doesn't have a subpartition, we assume it's the
+  BOOTKERNEL since previously this was the only such resource. Thus we
+  also assumed it had an ELF header which we parsed to get the size of the
+  partition rather than trusting the actual_size field in the FFS header.
+  A previous commit (9727fe3 DT: Add ibm,firmware-versions node) added the
+  version resource which isn't signed and also doesn't have a subpartition,
+  thus we expect it to have an ELF header. It doesn't so we print the
+  error message "FLASH: Invalid ELF header part VERSION".
+
+  It is a fluke that this works currently since we load the secure boot
+  header unconditionally and this happen to be the same size as the
+  version partition. We also don't update the return code on error so
+  happen to return OPAL_SUCCESS.
+
+  To make this explicitly correct; only check for an ELF header if we are
+  loading the BOOTKERNEL resource, otherwise use the partition size from
+  the FFS header. Also set the return code on error so we don't
+  erroneously return OPAL_SUCCESS. Add a check that the resource will fit
+  in the supplied buffer to prevent buffer overrun.
+- flash: Support adding the no-erase property to flash
+
+  The mbox protocol explicitly states that an erase is not required
+  before a write. This means that issuing an erase from userspace,
+  through the mtd device, and back returns a successful operation
+  that does nothing. Unfortunately, this makes userspace tools unhappy.
+  Linux MTD devices support the MTD_NO_ERASE flag which conveys that
+  writes do not require erases on the underlying flash devices. We
+  should set this property on all of our
+  devices which do not require erases to be performed.
+
+  NOTE: This still requires a linux kernel component to set the
+  MTD_NO_ERASE flag from the device tree property.
+
+Utilities
+---------
+- external/gard: Clear entire guard partition instead of entry by entry
+
+  When using the current implementation of the gard tool to ecc clear the
+  entire GUARD partition it is done one gard record at a time. While this
+  may be ok when accessing the actual flash this is very slow when done
+  from the host over the mbox protocol (on the order of 4 minutes) because
+  the bmc side is required to do many read, erase, writes under the hood.
+
+  Fix this by rewriting the gard tool reset_partition() function. Now we
+  allocate all the erased guard entries and (if required) apply ecc to the
+  entire buffer. Then we can do one big erase and write of the entire
+  partition. This reduces the time to clear the guard partition to on the
+  order of 4 seconds.
+- opal-prd: Fix opal-prd command line options
+
+  HBRT OCC reset interface depends on service processor type.
+
+  - FSP: reset_pm_complex()
+  - BMC: process_occ_reset()
+
+  We have both `occ` and `pm-complex` command line interfaces.
+  This patch adds support to dispaly appropriate message depending
+  on system type.
+
+  === ==================== ============================
+  SP  Command              Action
+  === ==================== ============================
+  FSP opal-prd occ         display error message
+  FSP opal-prd pm-complex  Call pm_complex_reset()
+  BMC opal-prd occ         Call process_occ_reset()
+  BMC opal-prd pm-complex  display error message
+  === ==================== ============================
+
+- opal-prd: detect service processor type and
+  then make appropriate occ reset call.
+- pflash: Fix erase command for unaligned start address
+
+  The erase_range() function handles erasing the flash for a given start
+  address and length, and can handle an unaligned start address and
+  length. However in the unaligned start address case we are incorrectly
+  calculating the remaining size which can lead to incomplete erases.
+
+  If we're going to update the remaining size based on what the start
+  address was then we probably want to do that before we overide the
+  origin start address. So rearrange the code so that this is indeed the
+  case.
+- external/gard: Print an error if run on an FSP system
+
+Simulators
+----------
+
+- mambo: Add mambo socket program
+
+  This adds a program that can be run inside a mambo simulator in linux
+  userspace which enables TCP sockets to be proxied in and out of the
+  simulator to the host.
+
+  Unlike mambo bogusnet, it's requires no linux or skiboot specific
+  drivers/infrastructure to run.
+
+  Run inside the simulator:
+
+  - to forward host ssh connections to sim ssh server:
+    ``./mambo-socket-proxy -h 10022 -s 22``, then connect to port 10022
+    on your host with ``ssh -p 10022 localhost``
+  - to allow http proxy access from inside the sim to local http proxy:
+    ``./mambo-socket-proxy -b proxy.mynetwork -h 3128 -s 3128``
+
+  Multiple connections are supported.
+- idle: disable stop*_lite POWER9 idle states for Mambo platform
+
+  Mambo prior to Mambo.7.8.21 had a bug where the stop idle instruction
+  with PSSCR[ESL]=PSSCR[EC]=0 would resume with MSR set as though it had
+  taken a system reset interrupt.
+
+  Linux currently executes this instruction with MSR already set that
+  way, so the problem went unnoticed. A proposed patch to Linux changes
+  that, and causes the idle code to crash. Work around this by disabling
+  lite stop states for the mambo platform for now.
author	Stewart Smith <stewart@linux.vnet.ibm.com>	2017-10-11 16:35:19 +1100
committer	Stewart Smith <stewart@linux.vnet.ibm.com>	2017-10-11 16:35:19 +1100
commit	e363cd66debb6a83e64bdd3bbdbf0eff501443a8 (patch)
tree	c1aed158fbe53fd57f8baf0231b9b0aa416f4406 /doc/release-notes
parent	48b8b46444f3f6b600012fdaa2fcdf9a52b3a6af (diff)
download	skiboot-e363cd66debb6a83e64bdd3bbdbf0eff501443a8.zip skiboot-e363cd66debb6a83e64bdd3bbdbf0eff501443a8.tar.gz skiboot-e363cd66debb6a83e64bdd3bbdbf0eff501443a8.tar.bz2