From 9f25b34b7f2655c76300ee16c1c5d6840cad4985 Mon Sep 17 00:00:00 2001 From: Stewart Smith Date: Mon, 17 Oct 2016 15:04:57 +1100 Subject: skiboot 5.4.0-rc1 release notes Signed-off-by: Stewart Smith --- doc/conf.py | 2 +- doc/release-notes/skiboot-5.1.18.rst | 2 + doc/release-notes/skiboot-5.3.7.rst | 2 + doc/release-notes/skiboot-5.4.0-rc1.rst | 533 ++++++++++++++++++++++++++++++++ doc/stable-skiboot-rules.rst | 7 +- doc/stb.rst | 3 + 6 files changed, 546 insertions(+), 3 deletions(-) create mode 100644 doc/release-notes/skiboot-5.4.0-rc1.rst (limited to 'doc') diff --git a/doc/conf.py b/doc/conf.py index b5ccf04..eed3b07 100644 --- a/doc/conf.py +++ b/doc/conf.py @@ -53,7 +53,7 @@ copyright = u'2016, Stewart Smith, IBM, others' # The short X.Y version. version = '5.4' # The full version, including alpha/beta/rc tags. -release = '5.4.0-rc1-snapshot' +release = '5.4.0-rc1' # The language for content autogenerated by Sphinx. Refer to documentation # for a list of supported languages. diff --git a/doc/release-notes/skiboot-5.1.18.rst b/doc/release-notes/skiboot-5.1.18.rst index bdaf58a..03cd1db 100644 --- a/doc/release-notes/skiboot-5.1.18.rst +++ b/doc/release-notes/skiboot-5.1.18.rst @@ -1,3 +1,5 @@ +.. _skiboot-5.1.18: + skiboot-5.1.18 -------------- diff --git a/doc/release-notes/skiboot-5.3.7.rst b/doc/release-notes/skiboot-5.3.7.rst index 58ee129..5fba960 100644 --- a/doc/release-notes/skiboot-5.3.7.rst +++ b/doc/release-notes/skiboot-5.3.7.rst @@ -1,3 +1,5 @@ +.. _skiboot-5.3.7: + skiboot-5.3.7 ------------- diff --git a/doc/release-notes/skiboot-5.4.0-rc1.rst b/doc/release-notes/skiboot-5.4.0-rc1.rst new file mode 100644 index 0000000..5e782cc --- /dev/null +++ b/doc/release-notes/skiboot-5.4.0-rc1.rst @@ -0,0 +1,533 @@ +skiboot-5.4.0-rc1 +================= + +skiboot-5.4.0-rc1 was released on Monday October 17th 2016. It is the first +release candidate of skiboot 5.4, which will become the new stable release +of skiboot following the 5.3 release, first released August 2nd 2016. + +skiboot-5.4.0-rc1 contains all bug fixes as of :ref:`skiboot-5.3.7` +and :ref:`skiboot-5.1.18` (the currently maintained stable releases). + +For how the skiboot stable releases work, see :ref:`stable-rules` for details. + +The current plan is to release a new release candidate every week until we +feel good about it. The aim is for skiboot-5.4.x to be in op-build v1.13, which +is due by November 23rd 2016. + +Over skiboot-5.3, we have the following changes: + +New Features +------------ +- Initial Trusted Boot support (see :ref:`stb-overview`). + There are several limitations with this initial release: + + - CAPP partition is not measured correctly + - Only Nuvoton TPM 2.0 is supported + - Requires hardware rework on late revision Habanero or Firestone boards + in order to install TPM. + + - Add i2c Nuvoton TPM 2.0 Driver + - romcode driver for POWER8 secure ROM + - See Device tree docs for tpm and ibm,secureboot nodes + - See main secure and trusted boot documentation. + + +- Fast reboot for P8 + + This makes reboot take an *awful* lot less time, somewhere between four + and ten times faster than a full IPL. It is currently experimental and not + enabled by default. + You can enable the experimental support via nvram option: :: + + # nvram -p ibm,skiboot --update-config experimental-fast-reset=feeling-lucky + + **WARNING**: This has *known* bugs. For example, if you have used a device + in CAPI mode, we will currently *NOT* reset it back to plain PCI. There + are also some known issues in most simulators. + +- Support ``ibm,skiboot`` NVRAM partition with skiboot configuration options. + + - These should generally only be used if you either completely know what + you are doing or need to work around a skiboot bug. They are **not** + intended for end users. + - Add support for supplying the kernel boot arguments from the ``bootargs`` + configuration string in the ``ibm,skiboot`` NVRAM partition. + - Enabling the experimental fast reset feature is done via this method. + +- Add support for nap mode on P8 while in skiboot + + - While nap has been exposed to the Operating System since day 1, we have + not utilized low power states when in skiboot itself, leading to higher + power consumption during boot. + We only enable the functionality after the 0x100 vector has been + patched, and we disable it before transferring control to Linux. + +- libflash: add 128MB MX66L1G45G part + +- Pointer validation of OPAL API call arguments. + + - If the kernel called an OPAL API with vmalloc'd address + or any other address range in real mode, we would hit + a problem with aliasing. Since the top 4 bits are ignored + in real mode, pointers from 0xc.. and 0xd.. (and other ranges) + could collide and lead to hard to solve bugs. This patch + adds the infrastructure for pointer validation and a simple + test case for testing the API + - The checks validate pointers sent in using ``opal_addr_valid()`` + +Documentation +------------- + +There have been a number of documentation fixes this release. Most prominent +is the switch to Sphinx (from the Python project) and ReStructured Text (RST) +as the documentation format. RST and Sphinx enable both production of pretty +documentation in HTML and PDF formats while remaining readable in their raw +form to those with no knowledge of RST. + +You can build a HTML site by doing the following: :: + + cd doc/ + make html + +As always, documentation patches are very, *very* welcome as we attempt to +document the OPAL API, the device tree bindings and important parts of +OPAL internals. + +We would like the Device Tree documentation to follow the style that can be +included in the Device Tree Specification. + + +General +------- +- Make console-log time more readable: seconds rather than timebase + Log format is now ``[SECONDS.(tb%512000000),LEVEL]`` + +- Flash (PNOR) code improvements + + - flash: Make size 64 bit safe + This makes the size of flash 64 bit safe so that we can have flash + devices greater than 4GB. This is especially useful for mambo disks + passed through to Linux. + - core/flash.c: load actual partition size + We are downloading 0x20000 bytes from PNOR for CAPP, but currently the + CAPP lid is only 40K. + - flash: Rework error paths and messages for multiple flash controllers + Now that we have mambo bogusdisk flash, we can have many flash chips. + This is resulting in some confusing output messages. + +- core/init: Fix "failure of getting node in the free list" warning on boot. +- slw: improve error message for SLW timer stuck + +- Centaur / XSCOM error handling + + - print message on disabling xscoms to centaur due to many errors + - Mark centaur offline after 10 consecutive access errors + +- XSCOM improvements + + - xscom: Map all HMER status codes to OPAL errors + - xscom: Initialize the data to a known value in ``xscom_read`` + In case of error, don't leave the data random. It helps debugging when + the user fails to check the error code. This happens due to a bug in the + PRD wrapper app. + - chip: Add a quirk for when core direct control XSCOMs are missing + +- p8-i2c: Don't crash if a centaur errored out + +- cpu: Make endian switch message more informative +- cpu: Display number of started CPUs during boot +- core/init: ensure that HRMOR is zero at boot +- asm: Fix backtrace for unexpected exception + +- cpu: Remove pollers calling heuristics from ``cpu_wait_job`` + This will be handled by ``time_wait_ms()``. Also remove a useless + ``smt_medium()``. + Note that this introduce a difference in behaviour: time_wait + will only call the pollers on the boot CPU while ``cpu_wait_job()`` + could call them on any. However, I can't think of a case where + this is a problem. + +- cpu: Remove global job queue + Instead, target a specific CPU for a global job at queuing time. + This will allow us to wake up the target using an interrupt when + implementing nap mode. + The algorithm used is to look for idle primary threads first, then + idle secondaries, and finally the less loaded thread. If nothing can + be found, we fallback to a synchronous call. +- lpc: Log LPC SYNC errors as unrecoverable ones for manufacturing +- lpc: Optimize SerIRQ dispatch based on which PSI IRQ fired +- interrupts: Add new source ``->attributes()`` callback + This allows a given source to provide per-interrupt attributes + such as whether it targets OPAL or Linux and it's estimated + frequency. + + The former allows to get rid of the double set of ops used to + decide which interrupts go where on some modules like the PHBs + and the latter will be eventually used to implement smart + caching of the source lookups. +- opal/hmi: Fix a TOD HMI failure during a race condition. +- platform: Add BT to Generic platform + + +NVRAM +----- +- Support ``ibm,skiboot`` partition for skiboot specific configuration options +- flash: Size NVRAM based on ECC for OpenPOWER platforms + If NVRAM has ECC (as per the ffs header) then the actual size of the + partition is less than reported by the ffs header in the PNOR then the + actual size of the partition is less than reported by the ffs header. + +NVLink/NPU +---------- + +- Fix reserved PE# +- NPU bdfn allocation bugfix +- Fix bad PE number check + NPUs have 4 PEs which are zero indexed, so {0, 1, 2, 3}. A bad PE number + check in npu_err_inject checks if the PE number is greater than 4 as a + fail case, so it would wrongly perform operations on a non-existant PE 4. +- Use PCI virtual device +- assert the NPU irq min is aligned. +- program NPU BUID reg properly +- npu: reword "error" to indicate it's actually a warning + Incorrect FWTS annotation. + Without this patch, you get spurious FirmWare Test Suite (FWTS) warnings + about NVLink not working on machines that aren't fully populated with + GPUs. +- external: NPU hardware procedure script + Performing NPU hardware procedures requires some config space magic. + Put all that magic into a script, so you can just specify the target + device and the procedure number. + +PCI +--- + +- Generic fixes + + - Claim surprise hotplug capability + - Reserve PCI buses for RC's slot + - Update PCI topology after power change + - Return slot cached power state + - Cache power state on slot without power control + - Avoid hot resets at boot time + - Fix initial PCIe slot power state + - Print CRS retry times + It's useful to know the CRS retry times before the PCI device is + detected successfully. In PCI hot add case, it usually indicates + time consumed for the adapter's firmware to be partially ready + (responsive PCI config space). + - core/pci: Fix the power-off timeout in ``pci_slot_power_off()`` + The timeout should be 1000ms instead of 1000 ticks while powering + off PCI slot in ``pci_slot_power_off()``. Otherwise, it's likely to + hit timeout powering off the PCI slot as below skiboot logs reveal: :: + + [5399576870,5] PHB#0005:02:11.0 Timeout powering off slot + +- PHB3 + + - Override root slot's ``prepare_link_change()`` with PHB's + - Disable surprise link down event on PCI slots + - Disable ECRC on Broadcom adapter behind PMC switch + +- astbmc platforms + + - Support dynamic PCI slot. We might insert a PCIe switch to PHB direct slot + and the downstream ports of the PCIe switch supports PCI hotplug. + + +CAPI +---- + +- hw/phb3: Update capi initialization sequence + The capi initialization sequence was revised in a circumvention + document when a 'link down' error was converted from fatal to Endpoint + Recoverable. Other, non-capi, register setup was corrected even before + the initial open-source release of skiboot, but a few capi-related + registers were not updated then, so this patch fixes it. + +IPMI +---- + +- core/ipmi: Set interrupt-parent property + This allows ipmi-opal to properly use the OPAL irqchip rather than + falling back to the event interface in Linux. + +Mambo Simulator +--------------- + +- Helpers for POWER9 Mambo. +- mambo: Advertise available RADIX page sizes +- mambo: Add section for kernel command line boot args + Users can set kernel command line boot arguments for Mambo in a tcl + script. +- mambo: add exception and qtrace helpers +- external/mambo: Update skiboot.tcl to add page-sizes nodes to device tree + +Simics Simulator +---------------- + +- chiptod: Enable ChipTOD in SIMICS + +Utilities +--------- + +- pflash + + - fix harmless buffer overflow: ``fl_total_size`` was ``uint32_t`` not ``uint64_t``. + - Don't try to write protect when writing to flash file + - Misc small improvements to code and code style + - makefile bug fixes + + +- external/boot_tests + + - remove lid from the BMC after flashing + - add the nobooting option -N + - add arbitrary lid option -F + +- ``getscom`` / ``getsram`` / ``putscom``: Parse chip-id as hex + We print the chip-id in hex (without a leading 0x), but we fail to + parse that same value correctly in ``getscom`` / ``getsram`` / ``putscom`` :: + + # getscom -l + ... + 80000000 | DD2.0 | Centaur memory buffer + # getscom -c 80000000 201140a + Error -19 reading XSCOM + + Fix this by assuming base 16 when parsing chip-id. + +PRD +--- + +- opal-prd: Fix error code from ``scom_read`` and ``scom_write`` +- opal-prd: Add get_interface_capabilities to host interfaces +- opal-prd: fix for 64-bit pnor sizes +- occ/prd/opal-prd: Queue OCC_RESET event message to host in OpenPOWER + During an OCC reset cycle the system is forced to Psafe pstate. + When OCC becomes active, the system has to be restored to its + last pstate as requested by host. So host needs to be notified + of OCC_RESET event or else system will continue to remian in + Psafe state until host requests a new pstate after the OCC + reset cycle. + +IBM FSP Based Platforms +----------------------- + +- fsp/console: Allocate irq for each hvc console + Allocate an irq number for each hvc console and set its interrupt-parent + property so that Linux can use the opal irqchip instead of the + OPAL_EVENT_CONSOLE_INPUT interface. +- platforms/firenze: Fix clock frequency dt property: :: + + [ 1.212366090,3] DT: Unexpected property length /xscom@3fc0000000000/i2cm@a0020/clock-frequency + +- HDAT: Fix typo in nest-frequency property + nest-frquency -> nest-frequency +- platforms/ibm-fsp: Use power_ctl bit when determining slot reset method + The power_ctl bit is used to represent if power management is available. + If power_ctl is set to true, then the I2C based external power management + functionality will be populated on the PCI slot. Otherwise we will try to + use the inband PERST as the fundamental reset, as before. +- FSP/ELOG: Fix elog timeout issue + Presently we set timeout value as soon as we add elog to queue. If + we have multiple elogs to write, it doesn't consider queue wait time. + Instead set timeout value when we are actually sending elog to FSP. +- FSP/ELOG: elog_enable flag should be false by default + This issue is one of the corner case, which is related to recent change + went upstream and only observed in the petitboot prompt, where we see + only one error log instead of getting all error log in + ``/sys/firmware/opal/elog``. + + + +POWER9 +------ + +- mambo: Make POWER9 look like DD2 +- flash: Move flash node under ``ibm,opal/flash/`` + This changes the boot ABI, so it's only active for P9 and later systems, + even though it's unrelated to hardware changes. There is an associated + Linux change to properly search for this node as well. +- core/cpu.c: Add OPAL call to setup Nest MMU +- psi: On p9, create an interrupt-map for routing PSI interrupts +- lpc: Add P9 LPC interrupts support +- chiptod: Basic P9 support +- psi: Add P9 support + +Testing and Debugging +--------------------- + +- test/qemu: bump qemu version used in CI, adds IPMI support +- platform/qemu: add BT and IPMI support + Enables testing BT and IPMI functionality in the Qemu simulator +- init: In debug builds, enable debug output to console +- mem_region: Be a bit smarter about poisoning + Don't poison chunks that are already free and poison regions on + first allocation. This speeds things up dramatically. +- libc: Use 8-bytes stores for non-0 memset too + Memory poisoning hammers this, so let's be a bit smart about it and + avoid falling back to byte stores when the data is not 0 +- fwts: add annotation for manufacturing mode +- check: Fix bugs in mem region tests +- Don't set -fstack-protector-all unconditionally + We set it already in DEBUG builds and we use -fstack-protector-strong + in release builds which provides most of the benefits and is more + efficient. +- Build host programs (and checks) with debug enabled + This enables memory poisoning in allocations and list checking + among other things. +- Add global DEBUG make flag + + +Contributors +------------ + +Extending the analysis done for the last few releases, we can see our trends +in code review across versions: + +======== ====== ======= ======= ====== ======== +Release csets Ack Reviews Tested Reported +======== ====== ======= ======= ====== ======== +5.0 329 15 20 1 0 +5.1 372 13 38 1 4 +5.2-rc1 334 20 34 6 11 +5.3-rc1 302 36 53 4 5 +5.4-rc1 278 8 19 0 4 +======== ====== ======= ======= ====== ======== + +This release has fewer changesets over previous 5.x first release candidates, +but that is not indicative of the size or complexity of these changes. + + +Processed 278 csets from 31 developers +A total of 17052 lines added, 4745 removed (delta 12307) + +Developers with the most changesets + +=========================== == ======= +=========================== == ======= +Stewart Smith 71 (25.5%) +Benjamin Herrenschmidt 50 (18.0%) +Claudio Carvalho 38 (13.7%) +Gavin Shan 20 (7.2%) +Oliver O'Halloran 18 (6.5%) +Mukesh Ojha 9 (3.2%) +Cyril Bur 7 (2.5%) +Russell Currey 7 (2.5%) +Vasant Hegde 7 (2.5%) +Pridhiviraj Paidipeddi 6 (2.2%) +Michael Neuling 6 (2.2%) +Alistair Popple 4 (1.4%) +Sam Mendoza-Jonas 3 (1.1%) +Vipin K Parashar 3 (1.1%) +Balbir Singh 3 (1.1%) +Mahesh Salgaonkar 3 (1.1%) +Frederic Barrat 3 (1.1%) +Chris Smart 2 (0.7%) +Jack Miller 2 (0.7%) +Patrick Williams 2 (0.7%) +Jeremy Kerr 2 (0.7%) +Suraj Jitindar Singh 2 (0.7%) +Milton Miller 2 (0.7%) +Shilpasri G Bhat 1 (0.4%) +Frederic Bonnard 1 (0.4%) +Joel Stanley 1 (0.4%) +Breno Leitao 1 (0.4%) +Anton Blanchard 1 (0.4%) +Nicholas Piggin 1 (0.4%) +Nageswara R Sastry 1 (0.4%) +Cédric Le Goater 1 (0.4%) +=========================== == ======= + +Developers with the most changed lines + +========================= ==== ======= +========================= ==== ======= +Claudio Carvalho 6817 (38.2%) +Stewart Smith 4677 (26.2%) +Benjamin Herrenschmidt 2586 (14.5%) +Gavin Shan 1005 (5.6%) +Cyril Bur 509 (2.9%) +Mukesh Ojha 361 (2.0%) +Oliver O'Halloran 343 (1.9%) +Russell Currey 343 (1.9%) +Balbir Singh 227 (1.3%) +Pridhiviraj Paidipeddi 194 (1.1%) +Michael Neuling 121 (0.7%) +Cédric Le Goater 115 (0.6%) +Vipin K Parashar 68 (0.4%) +Alistair Popple 66 (0.4%) +Vasant Hegde 65 (0.4%) +Shilpasri G Bhat 45 (0.3%) +Suraj Jitindar Singh 41 (0.2%) +Nicholas Piggin 34 (0.2%) +Sam Mendoza-Jonas 33 (0.2%) +Jack Miller 32 (0.2%) +Nageswara R Sastry 32 (0.2%) +Jeremy Kerr 23 (0.1%) +Mahesh Salgaonkar 21 (0.1%) +Chris Smart 20 (0.1%) +Milton Miller 19 (0.1%) +Patrick Williams 11 (0.1%) +Frederic Barrat 6 (0.0%) +Anton Blanchard 3 (0.0%) +Frederic Bonnard 2 (0.0%) +Joel Stanley 2 (0.0%) +Breno Leitao 2 (0.0%) +========================= ==== ======= + +Developers with the most lines removed + +========================= ==== ======= +========================= ==== ======= +Cyril Bur 299 (6.3%) +========================= ==== ======= + +Developers with the most signoffs (total 226) + +========================= ==== ======= +========================= ==== ======= +Stewart Smith 219 (96.9%) +Alistair Popple 4 (1.8%) +Cyril Bur 1 (0.4%) +Jeremy Kerr 1 (0.4%) +Benjamin Herrenschmidt 1 (0.4%) +========================= ==== ======= + +Developers with the most reviews (total 19) + +========================= ==== ======= +========================= ==== ======= +Mukesh Ojha 5 (26.3%) +Andrew Donnellan 4 (21.1%) +Vasant Hegde 3 (15.8%) +Russell Currey 3 (15.8%) +Balbir Singh 2 (10.5%) +Cyril Bur 1 (5.3%) +Vaidyanathan Srinivasan 1 (5.3%) +========================= ==== ======= + +Developers with the most test credits (total 0) + +Developers who gave the most tested-by credits (total 0) + +Developers with the most report credits (total 4) + +========================= ==== ======= +========================= ==== ======= +Benjamin Herrenschmidt 1 (25.0%) +Li Meng 1 (25.0%) +Pridhiviraj Paidipeddi 1 (25.0%) +Gavin Shan 1 (25.0%) +========================= ==== ======= + +Developers who gave the most report credits (total 4) + +========================= ==== ======= +========================= ==== ======= +Gavin Shan 1 (25.0%) +Vasant Hegde 1 (25.0%) +Russell Currey 1 (25.0%) +Stewart Smith 1 (25.0%) +========================= ==== ======= diff --git a/doc/stable-skiboot-rules.rst b/doc/stable-skiboot-rules.rst index bf4f5f0..5923d8e 100644 --- a/doc/stable-skiboot-rules.rst +++ b/doc/stable-skiboot-rules.rst @@ -1,5 +1,8 @@ -Stable Skiboot tree/releases -============================ +.. _stable-rules: + +====================================== +Skiboot stable tree rules and releases +====================================== If you're at all familiar with the Linux kernel stable trees, this should seem fairly familiar. diff --git a/doc/stb.rst b/doc/stb.rst index e8d2098..2b44370 100644 --- a/doc/stb.rst +++ b/doc/stb.rst @@ -1,3 +1,6 @@ +.. _stb-overview: + +=============================== Secure and Truted Boot Overview =============================== -- cgit v1.1