Age | Commit message (Collapse) | Author | Files | Lines |
|
[Upstream Commit 003ccd5775161d352c53cac3d00c6283eb036ffc]
The P9 NPU workbook says that only 4K/64K/16M/256M page size are supported
and in fact npu2_map_pe_dma_window() supports just these but in absence of
the "ibm,supported-tce-sizes" property Linux assumes the default P9 PHB4
page sizes - 4K/64K/2M/1G - so when Linux tries 2M/1G TCEs, we get lots of
"Unexpected TCE size" from npu2_tce_kill().
This advertises TCE page sizes so Linux could handle it correctly, i.e.
fall back to 4K/64K TCEs.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
|
|
[ Upstream commit 999246716d2da347aad46a28ed9899b832bffe6c ]
During CAPP recovery do_capp_recovery_scoms() will reset the CAPP Fir
register just after CAPP recovery is completed. This has an
unintentional side effect of preventing PRD from analyzing and
reporting this error. If PRD tries to read the CAPP FIR after opal has
already reset it, then it logs a critical error complaining "No active
error bits found".
To prevent this from happening we update do_capp_recovery_scoms() to
only reset fir bits that cause CAPP machine check (local xstop). This
is done by reading the CAPP Fir Action0/1 & Mask registers and
generating a mask which is then written on CAPP_FIR_CLEAR register.
Cc: stable
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
[ Upstream commit 9597a12ef4b3644e4b8644f659bec04ca139b7f9 ]
Some PHB4 PHYs can get stuck in a bad state where they are constantly
retraining the link. This happens transparently to skiboot and Linux
but will causes PCIe to be slow. Resetting the PHB4 clears the
problem.
We can detect this case by looking at the RX errors count where we
check for link stability. This patch does this by modifying the link
optimal code to check for RX errors. If errors are occurring we
retrain the link irrespective of the chip rev or card.
Normally when this problem occurs, the RX error count is maxed out at
255. When there is no problem, the count is 0. We chose 8 as the max
rx errors value to give us some margin for a few errors. There is also
a knob that can be used to set the error threshold for when we should
retrain the link. ie
nvram -p ibm,skiboot --update-config phb-rx-err-max=8
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
[ Upstream commit 7194e92cc700bfcc6f12f5fc12da06ef936bd2b8 ]
Artem Senichev reported[1] his P8 platform was failing to boot from
a43e9a66aae9 ("astbmc: Fail SFC init if SIO is unavailable") with the
following error:
[ 110.097168975,3] PLAT: Failed to open PNOR flash controller
I reproduced this behaviour on a Palmetto; we need to ensure the state
of the no-response error bit is clear before proceding with the presence
test.
The fix appears to resolve the failure to open the PNOR flash controller
on Palmetto and doesn't change the expected behaviour on Witherspoon.
[1] https://github.com/open-power/skiboot/issues/197
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Tested-by: Artem Senichev <a.senichev@yadro.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
[ Upstream commit a43e9a66aae9812f8790c4a9290989bb0774d2a6 ]
If SuperIO is unavailable then the driver cannot perform accesses on
which it currently depends. Test for SuperIO availability during
initialsation and bail out immediately if it is absent.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
[ Upstream commit 9a830ee06c66058b1421c017b25a65a22921e9f6 ]
Segregate the BMC platform configuration into hardware and software
components. This allows population of platform default values for
hardware configuration that may no-longer be accessible by the host.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
[stewart: fixup pci-quirk unit test]
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
[ Upstream commit dd554bacd13c6dea481ea4e1ec9f3c32087295d9 ]
Avoid the probabilistic approach and use a deterministic one instead.
The probe calls use a slow, synchronous method to capture the the state
of the target device, so it is used sparingly (only on first access).
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
[ Upstream commit 5684204c2d0b470de72eff563c9e0172bbdbcb18 ]
Introduce generic read and write probe functions that allow detection of
valid addresses by way of synchronous testing for the SYNC no-response
state. If the no-response state is detected the probe functions will
return an error to the caller, who can do with it what they wish.
In the process, rip out the naive mechanism for muting the equivalent
asynchronous error logging (regretfully introduced recently by yours
truly).
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
[ Upstream commit 1a1ff0ab2c78f9257bb77301191df38242d11f0d ]
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
[ Upstream commit b5edb1692b7f6af1a60758f4f63f52f795b5dba0 ]
If the IPMI command is not available, fall back to the mailbox
interface.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
[stewart: fix up mbox test]
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
[ Upstream commit d4048420962097ce5b46167b2715b458142d394f ]
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
[ Upstream commit 1d8793c64b596cfdcc3cf6035a3b4cbe3c341ae9 ]
Add the ability to silence particular errors from the LPC bus when they
can be expected, particularly:
LPC[000]: Got SYNC no-response error. Error address reg: 0xd001002f
This is necessary on platform exit on some astbmc machines to avoid
unnecessary noise in the msglog.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
[ Upstream commit ebc8524a3a457f73083d984296bfd797940a711c ]
It's possible for the platform to configure the BMC with SuperIO
access disabled. Rework the interfaces to report failures if SuperIO is
not enabled, and clean up once we're finished.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
[ Upstream commit 8972e44f97883e5aabf4b9c6737dcf3b22fd24b8 ]
Introduce some consistency for readability and make the names better
reflect the nature of the tests.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
[ Upstream commit f651e8eb56e2c17aeac58fd50c20f874d874169c ]
Fixes: 5b1bc2ffe791ae94361d86b2ae063ee543bf2df5
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
[ Upstream commit 5b1bc2ffe791ae94361d86b2ae063ee543bf2df5 ]
The only user was hw/ast-bmc/ast-sf-ctrl.c, and for accessing flash the
copy routines require knowledge of the PNOR LPC offset. For systems
using MBOX the ast-sf-ctrl implementation is unused, so move the offset
initialisation out of the common code-path and the copy routines to the
place where they are necessary.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
[ Upstream commit b09e48ffcdbffca97f7f6ebc2135a9e82dc5d9e9 ]
P8 boxes can opt in for mbox pnor support if they set the scratch
register bit to indicate it is supported.
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
[ Upstream commit c032c5991207bf143dec38d6d0527fb1a1944fac ]
ASPEED BMCs use SIO register 0x29 to configure host firmwrae settings.
This documents those setings as currently used by Hostboot in [1].
Despite the naming, these settings are relevant for ast2500 systems as
well.
[1] src/usr/initservice/bootconfig/bootconfig_ast2400.H
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
[ Upstream commit 2d7419274dfad55f1909fd9ad948764d23aef978 ]
Update phb4_init_capp_regs() to allocates STQ Engines to CAPP/PEC2
based on link width instead of always assuming it to x8.
Also re-factor the function slightly to evaluate the link-width only
once and cache it so that it can also be used to allocate DMA read
engines.
Cc: stable
Fixes: 47c09cdfe7a3("phb4/capp: Calculate STQ/DMA read engines based on link-width for PEC")
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
[ Upstream commit d5ebd5519dcd1727bd2355d9e5aa4bfbcd7f3792 ]
Currently on a FSP based P9 system load_capp_code() expects CAPP ucode
lid header to have eye-catcher magic of 'CAPPPSLL'. However skiboot
currently supports CAPP ucode only lids that have a eye-catcher magic
of 'CAPPLIDH'. This prevents skiboot from loading the ucode with this
error message:
CAPP: ucode header invalid
We fix this issue by updating load_capp_ucode() to use the eye-catcher
value of 'CAPPLIDH' instead of 'CAPPPSLL'.
Cc: stable
Fixes: e50764d4f2b1("capi: Load capp microcode")
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
[ Upstream commit 8f650b6d55b4060cca7b8a2fa2850bc73890b179 ]
Suggested-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Yeah-boiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiied-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
[ Upstream commit 8a2b6d51b77172d5ff81aa412ff7aa97f57d4f90 ]
kill_type is enum of OPAL_PCI_TCE_KILL_PAGES, OPAL_PCI_TCE_KILL_PE,
OPAL_PCI_TCE_KILL_ALL and phb4_tce_kill() gets it right but
npu2_tce_kill() uses OPAL_PCI_TCE_KILL which is an OPAL API token.
This fixes an obvious mistype.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
[ Upstream commit 34ceb75f282952b40b615558f947c3fee533b1d4 ]
In opal_npu_tl_set(), we made a typo that means the OPAL_NPU_TL_SET call
may not clear the enable bits for templates that were previously enabled
but are now disabled.
Fix the typo so we clear NPU2_OTL_CONFIG1_TX_TEMP2_EN as well as
TEMP{1,3}_EN.
Reported-by: Tyler Seredynski <tseredynski@gmail.com>
Fixes: cd8b82a8e83ed ("npu2-opencapi: Add OpenCAPI OPAL API calls")
Cc: stable
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
[ Upstream commit 9a83ab711ea3c76919f311cb1c78e051ae59c808 ]
If the PHB encounters a UR or CA status on a CFG write, it will
incorrectly freeze the wrong PE. Instead of using the PE# specified
in the CONFIG_ADDRESS register, it will use the PE# of whatever
MMIO occurred last.
Work around this disabling freeze on such errors
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Tested-By: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
[ Upstream commit 0a087154ca4f6759ad1e25c0b3933a9e6caeb456 ]
If the zalloc fails (and it can be a rather large allocation),
we will overwite memory at 0 instead of failing.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
[ Upstream commit cfecc3960c00ea9a9871c2358d8710c5d2c6539b ]
In a POWER9 chip, some PHB4s have 256 PEs, some have 512.
Currently, the diagnostics code retrieves 512 unconditionally,
which is wrong and causes us to incorrectly report bogus values
for the "high" PEs on the small PHBs.
Use the actual number of implemented PEs instead
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
[ Upstream commit 1520d6a1e3aaec74228d213083b68da70729121a ]
Presently phb4_probe_stack() causes an exception while trying to probe
a PHB if its garded. This causes skiboot to go into a reboot loop with
following exception log:
***********************************************
Fatal MCE at 000000003006ecd4 .probe_phb4+0x570
CFAR : 00000000300b98a0
<snip>
Aborting!
CPU 0018 Backtrace:
S: 0000000031cc37e0 R: 000000003001a51c ._abort+0x4c
S: 0000000031cc3860 R: 0000000030028170 .exception_entry+0x180
S: 0000000031cc3a40 R: 0000000000001f10 *
S: 0000000031cc3c20 R: 000000003006ecb0 .probe_phb4+0x54c
S: 0000000031cc3e30 R: 0000000030014ca4 .main_cpu_entry+0x5b0
S: 0000000031cc3f00 R: 0000000030002700 boot_entry+0x1b8
This is caused as phb4_probe_stack() will ignore all xscom read/write
errors to enable PHB Bars and then tries to perform an mmio to read
PHB Version registers that cause the fatal MCE.
We fix this by ignoring the PHB probe if the first xscom_write() to
populate the PHB Bar register fails, which indicates that there is
something wrong with the PHB.
Cc: stable
Fixes: dc21b4db3a2e('hw/phb4: Add initial support')
Reviewed-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
If a capi device does a DMA write targeting an address lower than 4GB,
it does so through a 32-bit operation, per the PCI spec. In capi mode,
the first TVE entry is configured in bypass mode, so the address is
valid. But with any (bad) luck, the address could be 0xFFFFxxxx, thus
looking like a 32-bit MSI.
We currently enable both 32-bit and 64-bit MSIs, so the PHB will
interpret the DMA write as a MSI, which very likely results in an EEH
(MSI with a bad payload size).
We can fix it by disabling 32-bit MSI when switching the PHB to capi
mode. Capi devices are 64-bit.
Cc: stable
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
(cherry picked from commit 3b9bc869a4fee22c99a4d24ba87ce938d46b11f4)
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
The current capp recovery timeout control loop in
do_capp_recovery_scoms() uses a wrong comparison for return value of
tb_compare(). This may cause do_capp_recovery_scoms() to report an
timeout earlier than the 168ms stipulated time.
The patch fixes this by updating the loop timeout control branch in
do_capp_recovery_scoms() to use the correct enum tb_cmpval.
Cc: Stable #6.0+
Fixes: 09b853cae0aa0("capi: Poll Err/Status register during CAPP
recovery")
Reported-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
(cherry picked from commit ec954f764efe064d7fc99e8a21a0ebdb7b8a3c91)
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
Commit 47c09cdfe7a3("phb4/capp: Calculate STQ/DMA read engines based
on link-width for PEC") update the CAPP init sequence by calculating
the needed STQ/DMA-read engines based on link width and populating it
in XPEC_NEST_CAPP_CNTL register. This however needs to be synchronized
with the value set in CAPP APC FSM Read Machine Mask Register.
Hence this patch update phb4_init_capp_regs() to calculate the link
width of the stack on PEC2 and populate the same values as previously
populated in PEC CAPP_CNTL register.
Cc: stable # v5.7+
Fixes: 47c09cdfe7a3("phb4/capp: Calculate STQ/DMA read engines based on link-width for PEC")
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
(cherry picked from commit ef9caad57e59ffc1a9ee44d38a161f624993b67b)
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
Due to some HW errata, the block tracking facility (performance optimisation
for large systems) should be disabled on Nimbus chips. Disable it unconditionally
for now.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
(cherry picked from commit 7db7c9f652295a47b7fed0fb62787ab795216a18)
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
We reallocate additional 16/8 DMA-Read engines allocated to stack0/1
on PEC2 respectively. This is needed to improve bandwidth available to
the Mellanox CX5 adapter when trying to read GPU memory (GPU-Direct).
If kernel cxl driver indicates a request to allocate maximum possible
DMA read engines when calling enable_capi_mode() and card is attached
to PEC2/stack0 slot then we assume its a Mellanox CX5 adapter. We then
allocate additional 16/8 extra DMA read engines to stack0 and stack1
respectively on PEC2. This is done by populating the
XPEC_PCI_PRDSTKOVR and XPEC_NEST_READ_STACK_OVERRIDE as suggested by
the h/w team.
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
(cherry picked from commit 3754dba77ef5a4d72dc579e789c0a7b06af02160)
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
By default when a PCIe device issues a read request via the PHB it is first
issued with nodal scope. When accessing GPU memory the NPU does not know at the
time of response if the requested memory page is off node or not. Therefore
every read of GPU memory by a PHB is retried with larger scope which introduces
bandwidth and latency issues.
On smaller boxes which have pump mode enabled nodal and group scoped reads are
treated the same and both types of request are broadcast to one chip. Therefore
we can avoid the retry by disabling nodal scope on the PHB for these boxes. On
larger boxes nodal (single chip) and group (multiple chip) scoped reads are
treated differently. Therefore we avoid disabling nodal scope on large boxes
which have pump mode disabled to avoid all PHB requests being broadcast to
multiple chips.
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
(cherry picked from commit 68518e542e6f7adfe4e97ac22024970ac2400872)
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
Currently it is defined in npu2-regs.h but needs to be used by other files
as well so move it somewhere generic.
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
(cherry picked from commit b8702e2c69638f9cab818e76232af3481935e250)
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
Enable these error checking features by setting the appropriate bits in
our one-off initialization of each "NTL Misc Config 2" register.
The exception is NDL RX parity checking, which should be disabled during
the link training procedures.
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
(cherry picked from commit 041d69bb1a7084778d63a846d109c148c7a0009a)
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
Name this bit properly. There's a lot more cleanup like this to be done,
but I'm catching this one now as part of some related changes.
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
(cherry picked from commit c2493fd0ce30dd4204cf4cec2e9c4496201a0cf1)
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
Presently in CAPI mode the number of STQ/DMA-read engines allocated on
PEC2 for CAPP is fixed to 6 and 0-30 respectively irrespective of the
PCI link width. These values are only suitable for x8 cards and
quickly run out if a x16 card is plugged to a PEC2 attached slot. This
usually manifests as CAPP reporting TLBI timeout due to these messages
getting stalled due to insufficient STQs.
To fix this we update enable_capi_mode() to check if PEC2 chiplet is
in x16 mode and if yes then we allocate 4/0-47 STQ/DMA-read engines
for the CAPP traffic.
Cc: stable # v5.7+
Fixes: 37ea3cfdc852("capi: Enable capi mode for PHB4")
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
(cherry picked from commit 47c09cdfe7a34843387c968ce75cea8dc578ab91)
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
Fixes: 99505c03f493 ("sensor-groups: occ: Add support to disable/enable sensor group")
Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
(cherry picked from commit d6de8fe73b88f92d6a222905e1974ec73777d5e5)
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
With the current code, the capi mmio window is not correctly configured
in the IODA table entry. The first entry (generally the non-prefetchable
BAR) is overwrriten.
This patch sets the capi window bar at the right place.
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
(cherry picked from commit 98182a960c5ffd53eed139668e686bc5af6e2e5f)
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
There are a couple of places we can set/unset fence for a brick:
1. MISC register: NPU2_MISC_FENCE_STATE
2. NTL register for the brick: NPU2_NTL_MISC_CFG1(ndev)
Recent testing of ATS in combination with GPU reset has exposed a side
effect of using (1); if fence is set for all six bricks, it triggers a
sticky nmmu latch which prevents the NPU from getting ATR responses.
This manifests as a hang in the tests.
We have npu2_dev_fence_brick() which uses (1), and only two calls to it.
Replace the call which sets fence with a write to (2). Remove the
corresponding unset call entirely. It's unneeded because the procedures
already do a progression from full fence to half to idle using (2).
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
(cherry picked from commit 5ff8763c9b0421d8de0f4346ca211c853d2406d4)
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
This helps some cards train on the second PERST (ie fast-reboot). The
reason is not clear why but it helps, so YOLO!
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
(cherry picked from commit 9078f8268922b44c3b0f2cd44f567b9389073142)
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
I'm going to defer training to this state soon, so move the tracing
first.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
(cherry picked from commit efc4020a32fbb199c58ada9315d64a175162d066)
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
We want to get through this as fast as possible so minimise by
removing msecs_to_tb() call. Changes number passed from 512 -> 1.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
(cherry picked from commit da05882b8e6e146b5b4121b1e177c4aea47de8f2)
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
The read offset (4:11) in Receive FIFO control register is incremented
by FIFO size whenever CRB read by NX. But the index in RxFIFO has to
match with the corresponding entry in FIFO maintained by VAS in kernel.
VAS entry is reset to 0 when opening the receive window during driver
initialization. So when NX842 is reloaded or in kexec boot, possibility
of mismatch between RxFIFO control register and VAS entries in kernel.
It could cause CRB failure / timeout from NX.
This patch adds nx_coproc_init opal call for kernel to initialize
readOffset (4:11) and Queued (15:23) in RxFIFO control register.
Fixes: 3b3c5962f432 ("NX: Add P9 NX support for 842 compression engine")
CC: stable # v5.8+
Signed-off-by: Haren Myneni <haren@us.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
(cherry picked from commit 56026a13292453b072ad3cc9adf3dee960077f38)
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
stop1_lite has been removed since it adds no additional benefit
over stop0_lite. stop2_lite has been removed since currently it adds
minimal benefit over stop2. However, the benefit is eclipsed by the time
required to ungate the clocks
Moreover, Lite states don't give up the SMT resources, can potentially
have a performance impact on sibling threads.
Since current OSs (Linux) aren't smart enough to make good decisions
with these stop states, we're (temporarly) removing them from what
we expose to the OS, the idea being to bring them back in a new
DT representation so that only an OS that knows what to do will
do things with them.
Signed-off-by: Akshay Adiga <akshay.adiga@linux.vnet.ibm.com>
Reviewed-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
[stewart: add to explanation]
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
(cherry picked from commit 34e9c3c1edb3eed02f428f9cbf97d99b3db43d4d)
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
Force reset was added as an attempt to work around some issues with TPM
devices locking up their I2C bus. In that particular case the problem
was that the device would hold the SCL line down permanently due to a
device firmware bug. The force reset doesn't actually do anything to
alleviate the situation here, it just happens to reset the internal
master state enough to make the I2C driver appear to work until
something tries to access the bus again.
On P9 systems with secure boot enabled there is the added problem
of the "diagostic mode" not being supported on I2C masters A,B,C and
D. Diagnostic mode allows the SCL and SDA lines to be driven directly
by software. Without this force reset is impossible to implement.
This patch removes the force reset functionality entirely since:
a) it doesn't do what it's supposed to, and
b) it's butt ugly code
Additionally, turn p8_i2c_reset_engine() into p8_i2c_reset_port().
There's no need to reset every port on a master in response to an
error that occurred on a specific port.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
(cherry picked from commit 49656a181133013d0b436db8052e23895ad4ff11)
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
Add support for setting a default timeout for the I2C port to the
device-tree. This is consumed by skiboot.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
(cherry picked from commit ac6059026442f0da98293f800aa002271d579097)
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
Without the WOF registers it's hard to figure out what went wrong first,
so print those when we print the FIRs when a fence is detected.
Suggested-by: Mike Perez <perezma@us.ibm.com>
Signed-off-by: Russell Currey <ruscur@russell.cc>
Reviewed-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
Stability improvements in microcode for stop4/stop5 are
available in upstream hcode images. Stop4 and stop5 can
be safely enabled by default.
Use ~0xE0000000 to cut all but stop0,1,2 in case there
are any issues with stop4/5.
example:
nvram -p ibm,skiboot --update-config opal-stop-state-disable-mask=0x1FFFFFFF
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|
|
BMC Get device ID command gives BMC firmware version details. Lets add this
to device tree. User space tools will use this information to display BMC
version details.
Stewart,
I have added bmc information under /ibm,firmware-version node as its firmware
version. But may be we should add new node (/bmc/firmware). So that we can
keep BMC related information separately. Let me know your thoughts on this.
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
|