diff options
author | Stewart Smith <stewart@linux.vnet.ibm.com> | 2017-10-02 12:08:25 +1100 |
---|---|---|
committer | Stewart Smith <stewart@linux.vnet.ibm.com> | 2017-10-11 16:45:34 +1100 |
commit | 696d378d7b7295366e115e89a785640bf72a5043 (patch) | |
tree | 83cfb7042a91d5d7489684f1b23ba44294663537 /platforms | |
parent | e363cd66debb6a83e64bdd3bbdbf0eff501443a8 (diff) | |
download | skiboot-696d378d7b7295366e115e89a785640bf72a5043.zip skiboot-696d378d7b7295366e115e89a785640bf72a5043.tar.gz skiboot-696d378d7b7295366e115e89a785640bf72a5043.tar.bz2 |
fsp: return OPAL_BUSY_EVENT on failure sending FSP_CMD_POWERDOWN_NORM
We had a race condition between FSP Reset/Reload and powering down
the system from the host:
Roughly:
FSP Host
--- ----
Power on
Power on
(inject EPOW)
(trigger FSP R/R)
Processes EPOW event, starts shutting down
calls OPAL_CEC_POWER_DOWN
(is still in R/R)
gets OPAL_INTERNAL_ERROR, spins in opal_poll_events
(FSP comes back)
spinning in opal_poll_events
(thinks host is running)
The call to OPAL_CEC_POWER_DOWN is only made once as the reset/reload
error path for fsp_sync_msg() is to return -1, which means we give
the OS OPAL_INTERNAL_ERROR, which is fine, except that our own API
docs give us the opportunity to return OPAL_BUSY when trying again
later may be successful, and we're ambiguous as to if you should retry
on OPAL_INTERNAL_ERROR.
For reference, the linux code looks like this:
>static void __noreturn pnv_power_off(void)
>{
> long rc = OPAL_BUSY;
>
> pnv_prepare_going_down();
>
> while (rc == OPAL_BUSY || rc == OPAL_BUSY_EVENT) {
> rc = opal_cec_power_down(0);
> if (rc == OPAL_BUSY_EVENT)
> opal_poll_events(NULL);
> else
> mdelay(10);
> }
> for (;;)
> opal_poll_events(NULL);
>}
Which means that *practically* our only option is to return OPAL_BUSY
or OPAL_BUSY_EVENT.
We choose OPAL_BUSY_EVENT for FSP systems as we do want to ensure we're
running pollers to communicate with the FSP and do the final bits of
Reset/Reload handling before we power off the system.
Additionally, we really should update our documentation to point all
of these return codes and what action an OS should take.
CC: stable
Reported-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Acked-by: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Diffstat (limited to 'platforms')
-rw-r--r-- | platforms/ibm-fsp/common.c | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/platforms/ibm-fsp/common.c b/platforms/ibm-fsp/common.c index 237b63f..0a9b06f 100644 --- a/platforms/ibm-fsp/common.c +++ b/platforms/ibm-fsp/common.c @@ -223,7 +223,7 @@ int64_t ibm_fsp_cec_power_down(uint64_t request) printf("FSP: Sending shutdown command to FSP...\n"); if (fsp_sync_msg(fsp_mkmsg(FSP_CMD_POWERDOWN_NORM, 1, request), true)) - return OPAL_INTERNAL_ERROR; + return OPAL_BUSY_EVENT; fsp_reset_links(); return OPAL_SUCCESS; |